Age | Commit message (Collapse) | Author |
|
This provides the optimizer with hints about code hotness, which we're
quite certain about for debug printouts (or, rather, while we
developers often hit the checks for debug printouts, we don't care
about performance while doing so).
|
|
Don't use r0 for FF_SYNC dest reg on Sandybridge, which would
smash FFID field in GS payload, that cause later URB write fail.
Also not use r0 in any URB write requiring allocate.
|
|
|
|
Sandybridge's VF would convert quads to polygon which not required
for GS then. Current GS state still would cause hang on lineloop.
|
|
The slightly less mechanical change of converting the emit_reloc calls
will follow.
|
|
This is trying to follow the spirit of the invariance rules, though
they're not specific on this point. Fixes quad-invariance piglit test
while retaining the 22s -> 18s win on glean blendFunc.
This was a regression in c67d9d84f501f145f841c0b981caff6f4dfd936f.
|
|
|
|
Rename old IGDNG to Ironlake, and set 'gen' number for
Ironlake as 5, so tracking the features with generation num
instead of special is_ironlake flag.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
|
|
Everything has been constant-sized until now, but constant buffer
handling changes will make us want some additional variable sized
array.
|
|
Saves ~480 bytes of code.
|
|
This didn't work for quad/quadstrips at all, and for all other primitive types
it only worked when they were unclipped.
Fix up the former in gs stage (could probably do without these changes and
instead set QuadsFollowProvokingVertexConvention to false), and the rest in
clip stage.
|
|
1. new PCI ids
2. fix some 3D commands on new chipset
3. fix send instruction on new chipset
4. new VUE vertex header
5. ff_sync message (added by Zou Nan Hai <nanhai.zou@intel.com>)
6. the offset in JMPI is in unit of 64bits on new chipset
7. new cube map layout
|
|
Makefile.template
|
|
This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
|
|
This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm_surface_state.c
|
|
To do this, I had to clean up some of 965 state upload stuff. We may end
up over-emitting state in the aperture overflow case, but that should be rare,
and I'd rather have the simplification of state management.
|
|
|
|
Makes state emission into a 2 phase, prepare sets things up and accounts
the size of all referenced buffer objects. The emit stage then actually
does the batchbuffer touching for emitting the objects.
There is an assert in dri_emit_reloc if a reloc occurs for a buffer
that hasn't been accounted yet.
|
|
|
|
The user-space suballocator that was used avoided relocation computations by
using the general and surface state base registers and allocating those types
of buffers out of pools built on top of single buffer objects. It also
avoided calls into the buffer manager for these small state allocations, since
only one buffer object was being used.
However, the buffer allocation cost appears to be low, and with relocation
caching, computing relocations for buffers is essentially free. Additionally,
implementing the suballocator required a don't-fence-subdata flag to disable
waiting on buffer maps so that writing new data didn't block on rendering using
old data, and careful handling when mapping to update old data (which we need
to do for unavoidable relocations with FBOs). More importantly, when the
suballocator filled, it had no replacement algorithm and just threw out all
of the contents and forced them to be recomputed, which is a significant cost.
This is the first step, which just changes the buffer type, but doesn't yet
improve the hash table to not result in full recompute on overflow. Because
the buffers are all allocated out of the general buffer allocator, we can
no longer use the general/surface state bases to avoid relocations, and they
are set to 0 instead.
|
|
state."
I had forgotten part of brw_state_cache.c that made this fix not relevant for
master (last_addr comparison and flagging based on cache id).
This reverts commit a4642f3d18bdaebaba31e5dee72fe5de9d890ffb.
|
|
Otherwise, choosing a new program wouldn't necessarily update the state, and
and an old program could be executed, leading to various sorts of pretty
pictures or hangs.
|
|
The order of vertices in payload for quardstrip is (0, 1, 3, 2),
so the PV for quardstrip is c->reg.vertex[2].
|
|
There is an errata for Broadwater that threads don't have the instruction/loop
mask stacks initialized on thread spawn. In single program flow mode, those
stacks are not writable, so we can't initialize them. However, they do get
read during ELSE and ENDIF instructions. So, instead, replace branch
instructions in single program flow mode with predicated jumps (ADD to the ip
register), avoiding use of the more complicated branch instructions that may
fail. This is also a minor optimization as no ENDIF equivalent is necessary.
Signed-off-by: Keith Packard <keithp@neko.keithp.com>
|
|
This driver comes from Tungsten Graphics, with a few further modifications by
Intel.
|