Age | Commit message (Collapse) | Author |
|
This reverts commit de6fd527a545f8344e074312544517d05573fb72.
Revert this workaround as it seems the real trouble is caused by
lineloop, which doesn't require GS convert on sandybridge actually.
|
|
This makes conformance tests stable on sandybridge D0 to track
multisample state before SF/WM state.
|
|
|
|
Fix incorrect scissor rect struct and missed scissor state pointer
setting for sandybridge.
|
|
Fixes glsl-algebraic-add-add-1.
|
|
|
|
|
|
We had to fill out all that junk when using the cache, but no more.
|
|
Now that the binding table is streamed indirect state, they were
always NULL/0.
|
|
|
|
It turns out that computing a 56 byte key to look up a 20-byte object
out of a hash table was some sort of a bad idea. Whoops.
before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6
after:
[ 0] gl firefox-talos-gfx 34.761 34.784 0.17% 5/6
|
|
This slightly reduces reduces cairo-gl firefox-talos-gfx runtime on my
Ironlake:
before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] gl firefox-talos-gfx 38.236 38.383 0.43% 5/6
after:
[ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6
It turns out the cost of caching these objects and looking them up in
the cache again is greater than the cost of just computing the object
again, particularly when the overhead of having a separate BO to pin
is removed.
(Those that are paying close attention will note that this is a
reversal of the path I was moving the driver in a couple of years ago.
The major thing that has changed is that back then all state was
recomputed when we wrapped the streaming state buffer, including
recompiling our precious programs. Now, we're uncaching just the
objects that are cheap to compute, and retaining caching of expensive
objects)
|
|
This was bothering me when redoing the binding tables.
|
|
|
|
The cache lookup of these two little floats was .12% of total CPU time
on firefox-talos-gfx because we did it any time commonly-changed state
changed. On the other hand, updating the CC VP bo immediately whenver
CC VP state changes is a .07% overhead due to putting a driver hoook
in glEnable().
|
|
The slightly less mechanical change of converting the emit_reloc calls
will follow.
|
|
|
|
|
|
|
|
|
|
|
|
even with vs disabled, still doesn't work.
|
|
|
|
|
|
|
|
|
|
It hangs the GPU at the clipper stage, presumably because we're lacking
other setup.
|
|
Everything has been constant-sized until now, but constant buffer
handling changes will make us want some additional variable sized
array.
|
|
|
|
|
|
Bug #25194.
|
|
This keeps the individual state files from having to export their
structures for brw_state_cache initialization.
|
|
Once we've freed a miptree, we won't see any more state cache requests
that would hit the things that pointed at it until we've let the miptree
get released back into the BO cache to be reused. By leaving those
surface state and binding table pointers that pointed at it around, we
would end up with up to (500 * texture size) in memory uselessly consumed
by the state cache.
Bug #20057
Bug #23530
|
|
Its flagging of extra state that's already flagged by the vtbl new_batch
when appropriate was confusing my tracking down of the OA clear bug.
|
|
No performance difference proven at 95% confidence with my GLSL demo (n=10).
|
|
This can avoid re-uploading constant data when it isn't necessary, and is
a step towards not updating other surfaces just because constants change.
It also brings the upload of the constant buffer next to the creation.
This brings openarena performance up another 4%, to 91% of the Mesa 7.4 branch.
|
|
Also, only create VS surface state if there's a VS constant buffer to be
uploaded, and set the contents of the buffer at the same time as creation.
|
|
The new, second cache will only be used for surface-related items.
Since we can create many surfaces the original, single cache could get
filled quickly. When we cleared it, we had to regenerate shaders, etc.
With two caches, we can avoid doing that.
|
|
|
|
|
|
Previously, since my check_aperture API change, we would check each piece of
state against the batchbuffer individually, but not all the state against the
batchbuffer at once. In addition to not being terribly useful in assuring
success, it probably also increased CPU load by calling check_aperture many
times per primitive.
|
|
This avoids issues with dereferencing stale cliprects around intel_draw_buffer
time. Additionally, take advantage of cliprects staying constant for FBOs and
DRI2, and emit cliprects in the batchbuffer instead of having to flush batch
each time they change.
|
|
The i965 driver previously had it's own set of code to convert
fixed-function TNL state to a vertex program. Core Mesa has code to
do this, so there is no reason to duplicate that effort in the driver.
In fact, this duplication leads to bugs when other aspects of the Mesa
infrastructure change.
|
|
This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
|
|
This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm_surface_state.c
|
|
To do this, I had to clean up some of 965 state upload stuff. We may end
up over-emitting state in the aperture overflow case, but that should be rare,
and I'd rather have the simplification of state management.
|
|
|
|
|
|
The previous change gave us only two modes, one which looped over the batch
per cliprect (3d drawing) and one that didn't (state updeast).
However, we really want 4:
- Batch doesn't care about cliprects (state updates)
- Batch needs DRAWING_RECTANGLE looping per cliprect (3d drawing)
- Batch needs to be executed just once (region fills, copies, etc.)
- Batch already includes cliprect handling, and must be flushed by unlock time
(copybuffers, clears).
All callers should now be fixed to use one of these states for any batchbuffer
emits. Thanks to Keith Whitwell for pointing out the failure.
|
|
|