Age | Commit message (Collapse) | Author |
|
This provides the optimizer with hints about code hotness, which we're
quite certain about for debug printouts (or, rather, while we
developers often hit the checks for debug printouts, we don't care
about performance while doing so).
|
|
This was slightly confused because gen6_wm_constants does the push
constant buffer, while brw_wm_constants does pull constants.
|
|
|
|
Fix incorrect scissor rect struct and missed scissor state pointer
setting for sandybridge.
|
|
Another optional ARB_imaging subset extension.
|
|
|
|
Fixes glsl-algebraic-add-add-1.
|
|
|
|
|
|
It turns out that computing a 56 byte key to look up a 20-byte object
out of a hash table was some sort of a bad idea. Whoops.
before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6
after:
[ 0] gl firefox-talos-gfx 34.761 34.784 0.17% 5/6
|
|
This slightly reduces reduces cairo-gl firefox-talos-gfx runtime on my
Ironlake:
before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] gl firefox-talos-gfx 38.236 38.383 0.43% 5/6
after:
[ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6
It turns out the cost of caching these objects and looking them up in
the cache again is greater than the cost of just computing the object
again, particularly when the overhead of having a separate BO to pin
is removed.
(Those that are paying close attention will note that this is a
reversal of the path I was moving the driver in a couple of years ago.
The major thing that has changed is that back then all state was
recomputed when we wrapped the streaming state buffer, including
recompiling our precious programs. Now, we're uncaching just the
objects that are cheap to compute, and retaining caching of expensive
objects)
|
|
This was bothering me when redoing the binding tables.
|
|
The cache lookup of these two little floats was .12% of total CPU time
on firefox-talos-gfx because we did it any time commonly-changed state
changed. On the other hand, updating the CC VP bo immediately whenver
CC VP state changes is a .07% overhead due to putting a driver hoook
in glEnable().
|
|
The slightly less mechanical change of converting the emit_reloc calls
will follow.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
even with vs disabled, still doesn't work.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This may break the SUNOS4 build, but it's no longer relevant.
|
|
|
|
|
|
Bug #25194.
|
|
Before, if we just called glXMakeCurrent() and didn't render anything we'd
still trigger a flushFrontBuffer() call.
Now only set the intel->front_buffer_dirty field at state validation time
just before we draw something.
NOTE: additional calls to intel_check_front_buffer_rendering() might be
needed if I missed some rendering paths.
|
|
|
|
Its flagging of extra state that's already flagged by the vtbl new_batch
when appropriate was confusing my tracking down of the OA clear bug.
|
|
Check that all the textures needed by the current fragment program
actually exist and are valid.
|
|
No performance difference proven at 95% confidence with my GLSL demo (n=10).
|
|
This state flag has been unused since the ffvertex_prog move to core.
|
|
This can avoid re-uploading constant data when it isn't necessary, and is
a step towards not updating other surfaces just because constants change.
It also brings the upload of the constant buffer next to the creation.
This brings openarena performance up another 4%, to 91% of the Mesa 7.4 branch.
|
|
Also, only create VS surface state if there's a VS constant buffer to be
uploaded, and set the contents of the buffer at the same time as creation.
|
|
The new, second cache will only be used for surface-related items.
Since we can create many surfaces the original, single cache could get
filled quickly. When we cleared it, we had to regenerate shaders, etc.
With two caches, we can avoid doing that.
|
|
|
|
No more dynamic atoms so we can simplify the state validation code a little.
|
|
|
|
|