Age | Commit message (Collapse) | Author |
|
It turns out that computing a 56 byte key to look up a 20-byte object
out of a hash table was some sort of a bad idea. Whoops.
before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6
after:
[ 0] gl firefox-talos-gfx 34.761 34.784 0.17% 5/6
|
|
This slightly reduces reduces cairo-gl firefox-talos-gfx runtime on my
Ironlake:
before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] gl firefox-talos-gfx 38.236 38.383 0.43% 5/6
after:
[ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6
It turns out the cost of caching these objects and looking them up in
the cache again is greater than the cost of just computing the object
again, particularly when the overhead of having a separate BO to pin
is removed.
(Those that are paying close attention will note that this is a
reversal of the path I was moving the driver in a couple of years ago.
The major thing that has changed is that back then all state was
recomputed when we wrapped the streaming state buffer, including
recompiling our precious programs. Now, we're uncaching just the
objects that are cheap to compute, and retaining caching of expensive
objects)
|
|
This was bothering me when redoing the binding tables.
|
|
|
|
The cache lookup of these two little floats was .12% of total CPU time
on firefox-talos-gfx because we did it any time commonly-changed state
changed. On the other hand, updating the CC VP bo immediately whenver
CC VP state changes is a .07% overhead due to putting a driver hoook
in glEnable().
|
|
|
|
It's more likely that we wrap badly in state setup than in the little
primitive packet.
|
|
It just duplicated the default/core Mesa behaviour.
|
|
|
|
|
|
This avoids many pipeline stalls in cairo-gl.
[ # ] backend test min(s) median(s) stddev. count
Before:
[ 0] gl firefox-talos-gfx 36.799 36.851 2.34% 3/3
[ 0] gl firefox-talos-svg 33.429 35.360 3.46% 3/3
After:
[ 0] gl firefox-talos-gfx 35.895 36.250 0.48% 3/3
[ 0] gl firefox-talos-svg 26.669 29.888 5.34% 3/3
This doesn't avoid all the pipeline stalls because the kernel reports
!busy for buffers on the flushing list. That should be fixed in .36.
|
|
In exchange we end up with an extra memcpy, but that seems better than
calloc/free. Each buffer is 4k maximum, and on the i965-streaming
branch this allocation was showing up as the top entry in
brw_validate_state profiling for cairo-gl.
|
|
There were entries to this function (most imporantly, prepare_render
-> update_renderbuffers) that wouldn't have had NEW_BUFFERS set, but
brw_wm_surface_state (the i965 state tracking the drawing regions)
expected this to change.
|
|
The new API makes so much more sense, I'd like to forget how the old
one worked.
|
|
The slightly less mechanical change of converting the emit_reloc calls
will follow.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The drawing rectangle is given in *inclusive* pixel values, so the range
is only [0,2047]. Hence when rendering to a 2048 wide target, such as an
extended desktop, we would issue an illegal instruction zeroing the draw
area.
Fixes:
Bug 27408: Primary and Secondary display blanks in extended
desktop mode with Compiz enabled
https://bugs.freedesktop.org/show_bug.cgi?id=27408
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
|
|
|
|
Fixes piglit/glsl-vs-vec4-indexing-4.
|
|
And sort the "case" statements alphabetically.
|
|
The support for XRGB8888 appeared in the 855 and 865, and this format
is reserved on 830/845. This should fix a regression from
b4a6169412819cc3a027c6a118f0537911145a30 that caused hangs in etracer
on 845s.
Bug #26557.
|
|
Otherwise, we'd run into minlod > maxlod, and the sampler would give
us the undefined we asked for.
Bug #24846. Fixes OGLC texlod.c.
|
|
Fixes piglit fxt1-teximage since
7554b83a21bd62b20df5a7327b69f08108ac9ab6, and also OGLC tests that hit
FXT1 with a million other things.
Bug #28184.
|
|
|
|
|
|
|
|
The pixel transfer rules state that we must set alpha to 1.0 in this case
which we can't easily do with the blitter. We can do to passes: one that
sets the alpha to 0xff and one that copies the RGB bits or we can just
use the 3D engine. Neither approach seems worth it for this case.
|
|
This adds TFP support to the swrast driver, with this I can run gnome-shell inside Xephyr slowly. I've no idea why I did it, and g-s has other rendering issues under swrast, but it might be useful to hook up llvmpipe later. I've no idea if I even want to commit it at this point.
An enhanced version might just pass the pointer in the indirect rendering case
and avoid the memcpy.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
This fixes an uninitialised value use in the dri2 st when doing TFP.
It uses the driContextPriv which isn't initialised at alloc time.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Fixes the assert (and buffer overrun):
glknots: intel_batchbuffer.c:164: _intel_batchbuffer_flush: Assertion
'used >= batch->buf->size' failed.
Reported in bug:
Bug 28274 - xscreensaver's glknots hangs GPU (945GME/Pineview)
https://bugs.freedesktop.org/show_bug.cgi?id=28274
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Yes I am fixing r300c ... who knew?
|
|
This will help in bufmgr debugging and aub dumping.
|
|
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
Use _mesa_meta_GenerateMipmap. It is Fast Enough(tm).
Signed-off-by: Maciej Cencora <m.cencora@gmail.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
|
|
When generating or uploading a new (higher) mipmap level for an image,
we can need to allocate a miptree for a level greater than
texObj->MaxLevel.
Signed-off-by: Maciej Cencora <m.cencora@gmail.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
|
|
This can happen when checking if a software fallback for a higher level
operation (such as GenerateMipmap) is needed.
Signed-off-by: Maciej Cencora <m.cencora@gmail.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
|
|
We could potentially do this on G45 as well, though the units are
different. On 965, the timestamp is tied to hclk, which would make
supporting it harder.
|
|
|
|
|
|
This is a workaround for Ironlake errata. The emit_mi_flush is used
for a few purposes:
1) Flushing write caches for RTT (including blit to texture)
2) Pipe fencing for sync objects
3) Spamming cache flushes to track down cache flush bugs
Spamming cache flushes seems less important than following the docs,
and we should probably do that with a different mechanism than the one
for render cache flushes.
|