summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri
AgeCommit message (Collapse)Author
2010-06-11i965: Remove caching of surface state objects.Eric Anholt
It turns out that computing a 56 byte key to look up a 20-byte object out of a hash table was some sort of a bad idea. Whoops. before: [ # ] backend test min(s) median(s) stddev. count [ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6 after: [ 0] gl firefox-talos-gfx 34.761 34.784 0.17% 5/6
2010-06-11i965: Convert the binding table to streamed indirect state.Eric Anholt
This slightly reduces reduces cairo-gl firefox-talos-gfx runtime on my Ironlake: before: [ # ] backend test min(s) median(s) stddev. count [ 0] gl firefox-talos-gfx 38.236 38.383 0.43% 5/6 after: [ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6 It turns out the cost of caching these objects and looking them up in the cache again is greater than the cost of just computing the object again, particularly when the overhead of having a separate BO to pin is removed. (Those that are paying close attention will note that this is a reversal of the path I was moving the driver in a couple of years ago. The major thing that has changed is that back then all state was recomputed when we wrapped the streaming state buffer, including recompiling our precious programs. Now, we're uncaching just the objects that are cheap to compute, and retaining caching of expensive objects)
2010-06-11i965: Split constant buffer setup from its surface state/binding state.Eric Anholt
This was bothering me when redoing the binding tables.
2010-06-11i965: Add support for streaming indirect state rather than caching objects.Eric Anholt
2010-06-11i965: Set the CC VP state immediately on state change.Eric Anholt
The cache lookup of these two little floats was .12% of total CPU time on firefox-talos-gfx because we did it any time commonly-changed state changed. On the other hand, updating the CC VP bo immediately whenver CC VP state changes is a .07% overhead due to putting a driver hoook in glEnable().
2010-06-11i965: Update old comment about state cache sizing.Eric Anholt
2010-06-11i965: Move no_batch_wrap assertion out across the area we're trying to verify.Eric Anholt
It's more likely that we wrap badly in state setup than in the little primitive packet.
2010-06-10i965: remove UseProgram driver callbackBrian Paul
It just duplicated the default/core Mesa behaviour.
2010-06-10intel: Remove unnecessary header.Vinson Lee
2010-06-10i965: Add support for GL_ALPHA framebuffer objects.Eric Anholt
2010-06-09intel: Use the blitter to upload TexSubImage data to busy textures.Eric Anholt
This avoids many pipeline stalls in cairo-gl. [ # ] backend test min(s) median(s) stddev. count Before: [ 0] gl firefox-talos-gfx 36.799 36.851 2.34% 3/3 [ 0] gl firefox-talos-svg 33.429 35.360 3.46% 3/3 After: [ 0] gl firefox-talos-gfx 35.895 36.250 0.48% 3/3 [ 0] gl firefox-talos-svg 26.669 29.888 5.34% 3/3 This doesn't avoid all the pipeline stalls because the kernel reports !busy for buffers on the flushing list. That should be fixed in .36.
2010-06-09i965: Avoid calloc/free in the CURBE upload process.Eric Anholt
In exchange we end up with an extra memcpy, but that seems better than calloc/free. Each buffer is 4k maximum, and on the i965-streaming branch this allocation was showing up as the top entry in brw_validate_state profiling for cairo-gl.
2010-06-08intel: Flag NEW_BUFFERS when changing draw buffers.Eric Anholt
There were entries to this function (most imporantly, prepare_render -> update_renderbuffers) that wouldn't have had NEW_BUFFERS set, but brw_wm_surface_state (the i965 state tracking the drawing regions) expected this to change.
2010-06-08intel: Convert remaining dri_bo_emit_reloc to drm_intel_bo_emit_reloc.Eric Anholt
The new API makes so much more sense, I'd like to forget how the old one worked.
2010-06-08intel: Change dri_bo_* to drm_intel_bo* to consistently use new API.Eric Anholt
The slightly less mechanical change of converting the emit_reloc calls will follow.
2010-06-08intel: Clean up stale comments in intel_batchbuffer.c.Eric Anholt
2010-06-08intel: Remove the non-gem paths for batchbuffer upload.Eric Anholt
2010-06-08intel: Update comment in intel_tex_copy from before miptree x/y rework.Eric Anholt
2010-06-08r600: Make next_inst() static.Henri Verbeet
2010-06-08r600: Assert output registers have a valid export index.Henri Verbeet
2010-06-08r600: Process exports for all written fragment outputs.Henri Verbeet
2010-06-08r600: Fill uiFP_OutputMap for all written fragment outputs.Henri Verbeet
2010-06-05r300compiler: fix scons buildJoakim Sindholt
2010-06-05i915: Only emit a MI_FLUSH when the drawing rectangle offset changes.Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-05i915: Fix off-by-one for drawing rectangle.Chris Wilson
The drawing rectangle is given in *inclusive* pixel values, so the range is only [0,2047]. Hence when rendering to a 2048 wide target, such as an extended desktop, we would issue an illegal instruction zeroing the draw area. Fixes: Bug 27408: Primary and Secondary display blanks in extended desktop mode with Compiz enabled https://bugs.freedesktop.org/show_bug.cgi?id=27408 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-05i915: Inhibit render cache flush when changing drawing rectangle offset.Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-06-05r300/compiler: implement SIN+COS+SCS for vertex shadersMarek Olšák
2010-06-05r300/compiler: implement SNE unwound for r3xx VS, natively for r5xx VSMarek Olšák
2010-06-05r300/compiler: implement SEQ unwound for r3xx VS, natively for r5xx VSMarek Olšák
Fixes piglit/glsl-vs-vec4-indexing-4.
2010-06-05r300/compiler: implement SFL for vertex shadersMarek Olšák
And sort the "case" statements alphabetically.
2010-06-04i915: Don't use XRGB8888 on 830 and 845.Eric Anholt
The support for XRGB8888 appeared in the 855 and 865, and this format is reserved on 830/845. This should fix a regression from b4a6169412819cc3a027c6a118f0537911145a30 that caused hangs in etracer on 845s. Bug #26557.
2010-06-04i915: Clamp minimum lod to maximum texture level too.Eric Anholt
Otherwise, we'd run into minlod > maxlod, and the sampler would give us the undefined we asked for. Bug #24846. Fixes OGLC texlod.c.
2010-06-04intel: Fix intel_compressed_num_bytes for FXT1 after I broke it.Eric Anholt
Fixes piglit fxt1-teximage since 7554b83a21bd62b20df5a7327b69f08108ac9ab6, and also OGLC tests that hit FXT1 with a million other things. Bug #28184.
2010-06-03r300/compiler: print opcode names instead of numbersMarek Olšák
2010-06-02dri/swrast: Remove unnecessary header.Vinson Lee
2010-06-02intel: Remove a leftover DRI1/DRI2 conditionalKristian Høgsberg
2010-06-01intel: Fallback to meta if we're asked to CopyTexImage2D from RGB to RGBAKristian Høgsberg
The pixel transfer rules state that we must set alpha to 1.0 in this case which we can't easily do with the blitter. We can do to passes: one that sets the alpha to 0xff and one that copies the RGB bits or we can just use the 3D engine. Neither approach seems worth it for this case.
2010-05-31swrast: add TFP support to swrast.Dave Airlie
This adds TFP support to the swrast driver, with this I can run gnome-shell inside Xephyr slowly. I've no idea why I did it, and g-s has other rendering issues under swrast, but it might be useful to hook up llvmpipe later. I've no idea if I even want to commit it at this point. An enhanced version might just pass the pointer in the indirect rendering case and avoid the memcpy. Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-05-31gallium: fix TFP on galliumDave Airlie
This fixes an uninitialised value use in the dri2 st when doing TFP. It uses the driContextPriv which isn't initialised at alloc time. Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-05-31intel: Initialize batch->reserved_space on allocationChris Wilson
Fixes the assert (and buffer overrun): glknots: intel_batchbuffer.c:164: _intel_batchbuffer_flush: Assertion 'used >= batch->buf->size' failed. Reported in bug: Bug 28274 - xscreensaver's glknots hangs GPU (945GME/Pineview) https://bugs.freedesktop.org/show_bug.cgi?id=28274 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-29r300: fix blits for textures of width/height greater than 2048 on r5xxMarek Olšák
Yes I am fixing r300c ... who knew?
2010-05-28i965: Add cache unit -> bo name mapping for more gen6 state objects.Eric Anholt
This will help in bufmgr debugging and aub dumping.
2010-05-28i965: fix PIPE_CONTROL command for gen6.Zou Nan hai
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>
2010-05-26Enable hardware mipmap generation for radeon.Will Dyson
Use _mesa_meta_GenerateMipmap. It is Fast Enough(tm). Signed-off-by: Maciej Cencora <m.cencora@gmail.com> Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
2010-05-26Fix image_matches_texture_obj() MaxLevel checkWill Dyson
When generating or uploading a new (higher) mipmap level for an image, we can need to allocate a miptree for a level greater than texObj->MaxLevel. Signed-off-by: Maciej Cencora <m.cencora@gmail.com> Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
2010-05-26Fallback to software render if there is no miptree for an imageWill Dyson
This can happen when checking if a software fallback for a higher level operation (such as GenerateMipmap) is needed. Signed-off-by: Maciej Cencora <m.cencora@gmail.com> Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
2010-05-26i965: Add support for EXT_timer_query on Ironlake.Eric Anholt
We could potentially do this on G45 as well, though the units are different. On 965, the timestamp is tied to hclk, which would make supporting it harder.
2010-05-26intel: Handle decode of PIPE_CONTROL instructions.Eric Anholt
2010-05-26i965: Move Gen6 debugging emit_mi_flush into the Gen6 block.Eric Anholt
2010-05-26i965: Don't PIPE_CONTROL instruction cache flush.Eric Anholt
This is a workaround for Ironlake errata. The emit_mi_flush is used for a few purposes: 1) Flushing write caches for RTT (including blit to texture) 2) Pipe fencing for sync objects 3) Spamming cache flushes to track down cache flush bugs Spamming cache flushes seems less important than following the docs, and we should probably do that with a different mechanism than the one for render cache flushes.