summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965
AgeCommit message (Collapse)Author
2010-06-11i965: Remove the surface key used to generate constant surfaces.Eric Anholt
We had to fill out all that junk when using the cache, but no more.
2010-06-11i965: Warning fixes from the i965-streaming merge.Eric Anholt
2010-06-11i965: Use the state base address to avoid relocations.Eric Anholt
This makes the binding table code simpler, and is required for gen6, which requires binding table addresses to be under 64k offset from the surface state base addr. No significant change in performance on firefox-talos-gfx.
2010-06-11i965: GC the last two arguments to brw_cache_data.Eric Anholt
Now that the binding table is streamed indirect state, they were always NULL/0.
2010-06-11i965: Remove brw_state_cache_bo_delete now that it's unused again.Eric Anholt
2010-06-11i965: Remove caching of surface state objects.Eric Anholt
It turns out that computing a 56 byte key to look up a 20-byte object out of a hash table was some sort of a bad idea. Whoops. before: [ # ] backend test min(s) median(s) stddev. count [ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6 after: [ 0] gl firefox-talos-gfx 34.761 34.784 0.17% 5/6
2010-06-11i965: Convert the binding table to streamed indirect state.Eric Anholt
This slightly reduces reduces cairo-gl firefox-talos-gfx runtime on my Ironlake: before: [ # ] backend test min(s) median(s) stddev. count [ 0] gl firefox-talos-gfx 38.236 38.383 0.43% 5/6 after: [ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6 It turns out the cost of caching these objects and looking them up in the cache again is greater than the cost of just computing the object again, particularly when the overhead of having a separate BO to pin is removed. (Those that are paying close attention will note that this is a reversal of the path I was moving the driver in a couple of years ago. The major thing that has changed is that back then all state was recomputed when we wrapped the streaming state buffer, including recompiling our precious programs. Now, we're uncaching just the objects that are cheap to compute, and retaining caching of expensive objects)
2010-06-11i965: Split constant buffer setup from its surface state/binding state.Eric Anholt
This was bothering me when redoing the binding tables.
2010-06-11i965: Add support for streaming indirect state rather than caching objects.Eric Anholt
2010-06-11i965: Set the CC VP state immediately on state change.Eric Anholt
The cache lookup of these two little floats was .12% of total CPU time on firefox-talos-gfx because we did it any time commonly-changed state changed. On the other hand, updating the CC VP bo immediately whenver CC VP state changes is a .07% overhead due to putting a driver hoook in glEnable().
2010-06-11i965: Update old comment about state cache sizing.Eric Anholt
2010-06-11i965: Move no_batch_wrap assertion out across the area we're trying to verify.Eric Anholt
It's more likely that we wrap badly in state setup than in the little primitive packet.
2010-06-10i965: remove UseProgram driver callbackBrian Paul
It just duplicated the default/core Mesa behaviour.
2010-06-10mesa: rename src/mesa/shader/ to src/mesa/program/Brian Paul
2010-06-10i965: remove UseProgram driver callbackBrian Paul
It just duplicated the default/core Mesa behaviour.
2010-06-10i965: Add support for GL_ALPHA framebuffer objects.Eric Anholt
2010-06-09i965: Avoid calloc/free in the CURBE upload process.Eric Anholt
In exchange we end up with an extra memcpy, but that seems better than calloc/free. Each buffer is 4k maximum, and on the i965-streaming branch this allocation was showing up as the top entry in brw_validate_state profiling for cairo-gl.
2010-06-08intel: Convert remaining dri_bo_emit_reloc to drm_intel_bo_emit_reloc.Eric Anholt
The new API makes so much more sense, I'd like to forget how the old one worked.
2010-06-08intel: Change dri_bo_* to drm_intel_bo* to consistently use new API.Eric Anholt
The slightly less mechanical change of converting the emit_reloc calls will follow.
2010-06-02intel: Remove a leftover DRI1/DRI2 conditionalKristian Høgsberg
2010-05-28i965: Add cache unit -> bo name mapping for more gen6 state objects.Eric Anholt
This will help in bufmgr debugging and aub dumping.
2010-05-26i965: Add support for EXT_timer_query on Ironlake.Eric Anholt
We could potentially do this on G45 as well, though the units are different. On 965, the timestamp is tied to hclk, which would make supporting it harder.
2010-05-26i965: Move Gen6 debugging emit_mi_flush into the Gen6 block.Eric Anholt
2010-05-26i965: Emit MI_FLUSH before PSP on Ironlake for clip max threads errata.Eric Anholt
2010-05-23i965: Add support for all 8 possible ARB_draw_buffers in Mesa.Eric Anholt
We should be able to do 16, but are limited by Mesa's static buffer allocations.
2010-05-23i965: Fix bit allocation for number of color regions for ARB_draw_buffers.Eric Anholt
If you used all 4 color targets we currently support, we would see 0 and end up just writing the first output. Give enough bits that we can do the maximum of 16. Fixes piglit fbo-drawbuffers-maxtargets.
2010-05-20i965: remove disabled code for cycling through MRF registers in clipping.Eric Anholt
The idea would be that you could have multiple send messages going on if nothing depended on the previous message's results and you used a different send message. The problem is that the later send requires the VUE handle returned by the first send's allocate anyway.
2010-05-18i965: Remove constant or ignored-by-hw args from FF sync message setup.Eric Anholt
2010-05-18 gen6 fix: fix a wrong bit in binding_table_pointerZou Nan hai
2010-05-17i965: Fix point coordinate replacement after airlied's ffvertex changes.Eric Anholt
This basically restores the previous state, where a vertex result slot is set up for the texcoord to be replaced with point coord. Fixes piglit point-sprite test. Bug #27625
2010-05-17i965: Add SF program disasm under INTEL_DEBUG=sf.Eric Anholt
2010-05-17i965: Make rasterization of single and multiple quad prims match.Eric Anholt
This is trying to follow the spirit of the invariance rules, though they're not specific on this point. Fixes quad-invariance piglit test while retaining the 22s -> 18s win on glean blendFunc. This was a regression in c67d9d84f501f145f841c0b981caff6f4dfd936f.
2010-05-16i965: Remove the half-baked code for multiple OQs at the same time.Eric Anholt
GL doesn't actually let you begin an OQ while one is active, so the extra work was pointless.
2010-05-16i965: Remove unused occlusion query struct field.Eric Anholt
2010-05-14i965: Set the correct provoking vertex for clipped first-mode trifans.Eric Anholt
Bug #24470: glean clipFlat test.
2010-05-14i965: Add program dumping for INTEL_DEBUG=gs.Eric Anholt
2010-05-14i965: Parse the ff_sync URB send opcode on Ironlake disasm.Eric Anholt
2010-05-14i965: Use R16G16B16A16_FLOAT for 3-component half-float.Eric Anholt
The RGBX version isn't supported as a vertex input type, but since we force the last channel's value anyway, this should be fine. The only potential risk I see is in the limiter on VBO reads past the end of the buffer forcing the whole vertex to 0 when the A channel lands past the end. Fixes piglit draw-vertices-half-float.
2010-05-14i965: Dump out the correct shared function for SEND on Ironlake.Eric Anholt
2010-05-14i965: Support INTEL_DEBUG=clip to dump the clip program.Eric Anholt
2010-05-13i965: Reduce a single GL_QUADS to GL_TRIANGLE_FAN.Eric Anholt
This is similar to the GL_QUAD_STRIP -> TRIANGLE_STRIP optimization -- the GS usage to split the quads into tris is a huge bottleneck, so a quick check improves glean blendFunc time massively (width * height of the window of single-pixel GL_QUADS, many many times). This may also end up helping with cairo performance, which sometimes ends up drawing a single quad.
2010-05-11intel: Drop viewport hack when we canKristian Høgsberg
2010-05-02Merge branch 'gles2-2'Kristian Høgsberg
Conflicts: src/mesa/drivers/dri/common/dri_util.h
2010-04-28intel: Only register ES2 extensions for ES2 contextsKristian Høgsberg
2010-04-28dri: Add DRI entrypoints to create a context for a given APIKristian Høgsberg
2010-04-21intel: Clean up chipset name and gen num for IronlakeZhenyu Wang
Rename old IGDNG to Ironlake, and set 'gen' number for Ironlake as 5, so tracking the features with generation num instead of special is_ironlake flag. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2010-04-29i965: Reject shaders with uninlined function calls instead of hanging.Eric Anholt
Most of the failure from using uninlined function calls ends up being just bad rendering, but nested function calls in the VS currently hang the GPU, so reject them and explain why.
2010-04-29i965: Fix cube map layouts on Ironlake.Eric Anholt
We were doubling up the offsets for the mipmap levels for CPU access. Instead of reimplementing i945_miptree_layout_2d with 6 cube images separated by qpitch, share that function and provide the level offsets later. Fixes piglit cubemap and fbo-cubemap.
2010-04-29i965: Implement VS MAX in a more obvious way.Eric Anholt
This should be functionally equivalent, with the possible exception of NaN handling.
2010-04-29i965: Use immediate float operands for some VS instructions.Eric Anholt
We could use this to reduce constant register pressure, but for now it makes the resulting program assembly much more readable.