summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965
AgeCommit message (Collapse)Author
2011-02-21i965: suppress repeat-emission of identical vertex elementsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Move repeat-instruction-suppression to batchbuffer coreChris Wilson
Move the tracking of the last emitted instructions into the core batchbuffer routines and take advantage of the shadow batch copy to avoid extra memory allocations and copies. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: use pwrite for batchChris Wilson
It's faster. Not only is the memcpy more efficiently performed in the kernel (making up for the system call overhead), but by not using mmap we remove the greater overhead of tracking the vma of every batch. And it means we can read back from the batch buffer without incurring the cost of a uncached read through the GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: drop state_bo references to batch_boChris Wilson
As we use state relocations and we know that all the state belongs to the same bo, we can drop the multiple references to the same bo. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: directly write wm state to batchChris Wilson
As we write directly into the batch in system memory, we do not need to write first to the stack (as was to avoid read back through the GTT) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: write cc straight to batchChris Wilson
As we write directly into the batch in system memory, we do not need to write first to the stack (as was to avoid read back through the GTT) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: switch gen6 to use its own cc state boChris Wilson
In preparation for a greater change, use the color_calc_state_bo already provisioned for this purpose. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Buffered uploadChris Wilson
Rather than performing lots of little writes to update the common bo upon each update, write those into a static buffer and flush that when full (or at the end of the batch). Doing so gives a dramatic performance improvement over and above using mmaped access. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Combine vb upload buffer with the general upload bufferChris Wilson
Reuse the new common upload buffer for uploading temporary indices and rebuilt vertex arrays. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Pack dynamic draws togetherChris Wilson
Dynamic arrays have the tendency to be small and so allocating a bo for each one is overkill and we can exploit many efficiency gains by packing them together. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Trim the trailing NOOP from 3DSTATE_INDEX_BUFFERChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Fallback on encountering a NULL render bufferChris Wilson
Following a GPU hang, or other error, the render target is not likely to have an allocated BO and so we must fallback to avoid using it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32534 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-08i965: Add missing DEFINE_BITS for brw dirty bits.Kenneth Graunke
These are only used for debugging, but should be there. Found by inspection.
2011-02-08i965: Separate the BRW_NEW_(VS|WM)_CONSTBUF dirty bits.Kenneth Graunke
These were incorrectly defined to the same value - likely due to a cut and paste error. Found by inspection.
2011-02-08i965: Rename a few more commands to match the documentation.Kenneth Graunke
2011-02-08i965: Remove pointless keying of WM state on VUE size.Eric Anholt
2011-02-05mesa/965: add support for GL_EXT_framebuffer_sRGB (v2)Dave Airlie
This adds i965 support for GL_EXT_framebuffer_sRGB, it introduces a new constant to say that the driver can support sRGB enabled FBOs since enabling the extension doesn't mean the driver can actually support sRGB. Also adds the suggested state flush in the core code suggested by Brian. fix the ARB_fbo color encoding. Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-02-04i965: Drop the dead tracking of color_regions[].Eric Anholt
We pull the draw regions right out of the renderbuffers these days.
2011-02-04i965: Drop the INTEL_DEBUG=state spam about the cache size check.Eric Anholt
There's way more interesting info in INTEL_DEBUG=state if you could find it among the state size checks.
2011-01-31i965: Emit texel offsets in sampler messages.Kenneth Graunke
2011-01-31Convert everything from the talloc API to the ralloc API.Kenneth Graunke
2011-01-23i965: remove _NEW_ACCUMBrian Paul
2011-01-21glsl, i965: Remove unnecessary talloc includes.Kenneth Graunke
These are already picked up by ir.h or glsl_types.h.
2011-01-20intel: Fix typeos from 3d028024 and 790ff232Ian Romanick
...and remove egg from face.
2011-01-20i965: Set correct values for range/precision of fragment shader typesIan Romanick
2011-01-19i965/fs: Take the shared mathbox into account in instruction scheduling.Eric Anholt
I don't have evidence for this amounting to any improvement, but it does codify a bit more what we understand so far about the pipeline.
2011-01-19i965/fs: Add a helper function for detecting math opcodes.Eric Anholt
2011-01-19i965/fs: Assign URB/CURB register numbers after instruction scheduling.Eric Anholt
This fixes a bunch of unnecessary barriers due to the scheduler not knowing what that arbitrary register description refers to when trying to reason about its dependencies. The result is rescheduling in the convolution kernel shader in Lightsmark, which results in avoiding register spilling and increasing the performance of the first scene from 6-7 fps midway through the panning to 11fps. The register spilling was a regression from Mesa 7.9 to Mesa 7.10.
2011-01-19i965/fs: Add an instruction scheduler.Eric Anholt
Improves performance of my GLSL demo by 5.1% (+/- 1.4%, n=7). It also reschedules the giant multiply tree at the end of glsl-fs-convolution-1 so that we end up not spilling registers, producing the expected level of performance.
2011-01-19i965/fs: Add a helper for detecting texturing opcodes.Eric Anholt
2011-01-18i965: Fix a comment typo.Eric Anholt
2011-01-18i965: Fix a bug in i965 compute-to-MRF.Eric Anholt
Fixes piglit glsl-fs-texture2d-branching. I couldn't come up with a testcase that didn't involve dead code, but it's still worthwhile to fix I think.
2011-01-17i965: Fix dead pointers to fp->Parameters->ParameterValues[] after realloc.Eric Anholt
Fixes texrect-many regression with ff_fragment_shader -- as we added refs to the subsequent texcoord scaling paramters, the array got realloced to a new address while our params[] still pointed at the old location.
2011-01-16i965: add support for EXT_texture_sRGB_decodeDave Airlie
We just choose the texture format depending on the srgb decode bit for the sRGB formats. Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-01-15mesa: begin implementation of GL_ARB_draw_buffers_blendBrian Paul
2011-01-14i965: Replace broken handling of dead code with an assert.Eric Anholt
This code should never have been triggered, but I often did anyway when I disabled optimization passes during debugging, then spent my time debugging that this code doesn't work.
2011-01-14i965: Add an invalidation of live intervals after register splitting.Eric Anholt
No effect, since it was called before live intervals were calculated.
2011-01-14i965: fix fbo-srgb on i965.Dave Airlie
Until we get the EXT_framebuffer_sRGB extension we should bind the sRGB formats for FBO as linear. Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-01-13i965: Remove unnecessary headers.Vinson Lee
2011-01-12i965/fs: Do flat shading when appropriate.Eric Anholt
We were trying to interpolate, which would end up doing unnecessary math, and doing so on undefined values. Fixes glsl-fs-flat-color.
2011-01-12i965: Clarify when we need to (re-)calculate live intervals.Eric Anholt
The ad-hoc placement of recalculation somewhere between when they got invalidated and when they were next needed was confusing. This should clarify what's going on here.
2011-01-12i965/vs: When MOVing to produce ABS, strip negate of the operand.Eric Anholt
We were returning the negative absolute value, instead of the absolute value. Fixes glsl-vs-abs-neg.
2011-01-12i965/fs: When producing ir_unop_abs of an operand, strip negate.Eric Anholt
We were returning the negative absolute value, instead of the absolute value. Fixes glsl-fs-abs-neg.
2011-01-11i965: Tighten up the check for flow control interfering with coalescing.Eric Anholt
This greatly improves codegen for programs with flow control by allowing coalescing for all instructions at the top level, not just ones that follow the last flow control in the program.
2011-01-11i965: Remove dead fallback for stencil _Enabled but no stencil buffer.Eric Anholt
The _Enabled field is the thing that takes into account whether there's a stencil buffer. Tested with piglit glx-visuals-stencil.
2011-01-10i965: Use a new miptree to avoid software fallbacks due to drawing offset.Eric Anholt
When attaching a small mipmap level to an FBO, the original gen4 didn't have the bits to support rendering to it. Instead of falling back, just blit it to a new little miptree just for it, and let it get revalidated into the stack later just like any other new teximage. Bug #30365.
2011-01-10Revert "intel: Always allocate miptrees from level 0, not tObj->BaseLevel."Eric Anholt
This reverts commit 7ce6517f3ac41bf770ab39aba4509d4f535ef663. This reverts commit d60145d06d999c5c76000499e6fa9351e11d17fa. I was wrong about which generations supported baselevel adjustment -- it's just gen4, nothing earlier. This meant that i915 would have never used the mag filter when baselevel != 0. Not a severe bug, but not an intentional regression. I think we can fix the performance issue another way.
2011-01-10i965: Add #defines for HiZ and separate stencil buffer commands.Kenneth Graunke
2011-01-10i965: Add new HiZ related bits to WM_STATE.Kenneth Graunke
2011-01-10i965: Rename more #defines to 3DSTATE rather than CMD or CMD_3D.Kenneth Graunke
Again, this makes it match the documentation.