summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965
AgeCommit message (Collapse)Author
2011-02-21i965: Clean up brw_prepare_vertices()Chris Wilson
Use a temporary glarray variable to replace the numerous input->glarray. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: combine short memcpy using a temporary allocated bufferChris Wilson
Using a temporary buffer for large discontiguous uploads into the common buffer and a single buffered upload is faster than performing the discontiguous copies through a mapping into the GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: upload normal arrays as interleavedChris Wilson
Upload the non-vbo arrays into a single interleaved buffer object, and so need to just emit a single vertex buffer relocation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: interleaved vboChris Wilson
If the user passed in several arrays interleaved in the same vbo, only emit a single vertex buffer and relocation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: emit one vb packet per vboChris Wilson
Track reuse of the vertex buffer objects and so minimise the number of vertex buffers used by the hardware (and their relocations). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: upload transient indices into the same discontiguous bufferChris Wilson
As we now pack the indices into a common upload buffer, we can reuse a single CMD_INDEX_BUFFER packet and translate each invocation with a start vertex offset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: suppress repeat-emission of identical vertex elementsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Move repeat-instruction-suppression to batchbuffer coreChris Wilson
Move the tracking of the last emitted instructions into the core batchbuffer routines and take advantage of the shadow batch copy to avoid extra memory allocations and copies. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: use pwrite for batchChris Wilson
It's faster. Not only is the memcpy more efficiently performed in the kernel (making up for the system call overhead), but by not using mmap we remove the greater overhead of tracking the vma of every batch. And it means we can read back from the batch buffer without incurring the cost of a uncached read through the GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: drop state_bo references to batch_boChris Wilson
As we use state relocations and we know that all the state belongs to the same bo, we can drop the multiple references to the same bo. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: directly write wm state to batchChris Wilson
As we write directly into the batch in system memory, we do not need to write first to the stack (as was to avoid read back through the GTT) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: write cc straight to batchChris Wilson
As we write directly into the batch in system memory, we do not need to write first to the stack (as was to avoid read back through the GTT) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: switch gen6 to use its own cc state boChris Wilson
In preparation for a greater change, use the color_calc_state_bo already provisioned for this purpose. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Buffered uploadChris Wilson
Rather than performing lots of little writes to update the common bo upon each update, write those into a static buffer and flush that when full (or at the end of the batch). Doing so gives a dramatic performance improvement over and above using mmaped access. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Combine vb upload buffer with the general upload bufferChris Wilson
Reuse the new common upload buffer for uploading temporary indices and rebuilt vertex arrays. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Pack dynamic draws togetherChris Wilson
Dynamic arrays have the tendency to be small and so allocating a bo for each one is overkill and we can exploit many efficiency gains by packing them together. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Trim the trailing NOOP from 3DSTATE_INDEX_BUFFERChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Fallback on encountering a NULL render bufferChris Wilson
Following a GPU hang, or other error, the render target is not likely to have an allocated BO and so we must fallback to avoid using it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32534 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-08i965: Add missing DEFINE_BITS for brw dirty bits.Kenneth Graunke
These are only used for debugging, but should be there. Found by inspection.
2011-02-08i965: Separate the BRW_NEW_(VS|WM)_CONSTBUF dirty bits.Kenneth Graunke
These were incorrectly defined to the same value - likely due to a cut and paste error. Found by inspection.
2011-02-08i965: Rename a few more commands to match the documentation.Kenneth Graunke
2011-02-08i965: Remove pointless keying of WM state on VUE size.Eric Anholt
2011-02-05mesa/965: add support for GL_EXT_framebuffer_sRGB (v2)Dave Airlie
This adds i965 support for GL_EXT_framebuffer_sRGB, it introduces a new constant to say that the driver can support sRGB enabled FBOs since enabling the extension doesn't mean the driver can actually support sRGB. Also adds the suggested state flush in the core code suggested by Brian. fix the ARB_fbo color encoding. Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-02-04i965: Drop the dead tracking of color_regions[].Eric Anholt
We pull the draw regions right out of the renderbuffers these days.
2011-02-04i965: Drop the INTEL_DEBUG=state spam about the cache size check.Eric Anholt
There's way more interesting info in INTEL_DEBUG=state if you could find it among the state size checks.
2011-01-31i965: Emit texel offsets in sampler messages.Kenneth Graunke
2011-01-31Convert everything from the talloc API to the ralloc API.Kenneth Graunke
2011-01-23i965: remove _NEW_ACCUMBrian Paul
2011-01-21glsl, i965: Remove unnecessary talloc includes.Kenneth Graunke
These are already picked up by ir.h or glsl_types.h.
2011-01-20intel: Fix typeos from 3d028024 and 790ff232Ian Romanick
...and remove egg from face.
2011-01-20i965: Set correct values for range/precision of fragment shader typesIan Romanick
2011-01-19i965/fs: Take the shared mathbox into account in instruction scheduling.Eric Anholt
I don't have evidence for this amounting to any improvement, but it does codify a bit more what we understand so far about the pipeline.
2011-01-19i965/fs: Add a helper function for detecting math opcodes.Eric Anholt
2011-01-19i965/fs: Assign URB/CURB register numbers after instruction scheduling.Eric Anholt
This fixes a bunch of unnecessary barriers due to the scheduler not knowing what that arbitrary register description refers to when trying to reason about its dependencies. The result is rescheduling in the convolution kernel shader in Lightsmark, which results in avoiding register spilling and increasing the performance of the first scene from 6-7 fps midway through the panning to 11fps. The register spilling was a regression from Mesa 7.9 to Mesa 7.10.
2011-01-19i965/fs: Add an instruction scheduler.Eric Anholt
Improves performance of my GLSL demo by 5.1% (+/- 1.4%, n=7). It also reschedules the giant multiply tree at the end of glsl-fs-convolution-1 so that we end up not spilling registers, producing the expected level of performance.
2011-01-19i965/fs: Add a helper for detecting texturing opcodes.Eric Anholt
2011-01-18i965: Fix a comment typo.Eric Anholt
2011-01-18i965: Fix a bug in i965 compute-to-MRF.Eric Anholt
Fixes piglit glsl-fs-texture2d-branching. I couldn't come up with a testcase that didn't involve dead code, but it's still worthwhile to fix I think.
2011-01-17i965: Fix dead pointers to fp->Parameters->ParameterValues[] after realloc.Eric Anholt
Fixes texrect-many regression with ff_fragment_shader -- as we added refs to the subsequent texcoord scaling paramters, the array got realloced to a new address while our params[] still pointed at the old location.
2011-01-16i965: add support for EXT_texture_sRGB_decodeDave Airlie
We just choose the texture format depending on the srgb decode bit for the sRGB formats. Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-01-15mesa: begin implementation of GL_ARB_draw_buffers_blendBrian Paul
2011-01-14i965: Replace broken handling of dead code with an assert.Eric Anholt
This code should never have been triggered, but I often did anyway when I disabled optimization passes during debugging, then spent my time debugging that this code doesn't work.
2011-01-14i965: Add an invalidation of live intervals after register splitting.Eric Anholt
No effect, since it was called before live intervals were calculated.
2011-01-14i965: fix fbo-srgb on i965.Dave Airlie
Until we get the EXT_framebuffer_sRGB extension we should bind the sRGB formats for FBO as linear. Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-01-13i965: Remove unnecessary headers.Vinson Lee
2011-01-12i965/fs: Do flat shading when appropriate.Eric Anholt
We were trying to interpolate, which would end up doing unnecessary math, and doing so on undefined values. Fixes glsl-fs-flat-color.
2011-01-12i965: Clarify when we need to (re-)calculate live intervals.Eric Anholt
The ad-hoc placement of recalculation somewhere between when they got invalidated and when they were next needed was confusing. This should clarify what's going on here.
2011-01-12i965/vs: When MOVing to produce ABS, strip negate of the operand.Eric Anholt
We were returning the negative absolute value, instead of the absolute value. Fixes glsl-vs-abs-neg.
2011-01-12i965/fs: When producing ir_unop_abs of an operand, strip negate.Eric Anholt
We were returning the negative absolute value, instead of the absolute value. Fixes glsl-fs-abs-neg.
2011-01-11i965: Tighten up the check for flow control interfering with coalescing.Eric Anholt
This greatly improves codegen for programs with flow control by allowing coalescing for all instructions at the top level, not just ones that follow the last flow control in the program.