summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri
AgeCommit message (Collapse)Author
2011-02-24i965: Unmap the correct pointer after discontiguous uploadChris Wilson
Fixes piglit/fbo-depth-sample-compare: ==14722== Invalid free() / delete / delete[] ==14722== at 0x4C240FD: free (vg_replace_malloc.c:366) ==14722== by 0x84FBBFD: intel_upload_unmap (intel_buffer_objects.c:695) ==14722== by 0x85205BC: brw_prepare_vertices (brw_draw_upload.c:457) ==14722== by 0x852F975: brw_validate_state (brw_state_upload.c:394) ==14722== by 0x851FA24: brw_draw_prims (brw_draw.c:365) ==14722== by 0x85F2221: vbo_exec_vtx_flush (vbo_exec_draw.c:389) ==14722== by 0x85EF443: vbo_exec_FlushVertices_internal (vbo_exec_api.c:543) ==14722== by 0x85EF49B: vbo_exec_FlushVertices (vbo_exec_api.c:973) ==14722== by 0x86D6A16: _mesa_set_enable (enable.c:351) ==14722== by 0x42CAD1: render_to_fbo (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare) ==14722== by 0x42CEE3: piglit_display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare) ==14722== by 0x42F508: display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare) ==14722== Address 0xc606310 is 0 bytes after a block of size 18,720 alloc'd ==14722== at 0x4C244E8: malloc (vg_replace_malloc.c:236) ==14722== by 0x85202AB: copy_array_to_vbo_array (brw_draw_upload.c:256) ==14722== by 0x85205BC: brw_prepare_vertices (brw_draw_upload.c:457) ==14722== by 0x852F975: brw_validate_state (brw_state_upload.c:394) ==14722== by 0x851FA24: brw_draw_prims (brw_draw.c:365) ==14722== by 0x85F2221: vbo_exec_vtx_flush (vbo_exec_draw.c:389) ==14722== by 0x85EF443: vbo_exec_FlushVertices_internal (vbo_exec_api.c:543) ==14722== by 0x85EF49B: vbo_exec_FlushVertices (vbo_exec_api.c:973) ==14722== by 0x86D6A16: _mesa_set_enable (enable.c:351) ==14722== by 0x42CAD1: render_to_fbo (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare) ==14722== by 0x42CEE3: piglit_display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare) ==14722== by 0x42F508: display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34604 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-24intel: Protect against waiting on a NULL render target boChris Wilson
If we fall back to software rendering due to the render target being absent (GPU hang or other error in creating the named target), then we do not need to nor should we wait upon the results. Reported-by: Magnus Kessler <Magnus.Kessler@gmx.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34656 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-23intel: gen3 is particular sensitive to batch sizeChris Wilson
... and prefers a small batch whereas gen4+ prefer a large batch to carry more state. Tuning using openarena/padman indicate that a batch size of just 4096 is best for those cases. Bugzilla: https://bugs.freedesktop.org/process_bug.cgi Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-23i915: And remember assign the new value to the state reg...Chris Wilson
Fixes regression from 298ebb78de8a6b6edf0aa0fe8d784d00bbc2930e. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34589 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-22i965: Increase Sandybridge point size clamp.Kenneth Graunke
255.875 matches the hardware documentation. Presumably this was a typo. Found by inspection. Not known to fix any issues. Reviewed-by: Eric Anholt <eric@anholt.net>
2011-02-22i965/fs: Correctly set up gl_FragCoord.w on Sandybridge.Kenneth Graunke
pixel_w is the final result; wpos_w is used on gen4 to compute it. NOTE: This is a candidate for the 7.10 branch. Reviewed-by: Eric Anholt <eric@anholt.net>
2011-02-22i965/fs: Refactor control flow stack handling.Kenneth Graunke
We can't safely use fixed size arrays since Gen6+ supports unlimited nesting of control flow. NOTE: This is a candidate for the 7.10 branch. Reviewed-by: Eric Anholt <eric@anholt.net>
2011-02-22i965/fs: Avoid register coalescing away gen6 MATH workarounds.Kenneth Graunke
The code that generates MATH instructions attempts to work around the hardware ignoring source modifiers (abs and negate) by emitting moves into temporaries. Unfortunately, this pass coalesced those registers, restoring the original problem. Avoid doing that. Fixes several OpenGL ES2 conformance failures on Sandybridge. NOTE: This is a candidate for the 7.10 branch. Reviewed-by: Eric Anholt <eric@anholt.net>
2011-02-22i965/fs: Apply source modifier workarounds to POW as well.Kenneth Graunke
Single-operand math already had these workarounds, but POW (the only two operand function) did not. It needs them too - otherwise we can hit assertion failures in brw_eu_emit.c when code is actually generated. NOTE: This is a candidate for the 7.10 branch. Reviewed-by: Eric Anholt <eric@anholt.net>
2011-02-22i965: Fix shaders that write to gl_PointSize on Sandybridge.Kenneth Graunke
gl_PointSize (VERT_RESULT_PSIZ) doesn't take up a message register, as it's part of the header. Without this fix, writing to gl_PointSize would cause the SF to read and use the wrong attributes, leading to all kinds of random looking failure. Reviewed-by: Eric Anholt <eric@anholt.net>
2011-02-22i965: Trim the interleaved upload to the minimum number of verticesChris Wilson
... should have no impact on a properly formatted draw operation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-22i965: Reinstate max-index paranoiaChris Wilson
Don't trust the applications not to reference beyond the end of the vertex buffers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-22i965: Zero the offset into the vbo when uploading non-interleavedChris Wilson
Fixes regression from 559435d9152acc7162e4e60aae6591c7c6c8274b. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Fix VB packet reuse when offset for the new buffer isn't stride aligned.Eric Anholt
Fixes regression in scissor-stencil-clear and 5 other tests.
2011-02-21radeon: add default switch case to silence unhandled enum warningBrian Paul
2011-02-21intel: Fix insufficient integer width for upload buffer offsetChris Wilson
I was being overly miserly and gave the offset of the buffer into the bo insufficient bits, distracted by the adjacency of the buffer[4096]. Ref: https://bugs.freedesktop.org/show_bug.cgi?id=34541 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Remove spurious duplicate ADVANCE_BATCHChris Wilson
... a leftover from a bad merge. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i915: Emit a single relocation per vboChris Wilson
Reducing the number of relocations has lots of nice knock-on effects, not least including reducing batch buffer size, auxilliary array sizes (vmalloced and copied into the kernel), processing of uncached relocations etc. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i915: Suppress emission of redundant stencil updatesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i915: Separate BLEND from general context state.Chris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i915: Only flag context changes if the actual state is changedChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i915: suppress repeated sampler state emissionChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i915: Eliminate redundant CONSTANTS updatesChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Use compiler builtins when availableChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Micro-optimise check_stateChris Wilson
Replace the intermediate tests due to the logical or with the bitwise or. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: use throttle ioctl for throttlingChris Wilson
Rather than waiting on the first batch after the last swapbuffers to be retired, call into the kernel to wait upon the retirement of any request less than 20ms old. This has the twofold advantage of (a) not blocking any other clients from utilizing the device whilst we wait and (b) we attain higher throughput without overloading the system. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Remove unused 'next_free_page' memberChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Skip the flush before read-pixels via blitChris Wilson
As we will flush when reading the return values of the blit, we can forgo the earlier flush. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: extend current vertex buffersChris Wilson
If the next vertex arrays are a (discontiguous) continuation of the current arrays, such that the new vertices are simply offset from the start of the current vertex buffer definitions we can reuse those defintions and avoid the overhead of relocations and invalidations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Use specified alignment for writes into the upload bufferChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Clean up brw_prepare_vertices()Chris Wilson
Use a temporary glarray variable to replace the numerous input->glarray. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: combine short memcpy using a temporary allocated bufferChris Wilson
Using a temporary buffer for large discontiguous uploads into the common buffer and a single buffered upload is faster than performing the discontiguous copies through a mapping into the GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: upload normal arrays as interleavedChris Wilson
Upload the non-vbo arrays into a single interleaved buffer object, and so need to just emit a single vertex buffer relocation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: interleaved vboChris Wilson
If the user passed in several arrays interleaved in the same vbo, only emit a single vertex buffer and relocation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: emit one vb packet per vboChris Wilson
Track reuse of the vertex buffer objects and so minimise the number of vertex buffers used by the hardware (and their relocations). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: upload transient indices into the same discontiguous bufferChris Wilson
As we now pack the indices into a common upload buffer, we can reuse a single CMD_INDEX_BUFFER packet and translate each invocation with a start vertex offset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: suppress repeat-emission of identical vertex elementsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Move repeat-instruction-suppression to batchbuffer coreChris Wilson
Move the tracking of the last emitted instructions into the core batchbuffer routines and take advantage of the shadow batch copy to avoid extra memory allocations and copies. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: use pwrite for batchChris Wilson
It's faster. Not only is the memcpy more efficiently performed in the kernel (making up for the system call overhead), but by not using mmap we remove the greater overhead of tracking the vma of every batch. And it means we can read back from the batch buffer without incurring the cost of a uncached read through the GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: drop state_bo references to batch_boChris Wilson
As we use state relocations and we know that all the state belongs to the same bo, we can drop the multiple references to the same bo. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: directly write wm state to batchChris Wilson
As we write directly into the batch in system memory, we do not need to write first to the stack (as was to avoid read back through the GTT) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: write cc straight to batchChris Wilson
As we write directly into the batch in system memory, we do not need to write first to the stack (as was to avoid read back through the GTT) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: switch gen6 to use its own cc state boChris Wilson
In preparation for a greater change, use the color_calc_state_bo already provisioned for this purpose. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Buffered uploadChris Wilson
Rather than performing lots of little writes to update the common bo upon each update, write those into a static buffer and flush that when full (or at the end of the batch). Doing so gives a dramatic performance improvement over and above using mmaped access. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Replace the bo for a complete updateChris Wilson
Rather than performing a blit to completely overwrite a busy bo, simply discard it and create a new one with the fresh data. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Combine vb upload buffer with the general upload bufferChris Wilson
Reuse the new common upload buffer for uploading temporary indices and rebuilt vertex arrays. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Pack dynamic draws togetherChris Wilson
Dynamic arrays have the tendency to be small and so allocating a bo for each one is overkill and we can exploit many efficiency gains by packing them together. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Use system memory for DYNAMIC_DRAW source objectsChris Wilson
Dynamic draw buffers are used by clients for temporary arrays and for uploading normal vertex arrays. By keeping the data in memory, we can avoid reusing active buffer objects and reallocate them as they are changed. This is important for Sandybridge which can not issue blits within a batch and so ends up flushing the batch upon every update, that is each batch only contains a single draw operation (if using dynamic arrays or regular arrays from system memory). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Trim the trailing NOOP from 3DSTATE_INDEX_BUFFERChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Fallback on encountering a NULL render bufferChris Wilson
Following a GPU hang, or other error, the render target is not likely to have an allocated BO and so we must fallback to avoid using it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32534 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>