Age | Commit message (Collapse) | Author |
|
Computation of the delta of this array from the last had a silly little
bug and ignored any initial delta==0 causing grief in Nexuiz and
friends.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
There is a silicon bug which causes unpredictable behaviour if the
URB_FENCE command should cross a cache-line boundary. Pad before the
command to avoid such occurrences. As this command only applies to
gen4/5, do the fixup unconditionally as the specs do not actually state
for which chip it was fixed (and the cost is negligible)...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
|
|
SNB has 64k urb space, we only use piece of them.
The more urb space we alloc,
the more concurrent vs threads we can run.
push the urb space usage to the limit.
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
|
|
With relaxed relocation checking in the kernel, we can specify a
negative delta (i.e. pointing outside of the target bo) in order to fake
a range in a large buffer. We only then need to upload the elements used
and adjust the buffer offset such that they correspond with the indices
used in the DrawArrays.
(Depends on libdrm 0209428b3918c4336018da9293cdcbf7f8fedfb6)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This breaks nexuiz for unknown reason; disable until a true fix can be
found.
|
|
... handle all cases and not just the interleaved upload.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
... and take advantage of start_vertex_bias to trim to [min_index,
max_index] where possible (i.e. when we need to upload all arrays).
Fixes half_float_vertex(misc.fillmode.wireframe)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34595
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This reverts commit 4a3b28113c3d23ba21bb8b8f5ebab7c567083a6d, as it
caused a regression on Ironlake (bug #34646).
|
|
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
|
|
This adds the opcode and the code to convert ir_txd to OPCODE_TXD;
it doesn't actually add support yet.
|
|
Initial plumbing existed to turn the ir_txl into OPCODE_TXL, but it was
never handled.
|
|
Initial plumbing existed to turn the ir_txl into OPCODE_TXL, but it was
never handled.
|
|
The old value, BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE makes it sound like we're
doing a non-bias texture lookup. It has the same value as the new constant
BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE_BIAS_COMPARE, so there should be no
functional changes.
|
|
From volume 4, page 161 of the public i965 documentation.
|
|
255.875 matches the hardware documentation. Presumably this was a typo.
NOTE: This is a candidate for the 7.10 branch, along with
commit 2bfc23fb86964e4153f57f2a56248760f6066033.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
Fixes regression from aac120977d1ead319141d48d65c9bba626ec03b8.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34597
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Fixes oglc/vbo(basic.bufferdata)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34603
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Fixes piglit/fbo-depth-sample-compare:
==14722== Invalid free() / delete / delete[]
==14722== at 0x4C240FD: free (vg_replace_malloc.c:366)
==14722== by 0x84FBBFD: intel_upload_unmap (intel_buffer_objects.c:695)
==14722== by 0x85205BC: brw_prepare_vertices (brw_draw_upload.c:457)
==14722== by 0x852F975: brw_validate_state (brw_state_upload.c:394)
==14722== by 0x851FA24: brw_draw_prims (brw_draw.c:365)
==14722== by 0x85F2221: vbo_exec_vtx_flush (vbo_exec_draw.c:389)
==14722== by 0x85EF443: vbo_exec_FlushVertices_internal (vbo_exec_api.c:543)
==14722== by 0x85EF49B: vbo_exec_FlushVertices (vbo_exec_api.c:973)
==14722== by 0x86D6A16: _mesa_set_enable (enable.c:351)
==14722== by 0x42CAD1: render_to_fbo (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare)
==14722== by 0x42CEE3: piglit_display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare)
==14722== by 0x42F508: display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare)
==14722== Address 0xc606310 is 0 bytes after a block of size 18,720 alloc'd
==14722== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==14722== by 0x85202AB: copy_array_to_vbo_array (brw_draw_upload.c:256)
==14722== by 0x85205BC: brw_prepare_vertices (brw_draw_upload.c:457)
==14722== by 0x852F975: brw_validate_state (brw_state_upload.c:394)
==14722== by 0x851FA24: brw_draw_prims (brw_draw.c:365)
==14722== by 0x85F2221: vbo_exec_vtx_flush (vbo_exec_draw.c:389)
==14722== by 0x85EF443: vbo_exec_FlushVertices_internal (vbo_exec_api.c:543)
==14722== by 0x85EF49B: vbo_exec_FlushVertices (vbo_exec_api.c:973)
==14722== by 0x86D6A16: _mesa_set_enable (enable.c:351)
==14722== by 0x42CAD1: render_to_fbo (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare)
==14722== by 0x42CEE3: piglit_display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare)
==14722== by 0x42F508: display (in /home/ickle/git/piglit/bin/fbo-depth-sample-compare)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34604
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
255.875 matches the hardware documentation. Presumably this was a typo.
Found by inspection. Not known to fix any issues.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
pixel_w is the final result; wpos_w is used on gen4 to compute it.
NOTE: This is a candidate for the 7.10 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
We can't safely use fixed size arrays since Gen6+ supports unlimited
nesting of control flow.
NOTE: This is a candidate for the 7.10 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
The code that generates MATH instructions attempts to work around
the hardware ignoring source modifiers (abs and negate) by emitting
moves into temporaries. Unfortunately, this pass coalesced those
registers, restoring the original problem. Avoid doing that.
Fixes several OpenGL ES2 conformance failures on Sandybridge.
NOTE: This is a candidate for the 7.10 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
Single-operand math already had these workarounds, but POW (the only two
operand function) did not. It needs them too - otherwise we can hit
assertion failures in brw_eu_emit.c when code is actually generated.
NOTE: This is a candidate for the 7.10 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
gl_PointSize (VERT_RESULT_PSIZ) doesn't take up a message register,
as it's part of the header. Without this fix, writing to gl_PointSize
would cause the SF to read and use the wrong attributes, leading to all
kinds of random looking failure.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
... should have no impact on a properly formatted draw operation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Don't trust the applications not to reference beyond the end of the
vertex buffers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Fixes regression from 559435d9152acc7162e4e60aae6591c7c6c8274b.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Fixes regression in scissor-stencil-clear and 5 other tests.
|
|
... a leftover from a bad merge.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Replace the intermediate tests due to the logical or with the bitwise
or.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If the next vertex arrays are a (discontiguous) continuation of the
current arrays, such that the new vertices are simply offset from the
start of the current vertex buffer definitions we can reuse those
defintions and avoid the overhead of relocations and invalidations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Use a temporary glarray variable to replace the numerous input->glarray.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Using a temporary buffer for large discontiguous uploads into the common
buffer and a single buffered upload is faster than performing the
discontiguous copies through a mapping into the GTT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Upload the non-vbo arrays into a single interleaved buffer object, and
so need to just emit a single vertex buffer relocation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If the user passed in several arrays interleaved in the same vbo, only
emit a single vertex buffer and relocation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Track reuse of the vertex buffer objects and so minimise the number of
vertex buffers used by the hardware (and their relocations).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we now pack the indices into a common upload buffer, we can reuse a
single CMD_INDEX_BUFFER packet and translate each invocation with a
start vertex offset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Move the tracking of the last emitted instructions into the core
batchbuffer routines and take advantage of the shadow batch copy to
avoid extra memory allocations and copies.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
It's faster. Not only is the memcpy more efficiently performed in the
kernel (making up for the system call overhead), but by not using mmap
we remove the greater overhead of tracking the vma of every batch.
And it means we can read back from the batch buffer without incurring
the cost of a uncached read through the GTT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we use state relocations and we know that all the state belongs to
the same bo, we can drop the multiple references to the same bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we write directly into the batch in system memory, we do not need to
write first to the stack (as was to avoid read back through the GTT)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we write directly into the batch in system memory, we do not need to
write first to the stack (as was to avoid read back through the GTT)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In preparation for a greater change, use the color_calc_state_bo already
provisioned for this purpose.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|