Age | Commit message (Collapse) | Author |
|
Fixes everything-black with meta_clear_tris on quake4-mpdemo and doom3-demo.
Bug #18844, 22077.
(cherry picked from commit 81d555068408d4343d7627c8bedda5675f09bd21)
|
|
the driver used to overwrite grf0 then use implicit move by send instruction
to move contents of grf0 to mrf1. However, we must not overwrite grf0 since
it's still used later for fb write.
Instead, do the move directly do mrf1 (we could use implicit move from another
grf reg to mrf1 but since we need a mov to encode the data anyway it doesn't
seem to make sense).
I think the dp_READ/WRITE_16 functions may suffer from the same issue.
While here also remove unnecessary msg_reg_nr parameter from the dataport
functions since always message register 1 is used.
|
|
Thanks to branching, the state of c->current_const[i].index at the point
of emitting constant loads for this instruction may not match the actual
constant currently loaded in the reg at runtime. Fixes a regression in my
GLSL program for idr's class since b58b3a786aa38dcc9d72144c2cc691151e46e3d5.
|
|
glsl compiler will not generate OPCODE_SWZ, and as a first step it would
be translated away to a MOV anyway (why?), but later internally this opcode is
generated (for EXT_texture_swizzling).
(cherry picked from commit 4ef1f8e3b52a06fcf58f78c9c36738531b91dbac)
|
|
With 1D textures, GL_TEXTURE_WRAP_T should be ignored (only
GL_TEXTURE_WRAP_S should be respected). But the i965 hardware
seems to follow the value of GL_TEXTURE_WRAP_T even when sampling
1D textures.
This fix forces GL_TEXTURE_WRAP_T to be GL_REPEAT whenever 1D
textures are used; this allows the texture to be sampled
correctly, avoiding "imaginary" border elements in the T direction.
This bug was demonstrated in the Piglit tex1d-2dborder test.
With this fix, that test passes.
(cherry picked from commit ab6c4fa582972e25f8800c77b5dd5b3a83afc996)
|
|
When out of memory (in at least one case, triggered by a longrunning
memory leak), this code will segfault and crash. By checking for the
out-of-memory condition, the system can continue, and will report
the out-of-memory error later, a much preferable outcome.
(cherry picked from commit 44a4abfd4f8695809eaec07df8eeb191d6e017d7)
|
|
Noticed while debugging a weird 1D FBO testcase that left its existing
viewport and projection matrix in place when switching drawbuffers. Didn't
fix the testcase, though.
(cherry picked from commit 3a521d84ecc646fcc65fa3fe7c5f1fdbdebe8bc2)
|
|
I don't have a testcase for this, but it seems clearly wrong.
(cherry picked from commit dc657f3929fbe03275b3fae4ef84f02e74b51114)
|
|
Before, if the VP output something that is in the attributes coming into
the WM but which isn't used by the WM, then WM would end up reading subsequent
varyings from the wrong places. This was visible with a GLSL demo
using gl_PointSize in the VS and a varying in the WM, as point size is in
the VUE but not used by the WM. There is now a regression test in piglit,
glsl-unused-varying.
(cherry picked from commit 0f5113deed91611ecdda6596542530b1849bb161)
|
|
For the TXP instruction we check if the texcoord is really a 4-component
atttibute which requires the divide by W step. This check involved the
projtex_mask field. However, the projtex_mask field was being miscalculated
because of some confusion between vertex program outputs and fragment
program inputs.
1. Rework the size_masks calculation so we correctly set bits corresponding
to fragment program input attributes.
2. Rename projtex_mask to proj_attrib_mask since we're interested in more
than just texcoords (generic varying vars too).
3. Simply the indexing of the size_masks and proj_attrib_mask fields.
4. The tracker::active[] array was mis-dimensioned. Use MAX_PROGRAM_TEMPS
instead of a magic number.
5. Update comments, add new assertions.
With these changes the Lightsmark demo/benchmark renders correctly, until
we eventually hit a GPU lockup...
|
|
Make the use_const_buffer field per-program and only call the code which
updates the constant buffer's data if the flag is set.
This should undo the perf regression from 20f3497e4b6756e330f7b3f54e8acaa1d6c92052
|
|
dri_bo_subdata()
This wraps up the unfinished business from commit a9a363f8298e9d534e60e3d2869f8677138a1e7e
|
|
need to clamp point size to user set min/max values, even for constant
point size. Fixes glean pointAtten test.
|
|
The drm_intel_gem_bo_map_gtt() call that replaced dri_bo_map() is
producing errors like:
intel_bufmgr_gem.c:689: Error preparing buffer map 39 (vp_const_buffer): Invalid argument .
and returning NULL, causing a segfault in the memcpy().
Just reverting until we can get to the root issue...
|
|
Also fixes drawing to 3D texture depth levels.
|
|
This is a CPU win in general, but in particular reduces the pain of
Mesa's calculation of min/max indices in DrawElements (wtf?).
|
|
i915 actually supports up to 4 (according to header file - not tested),
i965 up to 16 (code already handled this but slightly broken), so don't use 2
for all chips, even though angular dependency is very high.
|
|
|
|
Fixes a regression from commit 2c30fd84dfa052949a117c78d932b58c1f88b446
seen with DRI1.
|
|
The READ message's msg_control value can be 0 or 1 to indicate that the
Oword should be read into the lower or upper half of the target register.
It seems that the other half of the register gets clobbered though. So
we read into two dest registers then use a MOV to combine the upper/lower
halves.
|
|
Now that we have real constant buffers, the demands on the CURBE are lessened.
When we use real VS/WM constant buffers we only use the CURBE for clip planes.
|
|
|
|
Also enable them all regardless of screen bpp, as 32 bpp what I've been
testing against, and haven't been able to detect any screen bpp-specific
troubles with them.
|
|
For some reason, MOV instructions using immediate src values don't seem
to work reliably on the GLSL path. Disable them for now (falling back to
const buffer reads). This fixes a bunch of glean glsl1 failures.
|
|
|
|
|
|
A scatter-read should be possible, but we're just using two READs for
the time being.
|
|
|
|
Calls to release_tmps() were causing the temps holding constants to get
recycled.
|
|
|
|
There's really no need for two negation fields. This came from the
GL_NV_fragment_program extension. The new, unified Negate bitfield applies
after the absolute value step.
|
|
This mostly came down to finding the right MRF incantation in the
brw_dp_READ_4_vs() function.
Note: this feature is still disabled (but getting close to done).
|
|
Hook up a constant buffer, binding table, etc for the VS unit.
This will allow using large constant buffers with vertex shaders.
The new code is disabled at this time (use_const_buffer=FALSE).
|
|
|
|
|
|
|
|
Plus, begin the new code for vertex shader const buffers.
|
|
Used to map drawables, textures and constant buffers to surface binding
table indexes.
|
|
Fixes mem leak observed with texcombine test.
|
|
This fixes the random results that were seen when fetching a constant
inside an IF/ELSE clause. Disabling the execution mask ensures that all
the components of the register are written.
|
|
|
|
|
|
Before, the instruction's CondUpdate field was mistakenly effecting the
constant-fetch operation.
Fixes progs/glsl/bump.c demo. But there are some other issues related
to condition flags and IF/ELSE that need investigation...
|
|
This speeds up OA on my GM45 by 21% (more than the original CPU cost of
the upload path). We might still be able to squeeze a few more percent out
by avoiding repeatedly mapping/unmapping buffers as we upload elements into
them.
|
|
|
|
|
|
|
|
|
|
Everything is in place now for using a true constant buffer for GLSL fragment
shaders. Still some bugs to find though.
|
|
We were accidentally clobbering the next register.
|