Age | Commit message (Collapse) | Author |
|
There's a bunch of stuff from gen4asm and gpu-tools that we probably want
to make into a library instead of cargo-culting it around.
|
|
|
|
Bug #20821
|
|
This avoids sending a bad buffer address to the GPU due to programmer error,
and is permitted by the ARB_vbo spec. Note that we still have the opportunity
to dereference past the end of the GPU, because we aren't clipping to a
correct _MaxElement, but that appears to be harder than it should be. This
gets us the 90% solution.
Bug #19911.
|
|
See comment on Vertex URB Entry Read Length for VS_STATE.
This, combined with the previous three commits, fixes #22945.
|
|
This fix is just from code and docs inspection, but it may fix hangs on
some applications.
|
|
It appears that sometimes Mesa (and I suppose a VS could as well) emits
a program which references no vertex data, and thus we end up with
nr_enabled == 0 even though some VBs are enabled. We'd end up emitting
VB/VE packet headers of 0xffffffff in that case, leading to GPU hangs.
Bug #22945 (wine with an uncompiled VS)
|
|
The code duplication bothered me.
|
|
The LOOP/ENDLOOP pair is renamed to BGNFOR/ENDFOR as its behaviour
is similar to a C language for-loop.
The BGNLOOP2/ENDLOOP2 pair is renamed to BGNLOOP/ENDLOOP as now
there is no name collision.
|
|
In addition, it guarantees ff_sync message is issued
|
|
Previously, the FOGC attribute contained the fragment fog coord, front/back-
face flag and the gl_PointCoord.xy values. Now each of those things are
separate fragment program attributes. This simplifies quite a few things in
Mesa and gallium.
Need to test i965 driver and fix up point coord handling in the gallium/draw
module...
|
|
Fixes everything-black with meta_clear_tris on quake4-mpdemo and doom3-demo.
Bug #18844, 22077.
(cherry picked from commit 81d555068408d4343d7627c8bedda5675f09bd21)
|
|
Fixes everything-black with meta_clear_tris on quake4-mpdemo and doom3-demo.
Bug #18844, 22077.
|
|
|
|
|
|
1. new PCI ids
2. fix some 3D commands on new chipset
3. fix send instruction on new chipset
4. new VUE vertex header
5. ff_sync message (added by Zou Nan Hai <nanhai.zou@intel.com>)
6. the offset in JMPI is in unit of 64bits on new chipset
7. new cube map layout
|
|
This state flag has been unused since the ffvertex_prog move to core.
|
|
the driver used to overwrite grf0 then use implicit move by send instruction
to move contents of grf0 to mrf1. However, we must not overwrite grf0 since
it's still used later for fb write.
Instead, do the move directly do mrf1 (we could use implicit move from another
grf reg to mrf1 but since we need a mov to encode the data anyway it doesn't
seem to make sense).
I think the dp_READ/WRITE_16 functions may suffer from the same issue.
While here also remove unnecessary msg_reg_nr parameter from the dataport
functions since always message register 1 is used.
|
|
Thanks to branching, the state of c->current_const[i].index at the point
of emitting constant loads for this instruction may not match the actual
constant currently loaded in the reg at runtime. Fixes a regression in my
GLSL program for idr's class since b58b3a786aa38dcc9d72144c2cc691151e46e3d5.
|
|
1. the data type of <src1> (JMPI offset) must be D
2. execution size must be 1
3. NoMask
4. instruction compression isn't allowed.
|
|
This improves the performance of my GLSL demo by 30%. It also fixes the
VS deadlock that ut2004 had, for reasons I can't explain. Bug #21330.
|
|
If we can't fit all the VS outputs into the MRF, we need to overflow into
temporary GRF registers, then use some MOVs and a second brw_urb_WRITE()
instruction to place the overflow vertex results into the URB.
This is hit when a vertex/fragment shader pair has a large number of varying
variables (12 or more).
There's still something broken here, but it seems close...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
glsl compiler will not generate OPCODE_SWZ, and as a first step it would
be translated away to a MOV anyway (why?), but later internally this opcode is
generated (for EXT_texture_swizzling).
(cherry picked from commit 4ef1f8e3b52a06fcf58f78c9c36738531b91dbac)
|
|
With 1D textures, GL_TEXTURE_WRAP_T should be ignored (only
GL_TEXTURE_WRAP_S should be respected). But the i965 hardware
seems to follow the value of GL_TEXTURE_WRAP_T even when sampling
1D textures.
This fix forces GL_TEXTURE_WRAP_T to be GL_REPEAT whenever 1D
textures are used; this allows the texture to be sampled
correctly, avoiding "imaginary" border elements in the T direction.
This bug was demonstrated in the Piglit tex1d-2dborder test.
With this fix, that test passes.
(cherry picked from commit ab6c4fa582972e25f8800c77b5dd5b3a83afc996)
|
|
When out of memory (in at least one case, triggered by a longrunning
memory leak), this code will segfault and crash. By checking for the
out-of-memory condition, the system can continue, and will report
the out-of-memory error later, a much preferable outcome.
(cherry picked from commit 44a4abfd4f8695809eaec07df8eeb191d6e017d7)
|
|
Noticed while debugging a weird 1D FBO testcase that left its existing
viewport and projection matrix in place when switching drawbuffers. Didn't
fix the testcase, though.
(cherry picked from commit 3a521d84ecc646fcc65fa3fe7c5f1fdbdebe8bc2)
|
|
I don't have a testcase for this, but it seems clearly wrong.
(cherry picked from commit dc657f3929fbe03275b3fae4ef84f02e74b51114)
|
|
Before, if the VP output something that is in the attributes coming into
the WM but which isn't used by the WM, then WM would end up reading subsequent
varyings from the wrong places. This was visible with a GLSL demo
using gl_PointSize in the VS and a varying in the WM, as point size is in
the VUE but not used by the WM. There is now a regression test in piglit,
glsl-unused-varying.
(cherry picked from commit 0f5113deed91611ecdda6596542530b1849bb161)
|
|
For the TXP instruction we check if the texcoord is really a 4-component
atttibute which requires the divide by W step. This check involved the
projtex_mask field. However, the projtex_mask field was being miscalculated
because of some confusion between vertex program outputs and fragment
program inputs.
1. Rework the size_masks calculation so we correctly set bits corresponding
to fragment program input attributes.
2. Rename projtex_mask to proj_attrib_mask since we're interested in more
than just texcoords (generic varying vars too).
3. Simply the indexing of the size_masks and proj_attrib_mask fields.
4. The tracker::active[] array was mis-dimensioned. Use MAX_PROGRAM_TEMPS
instead of a magic number.
5. Update comments, add new assertions.
With these changes the Lightsmark demo/benchmark renders correctly, until
we eventually hit a GPU lockup...
|
|
the driver used to overwrite grf0 then use implicit move by send instruction
to move contents of grf0 to mrf1. However, we must not overwrite grf0 since
it's still used later for fb write.
Instead, do the move directly do mrf1 (we could use implicit move from another
grf reg to mrf1 but since we need a mov to encode the data anyway it doesn't
seem to make sense).
I think the dp_READ/WRITE_16 functions may suffer from the same issue.
While here also remove unnecessary msg_reg_nr parameter from the dataport
functions since always message register 1 is used.
|
|
It's the last addressable byte, not the byte after the end of the buffer.
|
|
I noticed this when this MI_FLUSH showed up in IPEHR for the ut2004 hang.
Not setting the reserved bit didn't help, though.
|
|
Fixes shadowtex.c. And an assert is added to catch this sooner next time.
|
|
|
|
We could have mapped the wrong set of draw buffers. Noticed while looking
into a DRI2 glean ReadPixels issue.
|
|
|
|
|
|
|
|
|
|
We were packing according to the pitch, while the hardware appears to base
it on the base level width.
With this and the previous commit, fbo-cubemap now matches untiled behavior.
|
|
3D rendering to tiled textures was being done with non-tile-aligned offsets.
The G4X hardware has fields to let us support it easily and correctly, while
the pre-G4X hardware requires a path full of suffering, so we just fall back.
|
|
Conflicts:
src/mesa/main/api_validate.c
|
|
glsl compiler will not generate OPCODE_SWZ, and as a first step it would
be translated away to a MOV anyway (why?), but later internally this opcode is
generated (for EXT_texture_swizzling).
|
|
...rather than with linear interpolation. Modern hardware should use
perspective-corrected interpolation for colors (as for texcoords).
glHint(GL_PERSPECTIVE_CORRECTION_HINT, mode) can be used to get
linear interpolation if mode = GL_FASTEST.
|