Age | Commit message (Collapse) | Author |
|
Fixes this GCC warning with linux-x86 build.
radeon_pair_regalloc.c: In function ‘compute_live_intervals’:
radeon_pair_regalloc.c:221: warning: ISO C90 forbids mixed declarations and code
|
|
This fixes incorrect behaviour when the stencil clear value exceeds
the size of the stencil buffer, for example, when set with:
glClearStencil (~1); /* Set a bit pattern of 111...11111110 */
glClear (GL_STENCIL_BUFFER_BIT);
The clear value needs to be masked by the value 2^m - 1, where m is the
number of bits in the stencil buffer. Previously, we passed the value
masked with 0x7fffffff to _mesa_StencilFuncSeparate which then clamps,
NOT masks the value to the range 0 to 2^m - 1.
The result would be clearing the stencil buffer to 0xff, rather than 0xfe.
Signed-off-by: Peter Clifton <pcjc2@cam.ac.uk>
Signed-off-by: Brian Paul <brianp@vmware.com>
|
|
This was really just for testing purposes.
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=31894
|
|
Fixes bugzilla #31832
NOTE: This is a candidate for the 7.9 branch.
|
|
Cuts the extra CMP instruction that used to precede SEL.
|
|
|
|
|
|
|
|
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=31193
NOTE: This is a candidate for the 7.9 branch.
|
|
|
|
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
|
|
|
|
In addition, the rename_reg pass has been rewritten to use
rc_get_readers().
|
|
Previously, presubtract operations where only being used by instructions
with less than three source source registers.
|
|
|
|
|
|
|
|
|
|
When the result of the alpha instruction is being replicated to the RGB
destination register, we do not need to use alpha's destination register.
This fixes an invalid "Too many hardware temporaries used" error in
the case where a transcendent operation writes to a temporary register
greater than max_temp_regs.
NOTE: This is a candidate for the 7.9 branch.
|
|
This fixes an invalid "Too many hardware temporaries used" error in the
case where a source reads from a temporary register with an index greater
than max_temp_regs and then the source is marked as unused before the
register allocation pass.
NOTE: This is a candidate for the 7.9 branch.
|
|
Reads of registers that where not written to within the same block were
not being tracked. So in a situations like this:
0: IF
1: ADD t0, t1, t2
2: MOV t2, t1
Instruction 2 didn't know that instruction 1 read from t2, so
in some cases instruction 2 was being scheduled before instruction 1.
NOTE: This is a candidate for the 7.9 branch.
|
|
NOTE: This is a candidate for the 7.9 branch.
|
|
NOTE: This is a candidate for the 7.9 branch.
|
|
|
|
|
|
|
|
|
|
This is quite common for multitexture sampling, and not only cuts down
on the second and later set of MOVs, but typically also allows
compute-to-MRF on the first set.
No statistically siginficant performance difference in nexuiz (n=3),
but it reduces instruction count in one of its shaders and seems like
a good idea.
|
|
We were skipping it if the instruction producing the value we were
going to compute-to-mrf used its result reg as a source reg. This
meant that the typical "write interpolated color to fragment color" or
"texture from interpolated texcoord" shader didn't compute-to-MRF.
Just don't check for the interference cases until after we've checked
if this is the instruction we wanted to compute-to-MRF.
Improves nexuiz high-settings performance on my laptop 0.48% +- 0.08%
(n=3).
|
|
On pre-gen6, this turns 4 instructions into 1. We could still do
better by folding the saturate into the instruction generating the
value if nobody else uses it, but that should be a separate pass.
|
|
This hits a common case with min/max operations.
|
|
|
|
This should make it a lot harder to forget to zero things.
|
|
Fixes glsl-fs-copy-propagation-texcoords-1.
|
|
|
|
This should save on the overhead of tree-walking and provide a
convenient place to add more instruction lowering in the future.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
The vector operator collects 2, 3, or 4 scalar components into a
vector. Doing this has several advantages. First, it will make
ud-chain tracking for components of vectors much easier. Second, a
later optimization pass could collect scalars into vectors to allow
generation of SWZ instructions (or similar as operands to other
instructions on R200 and i915). It also enables an easy way to
generate IR for SWZ instructions in the ARB_vertex_program assembler.
|
|
This may grow in the near future.
|
|
The operate just like ir_unop_sin and ir_unop_cos except that they
expect their inputs to be limited to the range [-pi, pi]. Several
GPUs require this limited range for their sine and cosine
instructions, so having these as operations (along with a to-be-written
lowering pass) helps this architectures.
These new operations also matche the semantics of the
GL_ARB_fragment_program SCS instruction. Having these as operations
helps in generating GLSL IR directly from assembly fragment programs.
|
|
Signed-off-by: Viktor Novotný <noviktor@seznam.cz>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
|
|
This should prevent the field going unset in the future. See bug
http://bugs.freedesktop.org/show_bug.cgi?id=31544 for background.
Also remove unneeded calls to clear_teximage_fields().
Finally, call _mesa_set_fetch_functions() from the
_mesa_init_teximage_fields() function so callers have one less
thing to worry about.
|
|
If an instruction writes reg but nothing later uses it, then we don't
need to bother doing it. Before, we were just killing code that was
never read after it was ever written.
This removes many interpolation instructions for attributes with only
a few comopnents used. Improves nexuiz high-settings performance .46%
+/- .12% (n=3) on my Ironlake.
|
|
|
|
|
|
|
|
|
|
|
|
Default group bytes to 512 on evergreen. Don't query
tiling config yet for evergreen, the current info returned is not
adequate for evergreen (no way to get bank info).
|