Age | Commit message (Collapse) | Author |
|
Fixes glsl-vs-if-nested (70.0 is not <= 70.000648 thanks to the
swizzle bits getting set). Some safety checks are added to make sure
this doesn't happen again as we increase the usage of immediate values
in program generation.
|
|
We'll need to use the HALT instruction to do this right, like returns
from other functions.
|
|
Fixes glsl-vs-dot-vec2.
|
|
This saves an extra message reg move in the program, though I'm not
clear on whether it will have any performance impact other than cache
footprint. It will also fix those math calls on Sandybridge, where
the brw_eu_emit.c brw_math() support relies on the implied move being
used.
|
|
|
|
|
|
|
|
|
|
Mixing stderr (_mesa_print_program, _mesa_print_instruction,
_mesa_print_alu) with stdout means that when writing both to a file,
there isn't a consistent ordering between the two.
|
|
|
|
mtypes.h does not use any symbols from compiler.h.
Also add the required headers for files that depended on symbols from
compiler.h but were indirectly including compiler.h through mtypes.h.
|
|
|
|
|
|
|
|
This pulls in multiple i965 driver fixes which will help ensure better
testing coverage during development, and also gets past the conflicts
of the src/mesa/shader -> src/mesa/program move.
Conflicts:
src/mesa/Makefile
src/mesa/main/shaderapi.c
src/mesa/main/shaderobj.h
|
|
Also fix up comments, so that the difference between the two passes is
clarified.
|
|
|
|
|
|
|
|
|
|
|
|
Several routines directly analyze the grf-to-mrf moves from the Gen
binary code. When it is possible, the mov is removed and the message
register is directly written in the arithmetic instruction
Also redundant mrf-to-grf moves are removed (frequently for example,
when sampling many textures with the same uv)
Code was tested with piglit, warsow and nexuiz on an Ironlake
machine. No regression was found there
Note that the optimizations are *deactivated* on Gen4 and Gen6 since I
did test them properly yet. No reason there are bugs but who knows
The optimizations are currently done in branch free programs *only*.
Considering branches is more complicated and there are actually two
paths: one for branch free programs and one for programs with branches
Also some other optimizations should be done during the emission
itself but considering that some code is shader between vertex shaders
(AOS) and pixel shaders (SOA) and that we may have branches or not, it
is pretty hard to both factorize the code and have one good set of
strategies
|
|
Clarifies program assembly, and with a little tweak to always use
constant_map, we could cut down on constant buffer payload.
|
|
This should be more useful for developers and for bug triaging than
just generating wrong code.
|
|
Fixes glsl-vs-arrays. Bug #27388.
|
|
Fixes glsl-vs-point-size.
|
|
This has confused me twice now. It's a fixed width of 4 (usually a
region description of <4,4,1>), not 1. If it was 1, we'd have been
skipping all over register space.
|
|
|
|
|
|
The ARL value is increments of vec4 in the register file. But
PROGRAM_TEMPORARY or PROGRAM_INPUT are stored as vec4s interleaved
between the two verts being executed (thus a vec8 each), compared to
PROGRAM_STATE_VAR being packed vec4s.
Fixes:
glsl-vs-arrays-2
glsl-vs-mov-after-deref
(without regressing glsl-vs-arrays-3)
|
|
|
|
The previous support was overly complicated by trying to use the same
1-OWORD message for both offsets.
|
|
|
|
|
|
Otherwise, the second half isn't written, and we end up reading back
black.
Fixes the remaining junk drawn in glsl-max-varyings, and will likely
help with a number of large real-world shaders.
|
|
They go into the render cache, so while we don't care about their
contents after execution, failing to note them could cause the writes
to be flushed over important buffer contents later.
|
|
|
|
Otherwise, the subsequent read may not get the written value.
|
|
|
|
To quiet a compiler warning.
|
|
There was confusion on both the size of message we can send, and on
what the URB destination offset means.
The remaining problems appear to be due to spilling of regs in the
fragment shader being broken.
|
|
|
|
This cleans up some chipset dependency sprinkled around, and fixes a
potential overflow of the attribute offset array for many vertex
results.
|
|
|
|
|
|
|
|
When EU executes 'wait' instruction, it stalls and sets notification
register state. Host can issue MMIO write to clear notification
register state to allow EU continue on executing again.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
|
|
|
|
|
|
|