Age | Commit message (Collapse) | Author |
|
Fixes compiler warnings.
|
|
This increases the chance that GLSL programs will actually work.
Note that continues and returns are not yet lowered, so linking
will just fail if not supported.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
There is a restriction on the destination of an operation involving a
vector immediate being 128-bit aligned and the destination horizontal
stride being equivalent to 2 bytes. Fixes bad pixel_x results from
gl_FragCoord, where each pair had the same value.
|
|
The default type conversion for MOV should be fine, and RNDZ actually
requires two instructions.
|
|
|
|
Dumping back to potentially 16-wide dispatch doesn't really work out
at the moment, and hopefully I'll just be able to resolve all the
failures so we never have to do this at all.
|
|
|
|
This includes a handy little safety check to prevent the loop from
going "too long", as permitted by the spec. I haven't gone out of my
way to test it, though…
Fixes 20 more piglit tests.
|
|
Fixes 3 testcases related to discard.
|
|
Like the comparison operations, this suffered from CMP only setting
the low bit. Doing the AND instructions would be the same instruction
count as the more obvious conditional moves, so do cond moves.
Fixes glsl-fs-sign and 6 other cases, like trig functions that use
sign() internally.
|
|
Fixes 5 piglit tests for bias. Note that LOD is a 1.30 feature and
not yet supported.
|
|
Fixes 11 piglit tests.
|
|
Fixes build on Linux/GCC 4.4 as libdrm includes are also used by other
brw_fs_*.cpp files.
Bug #29855
|
|
Fixes glsl-algebraic-sub-zero-4.
|
|
Fixes 4 piglit tests about min, max, and clamp.
|
|
When it says it sets the LSB, that's not just a hint as to where the
result goes. Only the LSB is modified. Fixes 20 piglit cases.
|
|
When we're trying to do integer ops, handing a float in doesn't help.
|
|
Fixes:
glsl-fs-any.
glsl1-integer division with uniform var
|
|
Fixes glsl-fs-mod.
|
|
Fixes glsl-fs-neg and 5 other tests.
|
|
10 more piglit tests pass.
|
|
20 more piglit tests pass.
|
|
|
|
glsl-algebraic-rcp-rsq managed to use 33 registers, and we claimed to
only use 32, so the write to g32 would go stomping over the precious
g0 of some other thread.
|
|
This wouldn't catch the last failure fixed in them, because we don't
validate assignments well (due to the fact that we've got a pretty
glaring inconsistency in how we handle assignment writemasking), but
it could catch other failure we may produce.
|
|
|
|
|
|
+269 piglits
|
|
It hangs the GPU due to FB_WRITE handling being incomplete. There are
bigger issues to handle first.
|
|
|
|
I'm also fixing this upstream in libdrm, but this avoids new libdrm
dependency for the moment.
|
|
This should make debugging way easier, as now we have context for
reading large programs.
|
|
|
|
At least some tests, like glsl-vs-sign, now work.
|
|
This can successfully emit a real program that generates magenta now.
|
|
Our channel-expressions and vector-splitting changes now happen into a
private copy of the IR that we maintain for ourselves. Uniform
assignment still happens by the core, so we continue using Mesa IR
generation not just for swrast fallbacks but also for uniform values
(since there's no storage for their contents other than
shader_program->FragmentProgram->Parameters->ParameterValues). And
most importantly, at the moment no actual codegen is hooked up other
than emitting our favorite color to the framebuffer.
|
|
Combined with the previous pass, this lets other optimization passes
do their work thanks to ir_tree_grafting. Still have regression in
instruction count with INTEL_NEW_FS, but register count is even
better.
|
|
This is a step towards implementing a GLSL IR backend for the 965
fragment shader. Because it has downsides with the current codegen,
it is hidden under the environment variable INTEL_NEW_FS.
This results in an increase in instruction count at the moment (1444
-> 1752 for glsl-fs-raytrace, 345 -> 359 on my demo), because dot
products are turned into a series of multiplies and adds instead of a
custom expansion of MULs and MACs, and by not splitting the variable
types up we don't get tree grafting and thus there are extra moves of
temporary storage. However, register count drops for the non-GLSL
path (64 -> 56 on my demo shader) because the register allocator sees
all the sub-operations.
|
|
|