Age | Commit message (Collapse) | Author |
|
|
|
Fixes glsl-fs-neg and 5 other tests.
|
|
10 more piglit tests pass.
|
|
20 more piglit tests pass.
|
|
|
|
glsl-algebraic-rcp-rsq managed to use 33 registers, and we claimed to
only use 32, so the write to g32 would go stomping over the precious
g0 of some other thread.
|
|
This wouldn't catch the last failure fixed in them, because we don't
validate assignments well (due to the fact that we've got a pretty
glaring inconsistency in how we handle assignment writemasking), but
it could catch other failure we may produce.
|
|
|
|
|
|
We weren't smearing a component of a split RHS out to reach an unsplit
LHS's writemask, so gl_FragColor (always unsplit) would often get
uninitialized values.
Fixes: glsl-algebraic-add-add-1 (and probably many others).
|
|
|
|
|
|
|
|
Fix mingw build.
|
|
Thanks to Michal for spotting it.
|
|
Use an internal struct for line setup information.
|
|
Point sprites now done in the rasterizer setup code instead of
going through the draw module.
|
|
A few subpixel_snap and fixed width changes.
Conflicts:
src/gallium/drivers/llvmpipe/lp_setup_point.c
|
|
Conflicts:
src/gallium/drivers/llvmpipe/lp_setup_context.h
src/gallium/drivers/llvmpipe/lp_setup_line.c
src/gallium/drivers/llvmpipe/lp_setup_tri.c
|
|
|
|
|
|
Line rasterization that follows diamond exit rule.
Can still optimize logic for start/endpoints.
|
|
Rasterize lines directly by treating them as 4-sided polygons.
Still need to check the exact pixel rasteration.
|
|
|
|
Looks nice, but makes almost no impact on performance - maybe
a percent or so in isosurf, nothing elsewhere. May be of use
later on.
|
|
Remove p_compiler.h.
|
|
Include p_compiler.h for boolean and INLINE symbols.
|
|
Include p_compiler.h for uint symbol.
|
|
Include p_compiler.h for uint symbol.
|
|
Remove p_compiler.h.
|
|
Include p_compiler.h for PUBLIC symbol.
|
|
Include p_compiler.h for uint32_t and boolean symbols.
|
|
|
|
This reverts commit bd25e23bf3740f59ce8859848c715daeb9e9821f.
Apart from introducing a lot of hex magic numbers and being highly impenetable code,
it causes lots of lockups on an average piglit run that always runs without lockups.
Always run piglit before/after doing big things like this.
|
|
this adds handling for some more CF instructions and conditions
also adds parameter for stack size emission
These seem to pass on VS with the stack size hack but not on FS,
TODO: fix FS + stack size calcs
|
|
this makes op2 emission smaller, since it skips instructions
that don't write to the dst. not sure if this could have unwanted
side effects but try it and see.
|
|
though it isn't passing the test, and this instruction is pure bonghits.
|
|
|
|
+269 piglits
|
|
It hangs the GPU due to FB_WRITE handling being incomplete. There are
bigger issues to handle first.
|
|
|
|
I'm also fixing this upstream in libdrm, but this avoids new libdrm
dependency for the moment.
|
|
This should make debugging way easier, as now we have context for
reading large programs.
|
|
|
|
At least some tests, like glsl-vs-sign, now work.
|
|
This can successfully emit a real program that generates magenta now.
|
|
Our channel-expressions and vector-splitting changes now happen into a
private copy of the IR that we maintain for ourselves. Uniform
assignment still happens by the core, so we continue using Mesa IR
generation not just for swrast fallbacks but also for uniform values
(since there's no storage for their contents other than
shader_program->FragmentProgram->Parameters->ParameterValues). And
most importantly, at the moment no actual codegen is hooked up other
than emitting our favorite color to the framebuffer.
|
|
Combined with the previous pass, this lets other optimization passes
do their work thanks to ir_tree_grafting. Still have regression in
instruction count with INTEL_NEW_FS, but register count is even
better.
|
|
This is a step towards implementing a GLSL IR backend for the 965
fragment shader. Because it has downsides with the current codegen,
it is hidden under the environment variable INTEL_NEW_FS.
This results in an increase in instruction count at the moment (1444
-> 1752 for glsl-fs-raytrace, 345 -> 359 on my demo), because dot
products are turned into a series of multiplies and adds instead of a
custom expansion of MULs and MACs, and by not splitting the variable
types up we don't get tree grafting and thus there are extra moves of
temporary storage. However, register count drops for the non-GLSL
path (64 -> 56 on my demo shader) because the register allocator sees
all the sub-operations.
|
|
|