Age | Commit message (Collapse) | Author |
|
This finishes the implementation of the fragment color clamp control
for ARB_color_buffer_float. I don't wanna keep this stuff in a branch...
|
|
The scheduler and the register allocator are not good enough yet to deal
with the effects of the register rename pass. This was causing a 50%
performance drop in Lightsmark. The pass can be re-enabled once the
scheduler and the register allocator are more mature. r300 and r400
still need this pass, because it prevents a lot of shaders from using
too many texture indirections.
NOTE: This is a candidate for the 7.10 branch.
|
|
|
|
|
|
The compiler seriously needs a cleanup as far as the arrangement of functions
is concerned. It's hard to know whether some function was implemented or not
because there are so many places to search in and it can be anywhere and
named anyhow.
|
|
It looks like the function was originally written for ARB_fragment_program.
NOTE: This is a candidate for the 7.9 branch.
|
|
In addition, the rename_reg pass has been rewritten to use
rc_get_readers().
|
|
|
|
|
|
Those are:
- dead-code elimination
- constant folding
- peephole (mainly copy propagation)
- register allocation
There are some bugs which I need to track down.
Also fix up the descriptions of all the debug options.
|
|
This cleans up the mess in r3xx_compile_fragment_program.
|
|
First list compiler passes in an array, then run the new function rc_run_compiler.
Every backend may need a different set of passes.
This cleans up the mess in r3xx_compile_vertex_program.
|
|
|
|
&c->Base == c.
|
|
I need to reduce the number of parameters of each compiler pass function.
This is part of a larger cleanup.
|
|
|
|
|
|
Wine likes to create a *lot* of constants, exceeding the size of the constant
file in hw.
|
|
i.e. relative addressing (mainly FS), saturate modifiers, exceeding
the maximum number of constants.
|
|
Single loops work, but nested loops do not.
|
|
|
|
The BGNLOOP and ENDLOOP instructions are now being used correctly, which
makes break and continue possible. The deadcode pass has been modified to
handle breaks, and the compiler is more careful about which loops are
unrolled.
|
|
This pass renames register in order to make it easier for the pair
scheduler to group TEX instructions together.
This fixes fdo bug #28606
|
|
Signed-off-by: Marek Olšák <maraeo@gmail.com>
|
|
|
|
|
|
This also allows us to split the loop emulation into two phases. A
tranformation phase which either unrolls loops or prepares them to be
emulated, and the emulation phase which unrolls remaining loops until the
instruction limit is reached. The second phase is completed after the
deadcode analysis in order to get a more accurate count of the number of
instructions in the body of loops.
|
|
It is not perfect, but it is the best we got.
|
|
The loop emulation unrolls loops as may times as possbile while still
keeping the shader program below the maximum instruction limit. At this
point, there are no checks for constant conditionals. This is only enabled
for fragment shaders.
|
|
Needed for vertex shaders too.
|
|
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
|
|
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
|
|
Yes, I'm fully aware this generates subpar code on r500.
|
|
Stuff's starting to show up in arbnpot.
|
|
|
|
This maybe breaks the vert compiler. Hopefully not.
|
|
instructions
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
Note that control flow instruction support isn't actually fully functional yet.
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
This replaces the old NQSSADCE code with the same functionality, but quite
different design. Instead of doing a single integerated pass, we now build
explicit data structures representing the dataflow.
This will enable analysis of flow control instruction, and could potentially
open an avenue for several dataflow based optimizations, such as peephole
optimization, fusing MUL+ADD to MAD, and so on.
|
|
In particular, this removes the dependency on prog_instruction, which
unfortunately creates some code duplication, but also opens a path towards
adding some hardware-specific things in there.
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
Attribute indices will probably be different in Gallium, so make the compiler
independent of magic values.
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|
|
Signed-off-by: Nicolai Hähnle <nhaehnle@gmail.com>
|