Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
The r300 compiler can eliminate unused uniforms and remap uniform locations
if their number surpasses hardware limits, so the limit is actually
NumParameters + NumUnusedParameters. This is important for some apps
under Wine to run.
Wine sometimes declares a uniform array of 256 vec4's and some Wine-specific
constants on top of that, so in total there is more uniforms than r300 can
handle. This was the main motivation for implementing the elimination
of unused constants.
We should allow drivers to implement fail & recovery paths where it makes
sense, so giving up too early especially when comes to uniforms is not
so good idea, though I agree there should be some hard limit for all drivers.
This patch fixes:
- glsl-fs-uniform-array-5
- glsl-vs-large-uniform-array
on drivers which can eliminate unused uniforms.
|
|
This reverts commit 7cb87dffce2c7a37f960f3a865cf92fd193dd8c5.
There were regressions (Bug #35244) and more review has been requested.
|
|
This is a step towards providing a direct route for drivers accepting
GLSL IR for codegen. Perhaps more importantly, it runs the fixed
function fragment program through the GLSL IR optimization. Having
seen how easy it is to make ugly fixed function texenv code that can
do unnecessary work, this may improve real applicatinos.
|
|
This is used in the upcoming fixed function shader_program generation,
and shader_program and ARB programs are together in this code until
both fragment and vertex ff get converted.
|
|
Since we're compiling/linking GLSL shaders we should check against
the shader uniform limits, not the legacy vertex/fragment program
parameter limits which are usually lower.
|
|
The gl_program_constants struct is for limits that are applicable to
any/all shader stages. Move the geometry shader-only fields into the
gl_constants struct.
Remove redundant MaxGeometryUniformComponents field too.
|
|
Without these checks we could create shaders with more samplers,
constants than the driver could handle. Fail linking rather than
dying later.
|
|
See https://bugs.freedesktop.org/show_bug.cgi?id=29418
|
|
For more info see fd.o bug 29418.
|
|
|
|
|
|
These files were for the ARB_vertex_program / ARB_fragement_program assembler.
|
|
|
|
Addresses excessive TEMP allocation in vertex shaders where all CONSTs are
stored into TEMPS at the start, but copy propagation was failing due to
the presence of IFs.
We could do something about loops, but ifs are easy enough.
|
|
The ACP may already be NULL, so don't try to make it NULL again.
This should fix bugzilla #34119.
|
|
Fixes glsl-vs-post-increment-01.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Both of these assertions are triggered by the test case in bugzilla
size of 0.
|
|
The original allocations use regs->regs as the context, so talloc will
happily ignore the context given here. Change it to match to clarify
that it isn't changing.
|
|
This revealed a bug in ra_get_spill_benefit where we only considered
the benefit of the first adjacency we were to remove, explaining some
of the ugly spilling I've seen in shaders. Because of the reduced
spilling, it reduces the runtime of glsl-fs-convolution-1 36.9% +/-
0.9% (n=5).
|
|
|
|
Reduces runtime of glsl-fs-convolution-1 another 13.9% +/- 0.6% (n=5).
|
|
This was recommended in the original paper, but I figued "make it run"
before "make it fast". Now we make it fast. Reduces the runtime of
glsl-fs-convolution-1 by 12.7% +/- 0.6% (n=5).
|
|
Our use of the register allocator in i965 is somewhat unusual.
Whereas most architectures would have a smaller set of registers with
fewer register classes and reuse that across compilation, we have 1,
2, and 4-register classes (usually) and a variable number up to 128
registers per compile depending on how many setup parameters and push
constants are present. As a result, when compiling large numbers of
programs (as with glean texCombine going through ff_fragment_shader),
we spent much of our CPU time in computing the q[] array. By keeping
a separate list of what the conflicts are for a particular reg, we
reduce glean texCombine time 17.0% +/- 2.3% (n=5).
We don't expect this optimization to be useful for 915, which will
have a constant register set, but it would be useful if we were switch
to this register allocator for Mesa IR.
|
|
Conflicts:
src/gallium/auxiliary/draw/draw_llvm.c
src/gallium/drivers/llvmpipe/lp_state_fs.c
src/glsl/ir_set_program_inouts.cpp
src/mesa/tnl/t_vb_program.c
|
|
|
|
Fixes these MSVC errors.
ir_to_mesa.cpp(2644) : error C2057: expected constant expression
ir_to_mesa.cpp(2644) : error C2466: cannot allocate an array of constant size 0
ir_to_mesa.cpp(2644) : error C2133: 'acp' : unknown size
ir_to_mesa.cpp(2646) : error C2070: 'ir_to_mesa_instruction *[]': illegal sizeof operand
ir_to_mesa.cpp(2709) : error C2070: 'ir_to_mesa_instruction *[]': illegal sizeof operand
ir_to_mesa.cpp(2718) : error C2070: 'ir_to_mesa_instruction *[]': illegal sizeof operand
|
|
This catches more opportunities than the prog_optimize.c code on
openarena's fixed function shaders turned to GLSL, mostly due to
looking at multiple source instructions for copy propagation
opportunities. It should also be much more CPU efficient than
prog_optimize.c's code.
|
|
Include mfeatures.h for feature tests.
|
|
This adds a new optional max_depth parameter (defaulting to 0) to
lower_if_to_cond_assign, and makes the pass only flatten if-statements
nested deeper than that.
By default, all if-statements will be flattened, just like before.
This patch also renames do_if_to_cond_assign to lower_if_to_cond_assign,
to match the new naming conventions.
|
|
|
|
|
|
|
|
|
|
|
|
This is the same as what the array dereference handler does.
Fixes piglit test glsl-link-struct-array (bugzilla #31648).
NOTE: This is a candidate for the 7.9 and 7.10 branches.
|
|
|
|
|
|
|
|
|
|
|