Age | Commit message (Collapse) | Author |
|
I don't have evidence for this amounting to any improvement,
but it does codify a bit more what we understand so far about
the pipeline.
|
|
|
|
This fixes a bunch of unnecessary barriers due to the scheduler not
knowing what that arbitrary register description refers to when trying
to reason about its dependencies.
The result is rescheduling in the convolution kernel shader in
Lightsmark, which results in avoiding register spilling and increasing
the performance of the first scene from 6-7 fps midway through the
panning to 11fps. The register spilling was a regression from Mesa
7.9 to Mesa 7.10.
|
|
Improves performance of my GLSL demo by 5.1% (+/- 1.4%, n=7). It also
reschedules the giant multiply tree at the end of
glsl-fs-convolution-1 so that we end up not spilling registers,
producing the expected level of performance.
|
|
|
|
Drivers should override the default range/precision info as needed.
No drivers do this yet.
|
|
This is a candidate for 7.9 and 7.10
|
|
|
|
(really not sure why I'm doing this).
This is a candidate for 7.9 and 7.10 branches.
|
|
Can happen during swrast fallbacks if a buffer is somehow bound as
a render target and a texture.
Fixes gnome-shell on nv20, and gets it mostly working on nv10.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
sw clears were being used and not getting the correct offsets in the span
code.
also not emitting correct offsets for CB draws to texture levels.
(I've no idea why I'm playing with r100).
This is a candidate for 7.9 and 7.10
|
|
|
|
Fixes piglit glsl-fs-texture2d-branching. I couldn't come up with a
testcase that didn't involve dead code, but it's still worthwhile to
fix I think.
|
|
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=33247
There might still be some issues with drawing multiple instances
with VBO splitting to investigate someday.
|
|
This revealed a bug in ra_get_spill_benefit where we only considered
the benefit of the first adjacency we were to remove, explaining some
of the ugly spilling I've seen in shaders. Because of the reduced
spilling, it reduces the runtime of glsl-fs-convolution-1 36.9% +/-
0.9% (n=5).
|
|
|
|
Reduces runtime of glsl-fs-convolution-1 another 13.9% +/- 0.6% (n=5).
|
|
This was recommended in the original paper, but I figued "make it run"
before "make it fast". Now we make it fast. Reduces the runtime of
glsl-fs-convolution-1 by 12.7% +/- 0.6% (n=5).
|
|
Our use of the register allocator in i965 is somewhat unusual.
Whereas most architectures would have a smaller set of registers with
fewer register classes and reuse that across compilation, we have 1,
2, and 4-register classes (usually) and a variable number up to 128
registers per compile depending on how many setup parameters and push
constants are present. As a result, when compiling large numbers of
programs (as with glean texCombine going through ff_fragment_shader),
we spent much of our CPU time in computing the q[] array. By keeping
a separate list of what the conflicts are for a particular reg, we
reduce glean texCombine time 17.0% +/- 2.3% (n=5).
We don't expect this optimization to be useful for 915, which will
have a constant register set, but it would be useful if we were switch
to this register allocator for Mesa IR.
|
|
Hopefully better than previous - this passes more mipgen tests
|
|
border color is RGBA for samples - this passes texenv tests
|
|
use introduced STATE_FB_WPOS_Y_TRANSFORM variable (thanks Marek)
this gets coords also right when using fbo
|
|
Fixes texrect-many regression with ff_fragment_shader -- as we added
refs to the subsequent texcoord scaling paramters, the array got
realloced to a new address while our params[] still pointed at the old
location.
|
|
|
|
Fixes a VTK regression after adding GL_ARB_draw_instanced.
|
|
|
|
primcount is also a parameter to glMultiDrawElements(). Use numInstances
to avoid confusion between these things.
|
|
|
|
|
|
|
|
This uses a sampler view to access the texture with the alternate format.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
We just choose the texture format depending on the srgb decode bit
for the sRGB formats.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
This implements the extension by choosing a different set of texture
fetch functions when the texture parameter changes.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Conflicts:
src/gallium/auxiliary/draw/draw_llvm.c
src/gallium/drivers/llvmpipe/lp_state_fs.c
src/glsl/ir_set_program_inouts.cpp
src/mesa/tnl/t_vb_program.c
|
|
Core mesa has gained support for GL_ARB_ES2_compatibility. Make GLES
generated dispatch table use them.
|
|
Fixes piglit arb_es2_compatibility-shadercompiler
|
|
Fixes piglit arb_es2_compatibility-maxvectors.
|
|
These are ARB_ES2_compatibility float variants of the core double
entrypoints. Fixes arb_es2_compatibility-depthrangef.
|
|
|
|
Fixes these MSVC errors.
ir_to_mesa.cpp(2644) : error C2057: expected constant expression
ir_to_mesa.cpp(2644) : error C2466: cannot allocate an array of constant size 0
ir_to_mesa.cpp(2644) : error C2133: 'acp' : unknown size
ir_to_mesa.cpp(2646) : error C2070: 'ir_to_mesa_instruction *[]': illegal sizeof operand
ir_to_mesa.cpp(2709) : error C2070: 'ir_to_mesa_instruction *[]': illegal sizeof operand
ir_to_mesa.cpp(2718) : error C2070: 'ir_to_mesa_instruction *[]': illegal sizeof operand
|
|
Fixes no-op dispatch warning in piglit
arb_es2_compatibility-releaseshadercompiler.c.
|