Age | Commit message (Collapse) | Author |
|
We were alpha testing against an unwritten value, resulting in garbage.
(part of) Bug #35073.
|
|
The optimization loop won't reinsert noise instructions or quadop
vectors, so we were traversing the tree for nothing. Lowering vector
indexing was in the loop after do_common_optimization() to avoid the
work if it ended up that the index was actually constant, but that has
been called already in the core.
|
|
This enables the new shadow texture functions in GLSL 1.30.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@intel.com>
|
|
This reverts commit 81b34a4e3a7aec9cdf2781757408dc5e9eec79cb. There
were regressions in the core change that this depends on.
|
|
This gets one more piece of the pipeline onto the new codegen backend.
Once ARB_fragment_program can generate GLSL programs, we can nuke the
old backend.
|
|
This reverts commit 4a3b28113c3d23ba21bb8b8f5ebab7c567083a6d, as it
caused a regression on Ironlake (bug #34646).
|
|
This adds the opcode and the code to convert ir_txd to OPCODE_TXD;
it doesn't actually add support yet.
|
|
Initial plumbing existed to turn the ir_txl into OPCODE_TXL, but it was
never handled.
|
|
Initial plumbing existed to turn the ir_txl into OPCODE_TXL, but it was
never handled.
|
|
The old value, BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE makes it sound like we're
doing a non-bias texture lookup. It has the same value as the new constant
BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE_BIAS_COMPARE, so there should be no
functional changes.
|
|
pixel_w is the final result; wpos_w is used on gen4 to compute it.
NOTE: This is a candidate for the 7.10 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
We can't safely use fixed size arrays since Gen6+ supports unlimited
nesting of control flow.
NOTE: This is a candidate for the 7.10 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
The code that generates MATH instructions attempts to work around
the hardware ignoring source modifiers (abs and negate) by emitting
moves into temporaries. Unfortunately, this pass coalesced those
registers, restoring the original problem. Avoid doing that.
Fixes several OpenGL ES2 conformance failures on Sandybridge.
NOTE: This is a candidate for the 7.10 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
Single-operand math already had these workarounds, but POW (the only two
operand function) did not. It needs them too - otherwise we can hit
assertion failures in brw_eu_emit.c when code is actually generated.
NOTE: This is a candidate for the 7.10 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
These are already picked up by ir.h or glsl_types.h.
|
|
|
|
This fixes a bunch of unnecessary barriers due to the scheduler not
knowing what that arbitrary register description refers to when trying
to reason about its dependencies.
The result is rescheduling in the convolution kernel shader in
Lightsmark, which results in avoiding register spilling and increasing
the performance of the first scene from 6-7 fps midway through the
panning to 11fps. The register spilling was a regression from Mesa
7.9 to Mesa 7.10.
|
|
Improves performance of my GLSL demo by 5.1% (+/- 1.4%, n=7). It also
reschedules the giant multiply tree at the end of
glsl-fs-convolution-1 so that we end up not spilling registers,
producing the expected level of performance.
|
|
|
|
|
|
Fixes piglit glsl-fs-texture2d-branching. I couldn't come up with a
testcase that didn't involve dead code, but it's still worthwhile to
fix I think.
|
|
Fixes texrect-many regression with ff_fragment_shader -- as we added
refs to the subsequent texcoord scaling paramters, the array got
realloced to a new address while our params[] still pointed at the old
location.
|
|
This code should never have been triggered, but I often did anyway
when I disabled optimization passes during debugging, then spent my
time debugging that this code doesn't work.
|
|
No effect, since it was called before live intervals were calculated.
|
|
We were trying to interpolate, which would end up doing unnecessary
math, and doing so on undefined values. Fixes glsl-fs-flat-color.
|
|
The ad-hoc placement of recalculation somewhere between when they got
invalidated and when they were next needed was confusing. This should
clarify what's going on here.
|
|
We were returning the negative absolute value, instead of the absolute
value. Fixes glsl-fs-abs-neg.
|
|
This greatly improves codegen for programs with flow control by
allowing coalescing for all instructions at the top level, not just
ones that follow the last flow control in the program.
|
|
Fixes a regression in ember since switching to the native FS backend,
and the new piglit tests glsl-fs-vec4-indexing-{2,3} for catching this.
|
|
Fixes 26 piglit cases on my GM965.
|
|
|
|
Gen4 and Gen5 hardware can have a maximum supported nesting depth of 16.
Previously, shaders with control flow nested 17 levels deep would
cause a driver assertion or segmentation fault.
Gen6 (Sandybridge) hardware no longer has this restriction.
Fixes fd.o bug #31967.
|
|
Determine header present for fb write by msg length is not right
for SIMD16 dispatch, and if there're more output attributes, header
present is not easy to tell from msg length. This explicitly adds
new param for fb write to say header present or not.
Fixes many cases' hang and failure in GL conformance test.
|
|
Fixes glsl-bug-22603.
|
|
Fixes this GCC warning.
brw_fs.cpp: In function 'brw_reg brw_reg_from_fs_reg(fs_reg*)':
brw_fs.cpp:3255: warning: 'brw_reg' may be used uninitialized in this function
|
|
Fixes:
glean/glsl1-! (not) operator (1, fail)
glean/glsl1-! (not) operator (1, pass)
|
|
With the change of extended math from having the arguments moved into
mrfs and handed off through message passing to being directly hooked
up to the EU, it looks like the piece for doing source modifiers
(negate and abs) was left out.
Fixes:
fog-modes
glean/fp1-ARB_fog_exp test
glean/fp1-ARB_fog_exp2 test
glean/fp1-Computed fog exp test
glean/fp1-Computed fog exp2 test
ext_fog_coord-modes
|
|
Previously the code only handled scalars and vectors. This new code
is modeled somewhat after similar code in ir_to_mesa.
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
Payload reg setup on gen6 depends more on the dispatch width as well
as the uses_depth, computes_depth, and other flags. That's something
we want to decide at compile time, not at cache lookup. As a bonus,
the fragment shader program cache lookup should be cheaper now that
there's less to compute for the hash key.
|
|
At this point, piglit tests for fragment shader loops are working.
|
|
There are now two targets: the hop-to-end-of-block target, and the
target for where to resume execution for active channels.
|
|
There's no more DO since there's no more mask stack, and WHILE has
been shuffled like IF was.
|
|
Fixes glsl-fs-fragdata-1, and hopefully Eve Online where I noticed
this bug in the generated shader. Bug #31952.
|
|
This is quite common for multitexture sampling, and not only cuts down
on the second and later set of MOVs, but typically also allows
compute-to-MRF on the first set.
No statistically siginficant performance difference in nexuiz (n=3),
but it reduces instruction count in one of its shaders and seems like
a good idea.
|
|
We were skipping it if the instruction producing the value we were
going to compute-to-mrf used its result reg as a source reg. This
meant that the typical "write interpolated color to fragment color" or
"texture from interpolated texcoord" shader didn't compute-to-MRF.
Just don't check for the interference cases until after we've checked
if this is the instruction we wanted to compute-to-MRF.
Improves nexuiz high-settings performance on my laptop 0.48% +- 0.08%
(n=3).
|
|
On pre-gen6, this turns 4 instructions into 1. We could still do
better by folding the saturate into the instruction generating the
value if nobody else uses it, but that should be a separate pass.
|
|
This hits a common case with min/max operations.
|
|
|