Age | Commit message (Collapse) | Author |
|
This should make it a lot harder to forget to zero things.
|
|
Fixes glsl-fs-copy-propagation-texcoords-1.
|
|
This should save on the overhead of tree-walking and provide a
convenient place to add more instruction lowering in the future.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
The vector operator collects 2, 3, or 4 scalar components into a
vector. Doing this has several advantages. First, it will make
ud-chain tracking for components of vectors much easier. Second, a
later optimization pass could collect scalars into vectors to allow
generation of SWZ instructions (or similar as operands to other
instructions on R200 and i915). It also enables an easy way to
generate IR for SWZ instructions in the ARB_vertex_program assembler.
|
|
This may grow in the near future.
|
|
The operate just like ir_unop_sin and ir_unop_cos except that they
expect their inputs to be limited to the range [-pi, pi]. Several
GPUs require this limited range for their sine and cosine
instructions, so having these as operations (along with a to-be-written
lowering pass) helps this architectures.
These new operations also matche the semantics of the
GL_ARB_fragment_program SCS instruction. Having these as operations
helps in generating GLSL IR directly from assembly fragment programs.
|
|
If an instruction writes reg but nothing later uses it, then we don't
need to bother doing it. Before, we were just killing code that was
never read after it was ever written.
This removes many interpolation instructions for attributes with only
a few comopnents used. Improves nexuiz high-settings performance .46%
+/- .12% (n=3) on my Ironlake.
|
|
|
|
|
|
|
|
|
|
|
|
This showed up as cairo-gl gradients being inverted on everyone but
Intel, where I'd apparently tweaked the transformation to work around
the bug. Fixes piglit fbo-fragcoord.
|
|
Silences this GCC warning.
brw_fs.cpp: In member function 'void fs_visitor::split_virtual_grfs()':
brw_fs.cpp:2516: warning: unused variable 'reg'
|
|
IF statements were getting flattened while they were broken. With
Zhenyu's last fix for ENDIF's type, everything appears to have lined
up to actually work.
This regresses two tests:
glsl1-! (not) operator (1, fail)
glsl1-! (not) operator (1, pass)
but fixes tests that couldn't work before because the IFs couldn't be
flattened:
glsl-fs-discard-01
occlusion-query-discard
(and, naturally, this should be a performance improvement for apps
that actually use IF statements to avoid executing a bunch of code).
|
|
Sometimes we swizzled in a different channel it looked like, and
sometimes we swizzled in zero. Or something.
Having looked at the output of another code generator for this chip,
this is approximately what they do, too: use align1 math on
temporaries, and then move the results into place.
Fixes:
glean/vp1-EX2 test
glean/vp1-EXP test
glean/vp1-LG2 test
glean/vp1-RCP test (reciprocal)
glean/vp1-RSQ test 1 (reciprocal square root)
shaders/glsl-cos
shaders/glsl-sin
shaders/glsl-vs-masked-cos
shaders/vpfp-generic/vp-exp-alias
|
|
This reverts commit 9c39a9fcb2c76897e9b5aff68ce197a411c4e25c.
Remove VS SPF mode, conditional instruction works for VS now.
|
|
That should also be immediate value for type W.
|
|
Fixes 10 piglit cases that were assertion failing.
|
|
Fixes assertion failure with texture swizzling in the GLSL path when
it's triggered (such as gen6 FF or ARB_fp shadow comparisons).
Fixes:
texdepth
texSwizzle
fp1-DST test
fp1-LIT test 3
|
|
Silences this GCC warning.
brw_wm_fp.c: In function 'brw_wm_pass_fp':
brw_wm_fp.c:966: warning: 'last_inst' may be used uninitialized in this function
brw_wm_fp.c:966: note: 'last_inst' was declared here
|
|
Silences this GCC warning.
brw_wm_fp.c: In function 'precalc_tex':
brw_wm_fp.c:666: warning: 'tmpcoord.Index' may be used uninitialized in this function
|
|
|
|
This provides the optimizer with hints about code hotness, which we're
quite certain about for debug printouts (or, rather, while we
developers often hit the checks for debug printouts, we don't care
about performance while doing so).
|
|
Fix compiz crash.
https://bugs.freedesktop.org/show_bug.cgi?id=31124
|
|
Fixes 6 piglit tests about stencil operations.
|
|
Matches pre-gen6, and fixes glsl-vs-large-uniform-array.
|
|
|
|
Fixes piglit user-clip, and compiz desktop switching when dragging a
window and using just 2 desktops. Bug #30446.
|
|
|
|
This fixes some insanity that would otherwise be required for GLSL
1.30 bit ops or gen6 integer uniform operations in general, at the
cost of upload-time pain. Given that we only have that pain because
mesa's mangling our integer uniforms to be floats, this something that
should be fixed outside of the shader codegen.
|
|
The assumption is that all stages are the same program or that
varyings are passed between stages using built-in varyings.
|
|
Avoids GPU hang on glsl-fs-convolution-1.
|
|
I'm trying to clamp to a minimum of 1 URB row, not a maximum of 1.
Fixes:
glsl-kwin-blur
glsl-max-varying
glsl-routing
|
|
|
|
Fixes glsl-fs-uniform-array-5.
|
|
This was slightly confused because gen6_wm_constants does the push
constant buffer, while brw_wm_constants does pull constants.
|
|
|
|
|
|
It's a little more painful than before because we don't have the handy
mask register any more, and have to make do with cooking up a value
out of the flag register.
|
|
|
|
This doesn't appear to help any testcases I'm looking at, but it looks
like it's required.
|
|
This is apparently required, as the thread will be initiated while it
still has dependencies, and this is what waits for those to be
resolved before writing color.
|
|
|
|
|
|
Fixes glsl-fs-uniform-array-5, but not 6 which fails in ir_to_mesa.
|
|
This makes it a lot easier to track down where we failed when some
code emit triggers an assert. Plus, less memory allocation for
codegen.
|
|
Fixes glsl-fs-convolution-2, which was blowing up due to the array
access insanity getting at the uniform values within the loop. Each
temporary was considered live across the whole loop.
|
|
One, it was allocating increments of 1kb, but per thread scratch space
is a power of two. Two, the new FS wasn't getting total_scratch set
at all, so everyone thought they had 1kb and writes beyond 1kb would
go stomping on a neighbor thread.
With this plus the previous register spilling for the new FS,
glsl-fs-convolution-1 passes.
|
|
g0 is used by others, and is expected to be left exactly as it was
dispatched to us. So manually move g0 into our message reg when
spilling/unspilling and update the offset in the MRF. Fixes failures
in texture sampling after having spilled a register.
|