summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965/brw_fs.cpp
AgeCommit message (Collapse)Author
2011-03-15i965: Fix alpha testing when there is no color buffer in the FBO.Eric Anholt
We were alpha testing against an unwritten value, resulting in garbage. (part of) Bug #35073.
2011-03-15i965: Do our lowering passes before the loop of optimization.Eric Anholt
The optimization loop won't reinsert noise instructions or quadop vectors, so we were traversing the tree for nothing. Lowering vector indexing was in the loop after do_common_optimization() to avoid the work if it ended up that the index was actually constant, but that has been called already in the core.
2011-03-14i965: Enable texture lookups whose return type is 'float'Kenneth Graunke
This enables the new shadow texture functions in GLSL 1.30. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@intel.com>
2011-03-12Revert "i965: Use the fixed function GLSL program instead of the ARB program."Eric Anholt
This reverts commit 81b34a4e3a7aec9cdf2781757408dc5e9eec79cb. There were regressions in the core change that this depends on.
2011-03-11i965: Use the fixed function GLSL program instead of the ARB program.Eric Anholt
This gets one more piece of the pipeline onto the new codegen backend. Once ARB_fragment_program can generate GLSL programs, we can nuke the old backend.
2011-03-01Revert "i965/fs: Correctly set up gl_FragCoord.w on Sandybridge."Kenneth Graunke
This reverts commit 4a3b28113c3d23ba21bb8b8f5ebab7c567083a6d, as it caused a regression on Ironlake (bug #34646).
2011-02-25i965/fs: Initial plumbing to support TXD.Kenneth Graunke
This adds the opcode and the code to convert ir_txd to OPCODE_TXD; it doesn't actually add support yet.
2011-02-25i965/fs: Complete TXL support on gen5+.Kenneth Graunke
Initial plumbing existed to turn the ir_txl into OPCODE_TXL, but it was never handled.
2011-02-25i965/fs: Complete TXL support on gen4.Kenneth Graunke
Initial plumbing existed to turn the ir_txl into OPCODE_TXL, but it was never handled.
2011-02-25i965/fs: Use a properly named constant in TXB handling.Kenneth Graunke
The old value, BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE makes it sound like we're doing a non-bias texture lookup. It has the same value as the new constant BRW_SAMPLER_MESSAGE_SIMD8_SAMPLE_BIAS_COMPARE, so there should be no functional changes.
2011-02-22i965/fs: Correctly set up gl_FragCoord.w on Sandybridge.Kenneth Graunke
pixel_w is the final result; wpos_w is used on gen4 to compute it. NOTE: This is a candidate for the 7.10 branch. Reviewed-by: Eric Anholt <eric@anholt.net>
2011-02-22i965/fs: Refactor control flow stack handling.Kenneth Graunke
We can't safely use fixed size arrays since Gen6+ supports unlimited nesting of control flow. NOTE: This is a candidate for the 7.10 branch. Reviewed-by: Eric Anholt <eric@anholt.net>
2011-02-22i965/fs: Avoid register coalescing away gen6 MATH workarounds.Kenneth Graunke
The code that generates MATH instructions attempts to work around the hardware ignoring source modifiers (abs and negate) by emitting moves into temporaries. Unfortunately, this pass coalesced those registers, restoring the original problem. Avoid doing that. Fixes several OpenGL ES2 conformance failures on Sandybridge. NOTE: This is a candidate for the 7.10 branch. Reviewed-by: Eric Anholt <eric@anholt.net>
2011-02-22i965/fs: Apply source modifier workarounds to POW as well.Kenneth Graunke
Single-operand math already had these workarounds, but POW (the only two operand function) did not. It needs them too - otherwise we can hit assertion failures in brw_eu_emit.c when code is actually generated. NOTE: This is a candidate for the 7.10 branch. Reviewed-by: Eric Anholt <eric@anholt.net>
2011-01-31i965: Emit texel offsets in sampler messages.Kenneth Graunke
2011-01-31Convert everything from the talloc API to the ralloc API.Kenneth Graunke
2011-01-21glsl, i965: Remove unnecessary talloc includes.Kenneth Graunke
These are already picked up by ir.h or glsl_types.h.
2011-01-19i965/fs: Add a helper function for detecting math opcodes.Eric Anholt
2011-01-19i965/fs: Assign URB/CURB register numbers after instruction scheduling.Eric Anholt
This fixes a bunch of unnecessary barriers due to the scheduler not knowing what that arbitrary register description refers to when trying to reason about its dependencies. The result is rescheduling in the convolution kernel shader in Lightsmark, which results in avoiding register spilling and increasing the performance of the first scene from 6-7 fps midway through the panning to 11fps. The register spilling was a regression from Mesa 7.9 to Mesa 7.10.
2011-01-19i965/fs: Add an instruction scheduler.Eric Anholt
Improves performance of my GLSL demo by 5.1% (+/- 1.4%, n=7). It also reschedules the giant multiply tree at the end of glsl-fs-convolution-1 so that we end up not spilling registers, producing the expected level of performance.
2011-01-19i965/fs: Add a helper for detecting texturing opcodes.Eric Anholt
2011-01-18i965: Fix a comment typo.Eric Anholt
2011-01-18i965: Fix a bug in i965 compute-to-MRF.Eric Anholt
Fixes piglit glsl-fs-texture2d-branching. I couldn't come up with a testcase that didn't involve dead code, but it's still worthwhile to fix I think.
2011-01-17i965: Fix dead pointers to fp->Parameters->ParameterValues[] after realloc.Eric Anholt
Fixes texrect-many regression with ff_fragment_shader -- as we added refs to the subsequent texcoord scaling paramters, the array got realloced to a new address while our params[] still pointed at the old location.
2011-01-14i965: Replace broken handling of dead code with an assert.Eric Anholt
This code should never have been triggered, but I often did anyway when I disabled optimization passes during debugging, then spent my time debugging that this code doesn't work.
2011-01-14i965: Add an invalidation of live intervals after register splitting.Eric Anholt
No effect, since it was called before live intervals were calculated.
2011-01-12i965/fs: Do flat shading when appropriate.Eric Anholt
We were trying to interpolate, which would end up doing unnecessary math, and doing so on undefined values. Fixes glsl-fs-flat-color.
2011-01-12i965: Clarify when we need to (re-)calculate live intervals.Eric Anholt
The ad-hoc placement of recalculation somewhere between when they got invalidated and when they were next needed was confusing. This should clarify what's going on here.
2011-01-12i965/fs: When producing ir_unop_abs of an operand, strip negate.Eric Anholt
We were returning the negative absolute value, instead of the absolute value. Fixes glsl-fs-abs-neg.
2011-01-11i965: Tighten up the check for flow control interfering with coalescing.Eric Anholt
This greatly improves codegen for programs with flow control by allowing coalescing for all instructions at the top level, not just ones that follow the last flow control in the program.
2010-12-28i965: Do lowering of array indexing of a vector in the FS.Eric Anholt
Fixes a regression in ember since switching to the native FS backend, and the new piglit tests glsl-fs-vec4-indexing-{2,3} for catching this.
2010-12-28i965: Fix regression in FS comparisons on original gen4 due to gen6 changes.Eric Anholt
Fixes 26 piglit cases on my GM965.
2010-12-28i965: Factor out the ir comparision to BRW_CONDITIONAL_* code.Eric Anholt
2010-12-27i965: Flatten if-statements beyond depth 16 on pre-gen6.Kenneth Graunke
Gen4 and Gen5 hardware can have a maximum supported nesting depth of 16. Previously, shaders with control flow nested 17 levels deep would cause a driver assertion or segmentation fault. Gen6 (Sandybridge) hardware no longer has this restriction. Fixes fd.o bug #31967.
2010-12-22i965: explicit tell header present for fb write on sandybridgeZhenyu Wang
Determine header present for fb write by msg length is not right for SIMD16 dispatch, and if there're more output attributes, header present is not easy to tell from msg length. This explicitly adds new param for fb write to say header present or not. Fixes many cases' hang and failure in GL conformance test.
2010-12-13i965: Fix gl_FragCoord.z setup on gen6.Eric Anholt
Fixes glsl-bug-22603.
2010-12-09i965: Silence uninitialized variable warning.Vinson Lee
Fixes this GCC warning. brw_fs.cpp: In function 'brw_reg brw_reg_from_fs_reg(fs_reg*)': brw_fs.cpp:3255: warning: 'brw_reg' may be used uninitialized in this function
2010-12-07i965: Fix flipped value of the not-embedded-in-if on gen6.Eric Anholt
Fixes: glean/glsl1-! (not) operator (1, fail) glean/glsl1-! (not) operator (1, pass)
2010-12-07i965: Work around gen6 ignoring source modifiers on math instructions.Eric Anholt
With the change of extended math from having the arguments moved into mrfs and handed off through message passing to being directly hooked up to the EU, it looks like the piece for doing source modifiers (negate and abs) was left out. Fixes: fog-modes glean/fp1-ARB_fog_exp test glean/fp1-ARB_fog_exp2 test glean/fp1-Computed fog exp test glean/fp1-Computed fog exp2 test ext_fog_coord-modes
2010-12-07i965: Correctly emit constants for aggregate types (array, matrix, struct)Ian Romanick
Previously the code only handled scalars and vectors. This new code is modeled somewhat after similar code in ir_to_mesa. Reviewed-by: Eric Anholt <eric@anholt.net>
2010-12-06i965: Move payload reg setup to compile, not lookup time.Eric Anholt
Payload reg setup on gen6 depends more on the dispatch width as well as the uses_depth, computes_depth, and other flags. That's something we want to decide at compile time, not at cache lookup. As a bonus, the fragment shader program cache lookup should be cheaper now that there's less to compute for the hash key.
2010-12-01i965: Add support for gen6 CONTINUE instruction emit.Eric Anholt
At this point, piglit tests for fragment shader loops are working.
2010-12-01i965: Add support for gen6 BREAK ISA emit.Eric Anholt
There are now two targets: the hop-to-end-of-block target, and the target for where to resume execution for active channels.
2010-12-01i965: Add support for gen6 DO/WHILE ISA emit.Eric Anholt
There's no more DO since there's no more mask stack, and WHILE has been shuffled like IF was.
2010-11-29i965: Fix type of gl_FragData[] dereference for FB write.Eric Anholt
Fixes glsl-fs-fragdata-1, and hopefully Eve Online where I noticed this bug in the generated shader. Bug #31952.
2010-11-19i965: Remove duplicate MRF writes in the FS backend.Eric Anholt
This is quite common for multitexture sampling, and not only cuts down on the second and later set of MOVs, but typically also allows compute-to-MRF on the first set. No statistically siginficant performance difference in nexuiz (n=3), but it reduces instruction count in one of its shaders and seems like a good idea.
2010-11-19i965: Improve compute-to-mrf.Eric Anholt
We were skipping it if the instruction producing the value we were going to compute-to-mrf used its result reg as a source reg. This meant that the typical "write interpolated color to fragment color" or "texture from interpolated texcoord" shader didn't compute-to-MRF. Just don't check for the interference cases until after we've checked if this is the instruction we wanted to compute-to-MRF. Improves nexuiz high-settings performance on my laptop 0.48% +- 0.08% (n=3).
2010-11-19i965: Recognize saturates and turn them into a saturated mov.Eric Anholt
On pre-gen6, this turns 4 instructions into 1. We could still do better by folding the saturate into the instruction generating the value if nobody else uses it, but that should be a separate pass.
2010-11-19i965: Fold constants into the second arg of BRW_SEL as well.Eric Anholt
This hits a common case with min/max operations.
2010-11-19i965: Remove extra \n at the end of every instruction in INTEL_DEBUG=wm.Eric Anholt