summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965/brw_fs.cpp
AgeCommit message (Collapse)Author
2010-10-11i965: Expand uniform args to gen6 math to full registers to get hstride == 1.Eric Anholt
This is a hw requirement in math args. This also is inefficient, as we're calculating the same result 8 times, but then we've been doing that on pre-gen6 as well. If we're doing math on uniforms, though, we'd probably be better served by having some sort of mechanism for precalculating those results into another uniform value to use. Fixes 7 piglit math tests.
2010-10-11i965: Don't compute-to-MRF in gen6 math instructions.Eric Anholt
2010-10-11i965: Don't consider gen6 math instructions to write to MRFs.Eric Anholt
This was leftover from the pre-gen6 cleanups. One tests regresses where compute-to-MRF now occurs.
2010-10-11i965: Compute to MRF in the new FS backend.Eric Anholt
This didn't produce a statistically significant performance difference in my demo (n=4) or nexuiz (n=3), but it still seems like a good idea and is recommended by the HW team.
2010-10-11i965: Give the FB write and texture opcodes the info on base MRF, like math.Eric Anholt
2010-10-11i965: Give the math opcodes information on base mrf/mrf len.Eric Anholt
This is progress towards enabling a compute-to-MRF pass.
2010-10-11i965: Move FS backend structures to a header.Eric Anholt
It's time to start splitting some of this up.
2010-10-11i965: Reduce register interference checks for changed FS_OPCODE_DISCARD.Eric Anholt
While I don't know of any performance changes from this (once extra reg available out of 128), it makes the generated asm a lot cleaner looking.
2010-10-11i965: Split FS_OPCODE_DISCARD into two steps.Eric Anholt
Having the single opcode write then read the reg meant that single instruction opcodes had to consider their source regs to interfere with their dest regs.
2010-10-08i965: Add register coalescing to the new FS backend.Eric Anholt
Improves performance of my GLSL demo 14.3% (+/- 4%, n=4) by eliminating the moves used in ir_assignment and ir_swizzle handling. Still 16.5% to go to catch up to the Mesa IR backend, presumably because instructions are almost perfectly mis-scheduled now.
2010-10-08i965: Fix new FS gen6 interpolation for sparsely-populated arrays.Eric Anholt
We'd overwrite the same element twice.
2010-10-07i965: Normalize cubemap coordinates like is done in the Mesa IR path.Eric Anholt
Fixes glsl-fs-texturecube-2-*
2010-10-07i965: Disable emitting if () statements on gen6 until we really fix them.Eric Anholt
2010-10-06i965: Don't assume that WPOS is always provided on gen6 in the new FS.Eric Anholt
We sensibly only provide it if the FS asks for it. We could actually skip WPOS unless the FS needed WPOS.zw, but that's something for later. Fixes: glsl-texture2d and probably many others.
2010-10-06i965: Add support for gl_FrontFacing on gen6.Eric Anholt
Fixes glsl1-gl_FrontFacing var (2) with new FS.
2010-10-06i965: Refactor gl_FrontFacing setup out of general variable setup.Eric Anholt
2010-10-06i965: Gen6's sampler messages are the same as Ironlake.Eric Anholt
This should fix texturing in the new FS backend.
2010-10-06i965: Don't do 1/w multiplication in new FS for gen6Eric Anholt
Not needed now that we're doing barycentric.
2010-10-06i965: Fix botch in the header_present case in the new FS.Eric Anholt
I only set it on the color_regions == 0 case, missing the important case, causing GPU hangs on pre-gen6.
2010-10-06i965: Add back gen6 headerless FB writes to the new FS backend.Eric Anholt
It's not that hard to detect when we need the header.
2010-10-06i965: Also do constant propagation for the second operand of CMP.Eric Anholt
We could do the first operand as well by flipping the comparison, but this covered several CMPs in code I was looking at.
2010-10-06i965: Enable the constant propagation code.Eric Anholt
A debug disable had slipped in.
2010-10-04i965: Add support for gen6 FB writes to the new FS.Eric Anholt
This uses message headers for now, since we'll need it for MRT. We can cut out the header later.
2010-10-04i965: Add initial folding of constants into operand immediate slots.Eric Anholt
We could try to detect this in expression handling and do it proactively there, but it seems like less logic to do it in one optional pass at the end.
2010-10-04i965: Add trivial dead code elimination in the new FS backend.Eric Anholt
The glsl core should be handling most dead code issues for us, but we generate some things in codegen that may not get used, like the 1/w value or pixel deltas. It seems a lot easier this way than trying to work out up front whether we're going to use those values or not.
2010-10-04i965: Be more conservative on live interval calculation.Eric Anholt
This also means that our intervals now highlight dead code.
2010-10-02i965: Add support for EXT_texture_swizzle to the new FS backend.Eric Anholt
2010-10-01i965: Don't try to emit code if we failed register allocation.Eric Anholt
2010-10-01i965: Fix off-by-ones in handling the last members of register classes.Eric Anholt
Luckily, one of them would result in failing out register allocation when the other bugs were encountered. Applies to glsl-fs-vec4-indexing-temp-dst-in-nested-loop-combined, which still fails register allocation, but now legitimately.
2010-10-01i965: Add a sanity check for register allocation sizes.Eric Anholt
2010-10-01i965: When producing a single channel swizzle, don't make a temporary.Eric Anholt
This quickly cuts 8% of the instructions in my glsl demo.
2010-10-01i965: Restore the forcing of aligned pairs for delta_xy on chips with PLN.Eric Anholt
By doing so using the register allocator now, we avoid wasting a register to make the alignment happen.
2010-10-01i965: Fix up copy'n'pasteo from moving coordinate setup around for gen4.Eric Anholt
2010-10-01i965: Add real support for pre-gen5 texture sampling to the new FS.Eric Anholt
Fixes 36 testcases, including glsl-fs-shadow2d*-bias which fail on the Mesa IR backend.
2010-10-01i965: Pre-gen6, map VS outputs (not FS inputs) to URB setup in the new FS.Eric Anholt
We should fix the SF to actually give us just the data we need, but this fixes regressions in the new FS until then. Fixes: glsl-kwin-blur glsl-routing
2010-10-01i965: Also increment attribute location when skipping unused slots.Eric Anholt
Fixes glsl1-texcoord varying.
2010-10-01i965: Fix the gen6 jump size for BREAK/CONT in new FS.Eric Anholt
Since gen5, jumps are in increments of 64 bits instead of increments of 128-bit instructions.
2010-10-01i965: Add gen6 attribute interpolation to new FS backend.Eric Anholt
Untested, since my hardware is not booting at the moment.
2010-09-30i965: Split the gen4 and gen5 sampler handling apart.Eric Anholt
Trying to track the insanity of the different argument layouts for normal/shadow crossed with normal/lod/bias one generation at a time is enough. Fixes: glsl1-texture2D() with bias. (first test passing in this code that doesn't pass without it!)
2010-09-30i965: Use the lowering pass for texture projection.Eric Anholt
We should end up with the same code, but anyone else with this issue could share the handling (which I got wrong for shadow comparisons in the driver before).
2010-09-30i965: Fix new FS handling of builtin uniforms with packed scalars in structs.Eric Anholt
We were pointing each element at the .x channel of the ParameterValues. Fixes glsl1-linear fog.
2010-09-30i965: Fix whole-structure/array assignment in new FS.Eric Anholt
We need to walk the type tree to get the right register types for structure components. Fixes glsl-fs-statevar-call.
2010-09-29i965: Remove my "safety counter" code from loops.Eric Anholt
I've screwed this up enough times that I don't think it's worth it. This time, it was that I was doing it once per top-level body instruction instead of just once at the end of the loop body.
2010-09-29i965: Add live interval analysis and hook it up to the register allocator.Eric Anholt
Fixes 13 piglit cases that failed at register allocation before.
2010-09-29i965: First cut at register allocation using graph coloring.Eric Anholt
The interference is totally bogus (maximal), so this is equivalent to our trivial register assignment before. As in, passes the same set of piglit tests.
2010-09-29i965: Clean up the virtual GRF handling.Eric Anholt
Now, virtual GRFs are consecutive integers, rather than offsetting the next one by the size. We need the size information to still be around for real register allocation, anyway.
2010-09-29i956: Make new FS discard do its work in a temp, not the null reg!Eric Anholt
Fixes: glsl-fs-discard-02 (GPU hang) glsl1-discard statement (2)
2010-09-28i965: Add support for builtin uniforms to the new FS backend.Eric Anholt
Fixes 8 piglit tests.
2010-09-28i965: Clean up obsolete FINISHME comment.Eric Anholt
2010-09-28i965: Fix array indexing of arrays of matrices.Eric Anholt
The deleted code was meant to be handling indexing of a matrix, which would have been a noop if it had been correct.