summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965
AgeCommit message (Collapse)Author
2010-10-15i965: Set the type of the null register to fix gen6 FS comparisons.Eric Anholt
We often use reg_null as the destination when setting up the flag regs. However, on gen6 there aren't general implicit conversions to destination types from src types, so the comparison to produce the flag regs would be done on the integer result interpreted as a float. Hilarity ensued. Fixes 20 piglit cases.
2010-10-15i965: Fix indentation after commit 3322fbafIan Romanick
2010-10-14glsl: Slightly change the semantic of _LinkedShadersIan Romanick
Previously _LinkedShaders was a compact array of the linked shaders for each shader stage. Now it is arranged such that each slot, indexed by the MESA_SHADER_* defines, refers to a specific shader stage. As a result, some slots will be NULL. This makes things a little more complex in the linker, but it simplifies things in other places. As a side effect _NumLinkedShaders is removed. NOTE: This may be a candidate for the 7.9 branch. If there are other patches that get backported to 7.9 that use _LinkedShader, this patch should be cherry picked also.
2010-10-14i965: Fix texturing on pre-gen5.Eric Anholt
I broke it in 06fd639c519214b6ebcbf29127b6d9ed429f8641 by only testing 2 generations of hardware :(
2010-10-14i965: Add support for ir_unop_round_even via the RNDE instruction.Kenneth Graunke
2010-10-14i965: Clean up a warning in the old fragment backend.Kenneth Graunke
Hopefully this code can just go away soon.
2010-10-14i965: Enable the new FS backend on pre-gen6 as well.Eric Anholt
It is now to the point where we have no regressing piglit tests. It also fixes Yo Frankie! and Humus DynamicBranching, probably due to the piglit bias tests that work that didn't on the Mesa IR backend. As a downside, performance takes about a 5-10% performance hit at the moment (e.g. nexuiz 19.8fps -> 18.8fps), which I plan to resolve by reintroducing 16-wide fragment shaders where possible. It is a win, though, for fragment shaders using flow control.
2010-10-14i965: Correctly emit the RNDZ instruction.Kenneth Graunke
Simply using RNDU, RNDZ, or RNDE does not produce the desired result. Rather, the RND* instructions place a value in the destination register that may be 1 less than the correct answer. They can also set per-channel "increment bits" in a flag register, which, if set, mean dest needs to be incremented by 1. A second instruction - a predicated add - completes the job. Notably, RNDD always produces the correct answer in a single instruction. Fixes piglit test glsl-fs-trunc.
2010-10-14i965: Use RNDZ for ir_unop_trunc in the new FS.Kenneth Graunke
The existing code used RNDD, which rounds down, rather than toward zero.
2010-10-14i965: Use logical-not when emitting ir_unop_ceil.Kenneth Graunke
Fixes piglit test glsl-fs-ceil.
2010-10-14i965: Add peepholing of conditional mod generation from expressions.Eric Anholt
This cuts usually 2 out of 3 instructions for flag reg generation (if statements, conditional assignment) by producing the conditional mod in the expression representing the boolean value. Fixes glsl-fs-vec4-indexing-temp-dst-in-nested-loop-combined (register allocation no longer fails for the conditional generation proliferation)
2010-10-14i965: Add a function for handling the move of boolean values to flag regs.Eric Anholt
This will be a place to peephole comparisions directly to the flag regs, and for now avoids using MOV with conditional mod on gen6, which is now illegal.
2010-10-14i965: Add a pass to the FS to split virtual GRFs to float channels.Eric Anholt
Improves nexuiz performance 0.91% (+/- 0.54%, n=8)
2010-10-14i965: Update the live interval when coalescing regs.Eric Anholt
2010-10-14i965: Set class_sizes[] for the aligned reg pair class.Eric Anholt
So far, I've only seen this be a valgrind warning and not a real failure.
2010-10-14Revert "i965: fallback lineloop on sandybridge for now"Zhenyu Wang
This reverts commit 73dab75b4165f7d2214a68d4ba8e3cb7aab9b4ac.
2010-10-14i965: Fix GS hang on SandybridgeZhenyu Wang
Don't use r0 for FF_SYNC dest reg on Sandybridge, which would smash FFID field in GS payload, that cause later URB write fail. Also not use r0 in any URB write requiring allocate.
2010-10-13i965: Add support for rescaling GL_TEXTURE_RECTANGLE coords to new FS.Eric Anholt
2010-10-13Drop GLcontext typedef and use struct gl_context insteadKristian Høgsberg
2010-10-13Rename GLvisual and __GLcontextModes to struct gl_configKristian Høgsberg
2010-10-12i965: Don't rebase the index buffer to min 0 if any arrays are in VBOs.Eric Anholt
There was a check to only do the rebase if we didn't have everything in VBOs, but nexuiz apparently hands us a mix of VBOs and arrays, resulting in blocking on the GPU to do a rebase. Improves nexuiz 800x600, high-settings performance on my Ironlake 41% (+/- 1.3%), from 14.0fps to 19.7fps.
2010-10-12i965: Fix missing "break;" in i2b/f2b, and missing AND of CMP result.Eric Anholt
Fixes glsl-fs-i2b.
2010-10-11i965: Always use the new FS backend on gen6.Eric Anholt
It's now much more correct for gen6 than the old backend, with just 2 regressions I've found (one of which is common with pre-gen6 and will be fixed by an array splitting IR pass). This does leave the old Mesa IR backend getting used still when we don't have GLSL IR, but the plan is to get GLSL IR input to the driver for the ARB programs and fixed function by the next release.
2010-10-11i965: Fix gen6 pixel_[xy] setup to avoid mixing int and float src operands.Eric Anholt
Pre-gen6, you could mix int and float just fine. Now, you get goofy results. Fixes: glsl-arb-fragment-coord-conventions glsl-fs-fragcoord glsl-fs-if-greater glsl-fs-if-greater-equal glsl-fs-if-less glsl-fs-if-less-equal
2010-10-11i965: Don't compute-to-MRF in gen6 VS math.Eric Anholt
There was code to do this for pre-gen6 already, this just enables it for gen6 as well.
2010-10-11i965: Expand uniform args to gen6 math to full registers to get hstride == 1.Eric Anholt
This is a hw requirement in math args. This also is inefficient, as we're calculating the same result 8 times, but then we've been doing that on pre-gen6 as well. If we're doing math on uniforms, though, we'd probably be better served by having some sort of mechanism for precalculating those results into another uniform value to use. Fixes 7 piglit math tests.
2010-10-11i965: Don't compute-to-MRF in gen6 math instructions.Eric Anholt
2010-10-11i965: Add a couple of checks for gen6 math instruction limits.Eric Anholt
2010-10-11i965: Don't consider gen6 math instructions to write to MRFs.Eric Anholt
This was leftover from the pre-gen6 cleanups. One tests regresses where compute-to-MRF now occurs.
2010-10-11i965: Compute to MRF in the new FS backend.Eric Anholt
This didn't produce a statistically significant performance difference in my demo (n=4) or nexuiz (n=3), but it still seems like a good idea and is recommended by the HW team.
2010-10-11i965: Give the FB write and texture opcodes the info on base MRF, like math.Eric Anholt
2010-10-11i965: Give the math opcodes information on base mrf/mrf len.Eric Anholt
This is progress towards enabling a compute-to-MRF pass.
2010-10-11i965: Move FS backend structures to a header.Eric Anholt
It's time to start splitting some of this up.
2010-10-11i965: Reduce register interference checks for changed FS_OPCODE_DISCARD.Eric Anholt
While I don't know of any performance changes from this (once extra reg available out of 128), it makes the generated asm a lot cleaner looking.
2010-10-11i965: Split FS_OPCODE_DISCARD into two steps.Eric Anholt
Having the single opcode write then read the reg meant that single instruction opcodes had to consider their source regs to interfere with their dest regs.
2010-10-08i965: Initialize member variables.Vinson Lee
Fixes these GCC warnings. brw_wm_fp.c: In function 'search_or_add_const4f': brw_wm_fp.c:92: warning: 'reg.Index2' is used uninitialized in this function brw_wm_fp.c:84: note: 'reg.Index2' was declared here brw_wm_fp.c:92: warning: 'reg.RelAddr2' is used uninitialized in this function brw_wm_fp.c:84: note: 'reg.RelAddr2' was declared here
2010-10-08i965: Silence unused variable warning on non-debug builds.Vinson Lee
Fixes this GCC warning. brw_vs.c: In function 'do_vs_prog': brw_vs.c:46: warning: unused variable 'ctx'
2010-10-08i965: Silence unused variable warning on non-debug builds.Vinson Lee
Fixes this GCC warning. brw_eu_emit.c: In function 'brw_math2': brw_eu_emit.c:1189: warning: unused variable 'intel'
2010-10-08i965: Add register coalescing to the new FS backend.Eric Anholt
Improves performance of my GLSL demo 14.3% (+/- 4%, n=4) by eliminating the moves used in ir_assignment and ir_swizzle handling. Still 16.5% to go to catch up to the Mesa IR backend, presumably because instructions are almost perfectly mis-scheduled now.
2010-10-08i965: Enable attribute swizzling (repositioning) in the gen6 SF.Eric Anholt
We were trying to remap a fully-filled array down to only handing the WM the components it uses. This is called attribute swizzling, and if you don't enable it you just get 1:1 mappings of inputs to outputs. This almost fixes glsl-routing, except for the highest gl_TexCoord[] indices.
2010-10-08i965: Fix new FS gen6 interpolation for sparsely-populated arrays.Eric Anholt
We'd overwrite the same element twice.
2010-10-08i965: Fix gen6 WM push constants updates.Eric Anholt
We would compute a new buffer, but never point the hardware at the new buffer. This partially fixes glsl-routing, as now it get the updated uniform for which attribute to draw.
2010-10-08i965: Handle swizzles in the addition of YUV texture constants.Eric Anholt
If someone happened to land a set in a different swizzle order, we would have assertion failed.
2010-10-08i965: Drop the check for YUV constants in the param list.Eric Anholt
_mesa_add_unnamed_constant() already does that.
2010-10-08i965: Drop the check for duplicate _mesa_add_state_reference.Eric Anholt
_mesa_add_state_reference does that check for us anyway.
2010-10-07i965: Normalize cubemap coordinates like is done in the Mesa IR path.Eric Anholt
Fixes glsl-fs-texturecube-2-*
2010-10-07i965: Disable emitting if () statements on gen6 until we really fix them.Eric Anholt
2010-10-06i965: Fix gen6 pointsize handling to match pre-gen6.Eric Anholt
Fixes point-line-no-cull. Bug #30532
2010-10-06i965: Don't assume that WPOS is always provided on gen6 in the new FS.Eric Anholt
We sensibly only provide it if the FS asks for it. We could actually skip WPOS unless the FS needed WPOS.zw, but that's something for later. Fixes: glsl-texture2d and probably many others.
2010-10-06i965: Add support for gl_FrontFacing on gen6.Eric Anholt
Fixes glsl1-gl_FrontFacing var (2) with new FS.