summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965
AgeCommit message (Collapse)Author
2010-08-27i965: Add missing handling for BRW_OPCODE_SEL.Eric Anholt
Fixes 4 piglit tests about min, max, and clamp.
2010-08-27i965: Mask out higher bits of the result of BRW_CMP producing a boolean.Eric Anholt
When it says it sets the LSB, that's not just a hint as to where the result goes. Only the LSB is modified. Fixes 20 piglit cases.
2010-08-27i965: Fix the types of immediate integer values.Eric Anholt
When we're trying to do integer ops, handing a float in doesn't help.
2010-08-27i965: Add translation for RNDD and RNDZ.Eric Anholt
Fixes: glsl-fs-any. glsl1-integer division with uniform var
2010-08-27i965: Add support for ir_binop_mod using do_mod_to_fract.Eric Anholt
Fixes glsl-fs-mod.
2010-08-27i965: Fix swapped instructions in ir_unop_abs and ir_unop_neg.Eric Anholt
Fixes glsl-fs-neg and 5 other tests.
2010-08-27i965: Add generate() handling for AND, OR, XOR.Eric Anholt
10 more piglit tests pass.
2010-08-27i965: Add support for if instructions in the new FS backend.Eric Anholt
20 more piglit tests pass.
2010-08-27i965: When encountering an unknown opcode in new FS backend, print its name.Eric Anholt
2010-08-27i965: Fix the maximum grf counting in the new FS backend.Eric Anholt
glsl-algebraic-rcp-rsq managed to use 33 registers, and we claimed to only use 32, so the write to g32 would go stomping over the precious g0 of some other thread.
2010-08-27i965: Validate the IR tree after doing our custom optimization passes.Eric Anholt
This wouldn't catch the last failure fixed in them, because we don't validate assignments well (due to the fact that we've got a pretty glaring inconsistency in how we handle assignment writemasking), but it could catch other failure we may produce.
2010-08-27i965: Add a bit of support for matrices to the new FS.Eric Anholt
2010-08-27i965: Fix destination writemasking in the new FS.Eric Anholt
2010-08-27i965: Fix swizzling in vector splitting for the new FS backend.Eric Anholt
We weren't smearing a component of a split RHS out to reach an unsplit LHS's writemask, so gl_FragColor (always unsplit) would often get uninitialized values. Fixes: glsl-algebraic-add-add-1 (and probably many others).
2010-08-26i965: Add preliminary support for uniforms to the new FS backend.Eric Anholt
+269 piglits
2010-08-26i965: Abort on gl_FragDepth in the new FS backend for now.Eric Anholt
It hangs the GPU due to FB_WRITE handling being incomplete. There are bigger issues to handle first.
2010-08-26i965: Fix up and actually enable the NewShader and NewShaderProgram hooks.Eric Anholt
2010-08-26i965: Hack in avoidance of c++ reserved keyword in libdrm.Eric Anholt
I'm also fixing this upstream in libdrm, but this avoids new libdrm dependency for the moment.
2010-08-26i965: Add GLSL IR-level source annotation and comments to new FS debug.Eric Anholt
This should make debugging way easier, as now we have context for reading large programs.
2010-08-26i965: Use the implied move in brw_math() in the new FS.Eric Anholt
2010-08-26i965: Add support for in varyings to the new FS codegen.Eric Anholt
At least some tests, like glsl-vs-sign, now work.
2010-08-26i965: Start building the codegen visitor.Eric Anholt
This can successfully emit a real program that generates magenta now.
2010-08-26i965: Start building direct GLSL2 IR to 965 assembly codegen.Eric Anholt
Our channel-expressions and vector-splitting changes now happen into a private copy of the IR that we maintain for ourselves. Uniform assignment still happens by the core, so we continue using Mesa IR generation not just for swrast fallbacks but also for uniform values (since there's no storage for their contents other than shader_program->FragmentProgram->Parameters->ParameterValues). And most importantly, at the moment no actual codegen is hooked up other than emitting our favorite color to the framebuffer.
2010-08-26i965: Add new pass to split vectors into scalar variablesEric Anholt
Combined with the previous pass, this lets other optimization passes do their work thanks to ir_tree_grafting. Still have regression in instruction count with INTEL_NEW_FS, but register count is even better.
2010-08-26i965: Add a pass for the FS to reduce vector expressions down to scalar.Eric Anholt
This is a step towards implementing a GLSL IR backend for the 965 fragment shader. Because it has downsides with the current codegen, it is hidden under the environment variable INTEL_NEW_FS. This results in an increase in instruction count at the moment (1444 -> 1752 for glsl-fs-raytrace, 345 -> 359 on my demo), because dot products are turned into a series of multiplies and adds instead of a custom expansion of MULs and MACs, and by not splitting the variable types up we don't get tree grafting and thus there are extra moves of temporary storage. However, register count drops for the non-GLSL path (64 -> 56 on my demo shader) because the register allocator sees all the sub-operations.
2010-08-26i965: Start building 965 FS backend.Eric Anholt
2010-08-26i965: Add support for destination RelAddr writes in the VS.Eric Anholt
Fixes: glsl-vs-varying-array
2010-08-26i965: Fix the test for variable indexing of shader inputs.Eric Anholt
Shader inputs appear in source registers, not dst registers. Catches unsupported shaders in glsl-fs-varying-array and Humus RaytracedShadows.
2010-08-25i965: Fix detection of implicit MOVs to message regs in brw_optimize.c.Eric Anholt
Texcoords in AmbientApertureLighting were getting trashed since the move of math arguments to implied moves, due to the logic for detecting ALU message reg writes overriding the logic for SEND implicit message reg writes.
2010-08-25i965: Remove unnecessary header.Vinson Lee
2010-08-24i965: Fix printf format warnings on 32-bit builds.Vinson Lee
2010-08-22i965: Fix 8-wide FB writes on gen6.Eric Anholt
My merge of Zhenyu's patch on top of my previous patches broke it by my code expecting simd16 single write and Zhenyu's simd8 path being disabled by mine. Merge the two for success.
2010-08-22i965: Fix brw_math1 with scalar argument in gen6 FS.Eric Anholt
The docs claim two conflicting things: One, that a scalar source is supported. Two, source hstride must be 1 and width must be exec size. So splat a constant argument out into a full reg to operate on, since violating the second set of constraints is clearly failing. The alternative here might be to do a 1-wide exec on a constant argument for math1. It would probably save cycles too. But I'll leave that for the glsl2-965 branch. Fixes glsl-algebraic-div-one-2.shader_test.
2010-08-22i965: Fix up WM push constant setup on gen6.Eric Anholt
Fixes glsl-algebraic-add-add-1.
2010-08-22i965: Use intel->gen >= 6 instead of IS_GEN6.Eric Anholt
2010-08-20i965: Rename nr_depth_regs to nr_payload_regs.Eric Anholt
Only 8 out of the up to 13 regs are for source/dest depth, so the name wasn't particularly appropriate. Note that this doesn't count the constant or URB payload regs. Also, don't pre-divide by 2, so it's actually a number of registers.
2010-08-20i965: Also use the SIMD8 FB writes for SIMD8 mode on non-SNB.Eric Anholt
2010-08-20i965: Add support for FB writes on Sandybridge.Zhenyu Wang
2010-08-20i965: Set the destination horiz stride even for da16, as SNB seems to need it.Zhenyu Wang
2010-08-20i965: Set the maximum number of threads on Sandybridge.Zhenyu Wang
2010-08-20i965: Add AccWrCtl support on Sandybridge.Zhenyu Wang
Whenever the accumulator results are needed, this bit must be set.
2010-08-20i965: Mention the mlen and rlen for URB reads.Zhenyu Wang
2010-08-20i965: Sandybridge doesn't have Compr4 mode, since it's not needed any more.Zhenyu Wang
2010-08-20i965: Adjust disasm of subreg numbers to be in units of the register type.Zhenyu Wang
This makes reading the code easier when matching up to the specs, which also use this format.
2010-08-20i965: Fix DP write channel ordering on Sandybridge.Eric Anholt
The SIMD16 message no longer has the goofy interleaved format that made Compr4 compression necessary before.
2010-08-20i965: Fix compile warnings on 64-bit Linux.Kenneth Graunke
format ‘%d’ expects type ‘int’, but argument 2 has type ‘long int’
2010-08-18i965: Set the if stack pop count when breaking out of a loop inside an if.Eric Anholt
Otherwise, we might end up with the if stack pointing at the wrong place. Fixes GPU hang with glsl-vs-if-loop.
2010-08-18i965: Don't set the swizzle on an immediate value in the VS.Eric Anholt
Fixes glsl-vs-if-nested (70.0 is not <= 70.000648 thanks to the swizzle bits getting set). Some safety checks are added to make sure this doesn't happen again as we increase the usage of immediate values in program generation.
2010-08-17i965: Throw a link error when we see a "return" in main().Eric Anholt
We'll need to use the HALT instruction to do this right, like returns from other functions.
2010-08-17i965: Add support for DP2 in the VS.Eric Anholt
Fixes glsl-vs-dot-vec2.