summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965/brw_wm.h
AgeCommit message (Collapse)Author
2011-03-15i965: Fix alpha testing when there is no color buffer in the FBO.Eric Anholt
We were alpha testing against an unwritten value, resulting in garbage. (part of) Bug #35073.
2011-01-07intel: Add a vtbl hook for determining if a format is renderable.Eric Anholt
By relying on just intel_span_supports_format, some formats that aren't supported pre-gen4 were not reporting FBO incomplete. And we also complained in stderr when it happened on i915 because draw_region gets called before framebuffer completeness validation.
2010-12-08i965: Drop KIL_NV from the ff/ARB_fp path since it was only used for GLSL.Eric Anholt
2010-12-06i965: Nuke brw_wm_glsl.c.Eric Anholt
It was only used for gen6 fragment programs (not GLSL shaders) at this point, and it was clearly unsuited to the task -- missing opcodes, corrupted texturing, and assertion failures hit various applications of all sorts. It was easier to patch up the non-glsl for remaining gen6 changes than to make brw_wm_glsl.c complete. Bug #30530
2010-12-06i965: Move payload reg setup to compile, not lookup time.Eric Anholt
Payload reg setup on gen6 depends more on the dispatch width as well as the uses_depth, computes_depth, and other flags. That's something we want to decide at compile time, not at cache lookup. As a bonus, the fragment shader program cache lookup should be cheaper now that there's less to compute for the hash key.
2010-11-14i965: Fix gl_FragCoord inversion when drawing to an FBO.Eric Anholt
This showed up as cairo-gl gradients being inverted on everyone but Intel, where I'd apparently tweaked the transformation to work around the bug. Fixes piglit fbo-fragcoord.
2010-10-19i965: Disable thread dispatch when the FS doesn't do any work.Eric Anholt
This should reduce the cost of generating shadow maps, for example. No performance difference measured in nexuiz, though it does trigger this path.
2010-10-13Drop GLcontext typedef and use struct gl_context insteadKristian Høgsberg
2010-10-11i965: Move FS backend structures to a header.Eric Anholt
It's time to start splitting some of this up.
2010-09-28i965: Fix up part of my Sandybridge attributes support patch.Eric Anholt
I confused the array sizing for number of files for the number of regs in a file.
2010-09-28i965: Add support for attribute interpolation on Sandybridge.Eric Anholt
Things are simpler these days thanks to barycentric interpolation parameters being handed in in the payload.
2010-09-21i965: Track the windowizer's dispatch for kill pixel, promoted, and OQEric Anholt
Looks like the problem was we weren't passing the depth to the render target as expected, so the chip would wedge. Fixes GPU hang in occlusion-query-discard. Bug #30097
2010-09-21i965: Share the KIL_NV implementation between glsl and non-glsl.Eric Anholt
2010-08-26i965: Start building direct GLSL2 IR to 965 assembly codegen.Eric Anholt
Our channel-expressions and vector-splitting changes now happen into a private copy of the IR that we maintain for ourselves. Uniform assignment still happens by the core, so we continue using Mesa IR generation not just for swrast fallbacks but also for uniform values (since there's no storage for their contents other than shader_program->FragmentProgram->Parameters->ParameterValues). And most importantly, at the moment no actual codegen is hooked up other than emitting our favorite color to the framebuffer.
2010-08-26i965: Add new pass to split vectors into scalar variablesEric Anholt
Combined with the previous pass, this lets other optimization passes do their work thanks to ir_tree_grafting. Still have regression in instruction count with INTEL_NEW_FS, but register count is even better.
2010-08-26i965: Add a pass for the FS to reduce vector expressions down to scalar.Eric Anholt
This is a step towards implementing a GLSL IR backend for the 965 fragment shader. Because it has downsides with the current codegen, it is hidden under the environment variable INTEL_NEW_FS. This results in an increase in instruction count at the moment (1444 -> 1752 for glsl-fs-raytrace, 345 -> 359 on my demo), because dot products are turned into a series of multiplies and adds instead of a custom expansion of MULs and MACs, and by not splitting the variable types up we don't get tree grafting and thus there are extra moves of temporary storage. However, register count drops for the non-GLSL path (64 -> 56 on my demo shader) because the register allocator sees all the sub-operations.
2010-08-26i965: Start building 965 FS backend.Eric Anholt
2010-08-20i965: Rename nr_depth_regs to nr_payload_regs.Eric Anholt
Only 8 out of the up to 13 regs are for source/dest depth, so the name wasn't particularly appropriate. Note that this doesn't count the constant or URB payload regs. Also, don't pre-divide by 2, so it's actually a number of registers.
2010-07-26Merge remote branch 'origin/master' into glsl2Eric Anholt
This pulls in multiple i965 driver fixes which will help ensure better testing coverage during development, and also gets past the conflicts of the src/mesa/shader -> src/mesa/program move. Conflicts: src/mesa/Makefile src/mesa/main/shaderapi.c src/mesa/main/shaderobj.h
2010-07-02i965: Add support for the DP2 opcode, which we use for dot(vec2, vec2).Eric Anholt
The original glsl compiler would generate a.x * b.x + a.y * b.y, which we would do mul+mul+add for instead of this mul+mac. Fixes glsl-fs-dot-vec2.
2010-06-30i965: Add support for OPCODE_SSG.Eric Anholt
The old compiler didn't use SSG, and instead emitted SGT/SGT/SUB. We can do a little better for SSG than we do for the SGT series.
2010-06-10mesa: rename src/mesa/shader/ to src/mesa/program/Brian Paul
2010-05-23i965: Fix bit allocation for number of color regions for ARB_draw_buffers.Eric Anholt
If you used all 4 color targets we currently support, we would see 0 and end up just writing the first output. Give enough bits that we can do the maximum of 16. Fixes piglit fbo-drawbuffers-maxtargets.
2010-03-22i965: Allow FS constants to be used as immediates instead of push/pull.Eric Anholt
The hope is to later take advantage of the reduced constant usage to free up regs. This only covers the GLSL path at the moment, because the brw_wm_emit path doesn't get the information as to whether a float value is a constant or a uniform.
2010-03-10i965: Add support for the CMP opcode in the GLSL path.Eric Anholt
This would be triggered by use of sqrt() along with control flow. Fixes piglit-fs-sqrt-branch and a bug in Yo Frankie!.
2010-01-26i965: Fix fp fragment.position handling and enable HW part of ARB_fcc.Eric Anholt
As with swrast, this fixes the default pixel center behavior which was broken, and implements the previous behavior for integer. Fixes piglit fp-arb-fragment-coord-conventions-none. The extension won't be exposed until we get the GLSL part implemented. The DRI1 origin_x/y parts are dropped since they're no longer relevant.
2009-11-19i965: Pack the brw_wm_prog_key better.Eric Anholt
2009-11-17Merge branch 'outputswritten64'Ian Romanick
Add a GLbitfield64 type and several macros to operate on 64-bit fields. The OutputsWritten field of gl_program is changed to use that type. This results in a fair amount of fallout in drivers that use programs. No changes are strictly necessary at this point as all bits used are below the 32-bit boundary. Fairly soon several bits will be added for clip distances written by a vertex shader. This will cause several bits used for varyings to be pushed above the 32-bit boundary. This will affect any drivers that support GLSL. At this point, only the i965 driver has been modified to support this eventuality. I did this as a "squash" merge. There were several places through the outputswritten64 branch where things were broken. I foresee this causing difficulties later for bisecting. The history is still available in the branch. Conflicts: src/mesa/drivers/dri/i965/brw_wm.h
2009-11-13i965: Share OPCODE_TXB between brw_wm_emit.c and brw_wm_glsl.cEric Anholt
This should fix TXB on G45 and older in the GLSL case.
2009-11-13i965: Share OPCODE_TEX between brw_wm_emit.c and brw_wm_glsl.c.Eric Anholt
New comments should explain some of the confusion about how this message works.
2009-11-10i965: avoid memsetting all the BRW_WM_MAX_INSN arrays for every compile.Eric Anholt
For an app that's blowing out the state cache, like sauerbraten, the memset of the giant arrays ended up taking 11% of the CPU even when only a "few" of the entries got used. With this, the WM program compile drops back down to 1% of CPU time. Bug #24981 (bisected to BRW_WM_MAX_INSN increase).
2009-11-06i965: Share min/max between brw_wm_emit.c and brw_wm_glsl.cEric Anholt
2009-11-06i965: Share emit_fb_write() between brw_wm_emit.c and brw_wm_glsl.cEric Anholt
This should fix issues with antialiased lines in GLSL.
2009-11-06i965: Share most of the WM functions between brw_wm_glsl.c and brw_wm_emit.cEric Anholt
The PINTERP code should be faster for brw_wm_glsl.c now since brw_wm_emit.c's had been improved, and pixel_w should no longer stomp on a neighbor to dst.
2009-11-06i965: Share math functions between brw_wm_glsl.c and brw_wm_emit.c.Eric Anholt
2009-11-06i965: Share the sop opcodes between brw_wm_glsl.c and brw_wm_emit.c.Eric Anholt
2009-11-06i965: Share OPCODE_MAD between brw_wm_glsl.c and brw_wm_emit.cEric Anholt
2009-11-06i965: Share the DP3, DP4, and DPH between brw_wm_glsl.c and brw_wm_emit.cEric Anholt
2009-11-06i965: Add generic GLSL code for unaliasing a 3-arg opcode, and share LRP code.Eric Anholt
2009-11-06i965: Share basic ALU ops between brw_wm_glsl and brw_wm_emit.cEric Anholt
This drops support for get_src_reg_imm in these, but the prospect of getting brw_wm_pass*.c onto our GLSL path is well worth some temporary pain.
2009-11-06i965: Collect GLSL src/dst regs up in generic code.Eric Anholt
This matches brw_wm_emit.c, which we'll be using shortly. There's a possible penalty here in that we'll allocate registers for unused channels, since we aren't doing ref tracking like brw_wm_pass*.c does. However, my measurements on GM965 don't show any for either OA or UT2004 with the GLSL path forced.
2009-10-30i965: Fix BRW_WM_MAX_INSN to reflect current limits.Eric Anholt
Part of fixing bug #24355.
2009-10-29i965: make brw_wm_prog_key a little smallerBrian Paul
GLushort is big enough for the swizzle and origin fields. The key could probably be made smaller still by re-ordering things. I'll hold off on that until after the outputswritten64 branch is merged. The key will get a little larger again with the GLbitfield64 fields.
2009-10-29i965: don't use context state in emit_fb_write()Brian Paul
Put the state that we care about in the hash key. Issue spotted by Keith Whitwell.
2009-10-29i965: use macros to get/set prog_instruction::Aux fieldBrian Paul
This makes things a bit easier to remember/understand.
2009-09-11i965: Move OPCODE_DDX/DDY to brw_wm_emit.c and make it actually work.Eric Anholt
Previously, it was trying to mess around with the varying's WM setup data to produce a result. Along with not actually working when passed a varying, this wouldn't work if you did dFd[xy]() on a temporary. Instead, just calculate the derivative using the neighbors in the subspan.
2009-08-12i965: drop dead scalar handling in GLSL.Eric Anholt
2009-08-12i965: Store the dispatch width in the WM compile struct.Eric Anholt
I'll be using this in merging brw_wm_emit.c and brw_wm_glsl.c
2009-08-12i965: Handle scalar result swizzling in shared GLSL/non-GLSL code.Eric Anholt
This is preparation for merging of brw_wm_glsl.c and brw_wm_emit.c, and glsl.c doesn't swizzle channel results around.
2009-08-05i965: Fix source depth reg setting for FSes reading and writing to depth.Eric Anholt
For some IZ setups, we'd forget to account for the source depth register being present, so we'd both read the wrong reg, and write output depth to the wrong reg. Bug #22603.