summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965/brw_wm_emit.c
AgeCommit message (Collapse)Author
2010-09-28i965: Add support for attribute interpolation on Sandybridge.Eric Anholt
Things are simpler these days thanks to barycentric interpolation parameters being handed in in the payload.
2010-09-21i965: Share the KIL_NV implementation between glsl and non-glsl.Eric Anholt
2010-08-22i965: Fix 8-wide FB writes on gen6.Eric Anholt
My merge of Zhenyu's patch on top of my previous patches broke it by my code expecting simd16 single write and Zhenyu's simd8 path being disabled by mine. Merge the two for success.
2010-08-22i965: Fix brw_math1 with scalar argument in gen6 FS.Eric Anholt
The docs claim two conflicting things: One, that a scalar source is supported. Two, source hstride must be 1 and width must be exec size. So splat a constant argument out into a full reg to operate on, since violating the second set of constraints is clearly failing. The alternative here might be to do a 1-wide exec on a constant argument for math1. It would probably save cycles too. But I'll leave that for the glsl2-965 branch. Fixes glsl-algebraic-div-one-2.shader_test.
2010-08-20i965: Also use the SIMD8 FB writes for SIMD8 mode on non-SNB.Eric Anholt
2010-08-20i965: Add support for FB writes on Sandybridge.Zhenyu Wang
2010-08-20i965: Fix DP write channel ordering on Sandybridge.Eric Anholt
The SIMD16 message no longer has the goofy interleaved format that made Compr4 compression necessary before.
2010-08-16i965: Use the implied move available in most brw_wm_emit brw_math() calls.Eric Anholt
This saves an extra message reg move in the program, though I'm not clear on whether it will have any performance impact other than cache footprint. It will also fix those math calls on Sandybridge, where the brw_eu_emit.c brw_math() support relies on the implied move being used.
2010-08-09i965: More s/stderr/stdout/ for program debug.Eric Anholt
2010-07-26Merge remote branch 'origin/master' into glsl2Eric Anholt
This pulls in multiple i965 driver fixes which will help ensure better testing coverage during development, and also gets past the conflicts of the src/mesa/shader -> src/mesa/program move. Conflicts: src/mesa/Makefile src/mesa/main/shaderapi.c src/mesa/main/shaderobj.h
2010-07-26i965: Fix reversed naming of the operations in compute-to-mrf optimization.Eric Anholt
Also fix up comments, so that the difference between the two passes is clarified.
2010-07-26i965: Clean up a few magic numbers to use brw_defines.h defs.Eric Anholt
2010-07-26i965: Move the GRF-to-MRF optimizations to brw_optimize.c.Eric Anholt
2010-07-26i965: Improve (i.e. remove) some grf-to-mrf unnecessary movesBenjamin Segovia
Several routines directly analyze the grf-to-mrf moves from the Gen binary code. When it is possible, the mov is removed and the message register is directly written in the arithmetic instruction Also redundant mrf-to-grf moves are removed (frequently for example, when sampling many textures with the same uv) Code was tested with piglit, warsow and nexuiz on an Ironlake machine. No regression was found there Note that the optimizations are *deactivated* on Gen4 and Gen6 since I did test them properly yet. No reason there are bugs but who knows The optimizations are currently done in branch free programs *only*. Considering branches is more complicated and there are actually two paths: one for branch free programs and one for programs with branches Also some other optimizations should be done during the emission itself but considering that some code is shader between vertex shaders (AOS) and pixel shaders (SOA) and that we may have branches or not, it is pretty hard to both factorize the code and have one good set of strategies
2010-07-02i965: Add support for the DP2 opcode, which we use for dot(vec2, vec2).Eric Anholt
The original glsl compiler would generate a.x * b.x + a.y * b.y, which we would do mul+mul+add for instead of this mul+mac. Fixes glsl-fs-dot-vec2.
2010-06-30i965: Add support for OPCODE_SSG.Eric Anholt
The old compiler didn't use SSG, and instead emitted SGT/SGT/SUB. We can do a little better for SSG than we do for the SGT series.
2010-05-14i965: Dump out the correct shared function for SEND on Ironlake.Eric Anholt
2010-04-21intel: Clean up chipset name and gen num for IronlakeZhenyu Wang
Rename old IGDNG to Ironlake, and set 'gen' number for Ironlake as 5, so tracking the features with generation num instead of special is_ironlake flag. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2010-03-22i965: Allow FS constants to be used as immediates instead of push/pull.Eric Anholt
The hope is to later take advantage of the reduced constant usage to free up regs. This only covers the GLSL path at the moment, because the brw_wm_emit path doesn't get the information as to whether a float value is a constant or a uniform.
2010-03-22i965: Optimize OPCODE_CMP by using BRW_SEL to choose results.Eric Anholt
Tested with piglit glsl-fs-sqrt-branch, fp-cmp.vpfp.
2010-03-16Revert "i965: Do FS SLT, SGT, and friends using CMP, SEL instead of CMP, ↵Eric Anholt
MOV, MOV." This reverts commit 46450c1f3f93bf4dc96696fc7e0f0eb808d9c08a. I was wrong about null reg behavior -- it reads undefined, not 0. And they're not kidding.
2010-03-12i965: Clarify the roles of emit_pixel_xy(), emit_delta_xy(), emit_wpos_xy().Eric Anholt
2010-03-12i965: Clarify that DELTAXY always occurs for both X and Y.Eric Anholt
2010-03-12i965: Do FS SLT, SGT, and friends using CMP, SEL instead of CMP, MOV, MOV.Eric Anholt
2010-03-12i965: When doing a swizzled kill pixel, don't do redundant channel compares.Eric Anholt
This was obvious when looking at the compiled output of ETQW's shaders.
2010-03-12i965: Use the SEL instruction to implement MIN and MAX.Eric Anholt
Saves an instruction over doing conditional moves.
2010-03-10i965: Use the PLN instruction when possible in interpolation.Eric Anholt
Saves an instruction in PINTERP, LINTERP, and PIXEL_W from brw_wm_glsl.c For non-GLSL it isn't used yet because the deltas have to be laid out differently.
2010-03-10i965: Add support for the CMP opcode in the GLSL path.Eric Anholt
This would be triggered by use of sqrt() along with control flow. Fixes piglit-fs-sqrt-branch and a bug in Yo Frankie!.
2010-02-19Replace the _mesa_*printf() wrappers with the plain libc versionsKristian Høgsberg
2010-01-26i965: Fix fp fragment.position handling and enable HW part of ARB_fcc.Eric Anholt
As with swrast, this fixes the default pixel center behavior which was broken, and implements the previous behavior for integer. Fixes piglit fp-arb-fragment-coord-conventions-none. The extension won't be exposed until we get the GLSL part implemented. The DRI1 origin_x/y parts are dropped since they're no longer relevant.
2010-01-08Merge branch 'mesa_7_7_branch'Brian Paul
Conflicts: src/mesa/drivers/dri/i965/brw_wm_emit.c
2010-01-06i965: fix invalid assertion in emit_xpd(), againBrian Paul
2010-01-05i965: fix invalid assertion in emit_xpd()Brian Paul
Invalid assertion found by Roel Kluin <roel.kluin@gmail.com>
2009-12-31Merge branch 'mesa_7_7_branch'Brian Paul
Conflicts: configs/darwin src/gallium/auxiliary/util/u_clear.h src/gallium/state_trackers/xorg/xorg_exa_tgsi.c src/mesa/drivers/dri/i965/brw_draw_upload.c
2009-12-28intel: Silence compiler warnings.Vinson Lee
2009-12-22intel: Replace IS_G4X() across the driver with context structure usage.Eric Anholt
Saves ~2KB of code.
2009-12-22intel: Replace IS_IGDNG checks with intel->is_ironlake or needs_ff_sync.Eric Anholt
Saves ~480 bytes of code.
2009-11-13i965: Share OPCODE_TXB between brw_wm_emit.c and brw_wm_glsl.cEric Anholt
This should fix TXB on G45 and older in the GLSL case.
2009-11-13i965: Share OPCODE_TEX between brw_wm_emit.c and brw_wm_glsl.c.Eric Anholt
New comments should explain some of the confusion about how this message works.
2009-11-13i965: Clean up emit_tex a bit.Eric Anholt
2009-11-13Merge remote branch 'origin/mesa_7_6_branch'Eric Anholt
2009-11-13i965: Clean up Ironlake sampler type definitions.Eric Anholt
They're the same regardless of execution width for 8, 4x2, and 16.
2009-11-12i965: Fix Ironlake shadow comparisons.Eric Anholt
The cube map array index arg is always present.
2009-11-06i965: Use Compr4 instruction compression mode on G4X and newer.Eric Anholt
No statistically significant performance difference at n=3 with either openarena or my GL demo, but cutting program size seems like a good thing to be doing for the hypothetical app that has a working set near icache size.
2009-11-06i965: Share min/max between brw_wm_emit.c and brw_wm_glsl.cEric Anholt
2009-11-06i965: Share emit_fb_write() between brw_wm_emit.c and brw_wm_glsl.cEric Anholt
This should fix issues with antialiased lines in GLSL.
2009-11-06i965: Share most of the WM functions between brw_wm_glsl.c and brw_wm_emit.cEric Anholt
The PINTERP code should be faster for brw_wm_glsl.c now since brw_wm_emit.c's had been improved, and pixel_w should no longer stomp on a neighbor to dst.
2009-11-06i965: Share math functions between brw_wm_glsl.c and brw_wm_emit.c.Eric Anholt
2009-11-06i965: Share the sop opcodes between brw_wm_glsl.c and brw_wm_emit.c.Eric Anholt
2009-11-06i965: Share OPCODE_MAD between brw_wm_glsl.c and brw_wm_emit.cEric Anholt