summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965/brw_eu.h
AgeCommit message (Collapse)Author
2010-10-26i965: Add EU code for dword scattered reads (constant buffer array indexing).Eric Anholt
2010-10-22i965: Add support for pull constants to the new FS backend.Eric Anholt
Fixes glsl-fs-uniform-array-5, but not 6 which fails in ir_to_mesa.
2010-10-21i965: Add support for register spilling.Eric Anholt
It can be tested with if (0) replaced with if (1) to force spilling for all virtual GRFs. Some simple tests work, but large texturing tests fail.
2010-10-19i965: Add EU emit support for gen6's new IF instruction with comparison.Eric Anholt
2010-10-14i965: Add support for ir_unop_round_even via the RNDE instruction.Kenneth Graunke
2010-10-14i965: Correctly emit the RNDZ instruction.Kenneth Graunke
Simply using RNDU, RNDZ, or RNDE does not produce the desired result. Rather, the RND* instructions place a value in the destination register that may be 1 less than the correct answer. They can also set per-channel "increment bits" in a flag register, which, if set, mean dest needs to be incremented by 1. A second instruction - a predicated add - completes the job. Notably, RNDD always produces the correct answer in a single instruction. Fixes piglit test glsl-fs-trunc.
2010-09-28i965: Add support for POW in gen6 FS.Eric Anholt
Fixes glsl-algebraic-pow-2 in brw_wm_glsl.c mode.
2010-08-30i965: Make brw_CONT and brw_BREAK take the pop count.Eric Anholt
We always need to set it, so pass it in.
2010-08-20i965: Also use the SIMD8 FB writes for SIMD8 mode on non-SNB.Eric Anholt
2010-08-20i965: Add AccWrCtl support on Sandybridge.Zhenyu Wang
Whenever the accumulator results are needed, this bit must be set.
2010-08-18i965: Don't set the swizzle on an immediate value in the VS.Eric Anholt
Fixes glsl-vs-if-nested (70.0 is not <= 70.000648 thanks to the swizzle bits getting set). Some safety checks are added to make sure this doesn't happen again as we increase the usage of immediate values in program generation.
2010-07-26i965: Fix reversed naming of the operations in compute-to-mrf optimization.Eric Anholt
Also fix up comments, so that the difference between the two passes is clarified.
2010-07-26i965: Move the GRF-to-MRF optimizations to brw_optimize.c.Eric Anholt
2010-07-21i965: Clean up brw_dp_READ_4_vs() now that it has fewer options to support.Eric Anholt
2010-07-21i965: Support relative addressed VS constant reads using the appropriate msg.Eric Anholt
The previous support was overly complicated by trying to use the same 1-OWORD message for both offsets.
2010-07-08i965: Add 'wait' instruction supportZhenyu Wang
When EU executes 'wait' instruction, it stalls and sets notification register state. Host can issue MMIO write to clear notification register state to allow EU continue on executing again. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2010-06-10mesa: rename src/mesa/shader/ to src/mesa/program/Brian Paul
2010-05-18i965: Remove constant or ignored-by-hw args from FF sync message setup.Eric Anholt
2010-03-12i965: Fix up VS DP4 sequences to avoid dependency control.Eric Anholt
This is recommended by the B-Spec. I wasn't able to measure any difference in ETQW.
2010-03-10i965: Use the PLN instruction when possible in interpolation.Eric Anholt
Saves an instruction in PINTERP, LINTERP, and PIXEL_W from brw_wm_glsl.c For non-GLSL it isn't used yet because the deltas have to be laid out differently.
2009-11-06i965: Use Compr4 instruction compression mode on G4X and newer.Eric Anholt
No statistically significant performance difference at n=3 with either openarena or my GL demo, but cutting program size seems like a good thing to be doing for the hypothetical app that has a working set near icache size.
2009-07-13i965: add support for new chipsetsXiang, Haihao
1. new PCI ids 2. fix some 3D commands on new chipset 3. fix send instruction on new chipset 4. new VUE vertex header 5. ff_sync message (added by Zou Nan Hai <nanhai.zou@intel.com>) 6. the offset in JMPI is in unit of 64bits on new chipset 7. new cube map layout
2009-06-30i965: use BRW_MAX_GRF, BRW_MAX_MRFBrian Paul
2009-06-26i965: fix fetching constants from constant buffer in glsl pathRoland Scheidegger
the driver used to overwrite grf0 then use implicit move by send instruction to move contents of grf0 to mrf1. However, we must not overwrite grf0 since it's still used later for fb write. Instead, do the move directly do mrf1 (we could use implicit move from another grf reg to mrf1 but since we need a mov to encode the data anyway it doesn't seem to make sense). I think the dp_READ/WRITE_16 functions may suffer from the same issue. While here also remove unnecessary msg_reg_nr parameter from the dataport functions since always message register 1 is used.
2009-05-12i965: increase BRW_EU_MAX_INSNBrian Paul
2009-04-17i915: fix broken indirect constant buffer readsBrian Paul
The READ message's msg_control value can be 0 or 1 to indicate that the Oword should be read into the lower or upper half of the target register. It seems that the other half of the register gets clobbered though. So we read into two dest registers then use a MOV to combine the upper/lower halves.
2009-04-16i965: implement relative addressing for VS constant buffer readsBrian Paul
A scatter-read should be possible, but we're just using two READs for the time being.
2009-04-14i965: fix VS constant buffer readsBrian Paul
This mostly came down to finding the right MRF incantation in the brw_dp_READ_4_vs() function. Note: this feature is still disabled (but getting close to done).
2009-04-14i965: checkpoint commit: VS constant buffersBrian Paul
Hook up a constant buffer, binding table, etc for the VS unit. This will allow using large constant buffers with vertex shaders. The new code is disabled at this time (use_const_buffer=FALSE).
2009-04-03i965: added brw_same_reg()Brian Paul
2009-04-03i965: added new brw_dp_READ_4() functionBrian Paul
Used to read float[4] vectors from the constant buffer/surface.
2009-03-13i965: more register number assertionsBrian Paul
2009-03-06i965: bump up BRW_EU_MAX_INSNBrian Paul
This is the size of the intermediate instruction buffer.
2009-02-13i965: rewrite the code for handling shader subroutine callsBrian Paul
Previously, the prog_instruction::Data field was used to map original Mesa instructions to brw instructions in order to resolve subroutine calls. This was a rather tangled mess. Plus it's an obstacle to implementing dynamic allocation/growing of the instruction buffer (it's still a fixed size). Mesa's GLSL compiler emits a label for each subroutine and CAL instruction. Now we use those labels to patch the subroutine calls after code generation has been done. We just keep a list of all CAL instructions that needs patching and a list of all subroutine labels. It's a simple matter to resolve them. This also consolidates some redundant post-emit code between brw_vs_emit.c and brw_wm_glsl.c and removes some loops that cleared the prog_instruction::Data fields at the end. Plus, a bunch of new comments.
2009-01-05i965: implement OPCODE_TRUNC (round toward zero) on vertex path.Brian Paul
Also, fix some RNDD vs. RNDZ confusion elsewhere.
2009-01-01i965: comments, clean-ups, re-order some functionsBrian Paul
2008-11-05i965: Implement missing OPCODE_NOISE3 instruction in fragment shaders.Gary Wong
OPCODE_NOISE4 coming later.
2008-10-31i965: support destination horiz strides in align1 access mode.Gary Wong
This is required for scatter writes in destination regions to work.
2008-06-21replace __inline and __inline__ with INLINE macroBrian Paul
2008-04-25[i965] short immediate values must be replicated to both halves of the dwordKeith Packard
The 32-bit immediate value in the i965 instruction word must contain two copies of any 16-bit constants. brw_imm_uw and brw_imm_w just needed to copy the value into both halves of the immediate value instruction field.
2008-01-29i965: new integrated graphics chipset supportXiang, Haihao
2007-12-29fix fd.o bug #13847Zou Nan hai
2007-09-30 fragment shader function call fix, gl_FragCoord fixZou Nan hai
2007-09-29 support continue, fix conditionalZou Nan hai
2007-06-21 support branch and loop in pixel shaderZou Nan hai
most of the sample working with some small modification
2007-04-12 Initial 965 GLSL supportZou Nan hai
2007-02-23Update DRI drivers for new glsl compiler.Brian
Mostly: - update #includes - update STATE_* token code
2007-01-06i965: Avoid branch instructions while in single program flow mode.Eric Anholt
There is an errata for Broadwater that threads don't have the instruction/loop mask stacks initialized on thread spawn. In single program flow mode, those stacks are not writable, so we can't initialize them. However, they do get read during ELSE and ENDIF instructions. So, instead, replace branch instructions in single program flow mode with predicated jumps (ADD to the ip register), avoiding use of the more complicated branch instructions that may fail. This is also a minor optimization as no ENDIF equivalent is necessary. Signed-off-by: Keith Packard <keithp@neko.keithp.com>
2006-08-09Add Intel i965G/Q DRI driver.Eric Anholt
This driver comes from Tungsten Graphics, with a few further modifications by Intel.