summaryrefslogtreecommitdiff
path: root/src/gallium/auxiliary/tgsi/tgsi_sse2.c
AgeCommit message (Collapse)Author
2010-06-03tgsi: we don't support indirect input/output registers in SSE codegen yetBrian Paul
Extend the check for indirect addressing of temp regs to include input/output regs. Fixes failure with piglit glsl-texcoord-array.shader_test test when using SSE codegen.
2010-06-03tgsi: whitespace cleanupBrian Paul
2010-05-07tgis: fix SOA aliasing for MUL instruction in SSE codegenBrian Paul
Part of a fix for piglit trinity-fp1 test failure.
2010-05-06tgsi: make SSE ADD instruction SOA-safeBrian Paul
To properly execute an instruction such as "ADD tmp, tmp.wzyx, foo;" with SOA we (sometimes) need to put the results into temporaries before writing the results to the destination register. This patch fixes the ADD instruction but this needs to be done for many more instructions. Helps to fix piglit fp-long-alu test (fd.o bug 27989).
2010-05-06tgsi: code refactoringBrian Paul
2010-04-27tgsi: Drop BGNFOR, ENDFOR, REP, and ENDREP opcodes.José Fonseca
2010-01-08Merge branch 'mesa_7_7_branch'Brian Paul
Conflicts: src/mesa/drivers/dri/i965/brw_wm_emit.c
2010-01-07tgsi: fix SSE code emit for XPDBrian Paul
Rearrange things so that the writes to the dest registers happen after we've fetched/used all src registers. The problematic instruction was: XPD TEMP[2].xyz, TEMP[0], TEMP[2]; Note that the dst reg is also a src reg. This fixes bad shading with progs/glsl/bump.c since Eric's changes to the Mesa program optimizer in commit d6690ce15fb8c7c6abf1bc0d847c1d2da2c33904. The optimizer rearranges some registers so we occasionally wind up with something like the above.
2010-01-07gallium: Fix texture sampling with explicit LOD in softpipe.Michal Krol
2010-01-07gallium: Pass per-element (not per-quad) LOD bias values down to texture ↵Michal Krol
sampler.
2010-01-05Remove TGSI_OPCODE_SHR, map existing usage to TGSI_OPCODE_ISHR.Michal Krol
This is to differentiate it from its unsigned version, TGSI_OPCODE_USHR.
2009-12-14tgsi: add properties and system value registerZack Rusin
adds support for properties to all parts of the tgsi framework, plus introduces a new register which will be used for system generated values.
2009-11-24tgsi: rename fields of tgsi_full_src_register to reduce verbosityKeith Whitwell
SrcRegister -> Register SrcRegisterInd -> Indirect SrcRegisterDim -> Dimension SrcRegisterDimInd -> DimIndirect
2009-11-24tgsi: rename fields of tgsi_full_dst_register to reduce verbosityKeith Whitwell
DstRegister -> Register DstRegisterInd -> Indirect
2009-11-24tgsi: rename fields of tgsi_full_declaration to reduce verbosityKeith Whitwell
DeclarationRange -> Range
2009-11-24tgsi: rename fields of tgsi_full_instruction to avoid excessive verbosityKeith Whitwell
InstructionPredicate -> Predicate InstructionLabel -> Label InstructionTexture -> Texture FullSrcRegisters -> Src FullDstRegisters -> Dst
2009-11-24gallium: simplify tgsi tokens furtherKeith Whitwell
Drop anonymous 'Extended' fields, have every optional token named explicitly in its parent. Eg. there is now an Instruction.Label flag, etc. Drop destination modifiers and other functionality which cannot be generated by tgsi_ureg.c, which is now the primary way of creating shaders. Pull source modifiers into the source register token, drop the second negate flag. The source register token is now full - if we need to expand it, probably best to move all of the modifiers to a new token and have a single flag for it.
2009-10-23gallium: remove the swizzling parts of ExtSwizzleKeith Whitwell
These haven't been used by the mesa state tracker since the conversion to tgsi_ureg, and it seems that none of the other state trackers are using it either. This helps simplify one of the biggest suprises when starting off with TGSI shaders.
2009-09-24Merge branch 'mesa_7_6_branch'Brian Paul
Conflicts: src/mesa/drivers/dri/r600/r700_assembler.c src/mesa/drivers/dri/r600/r700_chip.c src/mesa/drivers/dri/r600/r700_render.c src/mesa/drivers/dri/r600/r700_vertprog.c src/mesa/drivers/dri/r600/r700_vertprog.h src/mesa/drivers/dri/radeon/radeon_span.c
2009-09-24tgsi/sse: remove old commentsBrian Paul
2009-09-24tgsi/sse: implement SEQ, SGT, SLE, SNEBrian Paul
2009-09-24tgsi/sse: Pass the lodbias, not zero. More comments.Brian Paul
This fixes the glean/glsl1 "texture2D(), with bias" test when using SSE.
2009-09-13tgsi: handle some src/dst aliasing in tgsi_sse2.cKeith Whitwell
Src/Dst aliasing (aka SOA dependencies) requires some care to ensure intermediate results do not overwrite yet-to-be read source registers. This change ensures that MOV/SWZ handle this correctly, which is poor but no worse than the current tgsi_exec.c path. Remove the fallback as there is nothing to be gained correctness-wise between the two implementations now. Fixing this properly looks like a bit of work in this code, but might be easily achieved by sending destination writes to temporary storage.
2009-09-12tgsi: implement saturationKeith Whitwell
Fix recent performance regression.
2009-09-01tgsi: remove redundant CND0 opcodeKeith Whitwell
Can be implemented with CMP src2, src1, src0
2009-08-20tgsi: check for SOA dependencies in SSE and PPC code generatorsBrian Paul
Fall back to interpreter for now. This doesn't happen very often.
2009-08-18Merge branch 'mesa_7_5_branch'Brian Paul
2009-08-18tgsi/sse: we don't implement saturation modes yetBrian Paul
Fixes piglit fp-generic tests/shaders/generic/lrp_sat.fp, bug 23316.
2009-08-03tgsi: report opcode name in addition to the number when translation failsBrian Paul
2009-07-31Rename TGSI LOOP instruction to better match theri usage.Michal Krol
The LOOP/ENDLOOP pair is renamed to BGNFOR/ENDFOR as its behaviour is similar to a C language for-loop. The BGNLOOP2/ENDLOOP2 pair is renamed to BGNLOOP/ENDLOOP as now there is no name collision.
2009-07-29gallium: fix SSE shadow texture instructionsBrian Paul
When sampling a 2D shadow map we need 3 texcoord components, not 2. The third component (distance from light source) is compared against the texture sample to return the result (visible vs. occluded). Also, enable proper handling of TGSI_TEXTURE_SHADOW targets in Mesa->TGSI translation. There's a possibility for breakage in gallium drivers if they fail to handle the TGSI_TEXTURE_SHADOW1D / TGSI_TEXTURE_SHADOW2D / TGSI_TEXTURE_SHADOWRECT texture targets for TGSI_OPCODE_TEX/TXP instructions, but that should be easy to fix. With these changes, progs/demos/shadowtex.c renders properly again with softpipe.
2009-07-23gallium: remove deprecated TGSI opcodesKeith Whitwell
Various opcodes which can be implemented trivially with other TGSI opcodes, such as matrix multiplication and negation. These were not used by any state tracker or implemented by any of the drivers.
2009-07-22gallium: remove multiple aliases for TGSI opcodesKeith Whitwell
This is a source of ongoing confusion. TGSI has multiple names for opcodes where the same semantics originate in multiple shader APIs. For instance, TGSI includes both Mesa/GLSL and DX/SM30 names for opcodes with the same semantics, but aliases those names to the same underlying opcode number. This makes it very difficult to visually inspect two sets of opcodes (eg in state tracker & driver) and check if they implement the same functionality. This patch arbitarily rips out the versions of the opcodes not currently favoured by the mesa state tracker and leaves us with a single name for each distinct operation.
2009-07-22gallium: simplify tgsi_full_immediate structKeith Whitwell
Remove the need to have a pointer in this struct by just including the immediate data inline. Having a pointer in the struct introduces complications like needing to alloc/free the data pointed to, uncertainty about who owns the data, etc. There doesn't seem to be a need for it, and it is unlikely to make much difference plus or minus to performance. Added some asserts as we now will trip up on immediates with more than four elements. There were actually already quite a few such asserts, but the >4 case could be used in the future to specify indexable immediate ranges, such as lookup tables.
2009-07-20tgsi: get texturing working in vertex shader sse2 pathKeith Whitwell
2009-07-20tgsi: fix regression in indexed const lookupsKeith Whitwell
This function was calling get_input_base() and get_output_base() to get the names of a couple of register to use as temps. Those functions no longer return registers, so adjust it to get the registers elsewhere. This change doesn't address the issue that it's a fairly poor way to grab a register name by calling a function with an apparently unrelated meaning.
2009-07-16tgsi: simplify and fix sse KIL implementationKeith Whitwell
Use sse_movmskps to extract the correct bits of the comparison result for use in updating the killmask. Simplify some logic around identifying the set of necessary comparisons to make.
2009-07-16tgsi: initial texturing support on sse pathKeith Whitwell
Most obvious problem is drawpixels comes out blocky, but this may be an existing issue of KIL on the sse path.
2009-07-16tgsi: make sse function callout mechanism more genericKeith Whitwell
Take a list of arguments rather than hardcoding TEMP_R0.
2009-07-16tgsi: reduce x86 reg usage in tgsi_sse generated programsKeith Whitwell
Pass the tgsi_exec_machine struct in directly and just hold a single pointer to this struct, rather than keeping one for each of its internal members.
2009-07-16tgsi: make function call code in tgsi_sse.c less opaqueKeith Whitwell
Explictly pass src and dst arguments (previously dst argument was also being used as a src). Separate argument handling from the rest of the function call emit.
2009-04-24tgis: SSE code generator doesn't yet support indirect addressing of temp regsBrian Paul
Fall back to interpreter in this case.
2009-04-10tgsi/sse2: Cleanup NRM/NRM4 implementation.Michal Krol
Fix comments. Make sure .w is set to 1.0 for NRM. Optimise for non-.xyzw writemasks.
2009-04-09tgsi/sse2: Fix build.Michal Krol
2009-04-09tgsi/sse2: Fix ARL instruction.Michal Krol
2009-04-09tgsi/sse2: Fix LIT instruction.Michal Krol
2009-02-18util: Move p_debug.h into util module.José Fonseca
The debug functions depend on several util function for os abstractions, and these depend on debug functions, so a seperate module is not possible.
2009-02-16gallium: fix glean's vertProg1Alan Hourihane
RSQ test 2 (reciprocal square toot of negative value)
2009-02-10tgsi: Fix build -- rename Size to NrTokens.Michal Krol
2008-11-26tgsi: Implement OPCODE_SSG/SGN.Michal Krol