summaryrefslogtreecommitdiff
path: root/src/gallium/auxiliary/tgsi/tgsi_sse2.c
AgeCommit message (Collapse)Author
2009-11-24tgsi: rename fields of tgsi_full_instruction to avoid excessive verbosityKeith Whitwell
InstructionPredicate -> Predicate InstructionLabel -> Label InstructionTexture -> Texture FullSrcRegisters -> Src FullDstRegisters -> Dst
2009-11-24gallium: simplify tgsi tokens furtherKeith Whitwell
Drop anonymous 'Extended' fields, have every optional token named explicitly in its parent. Eg. there is now an Instruction.Label flag, etc. Drop destination modifiers and other functionality which cannot be generated by tgsi_ureg.c, which is now the primary way of creating shaders. Pull source modifiers into the source register token, drop the second negate flag. The source register token is now full - if we need to expand it, probably best to move all of the modifiers to a new token and have a single flag for it.
2009-10-23gallium: remove the swizzling parts of ExtSwizzleKeith Whitwell
These haven't been used by the mesa state tracker since the conversion to tgsi_ureg, and it seems that none of the other state trackers are using it either. This helps simplify one of the biggest suprises when starting off with TGSI shaders.
2009-09-24Merge branch 'mesa_7_6_branch'Brian Paul
Conflicts: src/mesa/drivers/dri/r600/r700_assembler.c src/mesa/drivers/dri/r600/r700_chip.c src/mesa/drivers/dri/r600/r700_render.c src/mesa/drivers/dri/r600/r700_vertprog.c src/mesa/drivers/dri/r600/r700_vertprog.h src/mesa/drivers/dri/radeon/radeon_span.c
2009-09-24tgsi/sse: remove old commentsBrian Paul
2009-09-24tgsi/sse: implement SEQ, SGT, SLE, SNEBrian Paul
2009-09-24tgsi/sse: Pass the lodbias, not zero. More comments.Brian Paul
This fixes the glean/glsl1 "texture2D(), with bias" test when using SSE.
2009-09-13tgsi: handle some src/dst aliasing in tgsi_sse2.cKeith Whitwell
Src/Dst aliasing (aka SOA dependencies) requires some care to ensure intermediate results do not overwrite yet-to-be read source registers. This change ensures that MOV/SWZ handle this correctly, which is poor but no worse than the current tgsi_exec.c path. Remove the fallback as there is nothing to be gained correctness-wise between the two implementations now. Fixing this properly looks like a bit of work in this code, but might be easily achieved by sending destination writes to temporary storage.
2009-09-12tgsi: implement saturationKeith Whitwell
Fix recent performance regression.
2009-09-01tgsi: remove redundant CND0 opcodeKeith Whitwell
Can be implemented with CMP src2, src1, src0
2009-08-20tgsi: check for SOA dependencies in SSE and PPC code generatorsBrian Paul
Fall back to interpreter for now. This doesn't happen very often.
2009-08-18Merge branch 'mesa_7_5_branch'Brian Paul
2009-08-18tgsi/sse: we don't implement saturation modes yetBrian Paul
Fixes piglit fp-generic tests/shaders/generic/lrp_sat.fp, bug 23316.
2009-08-03tgsi: report opcode name in addition to the number when translation failsBrian Paul
2009-07-31Rename TGSI LOOP instruction to better match theri usage.Michal Krol
The LOOP/ENDLOOP pair is renamed to BGNFOR/ENDFOR as its behaviour is similar to a C language for-loop. The BGNLOOP2/ENDLOOP2 pair is renamed to BGNLOOP/ENDLOOP as now there is no name collision.
2009-07-29gallium: fix SSE shadow texture instructionsBrian Paul
When sampling a 2D shadow map we need 3 texcoord components, not 2. The third component (distance from light source) is compared against the texture sample to return the result (visible vs. occluded). Also, enable proper handling of TGSI_TEXTURE_SHADOW targets in Mesa->TGSI translation. There's a possibility for breakage in gallium drivers if they fail to handle the TGSI_TEXTURE_SHADOW1D / TGSI_TEXTURE_SHADOW2D / TGSI_TEXTURE_SHADOWRECT texture targets for TGSI_OPCODE_TEX/TXP instructions, but that should be easy to fix. With these changes, progs/demos/shadowtex.c renders properly again with softpipe.
2009-07-23gallium: remove deprecated TGSI opcodesKeith Whitwell
Various opcodes which can be implemented trivially with other TGSI opcodes, such as matrix multiplication and negation. These were not used by any state tracker or implemented by any of the drivers.
2009-07-22gallium: remove multiple aliases for TGSI opcodesKeith Whitwell
This is a source of ongoing confusion. TGSI has multiple names for opcodes where the same semantics originate in multiple shader APIs. For instance, TGSI includes both Mesa/GLSL and DX/SM30 names for opcodes with the same semantics, but aliases those names to the same underlying opcode number. This makes it very difficult to visually inspect two sets of opcodes (eg in state tracker & driver) and check if they implement the same functionality. This patch arbitarily rips out the versions of the opcodes not currently favoured by the mesa state tracker and leaves us with a single name for each distinct operation.
2009-07-22gallium: simplify tgsi_full_immediate structKeith Whitwell
Remove the need to have a pointer in this struct by just including the immediate data inline. Having a pointer in the struct introduces complications like needing to alloc/free the data pointed to, uncertainty about who owns the data, etc. There doesn't seem to be a need for it, and it is unlikely to make much difference plus or minus to performance. Added some asserts as we now will trip up on immediates with more than four elements. There were actually already quite a few such asserts, but the >4 case could be used in the future to specify indexable immediate ranges, such as lookup tables.
2009-07-20tgsi: get texturing working in vertex shader sse2 pathKeith Whitwell
2009-07-20tgsi: fix regression in indexed const lookupsKeith Whitwell
This function was calling get_input_base() and get_output_base() to get the names of a couple of register to use as temps. Those functions no longer return registers, so adjust it to get the registers elsewhere. This change doesn't address the issue that it's a fairly poor way to grab a register name by calling a function with an apparently unrelated meaning.
2009-07-16tgsi: simplify and fix sse KIL implementationKeith Whitwell
Use sse_movmskps to extract the correct bits of the comparison result for use in updating the killmask. Simplify some logic around identifying the set of necessary comparisons to make.
2009-07-16tgsi: initial texturing support on sse pathKeith Whitwell
Most obvious problem is drawpixels comes out blocky, but this may be an existing issue of KIL on the sse path.
2009-07-16tgsi: make sse function callout mechanism more genericKeith Whitwell
Take a list of arguments rather than hardcoding TEMP_R0.
2009-07-16tgsi: reduce x86 reg usage in tgsi_sse generated programsKeith Whitwell
Pass the tgsi_exec_machine struct in directly and just hold a single pointer to this struct, rather than keeping one for each of its internal members.
2009-07-16tgsi: make function call code in tgsi_sse.c less opaqueKeith Whitwell
Explictly pass src and dst arguments (previously dst argument was also being used as a src). Separate argument handling from the rest of the function call emit.
2009-04-24tgis: SSE code generator doesn't yet support indirect addressing of temp regsBrian Paul
Fall back to interpreter in this case.
2009-04-10tgsi/sse2: Cleanup NRM/NRM4 implementation.Michal Krol
Fix comments. Make sure .w is set to 1.0 for NRM. Optimise for non-.xyzw writemasks.
2009-04-09tgsi/sse2: Fix build.Michal Krol
2009-04-09tgsi/sse2: Fix ARL instruction.Michal Krol
2009-04-09tgsi/sse2: Fix LIT instruction.Michal Krol
2009-02-18util: Move p_debug.h into util module.José Fonseca
The debug functions depend on several util function for os abstractions, and these depend on debug functions, so a seperate module is not possible.
2009-02-16gallium: fix glean's vertProg1Alan Hourihane
RSQ test 2 (reciprocal square toot of negative value)
2009-02-10tgsi: Fix build -- rename Size to NrTokens.Michal Krol
2008-11-26tgsi: Implement OPCODE_SSG/SGN.Michal Krol
2008-11-26tgsi: Implement OPCODE_ARR.Michal Krol
2008-11-26tgsi: Implement OPCODE_ROUND for SSE2 backend.Michal Krol
2008-11-12tgsi: Fix a bug with saving/restoring xmm registers upon func call.Michal Krol
2008-11-09gallium: use PIPE_ARCH_SSE to protect use of SSE instrinsics onlyBrian
This allows us to use SSE codegen with debug builds again. When PIPE_ARCH_SSE is set (w/ gcc -msse -msse2) we will also use the gcc SSE intrinsic functions.
2008-11-08gallium: implement SSE codegen for TGSI_OPCODE_NRM/NRM4Brian
2008-11-07gallium: added SSE for DP2, DP2ABrian Paul
2008-11-05Merge commit 'origin/gallium-0.1' into gallium-0.2Brian Paul
Conflicts: src/gallium/auxiliary/rtasm/rtasm_execmem.c src/mesa/shader/slang/slang_emit.c src/mesa/shader/slang/slang_log.c src/mesa/state_tracker/st_atom_framebuffer.c
2008-11-05gallium: call tgsi_set_exec_mask() and use exec mask in SSE ARL codeBrian Paul
This prevents vertex shaders from referencing invalid memory locations when the shader is operating on less than four vertices or fragments.
2008-11-05tgsi: Implement OPCODE_TRUNC.michal
2008-11-05tgsi: Implement OPCODE_TRUNC.michal
2008-10-07gallium: Introduce PIPE_ARCH_SSE define for SSE support.José Fonseca
Besides meaning x86 and x86-64 architecture, it also depends on SSE2 support enabled on gcc. This fixes the linux-debug build.
2008-10-01tgsi: Include p_config.h.José Fonseca
2008-09-30cell: Moved X86 checks to wrap #include section so that Cell targets will ↵Jonathan White
compile again.
2008-09-30tgsi: SSE2 optimized exp2, log2 and pow implementations.José Fonseca
Special care must be taken when calling compiler generated SSE2 functions from the runtime generated SSE2: saving the xmm registers, and notify gcc the stack is not 16byte aligned. It would be more efficient to keep the stack pointer 16byte aligned, but too hairy, and not consistent in all x86 architectures. This has been tested in linux x86 and windows x86 userspace. Not tested on x86-64 because it is broken for other reasons (even without this change).
2008-09-08tgsi: Cleanup code.Michal Krol