Age | Commit message (Collapse) | Author |
|
Extend the check for indirect addressing of temp regs to include
input/output regs.
Fixes failure with piglit glsl-texcoord-array.shader_test test when using
SSE codegen.
|
|
|
|
Part of a fix for piglit trinity-fp1 test failure.
|
|
To properly execute an instruction such as "ADD tmp, tmp.wzyx, foo;"
with SOA we (sometimes) need to put the results into temporaries before
writing the results to the destination register.
This patch fixes the ADD instruction but this needs to be done for
many more instructions.
Helps to fix piglit fp-long-alu test (fd.o bug 27989).
|
|
|
|
|
|
Conflicts:
src/mesa/drivers/dri/i965/brw_wm_emit.c
|
|
Rearrange things so that the writes to the dest registers happen
after we've fetched/used all src registers.
The problematic instruction was: XPD TEMP[2].xyz, TEMP[0], TEMP[2];
Note that the dst reg is also a src reg.
This fixes bad shading with progs/glsl/bump.c since Eric's changes to the
Mesa program optimizer in commit d6690ce15fb8c7c6abf1bc0d847c1d2da2c33904.
The optimizer rearranges some registers so we occasionally wind up with
something like the above.
|
|
|
|
sampler.
|
|
This is to differentiate it from its unsigned version, TGSI_OPCODE_USHR.
|
|
adds support for properties to all parts of the tgsi framework, plus
introduces a new register which will be used for system generated
values.
|
|
SrcRegister -> Register
SrcRegisterInd -> Indirect
SrcRegisterDim -> Dimension
SrcRegisterDimInd -> DimIndirect
|
|
DstRegister -> Register
DstRegisterInd -> Indirect
|
|
DeclarationRange -> Range
|
|
InstructionPredicate -> Predicate
InstructionLabel -> Label
InstructionTexture -> Texture
FullSrcRegisters -> Src
FullDstRegisters -> Dst
|
|
Drop anonymous 'Extended' fields, have every optional token named
explicitly in its parent. Eg. there is now an Instruction.Label flag,
etc.
Drop destination modifiers and other functionality which cannot be
generated by tgsi_ureg.c, which is now the primary way of creating
shaders.
Pull source modifiers into the source register token, drop the second
negate flag. The source register token is now full - if we need to
expand it, probably best to move all of the modifiers to a new token
and have a single flag for it.
|
|
These haven't been used by the mesa state tracker since the
conversion to tgsi_ureg, and it seems that none of the
other state trackers are using it either.
This helps simplify one of the biggest suprises when starting off with
TGSI shaders.
|
|
Conflicts:
src/mesa/drivers/dri/r600/r700_assembler.c
src/mesa/drivers/dri/r600/r700_chip.c
src/mesa/drivers/dri/r600/r700_render.c
src/mesa/drivers/dri/r600/r700_vertprog.c
src/mesa/drivers/dri/r600/r700_vertprog.h
src/mesa/drivers/dri/radeon/radeon_span.c
|
|
|
|
|
|
This fixes the glean/glsl1 "texture2D(), with bias" test when using SSE.
|
|
Src/Dst aliasing (aka SOA dependencies) requires some care to ensure
intermediate results do not overwrite yet-to-be read source registers.
This change ensures that MOV/SWZ handle this correctly, which is poor but
no worse than the current tgsi_exec.c path. Remove the fallback as there
is nothing to be gained correctness-wise between the two implementations now.
Fixing this properly looks like a bit of work in this code, but might be
easily achieved by sending destination writes to temporary storage.
|
|
Fix recent performance regression.
|
|
Can be implemented with CMP src2, src1, src0
|
|
Fall back to interpreter for now. This doesn't happen very often.
|
|
|
|
Fixes piglit fp-generic tests/shaders/generic/lrp_sat.fp, bug 23316.
|
|
|
|
The LOOP/ENDLOOP pair is renamed to BGNFOR/ENDFOR as its behaviour
is similar to a C language for-loop.
The BGNLOOP2/ENDLOOP2 pair is renamed to BGNLOOP/ENDLOOP as now
there is no name collision.
|
|
When sampling a 2D shadow map we need 3 texcoord components, not 2.
The third component (distance from light source) is compared against
the texture sample to return the result (visible vs. occluded).
Also, enable proper handling of TGSI_TEXTURE_SHADOW targets in Mesa->TGSI
translation. There's a possibility for breakage in gallium drivers if
they fail to handle the TGSI_TEXTURE_SHADOW1D / TGSI_TEXTURE_SHADOW2D /
TGSI_TEXTURE_SHADOWRECT texture targets for TGSI_OPCODE_TEX/TXP instructions,
but that should be easy to fix.
With these changes, progs/demos/shadowtex.c renders properly again with
softpipe.
|
|
Various opcodes which can be implemented trivially with other TGSI opcodes,
such as matrix multiplication and negation. These were not used by any
state tracker or implemented by any of the drivers.
|
|
This is a source of ongoing confusion. TGSI has multiple names for
opcodes where the same semantics originate in multiple shader APIs.
For instance, TGSI includes both Mesa/GLSL and DX/SM30 names for
opcodes with the same semantics, but aliases those names to the same
underlying opcode number.
This makes it very difficult to visually inspect two sets of opcodes
(eg in state tracker & driver) and check if they implement the same
functionality.
This patch arbitarily rips out the versions of the opcodes not currently
favoured by the mesa state tracker and leaves us with a single name
for each distinct operation.
|
|
Remove the need to have a pointer in this struct by just including
the immediate data inline. Having a pointer in the struct introduces
complications like needing to alloc/free the data pointed to, uncertainty
about who owns the data, etc. There doesn't seem to be a need for it,
and it is unlikely to make much difference plus or minus to performance.
Added some asserts as we now will trip up on immediates with more
than four elements. There were actually already quite a few such asserts,
but the >4 case could be used in the future to specify indexable immediate
ranges, such as lookup tables.
|
|
|
|
This function was calling get_input_base() and get_output_base() to
get the names of a couple of register to use as temps. Those
functions no longer return registers, so adjust it to get the
registers elsewhere.
This change doesn't address the issue that it's a fairly poor way to
grab a register name by calling a function with an apparently
unrelated meaning.
|
|
Use sse_movmskps to extract the correct bits of the comparison result
for use in updating the killmask. Simplify some logic around
identifying the set of necessary comparisons to make.
|
|
Most obvious problem is drawpixels comes out blocky, but this may be
an existing issue of KIL on the sse path.
|
|
Take a list of arguments rather than hardcoding TEMP_R0.
|
|
Pass the tgsi_exec_machine struct in directly and just hold a single
pointer to this struct, rather than keeping one for each of its
internal members.
|
|
Explictly pass src and dst arguments (previously dst argument was also
being used as a src). Separate argument handling from the rest of
the function call emit.
|
|
Fall back to interpreter in this case.
|
|
Fix comments.
Make sure .w is set to 1.0 for NRM.
Optimise for non-.xyzw writemasks.
|
|
|
|
|
|
|
|
The debug functions depend on several util function for os abstractions, and
these depend on debug functions, so a seperate module is not possible.
|
|
RSQ test 2 (reciprocal square toot of negative value)
|
|
|
|
|