Age | Commit message (Collapse) | Author |
|
We've been using these in the linear path for a while now. Based on
Chris's SSSE3 code, but using only sse2 opcodes. Speed seems to be
identical, but code is simpler & removes dependency on SSE3.
Should be easier to extend to other rgba8 formats.
|
|
Specifically, can do early-depth-test even when alpahtest or
kill-pixel are active, providing we defer the actual z write until the
final mask is avaialable.
Improves demos/fire.c especially in the case where you get close to
the trees.
|
|
Don't branch more than once in quick succession. Don't branch at the
end of the shader.
|
|
Avoid unnecessary masking of non-existant stencil component.
|
|
Better than GALLIVM_DEBUG if you're only interested in fragment shaders.
|
|
Don't try to emit our own phi's, let llvm mem2reg do it for us.
|
|
Don't calculate 1/w for quads which aren't visible...
|
|
The current interpolation schemes causes precision loss.
Changing the operation order helps, but does not completely avoid the
problem.
The only short term solution is to clamp z to 1.0.
This is unfortunate, but probably unavoidable until interpolation is
improved.
|
|
|
|
|
|
|
|
|
|
|
|
Avoid multiplying fixed-point values. Calculate triangle area in
floating point use that for culling.
Lift area calculations up a level as we are already doing this in the
triangle_both() case.
Would like to share the calculated area with attribute interpolation,
but the way the code is structured makes this difficult.
|
|
|
|
these aren't used anywhere, so just waste memory.
|
|
|
|
we should be checking output array not input to decide.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
we want to use the format from the sampler view which isn't always the
same as the texture format when creating sampler views.
|
|
interp data is stored in gpr0 so first interp overwrote it
and subsequent ones got wrong values
reserve register 0 so it's not used for attribs.
alternative is to interpolate attrib0 last (reverse, as r600c does)
|
|
Only cosmetic changes. No actual practical difference.
|
|
|
|
Q coordinate's coefficients also need to be multiplied by w, otherwise
it will have 1/w, causing problems with TXP.
|
|
Once a fragment is generated with LP_INTERP_PERSPECTIVE set for an input,
it will do a divide by w for that input. Therefore it's not OK to treat LP_INTERP_PERSPECTIVE as
LP_INTERP_LINEAR or vice-versa, even if the attribute is known to not
vary.
A better strategy would be to take the primitive in consideration when
generating the fragment shader key, and therefore avoid the per-fragment
perspective divide.
|
|
|
|
|
|
this sets the stencil up for evergreen properly.
|
|
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
|
|
Since flush rework there could be only one relocation per
register in a block.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
|
|
Got a speed up by tracking the dirty blocks in a seperate list instead of looping through all blocks. This version should work with block that get their dirty state disabled again and I added a dirty check during the flush as some blocks were already dirty.
|
|
|
|
Flush read cache before writting register. Track flushing inside
of a same cs and avoid reflushing same bo if not necessary. Allmost
properly force flush if bo rendered too and then use as a texture
in same cs (missing pipeline flush dunno if it's needed or not).
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
|
|
since we plan on using dx10 constant buffers everywhere.
|
|
These texture formats (like R16G16B16A16_UNORM) were untested until now
because st/mesa doesn't use them. I am testing this with a hacked st/mesa
here.
|
|
Add bo offset everywhere needed if r600_bo is ever a sub bo
of a bigger bo.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
|
|
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
|
|
From AROS.
|
|
this code was memcmp'ing two structs, but refcounting one of them afterwards,
so any subsequent memcmp was never going to work.
again this stops unnecessary uploads of vertex program,
|
|
Blending with DST_ALPHA is undefined. SRC_ALPHA works, though.
I bet some other formats have similar limitations too.
|
|
The hw swizzles have been obtained by a brute force approach,
and only C0 and C2 are stored in UV88, the other channels are
ignored.
R16G16 is going to be a lot trickier.
|
|
|
|
Fixes this GCC warning.
r600_shader.c: In function 'tgsi_split_literal_constant':
r600_shader.c:818: warning: unused variable 'index'
|
|
Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Brian Paul <brianp@vmware.com>
|
|
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
|
|
this sets up a single loop constant like r600c does.
|
|
|
|
just a typo in the register headers.
|
|
|
|
there are some vertex formats defined in r600c not in the docs.
|
|
this shouldn't change behaviour, just push the choice of what
to do out to the shader.
|