Age | Commit message (Collapse) | Author |
|
This is a step towards moving this code into the rasterizer.
|
|
|
|
Further reduce the size of a binned triangle.
|
|
|
|
|
|
But bin lazily only into bins which are receiving geometry.
|
|
|
|
|
|
Together with the previous commit, this generalize the benefits of
d2cf757f44f4ee5554243f3279483a25886d9927 to all depth formats, in
particular:
- simpler float -> 24unorm conversion
- avoid unsigned comparisons (not directly supported on SSE) by aligning
to the least significant bit
- avoid unecessary/repeated mask ANDing
Verified with trivial/tri-z that the exact same assembly is produced for
X8Z24.
|
|
Z32_FLOAT uses <4 x float> as intermediate/destination type,
instead of <4 x i32>.
The necessary bitcasts got removed with commit
5b7eb868fde98388d80601d8dea39e679828f42f
Also use depth/stencil type and build contexts consistently, and
make the depth pointer argument a ordinary <i8 *>, to catch this
sort of issues in the future (and also to pave way for Z16 and
Z32_FLOAT_S8_X24 support).
|
|
There's no apparent reason for the former to exist. And they didn't
even have the same value.
|
|
|
|
Apply Jose's suggestions for a small but measurable improvement in
isosurf.
|
|
This reverts commit 9773722c2b09d5f0615a47cecf4347859474dc56.
Looks like there are some floor/rounding issues here that need
to be better understood.
|
|
|
|
MSVC doesn't accept more than 3 __m128i arguments.
|
|
|
|
Avoid accumulating more and more fixed point bits.
|
|
|
|
There was actually a large quantity of scalar code in these functions
previously. This tries to move more into intrinsics.
Introduce an sse2 mm_mullo_epi32 replacement to avoid sse4 dependency
in the new rasterization code.
|
|
The engine is a global owned by gallivm module.
|
|
|
|
|
|
|
|
Simply rely on mem2reg pass. It's easier and more reliable.
|
|
|
|
|
|
We've been using these in the linear path for a while now. Based on
Chris's SSSE3 code, but using only sse2 opcodes. Speed seems to be
identical, but code is simpler & removes dependency on SSE3.
Should be easier to extend to other rgba8 formats.
|
|
Specifically, can do early-depth-test even when alpahtest or
kill-pixel are active, providing we defer the actual z write until the
final mask is avaialable.
Improves demos/fire.c especially in the case where you get close to
the trees.
|
|
Don't branch more than once in quick succession. Don't branch at the
end of the shader.
|
|
Avoid unnecessary masking of non-existant stencil component.
|
|
Better than GALLIVM_DEBUG if you're only interested in fragment shaders.
|
|
Don't try to emit our own phi's, let llvm mem2reg do it for us.
|
|
Don't calculate 1/w for quads which aren't visible...
|
|
The current interpolation schemes causes precision loss.
Changing the operation order helps, but does not completely avoid the
problem.
The only short term solution is to clamp z to 1.0.
This is unfortunate, but probably unavoidable until interpolation is
improved.
|
|
|
|
|
|
|
|
|
|
|
|
Avoid multiplying fixed-point values. Calculate triangle area in
floating point use that for culling.
Lift area calculations up a level as we are already doing this in the
triangle_both() case.
Would like to share the calculated area with attribute interpolation,
but the way the code is structured makes this difficult.
|
|
|
|
Only cosmetic changes. No actual practical difference.
|
|
|
|
Q coordinate's coefficients also need to be multiplied by w, otherwise
it will have 1/w, causing problems with TXP.
|
|
Once a fragment is generated with LP_INTERP_PERSPECTIVE set for an input,
it will do a divide by w for that input. Therefore it's not OK to treat LP_INTERP_PERSPECTIVE as
LP_INTERP_LINEAR or vice-versa, even if the attribute is known to not
vary.
A better strategy would be to take the primitive in consideration when
generating the fragment shader key, and therefore avoid the per-fragment
perspective divide.
|
|
|
|
|
|
Remove duplicated include.
Signed-off-by: Brian Paul <brianp@vmware.com>
|
|
Fixes glean pbo crash.
It would be possible to avoid crashing without decoupling, but given
that state trackers give no guarantee that number of views is consistent,
that would likely cause too many state updates (or miss some).
|