| Age | Commit message (Collapse) | Author |
|
Together with the previous commit, this generalize the benefits of
d2cf757f44f4ee5554243f3279483a25886d9927 to all depth formats, in
particular:
- simpler float -> 24unorm conversion
- avoid unsigned comparisons (not directly supported on SSE) by aligning
to the least significant bit
- avoid unecessary/repeated mask ANDing
Verified with trivial/tri-z that the exact same assembly is produced for
X8Z24.
|
|
Z32_FLOAT uses <4 x float> as intermediate/destination type,
instead of <4 x i32>.
The necessary bitcasts got removed with commit
5b7eb868fde98388d80601d8dea39e679828f42f
Also use depth/stencil type and build contexts consistently, and
make the depth pointer argument a ordinary <i8 *>, to catch this
sort of issues in the future (and also to pave way for Z16 and
Z32_FLOAT_S8_X24 support).
|
|
|
|
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
|
|
at the moment depth copies are failing (piglit depth-level-clamp)
so use the fallback for now until get some time to investigate.
|
|
this changes size on 32/64 bit so is definitely no what you want to use here.
|
|
fixes segfaults
|
|
this mirror changes in r300g, bpt is kinda useless when it comes to some
of the non-simple texture formats.
|
|
|
|
|
|
just a cleanup step towards tiling
|
|
|
|
completely removed them.
|
|
this thing will be in the cache a lot, so having massive big struct
arrays inside it won't be helping anyone.
|
|
also fixup framebuffer state copies to avoid bad state.
|
|
gallium calls them scissors, but r600 hw like r300 is better off using
cliprects to implement them as we can turn them on/off a lot easier.
|
|
|
|
There's no apparent reason for the former to exist. And they didn't
even have the same value.
|
|
|
|
|
|
this allows softpipe to be used to test shader stencil ref exporting.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
this adds the capability + a stencil semantic id, + tgsi scan support.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
Apply Jose's suggestions for a small but measurable improvement in
isosurf.
|
|
This reverts commit 9773722c2b09d5f0615a47cecf4347859474dc56.
Looks like there are some floor/rounding issues here that need
to be better understood.
|
|
|
|
|
|
MSVC doesn't accept more than 3 __m128i arguments.
|
|
|
|
Avoid accumulating more and more fixed point bits.
|
|
|
|
There was actually a large quantity of scalar code in these functions
previously. This tries to move more into intrinsics.
Introduce an sse2 mm_mullo_epi32 replacement to avoid sse4 dependency
in the new rasterization code.
|
|
The engine is a global owned by gallivm module.
|
|
Useful to amortize the command submission/reloc overhead (e.g. etracer
goes from 72 to 109 FPS on nv4b).
|
|
fixes https://bugs.freedesktop.org/show_bug.cgi?id=30771
Reported-by: Kevin DeKorte
|
|
|
|
|
|
This could probably be done much nicer, I've spent a day chasing
a coherency problem in the kernel, that turned out to be incorrect
scissor setup.
|
|
fixes glsl1-2D Texture lookup with explicit lod (Vertex shader)
|
|
We need to move the texture sampler resources out of the range of the vertex attribs.
We could probably improve this using an allocator but this is the simple answer for now.
makes mesa-demos/src/glsl/vert-tex work.
|
|
|
|
|
|
|
|
Simply rely on mem2reg pass. It's easier and more reliable.
|
|
|
|
|
|
We've been using these in the linear path for a while now. Based on
Chris's SSSE3 code, but using only sse2 opcodes. Speed seems to be
identical, but code is simpler & removes dependency on SSE3.
Should be easier to extend to other rgba8 formats.
|
|
Specifically, can do early-depth-test even when alpahtest or
kill-pixel are active, providing we defer the actual z write until the
final mask is avaialable.
Improves demos/fire.c especially in the case where you get close to
the trees.
|