summaryrefslogtreecommitdiff
path: root/src/gallium/drivers
AgeCommit message (Collapse)Author
2010-10-12llvmpipe: try to do more of rast_tri_3_16 with intrinsicsKeith Whitwell
There was actually a large quantity of scalar code in these functions previously. This tries to move more into intrinsics. Introduce an sse2 mm_mullo_epi32 replacement to avoid sse4 dependency in the new rasterization code.
2010-10-12llvmpipe: Do not dispose the execution engine.José Fonseca
The engine is a global owned by gallivm module.
2010-10-12nouveau: Get larger push buffers.Francisco Jerez
Useful to amortize the command submission/reloc overhead (e.g. etracer goes from 72 to 109 FPS on nv4b).
2010-10-12r600g: fix typo in vertex sampling on r600Dave Airlie
fixes https://bugs.freedesktop.org/show_bug.cgi?id=30771 Reported-by: Kevin DeKorte
2010-10-11llvmpipe: Use lp_tgsi_info.José Fonseca
2010-10-11llvmpipe: Remove outdated comment about stencil testing.José Fonseca
2010-10-11r600g: don't run with scissors.Dave Airlie
This could probably be done much nicer, I've spent a day chasing a coherency problem in the kernel, that turned out to be incorrect scissor setup.
2010-10-11r600g: add TXL opcode support.Dave Airlie
fixes glsl1-2D Texture lookup with explicit lod (Vertex shader)
2010-10-11r600g: enable vertex samplers.Dave Airlie
We need to move the texture sampler resources out of the range of the vertex attribs. We could probably improve this using an allocator but this is the simple answer for now. makes mesa-demos/src/glsl/vert-tex work.
2010-10-11r600g: evergreen has no request size bit in texture word4Dave Airlie
2010-10-11r600g: fix input/output Z export mixup for evergreen.Dave Airlie
2010-10-09gallivm: Cleanup the rest of the flow module.José Fonseca
2010-10-09gallivm: Remove support for Phi generation.José Fonseca
Simply rely on mem2reg pass. It's easier and more reliable.
2010-10-09gallivm: Don't generate Phis for execution mask.José Fonseca
2010-10-09llvmpipe: Fix MSVC build. Enable the new SSE2 code on non SSE3 systems.José Fonseca
2010-10-09llvmpipe: simplified SSE2 swz/unswz routinesKeith Whitwell
We've been using these in the linear path for a while now. Based on Chris's SSSE3 code, but using only sse2 opcodes. Speed seems to be identical, but code is simpler & removes dependency on SSE3. Should be easier to extend to other rgba8 formats.
2010-10-09llvmpipe: clean up shader pre/postamble, try to catch more early-zKeith Whitwell
Specifically, can do early-depth-test even when alpahtest or kill-pixel are active, providing we defer the actual z write until the final mask is avaialable. Improves demos/fire.c especially in the case where you get close to the trees.
2010-10-09llvmpipe: try to be sensible about whether to branch after mask updatesKeith Whitwell
Don't branch more than once in quick succession. Don't branch at the end of the shader.
2010-10-09gallivm: specialized x8z24 depthtest pathKeith Whitwell
Avoid unnecessary masking of non-existant stencil component.
2010-10-09llvmpipe: dump fragment shader ir and asm when LP_DEBUG=fsKeith Whitwell
Better than GALLIVM_DEBUG if you're only interested in fragment shaders.
2010-10-09llvmpipe: use alloca for fs color outputsKeith Whitwell
Don't try to emit our own phi's, let llvm mem2reg do it for us.
2010-10-09llvmpipe: defer attribute interpolation until after mask and ztestKeith Whitwell
Don't calculate 1/w for quads which aren't visible...
2010-10-09llvmpipe: Prevent z > 1.0José Fonseca
The current interpolation schemes causes precision loss. Changing the operation order helps, but does not completely avoid the problem. The only short term solution is to clamp z to 1.0. This is unfortunate, but probably unavoidable until interpolation is improved.
2010-10-09llvmpipe: fix rasterization of vertical lines on pixel boundariesZack Rusin
2010-10-08gallivm: Avoid control flow for two-sided stencil test.José Fonseca
2010-10-08llvmpipe: fix off-by-one in tri_16Keith Whitwell
2010-10-08llvmpipe: add rast_tri_4_16 for small lines and pointsKeith Whitwell
2010-10-08llvmpipe: clean up setup_tri a littleKeith Whitwell
2010-10-08llvmpipe: avoid overflow in triangle cullingKeith Whitwell
Avoid multiplying fixed-point values. Calculate triangle area in floating point use that for culling. Lift area calculations up a level as we are already doing this in the triangle_both() case. Would like to share the calculated area with attribute interpolation, but the way the code is structured makes this difficult.
2010-10-08llvmpipe: fail gracefully on oom in scene creationKeith Whitwell
2010-10-08r600g: drop width/height per level storage.Dave Airlie
these aren't used anywhere, so just waste memory.
2010-10-08r600g: add some RG texture format support.Dave Airlie
2010-10-07r600g: fix Z export enable bits.Dave Airlie
we should be checking output array not input to decide. Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-07r600g: use format from the sampler view not from the texture.Dave Airlie
we want to use the format from the sampler view which isn't always the same as the texture format when creating sampler views.
2010-10-07r600g: fix evergreen interpolation setupAndre Maasikas
interp data is stored in gpr0 so first interp overwrote it and subsequent ones got wrong values reserve register 0 so it's not used for attribs. alternative is to interpolate attrib0 last (reverse, as r600c does)
2010-10-06llvmpipe: Cleanup depth-stencil clears.José Fonseca
Only cosmetic changes. No actual practical difference.
2010-10-06gallivm: Only apply min/max_lod when necessary.José Fonseca
2010-10-06llvmpipe: Fix sprite coord perspective interpolation of Q.José Fonseca
Q coordinate's coefficients also need to be multiplied by w, otherwise it will have 1/w, causing problems with TXP.
2010-10-06llvmpipe: Fix perspective interpolation for point sprites.José Fonseca
Once a fragment is generated with LP_INTERP_PERSPECTIVE set for an input, it will do a divide by w for that input. Therefore it's not OK to treat LP_INTERP_PERSPECTIVE as LP_INTERP_LINEAR or vice-versa, even if the attribute is known to not vary. A better strategy would be to take the primitive in consideration when generating the fragment shader key, and therefore avoid the per-fragment perspective divide.
2010-10-06llvmpipe: Dump a few missing shader key flags.José Fonseca
2010-10-06llvmpipe: make debug_fs_variant respect variant->nr_samplersKeith Whitwell
2010-10-06r600g: add evergreen stencil support.Dave Airlie
this sets the stencil up for evergreen properly.
2010-10-05r600g: userspace fence to avoid kernel call for testing bo busy statusJerome Glisse
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-05r600g: simplify block relocationJerome Glisse
Since flush rework there could be only one relocation per register in a block. Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-05r600g: use dirty list to track dirty blocksBas Nieuwenhuizen
Got a speed up by tracking the dirty blocks in a seperate list instead of looping through all blocks. This version should work with block that get their dirty state disabled again and I added a dirty check during the flush as some blocks were already dirty.
2010-10-05nv50: fix always true conditional in shader optimizationNicolas Kaiser
2010-10-05r600g: improve bo flushingJerome Glisse
Flush read cache before writting register. Track flushing inside of a same cs and avoid reflushing same bo if not necessary. Allmost properly force flush if bo rendered too and then use as a texture in same cs (missing pipeline flush dunno if it's needed or not). Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-05r600g: drop use_mem_constant.Dave Airlie
since we plan on using dx10 constant buffers everywhere.
2010-10-05r300g: fix microtiling for 16-bits-per-channel formatsMarek Olšák
These texture formats (like R16G16B16A16_UNORM) were untested until now because st/mesa doesn't use them. I am testing this with a hacked st/mesa here.
2010-10-04r600g: allow r600_bo to be a sub allocation of a big boJerome Glisse
Add bo offset everywhere needed if r600_bo is ever a sub bo of a bigger bo. Signed-off-by: Jerome Glisse <jglisse@redhat.com>