summaryrefslogtreecommitdiff
path: root/src/gallium/drivers/llvmpipe/lp_rast_tri.c
AgeCommit message (Collapse)Author
2010-10-15llvmpipe: use aligned loads/stores for plane valuesKeith Whitwell
2010-10-15gallium: move some intrinsics helpers to u_sse.hKeith Whitwell
2010-10-15llvmpipe: slightly shrink the size of a binned triangleKeith Whitwell
2010-10-12llvmpipe: make sure intrinsics code is guarded with PIPE_ARCH_SSEKeith Whitwell
2010-10-12llmvpipe: improve mm_mullo_epi32José Fonseca
Apply Jose's suggestions for a small but measurable improvement in isosurf.
2010-10-12gallium: move sse intrinsics debug helpers to u_sse.hKeith Whitwell
2010-10-12llvmpipe: Fix MSVC build.José Fonseca
MSVC doesn't accept more than 3 __m128i arguments.
2010-10-12llvmpipe: fix typo in last commitKeith Whitwell
2010-10-12llvmpipe: try to do more of rast_tri_3_16 with intrinsicsKeith Whitwell
There was actually a large quantity of scalar code in these functions previously. This tries to move more into intrinsics. Introduce an sse2 mm_mullo_epi32 replacement to avoid sse4 dependency in the new rasterization code.
2010-10-08llvmpipe: add rast_tri_4_16 for small lines and pointsKeith Whitwell
2010-09-13llvmpipe: Fix non SSE2 builds.José Fonseca
Should fix fdo 30168.
2010-09-12llvmpipe: introduce tri_3_4 for tiny trianglesKeith Whitwell
2010-09-12llvmpipe: refactor tri_3_16Keith Whitwell
Keep step array as a set of four m128i's and reuse throughout the rasterization.
2010-09-12llvmpipe: pass linear masks to fragment shaderKeith Whitwell
Fragment shader can extract the correct bits for each quad.
2010-08-31llvmpipe: slightly simplify build_maskKeith Whitwell
2010-08-31llvmpipe: combine linear mask calculationKeith Whitwell
2010-08-31llvmpipe: intrinsics versions of build_mask functionsKeith Whitwell
2010-08-27llvmpipe: native rasterization for linesHui Qi Tay
Rasterize lines directly by treating them as 4-sided polygons. Still need to check the exact pixel rasteration.
2010-08-15llvmpipe: special case triangles which fall in a single 16x16 blockKeith Whitwell
Check for these and route them to a dedicated handler with one fewer levels of recursive rasterization.
2010-08-15llvmpipe: remove all traces of step arrays, pos_tablesKeith Whitwell
No need to calculate these values any longer, nor to store them in the bin data. Improves isosurf a bit more, 115->123 fps.
2010-08-15llvmpipe: eliminate last usage of step array in rast_tmp.hKeith Whitwell
For 16 and 64 pixel levels, calculate a mask which is linear in x and y (ie not in the swizzle layout). When iterating over full and partial masks, figure out position by manipulating the bit number set in the mask, rather than relying on postion arrays. Similarly, calculate the lower-level c values from dcdx, dcdy and the position rather than relying on the step array.
2010-08-15llvmpipe: version of block4 which doesn't need the full step arrayKeith Whitwell
No noticable slowdown with isosurf.
2010-08-15llvmpipe: reorganize block4 loop, nice speedupKeith Whitwell
isosurf 95->115 fps just by exchanging the two inner loops in this function...
2010-07-13llvmpipe: pass mask into fragment shaderKeith Whitwell
Move this code back out to C for now, will generate separately. Shader now takes a mask parameter instead of C0/C1/C2/etc. Shader does not currently use that parameter and rasterizes whole pixel stamps always.
2010-02-24llvmpipe: more lp_rasterizer_task parameter passingBrian Paul
2010-02-24llvmpipe: pass fewer parameters to rasterization functionsBrian Paul
2010-02-24llvmpipe: added some assertionsBrian Paul
2010-02-17llvmpipe: use ffs technique for full tiles alsoKeith Whitwell
Need to compute two masks here for full and partial 16x16 blocks. Gives a further good improvement for isosurf particularly: isosurf 97 -> 108 gears 597 -> 611
2010-02-17llvmpipe: rework do_block_16 to use bitmasks and ffsKeith Whitwell
Some nice speedups: gears: 547 -> 597 isosurf: 83 -> 98 Others like gloss unchanged. Could do further work in this direction.
2010-01-21llvmpipe: use some local vars to index step arraysBrian Paul
Saves a few more cycles.
2010-01-21llvmpipe: added simple perf/statistics counting facilityBrian Paul
Currently counting number of tris, how many tiles of each size are fully covered, partially covered or empty, etc. Set LP_DEBUG=counters to enable. Results are printed upon context destruction.
2010-01-15llvmpipe: skip 4x4 in/out test codeBrian Paul
It's a litte faster to just do the in/out testing in the shader jit code.
2010-01-15llvmpipe: added comment about lookup-tables vs. computationBrian Paul
2010-01-15llvmpipe: generate two shader varients, one omits triangle in/out testingBrian Paul
When we know that a 4x4 pixel block is entirely inside of a triangle use the jit function which omits the in/out test code. Results in a few percent speedup in many tests.
2009-12-17llvmpipe: replace INT_MIN/2 with INT_MINBrian Paul
Since changing the in/out test we can just use INT_MIN to be sure the comparison against the step values always passes.
2009-12-17llvmpipe: improve the in/out test a littleBrian Paul
Instead of: s = c + step m = s > 0 Do: m = step > c (with negated c)
2009-12-16llvmpipe: do final the pixel in/out triangle test in the fragment shaderBrian Paul
The test to determine which of the pixels in a 2x2 quad is now done in the fragment shader rather than in the calling C code. This is a little faster but there's a few more things to do. Note that the step[] array elements are in a different order now. Rather than being in row-major order for the 4x4 grid, they're in "quad-major" order. The setup of the step arrays is a little more complicated now. So is the course/intermediate tile test code, but some lookup tables help with that. Next steps: - early-cull 2x2 quads which are totally outside the triangle. - skip the in/out test for fully contained quads - make the in/out comparison code tighter/faster.
2009-12-07llvmpipe: repartition lp_rasterizer state for threadingBrian Paul
Some of the state is per-thread. Put that state in new lp_rasterizer_task struct.
2009-12-04llvmpipe: use LP_DBG() macro everywhereBrian Paul
2009-12-01llvmpipe: added assertionsBrian Paul
And remove unused BLOCKSIZE.
2009-12-01llvmpipe: simplify mask computationBrian Paul
Make this a little easier to understand.
2009-12-01llvmpipe: replace shifts with multiplies to be clearerBrian Paul
The compiler will still do the multiplies with shifts. It's just a bit easier to follow the logic with multiplies.
2009-12-01llvmpipe: make nr_blocks unsignedBrian Paul
2009-12-01llvmpipe: comments, reformatting and assertions in tri rast codeBrian Paul
2009-10-20llvmpipe: move block list into rast structKeith Whitwell
2009-10-20llvmpipe: build list of 4x4 blocks to be shadedKeith Whitwell
2009-10-20llvmpipe: recursive rasterization within a tileKeith Whitwell
2009-10-20llvmpipe: precalculate some offsetsKeith Whitwell
2009-10-19llvmpipe: calculate masks in format desired by shaderKeith Whitwell
Also remove branches calculating masks for quads.
2009-10-19llvmpipe: pre-multiply some constants by fixed_oneKeith Whitwell