Age | Commit message (Collapse) | Author |
|
|
|
It's still faster not to try to special case the "all pixels are
known to be inside the triangle" case.
|
|
|
|
This will make it easier to generate multiple versions of the fragment
code per variant.
|
|
This still isn't faster, but committing it for posterity.
|
|
Nice speedup for gears.
|
|
Non-mrt apps work, and the code looks correct, but not many mrt test apps
handy atm...
|
|
When the incoming c0,c1,c2 values are equal to INT_MIN it means that
all pixels are inside the triangle. Thus we can skip the detailed
pixel inside/outside triangle tests. Use the new lp_build_if()/endif()
functions to generate the branching code.
The code is disabled ATM however because it's actually a little slower
than the original code. A little more tuning may fix that though...
|
|
Conflicts:
src/gallium/auxiliary/util/u_surface.c
src/gallium/drivers/llvmpipe/Makefile
src/gallium/drivers/llvmpipe/SConscript
src/gallium/drivers/llvmpipe/lp_bld_arit.c
src/gallium/drivers/llvmpipe/lp_bld_flow.c
src/gallium/drivers/llvmpipe/lp_bld_interp.c
src/gallium/drivers/llvmpipe/lp_clear.c
src/gallium/drivers/llvmpipe/lp_context.c
src/gallium/drivers/llvmpipe/lp_context.h
src/gallium/drivers/llvmpipe/lp_draw_arrays.c
src/gallium/drivers/llvmpipe/lp_jit.c
src/gallium/drivers/llvmpipe/lp_jit.h
src/gallium/drivers/llvmpipe/lp_prim_vbuf.c
src/gallium/drivers/llvmpipe/lp_setup.c
src/gallium/drivers/llvmpipe/lp_setup_point.c
src/gallium/drivers/llvmpipe/lp_state.h
src/gallium/drivers/llvmpipe/lp_state_blend.c
src/gallium/drivers/llvmpipe/lp_state_derived.c
src/gallium/drivers/llvmpipe/lp_state_fs.c
src/gallium/drivers/llvmpipe/lp_state_sampler.c
src/gallium/drivers/llvmpipe/lp_state_surface.c
src/gallium/drivers/llvmpipe/lp_tex_cache.c
src/gallium/drivers/llvmpipe/lp_tex_cache.h
src/gallium/drivers/llvmpipe/lp_tex_sample.h
src/gallium/drivers/llvmpipe/lp_tile_cache.c
|
|
Was used only as a reference, since texture sampling is now code generated.
Already axed in the lp-binning branch too.
This fixes the llvmpipe build after recent sampling changes.
|
|
The setup tiling engine is now plugged directly into the draw module
as a rendering backend.
Removed a couple of layering violations such that the setup code no
longer reaches out into the surrounding llvmpipe state or context.
|
|
It's a 1-bit enum.
|
|
Conflicts:
configs/darwin
src/gallium/auxiliary/util/u_clear.h
src/gallium/state_trackers/xorg/xorg_exa_tgsi.c
src/mesa/drivers/dri/i965/brw_draw_upload.c
|
|
|
|
That is:
- check for no op
- update/flush draw module
- update bound state and mark it as dirty
In particular flushing the draw module is important since it may contain
unflushed primitives which would otherwise be draw with wrong state.
|
|
|
|
|
|
Instead of:
s = c + step
m = s > 0
Do:
m = step > c (with negated c)
|
|
The test to determine which of the pixels in a 2x2 quad is now done in
the fragment shader rather than in the calling C code. This is a little
faster but there's a few more things to do.
Note that the step[] array elements are in a different order now. Rather
than being in row-major order for the 4x4 grid, they're in "quad-major"
order. The setup of the step arrays is a little more complicated now.
So is the course/intermediate tile test code, but some lookup tables
help with that.
Next steps:
- early-cull 2x2 quads which are totally outside the triangle.
- skip the in/out test for fully contained quads
- make the in/out comparison code tighter/faster.
|
|
Cherry-picked from dec35d04aeb398eef159aaf8cde5e0d04622b811.
|
|
|
|
This matches the convention used by the recursive rasterizer.
Also fixed assorted typos, comments, etc.
Now tri-z.c, gears.c, etc look basically right but there's still some
cracks in triangle rasterization.
|
|
|
|
This allows for z32f depth format to work correctly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This was redundant as drivers can just keep track of whether they are
inside a begin/end query pair. We want to add more query types later
and also support nested queries, none of which map well onto a flag like
this. No driver appeared to be using the flag.
|
|
|
|
Union not worth the hassle of violating C99 or adding a name to
the structure.
|
|
|
|
|
|
This increases quake3 timedemo fps another 10%.
|
|
New control flow helper functions which keep track of all variables
and generate the correct Phi functions.
This re-enables skipping the fs execution of quads masked out by
the rasterizer, early z testing, and kill opcode.
This yields a performance improvement of around 20%.
|
|
|
|
Finally a substantial performance improvement: framerates of apps using
texturing tripled, and furthermore, enabling/disabling texturing only
affects around 15% of the framerate, which means the bottleneck is now
somewhere else.
Generated texture sampling code is not complete though -- we always
sample from the base level -- so final figures will be different.
|
|
translation.
|
|
|
|
Fixes the blank screen on non-64bit mode.
|
|
Special attention is given to the interpolation of side by side quads.
Multiplications are made only for the first quad. Interpolation of
inputs for posterior quads are done exclusively with additions, and
perspective divide if necessary.
|
|
|
|
|
|
|
|
|
|
Already included in the fragment shader.
|