Age | Commit message (Collapse) | Author |
|
|
|
|
|
Otherwise things start crashing.
|
|
|
|
This gives a ~30% shader optimization time improvement on blender.
Tested by comparing the dumped LLVM modules.
Current ordering:
time ~/llvm-git/obj/Release-Asserts/bin/opt l.bc -constprop -instcombine
-mem2reg -gvn -simplifycfg
real 0m1.126s
user 0m1.108s
sys 0m0.012s
With this patch:
time ~/llvm-git/obj/Release-Asserts/bin/opt l.bc -mem2reg -constprop -instcombine -gvn -simplifycfg
real 0m0.885s
user 0m0.880s
sys 0m0.000s
The overall improvement in blender is ~15%.
Blender without the patch takes 1m13s:
edwin 5934 87.6 11.5 729440 458296 pts/5 SLl+ 17:35 1:13 blender
Blender with the patch takes 1m3s:
edwin 5726 94.2 11.2 716424 446168 pts/5 SLl+ 17:32 1:03 blender
It is still slow with the patch, but better (most of the optimization time is
taken up by GVN, see LLVM PR7023).
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: José Fonseca <jfonseca@vmware.com>
|
|
|
|
This fixes broken 3D texture indexing when the height of the 3D texture
was less than 64 (the tile size). It's simpler to pass this as an array
(as we do with the row stride) than to compute it on the fly.
|
|
This branch implemented dual representations of texture/drawing surfaces:
one in the conventional linear layout and the other the tiled layout which
is used by the fragment shader pipe. Per-tile flags indicate the layout
of each image tile. In many situations this lets us avoid converting
image data between the two layouts.
Squashed commit of the following:
commit 563a7e3cc552fdcfcaf9ac0d4b1683c3ba2ae732
Author: Brian Paul <brianp@vmware.com>
Date: Thu Apr 8 14:48:21 2010 -0600
llvmpipe: convert points/lines to triangles with draw module
This isn't the most efficient way to render points/lines but it allows us
to run more tests.
commit a8aa763e8a717533f2b13bb6ea53cbccbede68c9
Author: Brian Paul <brianp@vmware.com>
Date: Thu Apr 8 14:47:28 2010 -0600
llvmpipe: call llvmpipe_get_texture_tile() for depth/stencil
The returned pointer isn't used, but the tile status/layout info
gets updated. Helps to fix glReadPixels(DEPTH / STENCIL).
commit 463bc64af266194acbea71cd52e26a79b8c8a260
Author: Brian Paul <brianp@vmware.com>
Date: Thu Apr 8 10:58:48 2010 -0600
llvmpipe: add store_color to debug cmd_names list
commit 784cc73fb334a9d7b7c93cbd8a1445cdf742ff58
Author: Brian Paul <brianp@vmware.com>
Date: Thu Apr 8 10:57:43 2010 -0600
llvmpipe: fix debug build
commit 792c93171ec075664f55720ffed397ac2834a4fc
Author: Brian Paul <brianp@vmware.com>
Date: Thu Apr 8 10:49:01 2010 -0600
llvmpipe: fix cube mapping
commit 882b1035db88c3dd8aebe28dc971ac30a9ee39e3
Author: Brian Paul <brianp@vmware.com>
Date: Thu Apr 8 09:53:30 2010 -0600
llvmpipe: remove some older/unused code
commit b807d32b23145301e8842824664d9f06b9c5502e
Author: Brian Paul <brianp@vmware.com>
Date: Thu Apr 8 09:29:50 2010 -0600
llvmpipe: silence warning
commit 7b337e64fec92836ccdf9d96216289dd58418e35
Author: Brian Paul <brianp@vmware.com>
Date: Wed Apr 7 17:06:08 2010 -0600
llvmpipe: clean-up, comments in lp_surface_copy()
commit c52fa36f249cc652fa8d5fdd94d6574127c08c41
Author: Brian Paul <brianp@vmware.com>
Date: Wed Apr 7 16:51:42 2010 -0600
llvmpipe: overhaul tiled/linear memory management
Now we keep per-tile layout info (linear vs. tiled (or neither or both)
and convert from one layout to the other on demand.
commit 4a50ccfd470547c9be0704005818a87014e9c0e9
Author: Brian Paul <brianp@vmware.com>
Date: Wed Apr 7 16:51:27 2010 -0600
llvmpipe: added tile read/write counters
commit b7d0ea9c687ac8773b083791623826fa604adf21
Author: Brian Paul <brianp@vmware.com>
Date: Mon Apr 5 14:54:04 2010 -0600
llvmpipe: rename some functions
commit ee45c6e5b95cbd3c8cccc9aa4d45d8aef11e20c4
Author: Brian Paul <brianp@vmware.com>
Date: Mon Apr 5 14:42:15 2010 -0600
llvmpipe: re-org some get block/tile pointer code
commit 26ce97c16c0b6520ff1538803baa772d8c3b1280
Author: Brian Paul <brianp@vmware.com>
Date: Mon Apr 5 14:34:13 2010 -0600
llvmpipe: disable bad assertions
commit 5c670481248c4d46f87f13bf3af5655925e7002d
Author: Brian Paul <brianp@vmware.com>
Date: Fri Apr 2 16:36:11 2010 -0600
llvmpipe: add a special-case optimization to lp_surface_copy()
Be more efficient when copying tiled image to linear image.
Before, the fallback path was always converting the whole source image
to linear. Now we can convert just a sub region.
commit faa684645e64d6024b3a11e4e08da825e8220b2e
Author: Brian Paul <brianp@vmware.com>
Date: Fri Apr 2 16:15:16 2010 -0600
llvmpipe: assorted texture and tile/line conversion code change s
The tiled/linear conversion functions take x/y positions now to
allow converting only sub-regions.
More texture-related helper functions.
commit baad81ec5318d44bfac1e37c7643afc0836607bb
Author: Brian Paul <brianp@vmware.com>
Date: Tue Mar 30 13:18:40 2010 -0600
llvmpipe: convert tiled->linear upon PIPE_FLUSH_SWAPBUFFERS
If we know we're about to do a swapbuffers we should immediately
convert the tiled color tiles to linear instead of later in
llvmpipe_texture_unmap() since we can take advantage of threading/
parallelism here.
commit 928dd41256811daeddb7506a49a34dbad04beaf8
Author: Brian Paul <brianp@vmware.com>
Date: Tue Mar 30 09:16:58 2010 -0600
llvmpipe: polish-up the llvmpipe_flush() code
commit dd6014abcf86c517d159b8175e0eaeb167ea2ef6
Author: Brian Paul <brianp@vmware.com>
Date: Tue Mar 30 09:15:17 2010 -0600
llvmpipe: SETUP_x enum clean-up
commit 0b1ce6da2b28a41f3389685ab93e10b43c950f5d
Author: Brian Paul <brianp@vmware.com>
Date: Fri Mar 26 10:43:37 2010 -0600
llvmpipe: remove unused vars
commit 4562663480f88162ed4452cb05569eecb67f9f39
Author: Brian Paul <brianp@vmware.com>
Date: Fri Mar 26 10:31:55 2010 -0600
llvmpipe: cope with non-existant color/depth buffers
The fragment jit functions always grab these pointers, even if they're
not used.
commit df4329edbaf204ed501f1eac0698b8198178f9af
Author: Brian Paul <brianp@vmware.com>
Date: Thu Mar 25 15:20:15 2010 -0600
llvmpipe: do all render target surface mapping/unmapping in the rast code
commit 3d0c25d5ba8b8f61e8366d4c97324e45d526ff41
Author: Brian Paul <brianp@vmware.com>
Date: Thu Mar 25 14:31:21 2010 -0600
llvmpipe: map z/stencil buffer on demand like color buffers
Plus lots of code clean-up and loose ends taken care of.
commit c3b6fddd788aef09b4b84b843b7b1272231151e8
Author: Brian Paul <brianp@vmware.com>
Date: Thu Mar 25 13:15:03 2010 -0600
llvmpipe: remove unused write_zstencil field
commit 63374d97836926a6357e9d6dd24a509a8e155c56
Author: Brian Paul <brianp@vmware.com>
Date: Thu Mar 25 09:45:59 2010 -0600
llvmpipe: add missing lp_rast_end() call
Fixes crash on window resize when LP_NUM_THREADS=0.
commit 92fe9952161cc06f6edc58778e9e5a8b9ea447dc
Author: Brian Paul <brianp@vmware.com>
Date: Wed Mar 24 10:15:19 2010 -0600
llvmpipe: add tiled/linear conversion for 16-bit Z images
commit 6605fa28c147f30df351da0e4413cab33e4db5da
Author: Brian Paul <brianp@vmware.com>
Date: Tue Mar 23 16:06:41 2010 -0600
llvmpipe: implement tiled/linear conversion for Z/stencil images
commit 804528d84ffa292ef9d49d3666cdd3fa099ff3ff
Author: Brian Paul <brianp@vmware.com>
Date: Tue Mar 23 16:05:45 2010 -0600
llvmpipe: added texture stride comment
commit 66a88c012edf670c4ac887a912f02dcff93266dd
Author: Brian Paul <brianp@vmware.com>
Date: Tue Mar 23 16:04:07 2010 -0600
llvmpipe: remove unused vars
commit e2ca8d1328316dc8b36d5f688c16d109e49a6870
Author: Brian Paul <brianp@vmware.com>
Date: Mon Mar 22 18:53:11 2010 -0600
llvmpipe: checkpoint WIP: overhaul texture/surface mapping
Conversion between tiled and linear surfaces is working everywhere now.
The LP_TEXTURE_READ/READ_WRITE/WRITE_ALL flags let us avoid unnecessary
image layout conversions.
Still some loose ends, temporary/debug code, etc.
Need to implement tiled/linear conversion for depth/stencil images.
commit f2730a03839ee8984c1f537b7cbebba24961397a
Author: Brian Paul <brianp@vmware.com>
Date: Mon Mar 22 14:41:58 2010 -0600
llvmpipe: rename/repurpose lp_rast_store_color()
commit e192a47552c5d20d2caef452ca7697e2cd852c9b
Author: Brian Paul <brianp@vmware.com>
Date: Mon Mar 22 14:38:51 2010 -0600
llvmpipe: remove lp_rast_load_color()
commit 3cff0bde4b4ab980e1c3e812700419091527c76b
Author: Brian Paul <brianp@vmware.com>
Date: Mon Mar 22 14:11:38 2010 -0600
llvmpipe: remove/consolidate texture image code
commit 3a2f08b6a550c69ef5e874f482be30252cbf8bfa
Author: Brian Paul <brianp@vmware.com>
Date: Fri Mar 19 17:03:14 2010 -0600
llvmpipe: checkpoint WIP: directly render to tiled texture buffers
We're now directly writing colors into the tiled texture image buffers.
This is a checkpoint commit with lots of dead code and temporary hacks.
Everything will get cleaned up eventually.
commit c5ca987e03870849514d4e3c99af143722a09695
Author: Brian Paul <brianp@vmware.com>
Date: Fri Mar 19 16:41:14 2010 -0600
llvmpipe: refactor code, create tile_pixel_offset()
commit 2133e8273e937cbac09cd7264d6ce53af9764ddb
Author: Brian Paul <brianp@vmware.com>
Date: Fri Mar 19 14:55:11 2010 -0600
llvmpipe: pass LP_TEXTURE_LINEAR/TILED flags around
commit b9b9d4b82b01f4588721fdc8444740f859b4a021
Author: Brian Paul <brianp@vmware.com>
Date: Fri Mar 19 14:51:05 2010 -0600
llvmpipe: checkpoint WIP: hanlde co-existing tiled/linear texture data
Cube maps are temporarily broken, maybe other things.
commit 4cd322e6889940b5f155fcb69041b685b9ef9273
Author: Brian Paul <brianp@vmware.com>
Date: Fri Mar 19 11:34:43 2010 -0600
progs/demos: add other modes/patterns to dissolve demo
|
|
into here.
|
|
Instead of passing an array, just pass two scalar values.
|
|
Use the new enum values rather than integers in a few places.
|
|
|
|
|
|
The stride depends on the mipmap level. Rename to row_stride to
distinguish from img_stride for 3D textures.
Fixes incorrect texel addressing in small mipmap levels.
|
|
Change the texture data_ptr from just a single image pointer to an
array of image pointers, indexed by mipmap level.
We'll use this for mipmap filtering.
For now, the mipmap level is hard-coded to zero.
|
|
|
|
|
|
|
|
|
|
the llvmpipe tgsi translation is a lot more complete than what was in
gallivm so replacing the latter with the former. this is needed since
the draw llvm paths will use the same code. effectively the proven
llvmpipe code becomes gallivm.
|
|
Conflicts:
Makefile
src/gallium/auxiliary/util/u_surface.c
src/gallium/drivers/llvmpipe/lp_flush.c
src/gallium/drivers/llvmpipe/lp_setup.c
src/gallium/drivers/llvmpipe/lp_state_derived.c
src/gallium/drivers/llvmpipe/lp_state_fs.c
src/gallium/drivers/llvmpipe/lp_state_surface.c
src/gallium/drivers/llvmpipe/lp_tex_cache.c
src/gallium/drivers/llvmpipe/lp_texture.c
src/gallium/drivers/llvmpipe/lp_tile_cache.c
src/mesa/state_tracker/st_cb_condrender.c
|
|
|
|
Conflicts:
src/gallium/auxiliary/draw/draw_context.c
src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline.c
src/gallium/auxiliary/pipebuffer/Makefile
src/gallium/auxiliary/pipebuffer/SConscript
src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c
src/gallium/auxiliary/tgsi/tgsi_scan.c
src/gallium/drivers/i915/i915_surface.c
src/gallium/drivers/i915/i915_texture.c
src/gallium/drivers/llvmpipe/lp_setup.c
src/gallium/drivers/llvmpipe/lp_tex_sample_c.c
src/gallium/drivers/llvmpipe/lp_texture.c
src/gallium/drivers/softpipe/sp_prim_vbuf.c
src/gallium/state_trackers/xorg/xorg_dri2.c
src/gallium/winsys/drm/intel/gem/intel_drm_api.c
src/gallium/winsys/drm/nouveau/drm/nouveau_drm_api.c
src/gallium/winsys/drm/radeon/core/radeon_drm.c
src/gallium/winsys/drm/vmware/core/vmw_screen_dri.c
src/mesa/state_tracker/st_cb_clear.c
|
|
|
|
The scissor test is implemented as another per-quad operation in
the JIT code. The four scissor box params are passed via the
lp_jit_context. In the JIT code we compare the quad's x/y coords
against the clip bounds and create a new in/out mask that's AND'd
with the main quad mask.
Note: we should also do scissor testing in the triangle setup code
to improve efficiency. That's not done yet.
|
|
|
|
|
|
SSE3 != SSSE3 and so far we only use the later.
|
|
|
|
Basically mimic the llvm 2.6 way of linking execution engines and
targets.
|
|
The combination of fptosi
and sitofp (necessary for trunc/floor/ceil/round implementation)
somehow becomes invalid code.
Skip the instruction combining pass when SSE4.1 is not available.
|
|
Note that llvmpipe still doesn't run on any processor yet: if you don't
have a recent processor with SSE4.1 you will still likely end up
hitting a code path for which a generic non-sse4 version is not
implemented yet.
|
|
|
|
|
|
Finally a substantial performance improvement: framerates of apps using
texturing tripled, and furthermore, enabling/disabling texturing only
affects around 15% of the framerate, which means the bottleneck is now
somewhere else.
Generated texture sampling code is not complete though -- we always
sample from the base level -- so final figures will be different.
|
|
|
|
|
|
|
|
|
|
|