summaryrefslogtreecommitdiff
path: root/src/gallium/drivers/cell/spu/spu_tri.h
diff options
context:
space:
mode:
authorJonathan Adamczewski <jadamcze@utas.edu.au>2009-05-21 08:18:03 -0600
committerBrian Paul <brianp@vmware.com>2009-05-21 08:18:03 -0600
commitb4824520ecf453cd8de90e57e839cb11a698d9c0 (patch)
tree2ec3bd7a633fd704c9b04e6acf9f6760bc4418da /src/gallium/drivers/cell/spu/spu_tri.h
parent5b27b4ad37bd992d2d3a6fd9d407277113544f30 (diff)
cell: unroll inner loop of spu_render.c:cmd_render()
It was taking approximately 50 cycles to extract the vertex indices, calculate the vertex_header pointers and call tri_draw() for each three vertices - . Unrolled, it takes less than 100 cycles to extract, unpack, calculate pointers and call tri_draw() eight times. It does have a nasty jump-tabled switch. I'm sure that there's a better way... Code size of spu_render.o gets larger due to the extra constants and work in the inner loop, there are extra stack saves and loads because there are more registers in use, and an assert. spu_tri.o gets a little smaller.
Diffstat (limited to 'src/gallium/drivers/cell/spu/spu_tri.h')
-rw-r--r--src/gallium/drivers/cell/spu/spu_tri.h2
1 files changed, 1 insertions, 1 deletions
diff --git a/src/gallium/drivers/cell/spu/spu_tri.h b/src/gallium/drivers/cell/spu/spu_tri.h
index aa694dd7c9..82e3b19ad7 100644
--- a/src/gallium/drivers/cell/spu/spu_tri.h
+++ b/src/gallium/drivers/cell/spu/spu_tri.h
@@ -31,7 +31,7 @@
extern boolean
-tri_draw(const float *v0, const float *v1, const float *v2, uint tx, uint ty);
+tri_draw(const qword vs, uint tx, uint ty);
#endif /* SPU_TRI_H */