Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
|
|
|
|
block.
|
|
|
|
This also allows us to split the loop emulation into two phases. A
tranformation phase which either unrolls loops or prepares them to be
emulated, and the emulation phase which unrolls remaining loops until the
instruction limit is reached. The second phase is completed after the
deadcode analysis in order to get a more accurate count of the number of
instructions in the body of loops.
|
|
1. Move all GL entrypoint functions and files into src/mesa/main/
This includes the ARB vp/vp, NV vp/fp, ATI fragshader and GLSL bits
that were in src/mesa/shader/
2. Move src/mesa/shader/slang/ to src/mesa/slang/ to reduce the tree depth
3. Rename src/mesa/shader/ to src/mesa/program/ since all the
remaining files are concerned with GPU programs.
4. Misc code refactoring. In particular, I got rid of most of the
GLSL-related ctx->Driver hook functions. None of the drivers used
them.
Conflicts:
src/mesa/drivers/dri/i965/brw_context.c
|
|
|
|
|
|
It is not perfect, but it is the best we got.
|
|
This reverts commit a9ee95651131e27d5acf3d10909b5b7e5c8d3e92.
It was based on a failure to understand how ther driver allocates
memory, and causes a regression with Celestia.
Set MaxLevel to dstLevel before allocating new mipmap level.
The radeon driver will fail to allocate space for a new level that
is outside of BaseLevel..MaxLevel. Set MaxLevel before allocating.
Signed-off-by: Maciej Cencora <m.cencora@gmail.com>
|
|
Fixes segfault in mipmap_view.c demo. Bug #27212.
|
|
|
|
|
|
|
|
|
|
|
|
I broke this with the state streaming changes.
|
|
|
|
BOs are stored in the bufmgr, which is freed as part of the screen
structure.
|
|
|
|
|
|
|
|
|
|
|
|
before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] gl firefox-talos-gfx 31.791 32.287 1.11% 6/6
after:
[ 0] gl firefox-talos-gfx 31.198 31.675 0.96% 6/6
|
|
|
|
|
|
|
|
|
|
|
|
|
|
e.g. for(i=10; i>0; i--)
|
|
This only works with for loops that increment the counter.
e.g. for(i=0; i<10; i++)
|
|
The loop emulation unrolls loops as may times as possbile while still
keeping the shader program below the maximum instruction limit. At this
point, there are no checks for constant conditionals. This is only enabled
for fragment shaders.
|
|
We had to fill out all that junk when using the cache, but no more.
|
|
|
|
This makes the binding table code simpler, and is required for gen6,
which requires binding table addresses to be under 64k offset from the
surface state base addr.
No significant change in performance on firefox-talos-gfx.
|
|
Now that the binding table is streamed indirect state, they were
always NULL/0.
|
|
|
|
It turns out that computing a 56 byte key to look up a 20-byte object
out of a hash table was some sort of a bad idea. Whoops.
before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6
after:
[ 0] gl firefox-talos-gfx 34.761 34.784 0.17% 5/6
|
|
This slightly reduces reduces cairo-gl firefox-talos-gfx runtime on my
Ironlake:
before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] gl firefox-talos-gfx 38.236 38.383 0.43% 5/6
after:
[ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6
It turns out the cost of caching these objects and looking them up in
the cache again is greater than the cost of just computing the object
again, particularly when the overhead of having a separate BO to pin
is removed.
(Those that are paying close attention will note that this is a
reversal of the path I was moving the driver in a couple of years ago.
The major thing that has changed is that back then all state was
recomputed when we wrapped the streaming state buffer, including
recompiling our precious programs. Now, we're uncaching just the
objects that are cheap to compute, and retaining caching of expensive
objects)
|
|
This was bothering me when redoing the binding tables.
|
|
|
|
The cache lookup of these two little floats was .12% of total CPU time
on firefox-talos-gfx because we did it any time commonly-changed state
changed. On the other hand, updating the CC VP bo immediately whenver
CC VP state changes is a .07% overhead due to putting a driver hoook
in glEnable().
|
|
|
|
It's more likely that we wrap badly in state setup than in the little
primitive packet.
|
|
It just duplicated the default/core Mesa behaviour.
|