Age | Commit message (Collapse) | Author |
|
The BGNLOOP and ENDLOOP instructions are now being used correctly, which
makes break and continue possible. The deadcode pass has been modified to
handle breaks, and the compiler is more careful about which loops are
unrolled.
|
|
|
|
|
|
When a DRI2 swap buffer is pending we need to make sure we
have the flush extension so radeon doesn't resume rendering to
or reading from the not yet blitted front buffer.
This fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=28341
https://bugs.freedesktop.org/show_bug.cgi?id=28410
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
|
|
This reverts commit 8446f257b3e3ca4a3eb2c79bc357e46343e04e87.
|
|
When DRI2 swap buffer is pending (copy buffer not pageflipping)
we need to make sure we have the flush extension so radeon doesn't
resume rendering on the not yet blitted front buffer.
Modified version of Jerome's patch to add flush extension
in the correct place.
This prepares a possible fix for:
https://bugs.freedesktop.org/show_bug.cgi?id=28341
https://bugs.freedesktop.org/show_bug.cgi?id=28410
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
|
|
r600 doesnt need the same normalization as r700 - instead it requires
range to be truncated to -pi..pi
I left the range trunc also effective on r700 althouch according the docs
it has sufficent range (-512*PI, +512*PI). The instructions seem
to be used not too often to cause perf loss because of this
Based on patches and testing by Conn Clark and Alain Perrot
|
|
mtypes.h does not use any symbols from compiler.h.
Also add the required headers for files that depended on symbols from
compiler.h but were indirectly including compiler.h through mtypes.h.
|
|
Fixes "implicit declaration of function
_mesa_get_incomplete_framebuffer" warning.
|
|
Add context.h for NEED_SECONDARY_COLOR symbol.
|
|
Add context.h for FLUSH_VERTICES symbol.
|
|
|
|
|
|
with no output
|
|
|
|
|
|
Also fix up comments, so that the difference between the two passes is
clarified.
|
|
|
|
|
|
|
|
|
|
|
|
Several routines directly analyze the grf-to-mrf moves from the Gen
binary code. When it is possible, the mov is removed and the message
register is directly written in the arithmetic instruction
Also redundant mrf-to-grf moves are removed (frequently for example,
when sampling many textures with the same uv)
Code was tested with piglit, warsow and nexuiz on an Ironlake
machine. No regression was found there
Note that the optimizations are *deactivated* on Gen4 and Gen6 since I
did test them properly yet. No reason there are bugs but who knows
The optimizations are currently done in branch free programs *only*.
Considering branches is more complicated and there are actually two
paths: one for branch free programs and one for programs with branches
Also some other optimizations should be done during the emission
itself but considering that some code is shader between vertex shaders
(AOS) and pixel shaders (SOA) and that we may have branches or not, it
is pretty hard to both factorize the code and have one good set of
strategies
|
|
Clarifies program assembly, and with a little tweak to always use
constant_map, we could cut down on constant buffer payload.
|
|
This should be more useful for developers and for bug triaging than
just generating wrong code.
|
|
Fixes glsl-vs-arrays. Bug #27388.
|
|
Fixes glsl-vs-point-size.
|
|
This has confused me twice now. It's a fixed width of 4 (usually a
region description of <4,4,1>), not 1. If it was 1, we'd have been
skipping all over register space.
|
|
|
|
|
|
This supersedes http://lists.freedesktop.org/archives/mesa-dev/2010-July/001442.html.
|
|
The ARL value is increments of vec4 in the register file. But
PROGRAM_TEMPORARY or PROGRAM_INPUT are stored as vec4s interleaved
between the two verts being executed (thus a vec8 each), compared to
PROGRAM_STATE_VAR being packed vec4s.
Fixes:
glsl-vs-arrays-2
glsl-vs-mov-after-deref
(without regressing glsl-vs-arrays-3)
|
|
|
|
The previous support was overly complicated by trying to use the same
1-OWORD message for both offsets.
|
|
|
|
|
|
Otherwise, the second half isn't written, and we end up reading back
black.
Fixes the remaining junk drawn in glsl-max-varyings, and will likely
help with a number of large real-world shaders.
|
|
They go into the render cache, so while we don't care about their
contents after execution, failing to note them could cause the writes
to be flushed over important buffer contents later.
|
|
|
|
Otherwise, the subsequent read may not get the written value.
|
|
|
|
To quiet a compiler warning.
|
|
|
|
The extension never worked, the implementation returns GLX_BAD_CONTEXT
when enabling the frame tracking.
|
|
Only r200 implemented it.
|
|
There was confusion on both the size of message we can send, and on
what the URB destination offset means.
The remaining problems appear to be due to spilling of regs in the
fragment shader being broken.
|
|
|
|
This cleans up some chipset dependency sprinkled around, and fixes a
potential overflow of the attribute offset array for many vertex
results.
|
|
|
|
|