Age | Commit message (Collapse) | Author |
|
If the loop ends with an if with one break or in a single break unroll
it. Loops that end with a continue will have that continue removed by
the redundant jump optimizer. Likewise loops that end with an
if-statement with a break at the end of both branches will have the
break pulled out after the if-statement.
Loops of the form
for (...) {
do_something1();
if (cond) {
do_something2();
break;
} else {
do_something3();
}
}
will be unrolled as
do_something1();
if (cond) {
do_something2();
} else {
do_something3();
do_something1();
if (cond) {
do_something2();
} else {
do_something3();
/* Repeat inserting iterations here.*/
}
}
ir_lower_jumps can guarantee that all loops are put in this form
and thus all loops are now potentially unrollable if an upper bound
on the number of iterations can be found.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
The loop_controls pass didn't look at the counter values it put in ir_loop
on previous iterations, so while the first iteration worked, subsequent
ones couldn't determine max_iterations.
|
|
Fixes piglit tests glsl-vs-main-return and glsl-fs-main-return.
|
|
|
|
Changes in v2:
- Base class renamed to ir_control_flow_visitor
- Tried to comply with coding style
This is a new pass that supersedes ir_if_return and "lowers" jumps
to if/else structures.
Currently it causes no regressions on softpipe and nv40, but I'm not sure
whether the piglit glsl tests are thorough enough, so consider this
experimental.
It can be asked to:
1. Pull jumps out of ifs where possible
2. Remove all "continue"s, replacing them with an "execute flag"
3. Replace all "break" with a single conditional one at the end of the loop
4. Replace all "return"s with a single return at the end of the function,
for the main function and/or other functions
This gives several great benefits:
1. All functions can be inlined after this pass
2. nv40 and other pre-DX10 chips without "continue" can be supported
3. nv30 and other pre-DX10 chips with no control flow at all are better supported
Note that for full effect we should also teach the unroller to unroll
loops with a fixed maximum number of iterations but with the canonical
conditional "break" that this pass will insert if asked to.
Continues are lowered by adding a per-loop "execute flag", initialized to
TRUE, that when cleared inhibits all execution until the end of the loop.
Breaks are lowered to continues, plus setting a "break flag" that is checked
at the end of the loop, and trigger the unique "break".
Returns are lowered to breaks/continues, plus adding a "return flag" that
causes loops to break again out of their enclosing loops until all the
loops are exited: then the "execute flag" logic will ignore everything
until the end of the function.
Note that "continue" and "return" can also be implemented by adding
a dummy loop and using break.
However, this is bad for hardware with limited nesting depth, and
prevents further optimization, and thus is not currently performed.
|
|
This is just a subclass of ir_visitor with empty implementations of all
the visit methods for non-control flow nodes.
Used to avoid duplicating that in ir_visitor subclasses.
ir_hierarchical_visitor is another way to solve this, but is less natural
for some applications.
|
|
Should fix fdo 30168.
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=29901
https://bugs.freedesktop.org/show_bug.cgi?id=30132
|
|
According to gcc documentation both are equivalent,
second are prefered as first can make conflict with existing symbols.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
|
|
Point about needing a better way to do this validated.
|
|
This applies to r6xx/r7xx/evergreen
|
|
Allows KMS EGL driver to load. We need a better way of doing this.
|
|
|
|
|
|
saves a few dwords
|
|
r700start3d already emits the context control packets
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
probably copy/paste error
|
|
|
|
|
|
For GLX 1.3 drawables, we can destroy the DRI2 drawable when the GLX
drawable is destroyed. However, for legacy drawables, there os no
good way of knowing when the application is done with it, so we just
let the DRI2 drawable linger on the server. The server will destroy
the DRI2 drawable when it destroys the X drawable or the client exits
anyway.
https://bugs.freedesktop.org/show_bug.cgi?id=30109
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=29901
|
|
|
|
different yet compatible formats
|
|
evergreen is always the same as r700 here.
|
|
no idea how/why it got there
|
|
|
|
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=30145
|
|
|
|
|
|
radeon_cs_space_check flushes the pipe context on failure, retries
the validation, and returns -1 if it fails again. At that point, there is
nothing we can do, so let's skip draw operations instead of getting stuck
in an infinite loop.
This code path ideally should never be hit.
|
|
This a leftover probably and is unnecessary, since we flush u_upload_mgr
in r300_flush.
|
|
|
|
Fixes SCons build.
|
|
|
|
Doesn't require 16-alignment, so catch more cases.
|
|
Keep step array as a set of four m128i's and reuse throughout the
rasterization.
|
|
Fragment shader can extract the correct bits for each quad.
|
|
|
|
I really don't understand the mechanism behind this, but it
seems like the way data blocks for a scene are malloced, and in
particular whether we treat them as stack or a queue, and whether
we retain the most recently allocated or least recently allocated
has a real affect (~5%) on isosurf framerates...
This is probably specific to my distro or even just my machine,
but none the less, it's nicer not to see the framerates go in the
wrong direction.
|
|
Mesa doesn't respect it anyway, but this makes it assert rather
than threads access areas of l[] that don't belong to them.
|