Age | Commit message (Collapse) | Author |
|
This turns on if conversion and unlimited loop unrolling if control
flow is not supported.
NOTE: this will change the behavior of r300g and any other driver
that doesn't advertise control flow
|
|
Changes in v3:
- Also change trace, which I forgot about
Changes in v2:
- No longer adds tessellation shaders
Currently each shader cap has FS and VS versions.
However, we want a version of them for geometry, tessellation control,
and tessellation evaluation shaders, and want to be able to easily
query a given cap type for a given shader stage.
Since having 5 duplicates of each shader cap is unmanageable, add
a new get_shader_param function that takes both a shader cap from a
new enum and a shader stage.
Drivers with non-unified shaders will first switch on the shader
and, within each case, switch on the cap.
Drivers with unified shaders instead first check whether the shader
is supported, and then switch on the cap.
MAX_CONST_BUFFERS is now per-stage.
The geometry shader cap is removed in favor of checking whether the
limit of geometry shader instructions is greater than 0, which is also
used for tessellation shaders.
WARNING: all drivers changed and compiled but only nvfx tested
|
|
|
|
Currently GLSL IR forbids any vector comparisons, and defines "ir_binop_equal"
and "ir_binop_nequal" to compare all elements and give a single bool.
This is highly unintuitive and prevents generation of optimal Mesa IR.
Hence, first rename "ir_binop_equal" to "ir_binop_all_equal" and
"ir_binop_nequal" to "ir_binop_any_nequal".
Second, readd "ir_binop_equal" and "ir_binop_nequal" with the same semantics
as less, lequal, etc.
Third, allow all comparisons to acts on vectors.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
If the loop ends with an if with one break or in a single break unroll
it. Loops that end with a continue will have that continue removed by
the redundant jump optimizer. Likewise loops that end with an
if-statement with a break at the end of both branches will have the
break pulled out after the if-statement.
Loops of the form
for (...) {
do_something1();
if (cond) {
do_something2();
break;
} else {
do_something3();
}
}
will be unrolled as
do_something1();
if (cond) {
do_something2();
} else {
do_something3();
do_something1();
if (cond) {
do_something2();
} else {
do_something3();
/* Repeat inserting iterations here.*/
}
}
ir_lower_jumps can guarantee that all loops are put in this form
and thus all loops are now potentially unrollable if an upper bound
on the number of iterations can be found.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
|
|
The loop_controls pass didn't look at the counter values it put in ir_loop
on previous iterations, so while the first iteration worked, subsequent
ones couldn't determine max_iterations.
|
|
Fixes piglit tests glsl-vs-main-return and glsl-fs-main-return.
|
|
|
|
Changes in v2:
- Base class renamed to ir_control_flow_visitor
- Tried to comply with coding style
This is a new pass that supersedes ir_if_return and "lowers" jumps
to if/else structures.
Currently it causes no regressions on softpipe and nv40, but I'm not sure
whether the piglit glsl tests are thorough enough, so consider this
experimental.
It can be asked to:
1. Pull jumps out of ifs where possible
2. Remove all "continue"s, replacing them with an "execute flag"
3. Replace all "break" with a single conditional one at the end of the loop
4. Replace all "return"s with a single return at the end of the function,
for the main function and/or other functions
This gives several great benefits:
1. All functions can be inlined after this pass
2. nv40 and other pre-DX10 chips without "continue" can be supported
3. nv30 and other pre-DX10 chips with no control flow at all are better supported
Note that for full effect we should also teach the unroller to unroll
loops with a fixed maximum number of iterations but with the canonical
conditional "break" that this pass will insert if asked to.
Continues are lowered by adding a per-loop "execute flag", initialized to
TRUE, that when cleared inhibits all execution until the end of the loop.
Breaks are lowered to continues, plus setting a "break flag" that is checked
at the end of the loop, and trigger the unique "break".
Returns are lowered to breaks/continues, plus adding a "return flag" that
causes loops to break again out of their enclosing loops until all the
loops are exited: then the "execute flag" logic will ignore everything
until the end of the function.
Note that "continue" and "return" can also be implemented by adding
a dummy loop and using break.
However, this is bad for hardware with limited nesting depth, and
prevents further optimization, and thus is not currently performed.
|
|
This is just a subclass of ir_visitor with empty implementations of all
the visit methods for non-control flow nodes.
Used to avoid duplicating that in ir_visitor subclasses.
ir_hierarchical_visitor is another way to solve this, but is less natural
for some applications.
|
|
Should fix fdo 30168.
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=29901
https://bugs.freedesktop.org/show_bug.cgi?id=30132
|
|
According to gcc documentation both are equivalent,
second are prefered as first can make conflict with existing symbols.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
|
|
Point about needing a better way to do this validated.
|
|
This applies to r6xx/r7xx/evergreen
|
|
Allows KMS EGL driver to load. We need a better way of doing this.
|
|
|
|
|
|
saves a few dwords
|
|
r700start3d already emits the context control packets
|
|
|
|
probably copy/paste error
|
|
|
|
|
|
For GLX 1.3 drawables, we can destroy the DRI2 drawable when the GLX
drawable is destroyed. However, for legacy drawables, there os no
good way of knowing when the application is done with it, so we just
let the DRI2 drawable linger on the server. The server will destroy
the DRI2 drawable when it destroys the X drawable or the client exits
anyway.
https://bugs.freedesktop.org/show_bug.cgi?id=30109
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=29901
|
|
|
|
different yet compatible formats
|
|
evergreen is always the same as r700 here.
|
|
no idea how/why it got there
|
|
|
|
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=30145
|
|
|
|
|
|
radeon_cs_space_check flushes the pipe context on failure, retries
the validation, and returns -1 if it fails again. At that point, there is
nothing we can do, so let's skip draw operations instead of getting stuck
in an infinite loop.
This code path ideally should never be hit.
|
|
This a leftover probably and is unnecessary, since we flush u_upload_mgr
in r300_flush.
|
|
|
|
Fixes SCons build.
|
|
|
|
Doesn't require 16-alignment, so catch more cases.
|
|
Keep step array as a set of four m128i's and reuse throughout the
rasterization.
|
|
Fragment shader can extract the correct bits for each quad.
|
|
|
|
I really don't understand the mechanism behind this, but it
seems like the way data blocks for a scene are malloced, and in
particular whether we treat them as stack or a queue, and whether
we retain the most recently allocated or least recently allocated
has a real affect (~5%) on isosurf framerates...
This is probably specific to my distro or even just my machine,
but none the less, it's nicer not to see the framerates go in the
wrong direction.
|
|
|
|
If not opaque, then the color buffer will have to be read any way,
therefore the specialization is pointless.
|
|
If the buffer we are attempting to map is referenced by the unsubmitted
command stream for this context, we need to flush the command stream,
however to do that we need to be able to access the context at the lowest
level map function, currently we set the buffer in the toplevel map, but this
racy between context. (we probably have a lot more issues than that.)
I'll look into a proper solution as suggested by jrfonseca when I get some time.
|