Age | Commit message (Collapse) | Author |
|
This makes a very simple 1.30 shader go from 196k of memory to 9k.
NOTE: This -may- be a candidate for the 7.9 branch, as the benefit is
substantial. However, it's not a simple change, so it may be wiser to
wait for 7.10.
|
|
We are not aware of any GPU that actually implements the cross product
as a single instruction. Hence, there's no need for it to be an opcode.
Future commits will remove it entirely.
|
|
|
|
|
|
In particular, calling the abs function is silly, since there's already
an expression opcode for that. Also, assigning to temporaries then
assigning those to the final location is rather redundant.
|
|
For consistency with the vec2/vec3/vec4 variants.
|
|
This works around MSVC's 65535 byte limit, unfortunately at the expense
of any semblance of readability and much larger file size. Hopefully I
can implement a better solution later, but for now this fixes the build.
|
|
|
|
This implements round() via the ir_unop_round_even opcode, rather than
adding a new opcode. We may wish to add one in the future, since it
might enable a small performance increase on some hardware, but for now,
this should suffice.
|
|
Implemented using the op-code introduced in the previous commit.
|
|
|
|
|
|
It turns out that most people new to this IR are surprised when an
assignment to (say) 3 components on the LHS takes 4 components on the
RHS. It also makes for quite strange IR output:
(assign (constant bool (1)) (x) (var_ref color) (swiz x (var_ref v) ))
(assign (constant bool (1)) (y) (var_ref color) (swiz yy (var_ref v) ))
(assign (constant bool (1)) (z) (var_ref color) (swiz zzz (var_ref v) ))
But even worse, even we get it wrong, as shown by this line of our
current step(float, vec4):
(assign (constant bool (1)) (w)
(var_ref t)
(expression float b2f (expression bool >=
(swiz w (var_ref x))(var_ref edge))))
where we try to assign a float to the writemasked-out x channel and
don't supply anything for the actual w channel we're writing. Drivers
right now just get lucky since ir_to_mesa spams the float value across
all the source channels of a vec4.
Instead, the RHS will now have a number of components equal to the
number of components actually being written. Hopefully this confuses
everyone less, and it also makes codegen for a scalar target simpler.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
|
|
|
|
Commit 309cd4115b7cba669a0bf858e7809cb6dae90ddf incorrectly converted
these to all_equal and any_nequal, which is the wrong operation.
|
|
|
|
|
|
|
|
Otherwise it gets used uninitialized.
|
|
Otherwise builtin_profiles contains dangling pointers the next time
_mesa_read_profile is called. I suspect this may fix bugzilla #29847,
but I was never able to reproduce it.
|
|
These need abs, and we need more tests.
|
|
|
|
ir_binop_dot is only defined for vector types. Use ir_binop_mul.
|
|
The code being generated was just stupid, considering that:
- normalize(x) = 1.0
- length(x) = x
- distance(x, y) = x - y
|
|
Fix an major regression in dc754586. Too bad that change was
obviously never tested.
|
|
|
|
|
|
|
|
|
|
When x==0, the result was wrong. Fixes piglit glsl-fs-atan-1.shader_test
|
|
When releasing the builtin functions, we were just freeing the memory,
not telling the builtin function loader that we had freed its memory.
I wish I had done ARB_ES2_compatibility so we had regression testing
of this path. Fixes segfault on changing video options in nexuiz.
|
|
DRI was doing teardown when we close the last screen, then an atexit()
was added to call it as well.
|
|
As of 1.20, variable names, function names, and structure type names all
share a single namespace, and should conflict with one another in the
same scope, or hide each other in nested scopes.
However, in 1.10, variables and functions can share the same name in the
same scope. Structure types, however, conflict with/hide both.
Fixes piglit tests redeclaration-06.vert, redeclaration-11.vert,
redeclaration-19.vert, and struct-05.vert.
|
|
Make glsl include only main/core.h from core mesa.
|
|
The previous any() implementation would generate arg0.x || arg0.y ||
arg0.z. Having an expression operation for this makes it easy for the
backend to generate something easier (DPn + SNE for 915 FS, .any
predication on 965 VS)
|
|
|
|
|
|
This should make it easier to diff the output, clean up some of the
insane whitespace, and make the strings a bit smaller.
We'll probably need to split up the prototype strings eventually, but
for now, this gets it under the 65K mark.
|
|
|
|
Fixes fd.o bug #29629.
|
|
Many functions are currently wrapped with #if 0 since we haven't
implemented them yet.
|
|
Each language version/extension and target now has a "profile" containing
all of the available builtin function prototypes. These are written in
GLSL, and come directly out of the GLSL spec (except for expanding genType).
A new builtins/ir/ folder contains the hand-written IR for each builtin,
regardless of what version includes it. Only those definitions that have
prototypes in the profile will be included.
The autogenerated IR for texture builtins is no longer written to disk,
so there's no longer any confusion as to what's hand-written or
generated.
All scripts are now in python instead of perl.
|
|
|
|
|
|
Some signatures were being generated with the wrong function name.
|
|
Fixes glsl-fs-tan-1.
|
|
So many problems here. One is that we can't do the quadrant handling
for all the channels at the same time, so we call the float(y, x)
version multiple times. I'd also left out the x == 0 handling. Also,
the quadrant handling was broken for y == 0, so there was a funny
discontinuity on the +x side if you plugged in obvious values to test.
I generated the atan(float y, float x) code from a short segment of
GLSL and pasted it in by hand. It would be nice to automate that
somehow.
Fixes:
glsl-fs-atan-1
glsl-fs-atan-2
|
|
The type signatures were completely backwards.
|
|
|
|
|