Age | Commit message (Collapse) | Author |
|
This should reduce the cost of generating shadow maps, for example.
No performance difference measured in nexuiz, though it does trigger
this path.
|
|
These were for debugging in bringup. Now that relatively complicated
apps are working, they haven't helped debug anything in quite a while.
|
|
This fixes hangs in some Z-writes-in-shaders tests, though other
pieces don't come out correctly.
Bug #30392: hang in fbo-fblit-d24s8. (still fails with bad color drawn
to some targets)
|
|
Now that MESA_MINOR=10, we no longer need the extra '0' in the
version string.
|
|
|
|
|
|
rc_get_readers_normal() supplies a list of readers for a given
instruction. This function is now being used by the copy propagate
optimization and will eventually be used by most other optimization
passes as well.
|
|
|
|
It is possible for a single pair instruction arg to select from both an
RGB and an Alpha source.
|
|
|
|
|
|
Fixes glean/bufferObject.
|
|
|
|
|
|
Fixes fbo-blit and probably several other tests.
|
|
XOR makes much more sense. Note that the previous code would have
failed for not(not(x)), but that gets optimized out.
|
|
|
|
Otherwise, it would try to handle arrays as structures, use
uninitialized memory, and crash.
|
|
We often use reg_null as the destination when setting up the flag
regs. However, on gen6 there aren't general implicit conversions to
destination types from src types, so the comparison to produce the
flag regs would be done on the integer result interpreted as a float.
Hilarity ensued.
Fixes 20 piglit cases.
|
|
|
|
Previously _LinkedShaders was a compact array of the linked shaders
for each shader stage. Now it is arranged such that each slot,
indexed by the MESA_SHADER_* defines, refers to a specific shader
stage. As a result, some slots will be NULL. This makes things a
little more complex in the linker, but it simplifies things in other
places.
As a side effect _NumLinkedShaders is removed.
NOTE: This may be a candidate for the 7.9 branch. If there are other
patches that get backported to 7.9 that use _LinkedShader, this patch
should be cherry picked also.
|
|
I broke it in 06fd639c519214b6ebcbf29127b6d9ed429f8641 by only testing
2 generations of hardware :(
|
|
|
|
Hopefully this code can just go away soon.
|
|
It is now to the point where we have no regressing piglit tests. It
also fixes Yo Frankie! and Humus DynamicBranching, probably due to the
piglit bias tests that work that didn't on the Mesa IR backend.
As a downside, performance takes about a 5-10% performance hit at the
moment (e.g. nexuiz 19.8fps -> 18.8fps), which I plan to resolve by
reintroducing 16-wide fragment shaders where possible. It is a win,
though, for fragment shaders using flow control.
|
|
Simply using RNDU, RNDZ, or RNDE does not produce the desired result.
Rather, the RND* instructions place a value in the destination register
that may be 1 less than the correct answer. They can also set per-channel
"increment bits" in a flag register, which, if set, mean dest needs to
be incremented by 1. A second instruction - a predicated add -
completes the job.
Notably, RNDD always produces the correct answer in a single
instruction.
Fixes piglit test glsl-fs-trunc.
|
|
The existing code used RNDD, which rounds down, rather than toward zero.
|
|
Fixes piglit test glsl-fs-ceil.
|
|
This cuts usually 2 out of 3 instructions for flag reg generation (if
statements, conditional assignment) by producing the conditional mod
in the expression representing the boolean value.
Fixes glsl-fs-vec4-indexing-temp-dst-in-nested-loop-combined (register
allocation no longer fails for the conditional generation
proliferation)
|
|
This will be a place to peephole comparisions directly to the flag
regs, and for now avoids using MOV with conditional mod on gen6, which
is now illegal.
|
|
Improves nexuiz performance 0.91% (+/- 0.54%, n=8)
|
|
|
|
So far, I've only seen this be a valgrind warning and not a real failure.
|
|
This reverts commit 73dab75b4165f7d2214a68d4ba8e3cb7aab9b4ac.
|
|
Don't use r0 for FF_SYNC dest reg on Sandybridge, which would
smash FFID field in GS payload, that cause later URB write fail.
Also not use r0 in any URB write requiring allocate.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
__GLcontextModes is always only used as an implementation internal struct
at this point and we shouldn't install glcore.h anymore. Anything that
needs __GLcontextModes should just include the struct in its headers files
directly.
|
|
Fixes this GCC warning.
tdfx_texman.c: In function 'tdfxTMMoveOutTM_NoLock':
tdfx_texman.c:897: warning: unused variable 'shared'
|
|
Fixes this GCC warning.
r300_state.c: In function 'r300InvalidateState':
r300_state.c:2247: warning: 'hw_format' may be used uninitialized in this function
r300_state.c:2247: note: 'hw_format' was declared here
|
|
There was a check to only do the rebase if we didn't have everything
in VBOs, but nexuiz apparently hands us a mix of VBOs and arrays,
resulting in blocking on the GPU to do a rebase.
Improves nexuiz 800x600, high-settings performance on my Ironlake 41%
(+/- 1.3%), from 14.0fps to 19.7fps.
|
|
The format selection of the CopyTexSubImage is pretty bogus still, but
this at least avoids software fallbacks in nexuiz, bringing
performance from 7.5fps to 12.8fps on my machine.
|
|
Fixes glsl-fs-i2b.
|
|
Useful to amortize the command submission/reloc overhead (e.g. etracer
goes from 72 to 109 FPS on nv4b).
|
|
|
|
It's now much more correct for gen6 than the old backend, with just 2
regressions I've found (one of which is common with pre-gen6 and will
be fixed by an array splitting IR pass).
This does leave the old Mesa IR backend getting used still when we
don't have GLSL IR, but the plan is to get GLSL IR input to the driver
for the ARB programs and fixed function by the next release.
|