Age | Commit message (Collapse) | Author |
|
This is a 2.9% (+/-.3%) performance win for my GL demo, which hits MAD
sequences for matrix transforms.
|
|
This shaves more instructions off of the VS of my GL demo, but no
performance difference this time at n=6. This also fixes a regression
that was in this path, which is now piglit's glsl-vs-mov-after-deref.
|
|
Thanks to Chia-I Wu's changes to the extension function
infrastructure, we no longer have to tell the loader which extensions
the driver might enable. This means that intelInitExtensions will
never be called with a NULL context pointer. Remove all the NULL checks.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
Before, we weren't aggressive enough in checking for the start or end
of render-to-texture. In particular, if only the ctx->ReadBuffer had
texture attachments, we were treating that as a render-to-texture case.
This fixes a regression from commit 75bdbdd90b15c8704d87ca195a364ff6a42edbb1
"intel: Don't validate in a texture image used as a render target."
|
|
|
|
|
|
|
|
|
|
|
|
For an app that's blowing out the state cache, like sauerbraten, the
memset of the giant arrays ended up taking 11% of the CPU even when only a
"few" of the entries got used. With this, the WM program compile drops back
down to 1% of CPU time.
Bug #24981 (bisected to BRW_WM_MAX_INSN increase).
|
|
|
|
Previously, we'd load linearly from ParameterValues[0] for the constants,
though ParameterValues[1] may not equal ParameterValues[0] + 4. Additionally,
the STATE_VAL type paramters didn't get updated.
Fixes piglit vp-constant-array-huge.vpfp and ET:QW object locations.
Bug #23226.
|
|
Fixes piglit vp-sge-alias test, and the googleearth ground shader. \o/
Bug #22228
(cherry picked from commit 56ab92bad8f1d05bc22b8a8471d5aeb663f220de)
|
|
Fixes piglit arl.vp.
(cherry picked from commit d52d78b4bcd6d4c0578f972c0b8ebac09e632196)
|
|
Fixes piglit vp-sge-alias test, and the googleearth ground shader. \o/
Bug #22228
|
|
Fixes piglit arl.vp.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
These are needed for HiZ which is not currently used and
the _BASE reg requires a reloc which is not currently supported
in the drm.
|
|
- consolidate DB render setup
- only enable perfect ZPASS counts and cull disable
when OQ is active
- enable early Z
|
|
These are needed for HiZ which is not currently used and
the _BASE reg requires a reloc which is not currently supported
in the drm.
|
|
|
|
|
|
Handle both NV vertex programs and NV vertex state programs passed to
glProgramStringARB.
|
|
|
|
|
|
Fixes bug 24967.
|
|
No statistically significant performance difference at n=3 with either
openarena or my GL demo, but cutting program size seems like a good
thing to be doing for the hypothetical app that has a working set near
icache size.
|
|
|
|
This should fix issues with antialiased lines in GLSL.
|
|
The PINTERP code should be faster for brw_wm_glsl.c now since brw_wm_emit.c's
had been improved, and pixel_w should no longer stomp on a neighbor to dst.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This drops support for get_src_reg_imm in these, but the prospect of getting
brw_wm_pass*.c onto our GLSL path is well worth some temporary pain.
|
|
This matches brw_wm_emit.c, which we'll be using shortly. There's a
possible penalty here in that we'll allocate registers for unused channels,
since we aren't doing ref tracking like brw_wm_pass*.c does. However, my
measurements on GM965 don't show any for either OA or UT2004 with the GLSL
path forced.
|
|
Depending on the writemask or the opcode, we can often trim the source
channels considered used for dead code elimination. This saves actual
instructions on 965 in the non-GLSL path for glean glsl1, and cleans up
the writemasks of programs even further.
|
|
This fixes the dead code elimination to work on the particular code
mentioned in the previous commit.
|
|
GLSL code such as:
vec4 result = {0, 1, 0, 0};
gl_FragColor = result;
emits code like:
0: MOV TEMP[0], CONST[0];
1: MOV OUTPUT[1], TEMP[0];
and this replaces it with:
0: MOV TEMP[0], CONST[0];
1: MOV OUTPUT[1], CONST[0];
Even when the dead code eliminator fails to clean up a now-useless MOV
instruction (since it doesn't do live/dead ranges), this should at reduce
dependencies.
|