Age | Commit message (Collapse) | Author |
|
In addition, the rename_reg pass has been rewritten to use
rc_get_readers().
|
|
Previously, presubtract operations where only being used by instructions
with less than three source source registers.
|
|
|
|
|
|
|
|
|
|
When the result of the alpha instruction is being replicated to the RGB
destination register, we do not need to use alpha's destination register.
This fixes an invalid "Too many hardware temporaries used" error in
the case where a transcendent operation writes to a temporary register
greater than max_temp_regs.
NOTE: This is a candidate for the 7.9 branch.
|
|
This fixes an invalid "Too many hardware temporaries used" error in the
case where a source reads from a temporary register with an index greater
than max_temp_regs and then the source is marked as unused before the
register allocation pass.
NOTE: This is a candidate for the 7.9 branch.
|
|
Reads of registers that where not written to within the same block were
not being tracked. So in a situations like this:
0: IF
1: ADD t0, t1, t2
2: MOV t2, t1
Instruction 2 didn't know that instruction 1 read from t2, so
in some cases instruction 2 was being scheduled before instruction 1.
NOTE: This is a candidate for the 7.9 branch.
|
|
NOTE: This is a candidate for the 7.9 branch.
|
|
NOTE: This is a candidate for the 7.9 branch.
|
|
|
|
Gallium drivers pass all piglit tests for the two (there are 12 tests
for separate_shader_objects and 5 tests for explicit_attrib_location),
and I was told the extensions don't need any driver-specific code.
I made them dependent on PIPE_CAP_GLSL.
Signed-off-by: Brian Paul <brianp@vmware.com>
|
|
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=31779
|
|
Fix up some details in the xml files and regenerate dispatch files.
|
|
|
|
|
|
The drm winsys only ever handles one gem memory manager. Rip out
the unnecessary complication.
Reviewed-by: Jakob Bornecrantz <wallbraker@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jakob Bornecrantz <wallbraker@gmail.com>
|
|
Not using the gtt is considered harmful for performance. And for
partial uploads there's always drm_intel_bo_subdata.
Reviewed-by: Jakob Bornecrantz <wallbraker@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jakob Bornecrantz <wallbraker@gmail.com>
|
|
It's intel, so always little endian!
Reviewed-by: Jakob Bornecrantz <wallbraker@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jakob Bornecrantz <wallbraker@gmail.com>
|
|
Reviewed-by: Jakob Bornecrantz <wallbraker@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jakob Bornecrantz <wallbraker@gmail.com>
|
|
More in line with other intel drivers.
Change to use enum by Jakob Bornecrantz.
Reviewed-by: Jakob Bornecrantz <wallbraker@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jakob Bornecrantz <wallbraker@gmail.com>
|
|
It looks like this was meant to facilitate unfenced access to textures/
color/renderbuffers. It's totally incomplete and fundamentally broken
on a few levels:
- broken: The kernel needs to about every tiled bo to fix up bit17
swizzling on swap-in.
- unflexible: fenced/unfenced relocs from execbuffer2 do the same, much
simpler.
- unneeded: with relaxed fencing tiled gem bos are as memory-efficient
as this trick.
Hence kill it.
Reviewed-by: Jakob Bornecrantz <wallbraker@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jakob Bornecrantz <wallbraker@gmail.com>
|
|
Somebody should find out what these are. It can be found on Windows
getting a D3DCAPS9 from IDirect3D9::GetCaps() and reading the
GuardBand* values.
|
|
Fix a crash when the subrectangle is not inside the fb. Fix wrong
pipe transfer when sx > 0 or sy + height != fb->height.
This fixes "readpixels" demo.
|
|
These two samplers use non-normalized texture coordinates. wrap_r
cannot be PIPE_TEX_WRAP_REPEAT (the default).
This fixes
sp_tex_sample.c:1790:get_linear_unorm_wrap: Assertion `0' failed
assertion failure.
|
|
Fix "lookup" demo crash.
|
|
Fix OpenVG "filter" demo
Program received signal SIGSEGV, Segmentation fault.
0xb7153dc9 in str_match_no_case (pcur=0xbfffe564, str=0x0) at
tgsi/tgsi_text.c:86
86 while (*str != '\0' && *str == uprcase( *cur )) {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The define is required for DRI drivers. It is not needed for
libgl-xlib, but the overhead it introduces should be minor.
|
|
We do not know how to use more, GL_ARB_draw_buffers is not exposed on blob.
|
|
The stride between the different clip plane registers was incorrect.
https://bugs.freedesktop.org/show_bug.cgi?id=31788
agd5f: fix evergreen as well.
|
|
Fixes glsl-vs-point-size, although I meant to fix glsl-novertexdata.
Since swrast fails glsl-novertexdata too, I guess it's a core issue.
|
|
This is quite common for multitexture sampling, and not only cuts down
on the second and later set of MOVs, but typically also allows
compute-to-MRF on the first set.
No statistically siginficant performance difference in nexuiz (n=3),
but it reduces instruction count in one of its shaders and seems like
a good idea.
|
|
We were skipping it if the instruction producing the value we were
going to compute-to-mrf used its result reg as a source reg. This
meant that the typical "write interpolated color to fragment color" or
"texture from interpolated texcoord" shader didn't compute-to-MRF.
Just don't check for the interference cases until after we've checked
if this is the instruction we wanted to compute-to-MRF.
Improves nexuiz high-settings performance on my laptop 0.48% +- 0.08%
(n=3).
|
|
The goal here is to avoid regressing performance on ir_to_mesa drivers
for fixed function fragment shaders requiring saturates.
|
|
On pre-gen6, this turns 4 instructions into 1. We could still do
better by folding the saturate into the instruction generating the
value if nobody else uses it, but that should be a separate pass.
|
|
Hardware pretty commonly has saturate modifiers on instructions, and
this can be used in codegen to produce those, without everyone else
needing to understand clamping other than min and max.
|
|
This hits a common case with min/max operations.
|