Age | Commit message (Collapse) | Author |
|
There were hacks in EmitCopyBlit before to adjust offsets so that y=0 after
the offsets had been adjusted for a negative pitch. It appears that those
hacks were due to an unclear and surprising aspect of the hardware: inverting
the pitch results in the blit into the specified rectangle being inverted,
without the user needing to adjust y and base offset.
Tested with piglit copytexsubimage test on 915GM and GM965. Should fix
serious performance issues with ETQW and other applications.
|
|
The blit bitmap code already handles scissoring. This is a 15-100% speedup on
blender benchmark.blend thanks to avoiding fallbacks. Bug #17951.
|
|
Instead, have i965 and i915 both call the generic function from their Viewport.
|
|
According to Keith the docs have these offsets the other way around
|
|
|
|
|
|
This was a regression in 59b2c2adbbece27ccf54e58b598ea29cb3a5aa85 that broke
blender, among other apps.
|
|
|
|
As far as I can read in the docs, VS threads can be 1:1 with the pairs of
VUE handles allocated for them. Also, G4X can run twice as many threads as
before (though we won't unless the we bump the preferred URB entries for VS).
|
|
We were dividing the number of URB entries by two to get number of threads,
which looks suspiciously like a copy'n'paste-o from brw_vs_state.c. Also, the
maximum number of threads is 24, not 12.
|
|
The clip thread could potentially deadlock when processing tristrips since
being moved back to dual-thread mode, as the two threads could each have 4 VUEs
referenced and not be able to allocate another one since SF processing
wasn't able to continue (needing 5 entries before it freed 2).
In constrained URB mode, similar deadlock could even have occurred with
polygons (so we cut back max_threads if we can't handle it any primitive type).
|
|
|
|
It shouldn't offer anything new over what's in the docs (except for G4X notes),
but here it's all in one place.
|
|
This ensures all batchbuffers have a same cliprect mode after calling
_intel_batchbuffer_flush even if there aren't invalid commands in the
current batch buffer. (fix bug#18362).
|
|
|
|
See bug 18445.
When getting array results, __glXReadReply() always reads a multiple of
four bytes. This can cause writing to invalid memory when 'n' is not a
multiple of four.
Special-case the glAreTexturesResident() functions now.
To fix the bug, we use a temporary buffer that's a multiple of four bytes
in length.
NOTE: this commit also reverts part of commit 919ec22ecf72aa163e1b97d8c7381002131ed32c
(glx/x11: Added some #ifdef GLX_DIRECT_RENDERING protection) which
directly edited the indirect.c file rather than the python generator!
I'm not repairing that issue at this time.
|
|
|
|
Trunc is a more accurate description; there's no type conversion involved.
|
|
Now i965 also uses the vertex program created by Mesa Core, but this vertex program
is not only depend on mesa state _NEW_PROGRAM, so always check the current vertex
program is updated or not. This fixes broken demo cubemap.
|
|
OPCODE_NOISE4 coming later.
|
|
|
|
This cuts one MOV out when setting a zero header.
|
|
The mobile and desktop chipsets are the same, and having them separate is
more typing and more chances to screw up.
|
|
Also, add a comment explaining what brw->urb.constrained tries to do.
|
|
Quoting section 11.3.10, paragraph 10.2 of the 965PRM:
10.2. If ExecSize is 1, dst.HorzStride must not be 0. Note that this is
relaxed from rule 10.1.2. Also note that this rule for destination
horizontal stride is different from that for source as stated in
rule #7.
GM45 gets very angry when rule 10.2 is violated.
Patch 58dc8b7 (i965: support destination horiz strides in align1 access mode)
added support for additional horizontal strides in the ExecSize 1 case, but
failed to notice that mesa occasionally re-purposes a register as a
temporary destination, even though it was constructed as a repeating source
with HorzStride = 0.
While, ideally, we should probably fix the code using these register
specifications, this patch simply rewrites them to use HorzStride 1 as the
pre-58dc8b7 code did.
Signed-off-by: Keith Packard <keithp@keithp.com>
|
|
|
|
GL_COLOR_INDEX mode is just like other normal formats (that is, not
depth/stencil) and is uploaded fine by TexImage.
|
|
|
|
(Only in fragment shaders, so far. Support for NOISE3 and NOISE4 to come.)
|
|
This is required for scatter writes in destination regions to work.
|
|
|
|
Previously, since my check_aperture API change, we would check each piece of
state against the batchbuffer individually, but not all the state against the
batchbuffer at once. In addition to not being terribly useful in assuring
success, it probably also increased CPU load by calling check_aperture many
times per primitive.
|
|
|
|
This avoids issues with dereferencing stale cliprects around intel_draw_buffer
time. Additionally, take advantage of cliprects staying constant for FBOs and
DRI2, and emit cliprects in the batchbuffer instead of having to flush batch
each time they change.
|
|
This is required for threads to be spawned with correctly sized GRF
register blocks.
|
|
|
|
|
|
Previously, we were trying to pass a name to the GEM GET_TILING_IOCTL,
which needs a handle, and failing. None of our buffers were tiled yet, but
they will be at some point with DRI2 and UXA.
|
|
(thanks Eric).
|
|
This ensures there is an unfilled batchbuffer used for emitting states again. Partial fix for #17964.
|
|
Use _mesa_copy_rect instead of BLT operation if dri_bufmgr_check_aperture_space
still fails after flushing batchbuffer. Partial fix for #17964.
|
|
|
|
|
|
Fix http://bugs.freedesktop.org/show_bug.cgi?id=16287.
|
|
This is nasty because there's no way in GL to output data to the stencil
buffer directly, so we have to do a dance to wrap the depth/stencil buffer
in an ARGB renderbuffer.
Improves performance of several oglconform testcases by better than a factor
of 2.
|
|
|
|
|
|
The fallback was introduced to fix bug #16697, but made the test it was
fixing run excessively long.
|
|
|
|
The fallback was introduced to fix bug #16697, but made the test it was
fixing run excessively long.
|