Age | Commit message (Collapse) | Author |
|
This provides the optimizer with hints about code hotness, which we're
quite certain about for debug printouts (or, rather, while we
developers often hit the checks for debug printouts, we don't care
about performance while doing so).
|
|
No measurable performance difference on cairo-perf-trace, but
simplifies the code and should have cache benefit in general.
|
|
Must issue a pipe control with any non-zero post sync op before
write cache flush = 1 pipe control.
|
|
This came from commit cf255e382d147fe3ca450f0dcec3525190e7dcbc
|
|
|
|
The new API makes so much more sense, I'd like to forget how the old
one worked.
|
|
The slightly less mechanical change of converting the emit_reloc calls
will follow.
|
|
|
|
|
|
Fixes the assert (and buffer overrun):
glknots: intel_batchbuffer.c:164: _intel_batchbuffer_flush: Assertion
'used >= batch->buf->size' failed.
Reported in bug:
Bug 28274 - xscreensaver's glknots hangs GPU (945GME/Pineview)
https://bugs.freedesktop.org/show_bug.cgi?id=28274
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
|
|
This is a workaround for Ironlake errata. The emit_mi_flush is used
for a few purposes:
1) Flushing write caches for RTT (including blit to texture)
2) Pipe fencing for sync objects
3) Spamming cache flushes to track down cache flush bugs
Spamming cache flushes seems less important than following the docs,
and we should probably do that with a different mechanism than the one
for render cache flushes.
|
|
Before we would throttle in the flush callback prior to round-tripping
to the server to do copyregion or swapbuffer. Now, instead just note
that we need to throttle and do it in intel_prepare_render(), which
will be called after receiving the response from the server but before
we start rendering the next frame. Even if the server also throttles
us in swapbuffer, this just makes the throttling a no-op when we hit
intel_prepare_render(). With that we can drop the
using_dri2_swapbuffers hack and just always throttle.
|
|
|
|
|
|
Cuts another 1800 bytes from the driver.
|
|
This improves tiled texture performance of OA on my 945 from 25.3fps
to 29.0fps, whereas untiled is 28.2fps, by avoiding stalls for fence
register changes.
|
|
This was accidentally (it seems) deleted in
5203b7227ccb6b618fa42f08434d4a3cf123dca2
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This really isn't supported at this point. GEM's been in the kernel for
a year, and the fake bufmgr never really worked.
|
|
|
|
This should do all the things that MI_FLUSH did, but it can be pipelined
so that further rendering isn't blocked on the flush completion unless
necessary.
|
|
|
|
Conflicts:
src/mesa/main/state.c
|
|
This fixes jerkiness in doom3 and other apps since the kernel change to
throttle less absurdly, which led to a thundering herd of frames.
Because this is a rather minimal fix, there is at least one downside: If
the whole scene completes in one batchbuffer, we'll end up stalling the GPU.
Thanks to Michel Dänzer for suggesting using glFlush to signal frame end
instead of going to all the effort of adding a new DRI2 extension.
(cherry picked from master, commit 0828579a658af01a64b5e699175dc9bbbedcd685)
|
|
This fixes jerkiness in doom3 and other apps since the kernel change to
throttle less absurdly, which led to a thundering herd of frames.
Because this is a rather minimal fix, there is at least one downside: If
the whole scene completes in one batchbuffer, we'll end up stalling the GPU.
Thanks to Michel Dänzer for suggesting using glFlush to signal frame end
instead of going to all the effort of adding a new DRI2 extension.
|
|
|
|
I keep wanting to hack this knob in as a one-time thing, so it seemed useful
to have all the time.
|
|
This ensures all batchbuffers have a same cliprect mode after calling
_intel_batchbuffer_flush even if there aren't invalid commands in the
current batch buffer. (fix bug#18362).
|
|
This avoids issues with dereferencing stale cliprects around intel_draw_buffer
time. Additionally, take advantage of cliprects staying constant for FBOs and
DRI2, and emit cliprects in the batchbuffer instead of having to flush batch
each time they change.
|
|
|
|
|
|
Mesa requires that we be able to share objects between contexts, which means
that the objects need to be created by the same bufmgr, and the bufmgr
internally requires pthread protection for thread safety.
Rely on the bufmgr having appropriate locking.
|
|
This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
|
|
This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm_surface_state.c
|
|
To do this, I had to clean up some of 965 state upload stuff. We may end
up over-emitting state in the aperture overflow case, but that should be rare,
and I'd rather have the simplification of state management.
|
|
This lets GEM use pwrite, for an additional 4% or so speedup.
|
|
|
|
Track DRM GEM name changes.
Add driver hooks for bo_subdata and bo_get_subdata so that GEM can use pread
and pwrite.
|
|
Make sure 'used' tracks the right value through the whole function.
Also, use GLint for intel_batchbuffer_space in case we do bad things
in the future.
|
|
The GEM flags are much more descriptive for what we need. Since this makes
bufmgr_fake rather device-specific, move it to the intel common directory.
We've wanted to do device-specific stuff to it before.
|
|
|
|
|
|
Fencing was used in two places: ensuring that we didn't get too many frames
ahead of ourselves, and glFinish. glFinish will be satisfied by waiting on
buffers like we would do for CPU access on them. The "don't get too far ahead"
is now the responsibility of the execution manager (kernel).
|