Age | Commit message (Collapse) | Author |
|
This helps us avoid a bunch of mess with gl_client_arrays that we filled
with unused data and confused readers.
|
|
Otherwise, we could choose to upload into the temporary VBO that we just fired
off to the hardware. Good for a 60% OA performance improvement.
|
|
Since each one is only 64b, and kernel allocations are a page anyway, this
lets us reduce buffer allocation by packing many CURBEs into one buffer, for
each batchbuffer submitted. Improves openarena performance by around 10%.
|
|
The comment about (vbo)_exec_api.c appeared to be stale, as the VBO code seems
to only use non-named VBOs (not actual VBOs) or freshly-allocated VBO data.
This brings a 2x speedup to openarena, because we can submit nearly-full
batchbuffers instead of many 450-byte ones.
|
|
This allows us to avoid re-emitting some state when validate_state happens
multiple times per batchbuffer. Even though we flush batch per primitive
currently, that may still happen already if the primitive changed (this should
probably be fixed as well).
|
|
Both drivers have ended up relying on lost_hardware being called after each
batch buffer, so update the name. This removes one of the calls on 965 whic
h was outside of the batchbuffer handling code and just duplicating what had
already happened through batchbuffer handling.
|
|
This adds (so far) unused PBO functions, and holding the lock while writing
to regions (which may be shared static screen regions).
|
|
The user-space suballocator that was used avoided relocation computations by
using the general and surface state base registers and allocating those types
of buffers out of pools built on top of single buffer objects. It also
avoided calls into the buffer manager for these small state allocations, since
only one buffer object was being used.
However, the buffer allocation cost appears to be low, and with relocation
caching, computing relocations for buffers is essentially free. Additionally,
implementing the suballocator required a don't-fence-subdata flag to disable
waiting on buffer maps so that writing new data didn't block on rendering using
old data, and careful handling when mapping to update old data (which we need
to do for unavoidable relocations with FBOs). More importantly, when the
suballocator filled, it had no replacement algorithm and just threw out all
of the contents and forced them to be recomputed, which is a significant cost.
This is the first step, which just changes the buffer type, but doesn't yet
improve the hash table to not result in full recompute on overflow. Because
the buffers are all allocated out of the general buffer allocator, we can
no longer use the general/surface state bases to avoid relocations, and they
are set to 0 instead.
|
|
This is currently believed to work but be a significant performance loss.
Performance recovery should be soon to follow.
The dri_bo_fake_disable_backing_store() call was added to allow backing store
disable like bufmgr_fake.c did, which is a significant performance win (though
it's missing the no-fence-subdata part).
This commit is a squash merge of the 965-ttm branch, which had some history
I wanted to avoid pulling due to noisiness and brokenness at many points
for git-bisecting.
|
|
This code existed to dump logs of hardware access to be replayed in simulation.
Since we have real hardware now, it's not really needed.
|
|
into vbo-0.2
Conflicts:
src/mesa/array_cache/sources
src/mesa/drivers/dri/i965/brw_context.c
src/mesa/drivers/dri/i965/brw_draw.c
src/mesa/drivers/dri/i965/brw_fallback.c
src/mesa/drivers/dri/i965/brw_vs_emit.c
src/mesa/drivers/dri/i965/brw_vs_tnl.c
src/mesa/drivers/dri/mach64/mach64_context.c
src/mesa/main/extensions.c
src/mesa/main/getstring.c
src/mesa/tnl/sources
src/mesa/tnl/t_save_api.c
src/mesa/tnl/t_save_playback.c
src/mesa/tnl/t_vtx_api.c
src/mesa/tnl/t_vtx_exec.c
src/mesa/vbo/vbo_attrib.h
src/mesa/vbo/vbo_exec_api.c
src/mesa/vbo/vbo_save_api.c
src/mesa/vbo/vbo_save_draw.c
|
|
Submitted by Gary Wong <gtw@gnu.org>
|
|
|
|
we render. Currenly requires that some state be re-examined after
every LOCK_HARDWARE().
|
|
This driver comes from Tungsten Graphics, with a few further modifications by
Intel.
|