Age | Commit message (Collapse) | Author |
|
This lets GLSL shaders use up to 16 samplers.
Fixed function is still limited to 8 textures.
Tested with progs/glsl/samplers.c
|
|
This is fallout from the ffvertex_prog.c work. It doesn't call
ProgramStringNotify, so we don't set param_state, so we wouldn't track when
VP parameters changed, and constants wouldn't get uploaded. Instead, remove
param_state entirely and just use the real value that we want to be tracking.
Fixes rendering in openarena since BRW_NEW_BATCH got disentangled from
BRW_NEW_INDICES.
Bug #18822.
|
|
The CACHE_NEW_SURFACE bit always gets spammed since we get many different
surface BOs per state emit, but the only consumer of it wanted to just know
how many surfaces were enabled.
|
|
Fixes upload of large amounts of state for every new primitive emit.
|
|
This was causing a prepare of wm state at every primitive emit.
|
|
Previously, since my check_aperture API change, we would check each piece of
state against the batchbuffer individually, but not all the state against the
batchbuffer at once. In addition to not being terribly useful in assuring
success, it probably also increased CPU load by calling check_aperture many
times per primitive.
|
|
|
|
|
|
The i965 driver previously had it's own set of code to convert
fixed-function TNL state to a vertex program. Core Mesa has code to
do this, so there is no reason to duplicate that effort in the driver.
In fact, this duplication leads to bugs when other aspects of the Mesa
infrastructure change.
|
|
This isn't required for GEM (at least, yet), but the check_aperture code
for non-GEM results in batch getting flushed during emit. brw_state_upload
restarts state emits, but a bunch of the state emit functions were assuming
that they would be called exactly once, after prepare and before new_batch.
Bug #17179.
|
|
Makefile.template
|
|
This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
|
|
This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm_surface_state.c
|
|
To do this, I had to clean up some of 965 state upload stuff. We may end
up over-emitting state in the aperture overflow case, but that should be rare,
and I'd rather have the simplification of state management.
|
|
|
|
Makes state emission into a 2 phase, prepare sets things up and accounts
the size of all referenced buffer objects. The emit stage then actually
does the batchbuffer touching for emitting the objects.
There is an assert in dri_emit_reloc if a reloc occurs for a buffer
that hasn't been accounted yet.
|
|
|
|
|
|
With DRI2 we there is no screen region until a drawable is bound to
the context. Set up the framebuffer texture in meta_draw_region instead
which should also handle the case where the draw region changes as a
result of resizing a redirected window or resizing the screen.
|
|
for WM payload. fix #10767
|
|
This helps us avoid a bunch of mess with gl_client_arrays that we filled
with unused data and confused readers.
|
|
|
|
|
|
Since each one is only 64b, and kernel allocations are a page anyway, this
lets us reduce buffer allocation by packing many CURBEs into one buffer, for
each batchbuffer submitted. Improves openarena performance by around 10%.
|
|
The previous change gave us only two modes, one which looped over the batch
per cliprect (3d drawing) and one that didn't (state updeast).
However, we really want 4:
- Batch doesn't care about cliprects (state updates)
- Batch needs DRAWING_RECTANGLE looping per cliprect (3d drawing)
- Batch needs to be executed just once (region fills, copies, etc.)
- Batch already includes cliprect handling, and must be flushed by unlock time
(copybuffers, clears).
All callers should now be fixed to use one of these states for any batchbuffer
emits. Thanks to Keith Whitwell for pointing out the failure.
|
|
The comment about (vbo)_exec_api.c appeared to be stale, as the VBO code seems
to only use non-named VBOs (not actual VBOs) or freshly-allocated VBO data.
This brings a 2x speedup to openarena, because we can submit nearly-full
batchbuffers instead of many 450-byte ones.
|
|
This allows us to avoid re-emitting some state when validate_state happens
multiple times per batchbuffer. Even though we flush batch per primitive
currently, that may still happen already if the primitive changed (this should
probably be fixed as well).
|
|
Each array element is now a BUFFER_x token rather than a BUFFER_BIT_x bitmask.
The number of active color buffers is specified by _NumColorDrawBuffers.
This builds on the previous DrawBuffer changes and will help with drivers
implementing GL_ARB_draw_buffers.
|
|
We have two consumers of relocations. One is static state buffers, which
want the same relocation every time. The other is the batchbuffer, which gets
thrown out immediately after submit. This lets us reduce repeated computation
for static state buffers, and clean up the code by moving relocations nearer
to where the state buffer is computed.
|
|
|
|
|
|
The user-space suballocator that was used avoided relocation computations by
using the general and surface state base registers and allocating those types
of buffers out of pools built on top of single buffer objects. It also
avoided calls into the buffer manager for these small state allocations, since
only one buffer object was being used.
However, the buffer allocation cost appears to be low, and with relocation
caching, computing relocations for buffers is essentially free. Additionally,
implementing the suballocator required a don't-fence-subdata flag to disable
waiting on buffer maps so that writing new data didn't block on rendering using
old data, and careful handling when mapping to update old data (which we need
to do for unavoidable relocations with FBOs). More importantly, when the
suballocator filled, it had no replacement algorithm and just threw out all
of the contents and forced them to be recomputed, which is a significant cost.
This is the first step, which just changes the buffer type, but doesn't yet
improve the hash table to not result in full recompute on overflow. Because
the buffers are all allocated out of the general buffer allocator, we can
no longer use the general/surface state bases to avoid relocations, and they
are set to 0 instead.
|
|
This is currently believed to work but be a significant performance loss.
Performance recovery should be soon to follow.
The dri_bo_fake_disable_backing_store() call was added to allow backing store
disable like bufmgr_fake.c did, which is a significant performance win (though
it's missing the no-fence-subdata part).
This commit is a squash merge of the 965-ttm branch, which had some history
I wanted to avoid pulling due to noisiness and brokenness at many points
for git-bisecting.
|
|
The only functional difference should be that 965 now gets the optimization
where textures default to 16bpp when the screen is 16bpp.
|
|
This reverts commit b2f1aa2389473ed09170713301b042661d70a48e.
Somehow I ended up with my branch's save-this-while-I-work-on-master commit
actually on master.
|
|
|
|
This code existed to dump logs of hardware access to be replayed in simulation.
Since we have real hardware now, it's not really needed.
|
|
into vbo-0.2
Conflicts:
src/mesa/array_cache/sources
src/mesa/drivers/dri/i965/brw_context.c
src/mesa/drivers/dri/i965/brw_draw.c
src/mesa/drivers/dri/i965/brw_fallback.c
src/mesa/drivers/dri/i965/brw_vs_emit.c
src/mesa/drivers/dri/i965/brw_vs_tnl.c
src/mesa/drivers/dri/mach64/mach64_context.c
src/mesa/main/extensions.c
src/mesa/main/getstring.c
src/mesa/tnl/sources
src/mesa/tnl/t_save_api.c
src/mesa/tnl/t_save_playback.c
src/mesa/tnl/t_vtx_api.c
src/mesa/tnl/t_vtx_exec.c
src/mesa/vbo/vbo_attrib.h
src/mesa/vbo/vbo_exec_api.c
src/mesa/vbo/vbo_save_api.c
src/mesa/vbo/vbo_save_draw.c
|
|
vertex/fragment programs provided as const.
bmSetFenceLock should return bmSetFence value.
|
|
|
|
Submitted by Gary Wong <gtw@gnu.org>
|
|
|
|
|
|
we render. Currenly requires that some state be re-examined after
every LOCK_HARDWARE().
|
|
as non-aliasing and cope with the >32 attributes that result, taking
materials into account.
|
|
This driver comes from Tungsten Graphics, with a few further modifications by
Intel.
|