Age | Commit message (Collapse) | Author |
|
before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] gl firefox-talos-gfx 31.791 32.287 1.11% 6/6
after:
[ 0] gl firefox-talos-gfx 31.198 31.675 0.96% 6/6
|
|
Now that the binding table is streamed indirect state, they were
always NULL/0.
|
|
The cache lookup of these two little floats was .12% of total CPU time
on firefox-talos-gfx because we did it any time commonly-changed state
changed. On the other hand, updating the CC VP bo immediately whenver
CC VP state changes is a .07% overhead due to putting a driver hoook
in glEnable().
|
|
The new API makes so much more sense, I'd like to forget how the old
one worked.
|
|
The slightly less mechanical change of converting the emit_reloc calls
will follow.
|
|
Conflicts:
src/mesa/drivers/dri/intel/intel_screen.c
src/mesa/drivers/dri/intel/intel_swapbuffers.c
src/mesa/drivers/dri/r300/r300_emit.c
src/mesa/drivers/dri/r300/r300_ioctl.c
src/mesa/drivers/dri/r300/r300_tex.c
src/mesa/drivers/dri/r300/r300_texstate.c
|
|
|
|
Everything has been constant-sized until now, but constant buffer
handling changes will make us want some additional variable sized
array.
|
|
Use the currently bound draw buffer instead of the visual from the
drawable used to create the context. This cause problems generating
mipmaps for an RGBA texture in an RGB context.
This fixes the failure in piglit's glsl-lod-bias test reported in bug #25614.
|
|
It turns out that 965 and friends cannot actually render to an xRGB
surfaces. Instead, the surface has to be RGBA with writes to alpha
disabled and the blend function modified to always use 1.0 for
destination alpha.
|
|
This keeps the individual state files from having to export their
structures for brw_state_cache initialization.
|
|
If a backwards glDepthRange was supplied (as with the old Quake no-z-clearing
hack), the hardware would have always clamped because we weren't clamping to
the min of near/far and the max of near/far. Also, we shouldn't be clamping
to near/far at all when not in depth clamp mode (this usually didn't matter
since near/far are usually the same as the 0.0, 1.0 clamping you do for
fixed-point depth).
This should fix funny depth issues in PlaneShift, and fixes piglit
depth-clamp-range
|
|
|
|
|
|
|
|
Track separate back-face stencil state for OpenGL 2.0 /
GL_ATI_separate_stencil and GL_EXT_stencil_two_side. This allows all
three to be enabled in a driver. One set of state is set via the 2.0
or ATI functions and is used when STENCIL_TEST_TWO_SIDE_EXT is
disabled. The other is set by StencilFunc and StencilOp when the
active stencil face is set to BACK. The GL_EXT_stencil_two_side spec has
more details.
http://opengl.org/registry/specs/EXT/stencil_two_side.txt
|
|
Makefile.template
|
|
|
|
This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
|
|
This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm_surface_state.c
|
|
To do this, I had to clean up some of 965 state upload stuff. We may end
up over-emitting state in the aperture overflow case, but that should be rare,
and I'd rather have the simplification of state management.
|
|
|
|
This is an API breakage only.
|
|
|
|
The GEM flags are much more descriptive for what we need. Since this makes
bufmgr_fake rather device-specific, move it to the intel common directory.
We've wanted to do device-specific stuff to it before.
|
|
Makes state emission into a 2 phase, prepare sets things up and accounts
the size of all referenced buffer objects. The emit stage then actually
does the batchbuffer touching for emitting the objects.
There is an assert in dri_emit_reloc if a reloc occurs for a buffer
that hasn't been accounted yet.
|
|
|
|
|
|
We have two consumers of relocations. One is static state buffers, which
want the same relocation every time. The other is the batchbuffer, which gets
thrown out immediately after submit. This lets us reduce repeated computation
for static state buffers, and clean up the code by moving relocations nearer
to where the state buffer is computed.
|
|
|
|
Note that this does not enable GL_EXT_stencil_two_side, because Mesa's computed
_TestTwoSide ends up respecting only STENCIL_TEST_TWO_SIDE_EXT (defaults to
GL_FALSE), even if the application uses only GL 2.0 / ATI entrypoints.
|
|
The user-space suballocator that was used avoided relocation computations by
using the general and surface state base registers and allocating those types
of buffers out of pools built on top of single buffer objects. It also
avoided calls into the buffer manager for these small state allocations, since
only one buffer object was being used.
However, the buffer allocation cost appears to be low, and with relocation
caching, computing relocations for buffers is essentially free. Additionally,
implementing the suballocator required a don't-fence-subdata flag to disable
waiting on buffer maps so that writing new data didn't block on rendering using
old data, and careful handling when mapping to update old data (which we need
to do for unavoidable relocations with FBOs). More importantly, when the
suballocator filled, it had no replacement algorithm and just threw out all
of the contents and forced them to be recomputed, which is a significant cost.
This is the first step, which just changes the buffer type, but doesn't yet
improve the hash table to not result in full recompute on overflow. Because
the buffers are all allocated out of the general buffer allocator, we can
no longer use the general/surface state bases to avoid relocations, and they
are set to 0 instead.
|
|
|
|
This driver comes from Tungsten Graphics, with a few further modifications by
Intel.
|