Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
|
|
|
|
|
|
avoid some issues such that 1 + (-2) gets a big
positive value.
|
|
|
|
This was around 3% improvement in OA.
|
|
|
|
|
|
C has no order of evaluation restrictions on function arguments, so we
attempted to realloc from new-size to new-size.
|
|
This fix ensures it gets the starting location of the clip program
if a clip unit state is same as a unit which is created when metaops
is actived and it doesn't impact metaops because the clip state offset
isn't emitted when metaops is actived.
|
|
The WM code had this right, so copy its behavior. This reverts a flipping
of the arguments to SLT in brw_vs_tnl which came in with the GLSL code that
probably occurred to work around the flipped results, and brings the code back
in line with t_vp_build.c.
|
|
|
|
|
|
|
|
Otherwise, we could choose to upload into the temporary VBO that we just fired
off to the hardware. Good for a 60% OA performance improvement.
|
|
|
|
Since each one is only 64b, and kernel allocations are a page anyway, this
lets us reduce buffer allocation by packing many CURBEs into one buffer, for
each batchbuffer submitted. Improves openarena performance by around 10%.
|
|
The previous code would reference freed memory on window moves.
|
|
The previous change gave us only two modes, one which looped over the batch
per cliprect (3d drawing) and one that didn't (state updeast).
However, we really want 4:
- Batch doesn't care about cliprects (state updates)
- Batch needs DRAWING_RECTANGLE looping per cliprect (3d drawing)
- Batch needs to be executed just once (region fills, copies, etc.)
- Batch already includes cliprect handling, and must be flushed by unlock time
(copybuffers, clears).
All callers should now be fixed to use one of these states for any batchbuffer
emits. Thanks to Keith Whitwell for pointing out the failure.
|
|
|
|
|
|
The comment about (vbo)_exec_api.c appeared to be stale, as the VBO code seems
to only use non-named VBOs (not actual VBOs) or freshly-allocated VBO data.
This brings a 2x speedup to openarena, because we can submit nearly-full
batchbuffers instead of many 450-byte ones.
|
|
This allows us to avoid re-emitting some state when validate_state happens
multiple times per batchbuffer. Even though we flush batch per primitive
currently, that may still happen already if the primitive changed (this should
probably be fixed as well).
|
|
Both drivers have ended up relying on lost_hardware being called after each
batch buffer, so update the name. This removes one of the calls on 965 whic
h was outside of the batchbuffer handling code and just duplicating what had
already happened through batchbuffer handling.
|
|
|
|
In particular, batch buffers are no longer flushed when switching from
CLIPRECTS to NO_CLIPRECTS or vice versa, and 965 just uses DRM cliprect
handling for primitives instead of trying to sneak in its own to avoid the
DRM stuff. The disadvantage is that we will re-execute state updates per
cliprect, but the advantage is that we will be able to accumulate larger
batch buffers, which were proving to be a major overhead.
|
|
|
|
|
|
Each array element is now a BUFFER_x token rather than a BUFFER_BIT_x bitmask.
The number of active color buffers is specified by _NumColorDrawBuffers.
This builds on the previous DrawBuffer changes and will help with drivers
implementing GL_ARB_draw_buffers.
|
|
These fields are no longer indexed by shader output. Now, we just have
a simple array of renderbuffer pointers.
If the shader writes to gl_FragData[i], send those colors to the N
_ColorDrawBuffers. Otherwise, replicate the single gl_FragColor (or
the fixed-function color) to the N _ColorDrawBuffers.
A few more changes and simplifications can follow from this...
|
|
We have two consumers of relocations. One is static state buffers, which
want the same relocation every time. The other is the batchbuffer, which gets
thrown out immediately after submit. This lets us reduce repeated computation
for static state buffers, and clean up the code by moving relocations nearer
to where the state buffer is computed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Without this, the WM binding tables would all collide, for example. Improves
openarena performance by around 2%.
|
|
|
|
Note that this does not enable GL_EXT_stencil_two_side, because Mesa's computed
_TestTwoSide ends up respecting only STENCIL_TEST_TWO_SIDE_EXT (defaults to
GL_FALSE), even if the application uses only GL 2.0 / ATI entrypoints.
|
|
|
|
|
|
To do so, merge the remainnig necessary code from the buffers, blit, span, and
screen code to shared, and replace it with those.
|
|
|