Age | Commit message (Collapse) | Author |
|
The slightly less mechanical change of converting the emit_reloc calls
will follow.
|
|
|
|
As with swrast, this fixes the default pixel center behavior which was
broken, and implements the previous behavior for integer. Fixes
piglit fp-arb-fragment-coord-conventions-none. The extension won't be
exposed until we get the GLSL part implemented.
The DRI1 origin_x/y parts are dropped since they're no longer relevant.
|
|
Conflicts:
src/mesa/drivers/dri/intel/intel_screen.c
src/mesa/drivers/dri/intel/intel_swapbuffers.c
src/mesa/drivers/dri/r300/r300_emit.c
src/mesa/drivers/dri/r300/r300_ioctl.c
src/mesa/drivers/dri/r300/r300_tex.c
src/mesa/drivers/dri/r300/r300_texstate.c
|
|
|
|
Everything has been constant-sized until now, but constant buffer
handling changes will make us want some additional variable sized
array.
|
|
Bug #25194.
|
|
Add a GLbitfield64 type and several macros to operate on 64-bit
fields. The OutputsWritten field of gl_program is changed to use that
type. This results in a fair amount of fallout in drivers that use
programs.
No changes are strictly necessary at this point as all bits used are
below the 32-bit boundary. Fairly soon several bits will be added for
clip distances written by a vertex shader. This will cause several
bits used for varyings to be pushed above the 32-bit boundary. This
will affect any drivers that support GLSL.
At this point, only the i965 driver has been modified to support this
eventuality.
I did this as a "squash" merge. There were several places through the
outputswritten64 branch where things were broken. I foresee this
causing difficulties later for bisecting. The history is still
available in the branch.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm.h
|
|
For an app that's blowing out the state cache, like sauerbraten, the
memset of the giant arrays ended up taking 11% of the CPU even when only a
"few" of the entries got used. With this, the WM program compile drops back
down to 1% of CPU time.
Bug #24981 (bisected to BRW_WM_MAX_INSN increase).
|
|
If the fragment shader doesn't use FRAG_ATTRIB_WPOS (gl_FragCoord) we
don't need to worry about the window size and origin in
brw_wm_populate_key().
This avoids re-generating the i965 shader code when a window is resized.
Issue spotted by Keith Whitwell.
|
|
Put the state that we care about in the hash key.
Issue spotted by Keith Whitwell.
|
|
And remove other unneeded #includes while we're at it.
|
|
Now gl_texture_image::TexFormat is a simple MESA_FORMAT_x enum.
ctx->Driver.ChooseTexture format also returns a MESA_FORMAT_x.
gl_texture_format will go away next.
|
|
|
|
I'll be using this in merging brw_wm_emit.c and brw_wm_glsl.c
|
|
For some IZ setups, we'd forget to account for the source depth register
being present, so we'd both read the wrong reg, and write output depth to
the wrong reg.
Bug #22603.
|
|
Conflicts:
src/mesa/main/api_validate.c
|
|
For the TXP instruction we check if the texcoord is really a 4-component
atttibute which requires the divide by W step. This check involved the
projtex_mask field. However, the projtex_mask field was being miscalculated
because of some confusion between vertex program outputs and fragment
program inputs.
1. Rework the size_masks calculation so we correctly set bits corresponding
to fragment program input attributes.
2. Rename projtex_mask to proj_attrib_mask since we're interested in more
than just texcoords (generic varying vars too).
3. Simply the indexing of the size_masks and proj_attrib_mask fields.
4. The tracker::active[] array was mis-dimensioned. Use MAX_PROGRAM_TEMPS
instead of a magic number.
5. Update comments, add new assertions.
With these changes the Lightsmark demo/benchmark renders correctly, until
we eventually hit a GPU lockup...
|
|
...rather than with linear interpolation. Modern hardware should use
perspective-corrected interpolation for colors (as for texcoords).
glHint(GL_PERSPECTIVE_CORRECTION_HINT, mode) can be used to get
linear interpolation if mode = GL_FASTEST.
|
|
Before, if the VP output something that is in the attributes coming into
the WM but which isn't used by the WM, then WM would end up reading subsequent
varyings from the wrong places. This was visible with a GLSL demo
using gl_PointSize in the VS and a varying in the WM, as point size is in
the VUE but not used by the WM. There is now a regression test in piglit,
glsl-unused-varying.
|
|
When out of memory (in at least one case, triggered by a longrunning
memory leak), this code will segfault and crash. By checking for the
out-of-memory condition, the system can continue, and will report
the out-of-memory error later, a much preferable outcome.
|
|
|
|
This also cuts instructions by just using the existing bit in the payload
rather than computing it from the determinant in the SF unit and passing it
as a varying down to the WM. Something still goes wrong with getting the
backface color right, but a simpler shader appears to get the right result.
|
|
This function scans the shader to see if it has any GLSL features like
conditionals and loops. Calling this during state validation is expensive.
Just call it when the shader is given to the driver and save the result.
There's some new/temporary assertions to be sure we don't get out of sync
on this.
|
|
|
|
s/FRAG_RESULT_DEPR/FRAG_RESULT_DEPTH/
s/FRAG_RESULT_COLR/FRAG_RESULT/COLOR/
Remove FRAG_RESULT_COLH (NV half-precision) output since we never used it.
Next, we might merge the COLOR and DATA outputs (COLOR0, COLOR1, etc).
|
|
|
|
|
|
|
|
|
|
If the texture swizzle is not XYZW (no-op) add an extra MOV instruction
after the TEX instruction to rearrange the components.
|
|
|
|
Track separate back-face stencil state for OpenGL 2.0 /
GL_ATI_separate_stencil and GL_EXT_stencil_two_side. This allows all
three to be enabled in a driver. One set of state is set via the 2.0
or ATI functions and is used when STENCIL_TEST_TWO_SIDE_EXT is
disabled. The other is set by StencilFunc and StencilOp when the
active stencil face is set to BACK. The GL_EXT_stencil_two_side spec has
more details.
http://opengl.org/registry/specs/EXT/stencil_two_side.txt
|
|
|
|
|
|
|
|
|
|
|
|
This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
|
|
This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm_surface_state.c
|
|
(bug #16852, #16853)
|
|
To do this, I had to clean up some of 965 state upload stuff. We may end
up over-emitting state in the aperture overflow case, but that should be rare,
and I'd rather have the simplification of state management.
|
|
Makes state emission into a 2 phase, prepare sets things up and accounts
the size of all referenced buffer objects. The emit stage then actually
does the batchbuffer touching for emitting the objects.
There is an assert in dri_emit_reloc if a reloc occurs for a buffer
that hasn't been accounted yet.
|
|
|
|
1. Follow EXT_texture_rectangle with YCbCr texture
2. swap UV component for MESA_FORMAT_YCBCR
|
|
|
|
These don't appear to have ever been used.
|
|
|
|
Note that this does not enable GL_EXT_stencil_two_side, because Mesa's computed
_TestTwoSide ends up respecting only STENCIL_TEST_TWO_SIDE_EXT (defaults to
GL_FALSE), even if the application uses only GL 2.0 / ATI entrypoints.
|
|
The user-space suballocator that was used avoided relocation computations by
using the general and surface state base registers and allocating those types
of buffers out of pools built on top of single buffer objects. It also
avoided calls into the buffer manager for these small state allocations, since
only one buffer object was being used.
However, the buffer allocation cost appears to be low, and with relocation
caching, computing relocations for buffers is essentially free. Additionally,
implementing the suballocator required a don't-fence-subdata flag to disable
waiting on buffer maps so that writing new data didn't block on rendering using
old data, and careful handling when mapping to update old data (which we need
to do for unavoidable relocations with FBOs). More importantly, when the
suballocator filled, it had no replacement algorithm and just threw out all
of the contents and forced them to be recomputed, which is a significant cost.
This is the first step, which just changes the buffer type, but doesn't yet
improve the hash table to not result in full recompute on overflow. Because
the buffers are all allocated out of the general buffer allocator, we can
no longer use the general/surface state bases to avoid relocations, and they
are set to 0 instead.
|