Age | Commit message (Collapse) | Author |
|
|
|
1. Move all GL entrypoint functions and files into src/mesa/main/
This includes the ARB vp/vp, NV vp/fp, ATI fragshader and GLSL bits
that were in src/mesa/shader/
2. Move src/mesa/shader/slang/ to src/mesa/slang/ to reduce the tree depth
3. Rename src/mesa/shader/ to src/mesa/program/ since all the
remaining files are concerned with GPU programs.
4. Misc code refactoring. In particular, I got rid of most of the
GLSL-related ctx->Driver hook functions. None of the drivers used
them.
Conflicts:
src/mesa/drivers/dri/i965/brw_context.c
|
|
|
|
This was bothering me when redoing the binding tables.
|
|
|
|
In exchange we end up with an extra memcpy, but that seems better than
calloc/free. Each buffer is 4k maximum, and on the i965-streaming
branch this allocation was showing up as the top entry in
brw_validate_state profiling for cairo-gl.
|
|
The slightly less mechanical change of converting the emit_reloc calls
will follow.
|
|
Shaves 60k off the driver from removing the broken spans code. This
means we now require 2.6.29, which seems fair given that it's a year
old and we've removed support for non-KMS already in the last release
of 2D.
|
|
|
|
|
|
For the tiny bis of data we generally upload through the CURBEs, the
overhead of the kernel's pagetable trickery is actually rather high.
This improves cairo-gl gnome-terminal-vim performance by 3.8%.
|
|
Improves cairo-gl gnome-terminal-vim times by 11%.
|
|
The pull constants require sending out to an overworked shared unit
and waiting for a response, while push constants are nicely loaded in
for us at thread dispatch time. By putting things we access in every
VS invocation there, ETQW performance improved by 2.5% +/- 1.6% (n=6).
|
|
|
|
Fixing this is a prereq for avoiding flagging all state at new
batch time. Eliminating that still causes problems, though (notably
glean logicOp fails on my GM965).
|
|
|
|
|
|
This can avoid re-uploading constant data when it isn't necessary, and is
a step towards not updating other surfaces just because constants change.
It also brings the upload of the constant buffer next to the creation.
This brings openarena performance up another 4%, to 91% of the Mesa 7.4 branch.
|
|
Also, only create VS surface state if there's a VS constant buffer to be
uploaded, and set the contents of the buffer at the same time as creation.
|
|
Conflicts:
src/mesa/drivers/dri/i965/brw_curbe.c
src/mesa/drivers/dri/i965/brw_vs_emit.c
src/mesa/drivers/dri/i965/brw_wm_glsl.c
|
|
|
|
Make the use_const_buffer field per-program and only call the code which
updates the constant buffer's data if the flag is set.
This should undo the perf regression from 20f3497e4b6756e330f7b3f54e8acaa1d6c92052
(cherry picked from master, commit dc9705d12d162ba6d087eb762e315de9f97bc456)
|
|
Make the use_const_buffer field per-program and only call the code which
updates the constant buffer's data if the flag is set.
This should undo the perf regression from 20f3497e4b6756e330f7b3f54e8acaa1d6c92052
|
|
dri_bo_subdata()
This wraps up the unfinished business from commit a9a363f8298e9d534e60e3d2869f8677138a1e7e
|
|
The drm_intel_gem_bo_map_gtt() call that replaced dri_bo_map() is
producing errors like:
intel_bufmgr_gem.c:689: Error preparing buffer map 39 (vp_const_buffer): Invalid argument .
and returning NULL, causing a segfault in the memcpy().
Just reverting until we can get to the root issue...
|
|
This is a CPU win in general, but in particular reduces the pain of
Mesa's calculation of min/max indices in DrawElements (wtf?).
|
|
|
|
|
|
Now that we have real constant buffers, the demands on the CURBE are lessened.
When we use real VS/WM constant buffers we only use the CURBE for clip planes.
|
|
|
|
Hook up a constant buffer, binding table, etc for the VS unit.
This will allow using large constant buffers with vertex shaders.
The new code is disabled at this time (use_const_buffer=FALSE).
|
|
|
|
Plus, begin the new code for vertex shader const buffers.
|
|
|
|
Currently, shader constants are stored in the GRF (loaded from the CURBE
prior to shader execution). This severly limits the number of constants
and temps that we can support.
This new code will support (practically) unlimited size constant buffers
and free up registers in the GRF. We allocate a new buffer object for the
constants and read them with "Read" messages/instructions. When only a
small number of constants are used, we can still use the old method.
The code works for fragment shaders only (and is actually disabled) for now.
Need to do the same thing for vertex shaders and need to add the necessary
code-gen to fetch the constants which are referenced by the shader
instructions.
|
|
|
|
|
|
|
|
|
|
|
|
This is fallout from the ffvertex_prog.c work. It doesn't call
ProgramStringNotify, so we don't set param_state, so we wouldn't track when
VP parameters changed, and constants wouldn't get uploaded. Instead, remove
param_state entirely and just use the real value that we want to be tracking.
Fixes rendering in openarena since BRW_NEW_BATCH got disentangled from
BRW_NEW_INDICES.
Bug #18822.
|
|
Previously, since my check_aperture API change, we would check each piece of
state against the batchbuffer individually, but not all the state against the
batchbuffer at once. In addition to not being terribly useful in assuring
success, it probably also increased CPU load by calling check_aperture many
times per primitive.
|
|
This ensures there is an unfilled batchbuffer used for emitting states again. Partial fix for #17964.
|
|
This isn't required for GEM (at least, yet), but the check_aperture code
for non-GEM results in batch getting flushed during emit. brw_state_upload
restarts state emits, but a bunch of the state emit functions were assuming
that they would be called exactly once, after prepare and before new_batch.
Bug #17179.
|
|
Makefile.template
|
|
This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
|
|
This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm_surface_state.c
|
|
To do this, I had to clean up some of 965 state upload stuff. We may end
up over-emitting state in the aperture overflow case, but that should be rare,
and I'd rather have the simplification of state management.
|
|
|
|
This is an API breakage only.
|