Age | Commit message (Collapse) | Author |
|
Add a GLbitfield64 type and several macros to operate on 64-bit
fields. The OutputsWritten field of gl_program is changed to use that
type. This results in a fair amount of fallout in drivers that use
programs.
No changes are strictly necessary at this point as all bits used are
below the 32-bit boundary. Fairly soon several bits will be added for
clip distances written by a vertex shader. This will cause several
bits used for varyings to be pushed above the 32-bit boundary. This
will affect any drivers that support GLSL.
At this point, only the i965 driver has been modified to support this
eventuality.
I did this as a "squash" merge. There were several places through the
outputswritten64 branch where things were broken. I foresee this
causing difficulties later for bisecting. The history is still
available in the branch.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm.h
|
|
|
|
|
|
This keeps the individual state files from having to export their
structures for brw_state_cache initialization.
|
|
i965 might support more than 4 color draw buffers. But if not, this protects
from breakage if the Mesa limit is raised.
|
|
This reverts commit 8810b8f67135185d1044746bb861fe2ff997626c.
It turns out the i965 driver uses the intel->Fallback field as a boolean,
not as a bitmask. The intelFallback() function is a no-op in the i965
driver. It would have been nice if there were some comments about this.
I'll fix that next...
|
|
|
|
Setting intel->Fallback = 1 clobbered any fallback state that was already
set. Not sure where this hack originated (the git history is a little
convoluted). Define and use a new BRW_FALLBACK_DRAW bit instead. This
shouldn't break anything and could potentially fix some bugs (but no
specific ones are known).
|
|
|
|
The value was probably wrong too.
It was the same as INTEL_FALLBACK_DRAW_BUFFER.
|
|
|
|
Conflicts:
Makefile
configs/default
progs/glsl/Makefile
src/gallium/auxiliary/util/u_simple_shaders.c
src/gallium/state_trackers/glx/xlib/xm_api.c
src/mesa/drivers/dri/i965/brw_draw_upload.c
src/mesa/drivers/dri/i965/brw_vs_emit.c
src/mesa/drivers/dri/intel/intel_context.h
src/mesa/drivers/dri/intel/intel_pixel.c
src/mesa/drivers/dri/intel/intel_pixel_read.c
src/mesa/main/texenvprogram.c
src/mesa/main/version.h
|
|
|
|
The code duplication bothered me.
(cherry picked from commit 9b9cb30d128fc5f1ba77287696ecd508e640efde)
|
|
We'll use this for debug/sanity checking.
|
|
No performance difference proven at 95% confidence with my GLSL demo (n=10).
|
|
|
|
I was getting tired of doing the dance of INTEL_DEBUG=batch, copying it out,
and running intel-gen4disasm on it.
|
|
The code duplication bothered me.
|
|
This state flag has been unused since the ffvertex_prog move to core.
|
|
|
|
|
|
For the TXP instruction we check if the texcoord is really a 4-component
atttibute which requires the divide by W step. This check involved the
projtex_mask field. However, the projtex_mask field was being miscalculated
because of some confusion between vertex program outputs and fragment
program inputs.
1. Rework the size_masks calculation so we correctly set bits corresponding
to fragment program input attributes.
2. Rename projtex_mask to proj_attrib_mask since we're interested in more
than just texcoords (generic varying vars too).
3. Simply the indexing of the size_masks and proj_attrib_mask fields.
4. The tracker::active[] array was mis-dimensioned. Use MAX_PROGRAM_TEMPS
instead of a magic number.
5. Update comments, add new assertions.
With these changes the Lightsmark demo/benchmark renders correctly, until
we eventually hit a GPU lockup...
|
|
Conflicts:
src/mesa/main/api_validate.c
|
|
This can avoid re-uploading constant data when it isn't necessary, and is
a step towards not updating other surfaces just because constants change.
It also brings the upload of the constant buffer next to the creation.
This brings openarena performance up another 4%, to 91% of the Mesa 7.4 branch.
|
|
Make the use_const_buffer field per-program and only call the code which
updates the constant buffer's data if the flag is set.
This should undo the perf regression from 20f3497e4b6756e330f7b3f54e8acaa1d6c92052
(cherry picked from master, commit dc9705d12d162ba6d087eb762e315de9f97bc456)
|
|
Make the use_const_buffer field per-program and only call the code which
updates the constant buffer's data if the flag is set.
This should undo the perf regression from 20f3497e4b6756e330f7b3f54e8acaa1d6c92052
|
|
|
|
The new, second cache will only be used for surface-related items.
Since we can create many surfaces the original, single cache could get
filled quickly. When we cleared it, we had to regenerate shaders, etc.
With two caches, we can avoid doing that.
|
|
No more dynamic atoms so we can simplify the state validation code a little.
|
|
Now that we have real constant buffers, the demands on the CURBE are lessened.
When we use real VS/WM constant buffers we only use the CURBE for clip planes.
|
|
Hook up a constant buffer, binding table, etc for the VS unit.
This will allow using large constant buffers with vertex shaders.
The new code is disabled at this time (use_const_buffer=FALSE).
|
|
Plus, begin the new code for vertex shader const buffers.
|
|
Used to map drawables, textures and constant buffers to surface binding
table indexes.
|
|
Currently, shader constants are stored in the GRF (loaded from the CURBE
prior to shader execution). This severly limits the number of constants
and temps that we can support.
This new code will support (practically) unlimited size constant buffers
and free up registers in the GRF. We allocate a new buffer object for the
constants and read them with "Read" messages/instructions. When only a
small number of constants are used, we can still use the old method.
The code works for fragment shaders only (and is actually disabled) for now.
Need to do the same thing for vertex shaders and need to add the necessary
code-gen to fetch the constants which are referenced by the shader
instructions.
|
|
|
|
|
|
|
|
This function scans the shader to see if it has any GLSL features like
conditionals and loops. Calling this during state validation is expensive.
Just call it when the shader is given to the driver and save the result.
There's some new/temporary assertions to be sure we don't get out of sync
on this.
|
|
|
|
Be a little more specific about what these are.
|
|
|
|
|
|
|
|
|
|
This lets GLSL shaders use up to 16 samplers.
Fixed function is still limited to 8 textures.
Tested with progs/glsl/samplers.c
|
|
This is fallout from the ffvertex_prog.c work. It doesn't call
ProgramStringNotify, so we don't set param_state, so we wouldn't track when
VP parameters changed, and constants wouldn't get uploaded. Instead, remove
param_state entirely and just use the real value that we want to be tracking.
Fixes rendering in openarena since BRW_NEW_BATCH got disentangled from
BRW_NEW_INDICES.
Bug #18822.
|
|
The CACHE_NEW_SURFACE bit always gets spammed since we get many different
surface BOs per state emit, but the only consumer of it wanted to just know
how many surfaces were enabled.
|
|
Fixes upload of large amounts of state for every new primitive emit.
|
|
This was causing a prepare of wm state at every primitive emit.
|