Age | Commit message (Collapse) | Author |
|
If the user passed in several arrays interleaved in the same vbo, only
emit a single vertex buffer and relocation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Track reuse of the vertex buffer objects and so minimise the number of
vertex buffers used by the hardware (and their relocations).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we now pack the indices into a common upload buffer, we can reuse a
single CMD_INDEX_BUFFER packet and translate each invocation with a
start vertex offset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Move the tracking of the last emitted instructions into the core
batchbuffer routines and take advantage of the shadow batch copy to
avoid extra memory allocations and copies.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
It's faster. Not only is the memcpy more efficiently performed in the
kernel (making up for the system call overhead), but by not using mmap
we remove the greater overhead of tracking the vma of every batch.
And it means we can read back from the batch buffer without incurring
the cost of a uncached read through the GTT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we use state relocations and we know that all the state belongs to
the same bo, we can drop the multiple references to the same bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we write directly into the batch in system memory, we do not need to
write first to the stack (as was to avoid read back through the GTT)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we write directly into the batch in system memory, we do not need to
write first to the stack (as was to avoid read back through the GTT)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In preparation for a greater change, use the color_calc_state_bo already
provisioned for this purpose.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Rather than performing lots of little writes to update the common bo
upon each update, write those into a static buffer and flush that when
full (or at the end of the batch). Doing so gives a dramatic performance
improvement over and above using mmaped access.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Rather than performing a blit to completely overwrite a busy bo, simply
discard it and create a new one with the fresh data.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Reuse the new common upload buffer for uploading temporary indices and
rebuilt vertex arrays.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Dynamic arrays have the tendency to be small and so allocating a bo for
each one is overkill and we can exploit many efficiency gains by packing
them together.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Dynamic draw buffers are used by clients for temporary arrays and for
uploading normal vertex arrays. By keeping the data in memory, we can
avoid reusing active buffer objects and reallocate them as they are
changed. This is important for Sandybridge which can not issue blits
within a batch and so ends up flushing the batch upon every update, that
is each batch only contains a single draw operation (if using dynamic
arrays or regular arrays from system memory).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Following a GPU hang, or other error, the render target is not likely to
have an allocated BO and so we must fallback to avoid using it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32534
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
|
|
|
|
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=34030
NOTE: This is a candidate for the 7.10 branch.
|
|
This an adds --enable-shared-dricore option to configure. When enabled,
DRI modules will link against a shared copy of the common mesa routines
rather than statically linking these.
This saves about 30MB on disc with a full complement of classic DRI
drivers.
v2: Only enable with a gcc-compatible compiler that handles rpath
Handle DRI_CFLAGS without filter-out magic
Build shared libraries with the full mklib voodoo
Fix typos
v3: Resolve conflicts with talloc removal patches
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
|
|
|
|
NOTE: This is a candidate for the 7.9 and 7.10 branches.
|
|
Previously the SNE and SEQ instructions would calculate the partial
result to the destination register. This would cause problems if the
destination register was also one of the source registers.
Fixes piglit tests glsl-fs-any, glsl-fs-struct-equal,
glsl-fs-struct-notequal, glsl-fs-vec4-operator-equal,
glsl-fs-vec4-operator-notequal.
NOTE: This is a candidate for the 7.9 and 7.10 branches.
|
|
|
|
Just add the gl_api parameter to _mesa_create_context().
|
|
Just add the gl_api parameter to _mesa_initialize_context().
|
|
These are only used for debugging, but should be there.
Found by inspection.
|
|
These were incorrectly defined to the same value - likely due to a cut
and paste error. Found by inspection.
|
|
|
|
|
|
|
|
|
|
The scheduler and the register allocator are not good enough yet to deal
with the effects of the register rename pass. This was causing a 50%
performance drop in Lightsmark. The pass can be re-enabled once the
scheduler and the register allocator are more mature. r300 and r400
still need this pass, because it prevents a lot of shaders from using
too many texture indirections.
NOTE: This is a candidate for the 7.10 branch.
|
|
|
|
This adds i965 support for GL_EXT_framebuffer_sRGB, it introduces a new
constant to say that the driver can support sRGB enabled FBOs since enabling
the extension doesn't mean the driver can actually support sRGB.
Also adds the suggested state flush in the core code suggested by Brian.
fix the ARB_fbo color encoding.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
We pull the draw regions right out of the renderbuffers these days.
|
|
There's way more interesting info in INTEL_DEBUG=state if you could find
it among the state size checks.
|
|
This new interface could set up context for OpenGL,
OpenGL ES1 and OpenGL ES2. It will be used by egl_dri2
driver.
Signed-off-by: Haitao Feng <haitao.feng@intel.com>
|
|
|
|
There's no such device. 0x4243 is a pci bridge id,
not a GPU.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
|
|
Previously a register would be marked as available if any component
was written. This caused shaders such as this:
0: TEX TEMP[0].xyz, INPUT[14].xyyy, texture[0], 2D;
1: MUL TEMP[1], UNIFORM[0], TEMP[0].xxxx;
2: MAD TEMP[2], UNIFORM[1], TEMP[0].yyyy, TEMP[1];
3: MAD TEMP[1], UNIFORM[2], TEMP[0].zzzz, TEMP[2];
4: ADD TEMP[0].xyz, TEMP[1].xyzx, UNIFORM[3].xyzx;
5: TEX TEMP[1].w, INPUT[14].xyyy, texture[0], 2D;
6: MOV TEMP[0].w, TEMP[1].wwww;
7: MOV OUTPUT[2], TEMP[0];
8: END
to produce incorrect code such as this:
BEGIN
DCL S[0]
DCL T_TEX0
R[0] = MOV T_TEX0.xyyy
U[0] = TEXLD S[0],R[0]
R[0].xyz = MOV U[0]
R[1] = MUL CONST[0], R[0].xxxx
R[2] = MAD CONST[1], R[0].yyyy, R[1]
R[1] = MAD CONST[2], R[0].zzzz, R[2]
R[0].xyz = ADD R[1].xyzx, CONST[3].xyzx
R[0] = MOV T_TEX0.xyyy
U[0] = TEXLD S[0],R[0]
R[1].w = MOV U[0]
R[0].w = MOV R[1].wwww
oC = MOV R[0]
END
Note that T_TEX0 is copied to R[0], but the xyz components of R[0] are
still expected to hold a calculated value.
Fixes piglit tests draw-elements-vs-inputs, fp-kill, and
glsl-fs-color-matrix. It also fixes Meego bugzilla #13005.
NOTE: This is a candidate for the 7.9 and 7.10 branches.
|
|
|
|
|
|
|
|
Swizzles are now defined everywhere as a field with 12 bits that contains
4 channels worth of meaningful information. Any channel that is unused is
set to RC_SWIZZLE_UNUSED. This change is necessary because rgb instructions
and alpha instructions were initializing channels that would never be used
(channel 3 for rgb and channels 1-3 for alpha) with 0 (aka RC_SWIZZLE_X).
This made it impossible to use generic helper functions for swizzles,
because sometimes a channel value of 0 meant unused and other times it
meant RC_SWIZZLE_X.
All hacks that tried to guess how many channels were relevant have
also been removed.
|
|
The same number of shaders is now printed regardless of optimizations being
enabled or not, so that we can compare shader stats side by side easily.
|
|
The software renderer doesn't support GL_ALPHA, GL_LUMINANCE, etc
so we should report GL_FRAMEBUFFER_UNSUPPORTED during FBO validation.
|
|
Fixes the build when selecting driver=osmesa and building static libraries.
Otherwise, mklib tries to add the ‘-ltalloc’ object to the archive, which
obviously fails.
Clients which statically link to osmesa will need to link to libtalloc also,
as specified in the Libs.private of osmesa.pc.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=33360
NOTE: This is a candidate for the 7.10 branch.
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
|