summaryrefslogtreecommitdiff
path: root/src/mesa/drivers
AgeCommit message (Collapse)Author
2011-02-21intel: combine short memcpy using a temporary allocated bufferChris Wilson
Using a temporary buffer for large discontiguous uploads into the common buffer and a single buffered upload is faster than performing the discontiguous copies through a mapping into the GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: upload normal arrays as interleavedChris Wilson
Upload the non-vbo arrays into a single interleaved buffer object, and so need to just emit a single vertex buffer relocation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: interleaved vboChris Wilson
If the user passed in several arrays interleaved in the same vbo, only emit a single vertex buffer and relocation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: emit one vb packet per vboChris Wilson
Track reuse of the vertex buffer objects and so minimise the number of vertex buffers used by the hardware (and their relocations). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: upload transient indices into the same discontiguous bufferChris Wilson
As we now pack the indices into a common upload buffer, we can reuse a single CMD_INDEX_BUFFER packet and translate each invocation with a start vertex offset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: suppress repeat-emission of identical vertex elementsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Move repeat-instruction-suppression to batchbuffer coreChris Wilson
Move the tracking of the last emitted instructions into the core batchbuffer routines and take advantage of the shadow batch copy to avoid extra memory allocations and copies. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: use pwrite for batchChris Wilson
It's faster. Not only is the memcpy more efficiently performed in the kernel (making up for the system call overhead), but by not using mmap we remove the greater overhead of tracking the vma of every batch. And it means we can read back from the batch buffer without incurring the cost of a uncached read through the GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: drop state_bo references to batch_boChris Wilson
As we use state relocations and we know that all the state belongs to the same bo, we can drop the multiple references to the same bo. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: directly write wm state to batchChris Wilson
As we write directly into the batch in system memory, we do not need to write first to the stack (as was to avoid read back through the GTT) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: write cc straight to batchChris Wilson
As we write directly into the batch in system memory, we do not need to write first to the stack (as was to avoid read back through the GTT) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: switch gen6 to use its own cc state boChris Wilson
In preparation for a greater change, use the color_calc_state_bo already provisioned for this purpose. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Buffered uploadChris Wilson
Rather than performing lots of little writes to update the common bo upon each update, write those into a static buffer and flush that when full (or at the end of the batch). Doing so gives a dramatic performance improvement over and above using mmaped access. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Replace the bo for a complete updateChris Wilson
Rather than performing a blit to completely overwrite a busy bo, simply discard it and create a new one with the fresh data. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Combine vb upload buffer with the general upload bufferChris Wilson
Reuse the new common upload buffer for uploading temporary indices and rebuilt vertex arrays. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Pack dynamic draws togetherChris Wilson
Dynamic arrays have the tendency to be small and so allocating a bo for each one is overkill and we can exploit many efficiency gains by packing them together. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21intel: Use system memory for DYNAMIC_DRAW source objectsChris Wilson
Dynamic draw buffers are used by clients for temporary arrays and for uploading normal vertex arrays. By keeping the data in memory, we can avoid reusing active buffer objects and reallocate them as they are changed. This is important for Sandybridge which can not issue blits within a batch and so ends up flushing the batch upon every update, that is each batch only contains a single draw operation (if using dynamic arrays or regular arrays from system memory). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Trim the trailing NOOP from 3DSTATE_INDEX_BUFFERChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-21i965: Fallback on encountering a NULL render bufferChris Wilson
Following a GPU hang, or other error, the render target is not likely to have an allocated BO and so we must fallback to avoid using it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32534 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-12dri: Remove the old metaops code which has been superceded by ../common/Eric Anholt
2011-02-12radeon: Remove setup of the old dri/ meta code, which is now unused.Eric Anholt
2011-02-12intel: Remove setup of the old dri/ meta code, which is now unused.Eric Anholt
2011-02-11r300/compiler: Don't erase sources when converting RGB->AlphaTom Stellard
https://bugs.freedesktop.org/show_bug.cgi?id=34030 NOTE: This is a candidate for the 7.10 branch.
2011-02-11mesa: Optionally build a dricore support library (v3)Christopher James Halse Rogers
This an adds --enable-shared-dricore option to configure. When enabled, DRI modules will link against a shared copy of the common mesa routines rather than statically linking these. This saves about 30MB on disc with a full complement of classic DRI drivers. v2: Only enable with a gcc-compatible compiler that handles rpath Handle DRI_CFLAGS without filter-out magic Build shared libraries with the full mklib voodoo Fix typos v3: Resolve conflicts with talloc removal patches Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
2011-02-10mesa: Remove empty header file s_trispan.h.Kenneth Graunke
2011-02-10i915: Force lowering of all types of indirect array accesses in the FSIan Romanick
NOTE: This is a candidate for the 7.9 and 7.10 branches.
2011-02-10i915: Calculate partial result to temp register firstIan Romanick
Previously the SNE and SEQ instructions would calculate the partial result to the destination register. This would cause problems if the destination register was also one of the source registers. Fixes piglit tests glsl-fs-any, glsl-fs-struct-equal, glsl-fs-struct-notequal, glsl-fs-vec4-operator-equal, glsl-fs-vec4-operator-notequal. NOTE: This is a candidate for the 7.9 and 7.10 branches.
2011-02-08r200: add cast to silence warningBrian Paul
2011-02-08mesa: remove _mesa_create_context_for_api()Brian Paul
Just add the gl_api parameter to _mesa_create_context().
2011-02-08mesa: remove _mesa_initialize_context_for_api()Brian Paul
Just add the gl_api parameter to _mesa_initialize_context().
2011-02-08i965: Add missing DEFINE_BITS for brw dirty bits.Kenneth Graunke
These are only used for debugging, but should be there. Found by inspection.
2011-02-08i965: Separate the BRW_NEW_(VS|WM)_CONSTBUF dirty bits.Kenneth Graunke
These were incorrectly defined to the same value - likely due to a cut and paste error. Found by inspection.
2011-02-08i965: Rename a few more commands to match the documentation.Kenneth Graunke
2011-02-08i965: Remove pointless keying of WM state on VUE size.Eric Anholt
2011-02-07intel: Implement dri2::{Allocate,Release}BufferBenjamin Franzke
2011-02-07Add dri2::{Allocate,Release}Buffer extensionBenjamin Franzke
2011-02-05r300/compiler: Disable register rename pass on r500Tom Stellard
The scheduler and the register allocator are not good enough yet to deal with the effects of the register rename pass. This was causing a 50% performance drop in Lightsmark. The pass can be re-enabled once the scheduler and the register allocator are more mature. r300 and r400 still need this pass, because it prevents a lot of shaders from using too many texture indirections. NOTE: This is a candidate for the 7.10 branch.
2011-02-05r300/compiler: Don't count BEGIN_TEX instructions in the compiler statsTom Stellard
2011-02-05mesa/965: add support for GL_EXT_framebuffer_sRGB (v2)Dave Airlie
This adds i965 support for GL_EXT_framebuffer_sRGB, it introduces a new constant to say that the driver can support sRGB enabled FBOs since enabling the extension doesn't mean the driver can actually support sRGB. Also adds the suggested state flush in the core code suggested by Brian. fix the ARB_fbo color encoding. Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-02-04i965: Drop the dead tracking of color_regions[].Eric Anholt
We pull the draw regions right out of the renderbuffers these days.
2011-02-04i965: Drop the INTEL_DEBUG=state spam about the cache size check.Eric Anholt
There's way more interesting info in INTEL_DEBUG=state if you could find it among the state size checks.
2011-02-03swrast: add an interface createNewContextForAPIHaitao Feng
This new interface could set up context for OpenGL, OpenGL ES1 and OpenGL ES2. It will be used by egl_dri2 driver. Signed-off-by: Haitao Feng <haitao.feng@intel.com>
2011-02-03r300c: Unbreak after R4xx support was added to r300/compiler.Michel Dänzer
2011-02-01r200: remove 0x4243 pci idAlex Deucher
There's no such device. 0x4243 is a pci bridge id, not a GPU. Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
2011-02-01i915: Only mark a register as available if all components are writtenIan Romanick
Previously a register would be marked as available if any component was written. This caused shaders such as this: 0: TEX TEMP[0].xyz, INPUT[14].xyyy, texture[0], 2D; 1: MUL TEMP[1], UNIFORM[0], TEMP[0].xxxx; 2: MAD TEMP[2], UNIFORM[1], TEMP[0].yyyy, TEMP[1]; 3: MAD TEMP[1], UNIFORM[2], TEMP[0].zzzz, TEMP[2]; 4: ADD TEMP[0].xyz, TEMP[1].xyzx, UNIFORM[3].xyzx; 5: TEX TEMP[1].w, INPUT[14].xyyy, texture[0], 2D; 6: MOV TEMP[0].w, TEMP[1].wwww; 7: MOV OUTPUT[2], TEMP[0]; 8: END to produce incorrect code such as this: BEGIN DCL S[0] DCL T_TEX0 R[0] = MOV T_TEX0.xyyy U[0] = TEXLD S[0],R[0] R[0].xyz = MOV U[0] R[1] = MUL CONST[0], R[0].xxxx R[2] = MAD CONST[1], R[0].yyyy, R[1] R[1] = MAD CONST[2], R[0].zzzz, R[2] R[0].xyz = ADD R[1].xyzx, CONST[3].xyzx R[0] = MOV T_TEX0.xyyy U[0] = TEXLD S[0],R[0] R[1].w = MOV U[0] R[0].w = MOV R[1].wwww oC = MOV R[0] END Note that T_TEX0 is copied to R[0], but the xyz components of R[0] are still expected to hold a calculated value. Fixes piglit tests draw-elements-vs-inputs, fp-kill, and glsl-fs-color-matrix. It also fixes Meego bugzilla #13005. NOTE: This is a candidate for the 7.9 and 7.10 branches.
2011-01-31i965: Emit texel offsets in sampler messages.Kenneth Graunke
2011-01-31Remove talloc from the make and automake build systems.Kenneth Graunke
2011-01-31Convert everything from the talloc API to the ralloc API.Kenneth Graunke
2011-01-29r300/compiler: Standardize the number of bits used by swizzle fieldsTom Stellard
Swizzles are now defined everywhere as a field with 12 bits that contains 4 channels worth of meaningful information. Any channel that is unused is set to RC_SWIZZLE_UNUSED. This change is necessary because rgb instructions and alpha instructions were initializing channels that would never be used (channel 3 for rgb and channels 1-3 for alpha) with 0 (aka RC_SWIZZLE_X). This made it impossible to use generic helper functions for swizzles, because sometimes a channel value of 0 meant unused and other times it meant RC_SWIZZLE_X. All hacks that tried to guess how many channels were relevant have also been removed.
2011-01-28r300/compiler: print stats based on the initial number of instructionsMarek Olšák
The same number of shaders is now printed regardless of optimizations being enabled or not, so that we can compare shader stats side by side easily.