summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965/brw_wm_state.c
AgeCommit message (Collapse)Author
2011-03-12Revert "i965: Use the fixed function GLSL program instead of the ARB program."Eric Anholt
This reverts commit 81b34a4e3a7aec9cdf2781757408dc5e9eec79cb. There were regressions in the core change that this depends on.
2011-03-11i965: Use the fixed function GLSL program instead of the ARB program.Eric Anholt
This gets one more piece of the pipeline onto the new codegen backend. Once ARB_fragment_program can generate GLSL programs, we can nuke the old backend.
2011-02-08i965: Remove pointless keying of WM state on VUE size.Eric Anholt
2010-12-09i965: remove unused variable since brw_wm_glsl.c removal.Eric Anholt
2010-12-06i965: Nuke brw_wm_glsl.c.Eric Anholt
It was only used for gen6 fragment programs (not GLSL shaders) at this point, and it was clearly unsuited to the task -- missing opcodes, corrupted texturing, and assertion failures hit various applications of all sorts. It was easier to patch up the non-glsl for remaining gen6 changes than to make brw_wm_glsl.c complete. Bug #30530
2010-11-03intel: Annotate debug printout checks with unlikely().Eric Anholt
This provides the optimizer with hints about code hotness, which we're quite certain about for debug printouts (or, rather, while we developers often hit the checks for debug printouts, we don't care about performance while doing so).
2010-10-27Track separate programs for each stageIan Romanick
The assumption is that all stages are the same program or that varyings are passed between stages using built-in varyings.
2010-10-21i965: Correct scratch space allocation.Eric Anholt
One, it was allocating increments of 1kb, but per thread scratch space is a power of two. Two, the new FS wasn't getting total_scratch set at all, so everyone thought they had 1kb and writes beyond 1kb would go stomping on a neighbor thread. With this plus the previous register spilling for the new FS, glsl-fs-convolution-1 passes.
2010-10-19i965: Disable thread dispatch when the FS doesn't do any work.Eric Anholt
This should reduce the cost of generating shadow maps, for example. No performance difference measured in nexuiz, though it does trigger this path.
2010-10-14glsl: Slightly change the semantic of _LinkedShadersIan Romanick
Previously _LinkedShaders was a compact array of the linked shaders for each shader stage. Now it is arranged such that each slot, indexed by the MESA_SHADER_* defines, refers to a specific shader stage. As a result, some slots will be NULL. This makes things a little more complex in the linker, but it simplifies things in other places. As a side effect _NumLinkedShaders is removed. NOTE: This may be a candidate for the 7.9 branch. If there are other patches that get backported to 7.9 that use _LinkedShader, this patch should be cherry picked also.
2010-10-13Drop GLcontext typedef and use struct gl_context insteadKristian Høgsberg
2010-08-26i965: Start building direct GLSL2 IR to 965 assembly codegen.Eric Anholt
Our channel-expressions and vector-splitting changes now happen into a private copy of the IR that we maintain for ourselves. Uniform assignment still happens by the core, so we continue using Mesa IR generation not just for swrast fallbacks but also for uniform values (since there's no storage for their contents other than shader_program->FragmentProgram->Parameters->ParameterValues). And most importantly, at the moment no actual codegen is hooked up other than emitting our favorite color to the framebuffer.
2010-07-21i965: Set the GEM domain flags for the scratch space.Eric Anholt
They go into the render cache, so while we don't care about their contents after execution, failing to note them could cause the writes to be flushed over important buffer contents later.
2010-06-08intel: Convert remaining dri_bo_emit_reloc to drm_intel_bo_emit_reloc.Eric Anholt
The new API makes so much more sense, I'd like to forget how the old one worked.
2010-06-08intel: Change dri_bo_* to drm_intel_bo* to consistently use new API.Eric Anholt
The slightly less mechanical change of converting the emit_reloc calls will follow.
2010-04-21intel: Clean up chipset name and gen num for IronlakeZhenyu Wang
Rename old IGDNG to Ironlake, and set 'gen' number for Ironlake as 5, so tracking the features with generation num instead of special is_ironlake flag. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2010-01-19i965: Allow for variable-sized auxdata in the state cache.Eric Anholt
Everything has been constant-sized until now, but constant buffer handling changes will make us want some additional variable sized array.
2009-12-22intel: Replace IS_G4X() across the driver with context structure usage.Eric Anholt
Saves ~2KB of code.
2009-12-22intel: Replace IS_IGDNG checks with intel->is_ironlake or needs_ff_sync.Eric Anholt
Saves ~480 bytes of code.
2009-11-17Merge branch 'outputswritten64'Ian Romanick
Add a GLbitfield64 type and several macros to operate on 64-bit fields. The OutputsWritten field of gl_program is changed to use that type. This results in a fair amount of fallout in drivers that use programs. No changes are strictly necessary at this point as all bits used are below the 32-bit boundary. Fairly soon several bits will be added for clip distances written by a vertex shader. This will cause several bits used for varyings to be pushed above the 32-bit boundary. This will affect any drivers that support GLSL. At this point, only the i965 driver has been modified to support this eventuality. I did this as a "squash" merge. There were several places through the outputswritten64 branch where things were broken. I foresee this causing difficulties later for bisecting. The history is still available in the branch. Conflicts: src/mesa/drivers/dri/i965/brw_wm.h
2009-09-08i965: Respect spec requirement for pixel shader computed depth with no zbuffer.Eric Anholt
2009-07-13i965: add support for new chipsetsXiang, Haihao
1. new PCI ids 2. fix some 3D commands on new chipset 3. fix send instruction on new chipset 4. new VUE vertex header 5. ff_sync message (added by Zou Nan Hai <nanhai.zou@intel.com>) 6. the offset in JMPI is in unit of 64bits on new chipset 7. new cube map layout
2009-04-14i965: checkpoint commit: VS constant buffersBrian Paul
Hook up a constant buffer, binding table, etc for the VS unit. This will allow using large constant buffers with vertex shaders. The new code is disabled at this time (use_const_buffer=FALSE).
2009-04-03i965: minor code movement, new commentBrian Paul
2009-04-03i965: rename scratch_buffer -> scratch_bo to be consistant with other buffersBrian Paul
2009-03-23i965: Fix occlusion query when no other WM state updates occur.Eric Anholt
Turns out that XXX comment was important. We weren't flagging the WM to re-update with the statistics enable, so we got zeroes out of our query. Bug #20740, fixes piglit occlusion_query test. Signed-off-by: Eric Anholt <eric@anholt.net>
2009-03-06i965: avoid unnecessary calls to brw_wm_is_glsl()Brian Paul
This function scans the shader to see if it has any GLSL features like conditionals and loops. Calling this during state validation is expensive. Just call it when the shader is given to the driver and save the result. There's some new/temporary assertions to be sure we don't get out of sync on this.
2009-02-28mesa: rename, reorder FRAG_RESULT_x tokensBrian Paul
s/FRAG_RESULT_DEPR/FRAG_RESULT_DEPTH/ s/FRAG_RESULT_COLR/FRAG_RESULT/COLOR/ Remove FRAG_RESULT_COLH (NV half-precision) output since we never used it. Next, we might merge the COLOR and DATA outputs (COLOR0, COLOR1, etc).
2009-02-02i965: Remove brw->attribs now that we can just always look in the GLcontext.Eric Anholt
2008-11-28i965: Add a new state flag BRW_NEW_NR_SURFACES instead of CACHE_NEW_SURFACEEric Anholt
The CACHE_NEW_SURFACE bit always gets spammed since we get many different surface BOs per state emit, but the only consumer of it wanted to just know how many surfaces were enabled.
2008-11-28i965: Remove BRW_WM_LOCK dirty bit, introduced to work around lack of relocs.Eric Anholt
This was causing a prepare of wm state at every primitive emit.
2008-11-12i965: Update WM maximum threads for G4X.Eric Anholt
2008-09-10intel: track bufmgr move to libdrm_intel and bufmgr_fake irq emit/wait change.Eric Anholt
2008-08-24Revert "Revert "Merge branch 'drm-gem'""Dave Airlie
This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
2008-08-24Revert "Merge branch 'drm-gem'"Dave Airlie
This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03. Conflicts: src/mesa/drivers/dri/i965/brw_wm_surface_state.c
2008-08-08intel-gem: Update to new check_aperture API for classic mode.Eric Anholt
To do this, I had to clean up some of 965 state upload stuff. We may end up over-emitting state in the aperture overflow case, but that should be rare, and I'd rather have the simplification of state management.
2008-06-11[intel-gem] Chase domain flag renaming in the DRM.Eric Anholt
This is an API breakage only.
2008-06-03[intel] Convert drivers to using libdrm bufmgr code.Eric Anholt
2008-05-07GEM: Make dri_emit_reloc take GEM domain flags instead of TTM flags.Eric Anholt
The GEM flags are much more descriptive for what we need. Since this makes bufmgr_fake rather device-specific, move it to the intel common directory. We've wanted to do device-specific stuff to it before.
2008-04-18i965: initial attempt at fixing the aperture overflowDave Airlie
Makes state emission into a 2 phase, prepare sets things up and accounts the size of all referenced buffer objects. The emit stage then actually does the batchbuffer touching for emitting the objects. There is an assert in dri_emit_reloc if a reloc occurs for a buffer that hasn't been accounted yet.
2008-01-19[965] Fix WM unit cache keying that broke line stipple and polygon offset.Eric Anholt
2008-01-03[intel] Convert relocations to not be cleared out on buffer submit.Eric Anholt
We have two consumers of relocations. One is static state buffers, which want the same relocation every time. The other is the batchbuffer, which gets thrown out immediately after submit. This lets us reduce repeated computation for static state buffers, and clean up the code by moving relocations nearer to where the state buffer is computed.
2008-01-03[965] Fix some missing initialization in WM keys.Eric Anholt
2008-01-02[965] Convert WM unit to use a cache key instead of brw_cache_data.Eric Anholt
2007-12-14[965] Replace the state cache suballocator with direct dri_bufmgr use.Eric Anholt
The user-space suballocator that was used avoided relocation computations by using the general and surface state base registers and allocating those types of buffers out of pools built on top of single buffer objects. It also avoided calls into the buffer manager for these small state allocations, since only one buffer object was being used. However, the buffer allocation cost appears to be low, and with relocation caching, computing relocations for buffers is essentially free. Additionally, implementing the suballocator required a don't-fence-subdata flag to disable waiting on buffer maps so that writing new data didn't block on rendering using old data, and careful handling when mapping to update old data (which we need to do for unavoidable relocations with FBOs). More importantly, when the suballocator filled, it had no replacement algorithm and just threw out all of the contents and forced them to be recomputed, which is a significant cost. This is the first step, which just changes the buffer type, but doesn't yet improve the hash table to not result in full recompute on overflow. Because the buffers are all allocated out of the general buffer allocator, we can no longer use the general/surface state bases to avoid relocations, and they are set to 0 instead.
2007-12-12[intel] Move bufmgr back to context instead of screen, fixing glthreads.Eric Anholt
Putting the bufmgr in the screen is not thread-safe since the emit_reloc changes. It also led to a significant performance hit from pthread usage for the attempted thread-safety (up to 12% of a cpu spent on refcounting protection in single-threaded 965). The motivation had been to allow multi-context bufmgr sharing in classic mode, but it wasn't worth the cost.
2007-12-07[965] Convert the driver to dri_bufmgr interface and enable TTM.Eric Anholt
This is currently believed to work but be a significant performance loss. Performance recovery should be soon to follow. The dri_bo_fake_disable_backing_store() call was added to allow backing store disable like bufmgr_fake.c did, which is a significant performance win (though it's missing the no-fence-subdata part). This commit is a squash merge of the 965-ttm branch, which had some history I wanted to avoid pulling due to noisiness and brokenness at many points for git-bisecting.
2007-10-26Merge branch '965-glsl'Zou Nan hai
Conflicts: src/mesa/drivers/dri/i965/brw_sf.h src/mesa/drivers/dri/i965/intel_context.c
2007-10-04[965] Replace various alignment code with a shared ALIGN() macro.Eric Anholt
In the process, fix some alignment issues: - Scratch space allocation was aligned into units of 1KB, while the allocation wanted units of bytes, so we never allocated enough space for scratch. - GRF register count was programmed as ALIGN(val - 1, 16) / 16 instead of ALIGN(val, 16) / 16 - 1, which overcounted for val != 16n+1.
2007-09-27Revert "WIP 965 conversion to dri_bufmgr."Eric Anholt
This reverts commit b2f1aa2389473ed09170713301b042661d70a48e. Somehow I ended up with my branch's save-this-while-I-work-on-master commit actually on master.