summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965/brw_curbe.c
AgeCommit message (Collapse)Author
2010-02-19Replace the _mesa_*printf() wrappers with the plain libc versionsKristian Høgsberg
2010-02-19Replace _mesa_malloc, _mesa_calloc and _mesa_free with plain libc versionsKristian Høgsberg
2010-02-06i965: Keep the CURBE BO mapped and memcpy instead of subdataing.Eric Anholt
For the tiny bis of data we generally upload through the CURBEs, the overhead of the kernel's pagetable trickery is actually rather high. This improves cairo-gl gnome-terminal-vim performance by 3.8%.
2010-02-06i965: Reset the "need new CURBE BO" flag when we make a new CURBE bo.Eric Anholt
Improves cairo-gl gnome-terminal-vim times by 11%.
2010-01-19i965: Upload as many VS constants as possible through the push constants.Eric Anholt
The pull constants require sending out to an overworked shared unit and waiting for a response, while push constants are nicely loaded in for us at thread dispatch time. By putting things we access in every VS invocation there, ETQW performance improved by 2.5% +/- 1.6% (n=6).
2010-01-04intel: Drop batchbuffer cliprect_mode trackingKristian Høgsberg
2009-11-13i965: Flag BRW_NEW_CONTEXT on some context state.Eric Anholt
Fixing this is a prereq for avoiding flagging all state at new batch time. Eliminating that still causes problems, though (notably glean logicOp fails on my GM965).
2009-09-24i965: Load NV program matrices when required.Eric Anholt
2009-08-29i965: Support PROGRAM_ENV_PARAMs in brw_vs_emit.cEric Anholt
2009-05-06i965: Split WM constant buffer update from other WM surfaces.Eric Anholt
This can avoid re-uploading constant data when it isn't necessary, and is a step towards not updating other surfaces just because constants change. It also brings the upload of the constant buffer next to the creation. This brings openarena performance up another 4%, to 91% of the Mesa 7.4 branch.
2009-05-06i965: Disentangle VS constant surface state from WM surface state.Eric Anholt
Also, only create VS surface state if there's a VS constant buffer to be uploaded, and set the contents of the buffer at the same time as creation.
2009-05-01Merge branch 'const-buffer-changes'Brian Paul
Conflicts: src/mesa/drivers/dri/i965/brw_curbe.c src/mesa/drivers/dri/i965/brw_vs_emit.c src/mesa/drivers/dri/i965/brw_wm_glsl.c
2009-04-27i965: #include prog_print.h to silence warningBrian Paul
2009-04-27i965: only upload constant buffer data when we actually need the const bufferBrian Paul
Make the use_const_buffer field per-program and only call the code which updates the constant buffer's data if the flag is set. This should undo the perf regression from 20f3497e4b6756e330f7b3f54e8acaa1d6c92052 (cherry picked from master, commit dc9705d12d162ba6d087eb762e315de9f97bc456)
2009-04-27i965: only upload constant buffer data when we actually need the const bufferBrian Paul
Make the use_const_buffer field per-program and only call the code which updates the constant buffer's data if the flag is set. This should undo the perf regression from 20f3497e4b6756e330f7b3f54e8acaa1d6c92052
2009-04-24i965: use drm_intel_gem_bo_map/unmap_gtt() when possible, otherwise ↵Brian Paul
dri_bo_subdata() This wraps up the unfinished business from commit a9a363f8298e9d534e60e3d2869f8677138a1e7e
2009-04-23i965: revert part of commit 4f4907d69f9020ce17aef21b6431d2dd65e01982Brian Paul
The drm_intel_gem_bo_map_gtt() call that replaced dri_bo_map() is producing errors like: intel_bufmgr_gem.c:689: Error preparing buffer map 39 (vp_const_buffer): Invalid argument . and returning NULL, causing a segfault in the memcpy(). Just reverting until we can get to the root issue...
2009-04-23intel: Take advantage of GL_READ_ONLY_ARB to map to GEM bo_map write flag.Eric Anholt
This is a CPU win in general, but in particular reduces the pain of Mesa's calculation of min/max indices in DrawElements (wtf?).
2009-04-22i965: updates to some debug codeBrian Paul
2009-04-22i965: use new _NEW_PROGRAM_CONSTANTS flag instead of dynamic flagsBrian Paul
2009-04-17i965: updated CURBE allocation codeBrian Paul
Now that we have real constant buffers, the demands on the CURBE are lessened. When we use real VS/WM constant buffers we only use the CURBE for clip planes.
2009-04-16i965: const buffer debug code (disabled)Brian Paul
2009-04-14i965: checkpoint commit: VS constant buffersBrian Paul
Hook up a constant buffer, binding table, etc for the VS unit. This will allow using large constant buffers with vertex shaders. The new code is disabled at this time (use_const_buffer=FALSE).
2009-04-10i965: added null const_buffer pointer check in update_constant_buffer()Brian Paul
2009-04-09i965: re-org of some of the new constant buffer codeBrian Paul
Plus, begin the new code for vertex shader const buffers.
2009-04-03i965: remove unused varBrian Paul
2009-04-03i965: check-point commit of new constant buffer supportBrian Paul
Currently, shader constants are stored in the GRF (loaded from the CURBE prior to shader execution). This severly limits the number of constants and temps that we can support. This new code will support (practically) unlimited size constant buffers and free up registers in the GRF. We allocate a new buffer object for the constants and read them with "Read" messages/instructions. When only a small number of constants are used, we can still use the old method. The code works for fragment shaders only (and is actually disabled) for now. Need to do the same thing for vertex shaders and need to add the necessary code-gen to fetch the constants which are referenced by the shader instructions.
2009-03-12i965: fix const correctnessBrian Paul
2009-03-10i965: use new cast wrappersBrian Paul
2009-03-10i965: asst. code clean-ups, commentsBrian Paul
2009-02-25i965: Rename CMD_CONST_BUFFER_STATE to the CS_URB_STATE used in the docs.Eric Anholt
2009-02-02i965: Remove brw->attribs now that we can just always look in the GLcontext.Eric Anholt
2008-12-03i965: Fix failure to upload new constant data when changing programs.Eric Anholt
This is fallout from the ffvertex_prog.c work. It doesn't call ProgramStringNotify, so we don't set param_state, so we wouldn't track when VP parameters changed, and constants wouldn't get uploaded. Instead, remove param_state entirely and just use the real value that we want to be tracking. Fixes rendering in openarena since BRW_NEW_BATCH got disentangled from BRW_NEW_INDICES. Bug #18822.
2008-10-28i965: Fix check_aperture calls to cover everything needed for the prim at once.Eric Anholt
Previously, since my check_aperture API change, we would check each piece of state against the batchbuffer individually, but not all the state against the batchbuffer at once. In addition to not being terribly useful in assuring success, it probably also increased CPU load by calling check_aperture many times per primitive.
2008-10-24i965: don't emit state when dri_bufmgr_check_aperture_space fails.Xiang, Haihao
This ensures there is an unfilled batchbuffer used for emitting states again. Partial fix for #17964.
2008-09-23i965: Cope with batch getting flushed in the middle of batchbuffer emits.Eric Anholt
This isn't required for GEM (at least, yet), but the check_aperture code for non-GEM results in batch getting flushed during emit. brw_state_upload restarts state emits, but a bunch of the state emit functions were assuming that they would be called exactly once, after prepare and before new_batch. Bug #17179.
2008-09-18mesa: added "main/" prefix to includes, remove some -I paths from ↵Brian Paul
Makefile.template
2008-08-24Revert "Revert "Merge branch 'drm-gem'""Dave Airlie
This reverts commit 7c81124d7c4a4d1da9f48cbf7e82ab1a3a970a7a.
2008-08-24Revert "Merge branch 'drm-gem'"Dave Airlie
This reverts commit 53675e5c05c0598b7ea206d5c27dbcae786a2c03. Conflicts: src/mesa/drivers/dri/i965/brw_wm_surface_state.c
2008-08-08intel-gem: Update to new check_aperture API for classic mode.Eric Anholt
To do this, I had to clean up some of 965 state upload stuff. We may end up over-emitting state in the aperture overflow case, but that should be rare, and I'd rather have the simplification of state management.
2008-08-08965: cleanups to state emission from aperture checking and state ordering.Eric Anholt
2008-06-11[intel-gem] Chase domain flag renaming in the DRM.Eric Anholt
This is an API breakage only.
2008-06-03[intel] Convert drivers to using libdrm bufmgr code.Eric Anholt
2008-05-07GEM: Make dri_emit_reloc take GEM domain flags instead of TTM flags.Eric Anholt
The GEM flags are much more descriptive for what we need. Since this makes bufmgr_fake rather device-specific, move it to the intel common directory. We've wanted to do device-specific stuff to it before.
2008-04-18i965: initial attempt at fixing the aperture overflowDave Airlie
Makes state emission into a 2 phase, prepare sets things up and accounts the size of all referenced buffer objects. The emit stage then actually does the batchbuffer touching for emitting the objects. There is an assert in dri_emit_reloc if a reloc occurs for a buffer that hasn't been accounted yet.
2008-01-10[965] Improve performance by allocating CURBE buffers a page at a time.Eric Anholt
Since each one is only 64b, and kernel allocations are a page anyway, this lets us reduce buffer allocation by packing many CURBEs into one buffer, for each batchbuffer submitted. Improves openarena performance by around 10%.
2008-01-10[intel] Add more cliprect modes to cover other meanings for batch emits.Eric Anholt
The previous change gave us only two modes, one which looped over the batch per cliprect (3d drawing) and one that didn't (state updeast). However, we really want 4: - Batch doesn't care about cliprects (state updates) - Batch needs DRAWING_RECTANGLE looping per cliprect (3d drawing) - Batch needs to be executed just once (region fills, copies, etc.) - Batch already includes cliprect handling, and must be flushed by unlock time (copybuffers, clears). All callers should now be fixed to use one of these states for any batchbuffer emits. Thanks to Keith Whitwell for pointing out the failure.
2008-01-09[965] Replace the always_update dirty flag with BRW_NEW_BATCH.Eric Anholt
This allows us to avoid re-emitting some state when validate_state happens multiple times per batchbuffer. Even though we flush batch per primitive currently, that may still happen already if the primitive changed (this should probably be fixed as well).
2007-12-14[965] Replace the state cache suballocator with direct dri_bufmgr use.Eric Anholt
The user-space suballocator that was used avoided relocation computations by using the general and surface state base registers and allocating those types of buffers out of pools built on top of single buffer objects. It also avoided calls into the buffer manager for these small state allocations, since only one buffer object was being used. However, the buffer allocation cost appears to be low, and with relocation caching, computing relocations for buffers is essentially free. Additionally, implementing the suballocator required a don't-fence-subdata flag to disable waiting on buffer maps so that writing new data didn't block on rendering using old data, and careful handling when mapping to update old data (which we need to do for unavoidable relocations with FBOs). More importantly, when the suballocator filled, it had no replacement algorithm and just threw out all of the contents and forced them to be recomputed, which is a significant cost. This is the first step, which just changes the buffer type, but doesn't yet improve the hash table to not result in full recompute on overflow. Because the buffers are all allocated out of the general buffer allocator, we can no longer use the general/surface state bases to avoid relocations, and they are set to 0 instead.
2007-12-07[965] Convert the driver to dri_bufmgr interface and enable TTM.Eric Anholt
This is currently believed to work but be a significant performance loss. Performance recovery should be soon to follow. The dri_bo_fake_disable_backing_store() call was added to allow backing store disable like bufmgr_fake.c did, which is a significant performance win (though it's missing the no-fence-subdata part). This commit is a squash merge of the 965-ttm branch, which had some history I wanted to avoid pulling due to noisiness and brokenness at many points for git-bisecting.