summaryrefslogtreecommitdiff
path: root/src/gallium
AgeCommit message (Collapse)Author
2011-01-08r300g: fix a surface leak when flushing ZMASKMarek Olšák
2011-01-08r300g: rework command submission and resource space checkingMarek Olšák
The motivation behind this rework is to get some speed by reducing CPU overhead. The performance increase depends on many factors, but it's measurable (I think it's about 10% increase in Torcs). This commit replaces libdrm's radeon_cs_gem with our own implemention. It's optimized specifically for r300g, but r600g could use it as well. Reloc writes and space checking are faster and simpler than their counterparts in libdrm (the time complexity of all the functions is O(1) in nearly all scenarios, thanks to hashing). (libdrm's radeon_bo_gem is still being used in the driver.) It works like this: cs_add_reloc(cs, buf, read_domain, write_domain) adds a new relocation and also adds the size of 'buf' to the used_gart and used_vram winsys variables based on the domains, which are simply or'd for the accounting purposes. The adding is skipped if the reloc is already present in the list, but it accounts any newly-referenced domains. cs_validate is then called, which just checks: used_vram/gart < vram/gart_size * 0.8 The 0.8 number allows for some memory fragmentation. If the validation fails, the pipe driver flushes CS and tries do the validation again, i.e. it validates only that one operation. If it fails again, it drops the operation on the floor and prints some nasty message to stderr. cs_write_reloc(cs, buf) just writes a reloc that has been added using cs_add_reloc. The read_domain and write_domain parameters have been removed, because we already specify them in cs_add_reloc. The space checking has been tested by putting small values in vram/gart_size variables.
2011-01-08nvc0: fix reloc domain conflict on buffer migrationChristoph Bumiller
Occurred because the code assumed that buf->domain would remain equal to old_domain.
2011-01-08nvc0: upload user buffers only from draw info min to max indexChristoph Bumiller
There are actually applications that profit immensely from this.
2011-01-08nvc0: fix emission of first 3 u8 indices to RING_NIChristoph Bumiller
2011-01-08nvc0: reset mt transfer address after read loop over layersChristoph Bumiller
2011-01-08nvc0: tie buffer memory release to the buffer fenceChristoph Bumiller
... instead of the next fence to be emitted. This way we have a chance to reclaim the storage earlier.
2011-01-08r300g: Remove invalid assertion.Łukasz Krotowski
Invalid after be1af4394e060677b7db6bbb8e3301e38a3363da (user buffer creation with width0 == ~0). Signed-off-by: Marek Olšák <maraeo@gmail.com>
2011-01-07r600g: Also set const_offset if the buffer is not a user buffer in ↵Henri Verbeet
r600_upload_const_buffer().
2011-01-07r600g: Update some comments for Evergreen.Henri Verbeet
2011-01-07r600g: Split ALU clauses based on used constant cache lines.Henri Verbeet
2011-01-07r600g: Consistently use the copy of the alu instruction in ↵Henri Verbeet
r600_bc_add_alu_type().
2011-01-07r600g: Store kcache settings as an array.Henri Verbeet
2011-01-07r300g: derive user buffer sizes at draw timeMarek Olšák
This only uploads the [min_index, max_index] range instead of [0, userbuf size], which greatly speeds up user buffer uploads. This is also a prerequisite for atomizing vertex arrays in st/mesa.
2011-01-07r600g: allow constant buffers to be user buffers.Dave Airlie
This provides an upload facility for the constant buffers since Marek's constants in user buffers changes. gears at least work on my evergreen now. Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-01-06r600g: add support for NI (Northern Islands) GPUsAlex Deucher
This adds support for Barts, Turks, and Caicos asics.
2011-01-06svga: Ensure that the wrong vdecls don't get used in swtnl pathJakob Bornecrantz
The draw module set new state that didn't require swtnl which caused need_swtnl to be unset. This caused the call from to svga_update_state(svga, SVGA_STATE_SWTNL_DRAW) from the vbuf backend to overwrite the vdecls we setup there to be overwritten with the real buffers vdecls.
2011-01-06r300g: fix corruption when nr_cbufs==0 and multiwrites enabledMarek Olšák
https://bugs.freedesktop.org/show_bug.cgi?id=32634
2011-01-06r300g: remove the buffer range checkingMarek Olšák
It's no longer needed because the upload buffer remains mapped while the CS is being filled (openarena, ut2004 and others that this code was for do not use VBOs by default).
2011-01-06r300g: skip buffer validation of upload buffers when appropriateMarek Olšák
because the upload buffers are reused for subsequent draw operations.
2011-01-06util: add comments to u_upload_mgr and u_inlinesMarek Olšák
2011-01-06tgsi: remove redundant name tables from tgsi_text, use those from tgsi_dumpMarek Olšák
I also specified the array sizes in the header so that one can use the Elements macro on it.
2011-01-06gallium: drivers should reference vertex buffersMarek Olšák
So that a state tracker can unreference them after set_vertex_buffers.
2011-01-06u_upload_mgr: new featuresMarek Olšák
- Added a parameter to specify a minimum offset that should be returned. r300g needs this to better implement user buffer uploads. This weird requirement comes from the fact that the Radeon DRM doesn't support negative offsets. - Added a parameter to notify a driver that the upload flush occured. A driver may skip buffer validation if there was no flush, resulting in a better performance. - Added a new upload function that returns a pointer to the upload buffer directly, so that the buffer can be filled e.g. by the translate module.
2011-01-06u_upload_mgr: keep the upload buffer mapped until it is flushedMarek Olšák
The map/unmap overhead can be significant even though there is no waiting on busy buffers. There is simply a huge number of uploads. This is a performance optimization for Torcs, a car racing game.
2011-01-06nvc0: Fix typo of nvc0_mm.c in SConscript.Vinson Lee
2011-01-05st/xorg: Flesh out colour map support and support depth 8.Michel Dänzer
2011-01-04r600g: support up to 64 shader constantsAlex Deucher
From the r600 ISA: Each ALU clause can lock up to four sets of constants into the constant cache. Each set (one cache line) is 16 128-bit constants. These are split into two groups. Each group can be from a different constant buffer (out of 16 buffers). Each group of two constants consists of either [Line] and [Line+1] or [line + loop_ctr] and [line + loop_ctr +1]. For supporting more than 64 constants, we need to break the code into multiple ALU clauses based on what sets of constants are needed in that clause. Note: This is a candidate for the 7.10 branch. Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
2011-01-04Merge remote branch 'origin/nvc0'Christoph Bumiller
2011-01-04nvc0: fix index size method value for u8 indicesChristoph Bumiller
2011-01-04nvc0: set the correct FP header bit for multiple colour outputsChristoph Bumiller
2011-01-04nvc0: delete memory caches and fence on screen destructionChristoph Bumiller
2011-01-04nvc0: use mov instead of ld for scalar const loadsChristoph Bumiller
2011-01-04nvc0: fix resource unmap after vertex pushChristoph Bumiller
2011-01-04nvc0: use the proper typed opcodes in constant foldingChristoph Bumiller
2011-01-04nvc0: demagic GP invocation count bitfieldChristoph Bumiller
2011-01-04nvc0: rewrite the 9097 GRAPH macrosChristoph Bumiller
2011-01-04i965g: include brw_types.h instead of GL/gl.hBrian Paul
Alternately, some search&replace could be used to replace all occurances of GLint with int, etc. in the driver.
2011-01-04llvmpipe: Include p_compiler.h in lp_scene_queue.h.Vinson Lee
Include p_compiler.h for boolean symbol.
2011-01-04llvmpipe: Include p_compiler.h in lp_perf.h.Vinson Lee
Include p_compiler.h for int64_t symbol.
2011-01-04llvmpipe: Include missing headers in lp_bld_depth.hVinson Lee
Include p_compiler.h for boolean symbol. Include p_state.h for pipe_stencil_state symbol.
2011-01-04llvmpipe: Include p_compiler.h in lp_bld_alpha.h.Vinson Lee
Include p_compiler.h for boolean symbol. Add forward declaration for gallivm_state struct.
2011-01-04i965g: Include p_compiler.h in intel_decode.h.Vinson Lee
Include p_compiler.h for uint32_t symbol.
2011-01-04i965g: Include gl.h in intel_structs.h.Vinson Lee
Include gl.h for OpenGL symbols.
2010-12-30util: Add forward declarations in u_index_modify.h.Vinson Lee
2010-12-30tgsi: Clean up header file inclusion in tgsi_text.h.Vinson Lee
2010-12-30graw: Include p_shader_tokens.h for tgsi_token struct.Vinson Lee
2010-12-30tgsi: Clean up header file inclusion in tgsi_sanity.h.Vinson Lee
2010-12-30drm/nvc0: don't un-bind every subchannel on initBen Skeggs
The initial values in the grctx are 0x0000 anyway, and re-binding them all to 0x0000 destroys some init done by the nouveau drm. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2010-12-29util: add a way to store translated indices to a user memory in u_index_modifyMarek Olšák
I am about to use the upload buffer in r300g instead.