summaryrefslogtreecommitdiff
path: root/src/gallium/drivers
AgeCommit message (Collapse)Author
2010-08-21nvfx: new 2D: use new 2D engine in GalliumLuca Barbieri
This patch implements nv04_surface_copy/fill using the new 2D engine module. It supports falling back to the 3D engine using the u_blitter module, which will be added in a later patch. Also adds support for using the 3D engine, reusing the u_blitter module created for r300. This is used for unswizzling and copies between swizzled surfaces.
2010-08-21nv04-nv40: new 2D: add new Gallium-independent 2D engineLuca Barbieri
This patch add a brand new nv04-nv40 2D engine module. It should correctly implement all operations involving swizzled, and 3D-swizzled surfaces. This code is independent from the Gallium framework and can thus be reused in the DDX and classic Mesa drivers (it's only likely to be useful in the latter, though). Currently, surface_copy and surface_fill are broken for 3D textures, for swizzled source textures and possibly for some misaligned cases The code is based around the new nv04_region structure, which encapsulates the information from pipe_surface needed for the 2D engine and CPU copies. The use of nv04_region makes the code independent of the Gallium framework and allows to transform the nv04_region without clobbering the nv04_region. The existing M2MF, blitter, and SWIZZLED_SURFACE paths have been improved and a new CPU path has been added. There is also support to tell the caller to use the 3D engine. The main feature of the copy/fill setup algorithm is linearization/contiguous-linearization of swizzled surfaces. The idea of linearization is that some swizzled surfaces are laid out like linear ones (1xN, 2xN, Nx1) and can thus be used as such (e.g. useful for copying single pixels). Also, some rectangles (e.g. the whole surface) are contiguous in memory. If both the source and destination rectangles are swizzled but contiguous, then they can be regarded as both linear: this is the idea of "contiguous linearization". This, for instance, allows to use the 2D engine to duplicate the content of a swizzled surface to another swizzled surface, by pretending they are actually linear. After linearization, the result may not be 64-byte aligned. Another transformation is done to enlarge the linear surface so that it becomes 64-byte aligned. This is also used to 64-byte align swizzled texture mipmaps. The inner loop of the CPU path is as optimized as possible without using SSE/SSE2. Future improvements could include SSE/SSE2 support, and possibly a faster coordinate swizzling algorithm (which is however not used in the inner loop). It may be a good idea to autogenerate swizzling code at least for all possible POT 2D texture dimensions (less than 256), maybe for all 3D ones too (less than 4096). Also, it woud be a very good idea to make a copy with the GPU first if the source surface is in uncached memory.
2010-08-21nvfx: new 2D: rewrite transfer code to use staging transfersLuca Barbieri
This greatly simplifies the code, and avoids ad-hoc copy code. Also, these new transfers work for buffers too, even though they are still used for miptrees only.
2010-08-21nvfx: new 2D: rewrite miptree code, adapt transfersLuca Barbieri
Changes: - Disable swizzling on non-RGBA 2D textures, since the current 2D code is mostly broken in those cases. A later patch will fix this. Thanks to Andrew Randrianasulu who reported this. - Fix compressed texture transfers and hack around the current 2D code inability to copy compressed textures by using direct access. Thanks to Andrew Randrianasulu who reported this. This patch rewrites all the miptree layout and transfer code in the nvfx driver. The current code is broken in several ways: 1. 3D textures are laid out first by face, then by level, which is incorrect 2. Cube maps should have 128-byte aligned faces 3. Swizzled textures have a strange alignment test that seems unnecessary 4. We store the image_offsets for each face/slice but they can be easily computed instead 5. "Swizzling" is not supported for compressed formats. They can be "swizzled" but swizzling only means that there are no gaps (pitch is level-dependant) and the layout is still linear 6. Swizzling is not supported for non-RGBA formats. All formats (except possibly depth) can be swizzled according to my testing. The miptree layout is rewritten based on my empirical testing, which I posted in the "miptree findings" mail. The image_offset array is removed, since it can be calculated with a simple multiplication; the only array in the miptree structure is now the one for mipmap level starts, which it seems cannot be easily computed in constant time. Also, we now directly store a nouveau_bo instead of a pipe_buffer in the miptree structure, like nv50 does. Support for render temporaries is removed, and will be readded in a later patch. Note that the current temporary code is broken, because it does not copy the temporary back on render cache flushes.
2010-08-21nvfx: add nouveau_resource_on_gpuLuca Barbieri
Add a function to get whether a resource is likely on the GPU or not. Currently always returns TRUE.
2010-08-21nvfx: add linear flag for buffersLuca Barbieri
2010-08-21nvfx: properly unreference bound objects on context destructionLuca Barbieri
2010-08-21nvfx: reference count bound objectsLuca Barbieri
2010-08-21nvfx: fix format support code for compressed textureLuca Barbieri
A source line was put in the wrong place.
2010-08-21trace: Don't immediately destroy the pipe's sampler view in the trace driver.Alex Corscadden
The trace driver's implementation of sampler_view_destroy was calling directly into the underlying pipe's sampler_view_destroy implementation. This causes problems for pipes that keep references to sampler views even after the state tracker has released them. Instead, we'll simply drop the trace driver's reference to the pipe's sampler view. Signed-off-by: José Fonseca <jfonseca@vmware.com>
2010-08-21trace: Trace the correct version of the resource when setting the index buffer.Alex Corscadden
The trace driver was tracing the unwrapped version of the index buffer when setting the index buffer. This caused an assert validating that a resource belonged to the trace driver to fail. Instead, we'll log the unmodified index buffer structure when setting the index buffer. Signed-off-by: José Fonseca <jfonseca@vmware.com>
2010-08-20r600g: add POW instructionJerome Glisse
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-08-20r600g: cleanup definition, fix segfault when no valid pixel shaderJerome Glisse
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-08-20r600g: add occlusion query supportDave Airlie
Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-08-20galahad: remove incorrect comment just addedLuca Barbieri
2010-08-20nv50: use NV50TIC_0_2_TARGET_RECTLuca Barbieri
2010-08-20galahad: check resource_create templateLuca Barbieri
2010-08-20gallium: make all checks for PIPE_TEXTURE_2D check for PIPE_TEXTURE_RECT tooLuca Barbieri
Searched for them with: git grep -E '[!=]=.*PIPE_TEXTURE_2D|PIPE_TEXTURE_2D.*[!=]=|case.*PIPE_TEXTURE_2D' Behavior hasn't been changed.
2010-08-20galahad, i915g: Copy over constant buffer index check.Corbin Simpson
2010-08-20galahad, i915g: Move over a few state asserts.Corbin Simpson
2010-08-19galahad: Make it obvious on stderr that Galahad's active.Corbin Simpson
2010-08-19r300g: do not use fastfill with 16-bit zbuffersMarek Olšák
To my knowledge, there is no way to flush zmask and thus write the clear value. This fixes zbuffer reads, among other things.
2010-08-19r600g: update comments about ALU src operandsAlex Deucher
2010-08-19r600g: add sin/cosDave Airlie
This pretty much ports the code from r600c, however it doesn't always seem to work quite perfectly, but I can't find anything in this code that is wrong. I'm guessing either literal input or constants aren't working always.
2010-08-19r600g: add a chiprev type for r600/r700/evergreen instead of using familyDave Airlie
2010-08-19r600g: add SSG, SEQ, SGT and SNEDave Airlie
2010-08-18r600g: add FRC, FLR, DDX and DDYDave Airlie
the first two are straight op2's and the DDX/DDY are taken from r600c.
2010-08-18r600g: add SGE and SLE opcodesDave Airlie
fixes fp-set-01 and glsl-fs-step
2010-08-18r600g: add TXB supportDave Airlie
fixes biased texturing tests
2010-08-18r600g: fix TXP vs TEX in shader.Dave Airlie
Don't do perspective for TEX, and also copy input to a temporary for TEX also add tex opcode names
2010-08-18r600g: add two simple tgsi opcodes.Dave Airlie
makes glsl-fs-log2 and glsl1-integer division with uniform var pass
2010-08-18r600g: fix point sizeDave Airlie
fixes piglit pointAtten and point-sprite tests
2010-08-18r600g: fixup pitch alignment like r600c.Dave Airlie
This still needs work, passes tex3d, fbo-scissor-bitmap, scissor-bitmap
2010-08-18r600g: fix height calcs for miptreeDave Airlie
h needs to be rounded up, this probably needs revisiting when we get to tiling etc. fixes fbo-generatemipmap-npot
2010-08-18r600g: emit texture level offset in CB/DB setup.Dave Airlie
8 more piglit tests pass, fbo-clearmipmap, fbo-copyteximage, fbo-generatemipmap, fbo-generatemipmap-nonsquare, fbo-generatemipmap-scissor, fbo-generatemipmap-viewport, gen-teximage, gen-texsubimage
2010-08-17r600g: fix fake pixel outputJerome Glisse
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-08-17r300g: fix context destroy under hyperzDave Airlie
we were destroying the mm before unrefing all the objects, so segfault. Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-08-17r600g: add user clip plane support.Dave Airlie
Apart from the fact that the radeon.h/r600_states.h editing is a nightmare, this wasn't so bad. passes piglit user-clip test now also trivial tests. Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-08-16r300g: fix assert in the rasterizer block for r3xx-r4xxMarek Olšák
Reported-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
2010-08-16r300g: fix an invalid pointer in freeMarek Olšák
2010-08-16r300g: Let hyperz init failnobled
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2010-08-16r300g: Fix leaks in failed context creationnobled
This changes r300_destroy_context() so it can be called on a partially-initialized context, and uses it when r300_create_context() hits a fatal error. This makes sure r300_create_context() doesn't leak memory or neglect to call r300_update_num_contexts() when it fails. Signed-off-by: Marek Olšák <maraeo@gmail.com>
2010-08-16r300g: Fix macronobled
This fixes a potential bug if (has_hyperz) is false (it would still init the atom as if has_hyperz were true). Signed-off-by: Marek Olšák <maraeo@gmail.com>
2010-08-16r300/compiler: implement DP2 opcodeMarek Olšák
2010-08-16r300/compiler: implement SSG opcodeMarek Olšák
2010-08-15llvmpipe: special case triangles which fall in a single 16x16 blockKeith Whitwell
Check for these and route them to a dedicated handler with one fewer levels of recursive rasterization.
2010-08-15llvmpipe: consolidate several loops in lp_rast_triangleKeith Whitwell
2010-08-15llvmpipe: remove all traces of step arrays, pos_tablesKeith Whitwell
No need to calculate these values any longer, nor to store them in the bin data. Improves isosurf a bit more, 115->123 fps.
2010-08-15llvmpipe: eliminate last usage of step array in rast_tmp.hKeith Whitwell
For 16 and 64 pixel levels, calculate a mask which is linear in x and y (ie not in the swizzle layout). When iterating over full and partial masks, figure out position by manipulating the bit number set in the mask, rather than relying on postion arrays. Similarly, calculate the lower-level c values from dcdx, dcdy and the position rather than relying on the step array.
2010-08-15llvmpipe: don't refer to plane->step when dcdx or dcdy would doKeith Whitwell