Age | Commit message (Collapse) | Author |
|
This patch implements nv04_surface_copy/fill using the new 2D engine module.
It supports falling back to the 3D engine using the u_blitter module, which will be
added in a later patch.
Also adds support for using the 3D engine, reusing the u_blitter module
created for r300.
This is used for unswizzling and copies between swizzled surfaces.
|
|
This patch add a brand new nv04-nv40 2D engine module.
It should correctly implement all operations involving swizzled, and 3D-swizzled surfaces.
This code is independent from the Gallium framework and can thus be reused in the DDX and classic Mesa drivers (it's only likely to be useful in the latter, though).
Currently, surface_copy and surface_fill are broken for 3D textures, for swizzled source textures and possibly for some misaligned cases
The code is based around the new nv04_region structure, which encapsulates the information from pipe_surface needed for the 2D engine and CPU copies.
The use of nv04_region makes the code independent of the Gallium framework and allows to transform the nv04_region without clobbering the nv04_region.
The existing M2MF, blitter, and SWIZZLED_SURFACE paths have been improved and a new CPU path has been added.
There is also support to tell the caller to use the 3D engine.
The main feature of the copy/fill setup algorithm is linearization/contiguous-linearization of swizzled surfaces.
The idea of linearization is that some swizzled surfaces are laid out like linear ones (1xN, 2xN, Nx1) and can thus be used as such (e.g. useful for copying single pixels).
Also, some rectangles (e.g. the whole surface) are contiguous in memory. If both the source and destination rectangles are swizzled but contiguous, then they can be regarded as both linear: this is the idea of "contiguous linearization".
This, for instance, allows to use the 2D engine to duplicate the content of a swizzled surface to another swizzled surface, by pretending they are actually linear.
After linearization, the result may not be 64-byte aligned. Another transformation is done to enlarge the linear surface so that it becomes 64-byte aligned.
This is also used to 64-byte align swizzled texture mipmaps.
The inner loop of the CPU path is as optimized as possible without using SSE/SSE2.
Future improvements could include SSE/SSE2 support, and possibly a faster coordinate swizzling algorithm (which is however not used in the inner loop).
It may be a good idea to autogenerate swizzling code at least for all possible POT 2D texture dimensions (less than 256), maybe for all 3D ones too (less than 4096).
Also, it woud be a very good idea to make a copy with the GPU first if the source surface is in uncached memory.
|
|
This greatly simplifies the code, and avoids ad-hoc copy code.
Also, these new transfers work for buffers too, even though they
are still used for miptrees only.
|
|
Changes:
- Disable swizzling on non-RGBA 2D textures, since the current 2D
code is mostly broken in those cases. A later patch will fix this.
Thanks to Andrew Randrianasulu who reported this.
- Fix compressed texture transfers and hack around the current 2D
code inability to copy compressed textures by using direct access.
Thanks to Andrew Randrianasulu who reported this.
This patch rewrites all the miptree layout and transfer code in the
nvfx driver.
The current code is broken in several ways:
1. 3D textures are laid out first by face, then by level, which is
incorrect
2. Cube maps should have 128-byte aligned faces
3. Swizzled textures have a strange alignment test that seems
unnecessary
4. We store the image_offsets for each face/slice but they can be
easily computed instead
5. "Swizzling" is not supported for compressed formats. They can be
"swizzled" but swizzling only means that there are no gaps (pitch is
level-dependant) and the layout is still linear
6. Swizzling is not supported for non-RGBA formats. All formats (except
possibly depth) can be swizzled according to my testing.
The miptree layout is rewritten based on my empirical testing, which I
posted in the "miptree findings" mail.
The image_offset array is removed, since it can be calculated with a
simple multiplication; the only array in the miptree structure is now
the one for mipmap level starts, which it seems cannot be easily
computed in constant time.
Also, we now directly store a nouveau_bo instead of a pipe_buffer in
the miptree structure, like nv50 does.
Support for render temporaries is removed, and will be readded in a
later patch.
Note that the current temporary code is broken, because it does not
copy the temporary back on render cache flushes.
|
|
Add a function to get whether a resource is likely on the GPU or not.
Currently always returns TRUE.
|
|
|
|
|
|
|
|
A source line was put in the wrong place.
|
|
Searched for them with:
git grep -E '[!=]=.*PIPE_TEXTURE_2D|PIPE_TEXTURE_2D.*[!=]=|case.*PIPE_TEXTURE_2D'
Behavior hasn't been changed.
|
|
Apparently they have always been broken, even before unification.
Fixes a lot of stuff, starting from morph3d and lighting in teapot
with textures disabled.
|
|
|
|
That is, remove pipe_context::draw_arrays, pipe_context::draw_elements,
pipe_context::draw_arrays_instanced,
pipe_context::draw_elements_instanced,
pipe_context::draw_range_elements.
|
|
Some drivers define a generic function that is called by all drawing
functions. To implement draw_vbo for such drivers, either draw_vbo
calls the generic function or the prototype of the generic function is
changed to match draw_vbo.
Other drivers have no such generic function. draw_vbo is implemented by
calling either draw_arrays and draw_elements.
For most drivers, set_index_buffer does not mark the state dirty for
tracking. Instead, the index buffer state is emitted whenever draw_vbo
is called, just like the case with draw_elements. It surely can be
improved.
|
|
|
|
Signed-off-by: Patrice Mandin <patmandin@gmail.com>
|
|
|
|
we need to change it to support composite types
|
|
more consistent with rest of gallium naming conventions.
Also rename driver-internal names for these the same.
|
|
|
|
Conflicts:
src/mesa/state_tracker/st_gen_mipmap.c
src/mesa/state_tracker/st_texture.c
|
|
Signed-off-by: Patrice Mandin <patmandin@gmail.com>
|
|
|
|
prevents segfault when state trackers try to set default mask.
Other option would be to make this required only for drivers
supporting multisampling, but this seems more clean.
Only dummy implementations (for normal drivers) provided (no driver
supports multisampling yet neither).
|
|
this probably needs further cleanup (just getting a surface for the resource
seems quite nonoptimal and potentially cause unnecessary copies I think)
|
|
The linux-debug target builds...
|
|
Use front/back instead of cw/ccw throughout.
Also, use offset_point/line/fill instead of offset_cw/ccw.
Brings gallium representation of this state into line with its main
user, and also what turns out to be the most common hardware
representation.
This fixes a long-standing bias in the interface towards the
architecture of the software rasterizer.
|
|
|
|
|
|
Signed-off-by: Corbin Simpson <MostAwesomeDude@gmail.com>
|
|
Signed-off-by: Patrice Mandin <patmandin@gmail.com>
|
|
libdrm-2.4.20 and earlier include the nouveau/nouveau_class.h header. A
later version of libdrm will not ship this header. Mesa also has this
header at src/gallium/drivers.
The symbol NV34TCL_VTXFMT_TYPE_HALF is needed by nvfx_vbo.c. This symbol
is not in the libdrm copy of the header but is in the Mesa copy of the
header. This patch moves src/gallium/drivers to the beginning of the
include paths such that when building on hosts with libdrm-2.4.20 or
ealier the build uses the copy in Mesa.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Conflicts:
src/gallium/auxiliary/draw/draw_context.c
src/gallium/auxiliary/draw/draw_pipe_aaline.c
src/gallium/drivers/llvmpipe/lp_context.c
|
|
Don't include nvfx_context.h and use a forward reference instead.
nvfx_context.h includes nvfx_screen.h (itself).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|