Age | Commit message (Collapse) | Author |
|
Remove redundant initialization from commit
3b01b52bd78e3d2fc857feacebd815a5fae00c94 noticed by tstellar.
|
|
|
|
|
|
|
|
|
|
|
|
Fixes SCons build.
|
|
I thought I couldn't skip emitting this packet in some cases.
Well it looks like I can.
|
|
|
|
|
|
|
|
|
|
|
|
Should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=33185
|
|
|
|
Based on Dave's branch.
The majority of this commit is a cleanup, mainly renaming things.
There wasn't much code to import, just ioctl calls.
Also done:
- implemented unsynchronized bo_map (important optimization!)
- radeon_bo_is_referenced_by_cs is no longer a refcount hack
- dropped the libdrm_radeon dependency
I'm surprised that this has resulted in less code in the end.
|
|
|
|
So that we can implement resource_copy on arbitrary data.
|
|
Transfers and create/destroy are still handled separately.
|
|
|
|
|
|
This partially reverts commit 91eba2567eab9409d94efc3c1f07a4a3731d0047.
Conflicts:
src/gallium/drivers/r300/r300_blit.c
|
|
|
|
|
|
|
|
We don't have to unmap and recreate the upload buffer when a flush occurs.
This should also prevent buffer allocations from failing.
|
|
|
|
|
|
|
|
1) Only translate the [min_index, max_index] range.
2) Upload translated vertices via the uploader.
3) Rename valid_vertex_buffer[] to real_vertex_buffer[]
|
|
|
|
|
|
|
|
Also cleanup the whole thing.
|
|
|
|
|
|
|
|
This drops the memblock manager for ZMASK. Instead, only one zbuffer can be
compressed at a time. Note that this does not necessarily have to be slower.
When there is a large number of zbuffers, compression might be used more often
than it was before. It's also easier to debug.
How it works:
1) 'clear' turns the compression on.
2) If some other zbuffer is set or the currently-bound zbuffer is used
for texturing, the driver decompresses it and then turns the compression off.
Notes:
- The ZMASK clear has been refactored, so that only one packet3 is used to clear
ZMASK.
- The 8x8 compression mode is disabled. I couldn't make it work without issues.
- Also removed driver-specific stuff from u_blitter.
Driver status:
- RV530 and R580 appear to just work (finally).
- RV570 should work, but there may be an issue that we don't correctly
calculate the number of dwords to clear, resulting in a partially
uninitialized zbuffer.
- RS690 misrenders as if no ZMASK clear happened. No idea what's going on.
- RV350 may even hardlock. This issue was already present and this patch doesn't
fix it.
I think we are still missing some hardware info we need to make the zbuffer
compression work fully.
Note that there is also an issue with HiZ, resulting in a sort of blocky
zigzagged corruption around some objects.
|
|
|
|
I couldn't make it work.
GB_TILE_CONFIG.Z_EXTENDED, which enables per-pixel Z clamping, and
VAP_CLIP_CNTL.CLIP_DISABLE, which disables clipping, do help, but they
also add regressions like random graphics corruptions in some games.
|
|
|
|
This reverts commit 88550083b3857184445075e70fed8b2eed4952a1.
|
|
r400 fragment shaders now support up to 64 temporary registers,
512 ALU instructions, and 512 TEX instructions.
|
|
We are not required to do the linear->sRGB conversion if ARB_framebuffer_sRGB
is unsupported. However I think the conversion should work in hw except
for blending, which matches the D3D9 behavior.
|
|
The hw can't do it and the code was useless anyway (it's lowered
in the GLSL compiler).
|
|
|
|
Performance++.
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=32912
The fix is to call update_derived_state before user buffer uploads.
I've also moved some code around.
Unfortunately, there are still some ZMASK-related bugs which cause
misrendering, i.e. flushing doesn't always work and glean/fbo fails.
|
|
|
|
The motivation behind this rework is to get some speed by reducing
CPU overhead. The performance increase depends on many factors,
but it's measurable (I think it's about 10% increase in Torcs).
This commit replaces libdrm's radeon_cs_gem with our own implemention.
It's optimized specifically for r300g, but r600g could use it as well.
Reloc writes and space checking are faster and simpler than their
counterparts in libdrm (the time complexity of all the functions
is O(1) in nearly all scenarios, thanks to hashing).
(libdrm's radeon_bo_gem is still being used in the driver.)
It works like this:
cs_add_reloc(cs, buf, read_domain, write_domain) adds a new relocation and
also adds the size of 'buf' to the used_gart and used_vram winsys variables
based on the domains, which are simply or'd for the accounting purposes.
The adding is skipped if the reloc is already present in the list, but it
accounts any newly-referenced domains.
cs_validate is then called, which just checks:
used_vram/gart < vram/gart_size * 0.8
The 0.8 number allows for some memory fragmentation. If the validation
fails, the pipe driver flushes CS and tries do the validation again,
i.e. it validates only that one operation. If it fails again, it drops
the operation on the floor and prints some nasty message to stderr.
cs_write_reloc(cs, buf) just writes a reloc that has been added using
cs_add_reloc. The read_domain and write_domain parameters have been removed,
because we already specify them in cs_add_reloc.
The space checking has been tested by putting small values in vram/gart_size
variables.
|