Age | Commit message (Collapse) | Author |
|
The motivation behind this rework is to get some speed by reducing
CPU overhead. The performance increase depends on many factors,
but it's measurable (I think it's about 10% increase in Torcs).
This commit replaces libdrm's radeon_cs_gem with our own implemention.
It's optimized specifically for r300g, but r600g could use it as well.
Reloc writes and space checking are faster and simpler than their
counterparts in libdrm (the time complexity of all the functions
is O(1) in nearly all scenarios, thanks to hashing).
(libdrm's radeon_bo_gem is still being used in the driver.)
It works like this:
cs_add_reloc(cs, buf, read_domain, write_domain) adds a new relocation and
also adds the size of 'buf' to the used_gart and used_vram winsys variables
based on the domains, which are simply or'd for the accounting purposes.
The adding is skipped if the reloc is already present in the list, but it
accounts any newly-referenced domains.
cs_validate is then called, which just checks:
used_vram/gart < vram/gart_size * 0.8
The 0.8 number allows for some memory fragmentation. If the validation
fails, the pipe driver flushes CS and tries do the validation again,
i.e. it validates only that one operation. If it fails again, it drops
the operation on the floor and prints some nasty message to stderr.
cs_write_reloc(cs, buf) just writes a reloc that has been added using
cs_add_reloc. The read_domain and write_domain parameters have been removed,
because we already specify them in cs_add_reloc.
The space checking has been tested by putting small values in vram/gart_size
variables.
|
|
|
|
|
|
|
|
Small perf improvement in ipers.
radeon_drm_get_cs_handle is exactly what this commit tries to avoid
in every write_reloc.
|
|
On lightsmark on my r500 this drop the bufmgr allocations of the sysprof.
|
|
Should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=31841
|
|
|
|
NOTE: This is a candidate for the 7.9 branch.
|
|
Based on commit 3ddc714b20ac4e28b80c6f88d1993445fff2262c by Dave Airlie.
NOTE: This is a candidate for the 7.9 branch.
|
|
caused by 0b9eb5c9bb03e5134d9a41786178100109e80c5a
test run glxgears, resize.
|
|
This fixes a DRM deadlock in the cubestorm xscreensaver, because somehow
there must not be 2 different BOs relocated in one CS if both BOs back
the same handle. I was told it is impossible to happen, but apparently
it is not, or there is something else wrong.
|
|
https://bugs.freedesktop.org/show_bug.cgi?id=30145
|
|
If the buffer we are attempting to map is referenced by the unsubmitted
command stream for this context, we need to flush the command stream,
however to do that we need to be able to access the context at the lowest
level map function, currently we set the buffer in the toplevel map, but this
racy between context. (we probably have a lot more issues than that.)
I'll look into a proper solution as suggested by jrfonseca when I get some time.
|
|
|
|
This makes it compatible with the modified DRM interface in drm-radeon-testing.
Also, now you need to set RADEON_HYPERZ=1 to be able to use hyperz.
It's not bug-free yet.
|
|
|
|
|
|
This implements fast Z clear, Z compression, and HiZ support for r300->r500
GPUs.
It also allows cbzb clears when fast Z clears are being used for the ZB.
It requires a kernel with hyper-z support.
Thanks to Marek Olšák <maraeo@gmail.com>, who started this off, and Alex Deucher at AMD for providing lots of hints.
v2:
squashed zmask ram size fix]
squashed r300g/blitter: fix Z readback when compressed]
v3:
rebase around texture changes in master - .1 fix more bits
v4:
migrated to using u_mm in r300_texture to manage hiz/zmask rams consistently
disabled HiZ when using OQ
flush z-cache before turning hyper-z off
update hyper-z state on dsa state change
store depthclearvalue across cbzb clears and replace it afterwards.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
The driver gets a buffer and its size in resource_from_handle.
It computes the required minimum buffer size from given texture
properties, and compares the two sizes.
This is to early detect DDX bugs.
|
|
|
|
|
|
Also print relocation failures on non-debug builds too.
|
|
|
|
This flush happens when changing the tiling flags, and it should really be
done in the context.
I hope this fixes FDO bug #28630.
|
|
Conflicts:
src/gallium/state_trackers/egl/x11/native_dri2.c
src/gallium/state_trackers/egl/x11/native_x11.c
src/gallium/state_trackers/egl/x11/native_x11.h
src/gallium/state_trackers/xorg/xorg_driver.c
src/gallium/winsys/radeon/drm/radeon_drm.c
|
|
|
|
|
|
Currently unconditional and causes segfaults.
|
|
|
|
|
|
I have had a look at the libdrm sources and they just contain more or less
the same checking we do in macros, and begin_cs may realloc the CS buffer
if we overflow it, which never happens with r300g. So these are pretty
much useless.
There is a small but measurable performance increase by dropping the two
functions.
|
|
|
|
|
|
|
|
|
|
|
|
With the removal of DRI1 support there where no use of this argument,
some drivers didn't even properly check it.
|
|
|
|
The regression has first shown up after this state tracker change:
b0427bedde80e3189524651a327235bdfddbc613.
FDO bug #28082.
|
|
It's already done in r300_emit_buffer_validate.
This also fixes Total Annihilation 3D on debug builds at least.
|
|
|
|
|
|
See also the libdrm commit af98ccf4dd5dcb1b904ec32b9bd1521e6bf7dda5.
|
|
|
|
|
|
|
|
Also try to wrap trace around driver on non-debug builds, its free.
|
|
It saves a few libdrm calls and unnecessary flushes.
|
|
This is a bug in the CS checker causing CS being rejected.
|