Age | Commit message (Collapse) | Author |
|
It's faster. Not only is the memcpy more efficiently performed in the
kernel (making up for the system call overhead), but by not using mmap
we remove the greater overhead of tracking the vma of every batch.
And it means we can read back from the batch buffer without incurring
the cost of a uncached read through the GTT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we use state relocations and we know that all the state belongs to
the same bo, we can drop the multiple references to the same bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we write directly into the batch in system memory, we do not need to
write first to the stack (as was to avoid read back through the GTT)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Following a GPU hang, or other error, the render target is not likely to
have an allocated BO and so we must fallback to avoid using it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32534
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This adds i965 support for GL_EXT_framebuffer_sRGB, it introduces a new
constant to say that the driver can support sRGB enabled FBOs since enabling
the extension doesn't mean the driver can actually support sRGB.
Also adds the suggested state flush in the core code suggested by Brian.
fix the ARB_fbo color encoding.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
We just choose the texture format depending on the srgb decode bit
for the sRGB formats.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Until we get the EXT_framebuffer_sRGB extension we should bind the sRGB
formats for FBO as linear.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
This reverts commit 7ce6517f3ac41bf770ab39aba4509d4f535ef663.
This reverts commit d60145d06d999c5c76000499e6fa9351e11d17fa.
I was wrong about which generations supported baselevel adjustment --
it's just gen4, nothing earlier. This meant that i915 would have
never used the mag filter when baselevel != 0. Not a severe bug, but
not an intentional regression. I think we can fix the performance
issue another way.
|
|
By relying on just intel_span_supports_format, some formats that
aren't supported pre-gen4 were not reporting FBO incomplete. And we
also complained in stderr when it happened on i915 because draw_region
gets called before framebuffer completeness validation.
|
|
BaseLevel/MaxLevel are mostly used for two things: clamping texture
access for FBO rendering, and limiting the used mipmap levels when
incrementally loading textures. By restricting our mipmap trees to
just the current BaseLevel/MaxLevel, we caused reallocation thrashing
in the common case, for a theoretical win if someone really did want
just levels 2..4 or whatever of their texture object.
Bug #30366
|
|
This has always been ugly about our texture code -- object base/max
level vs intel object first/last level vs image level vs miptree
first/last level. We now get rid of intelObj->first_level which is
just tObj->BaseLevel, and make intelObj->_MaxLevel clearly based off
of tObj->_MaxLevel instead of duplicating its code (incorrectly, as
image->MaxLog2 only considers width/height and not depth!)
|
|
It was quite a mess by trying to do NULL renderbuffers and real
renderbuffers in the same function. This clarifies the common case of
real renderbuffers.
|
|
This makes
fbo-generatemipmap-formats GL_EXT_texture_sRGB-s3tc
match
fbo-generatemipmap-formats GL_EXT_texture_compression_s3tc
and swrast in bad DXT1_RGBA alpha=0 handling, but it means we won't
unpack and repack someone's textures into uncompressed SARGB8 format.
|
|
There are exceptions to the table for depth texturing or rendering to
not-quite-supported formats thanks to the non-orthogonal component
selection for surface formats, but it's still a lot simpler.
|
|
Fixes depth-tex-modes-rg.
|
|
This is said to be required in the spec, even when you aren't doing writes.
|
|
This fixes some insanity that would otherwise be required for GLSL
1.30 bit ops or gen6 integer uniform operations in general, at the
cost of upload-time pain. Given that we only have that pain because
mesa's mangling our integer uniforms to be floats, this something that
should be fixed outside of the shader codegen.
|
|
Fixes glsl-fs-uniform-array-5, but not 6 which fails in ir_to_mesa.
|
|
|
|
|
|
Tested with fbo-generatemipmap-formats GL_EXT_texture_srgb. The test
still fails on SLA8, though.
|
|
1. Move all GL entrypoint functions and files into src/mesa/main/
This includes the ARB vp/vp, NV vp/fp, ATI fragshader and GLSL bits
that were in src/mesa/shader/
2. Move src/mesa/shader/slang/ to src/mesa/slang/ to reduce the tree depth
3. Rename src/mesa/shader/ to src/mesa/program/ since all the
remaining files are concerned with GPU programs.
4. Misc code refactoring. In particular, I got rid of most of the
GLSL-related ctx->Driver hook functions. None of the drivers used
them.
Conflicts:
src/mesa/drivers/dri/i965/brw_context.c
|
|
We had to fill out all that junk when using the cache, but no more.
|
|
|
|
This makes the binding table code simpler, and is required for gen6,
which requires binding table addresses to be under 64k offset from the
surface state base addr.
No significant change in performance on firefox-talos-gfx.
|
|
It turns out that computing a 56 byte key to look up a 20-byte object
out of a hash table was some sort of a bad idea. Whoops.
before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6
after:
[ 0] gl firefox-talos-gfx 34.761 34.784 0.17% 5/6
|
|
This slightly reduces reduces cairo-gl firefox-talos-gfx runtime on my
Ironlake:
before:
[ # ] backend test min(s) median(s) stddev. count
[ 0] gl firefox-talos-gfx 38.236 38.383 0.43% 5/6
after:
[ 0] gl firefox-talos-gfx 37.799 38.203 0.39% 6/6
It turns out the cost of caching these objects and looking them up in
the cache again is greater than the cost of just computing the object
again, particularly when the overhead of having a separate BO to pin
is removed.
(Those that are paying close attention will note that this is a
reversal of the path I was moving the driver in a couple of years ago.
The major thing that has changed is that back then all state was
recomputed when we wrapped the streaming state buffer, including
recompiling our precious programs. Now, we're uncaching just the
objects that are cheap to compute, and retaining caching of expensive
objects)
|
|
This was bothering me when redoing the binding tables.
|
|
|
|
|
|
The new API makes so much more sense, I'd like to forget how the old
one worked.
|
|
The slightly less mechanical change of converting the emit_reloc calls
will follow.
|
|
|
|
Fixes assertion failure in fbo-generatemipmap-npot.
|
|
The pitch is not really an inherent part of the miptree, since it's
not part of any of the layout calculations, and it's dictated by the
libdrm-allocated region pitch now.
|
|
|
|
|
|
|
|
|
|
Everything has been constant-sized until now, but constant buffer
handling changes will make us want some additional variable sized
array.
|
|
|
|
Conflicts:
docs/relnotes.html
src/gallium/drivers/llvmpipe/lp_tex_sample_c.c
src/gallium/drivers/r300/r300_cs.h
src/mesa/drivers/dri/i965/brw_wm_surface_state.c
src/mesa/main/enums.c
|
|
If we ever had a non-tile-aligned tiled renderbuffer, the math was all
off. Use the existing x,y coordinates instead of trying to
reconstruct them from an incorrectly-calculated offset value.
|
|
This is part of the GL_EXT_draw_buffers2 extension and part of GL 3.0.
The ctx->Color.ColorMask field is now a 2-D array. Until drivers are
modified to support per-buffer color masking, they can just look at
the 0th color mask.
The new _mesa_ColorMaskIndexed() function will be called by
glColorMaskIndexedEXT() or glColorMaski().
|
|
Saves ~2KB of code.
|
|
Use the currently bound draw buffer instead of the visual from the
drawable used to create the context. This cause problems generating
mipmaps for an RGBA texture in an RGB context.
This fixes the failure in piglit's glsl-lod-bias test reported in bug #25614.
|
|
Now that XRGB is supported, we don't need to hack around cases of an RGBA
format buffer with an internal format of GL_RGB.
|
|
It turns out that 965 and friends cannot actually render to an xRGB
surfaces. Instead, the surface has to be RGBA with writes to alpha
disabled and the blend function modified to always use 1.0 for
destination alpha.
|
|
Since the texformat branch merge, the value of intel_renderbuffer::texformat
is just a copy of gl_renderbuffer::Format.
|
|
|