Age | Commit message (Collapse) | Author |
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Move the tracking of the last emitted instructions into the core
batchbuffer routines and take advantage of the shadow batch copy to
avoid extra memory allocations and copies.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
It's faster. Not only is the memcpy more efficiently performed in the
kernel (making up for the system call overhead), but by not using mmap
we remove the greater overhead of tracking the vma of every batch.
And it means we can read back from the batch buffer without incurring
the cost of a uncached read through the GTT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we use state relocations and we know that all the state belongs to
the same bo, we can drop the multiple references to the same bo.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we write directly into the batch in system memory, we do not need to
write first to the stack (as was to avoid read back through the GTT)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As we write directly into the batch in system memory, we do not need to
write first to the stack (as was to avoid read back through the GTT)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
In preparation for a greater change, use the color_calc_state_bo already
provisioned for this purpose.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Rather than performing lots of little writes to update the common bo
upon each update, write those into a static buffer and flush that when
full (or at the end of the batch). Doing so gives a dramatic performance
improvement over and above using mmaped access.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Reuse the new common upload buffer for uploading temporary indices and
rebuilt vertex arrays.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Dynamic arrays have the tendency to be small and so allocating a bo for
each one is overkill and we can exploit many efficiency gains by packing
them together.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Following a GPU hang, or other error, the render target is not likely to
have an allocated BO and so we must fallback to avoid using it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32534
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
These are only used for debugging, but should be there.
Found by inspection.
|
|
These were incorrectly defined to the same value - likely due to a cut
and paste error. Found by inspection.
|
|
|
|
|
|
This adds i965 support for GL_EXT_framebuffer_sRGB, it introduces a new
constant to say that the driver can support sRGB enabled FBOs since enabling
the extension doesn't mean the driver can actually support sRGB.
Also adds the suggested state flush in the core code suggested by Brian.
fix the ARB_fbo color encoding.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
We pull the draw regions right out of the renderbuffers these days.
|
|
There's way more interesting info in INTEL_DEBUG=state if you could find
it among the state size checks.
|
|
|
|
|
|
|
|
These are already picked up by ir.h or glsl_types.h.
|
|
...and remove egg from face.
|
|
|
|
I don't have evidence for this amounting to any improvement,
but it does codify a bit more what we understand so far about
the pipeline.
|
|
|
|
This fixes a bunch of unnecessary barriers due to the scheduler not
knowing what that arbitrary register description refers to when trying
to reason about its dependencies.
The result is rescheduling in the convolution kernel shader in
Lightsmark, which results in avoiding register spilling and increasing
the performance of the first scene from 6-7 fps midway through the
panning to 11fps. The register spilling was a regression from Mesa
7.9 to Mesa 7.10.
|
|
Improves performance of my GLSL demo by 5.1% (+/- 1.4%, n=7). It also
reschedules the giant multiply tree at the end of
glsl-fs-convolution-1 so that we end up not spilling registers,
producing the expected level of performance.
|
|
|
|
|
|
Fixes piglit glsl-fs-texture2d-branching. I couldn't come up with a
testcase that didn't involve dead code, but it's still worthwhile to
fix I think.
|
|
Fixes texrect-many regression with ff_fragment_shader -- as we added
refs to the subsequent texcoord scaling paramters, the array got
realloced to a new address while our params[] still pointed at the old
location.
|
|
We just choose the texture format depending on the srgb decode bit
for the sRGB formats.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
This code should never have been triggered, but I often did anyway
when I disabled optimization passes during debugging, then spent my
time debugging that this code doesn't work.
|
|
No effect, since it was called before live intervals were calculated.
|
|
Until we get the EXT_framebuffer_sRGB extension we should bind the sRGB
formats for FBO as linear.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
|
|
We were trying to interpolate, which would end up doing unnecessary
math, and doing so on undefined values. Fixes glsl-fs-flat-color.
|
|
The ad-hoc placement of recalculation somewhere between when they got
invalidated and when they were next needed was confusing. This should
clarify what's going on here.
|
|
We were returning the negative absolute value, instead of the absolute
value. Fixes glsl-vs-abs-neg.
|
|
We were returning the negative absolute value, instead of the absolute
value. Fixes glsl-fs-abs-neg.
|
|
This greatly improves codegen for programs with flow control by
allowing coalescing for all instructions at the top level, not just
ones that follow the last flow control in the program.
|
|
The _Enabled field is the thing that takes into account whether
there's a stencil buffer. Tested with piglit glx-visuals-stencil.
|
|
When attaching a small mipmap level to an FBO, the original gen4
didn't have the bits to support rendering to it. Instead of falling
back, just blit it to a new little miptree just for it, and let it get
revalidated into the stack later just like any other new teximage.
Bug #30365.
|
|
This reverts commit 7ce6517f3ac41bf770ab39aba4509d4f535ef663.
This reverts commit d60145d06d999c5c76000499e6fa9351e11d17fa.
I was wrong about which generations supported baselevel adjustment --
it's just gen4, nothing earlier. This meant that i915 would have
never used the mag filter when baselevel != 0. Not a severe bug, but
not an intentional regression. I think we can fix the performance
issue another way.
|
|
|
|
|
|
Again, this makes it match the documentation.
|