Age | Commit message (Collapse) | Author |
|
This provides the optimizer with hints about code hotness, which we're
quite certain about for debug printouts (or, rather, while we
developers often hit the checks for debug printouts, we don't care
about performance while doing so).
|
|
Fixes glsl-fs-uniform-array-5, but not 6 which fails in ir_to_mesa.
|
|
Sandybridge's PS constant buffer payload size is decided from
push const buffer command, incorrect size would cause wrong data
in payload for position and vertex attributes. This fixes coefficients
for tex2d/tex3d.
|
|
Accumulator update flag must be set for implicit update on sandybridge.
|
|
Sending down data that doesn't get read doesn't make any sense, and
would make handling things like gl_FrontFacing and gl_PointCoord
harder.
|
|
Things are simpler these days thanks to barycentric interpolation
parameters being handed in in the payload.
|
|
|
|
With recent changes to the GLSL compiler, these opcode should never be
seen in these drivers.
|
|
We always need to set it, so pass it in.
|
|
This is the same as 8de8c97275e9555183a7e8f2238143657bbe60b2 for the
VS, and fixes glsl-fs-if-nested-loop and the mandelbrot demo.
Bug #29498
|
|
Only 8 out of the up to 13 regs are for source/dest depth, so the name
wasn't particularly appropriate. Note that this doesn't count the
constant or URB payload regs. Also, don't pre-divide by 2, so it's
actually a number of registers.
|
|
|
|
This pulls in multiple i965 driver fixes which will help ensure better
testing coverage during development, and also gets past the conflicts
of the src/mesa/shader -> src/mesa/program move.
Conflicts:
src/mesa/Makefile
src/mesa/main/shaderapi.c
src/mesa/main/shaderobj.h
|
|
The original glsl compiler would generate a.x * b.x + a.y * b.y, which
we would do mul+mul+add for instead of this mul+mac.
Fixes glsl-fs-dot-vec2.
|
|
The old compiler didn't use SSG, and instead emitted SGT/SGT/SUB. We
can do a little better for SSG than we do for the SGT series.
|
|
|
|
|
|
Rename old IGDNG to Ironlake, and set 'gen' number for
Ironlake as 5, so tracking the features with generation num
instead of special is_ironlake flag.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
|
|
Fixes piglit glsl-orangebook-ch06-bump, regressed with
4fc57322258a750c0a9cabc77372b5ccde1fa877
|
|
The hope is to later take advantage of the reduced constant usage to
free up regs. This only covers the GLSL path at the moment, because
the brw_wm_emit path doesn't get the information as to whether a float
value is a constant or a uniform.
|
|
I keep finding the desire to force this path to debug it instead of
cooking up goofy-looking testcases to do so.
|
|
Saves an instruction in PINTERP, LINTERP, and PIXEL_W from
brw_wm_glsl.c For non-GLSL it isn't used yet because the deltas have
to be laid out differently.
|
|
This would be triggered by use of sqrt() along with control flow.
Fixes piglit-fs-sqrt-branch and a bug in Yo Frankie!.
|
|
|
|
Corresponds to d225a25e21a24508aea3b877c78beb35502e942d and fixes
piglit glsl-fs-loop-nested. Bug #25173.
|
|
We were doing it ad-hoc before, as instructions with potential
aliasing problems were identified. But thanks to swizzling basically
anything can have aliasing, so just do it generally at source reg
setup time. This is somewhat inefficient, because sometimes an
operation doesn't need unaliasing protection if the swizzling is safe,
but the unaliasing before didn't cover those cases either.
Fixes piglit glsl-fs-loop.
|
|
|
|
Conflicts:
configs/darwin
src/gallium/auxiliary/util/u_clear.h
src/gallium/state_trackers/xorg/xorg_exa_tgsi.c
src/mesa/drivers/dri/i965/brw_draw_upload.c
|
|
|
|
|
|
Caught by clang.
|
|
Saves ~480 bytes of code.
|
|
Add a GLbitfield64 type and several macros to operate on 64-bit
fields. The OutputsWritten field of gl_program is changed to use that
type. This results in a fair amount of fallout in drivers that use
programs.
No changes are strictly necessary at this point as all bits used are
below the 32-bit boundary. Fairly soon several bits will be added for
clip distances written by a vertex shader. This will cause several
bits used for varyings to be pushed above the 32-bit boundary. This
will affect any drivers that support GLSL.
At this point, only the i965 driver has been modified to support this
eventuality.
I did this as a "squash" merge. There were several places through the
outputswritten64 branch where things were broken. I foresee this
causing difficulties later for bisecting. The history is still
available in the branch.
Conflicts:
src/mesa/drivers/dri/i965/brw_wm.h
|
|
This should fix TXB on G45 and older in the GLSL case.
|
|
New comments should explain some of the confusion about how this message
works.
|
|
They're the same regardless of execution width for 8, 4x2, and 16.
|
|
|
|
This should fix issues with antialiased lines in GLSL.
|
|
The PINTERP code should be faster for brw_wm_glsl.c now since brw_wm_emit.c's
had been improved, and pixel_w should no longer stomp on a neighbor to dst.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This drops support for get_src_reg_imm in these, but the prospect of getting
brw_wm_pass*.c onto our GLSL path is well worth some temporary pain.
|
|
This matches brw_wm_emit.c, which we'll be using shortly. There's a
possible penalty here in that we'll allocate registers for unused channels,
since we aren't doing ref tracking like brw_wm_pass*.c does. However, my
measurements on GM965 don't show any for either OA or UT2004 with the GLSL
path forced.
|
|
This makes things a bit easier to remember/understand.
|
|
Previously, it was trying to mess around with the varying's
WM setup data to produce a result. Along with not actually working when
passed a varying, this wouldn't work if you did dFd[xy]() on a temporary.
Instead, just calculate the derivative using the neighbors in the subspan.
|
|
The instructions we're translating already went through the brw_wm_pass_fp()
function which does the sampler->texture unit mapping. We were applying
the sample->unit mapping a second time in the GLSL texture emitters.
Often, this made no difference but other times it could lead to accessing
an invalid texture and could cause a GPU lockup.
|