Age | Commit message (Collapse) | Author |
|
|
|
Fixes glsl-fs-uniform-array-5, but not 6 which fails in ir_to_mesa.
|
|
It can be tested with if (0) replaced with if (1) to force spilling for all
virtual GRFs. Some simple tests work, but large texturing tests fail.
|
|
|
|
|
|
Simply using RNDU, RNDZ, or RNDE does not produce the desired result.
Rather, the RND* instructions place a value in the destination register
that may be 1 less than the correct answer. They can also set per-channel
"increment bits" in a flag register, which, if set, mean dest needs to
be incremented by 1. A second instruction - a predicated add -
completes the job.
Notably, RNDD always produces the correct answer in a single
instruction.
Fixes piglit test glsl-fs-trunc.
|
|
Fixes glsl-algebraic-pow-2 in brw_wm_glsl.c mode.
|
|
We always need to set it, so pass it in.
|
|
|
|
Whenever the accumulator results are needed, this bit must be set.
|
|
Fixes glsl-vs-if-nested (70.0 is not <= 70.000648 thanks to the
swizzle bits getting set). Some safety checks are added to make sure
this doesn't happen again as we increase the usage of immediate values
in program generation.
|
|
Also fix up comments, so that the difference between the two passes is
clarified.
|
|
|
|
|
|
The previous support was overly complicated by trying to use the same
1-OWORD message for both offsets.
|
|
When EU executes 'wait' instruction, it stalls and sets notification
register state. Host can issue MMIO write to clear notification
register state to allow EU continue on executing again.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
|
|
|
|
|
|
This is recommended by the B-Spec. I wasn't able to measure any
difference in ETQW.
|
|
Saves an instruction in PINTERP, LINTERP, and PIXEL_W from
brw_wm_glsl.c For non-GLSL it isn't used yet because the deltas have
to be laid out differently.
|
|
No statistically significant performance difference at n=3 with either
openarena or my GL demo, but cutting program size seems like a good
thing to be doing for the hypothetical app that has a working set near
icache size.
|
|
1. new PCI ids
2. fix some 3D commands on new chipset
3. fix send instruction on new chipset
4. new VUE vertex header
5. ff_sync message (added by Zou Nan Hai <nanhai.zou@intel.com>)
6. the offset in JMPI is in unit of 64bits on new chipset
7. new cube map layout
|
|
|
|
the driver used to overwrite grf0 then use implicit move by send instruction
to move contents of grf0 to mrf1. However, we must not overwrite grf0 since
it's still used later for fb write.
Instead, do the move directly do mrf1 (we could use implicit move from another
grf reg to mrf1 but since we need a mov to encode the data anyway it doesn't
seem to make sense).
I think the dp_READ/WRITE_16 functions may suffer from the same issue.
While here also remove unnecessary msg_reg_nr parameter from the dataport
functions since always message register 1 is used.
|
|
|
|
The READ message's msg_control value can be 0 or 1 to indicate that the
Oword should be read into the lower or upper half of the target register.
It seems that the other half of the register gets clobbered though. So
we read into two dest registers then use a MOV to combine the upper/lower
halves.
|
|
A scatter-read should be possible, but we're just using two READs for
the time being.
|
|
This mostly came down to finding the right MRF incantation in the
brw_dp_READ_4_vs() function.
Note: this feature is still disabled (but getting close to done).
|
|
Hook up a constant buffer, binding table, etc for the VS unit.
This will allow using large constant buffers with vertex shaders.
The new code is disabled at this time (use_const_buffer=FALSE).
|
|
|
|
Used to read float[4] vectors from the constant buffer/surface.
|
|
|
|
This is the size of the intermediate instruction buffer.
|
|
Previously, the prog_instruction::Data field was used to map original Mesa
instructions to brw instructions in order to resolve subroutine calls. This
was a rather tangled mess. Plus it's an obstacle to implementing dynamic
allocation/growing of the instruction buffer (it's still a fixed size).
Mesa's GLSL compiler emits a label for each subroutine and CAL instruction.
Now we use those labels to patch the subroutine calls after code generation
has been done. We just keep a list of all CAL instructions that needs patching
and a list of all subroutine labels. It's a simple matter to resolve them.
This also consolidates some redundant post-emit code between brw_vs_emit.c and
brw_wm_glsl.c and removes some loops that cleared the prog_instruction::Data
fields at the end.
Plus, a bunch of new comments.
|
|
Also, fix some RNDD vs. RNDZ confusion elsewhere.
|
|
|
|
OPCODE_NOISE4 coming later.
|
|
This is required for scatter writes in destination regions to work.
|
|
|
|
The 32-bit immediate value in the i965 instruction word must contain two
copies of any 16-bit constants. brw_imm_uw and brw_imm_w just needed to
copy the value into both halves of the immediate value instruction field.
|
|
|
|
|
|
|
|
|
|
most of the sample working with some small modification
|
|
|
|
Mostly:
- update #includes
- update STATE_* token code
|
|
There is an errata for Broadwater that threads don't have the instruction/loop
mask stacks initialized on thread spawn. In single program flow mode, those
stacks are not writable, so we can't initialize them. However, they do get
read during ELSE and ENDIF instructions. So, instead, replace branch
instructions in single program flow mode with predicated jumps (ADD to the ip
register), avoiding use of the more complicated branch instructions that may
fail. This is also a minor optimization as no ENDIF equivalent is necessary.
Signed-off-by: Keith Packard <keithp@neko.keithp.com>
|
|
This driver comes from Tungsten Graphics, with a few further modifications by
Intel.
|