summaryrefslogtreecommitdiff
path: root/src/gallium/drivers/nv50/nv50_program.c
AgeCommit message (Collapse)Author
2009-10-23gallium: remove the swizzling parts of ExtSwizzleKeith Whitwell
These haven't been used by the mesa state tracker since the conversion to tgsi_ureg, and it seems that none of the other state trackers are using it either. This helps simplify one of the biggest suprises when starting off with TGSI shaders.
2009-10-19nv50: add support for address regsChristoph Bumiller
Allow indirect uniform access and increase the limit on parameters from 128 to 512.
2009-10-19nv50: cleanup emit_kilChristoph Bumiller
2009-10-19nv50: implement TGSI_OPCODE_CMPChristoph Bumiller
2009-10-19nv50: quick fix for insn src negationChristoph Bumiller
We only have a per nv50_reg negation flag, if an nv50_reg is used more than once in a TGSI op with different sign modes, we'd generate wrong code. We probably can't do much better without more invasive changes.
2009-10-19nv50: add support for DDX and DDY opcodesChristoph Bumiller
2009-09-25nv50: fix TEX for WriteMask not equal 0xfChristoph Bumiller
If you e.g. only need alpha, it ends up in the first reg, not the last, as it would when reading rgb too.
2009-09-25nv50: RCP and RSQ cannot load from VP inputsChristoph Bumiller
2009-09-25nv50: fix CEIL and TRUNCChristoph Bumiller
Separated the integer rounding mode flag for cvt.
2009-09-25nv50: implement BGNLOOP, BRK, ENDLOOPChristoph Bumiller
There's a good chance a loop won't execute correctly though since our TEMP allocation assumes programs to be executed linearly. Will fix later.
2009-09-25nv50: implement IF, ELSE, ENDIF opcodesChristoph Bumiller
2009-09-15nv50: fix stupid thinko in emit_setChristoph Bumiller
When swapping sources 0 and 1, EQ of course does *not* become NE, etc. Introduced in 2b963f5c723401aa2646bd48eefe065cd335e280.
2009-09-15nv50: let programs use the whole param bufferChristoph Bumiller
Allocation is unnecessary since all uniforms are uploaded on every constant buffer change anyway.
2009-09-15nv50: add preliminary support for point spritesChristoph Bumiller
2009-09-15nv50: add support for point size per vertexChristoph Bumiller
2009-09-15nv50: add support for light-twosideChristoph Bumiller
2009-09-15nv50: proper linkage between VP and FPChristoph Bumiller
This moves construction of the mapping between VP outputs and FP inputs into validation. The map also contains slots for special outputs like clip distance and point size, so we need to at least merge the VP related and FP related parts on validation if we want to support those. Now we match every single FP input component with results from the VP and leave those not read out of the map, or replace those not written by 0 (xyz) or 1 (w). The bitmap indicating linear interpolants is also filled, and flat FP inputs are mapped in only after non-flat ones, as is required. Furthermore, we can save some space by only fetching VP attrs we actually use, and avoid wasting any output regs because of TGSI using less than 4 components.
2009-09-15nv50: move allocation of pc regsChristoph Bumiller
Make use of tgsi_shader_info to determine how many nv50_regs we need to allocate, whether program uses KIL, or writes DEPR.
2009-09-15nv50: nicer initialization of nv50_regsChristoph Bumiller
2009-09-15nv50: handle CEIL and TRUNC opcodesChristoph Bumiller
2009-09-15nv50: handle SEQ, SGT, SLE, SNE opcodesChristoph Bumiller
2009-09-15nv50: SIN and COS use src0.w for dst.wChristoph Bumiller
2009-09-15nv50: use broadcast TEMP reg in tx_insnChristoph Bumiller
Makes some opcode cases nicer and might reduce the total nr of TEMPs required, or save some MOVs.
2009-09-15nv50: add nv50_tgsi_insn to handle swizzles safelyChristoph Bumiller
2009-09-15nv50: add functions for swizzle resolutionChristoph Bumiller
We're going to try to reorder the scalar ops of a vector instr to accomodate swizzles that would otherwise require us to emit to an additional TEMP first (like MOV R0.xy, R0.zx).
2009-09-15nv50: extend insn src mask functionChristoph Bumiller
Extend its usage to avoiding e.g. emission of negation instructions in tx_insn for sources we don't need.
2009-09-03nv50: move centroid, flat bits when making interp longChristoph Bumiller
Before this, just the perspective divide bit was moved in convert_to_long of the load interpolant instruction.
2009-09-02nv50: SWZ is the same as MOV from our perspectiveBen Skeggs
2009-08-17nv50: whitespace fixes and deobfuscationMaarten Maathuis
2009-08-15nv50: align registers used with TEX to 4Christoph Bumiller
The TEX instruction is passed the first index of a contiguous range of 4 TEMP registers that contain coordinates / LOD and, after execution, the texel values. It seems the first index is required to be a multiple of 4 on some (older ?) cards.
2009-08-14nv50: fix typo in REALLOC's 2nd argument in ctor_immdChristoph Bumiller
2009-07-22gallium: simplify tgsi_full_immediate structKeith Whitwell
Remove the need to have a pointer in this struct by just including the immediate data inline. Having a pointer in the struct introduces complications like needing to alloc/free the data pointed to, uncertainty about who owns the data, etc. There doesn't seem to be a need for it, and it is unlikely to make much difference plus or minus to performance. Added some asserts as we now will trip up on immediates with more than four elements. There were actually already quite a few such asserts, but the >4 case could be used in the future to specify indexable immediate ranges, such as lookup tables.
2009-06-05nouveau: remove unneeded code from ws, use pipe_buffer_ instead of ws->Ben Skeggs
2009-06-05nouveau: move channel creation into pipe driversBen Skeggs
2009-06-05nouveau: call notifier/grobj etc funcs directlyBen Skeggs
libdrm_nouveau is linked with the winsys, there's no good reason to do all this through yet another layer.
2009-06-05nouveau: pass nouveau_bo instead of pipe_buffer to so_ callsBen Skeggs
2009-05-28nv50: negate sources directly where supportedChristoph Bumiller
2009-05-28nv50: introduce emit_cvt and use itChristoph Bumiller
This makes some code cleaner, and we can now easily do CEIL and TRUNC.
2009-05-28nv50: fix TXPChristoph Bumiller
For TXP we need to divide texture coords by their w component, or use the coords' 1/w in the perspective interpolation instruction. This also tries to support 1D, 3D and CUBE textures, and lets the instruction only load the components that are used.
2009-05-28nv50: use multiple constant buffersChristoph Bumiller
Use different buffers for immds, FP params, and VP params. One has to map constant buffer indices in shader code to buffers defined via CB_DEF. In principle, we could use more buffers so we'd have to change the shader code less frequently.
2009-05-28nv50: don't look for unfreed temps in free_nv50_pcChristoph Bumiller
Since we stopped using alloc_temp to get hw indices for FP attrs there shouldn't be any non-deallocated temps left.
2009-05-28nv50: release hw TEMPs earlyChristoph Bumiller
Since we know when we don't use a TEMP or FP ATTR register anymore, we can release their hw resources early.
2009-05-28nv50: allow immediates for MOV, ADD and MULChristoph Bumiller
Immediates are inlined now where possible, so we need to set pc->allow32 to FALSE in LIT where we have the conditional MOV, since immediates swallow the predicate bits.
2009-05-28nv50: enable half insns for MOV and MULChristoph Bumiller
2009-05-28nv50: make sure half-long insns are pairedChristoph Bumiller
I chose to just convert unpaired 32 bit length instructions after parsing all instructions, although it might be possible to determine beforehand whether there would be any lone ones, and then even do some swapping to bring them together ...
2009-05-28nv50: enable KIL in register 19a8Christoph Bumiller
2009-05-28nv50: don't overwrite sources before they're usedChristoph Bumiller
This would have happened in p.e. ADD TEMP[0], TEMP[0].xyxy, TEMP[1] or RCP/RSQ TEMP[i], TEMP[i].
2009-05-28nv50: put FP outputs where they belongChristoph Bumiller
Depth output in fragment programs should end up in the first register after the color outputs.
2009-05-28nv50: modified FP attribute loadingChristoph Bumiller
VP outputs that should be loadable in the FP are mapped to interpolant indices by HPOS, COL0 etc.; of course HPOS is always written, so the highest byte of 1988 is a bitmask that selects which components of HPOS are used for interpolants, i.e. the FP inputs in COL0 start at index POPCNT(1988[24:28]).
2009-05-28nv50: inspect decl semantic and interpolation modeChristoph Bumiller
Record interpolation mode for attributes while parsing declarations, and also remember the indices of FP color inputs and FP depth output, which has to end up in the highest output register.