summaryrefslogtreecommitdiff
path: root/src/gallium/drivers/nv50/nv50_program.c
AgeCommit message (Collapse)Author
2009-11-04nv50: fix shader emit_tex for cube texturesChristoph Bumiller
2009-11-04nv50: add abs-modifier for emit_minmaxChristoph Bumiller
2009-11-01nv50: handle TGSI_SEMANTIC_FACEChristoph Bumiller
2009-11-01nv50: make IF condition safeChristoph Bumiller
Don't assume that a SET that writes to IF's argument directly precedes the IF.
2009-11-01nv50: implement TGSI_OPCODE_AND/OR/XORChristoph Bumiller
Will use AND for gl_FrontFacing, the face input is either 0 or 0xffffffff.
2009-10-31nv50: use SIFC also for shader uploadChristoph Bumiller
Adds a more generic SIFC transfer function.
2009-10-31nv50: make MRTs workChristoph Bumiller
We have to indicate to the hw whether the FP exports multiple colour results. Method 0x121c is used to specify the number of RTs. Also deactivate zeta explicitly if there's no zsbuf.
2009-10-23nv50: do SIGN_SET as one instructionChristoph Bumiller
2009-10-23nv50: fix saturation outside of tx_insn caseChristoph Bumiller
2009-10-23nv50: allow all 127 TEMP regsChristoph Bumiller
We should really learn to not waste so many though.
2009-10-23nv50: fix address reg codeChristoph Bumiller
Contained some rather obvious thinking errors before, and didn't consider offsets from TGSI ADDRESS regs.
2009-10-23gallium: remove the swizzling parts of ExtSwizzleKeith Whitwell
These haven't been used by the mesa state tracker since the conversion to tgsi_ureg, and it seems that none of the other state trackers are using it either. This helps simplify one of the biggest suprises when starting off with TGSI shaders.
2009-10-19nv50: add support for address regsChristoph Bumiller
Allow indirect uniform access and increase the limit on parameters from 128 to 512.
2009-10-19nv50: cleanup emit_kilChristoph Bumiller
2009-10-19nv50: implement TGSI_OPCODE_CMPChristoph Bumiller
2009-10-19nv50: quick fix for insn src negationChristoph Bumiller
We only have a per nv50_reg negation flag, if an nv50_reg is used more than once in a TGSI op with different sign modes, we'd generate wrong code. We probably can't do much better without more invasive changes.
2009-10-19nv50: add support for DDX and DDY opcodesChristoph Bumiller
2009-09-25nv50: fix TEX for WriteMask not equal 0xfChristoph Bumiller
If you e.g. only need alpha, it ends up in the first reg, not the last, as it would when reading rgb too.
2009-09-25nv50: RCP and RSQ cannot load from VP inputsChristoph Bumiller
2009-09-25nv50: fix CEIL and TRUNCChristoph Bumiller
Separated the integer rounding mode flag for cvt.
2009-09-25nv50: implement BGNLOOP, BRK, ENDLOOPChristoph Bumiller
There's a good chance a loop won't execute correctly though since our TEMP allocation assumes programs to be executed linearly. Will fix later.
2009-09-25nv50: implement IF, ELSE, ENDIF opcodesChristoph Bumiller
2009-09-15nv50: fix stupid thinko in emit_setChristoph Bumiller
When swapping sources 0 and 1, EQ of course does *not* become NE, etc. Introduced in 2b963f5c723401aa2646bd48eefe065cd335e280.
2009-09-15nv50: let programs use the whole param bufferChristoph Bumiller
Allocation is unnecessary since all uniforms are uploaded on every constant buffer change anyway.
2009-09-15nv50: add preliminary support for point spritesChristoph Bumiller
2009-09-15nv50: add support for point size per vertexChristoph Bumiller
2009-09-15nv50: add support for light-twosideChristoph Bumiller
2009-09-15nv50: proper linkage between VP and FPChristoph Bumiller
This moves construction of the mapping between VP outputs and FP inputs into validation. The map also contains slots for special outputs like clip distance and point size, so we need to at least merge the VP related and FP related parts on validation if we want to support those. Now we match every single FP input component with results from the VP and leave those not read out of the map, or replace those not written by 0 (xyz) or 1 (w). The bitmap indicating linear interpolants is also filled, and flat FP inputs are mapped in only after non-flat ones, as is required. Furthermore, we can save some space by only fetching VP attrs we actually use, and avoid wasting any output regs because of TGSI using less than 4 components.
2009-09-15nv50: move allocation of pc regsChristoph Bumiller
Make use of tgsi_shader_info to determine how many nv50_regs we need to allocate, whether program uses KIL, or writes DEPR.
2009-09-15nv50: nicer initialization of nv50_regsChristoph Bumiller
2009-09-15nv50: handle CEIL and TRUNC opcodesChristoph Bumiller
2009-09-15nv50: handle SEQ, SGT, SLE, SNE opcodesChristoph Bumiller
2009-09-15nv50: SIN and COS use src0.w for dst.wChristoph Bumiller
2009-09-15nv50: use broadcast TEMP reg in tx_insnChristoph Bumiller
Makes some opcode cases nicer and might reduce the total nr of TEMPs required, or save some MOVs.
2009-09-15nv50: add nv50_tgsi_insn to handle swizzles safelyChristoph Bumiller
2009-09-15nv50: add functions for swizzle resolutionChristoph Bumiller
We're going to try to reorder the scalar ops of a vector instr to accomodate swizzles that would otherwise require us to emit to an additional TEMP first (like MOV R0.xy, R0.zx).
2009-09-15nv50: extend insn src mask functionChristoph Bumiller
Extend its usage to avoiding e.g. emission of negation instructions in tx_insn for sources we don't need.
2009-09-03nv50: move centroid, flat bits when making interp longChristoph Bumiller
Before this, just the perspective divide bit was moved in convert_to_long of the load interpolant instruction.
2009-09-02nv50: SWZ is the same as MOV from our perspectiveBen Skeggs
2009-08-17nv50: whitespace fixes and deobfuscationMaarten Maathuis
2009-08-15nv50: align registers used with TEX to 4Christoph Bumiller
The TEX instruction is passed the first index of a contiguous range of 4 TEMP registers that contain coordinates / LOD and, after execution, the texel values. It seems the first index is required to be a multiple of 4 on some (older ?) cards.
2009-08-14nv50: fix typo in REALLOC's 2nd argument in ctor_immdChristoph Bumiller
2009-07-22gallium: simplify tgsi_full_immediate structKeith Whitwell
Remove the need to have a pointer in this struct by just including the immediate data inline. Having a pointer in the struct introduces complications like needing to alloc/free the data pointed to, uncertainty about who owns the data, etc. There doesn't seem to be a need for it, and it is unlikely to make much difference plus or minus to performance. Added some asserts as we now will trip up on immediates with more than four elements. There were actually already quite a few such asserts, but the >4 case could be used in the future to specify indexable immediate ranges, such as lookup tables.
2009-06-05nouveau: remove unneeded code from ws, use pipe_buffer_ instead of ws->Ben Skeggs
2009-06-05nouveau: move channel creation into pipe driversBen Skeggs
2009-06-05nouveau: call notifier/grobj etc funcs directlyBen Skeggs
libdrm_nouveau is linked with the winsys, there's no good reason to do all this through yet another layer.
2009-06-05nouveau: pass nouveau_bo instead of pipe_buffer to so_ callsBen Skeggs
2009-05-28nv50: negate sources directly where supportedChristoph Bumiller
2009-05-28nv50: introduce emit_cvt and use itChristoph Bumiller
This makes some code cleaner, and we can now easily do CEIL and TRUNC.
2009-05-28nv50: fix TXPChristoph Bumiller
For TXP we need to divide texture coords by their w component, or use the coords' 1/w in the perspective interpolation instruction. This also tries to support 1D, 3D and CUBE textures, and lets the instruction only load the components that are used.