| Age | Commit message (Collapse) | Author | 
|---|
|  |  | 
|  | Besides meaning x86 and x86-64 architecture, it also depends on SSE2
support enabled on gcc.
This fixes the linux-debug build. | 
|  | The assertion failed when we ran out of exec memory.
Found with conform texcombine test. | 
|  |  | 
|  |  | 
|  |  | 
|  | This set of code changes are for stencil code generation
support.  Both one-sided and two-sided stenciling are supported.
In addition to the raw code generation changes, these changes had
to be made elsewhere in the system:
- Added new "register set" feature to the SPE assembly generation.
  A "register set" is a way to allocate multiple registers and free
  them all at the same time, delegating register allocation management
  to the spe_function unit.  It's quite useful in complex register
  allocation schemes (like stenciling).
- Added and improved SPE macro calculations.
  These are operations between registers and unsigned integer
  immediates.  In many cases, the calculation can be performed
  with a single instruction; the macros will generate the
  single instruction if possible, or generate a register load
  and register-to-register operation if not.  These macro
  functions are: spe_load_uint() (which has new ways to
  load a value in a single instruction), spe_and_uint(),
  spe_xor_uint(), spe_compare_equal_uint(), and spe_compare_greater_uint().
- Added facing to fragment generation.  While rendering, the rasterizer
  needs to be able to determine front- and back-facing fragments, in order
  to correctly apply two-sided stencil.  That requires these changes:
  - Added front_winding field to the cell_command_render block, so that
    the state tracker could communicate to the rasterizer what it
    considered to be the front-facing direction.
  - Added fragment facing as an input to the fragment function.
  - Calculated facing is passed during emit_quad(). | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | compile again. | 
|  |  | 
|  | - Use a lookup table for log2.
- Compute (float) (1 << ipart) by tweaking with the exponent directly to
avoid integer overflow and float conversion.
- Also table negative exponents to avoid float division and branching.
- Implement util_fast_exp as function of util_fast_exp2. | 
|  | Special care must be taken when calling compiler generated SSE2 functions
from the runtime generated SSE2: saving the xmm registers, and notify gcc
the stack is not 16byte aligned.
It would be more efficient to keep the stack pointer 16byte aligned, but
too hairy, and not consistent in all x86 architectures.
This has been tested in linux x86 and windows x86 userspace. Not tested on
x86-64 because it is broken for other reasons (even without this change). | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | into gallium-0.2 | 
|  |  | 
|  | - rtasm_ppc_spe.c, rtasm_ppc_spe.h: added a new macro function
  "spe_load_uint" for loading and splatting unsigned integers
  in a register; it will use "ila" for values 18 bits or less,
  "ilh" for word values that are symmetric across halfwords,
  "ilhu" for values that have zeroes in their bottom halfwords,
  or "ilhu" followed by "iohl" for general 32-bit values.
  Of the 15 color masks of interest, 4 are 18 bits or less,
  2 are symmetric across halfwords, 3 are zero in the bottom
  halfword, and 6 require two instructions to load.
- cell_gen_fragment.c: added full codegen for logic op and
  color mask. | 
|  | Conflicts:
	src/mesa/shader/slang/slang_link.c | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | - Added new "macro" functions spe_float_min() and spe_float_max()
  to rtasm_ppc_spe.{ch}.  These emit instructions that cause
  the minimum or maximum of each element in a vector of floats
  to be saved in the destination register.
- Major changes to cell_gen_fragment.c to implement all the blending
  modes (except for the mysterious D3D-based PIPE_BLENDFACTOR_SRC1_COLOR,
  PIPE_BLENDFACTOR_SRC1_ALPHA, PIPE_BLENDFACTOR_INV_SRC1_COLOR, and
  PIPE_BLENDFACTOR_INV_SRC1_ALPHA).
- Some revamping of code in cell_gen_fragment.c: use the new spe_float_min()
  and spe_float_max() functions (instead of expanding these calculations
  inline via macros); create and use an inline utility function for handling
  "optional" register allocation (for the {1,1,1,1} vector, and the
  blend color vectors) instead of expanding with macros; use the Float
  Multiply and Subtract (fnms) instruction to simplify and optimize many
  blending calculations. |