LLVMPIPE -- a fork of softpipe that employs LLVM for code generation.


Status
======

Done so far is:

 - the whole fragment pipeline is code generated in a single function
 
   - input interpolation
   
   - depth testing
 
   - texture sampling (not all state/formats are supported) 
   
   - fragment shader TGSI translation
     - same level of support as the TGSI SSE2 exec machine, with the exception
       we don't fallback to TGSI interpretation when an unsupported opcode is
       found, but just ignore it
     - done in SoA layout
     - input interpolation also code generated
 
   - alpha testing
 
   - blend (including logic ops)
     - both in SoA and AoS layouts, but only the former used for now
 
 - code is generic
   - intermediates can be vectors of floats, ubytes, fixed point, etc, and of
     any width and length
   - not all operations are implemented for these types yet though

Most mesa/progs/demos/* work. 

To do (probably by this order):

 - code generate stipple and stencil testing

 - translate the remaining bits of texture sampling state

 - translate TGSI control flow instructions, and all other remaining opcodes
 
 - integrate with the draw module for VS code generation

 - code generate the triangle setup and rasterization


Requirements
============

 - Linux
 
 - udis86, http://udis86.sourceforge.net/ . Use my repository, which decodes
   opcodes not yet supported by upstream.
 
     git clone git://people.freedesktop.org/~jrfonseca/udis86
     cd udis86
     ./configure --with-pic
     make
     sudo make install
 
 - LLVM 2.5. On Debian based distributions do:
 
     aptitude install llvm-dev

   There is a typo in one of the llvm-dev 2.5 headers, that causes compilation
   errors in the debug build:

     --- /usr/include/llvm-c/Core.h.orig	2009-08-10 15:38:54.000000000 +0100
     +++ /usr/include/llvm-c/Core.h	2009-08-10 15:38:25.000000000 +0100
     @@ -831,7 +831,7 @@
        template<typename T>
        inline T **unwrap(LLVMValueRef *Vals, unsigned Length) {
          #if DEBUG
     -    for (LLVMValueRef *I = Vals, E = Vals + Length; I != E; ++I)
     +    for (LLVMValueRef *I = Vals, *E = Vals + Length; I != E; ++I)
            cast<T>(*I);
          #endif
          return reinterpret_cast<T**>(Vals);
 
 - A x86 or amd64 processor with support for sse2, sse3, and sse4.1 SIMD
   instructions. This is necessary because we emit several SSE intrinsics for
   convenience. See /proc/cpuinfo to know what your CPU supports.
 
 - scons


Building
========

To build everything invoke scons as:

  scons debug=yes statetrackers=mesa drivers=llvmpipe winsys=xlib dri=false -k

Alternatively, you can build it with GNU make, if you prefer, by invoking it as

  make linux-llvm

but the rest of these instructions assume that scons is used.


Using
=====

Building will create a drop-in alternative for libGL.so. To use it set the
environment variables:

  export LD_LIBRARY_PATH=$PWD/build/linux-x86_64-debug/lib:$LD_LIBRARY_PATH

or

  export LD_LIBRARY_PATH=$PWD/build/linux-x86-debug/lib:$LD_LIBRARY_PATH

For performance evaluation pass debug=no to scons, and use the corresponding
lib directory without the "-debug" suffix.


Unit testing
============

Building will also create several unit tests in
build/linux-???-debug/gallium/drivers/llvmpipe:

 - lp_test_blend: blending
 - lp_test_conv: SIMD vector conversion
 - lp_test_format: pixel unpacking/packing

Some of this tests can output results and benchmarks to a tab-separated-file
for posterior analysis, e.g.:

  build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv


Development Notes
=================

- When looking to this code by the first time start in lp_state_fs.c, and 
  then skim through the lp_bld_* functions called in there, and the comments
  at the top of the lp_bld_*.c functions.  

- All lp_bld_*.[ch] are isolated from the rest of the driver, and could/may be 
  put in a stand-alone Gallium state -> LLVM IR translation module.

- We use LLVM-C bindings for now. They are not documented, but follow the C++
  interfaces very closely, and appear to be complete enough for code
  generation. See 
  http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html
  for a stand-alone example.