Age | Commit message (Collapse) | Author |
|
This branch defines a gallivm_state structure which contains the
LLVMBuilderRef, LLVMContextRef, etc. All data structures built with
this object can be periodically freed during a "garbage collection"
operation.
The gallivm_state object has to be passed to most of the builder
functions where LLVMBuilderRef used to be used.
Conflicts:
src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
src/gallium/drivers/llvmpipe/lp_state_setup.c
|
|
The viewport state was being baked in at compile time (oops...)
|
|
|
|
Corrections in store_clip to store clip coordinates in AoS form.
Viewport & cliptest flag options based on variant key.
Put back draw_pt_post_vs and now 2 paths based on whether clipping
occurs or not.
|
|
|
|
|
|
Triangle strip alternates the front/back orientation of its triangles.
max_vertices was made even so that varray never splitted a triangle
strip at the wrong positions.
It did not work with triangle strips with adjacencies. And it is no
longer relevant with vsplit.
|
|
The higher bits of draw elements are no longer used for the stipple or
edge flags.
|
|
Update the middle end interface to pass the primitive flags from the
frontends to the pipeline. No frontend sets the flags yet.
|
|
A primitive may be splitted in frontends. The splitted primitives
should convey certain flag bits so that the decomposer can correctly
decide the stipple or edge flags.
This commit adds flags to draw_prim_info and updates the decomposer to
honor the flags. Frontends and middle ends will be updated later.
|
|
Plumb the constant buffer sizes down into the tgsi interpreter where
we can do bounds checking. Optional debug code warns upon out-of-bounds
reading. Plus add a few other assertions in the TGSI interpreter.
|
|
|
|
fixes instancing in draw llvm
|
|
|
|
we used to create and cache unltimited number of variant, this
change limits the number of variants kept around to a fixed number.
the change is based on a similar patch by Roland for llvmpipe fragment
shaders.
|
|
|
|
We were previously calculating a value which was either the geometry
shader output primitive or the application's input primitive, and
passing that to the various front/middle/back components for use as
the ultimate rendering primtive.
Unfortunately, this was not correct -- if the vcache decomposition
path is active and geometry shaders are *not* active, we can end up
with a third primitive -- specifically the decomposed version of the
input primitive.
Rather than trying to precalculate this, just let the individual
components inform their successors about which primitive type they are
recieving.
|
|
register masks, multiple output buffers, multiple primitives,
non-linear vertices (elts) and stride semantics.
|
|
Keith came up with a new way of running the pipeline which involves passing
a few info structs around (for fetch, vertices and prims) and allows us
to correctly handle cases where we endup with multiple primitives generated
by the pipeline itself.
|
|
of vertices
lots and lots of fixes for geometry shaders. in particular now we work when the gs
emits a different primitive than the one the pipeline was started with and also
we work when gs emits more vertices than would fit in the original buffer.
|
|
interface wise we have everything needed by d3d10 and gl transform feedback.
the draw module misses implementation of some corner cases (e.g. when stream
output wants different number of components per output than normal rendering
paths)
|
|
|
|
Unneeded since we code generate the vertex fecthes.
|
|
|
|
we were only running the llvm paths when the input elts were linear,
now we can handle abritrary fetch elts arrays. we do this by generating
two paths - linear and fetch_elts one and just selecting the right one
at run time.
|
|
we were only using the jited function in the linear case, now drawelts
correctly uses the same path. it results in a significant gain in
real world apps (openarena went from 23fps to 29fps)
|
|
our code resets pipe_vertex_buffer's with different offsets when rendering
vbo, meaning that we kept creating insane number of shaders even for simple
apps e.g. geartrain had 54 shaders and it was taking almost 27 seconds just to
compile them. this patch passes pipe_vertex_buffer's to the jit function and lets
it to the stride/buffer_offset computation at run time. the slowdown at runtime
is largely unnoticable but the we go from 54 shaders to 3, and from 27 seconds to less
than 1.
|
|
|
|
|
|
|
|
we don't index within the outputs but only within the inputs
|
|
the loop was doing a NE comparison which we could have skipped if the prim
was triangles (3 verts) and our step was 4 verts. also fix offsets in conversion
to aos.
|
|
there's no good way of aligning the output's, and since the vertex_header
is variable sized in the first place we need to extract elements from a vector
and store them individually into an array. this gets the basic examples working
again
|
|
|
|
|
|
the values passed are still not right, but the general scheme
is looking good.
|
|
code generate big chunks of the vertex pipeline in order to speed up
software vertex processing.
|