summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-09-13loop_unroll: unroll loops with (lowered) breaksLuca Barbieri
If the loop ends with an if with one break or in a single break unroll it. Loops that end with a continue will have that continue removed by the redundant jump optimizer. Likewise loops that end with an if-statement with a break at the end of both branches will have the break pulled out after the if-statement. Loops of the form for (...) { do_something1(); if (cond) { do_something2(); break; } else { do_something3(); } } will be unrolled as do_something1(); if (cond) { do_something2(); } else { do_something3(); do_something1(); if (cond) { do_something2(); } else { do_something3(); /* Repeat inserting iterations here.*/ } } ir_lower_jumps can guarantee that all loops are put in this form and thus all loops are now potentially unrollable if an upper bound on the number of iterations can be found. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2010-09-13glsl2: Add pass to remove redundant jumpsIan Romanick
2010-09-13glsl: Explain file naming conventionIan Romanick
2010-09-13loop_controls: fix analysis of already analyzed loopsLuca Barbieri
The loop_controls pass didn't look at the counter values it put in ir_loop on previous iterations, so while the first iteration worked, subsequent ones couldn't determine max_iterations.
2010-09-13i965: Request that returns be lowered in shader mainIan Romanick
Fixes piglit tests glsl-vs-main-return and glsl-fs-main-return.
2010-09-13glsl: call ir_lower_jumps according to compiler optionsLuca Barbieri
2010-09-13glsl: add continue/break/return unification/elimination pass (v2)Luca Barbieri
Changes in v2: - Base class renamed to ir_control_flow_visitor - Tried to comply with coding style This is a new pass that supersedes ir_if_return and "lowers" jumps to if/else structures. Currently it causes no regressions on softpipe and nv40, but I'm not sure whether the piglit glsl tests are thorough enough, so consider this experimental. It can be asked to: 1. Pull jumps out of ifs where possible 2. Remove all "continue"s, replacing them with an "execute flag" 3. Replace all "break" with a single conditional one at the end of the loop 4. Replace all "return"s with a single return at the end of the function, for the main function and/or other functions This gives several great benefits: 1. All functions can be inlined after this pass 2. nv40 and other pre-DX10 chips without "continue" can be supported 3. nv30 and other pre-DX10 chips with no control flow at all are better supported Note that for full effect we should also teach the unroller to unroll loops with a fixed maximum number of iterations but with the canonical conditional "break" that this pass will insert if asked to. Continues are lowered by adding a per-loop "execute flag", initialized to TRUE, that when cleared inhibits all execution until the end of the loop. Breaks are lowered to continues, plus setting a "break flag" that is checked at the end of the loop, and trigger the unique "break". Returns are lowered to breaks/continues, plus adding a "return flag" that causes loops to break again out of their enclosing loops until all the loops are exited: then the "execute flag" logic will ignore everything until the end of the function. Note that "continue" and "return" can also be implemented by adding a dummy loop and using break. However, this is bad for hardware with limited nesting depth, and prevents further optimization, and thus is not currently performed.
2010-09-13glsl: add ir_control_flow_visitorLuca Barbieri
This is just a subclass of ir_visitor with empty implementations of all the visit methods for non-control flow nodes. Used to avoid duplicating that in ir_visitor subclasses. ir_hierarchical_visitor is another way to solve this, but is less natural for some applications.
2010-09-13llvmpipe: Fix non SSE2 builds.José Fonseca
Should fix fdo 30168.
2010-09-13r300g/swtcl: unlock VBO after draw_flushMarek Olšák
https://bugs.freedesktop.org/show_bug.cgi?id=29901 https://bugs.freedesktop.org/show_bug.cgi?id=30132
2010-09-13llvmpipe: Change asm to __asm__.Witold Baryluk
According to gcc documentation both are equivalent, second are prefered as first can make conflict with existing symbols. Signed-off-by: José Fonseca <jfonseca@vmware.com>
2010-09-13EGL DRI2: 0xa011 is Pineview not IronlakeJesse Barnes
Point about needing a better way to do this validated.
2010-09-13r600c: const buffer sizes must be a multiple of 16 constsAlex Deucher
This applies to r6xx/r7xx/evergreen
2010-09-13EGL DRI2: add PCI ID for Ironlake mobileJesse Barnes
Allows KMS EGL driver to load. We need a better way of doing this.
2010-09-13r600c/eg: remove obselete commentAlex Deucher
2010-09-13r600c/eg: remove unused emit timestamp functionAlex Deucher
2010-09-13r600c/eg: emit CB_BLEND_ALPHA with the other blend valuesAlex Deucher
saves a few dwords
2010-09-13r600c: remove redundant state emit on evergreenAlex Deucher
r700start3d already emits the context control packets
2010-09-13nv50: fix TXP depth comparison valueChristoph Bumiller
2010-09-13nv50: fix indirect CONST access with large or negative offsetsChristoph Bumiller
2010-09-13nv50: MOV TEMP[0], -CONST[0] must be float32 negationChristoph Bumiller
2010-09-13nv50: interp cannot write flags regChristoph Bumiller
2010-09-13nv50: check for immediates when turning MUL ADD into MADChristoph Bumiller
2010-09-13nv50: handle TGSI EXP and LOG againChristoph Bumiller
2010-09-13mesa: Revert accidentally committed vertex code chunkKristian Høgsberg
2010-09-13r600c: eg: fix typoAndre Maasikas
probably copy/paste error
2010-09-13r600c: eg: 256 float4 constants may need more than 256 bytesAndre Maasikas
2010-09-13r600c: eg - fix uninitialized variableAndre Maasikas
2010-09-13glx: Don't destroy DRI2 drawables for legacy glx drawablesKristian Høgsberg
For GLX 1.3 drawables, we can destroy the DRI2 drawable when the GLX drawable is destroyed. However, for legacy drawables, there os no good way of knowing when the application is done with it, so we just let the DRI2 drawable linger on the server. The server will destroy the DRI2 drawable when it destroys the X drawable or the client exits anyway. https://bugs.freedesktop.org/show_bug.cgi?id=30109
2010-09-13r300g: fix SWTCLMarek Olšák
https://bugs.freedesktop.org/show_bug.cgi?id=29901
2010-09-13llvmpipe: Unbreak rasterization on 64bit.José Fonseca
2010-09-13gallium: Change the resource_copy_region semantics to allow copies between ↵José Fonseca
different yet compatible formats
2010-09-13r600g: evergreen fixup dsa state for running query.Dave Airlie
evergreen is always the same as r700 here.
2010-09-13r600c: remove stray unmap callAndre Maasikas
no idea how/why it got there
2010-09-13llvmpipe: use gcc asm only with gccJosé Fonseca
2010-09-13r300g: print unassigned FS inputs for DBG_RSMarek Olšák
2010-09-13r300g: fix map_bufferMarek Olšák
https://bugs.freedesktop.org/show_bug.cgi?id=30145
2010-09-13r300/compiler: fix warningsMarek Olšák
2010-09-13r300g: add new debug options for dumping scissor regs and disabling CBZB clearMarek Olšák
2010-09-13r300g: skip rendering if CS space validation failsMarek Olšák
radeon_cs_space_check flushes the pipe context on failure, retries the validation, and returns -1 if it fails again. At that point, there is nothing we can do, so let's skip draw operations instead of getting stuck in an infinite loop. This code path ideally should never be hit.
2010-09-13r300g: remove u_upload_flush from r300_draw_arraysMarek Olšák
This a leftover probably and is unnecessary, since we flush u_upload_mgr in r300_flush.
2010-09-12nvfx: Remove unused variables.Vinson Lee
2010-09-12nvfx: Move declaration before code.Vinson Lee
Fixes SCons build.
2010-09-12llvmpipe: introduce tri_3_4 for tiny trianglesKeith Whitwell
2010-09-12llvmpipe: allow tri_3_16 at any 4-aligned location within a tileKeith Whitwell
Doesn't require 16-alignment, so catch more cases.
2010-09-12llvmpipe: refactor tri_3_16Keith Whitwell
Keep step array as a set of four m128i's and reuse throughout the rasterization.
2010-09-12llvmpipe: pass linear masks to fragment shaderKeith Whitwell
Fragment shader can extract the correct bits for each quad.
2010-09-12llvmpipe: fix warnings on both 32 and 64 bit buildsKeith Whitwell
2010-09-12llvmpipe: fix wierd performance regression in isosurfKeith Whitwell
I really don't understand the mechanism behind this, but it seems like the way data blocks for a scene are malloced, and in particular whether we treat them as stack or a queue, and whether we retain the most recently allocated or least recently allocated has a real affect (~5%) on isosurf framerates... This is probably specific to my distro or even just my machine, but none the less, it's nicer not to see the framerates go in the wrong direction.
2010-09-12nv50: match TEMP limit with nv50 ir builderChristoph Bumiller
Mesa doesn't respect it anyway, but this makes it assert rather than threads access areas of l[] that don't belong to them.