summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965
AgeCommit message (Collapse)Author
2009-06-30i965: first attempt at handling URB overflow when there's too many vs outputsBrian Paul
If we can't fit all the VS outputs into the MRF, we need to overflow into temporary GRF registers, then use some MOVs and a second brw_urb_WRITE() instruction to place the overflow vertex results into the URB. This is hit when a vertex/fragment shader pair has a large number of varying variables (12 or more). There's still something broken here, but it seems close...
2009-06-30i965: use BRW_MAX_MRFBrian Paul
2009-06-30i965: use BRW_MAX_GRF, BRW_MAX_MRFBrian Paul
2009-06-30i965: move BRW_MAX_GRF, define BRW_MAX_MRFBrian Paul
2009-06-30i965: defined BRW_MAX_MRFBrian Paul
2009-06-30i965: comments and a new assertionBrian Paul
2009-06-29intel: Move note_unlock() implementation to the one place it's needed.Eric Anholt
2009-06-26i965: fix fetching constants from constant buffer in glsl pathRoland Scheidegger
the driver used to overwrite grf0 then use implicit move by send instruction to move contents of grf0 to mrf1. However, we must not overwrite grf0 since it's still used later for fb write. Instead, do the move directly do mrf1 (we could use implicit move from another grf reg to mrf1 but since we need a mov to encode the data anyway it doesn't seem to make sense). I think the dp_READ/WRITE_16 functions may suffer from the same issue. While here also remove unnecessary msg_reg_nr parameter from the dataport functions since always message register 1 is used.
2009-06-23i965: Set the max index buffer address correctly according to the docs.Eric Anholt
It's the last addressable byte, not the byte after the end of the buffer.
2009-06-23i965: Don't set a reserved bit in MI_FLUSH.Eric Anholt
I noticed this when this MI_FLUSH showed up in IPEHR for the ut2004 hang. Not setting the reserved bit didn't help, though.
2009-06-23i965: Fix packed depth/stencil textures to be Y-tiled as well.Eric Anholt
Fixes shadowtex.c. And an assert is added to catch this sooner next time.
2009-06-19intel: Also get the DRI2 front buffer when doing front buffer reading.Eric Anholt
2009-06-19intel: Update Mesa state before span setup in glReadPixels.Eric Anholt
We could have mapped the wrong set of draw buffers. Noticed while looking into a DRI2 glean ReadPixels issue.
2009-06-19i965: initial code for loops in vertex programsBrian Paul
2009-06-19i965: asst clean-ups, etc in brw_vs_emit()Brian Paul
2009-06-19i965: asst clean-ups, var renaming in brw_wm_emit_glsl()Brian Paul
2009-06-17i965: Add decode for the G4X x,y offset in surface state.Eric Anholt
2009-06-17i965: Fix up texture layout for small things with wide pitches (tiled)Eric Anholt
We were packing according to the pitch, while the hardware appears to base it on the base level width. With this and the previous commit, fbo-cubemap now matches untiled behavior.
2009-06-17i965: Fall back or appropriately adjust offsets of drawing to tiled regions.Eric Anholt
3D rendering to tiled textures was being done with non-tile-aligned offsets. The G4X hardware has fields to let us support it easily and correctly, while the pre-G4X hardware requires a path full of suffering, so we just fall back.
2009-06-16Merge branch 'mesa_7_5_branch'Brian Paul
Conflicts: src/mesa/main/api_validate.c
2009-06-16i965: fix bugs in projective texture coordinatesBrian Paul
For the TXP instruction we check if the texcoord is really a 4-component atttibute which requires the divide by W step. This check involved the projtex_mask field. However, the projtex_mask field was being miscalculated because of some confusion between vertex program outputs and fragment program inputs. 1. Rework the size_masks calculation so we correctly set bits corresponding to fragment program input attributes. 2. Rename projtex_mask to proj_attrib_mask since we're interested in more than just texcoords (generic varying vars too). 3. Simply the indexing of the size_masks and proj_attrib_mask fields. 4. The tracker::active[] array was mis-dimensioned. Use MAX_PROGRAM_TEMPS instead of a magic number. 5. Update comments, add new assertions. With these changes the Lightsmark demo/benchmark renders correctly, until we eventually hit a GPU lockup...
2009-06-16i965: handle OPCODE_SWZ in the glsl pathRoland Scheidegger
glsl compiler will not generate OPCODE_SWZ, and as a first step it would be translated away to a MOV anyway (why?), but later internally this opcode is generated (for EXT_texture_swizzling).
2009-06-12i965: interpolate colors with perspective correction by defaultBrian Paul
...rather than with linear interpolation. Modern hardware should use perspective-corrected interpolation for colors (as for texcoords). glHint(GL_PERSPECTIVE_CORRECTION_HINT, mode) can be used to get linear interpolation if mode = GL_FASTEST.
2009-06-04intel: Add support for tiled textures.Eric Anholt
This is about a 30% performance win in OA with high settings on my GM45, and experiments with 915GM indicate that it'll be around a 20% win there. Currently, 915-class hardware is seriously hurt by the fact that we use fence regs to control the tiling even for 3D instructions that could live without them, so we spend a bunch of time waiting on previous rendering in order to pull fences off. Thus, the texture_tiling driconf option defaults off there for now.
2009-06-02i965: Support OPCODE_TRUNC in the brw_wm_fp.c code.Eric Anholt
This gets two more glean glsl1 tests using the non-GLSL path.
2009-05-21i965: fix whitespace in brw_tex_layout.cEric Anholt
The broken indentation was driving me crazy, so fix other stuff while I'm here.
2009-05-21i956: Make state dependency of SF on drawbuffer bounds match Mesa's.Eric Anholt
Noticed while debugging a weird 1D FBO testcase that left its existing viewport and projection matrix in place when switching drawbuffers. Didn't fix the testcase, though.
2009-05-21i965: rename var: s/tmp/vs_inputs/Brian Paul
2009-05-14i965: Fix varying payload reg assignment for the non-GLSL-instructions path.Eric Anholt
I don't have a testcase for this, but it seems clearly wrong.
2009-05-14i965: Fix register allocation of GLSL fp inputs.Eric Anholt
Before, if the VP output something that is in the attributes coming into the WM but which isn't used by the WM, then WM would end up reading subsequent varyings from the wrong places. This was visible with a GLSL demo using gl_PointSize in the VS and a varying in the WM, as point size is in the VUE but not used by the WM. There is now a regression test in piglit, glsl-unused-varying.
2009-05-14i965: fix 1D texture borders with GL_CLAMP_TO_BORDERRobert Ellison
With 1D textures, GL_TEXTURE_WRAP_T should be ignored (only GL_TEXTURE_WRAP_S should be respected). But the i965 hardware seems to follow the value of GL_TEXTURE_WRAP_T even when sampling 1D textures. This fix forces GL_TEXTURE_WRAP_T to be GL_REPEAT whenever 1D textures are used; this allows the texture to be sampled correctly, avoiding "imaginary" border elements in the T direction. This bug was demonstrated in the Piglit tex1d-2dborder test. With this fix, that test passes.
2009-05-12i965: enable additional code in emit_fb_write()Brian Paul
Not 100% sure this is right, but the invalid assertion is fixed...
2009-05-12i965: increase BRW_EU_MAX_INSNBrian Paul
2009-05-12i965: commentBrian Paul
2009-05-11i965: handle extended swizzle terms (0,1) in get_src_reg()Brian Paul
Fixes failed assertion in progs/glsl/twoside.c (but still wrong rendering).
2009-05-08i965: improve debug loggingRobert Ellison
Looking for memory leaks that were causing crashes in my environment in a situation where valgrind would not work, I ended up improving the i965 debug traces so I could better see where the memory was being allocated and where it was going, in the regions and miptrees code, and in the state caches. These traces were specific enough that external scripts could determine what elements were not being released, and where the memory leaks were. I also ended up creating my own backtrace code in intel_regions.c, to determine exactly where regions were being allocated and for what, since valgrind wasn't working. Because it was useful, I left it in, but disabled and compiled out. It can be activated by changing a flag at the top of the file.
2009-05-08i965: fix segfault on low memory conditionsRobert Ellison
When out of memory (in at least one case, triggered by a longrunning memory leak), this code will segfault and crash. By checking for the out-of-memory condition, the system can continue, and will report the out-of-memory error later, a much preferable outcome.
2009-05-08intel: Add a metaops version of glGenerateMipmapEXT/SGIS_generate_mipmaps.Eric Anholt
In addition to being HW accelerated, it avoids the incorrect (black) rendering of the mipmaps that SW was doing in fbo-generatemipmap. Improves the performance of the mipmap generation and drawing in fbo-generatemipmap by 30%.
2009-05-08i965: const qualifiersBrian Paul
2009-05-08i965: don't use GRF regs 126,127 for WM programsBrian Paul
They seem to be used for something else and using them for shader temps seems to lead to GPU lock-ups. Call _mesa_warning() when we run out of temps. Also, clean up some debug code.
2009-05-07i965: relAddr local var (to make debug/test a little easier)Brian Paul
2009-05-06i965: Remove bad constant buffer constant-reg-already-loaded optimization.Eric Anholt
Thanks to branching, the state of c->current_const[i].index at the point of emitting constant loads for this instruction may not match the actual constant currently loaded in the reg at runtime. Fixes a regression in my GLSL program for idr's class since b58b3a786aa38dcc9d72144c2cc691151e46e3d5.
2009-05-06i965: Remove the forced lack of caching for renderbuffer surface state.Eric Anholt
This snuck in with the multi-draw-buffers commit, and is a major penalty to performance. It doesn't appear to be required, as the only dependency the surface BO has is on the state key (and if there's some other dependency, it should just be in the key). This brings openarena performance up to almost 2% faster than Mesa 7.4.
2009-05-06i965: Remove _NEW_PROGRAM from brw_wm_surfaces setup dependencies.Eric Anholt
This was a leftover from the brw_wm_constant_buffer change.
2009-05-06i965: Split WM constant buffer update from other WM surfaces.Eric Anholt
This can avoid re-uploading constant data when it isn't necessary, and is a step towards not updating other surfaces just because constants change. It also brings the upload of the constant buffer next to the creation. This brings openarena performance up another 4%, to 91% of the Mesa 7.4 branch.
2009-05-06i965: Disentangle VS constant surface state from WM surface state.Eric Anholt
Also, only create VS surface state if there's a VS constant buffer to be uploaded, and set the contents of the buffer at the same time as creation.
2009-05-06i965: Don't create constant buffers if they won't be used.Eric Anholt
Really, the creation and upload of constants should be in the same place, since they should only happen together, and a state flag should be triggered by them so that we don't thrash state around so much for just updating constants. But this still recovers openarena performance by another 19%, leaving us 16% behind Mesa 7.4 branch.
2009-05-01Merge branch 'const-buffer-changes'Brian Paul
Conflicts: src/mesa/drivers/dri/i965/brw_curbe.c src/mesa/drivers/dri/i965/brw_vs_emit.c src/mesa/drivers/dri/i965/brw_wm_glsl.c
2009-04-27i965: #include prog_print.h to silence warningBrian Paul
2009-04-27i965: only upload constant buffer data when we actually need the const bufferBrian Paul
Make the use_const_buffer field per-program and only call the code which updates the constant buffer's data if the flag is set. This should undo the perf regression from 20f3497e4b6756e330f7b3f54e8acaa1d6c92052 (cherry picked from master, commit dc9705d12d162ba6d087eb762e315de9f97bc456)