summaryrefslogtreecommitdiff
path: root/src/gallium/drivers/r600/r600_state_common.c
AgeCommit message (Collapse)AuthorFilesLines
2013-03-11r600g: atomize vertex shaderMarek Olšák1-13/+25
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-11r600g: inline r600_pipe_shader functionMarek Olšák1-2/+2
also change names of other functions, so that they make sense Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2013-03-01r600g: always map uninitialized buffer range as unsynchronizedMarek Olšák1-0/+4
Any driver can implement this simple and efficient optimization. Team Fortress 2 hits it always. The DISCARD_RANGE codepath is not even used with TF2 anymore, so we avoid a ton of useless buffer copies. Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> NOTE: This is a candidate for the 9.1 branch.
2013-03-01r600g: unify vgt statesMarek Olšák1-15/+7
The states were split because we thought it caused a hardlock. Now we know the hardlock was caused by something else and has since been fixed. Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-03-01r600g: atomize streamout enablingMarek Olšák1-19/+42
This doesn't fix any issue we know of, but there indeed is a week spot in draw_vbo where streamout can fail. After streamout is enabled, the need_cs_space call can flush the context, which causes the streamout to be disabled right after it was enabled and bad things happen. One way to fix it is to atomize the beginning part, so that no context flush can happen between streamout enabling and the first drawing. Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-02-28r600g: workaround hyperz lockup on evergreenJerome Glisse1-0/+10
This work around disable hyperz if write to zbuffer is disabled. Somehow using hyperz when not writting to the zbuffer trigger GPU lockup. See : https://bugs.freedesktop.org/show_bug.cgi?id=60848 Candidate for 9.1 Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-02-22r600g: r6xx deadlock workaround (v6)Alex Deucher1-0/+6
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=50655 https://bugs.freedesktop.org/show_bug.cgi?id=47116 v2: flush along with workaround. v3: just need a flush v4: try WAIT_UNTIL v5: switch to PS partial flush v6: rework patch Note: this is a candidate for the 9.1 branch. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-02-21r600g: Fix memory leak in r600_shader_select.Vinson Lee1-0/+1
Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reported-by: Michel Dänzer <michel@daenzer.net> Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-02-12r600g: fix lockup when hyperz & alpha test are enabled together. v3Jerome Glisse1-0/+5
Seems that alpha test being enabled confuse the GPU on the order in which it should perform the Z testing. So force the order programmed throught db shader control. v2: Only force z order when alpha test is enabled v3: Update db shader when binding new dsa + spelling fix Signed-off-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-31r600g: add cs memory usage accounting and limit it v3Jerome Glisse1-1/+12
We are now seing cs that can go over the vram+gtt size to avoid failing flush early cs that goes over 70% (gtt+vram) usage. 70% is use to allow some fragmentation. The idea is to compute a gross estimate of memory requirement of each draw call. After each draw call, memory will be precisely accounted. So the uncertainty is only on the current draw call. In practice this gave very good estimate (+/- 10% of the target memory limit). v2: Remove left over from testing version, remove useless NULL checking. Improve commit message. v3: Add comment to code on memory accounting precision Signed-off-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-01-28r600g: fix segfault with old kernel9.1-branchpointJerome Glisse1-1/+3
Old kernel do not have dma support, patch pushed were missing some of the check needed to not use dma. Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28r600g: add async for staging buffer upload v2Jerome Glisse1-3/+3
v2: Add virtual address to dma src/dst offset for cayman Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-28r600g: add multi ring support with dma as first second ring v4Jerome Glisse1-13/+16
We keep track of ring emission order in a stack, whenever we need to flush we empty the stack in a fifo order. There is few helpers function for bo mapping and other ring activities that will make sure that the ring stack is properly flush and submitted. v2: fix st flush path, and other flush path to properly flush all rings if necessary v3: - improve name of ring helpers - make sure that each time a cs is gona be written it endup at top of the stack to avoid any issue such as : STACK[0] = dma (withbo A,B) STACK[1] = gfx (withbo C,D) Now if code try to emit a dma command relative to bo C or D it will start writting cmd stream into the cs and once it reach the point where it adds relocation it will flush. At that point the cs will have cmd that don't have proper relocation into the relocation buffer and kernel will just refuse to run. v4: - Drop the stack idea as it turn out there is no way to use it or benefit from it. Any time the driver start command on other ring, it always need to flush the previous ring. So make code simpler by not using a stack. Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-01-11r600g: texture buffer object + glsl 1.40 enable support (v2)Dave Airlie1-11/+114
This adds TBO support to r600g, and with GLSL 1.40 enabled, we now get 3.1 core profiles advertised for r600g. The r600/700 implementation is a bit different from the evergreen one, as r6/7 hw lacks vertex fetch swizzles. So we implement it by passing 5 constants per sampler to the shader, the shader uses the first 4 as masks for each component and the 5th as the alpha value to OR in. Now TXQ is also broken so we have to pass a constant for the buffer size, on evergreen we just pass this, on r6/7 we pass it as the 6th element in the const info buffer. v1.1: drop return as DDX doesn't use a texture type v2: add r600/700 support. Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-01-08r600g: implement buffer copying using CP DMA for R7xx, Evergreen, CaymanMarek Olšák1-3/+3
R6xx doesn't work - the issue seems to be with flushing (sometimes the destination buffer contains garbage). There are no hangs, so we're good. R7xx doesn't seem to have any alignment restriction despite our initial thinking. Everything just works. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-22r600g: rename GPU_FLUSH -> INVAL_READ_CACHESMarek Olšák1-4/+4
because that's what it does.
2012-12-20r600g: add cs tracing infrastructure for lockup pin pointingJerome Glisse1-0/+26
It's a build time option you need to set R600_TRACE_CS to 1 and it will print to stderr all cs along as cs trace point value which gave last offset into a cs process by the GPU. Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-12-20r600g: rework flusing and synchronization pattern v7Jerome Glisse1-11/+8
This bring r600g allmost inline with closed source driver when it comes to flushing and synchronization pattern. v2-v4: history lost somewhere in outer space v5: Fix compute size of flushing, use define for flags, update worst case cs size requirement for flush, treat rs780 and newer as r7xx when it comes to streamout. v6: Fix num dw computation for framebuffer state, remove dead code, use define instead of hardcoded value. v7: Remove dead code Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-12-12r600g: suballocate memory for fetch shaders from a large bufferMarek Olšák1-1/+3
Fetch shaders are usually destroyed at the context destruction by the state tracker, so we can put them all in a large buffer without wasting memory. This reduces the number of relocations sent to the kernel a little bit. Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12r600g: suballocate memory for the STRMOUT_BUFFER_FILLED_SIZE registerMarek Olšák1-11/+11
Instead of having a 4-byte buffer for each streamout target, we suballocate each dword from a 4K buffer. This further reduces the overall number of relocations. Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-07gallium/u_blitter: fix conflict with u_memory.hMarek Olšák1-0/+1
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-10r600g: add initial cube map array support (v2)Dave Airlie1-1/+35
This contains the evergreen support. Support is possible on rv670 upwards and the code in here should work, but it doesn't and I haven't debugged it to figure out why. Beyond just adding support for the cube map array sampling, r600 resinfo isn't conformant with the GL specification, which states the number of layers should be returned for the textureSize, so we have to track in an external constant buffer the layers for each sampler if we need them in the shader. v2: only update the sampler constants if the sampler views have changed, as suggested by Marek. Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-11-09r600g: clarify const buffer numbering and handlingDave Airlie1-1/+1
For cube map arrays I'll need another driver private constant buffer, and looking forward to UBOs. So clean up with some defines, that can be modified when adding cube map array and ubos later. Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-10-31r600g: avoid shader needing too many gpr to lockup the gpu v2Jerome Glisse1-12/+15
On r6xx/r7xx shader resource management need to make sure that the shader does not goes over the gpr register limit. Each specific asic has a maxmimum register that can be split btw shader stage. For each stage the shader must not use more register than the limit programmed. v2: Print an error message when discarding draw. Don't add another boolean to context structure, but rather propagate the discard boolean through the call chain. Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-10-31gallium: add start_slot parameter to set_vertex_buffersMarek Olšák1-26/+27
This allows updating only a subrange of buffer bindings. set_vertex_buffers(pipe, start_slot, count, NULL) unbinds buffers in that range. Binding NULL resources unbinds buffers too (both buffer and user_buffer must be NULL). The meta ops are adapted to only save, change, and restore the single slot they use. The cso_context can save and restore only one vertex buffer slot. The clients can query which one it is using cso_get_aux_vertex_buffer_slot. It's currently set to 0. (the Draw module breaks if it's set to non-zero) It should decrease the CPU overhead when using a lot of meta ops, but the drivers must be able to treat each vertex buffer slot as a separate state (only r600g does so at the moment). I can imagine this also being useful for optimizing some OpenGL use cases. Reviewed-by: Brian Paul <brianp@vmware.com>
2012-10-29r600g: implement texturing with 8x MSAA compressed surfaces for EvergreenMarek Olšák1-2/+2
The 2x and 4x MSAA cases are completely broken. The lfdptr instruction returns garbage there. The 8x MSAA case is broken on Cayman, though at least the result looks somewhat correct. Only the 8x MSAA case works on Evergreen and is enabled.
2012-10-15r600g: emit the border color only when it's neededMarek Olšák1-0/+21
That depends on the texture wrap modes and filtering.
2012-10-12r600g: move shader structures into r600_shader.hMarek Olšák1-0/+1
2012-10-11r600g: put user indices in the command stream for small index countsMarek Olšák1-14/+27
This improves performance a little bit if there are lots of small indexed draw commands.
2012-10-11r600g: inline r600_translate_index_bufferMarek Olšák1-6/+22
2012-10-10r600g: move DB_SHADER_CONTROL into db_misc_stateMarek Olšák1-6/+6
Also update the register value in more appropriate places than r600_update_derived_state. Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10r600g: atomize depth-stencil-alpha stateMarek Olšák1-17/+9
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10r600g: atomize rasterizer stateMarek Olšák1-12/+8
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10r600g: sort variables in r600_contextMarek Olšák1-11/+7
Some variables have been removed from there too. Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10r600g: atomize scissor stateMarek Olšák1-22/+4
The workaround for R600 lacking VPORT_SCISSOR_ENABLE has also been simplified. Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10r600g: atomize polygon offset stateMarek Olšák1-4/+6
POLY_OFFSET_DB_FMT_CNTL is moved to the framebuffer state, because it only depends on the zbuffer format. Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10r600g: atomize fetch shaderMarek Olšák1-39/+3
The state object is actually a buffer, it's literally a buffer containing the shader code. Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10r600g: remove the dual_src_blend flag from the shader keyMarek Olšák1-1/+3
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10r600g: atomize blend stateMarek Olšák1-20/+40
This is not so trivial, because we disable blending if the dual src blending is turned on and the number of color outputs is less than 2. I decided to create 2 command buffers in the blend state object and just switch between them when needed, because there are other states unrelated to blending (like the color mask) and those shouldn't be changed (the old code had it wrong). Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10r600g: inline r600_atom_dirtyMarek Olšák1-19/+19
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-10-10r600g: remove the "atom" variable from r600_command_bufferMarek Olšák1-12/+1
r600_command_buffer is not an atom. The "atoms" have evolved into state slots (or groups of state slots) where you can bind states. There is a fixed amount of atoms (state slots) in the context. The command buffers are nothing like that. They represent states, not state slots. We could probably give r600_atom a better name someday. Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-30r600g: implement blitMarek Olšák1-1/+1
Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-30gallium/u_blitter: add gallium blit implementationMarek Olšák1-1/+1
The original blit function is extended and the otAher functions reuse it. Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-22r600g: atomize framebuffer stateMarek Olšák1-7/+9
Tested on RS880, Evergreen and Cayman. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-22r600g: don't snoop context state while building shadersMarek Olšák1-14/+14
Let's use the shader key describing the state. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-13r600g: convert the remnants of VGT state into immediate register writes/atoms v4Marek Olšák1-16/+35
v2: Group vgt register together to avoid lockup v3: Split multi primitive register and index bias register v4: Bump R600_NUM_ATOMS Signed-off-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13r600g: emit the primitive type and associated regs only if the type is changedMarek Olšák1-30/+30
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13r600g: add clip_misc_state for clip registers emitted in draw_vboMarek Olšák1-11/+29
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13r600g: atomize clip stateMarek Olšák1-0/+18
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
2012-09-13r600g: atomize blend colorMarek Olšák1-11/+12
Reviewed-by: Jerome Glisse <jglisse@redhat.com>