summaryrefslogtreecommitdiff
path: root/src/gallium/drivers/vc4/vc4_cl.h
AgeCommit message (Collapse)AuthorFilesLines
2017-12-01broadcom/vc4: Simplify the relocation handling for index buffers.Eric Anholt1-15/+0
Originally there was CL code for handling various relocations back when I had relocs for the TSDA/TA buffers. Now that the kernel handles those entirely on its own, I can inline that code into the one place using it.
2017-10-27vc4: fix release buildEric Engestrom1-6/+6
Mesa's DEBUG and assert's NDEBUG are not tied to each other, so we need to explicitly compile this code out. Fixes: 3df78928786134874eafa "vc4: Drop reloc_count tracking for debug asserts on non-debug builds." Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net>
2017-06-30vc4: Start using XML unpack functions in CL dump.Eric Anholt1-1/+0
For now this is a no-op on the output, but it makes it clear that we've had weird things going on with things like V3D21_CLIPPER_Z_SCALE_AND_OFFSET.
2017-06-30vc4: Move rasterizer state packing to CSO creation time.Eric Anholt1-0/+5
This gets our vc4_emit.c size back down a bit: before: 1020 0 0 1020 3fc src/gallium/drivers/vc4/.libs/vc4_emit.o after: 968 0 0 968 3c8 src/gallium/drivers/vc4/.libs/vc4_emit.o
2017-06-30vc4: Convert the driver to emitting the shader record using pack macros.Eric Anholt1-10/+37
2017-06-30vc4: Simplify pack header usageEric Anholt1-6/+9
Take the CL pointer in, which will be useful for enabling relocs. However, our code expands a bit more: before: 4449 0 0 4449 1161 src/gallium/drivers/vc4/.libs/vc4_draw.o 988 0 0 988 3dc src/gallium/drivers/vc4/.libs/vc4_emit.o after: 4481 0 0 4481 1181 src/gallium/drivers/vc4/.libs/vc4_draw.o 1020 0 0 1020 3fc src/gallium/drivers/vc4/.libs/vc4_emit.o
2017-06-30vc4: Start using the pack header.Eric Anholt1-0/+63
This slightly inflates the size of the generated code, in exchange for getting us some convenient tools. before: 4389 0 0 4389 1125 src/gallium/drivers/vc4/.libs/vc4_draw.o 808 0 0 808 328 src/gallium/drivers/vc4/.libs/vc4_emit.o after: 4449 0 0 4449 1161 src/gallium/drivers/vc4/.libs/vc4_draw.o 988 0 0 988 3dc src/gallium/drivers/vc4/.libs/vc4_emit.o
2016-09-14vc4: Move the render job state into a separate structure.Eric Anholt1-6/+7
This is a preparation step for having multiple jobs being queued up at the same time.
2015-07-14vc4: Drop reloc_count tracking for debug asserts on non-debug builds.Eric Anholt1-0/+10
Cuts another 88 bytes of compiled code.
2015-07-14vc4: Rework cl handling to be friendlier to the compiler.Eric Anholt1-47/+66
Drops 680 bytes of code, from avoiding a bunch of extra updates to the next pointer in the struct.
2015-07-14vc4: Make a helper function for getting the current offset in the CL.Eric Anholt1-5/+10
I needed to rewrite this a bit for safety checking in the next commit. Despite being a static inline of the same thing that was being done, we lose 36 bytes of code for some reason.
2015-07-14vc4: Drop separate cl*_reloc_hindex().Eric Anholt1-18/+6
Now that RCL generation is in the kernel, we don't have any other callers. Oddly, the compiler generates another 8 bytes of code for this, but the simplification is worth it.
2015-07-14vc4: Store reloc pointers as pointers, not offsets.Eric Anholt1-5/+5
Now that we don't resize the CL as we build (it's set up at the top by vc4_start_draw()), we can store the pointers instead of offsets from the base. Saves a bit of math in emitting relocs (about 60 bytes of code).
2015-06-16vc4: Move vc4_packet.h to the kernel/ directory, since it's also shared.Eric Anholt1-1/+1
I want to notice discrepancies when I diff -u between Mesa and the kernel.
2014-12-25vc4: Handle unaligned accesses in CL emits.Eric Anholt1-1/+52
As of 229bf4475ff0a5dbeb9bc95250f7a40a983c2e28 we started getting SIBGUS from unaligned accesses on the hardware, for reasons I haven't figured out. However, we should be avoiding unaligned accesses anyway, and our CL setup certainly would have produced them.
2014-12-25vc4: Don't bother zero-initializing the shader reloc indices.Eric Anholt1-2/+2
They should all be set to real values by the time they're read, and ideally if you used valgrind you'd see uninitialized value uses.
2014-12-25vc4: Fix the argument type for cl_u16().Eric Anholt1-1/+1
It doesn't matter, since it just got truncated to 16 inside, anyway.
2014-12-24vc4: Optimize CL emits by doing size checks up front.Eric Anholt1-10/+7
The optimizer obviously doesn't have the ability to rewrite these to skip the size checks per call, so we have to do it manually. Improves a norast benchmark on simulation by 0.779706% +/- 0.405838% (n=6087).
2014-12-24vc4: Avoid repeated hindex lookups in the loop over tiles.Eric Anholt1-3/+9
Improves norast performance of a microbenchmark by 11.1865% +/- 2.37673% (n=20).
2014-10-17vc4: Make some assertions about how many flushes/EOFs the simulator sees.Eric Anholt1-1/+1
This caught the previous commit's bug in the kernel validator.
2014-09-18vc4: Actually implement VC4_DEBUG=cl.Eric Anholt1-0/+1
2014-08-11vc4: Rename GEM_HANDLES to be in a namespace.Eric Anholt1-1/+1
It's not a real VC4 hardware packet, but I've put in a comment to explain it.
2014-08-11vc4: Switch simulator to using kernel validatorEric Anholt1-12/+10
This ensures that when I'm using the simulator, I get a closer match to what behavior on real hardware will be. It lets me rapidly iterate on the kernel validation code (which otherwise has a several-minute turnaround time), and helps catch buffer overflow bugs in the userspace driver faster.
2014-08-08vc4: Initial skeleton driver import.Eric Anholt1-0/+132
This mostly just takes every draw call and turns it into a sequence of commands that clear the FBO and draw a single shaded triangle to it, regardless of the actual input vertices or shaders. I copied the initial driver skeleton mostly from freedreno, and I've preserved Rob Clark's copyright for those. I also based my initial hardcoded shaders and command lists on Scott Mansell (phire)'s "hackdriver" project, though the bit patterns of the shaders emitted end up being different. v2: Rebase on gallium megadrivers changes. v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change. v4: Rely on simpenrose actually being installed when building for simulation. v5: Add more header duplicate-include guards. v6: Apply Emil's review (protection against vc4 sim and ilo at the same time, and dropping the dricommon drm bits) and fix a copyright header (thanks, Roland)