summaryrefslogtreecommitdiff
path: root/src/gallium/drivers/freedreno/Makefile.sources
AgeCommit message (Collapse)AuthorFilesLines
2017-05-04freedreno/a5xx: compute shader supportRob Clark1-0/+2
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-22freedreno: add support for hw accumulating queriesRob Clark1-0/+2
Some queries on a4xx and all queries on a5xx can do result accumulation on CP so we don't need to track per-tile samples. We do still need to handle pausing/resuming while switching batches (in case the query is active over multiple draws which are executed out of order). So introduce new accumulated-query helpers for these sorts of queries, since it doesn't really fit in cleanly with the original query infra- structure. Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-11-30freedreno/a5xx: initial supportRob Clark1-0/+27
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-30freedreno: add batch-cache and batch reorderingRob Clark1-0/+2
Note that I originally also had a entry-point that would construct a key and do lookup from a pipe_surface. I ended up not needing that (yet?) but it is easy-enough to re-introduce later if we need it for the blit path. For now, not enabled by default, but can be enabled (on a3xx/a4xx) with FD_MESA_DEBUG=reorder. Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-30freedreno: introduce fd_batchRob Clark1-0/+2
Introduce the batch object, to track a batch/submit's worth of ringbuffers and other bookkeeping. In this first step, just move the ringbuffers into batch, since that is mostly uninteresting churn. For now there is just a single batch at a time. Note that one outcome of this change is that rb's are allocated/freed on each use. But the expectation is that the bo pool in libdrm_freedreno will save us the GEM bo alloc/free which was the initial reason to implement a rb pool in gallium. The purpose of the batch is to eventually facilitate out-of-order rendering, with batches associated to framebuffer state, and tracking the dependencies on other batches. Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-04-25freedreno/ir3: fix sin/cosRob Clark1-0/+3
We seem to need range reduction to get sane results. Fixes glmark2 jellyfish bench, and a whole bunch of dEQP-GLES3.functional.shaders.builtin_functions.precision.{sin,cos,tan}.* v2: squashed in android build fixes from Rob Herring Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-01-03freedreno/ir3: refactor NIR IR handlingRob Clark1-0/+1
Immediately convert into NIR and do an initial key-agnostic lowering/ optimization pass. This should let us share most of the per-variant transformations between each variant, and hopefully minimize the draw- time variant creation part of the compilation process. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21freedreno/ir3: introduce ir3_compiler objectRob Clark1-0/+1
Right now, just provides a cleaner way to get at the gpu-id, given the separation between compiler and context. But we will need this also to hold the reg-set for new register allocation. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21freedreno/ir3: remove tgsi f/eRob Clark1-2/+0
Also remove ir3_flatten which was only used by tgsi f/e. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21freedreno/ir3: drop dot graph dumpingRob Clark1-1/+1
At least for now.. right now the instruction and instruction list printing should suffice, and the re-working of ir3_block would require a lot of changes in that code. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-17freedreno/ir3/nir: lower if/elseRob Clark1-0/+2
For now, completely flatten if/else blocks. That will almost certainly change once we have flow control. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-05freedreno/ir3: add NIR compilerRob Clark1-0/+1
The NIR compiler frontend is an alternative to the TGSI f/e, producing the same ir3 IR and using the same backend passes for scheduling, etc. It is not enabled by default yet, as there are still some regressions. To enable, use 'FD_MESA_DEBUG=nir'. It is enough to use with, for example, xonotic or supertuxkart. With the NIR f/e, scalarizing and a number of other lowering steps happen in NIR, so we don't have to do them in ir3. Which simplifies the f/e and allows the lowered instructions to pass through other optimization stages. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-15freedreno/ir3: remove old compilerRob Clark1-1/+0
Now that piglit is no longer falling back to old compiler for any tests, we can remove it. Hurray \o/ Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07freedreno/ir3: simplify RARob Clark1-2/+2
Group inputs/outputs, in addition to fanin/fanout, as they must also exist in sequential scalar registers. This lets us simplify RA by working in terms of neighbor groups. NOTE: has the slight problem that it can't optimize out mov's for things like: MOV OUT[n], IN[m] To avoid this, instead of trying to figure out what mov's we can eliminate, we first remove all mov's prior to grouping, and then re-insert mov's as needed while grouping inputs/outputs/fanins. Eventually we'd prefer the frontend to not insert extra mov's in the first place (so we don't have to bother removing them). This is the plan for an eventual NIR based frontend, so separate out the instr grouping (which will still be needed for NIR frontend) from the mov elimination (which won't). Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-23freedreno/ir3: split out legalize passRob Clark1-0/+1
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-04freedreno/a4xx: fd4_util -> fd4_formatRob Clark1-2/+2
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-29freedreno/a3xx: fd3_util -> fd3_formatIlia Mirkin1-2/+2
All the "util" helpers are actually format-related Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-11-16freedreno: add missing headers in Makefile.sourcesEmil Velikov1-1/+14
... or autotools will fail to pick them up for the distribution tarball. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-15freedreno: add adreno 420 supportRob Clark1-0/+14
Very initial support. Basic stuff working (es2gears, es2tri, and maybe about half of glmark2). Expect broken stuff. Still missing: mem->gmem (restore), queries, mipmaps (blob segfaults!), hw binning, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-14freedreno: use tgsi_loweringRob Clark1-2/+0
Now that the freedreno_lowering code is moved to tgsi_lowering, remove our private copy and switch over to using the common version. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-05gallium/freedreno: ship all files in the tarballEmil Velikov1-12/+63
- include all headers in Makefile.sources - sort the list(s) - bundle the android build Cc: freedreno@lists.freedesktop.org Cc: Rob Clark <robclark@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>
2014-07-25freedreno/ir3: split out shader compiler from a3xxRob Clark1-11/+14
Move the bits we want to share between generations from fd3_program to ir3_shader. So overall structure is: fdN_shader_stateobj -> ir3_shader -> ir3_shader_variant -> ir3 |- ... \- ir3_shader_variant -> ir3 So the ir3_shader becomes the topmost generation neutral object, which manages the set of variants each of which generates, compiles, and assembles it's own ir. There is a bit of additional renaming to s/fd3_compiler/ir3_compiler/, etc. Keep the split between the gallium level stateobj and the shader helper object because it might be a good idea to pre-compute some generation specific register values (ie. anything that is independent of linking). Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-13freedreno/a3xx: occlusion query supportRob Clark1-0/+1
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-13freedreno: add support for hw queriesRob Clark1-0/+1
Real GPU queries need some infrastructure to track samples per tile and accumulate the results. But fortunately this can be shared across GPU generation. See: https://github.com/freedreno/freedreno/wiki/Queries#hardware-queries Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-13freedreno/query: allow multiple query implementationsRob Clark1-0/+1
Split out fd_query into an abstract base class, to allow multiple implementations. The current sw based queries are moved into fd_sw_query. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-02-23freedreno/a3xx: drop hand-coded blit/solid shadersRob Clark1-0/+1
Instead in the common code, construct these shaders from TGSI. For now we let a2xx keep it's hand coded shaders, as it's compiler isn't quite up to the job yet. All the same it is a net drop in code size and gets rid of special cases. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-02-03freedreno/a3xx/compiler: new compilerRob Clark1-0/+6
The new compiler generates a dependency graph of instructions, including a few meta-instructions to handle PHI and preserve some extra information needed for register assignment, etc. The depth pass assigned a weight/depth to each node (based on sum of instruction cycles of a given node and all it's dependent nodes), which is used to schedule instructions. The scheduling takes into account the minimum number of cycles/slots between dependent instructions, etc. Which was something that could not be handled properly with the original compiler (which was more of a naive TGSI translator than an actual compiler). The register assignment is currently split out as a standalone pass. I expect that it will be replaced at some point, once I figure out what to do about relative addressing (which is currently the only thing that should cause fallback to old compiler). There are a couple new debug options for FD_MESA_DEBUG env var: optmsgs - enable debug prints in optimizer optdump - dump instruction graph in .dot format, for example: http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot.png http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot At this point, thanks to proper handling of instruction scheduling, the new compiler fixes a lot of things that were broken before, and does not appear to break anything that was working before[1]. So even though it is not finished, it seems useful to merge it in it's current state. [1] Not merged in this commit, because I'm not sure if it really belongs in mesa tree, but the following commit implements a simple shader emulator, which I've used to compare the output of the new compiler to the original compiler (ie. run it on all the TGSI shaders dumped out via ST_DEBUG=tgsi with various games/apps): https://github.com/freedreno/mesa/commit/163b6306b1660e05ece2f00d264a8393d99b6f12 Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-02-03freedreno/a3xx/compiler: split out old compilerRob Clark1-0/+1
For the time being, keep old compiler as fallback for things that the new compiler does not support yet. Split out as it's own commit to make the later new-compiler commits easier to follow. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-02-03freedreno/a3xx/compiler: prepare for new compilerRob Clark1-1/+1
Shuffle things around to prepare for new compiler. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-02-01freedreno: add tgsi lowering passRob Clark1-0/+1
Currently lowers the following instructions: DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH, DP2 translating these into equivalent simpler TGSI instructions. This probably should be moved to util so other drivers can use it, but just adding under freedreno for now so that I can clear out a lot of the lowering code in a3xx compiler before beginning to add new compiler. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-01-08freedreno: add basic query supportRob Clark1-0/+1
Add for now some simple/basic query support (ie. things not actually requiring the GPU). Might change around a bit when I actually add GPU queries, but for now this enables some useful performance info in the GALLIUM_HUD. For example: GALLIUM_HUD=fps+batches+batches-sysmem+batches-gmem+restores,draw-calls The driver specific specific queries are: + draw-calls + batches - number of batches per second, sum of batches-sysmem plus batches-gmem + batches-gmem - render a set of tiles in GMEM, for each tile (optionally) system mem -> gmem (restore), plus N draws, plus gmem -> system mem (resolve) per second + batches-sysmem - N draws to system memory (GMEM bypass) per second + restores - number of GMEM batches that required restore per second Ideally for GMEM rendering, you want batches-gmem to equal fps. If the app is doing something that triggers multiple passes (ie. requires extra round trip gmem <-> system memory) then the # of batches per second will go up relative to fps. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-16freedreno: compact a2xx and a3xx makefiles into parent onesJohannes Obermayr1-0/+32
Nearly everything within the three Makefile.am's is identical. Let's simplify things a little. v2: Rebase and rewrite the commit message (Emil Velikov) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-01freedreno: consolidate C sources list into Makefile.sourcesEmil Velikov1-0/+11
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>