mesa/mesa - The Mesa 3D Graphics Library (mirrored from https://gitlab.freedesktop.org/mesa/mesa)

Age	Commit message (Collapse)	Author	Files	Lines
2017-05-04	freedreno/a5xx: compute shader support	Rob Clark	1	-0/+2
	Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-22	freedreno: add support for hw accumulating queries	Rob Clark	1	-0/+2
	Some queries on a4xx and all queries on a5xx can do result accumulation on CP so we don't need to track per-tile samples. We do still need to handle pausing/resuming while switching batches (in case the query is active over multiple draws which are executed out of order). So introduce new accumulated-query helpers for these sorts of queries, since it doesn't really fit in cleanly with the original query infra- structure. Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-11-30	freedreno/a5xx: initial support	Rob Clark	1	-0/+27
	Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-30	freedreno: add batch-cache and batch reordering	Rob Clark	1	-0/+2
	Note that I originally also had a entry-point that would construct a key and do lookup from a pipe_surface. I ended up not needing that (yet?) but it is easy-enough to re-introduce later if we need it for the blit path. For now, not enabled by default, but can be enabled (on a3xx/a4xx) with FD_MESA_DEBUG=reorder. Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-30	freedreno: introduce fd_batch	Rob Clark	1	-0/+2
	Introduce the batch object, to track a batch/submit's worth of ringbuffers and other bookkeeping. In this first step, just move the ringbuffers into batch, since that is mostly uninteresting churn. For now there is just a single batch at a time. Note that one outcome of this change is that rb's are allocated/freed on each use. But the expectation is that the bo pool in libdrm_freedreno will save us the GEM bo alloc/free which was the initial reason to implement a rb pool in gallium. The purpose of the batch is to eventually facilitate out-of-order rendering, with batches associated to framebuffer state, and tracking the dependencies on other batches. Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-04-25	freedreno/ir3: fix sin/cos	Rob Clark	1	-0/+3
	We seem to need range reduction to get sane results. Fixes glmark2 jellyfish bench, and a whole bunch of dEQP-GLES3.functional.shaders.builtin_functions.precision.{sin,cos,tan}.* v2: squashed in android build fixes from Rob Herring Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-01-03	freedreno/ir3: refactor NIR IR handling	Rob Clark	1	-0/+1
	Immediately convert into NIR and do an initial key-agnostic lowering/ optimization pass. This should let us share most of the per-variant transformations between each variant, and hopefully minimize the draw- time variant creation part of the compilation process. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21	freedreno/ir3: introduce ir3_compiler object	Rob Clark	1	-0/+1
	Right now, just provides a cleaner way to get at the gpu-id, given the separation between compiler and context. But we will need this also to hold the reg-set for new register allocation. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21	freedreno/ir3: remove tgsi f/e	Rob Clark	1	-2/+0
	Also remove ir3_flatten which was only used by tgsi f/e. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-06-21	freedreno/ir3: drop dot graph dumping	Rob Clark	1	-1/+1
	At least for now.. right now the instruction and instruction list printing should suffice, and the re-working of ir3_block would require a lot of changes in that code. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-17	freedreno/ir3/nir: lower if/else	Rob Clark	1	-0/+2
	For now, completely flatten if/else blocks. That will almost certainly change once we have flow control. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-04-05	freedreno/ir3: add NIR compiler	Rob Clark	1	-0/+1
	The NIR compiler frontend is an alternative to the TGSI f/e, producing the same ir3 IR and using the same backend passes for scheduling, etc. It is not enabled by default yet, as there are still some regressions. To enable, use 'FD_MESA_DEBUG=nir'. It is enough to use with, for example, xonotic or supertuxkart. With the NIR f/e, scalarizing and a number of other lowering steps happen in NIR, so we don't have to do them in ir3. Which simplifies the f/e and allows the lowered instructions to pass through other optimization stages. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-15	freedreno/ir3: remove old compiler	Rob Clark	1	-1/+0
	Now that piglit is no longer falling back to old compiler for any tests, we can remove it. Hurray \o/ Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-01-07	freedreno/ir3: simplify RA	Rob Clark	1	-2/+2
	Group inputs/outputs, in addition to fanin/fanout, as they must also exist in sequential scalar registers. This lets us simplify RA by working in terms of neighbor groups. NOTE: has the slight problem that it can't optimize out mov's for things like: MOV OUT[n], IN[m] To avoid this, instead of trying to figure out what mov's we can eliminate, we first remove all mov's prior to grouping, and then re-insert mov's as needed while grouping inputs/outputs/fanins. Eventually we'd prefer the frontend to not insert extra mov's in the first place (so we don't have to bother removing them). This is the plan for an eventual NIR based frontend, so separate out the instr grouping (which will still be needed for NIR frontend) from the mov elimination (which won't). Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-23	freedreno/ir3: split out legalize pass	Rob Clark	1	-0/+1
	Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-12-04	freedreno/a4xx: fd4_util -> fd4_format	Rob Clark	1	-2/+2
	Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-11-29	freedreno/a3xx: fd3_util -> fd3_format	Ilia Mirkin	1	-2/+2
	All the "util" helpers are actually format-related Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>
2014-11-16	freedreno: add missing headers in Makefile.sources	Emil Velikov	1	-1/+14
	... or autotools will fail to pick them up for the distribution tarball. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-11-15	freedreno: add adreno 420 support	Rob Clark	1	-0/+14
	Very initial support. Basic stuff working (es2gears, es2tri, and maybe about half of glmark2). Expect broken stuff. Still missing: mem->gmem (restore), queries, mipmaps (blob segfaults!), hw binning, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-14	freedreno: use tgsi_lowering	Rob Clark	1	-2/+0
	Now that the freedreno_lowering code is moved to tgsi_lowering, remove our private copy and switch over to using the common version. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-09-05	gallium/freedreno: ship all files in the tarball	Emil Velikov	1	-12/+63
	- include all headers in Makefile.sources - sort the list(s) - bundle the android build Cc: freedreno@lists.freedesktop.org Cc: Rob Clark <robclark@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>
2014-07-25	freedreno/ir3: split out shader compiler from a3xx	Rob Clark	1	-11/+14
	Move the bits we want to share between generations from fd3_program to ir3_shader. So overall structure is: fdN_shader_stateobj -> ir3_shader -> ir3_shader_variant -> ir3 \|- ... \- ir3_shader_variant -> ir3 So the ir3_shader becomes the topmost generation neutral object, which manages the set of variants each of which generates, compiles, and assembles it's own ir. There is a bit of additional renaming to s/fd3_compiler/ir3_compiler/, etc. Keep the split between the gallium level stateobj and the shader helper object because it might be a good idea to pre-compute some generation specific register values (ie. anything that is independent of linking). Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-13	freedreno/a3xx: occlusion query support	Rob Clark	1	-0/+1
	Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-13	freedreno: add support for hw queries	Rob Clark	1	-0/+1
	Real GPU queries need some infrastructure to track samples per tile and accumulate the results. But fortunately this can be shared across GPU generation. See: https://github.com/freedreno/freedreno/wiki/Queries#hardware-queries Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-13	freedreno/query: allow multiple query implementations	Rob Clark	1	-0/+1
	Split out fd_query into an abstract base class, to allow multiple implementations. The current sw based queries are moved into fd_sw_query. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-02-23	freedreno/a3xx: drop hand-coded blit/solid shaders	Rob Clark	1	-0/+1
	Instead in the common code, construct these shaders from TGSI. For now we let a2xx keep it's hand coded shaders, as it's compiler isn't quite up to the job yet. All the same it is a net drop in code size and gets rid of special cases. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-02-03	freedreno/a3xx/compiler: new compiler	Rob Clark	1	-0/+6
	The new compiler generates a dependency graph of instructions, including a few meta-instructions to handle PHI and preserve some extra information needed for register assignment, etc. The depth pass assigned a weight/depth to each node (based on sum of instruction cycles of a given node and all it's dependent nodes), which is used to schedule instructions. The scheduling takes into account the minimum number of cycles/slots between dependent instructions, etc. Which was something that could not be handled properly with the original compiler (which was more of a naive TGSI translator than an actual compiler). The register assignment is currently split out as a standalone pass. I expect that it will be replaced at some point, once I figure out what to do about relative addressing (which is currently the only thing that should cause fallback to old compiler). There are a couple new debug options for FD_MESA_DEBUG env var: optmsgs - enable debug prints in optimizer optdump - dump instruction graph in .dot format, for example: http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot.png http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot At this point, thanks to proper handling of instruction scheduling, the new compiler fixes a lot of things that were broken before, and does not appear to break anything that was working before[1]. So even though it is not finished, it seems useful to merge it in it's current state. [1] Not merged in this commit, because I'm not sure if it really belongs in mesa tree, but the following commit implements a simple shader emulator, which I've used to compare the output of the new compiler to the original compiler (ie. run it on all the TGSI shaders dumped out via ST_DEBUG=tgsi with various games/apps): https://github.com/freedreno/mesa/commit/163b6306b1660e05ece2f00d264a8393d99b6f12 Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-02-03	freedreno/a3xx/compiler: split out old compiler	Rob Clark	1	-0/+1
	For the time being, keep old compiler as fallback for things that the new compiler does not support yet. Split out as it's own commit to make the later new-compiler commits easier to follow. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-02-03	freedreno/a3xx/compiler: prepare for new compiler	Rob Clark	1	-1/+1
	Shuffle things around to prepare for new compiler. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-02-01	freedreno: add tgsi lowering pass	Rob Clark	1	-0/+1
	Currently lowers the following instructions: DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH, DP2 translating these into equivalent simpler TGSI instructions. This probably should be moved to util so other drivers can use it, but just adding under freedreno for now so that I can clear out a lot of the lowering code in a3xx compiler before beginning to add new compiler. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-01-08	freedreno: add basic query support	Rob Clark	1	-0/+1
	Add for now some simple/basic query support (ie. things not actually requiring the GPU). Might change around a bit when I actually add GPU queries, but for now this enables some useful performance info in the GALLIUM_HUD. For example: GALLIUM_HUD=fps+batches+batches-sysmem+batches-gmem+restores,draw-calls The driver specific specific queries are: + draw-calls + batches - number of batches per second, sum of batches-sysmem plus batches-gmem + batches-gmem - render a set of tiles in GMEM, for each tile (optionally) system mem -> gmem (restore), plus N draws, plus gmem -> system mem (resolve) per second + batches-sysmem - N draws to system memory (GMEM bypass) per second + restores - number of GMEM batches that required restore per second Ideally for GMEM rendering, you want batches-gmem to equal fps. If the app is doing something that triggers multiple passes (ie. requires extra round trip gmem <-> system memory) then the # of batches per second will go up relative to fps. Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-11-16	freedreno: compact a2xx and a3xx makefiles into parent ones	Johannes Obermayr	1	-0/+32
	Nearly everything within the three Makefile.am's is identical. Let's simplify things a little. v2: Rebase and rewrite the commit message (Emil Velikov) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-01	freedreno: consolidate C sources list into Makefile.sources	Emil Velikov	1	-0/+11
	Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>