summaryrefslogtreecommitdiff
path: root/src/gallium
AgeCommit message (Collapse)AuthorFilesLines
2016-08-06radeonsi: add a standalone compiler amdgcn_glslcMarek Olšák3-0/+323
This will be used by GLSL lit tests. For developers only. It shouldn't be distributable and it doesn't use the Mesa build system. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06radeonsi: add environment variable SI_FORCE_FAMILYMarek Olšák1-0/+32
This will be used by: amdgcn_glslc -mcpu=[family] It can also be used for shader-db if you want stats for a different family. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06winsys/radeon: implement cs_get_next_fenceMarek Olšák2-2/+29
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06winsys/amdgpu: implement cs_get_next_fenceMarek Olšák2-4/+35
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06gallium/radeon: add cs_get_next_fence winsys callbackMarek Olšák1-0/+7
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06gallium/radeon: count contextsMarek Olšák2-0/+4
We don't wanna use unflushed fences when we have multiple contexts. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06gallium/radeon: count gfx IB flushesMarek Olšák3-1/+3
This will be used as a counter for whether fence_finish needs to flush the IB. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06gallium/radeon: move radeon_winsys::cs_memory_below_limit to driversMarek Olšák7-52/+32
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06gallium/radeon: inline radeon_winsys::query_memory_usageMarek Olšák4-15/+1
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06gallium/radeon/winsyses: expose per-IB used_vram and used_gart to driversMarek Olšák5-25/+24
The following patches will use this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06gallium/radeon/winsyses: print CS submission error numberMarek Olšák2-2/+2
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06radeonsi: flush if constant, shader, and streamout buffers use too much memoryMarek Olšák1-15/+18
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06radeonsi: flush if sampler views and images use too much memoryMarek Olšák2-19/+63
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06radeonsi: deal with high vertex buffer memory usage correctlyMarek Olšák3-3/+10
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06radeonsi: take compute shader and dispatch indirect memory usage into accountMarek Olšák1-0/+6
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06radeonsi: take scratch buffer and draw indirect memory usage into accountMarek Olšák1-0/+6
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06radeonsi: check IB memory usage of CP DMA operationsMarek Olšák1-0/+5
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06gallium/radeon: add r600_resource::vram_usage and gart_usageMarek Olšák3-12/+19
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-05util: Move format_r11g11b10f.h to src/utilJason Ekstrand3-234/+1
It's used from both mesa main and gallium. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-08-05util: Move format_rgb9e5.h to src/utilJason Ekstrand4-164/+2
It's used from both mesa main and gallium. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-08-04swr: [rasterizer core] static analysis fixes for conservative rastTim Rowley2-5/+10
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04swr: [rasterizer core] implement InnerConservative input coverageTim Rowley6-182/+357
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04swr: [rasterizer core] remove CanEarlyZ functionTim Rowley1-6/+0
Test is now in SetupPipeline. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04swr: [rasterizer core] use 32x32 macrotile for openswrTim Rowley1-4/+4
Significant performance increase (up to 2x) on high geometry workloads. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04swr: [rasterizer fetch] add support for 24bit format fetchTim Rowley1-0/+1
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04swr: [rasterizer fetch] additional fetch format supportTim Rowley1-3/+15
Add support for 0 pitch in fetch. Add support for USCALE/SSCALE for 32bit integer fetches. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04swr: [rasterizer jitter] fix potential jit exit crashTim Rowley1-1/+6
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04swr: [rasterizer core] update sync handlingTim Rowley5-15/+15
Sync now uses a callback to ensure that it's called by the last thread moving past a DC. This will help with the new counter handling. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04swr: [rasterizer core] rename variableTim Rowley1-7/+7
Avoid nested declarations of the same name within a single function. Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04swr: [rasterizer jitter] adjust extern "C" block scopeTim Rowley1-3/+5
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04swr: [rasterizer core] conservative rast degenerate handlingTim Rowley5-144/+332
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04swr: [rasterizer core] allow hexadecimal for integer knobsTim Rowley1-3/+6
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04vc4: Move scalarizing and some lowering to link time.Eric Anholt1-5/+12
This works out to be a wash in terms of memory usage: We use more memory to store the separate ALU instructions, but we optimize out a lot of code as well. The main result, though, is that we do more of our work at link time rather than draw time.
2016-08-04vc4: Avoid VS shader recompiles by keeping a set of FS inputs seen so far.Eric Anholt3-25/+81
We don't want to bake the whole array into the FS key, because of the hashing overhead. But we can keep a set of the arrays seen, and use a pointer to the copy in as the array's proxy. Between this and the previous patch, gl-1.0-blend-func now passes on hardware, where previously it was filling the 256MB CMA area with shaders and OOMing. Drops 712 shaders from shader-db.
2016-08-04vc4: Don't recompile the CS when the FS changes.Eric Anholt1-0/+2
The compiled_fs_id is a proxy for the vc4->prog.fs->input_slots[], but only the VS dereferences it. Drops 754 shaders from shader-db.
2016-08-04vc4: Move FS inputs setup out to a helper function.Eric Anholt1-34/+41
It's a pretty big block, and I was about to make it bigger.
2016-08-04vl/dri3: Destroy Present event context when destroying drawable v2Michel Dänzer1-5/+16
Without this, the X server may accumulate stale Present event contexts if a client performs several video decoding sessions using the same window. v2: Based on Chris Wilson's review: * Use xcb_discard_reply() instead of free(xcb_request_check()) Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>
2016-08-03vc4: Avoid generating a custom shader per level in glGenerateMipmaps().Eric Anholt3-7/+25
We were baking in the LOD of the source level to each shader. Instead, pass it in as a uniform -- this requires storing it to a temp register, but that's better than compiling a ton of separate shaders: total instructions in shared programs: 115032 -> 115036 (0.00%) instructions in affected programs: 96 -> 100 (4.17%) LOST: 572
2016-08-03vc4: Tell valgrind about BO allocations from mmap time to destroy.Eric Anholt2-0/+11
This helps in debugging memory pressure. It would be nice if we could tell valgrind about it all the way from allocation time to destroy, but we need a pointer to hand to VALGRIND_MALLOCLIKE_BLOCK.
2016-08-03vc4: Fix a leak of the src[] array of VPM reads in optimization.Eric Anholt1-4/+5
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-08-03vc4: Fix leak of the bo_handles table.Eric Anholt1-0/+1
2016-08-03vc4: Fix handling of UBO range offsets.Eric Anholt1-2/+3
The ranges are in units of bytes, not dwords. This wasn't caught by piglit tests because ttn tends to make one big uniform file, so we only had one UBO range with a src and dst offset of 0.
2016-08-03vc4: Dump NIR at shader state creation time as well.Eric Anholt1-0/+8
I keep wanting to see this version of the NIR.
2016-08-03r600g: use last_gfx_fence like radeonsiMarek Olšák1-3/+12
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-03gallium/radeon: move last_gfx_fence from radeonsi to common codeMarek Olšák5-7/+7
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-03radeonsi: skip unnecessary si_update_shaders callsMarek Olšák4-7/+27
Small decrease in draw call overhead. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-03radeonsi: print the command line to VM fault reports (v2)Marek Olšák1-0/+3
v2: rebase on top of Brian's commit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-03ddebug: print the command line to all logs (v2)Marek Olšák1-0/+4
for piglit with the pipelined hang detection mode v2: rebase on top of Brian's commit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-03ddebug: don't use fmemopen on non-Linux OSMarek Olšák1-0/+5
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97140 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-03radeonsi: don't set the last parameter component of llvm.AMDGPU.cubeMarek Olšák1-2/+8
LLVM doesn't use it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>