path: root/src/gallium
AgeCommit message (Collapse)AuthorFilesLines
2015-10-23virgl: add driver for virtio-gpu 3D (v2)Dave Airlie28-0/+5918
virgl is the 3D acceleration backend for the virtio-gpu shipping with qemu. The 3D acceleration is designed around gallium and TGSI as the virtualisation layer. The backend renderer translates the virgl interface into OpenGL currently. This is the initial import of the driver to mesa. The kernel driver portions are lined up for drm-next. Currently this driver supports up to GL3.3 and some misc extensions if the host driver exposes it. It is planned to iterate the virgl API to new GL levels as mesa host drivers gain features. v2: fix resource tracking across flushes to avoid ->bind hack in mapping. consolidate mapping and waiting code for transfers. use u_range for dirt tracking. handle larger shaders in protocol. include virtgpu_drm.h in mesa for now. add translation layer for gallium tgsi to virgl tgsi. Signed-off-by: Dave Airlie <>
2015-10-23tgsi: try and handle overflowing shaders. (v2)Dave Airlie2-3/+9
This is used to detect error in virgl if we overflow the shader dumping buffers. v2: return a bool. Reviewed-by: Marek Olšák <> Signed-off-by: Dave Airlie <>
2015-10-23tgsi: add option to dump floats as hex valuesDave Airlie3-2/+30
This adds support to the parser to accept hex values as floats, and then adds support to the dumper to allow the user to select to dump float as 32-bit hex numbers. This is required to get accurate values for virgl use of TGSI. Reviewed-by: Marek Olšák <> Signed-off-by: Dave Airlie <>
2015-10-22svga: Condition preemptive flush on draw emissionSinclair Yeh4-5/+25
On ultra high resolution modes, the preemptive flush flag can be set midway through command submission, a condition that cannot be recovered from a flush-retry, causing rendering artifacts. This patch prevents a preemtive_flush until a draw has been emitted. Signed-off-by: Sinclair Yeh <> Reviewed-by: Thomas Hellstrom <> Reviewed-by: Charmaine Lee <> Reviewed-by: Brian Paul <>
2015-10-22svga: try to avoid index generation for some primitive typesBrian Paul1-0/+14
The svga device doesn't directly support quads, quad strips or polygons so we have to convert those types to indexed triangle lists. But we can sometimes avoid that if we're drawing flat/constant-colored prims and we don't have to worry about provoking vertex. Reviewed-by: Charmaine Lee <> Reviewed-by: José Fonseca <>
2015-10-22svga: avoid provoking vertex conversion when possibleBrian Paul1-1/+14
Provoking vertex comes into play when doing flat shading. But if we know that all fragments in a primitive are the same color, the provoking vertex doesn't matter. Check for that case and use whichever provoking vertex convention is supported by the device. This avoids generating an index buffer to do the PV conversion. Reviewed-by: Charmaine Lee <> Reviewed-by: José Fonseca <>
2015-10-22svga: detect constant color writes in fragment shadersBrian Paul5-2/+77
Examine the fragment shader to try to detect TGSI shaders which use "MOV OUT[0], CONST[i]" to write a constant value for the fragment color. In this case, all fragments will have the same color (unless blending is enabled). This is a common case for OpenGL code such as: glColor(), glBegin(), glVertex(), ..., glEnd() when lighting/fog/etc are disabled. In this case, the Mesa/gallium state tracker actually generates a simple "MOV OUT[0], CONST[i]" fragment shader. This will be used by the next commit to avoid provoking vertex conversion (creating/rewriting an index buffer) when drawing flat-shaded primitives. Reviewed-by: Charmaine Lee <> Reviewed-by: José Fonseca <>
2015-10-22radeon/uvd: don't expose HEVC on old UVD hw (v3)Alex Deucher1-32/+18
The section for UVD 2 and older was not updated when HEVC support was added. Reported by Kano on irc. v2: integrate the UVD2 and older checks into the main switch statement. v3: handle encode checking as well. Encode is already checked in the top case statement, so drop encode checks in the lower case statement. Reviewed-by: Christian König <> Signed-off-by: Alex Deucher <> Cc:
2015-10-22gallivm: Translate all util_cpu_caps bits to LLVM attributes.Jose Fonseca1-2/+34
This should prevent disparity between features Mesa and LLVM believe are supported by the CPU. Tested on a i7-3720QM w/ LLVM 3.3 and 3.6. v2: Increase SmallVector initial size as suggested by Gustaw Smolarczyk. Reviewed-by: Roland Scheidegger <> CC: "10.6 11.0" <>
2015-10-22ilo: make sure there is HiZ before resolvingChia-I Wu1-2/+4
We do not want to perform a depth resolve on an MCS enabled surface.
2015-10-22ilo: fix max thread count for HS on Gen8Chia-I Wu1-3/+5
It is in DW2 on Gen8.
2015-10-21svga: fix clip plane regression after recent tgsi_scan changeBrian Paul1-2/+2
Before the change "tgsi/scan: use properties for clip/cull distance writemasks", the tgsi_shader_info::num_written_clipdistance field was a multiple of four, now it's an accurate count. In the svga driver, we need a minor change to the loop test. Reviewed-by: Charmaine Lee <>
2015-10-21osmesa: Expose GL entry points for Windows build via DEF file.Nigel Stewart2-0/+674
Bugzilla: CC: "10.6 11.0" <> Signed-off-by: Jose Fonseca <>
2015-10-20svga: add switch case for PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINTBrian Paul1-0/+2
A third instance of this was needed but missed in the previous commit. Return 32 as for the two other cases. Reviewed-by: Roland Scheidegger <> Reviewed-by: Charmaine Lee <>
2015-10-20draw: fix splitting of line loops (v2)Brian Paul4-8/+32
When the draw module splits long line loops, the sections are emitted as line strips. But the primitive type wasn't set correctly so each section was being drawn as a loop, introducing extra line segments. To fix this, we pass a new DRAW_LINE_LOOP_AS_STRIP flag to the run() function. The linear/elt_run() functions have to check for this flag and set their primitive type accordingly. No piglit regressions. Fixes piglit's lineloop with -count 4097 or higher. Bugzilla: Reviewed-by: Roland Scheidegger <>
2015-10-20gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINTMarek Olšák15-1/+42
This avoids a serious r600g bug leading to a GPU hang. The chances this bug will get fixed are pretty low now. I deeply regret listening to others and not pushing this patch, leaving other users with a GPU-crashing driver. Yes, it should be fixed in the compiler and it's ugly, but users couldn't care less about that. Bugzilla: Cc: 11.0 10.6 <> Reviewed-by: Brian Paul <>
2015-10-20vc4: Switch our vertex attr lowering to being NIR-based.Eric Anholt2-143/+200
This exposes more information to NIR's optimization, and should be particularly useful when we do range-based optimization. total uniforms in shared programs: 32066 -> 32065 (-0.00%) uniforms in affected programs: 21 -> 20 (-4.76%) total instructions in shared programs: 93104 -> 92630 (-0.51%) instructions in affected programs: 31901 -> 31427 (-1.49%)
2015-10-20vc4: Add limited support for ibfe/ubfe.Eric Anholt1-0/+42
This is just enough to cover our unpack modes, which will be used by some new NIR-based lowering in the next commit.
2015-10-20tgsi/scan: use properties for clip/cull distance writemasksMarek Olšák1-14/+14
No changes needed for drivers already relying on tgsi_shader_info. Reviewed-by: Brian Paul <>
2015-10-20gallium: add new properties for clip and cull distance usageMarek Olšák3-1/+15
The TGSI usage mask can't be used, because these are declared as an output array of 2 elements. Reviewed-by: Ilia Mirkin <> Reviewed-by: Brian Paul <>
2015-10-20radeonsi: enable BC_OPTIMIZE if centroid isn't usedMarek Olšák1-1/+5
This solution was recommended by a Catalyst developer. Reviewed-by: Michel Dänzer <>
2015-10-20radeonsi: fix the export_prim_id field size in the shader keyMarek Olšák1-2/+2
Reviewed-by: Michel Dänzer <>
2015-10-20radeonsi: support thread-safe shaders shared by multiple contextsMarek Olšák9-199/+224
The "current" shader pointer is moved from the CSO to the context, so that the CSO is mostly immutable. The only drawback is that the "current" pointer isn't saved when unbinding a shader and it must be looked up when the shader is bound again. This is also a prerequisite for multithreaded shader compilation. Reviewed-by: Michel Dänzer <>
2015-10-20gallium: add PIPE_CAP_SHAREABLE_SHADERSMarek Olšák15-0/+16
I'll let drivers figure out how to do it. Reviewed-by: Ilia Mirkin <>
2015-10-20radeonsi: add support for ARB_texture_viewMarek Olšák2-7/+22
All tests pass. We don't need to do much - just set CUBE if the view target is CUBE or CUBE_ARRAY, otherwise set the resource target. The reason this can be so simple is that texture instructions have a greater effect on the target than the sampler view. Thanks Glenn for the piglit test. Reviewed-by: Michel Dänzer <>
2015-10-20vc4: Use nir_foreach_variableBoyan Ding3-7/+7
Signed-off-by: Boyan Ding <> Reviewed-by: Eric Anholt <>
2015-10-19st/omx/dec/h264: fix field picture type 0 poc disorderLeo Liu1-4/+8
Signed-off-by: Leo Liu <> Reviewed-by: Christian König <> Cc: "10.6 11.0" <>
2015-10-19scons: Build nir/glsl_types.cpp once.Jose Fonseca6-27/+2
Undoes early hacks, and ensures nir/glsl_types.cpp is built once, and only once. The root problem is that SCons doesn't know about NIR nor any source file in the NIR_FILES source list. Tested with libgl-gdi and libgl-xlib scons targets. Reviewed-by: Brian Paul <>
2015-10-19svga: fix incorrect round-down arithmeticBrian Paul1-1/+1
Spotted by Roland. Luckily, this code should never really be hit since the const buffer size and offset should already be multiples of 16. I could probably add more assertions to that effect, but let's just fix the arithmetic for now. Reviewed-by: Roland Scheidegger <>
2015-10-19st/va: Added support for NV12 to IYUV conversion in vlVaGetImageIndrajit Das1-3/+5
Reviewed-by: Christian König <>
2015-10-19st/va: Used correct parameter to derive the value of the "h" variable in ↵Indrajit Das1-1/+1
vlVaCreateImage Cc: "11.0" <> Reviewed-by: Christian König <> Reviewed-by: Emil Velikov <>
2015-10-18ilo: set VME for 3DSTATE_PSChia-I Wu1-1/+6
When the bit is not set, we can see sampling artifacts on triangle edges when the mip filter is not GEN6_MIPFILTER_NONE.
2015-10-18ilo: ignore prefer_linear_threshold when zeroChia-I Wu2-3/+3
This was the intended behavior but it did not work as intended until now.
2015-10-18ilo: remove some unused kernel paramsChia-I Wu2-22/+0
2015-10-18ilo: remove unused ilo_shader_get_type()Chia-I Wu2-12/+0
2015-10-18ilo: remove u_debug.h inclusion from ilo_core.hChia-I Wu2-1/+2
Move it to ilo_debug.h.
2015-10-18ilo: remove u_memory.h inclusion from ilo_core.hChia-I Wu3-1/+3
We do not make allocations generally in the core.
2015-10-18nvc0: do not bind input params at compute state init on FermiSamuel Pitoiset1-8/+0
It looks like binding a constant buffer on compute overwrites the 3D state. To avoid that, we already re-bind all the 3D constant buffers after launching a compute grid but this is not enough. Binding the constant buffer of input parameters for the compute state at initialization corrupts the 3D constant buffers, and it's just useless to bind it because this is not needed until we really launch a grid. This fixes some piglit regressions related to interpolation tests introduced in "nvc0: enable compute support by default on Fermi". Fixes: 00d6186 (nvc0: enable compute support by default on Fermi) Signed-off-by: Samuel Pitoiset <> Reviewed-by: Ilia Mirkin <>
2015-10-17radeonsi: don't use the AMDGPU intrinsic for CMPMarek Olšák1-9/+22
No difference according to shader-db. Reviewed-by: Michel Dänzer <> Reviewed-by: Tom Stellard <>
2015-10-17radeonsi: use LRP from gallivmMarek Olšák1-2/+0
Totals: SGPRS: 344552 -> 344368 (-0.05 %) VGPRS: 197132 -> 197552 (0.21 %) Code Size: 7375376 -> 7366304 (-0.12 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1679360 -> 1615872 (-3.78 %) bytes per wave Totals from affected shaders: SGPRS: 47736 -> 47552 (-0.39 %) VGPRS: 27952 -> 28372 (1.50 %) Code Size: 1392724 -> 1383652 (-0.65 %) bytes LDS: 39 -> 39 (0.00 %) blocks Scratch: 513024 -> 449536 (-12.38 %) bytes per wave Reviewed-by: Michel Dänzer <>
2015-10-17radeonsi: don't emit AMDGPU intrinsics for integer abs, min, maxMarek Olšák1-10/+50
No difference according to shader-db. (with the new S_ABS_I32 pattern) Reviewed-by: Michel Dänzer <> Reviewed-by: Tom Stellard <>
2015-10-17radeonsi: don't emit AMDGPU intrinsics for EX2, ROUND, TRUNCMarek Olšák1-3/+3
No difference according to shader-db. Reviewed-by: Michel Dänzer <> Reviewed-by: Tom Stellard <>
2015-10-17radeonsi: initialize output, temp, and address registers to "undef"Marek Olšák1-4/+15
This removes "v_mov v0, 0" which typically occurs before exports. Totals: SGPRS: 345216 -> 344552 (-0.19 %) VGPRS: 197684 -> 197132 (-0.28 %) Code Size: 7390408 -> 7375376 (-0.20 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1842176 -> 1679360 (-8.84 %) bytes per wave Totals from affected shaders: SGPRS: 101336 -> 100672 (-0.66 %) VGPRS: 53920 -> 53368 (-1.02 %) Code Size: 2170176 -> 2155144 (-0.69 %) bytes LDS: 2 -> 2 (0.00 %) blocks Scratch: 1015808 -> 852992 (-16.03 %) bytes per wave Reviewed-by: Michel Dänzer <> Reviewed-by: Tom Stellard <>
2015-10-17gallivm: implement the correct version of LRPMarek Olšák1-6/+13
The previous version has precision issues. This can be a problem with tessellation. Sadly, I can't find the article where I read it anymore. I'm not sure if the unsafe-fp-math flag would be enough to revert this. v2: added the comment
2015-10-17gallivm: set correct opcode info from unary/binary/ternary emitsMarek Olšák1-3/+6
and clear the emit_data structure. The new radeonsi min/max opcode implementation requires this. (it looks good according to Roland S.)
2015-10-17radeonsi: implement vertex color clampingMarek Olšák5-4/+52
This is only supported in the compatibility profile (without GS and tess). Reviewed-by: Michel Dänzer <>
2015-10-17radeonsi: implement fragment color clampingMarek Olšák6-2/+18
using the shader key for now. Reviewed-by: Michel Dänzer <>
2015-10-17radeonsi: clean up other scratch buffer functionsMarek Olšák1-15/+8
Reviewed-by: Michel Dänzer <>
2015-10-17radeonsi: clean up copy-pasted scratch buffer updatesMarek Olšák1-26/+13
Reviewed-by: Michel Dänzer <>
2015-10-17radeonsi: unify shader create functionsMarek Olšák1-40/+9
The shader specifies the processor type, so use that instead. Reviewed-by: Michel Dänzer <>