AgeCommit message (Collapse)AuthorFilesLines
73 min.freedreno/ci/a306: split off snorm blending failuresHEADmainIlia Mirkin1-35/+37
The hardware doesn't support this. Signed-off-by: Ilia Mirkin <> Part-of: <>
73 min.freedreno/ci/a306: split off the f32 blend / texturing failuresIlia Mirkin1-57/+60
The hardware doesn't support this. Signed-off-by: Ilia Mirkin <> Part-of: <>
73 min.freedreno/ci/a306: separate msaa failsIlia Mirkin1-201/+204
The driver does not implement MSAA. When that happens these can be split up further. Signed-off-by: Ilia Mirkin <> Part-of: <>
109 Use TLS context/dispatch with shared-glapiJesse Natalie2-34/+44
However they have to be called via _glapi_get_dispatch/context. This would be safe to do on any platform, but the extra indirection is only necessary on Windows since TLS vars can't be exported from a DLL. Reviewed-by: Emma Anholt <> Part-of: <>
2 hoursfreedreno/a3xx: add some legacy formatsIlia Mirkin2-28/+17
These can be used in "legacy" buffer textures. Signed-off-by: Ilia Mirkin <> Part-of: <>
2 hoursfreedreno/ci/a306: add additional skip which hangchecksIlia Mirkin1-0/+4
I was having trouble getting a run to complete without this. Was working earlier, not sure what changed. Signed-off-by: Ilia Mirkin <> Part-of: <>
3 hoursfreedreno/a6xx: Set the tess BO ptrs in the program stateobj.Emma Anholt2-25/+27
Saves some draw-time work for tess. Part-of: <>
3 hoursfreedreno/a6xx: Skip emitting tess BO pointers past the shader's constlen.Emma Anholt1-2/+5
Some shaders don't want these pointers, and going past the constlen would potentially overwrite consts from other draws. This is a port of a fix from turnip. Part-of: <>
3 hoursfreedreno/a6xx: Allocate a fixed-size tess factor BO.Emma Anholt10-83/+48
Saves per-batch allocations, avoids reallocation for various vertex counts, and avoids needing the indirect tess addrs constobj so that we could emit the relocs to the tess BO after we'd emitted all the draws. Also apparently it fixes one of our CTS fails. Part-of: <>
4 hoursradv: Don't emit framebuffer state if there is no renderpass active.Bas Nieuwenhuizen1-1/+1
The framebuffer state could still be dirty from when the previous renderpass was bound. Fixes: 5632359959f ("radv: Remove the skipping of framebuffer emission if we don't have a framebuffer.") Closes: Reviewed-by: Samuel Pitoiset <> Part-of: <>
5 hoursd3d12: Support compat level 330Jesse Natalie3-166/+11
Reviewed-by: Bill Kristiansen <> Part-of: <>
5 hoursvenus: ignore framebuffer for VkCommandBuffer executed outside of render passRyan Neph1-5/+17
The vulkan spec states[1]: > If the VkCommandBuffer will not be executed within a render pass instance, > or if the render pass instance was begun with vkCmdBeginRenderingKHR, > renderPass, subpass, and framebuffer are ignored. but venus will still try to encode them, resulting in a guest-side assert or host-side command stream error. [1]: Signed-off-by: Ryan Neph <> Reviewed-by: Chia-I Wu <> Reviewed-by: Yiwei Zhang <> Part-of: <>
7 hoursnir: Make nir_build_alu() variants per 1-4 arg count.Emma Anholt3-5/+73
This saves a bunch of generated code to pack up the extra NULLs to get to 4 args, and saves executing the conditions in nir_build_alu() to then skip those NULLs. Saves another 27kb on disk. Reviewed-by: Jason Ekstrand <> Part-of: <>
7 hoursnir: Uninline a bunch of nir.h functions.Emma Anholt3-441/+501
I aimed for "things that look like big switch statements, or cases where the compiler is unlikely to be able to constant-propagate an argument into something useful." Saves another 80kb on disk. No perf difference on iris shader-db, n=23. Reviewed-by: Jason Ekstrand <> Part-of: <>
8 hoursiris: Drop redundant iris_resource_disable_aux callNanley Chery1-4/+2
Drop the call to iris_resource_disable_aux in iris_resource_configure_aux. With the previous patches, we no longer create CCS surfaces and pick the AUX_NONE usage. As a result, if the aux usage is NONE, all iris_resource fields already indicate that aux is disabled. Reviewed-by: Kenneth Graunke <> Part-of: <>
8 hoursiris: Enable CCS_E on 32-bpc float formats on TGL+Nanley Chery1-4/+5
Allow CCS_E on these formats on TGL+ for a couple reasons: 1) TGL doesn't have the option to fall back to CCS_D/fast-clears like prior platforms do. 2) The CCS compression scheme on TGL improves to encode more than 3 levels of compression. This should help floating point formats. In my measurements, enabling this on TGL results in a minor performance improvement on Paraview (+0.06%) rather than a major regression like on prior platforms. The improvement was measured by taking the average of 3 runs of: -d 256 -f 600. Also, the Intel performance CI reports a 3.81% ±0.12% FPS improvement in Bioshock Infinite. Reviewed-by: Kenneth Graunke <> Part-of: <>
8 hoursintel/isl: Unify fmt checks in isl_surf_supports_ccsNanley Chery2-13/+8
On TGL+, require that the surface format supports CCS_E in order to support CCS. This aligns with the ISL code that pads the primary surface for CCS on this platform. Pre-TGL, require support for either CCS_D or CCS_E. Reviewed-by: Kenneth Graunke <> Part-of: <>
9 hoursdocs: update calendar and link releases notes for 21.3.1Eric Engestrom2-2/+3
Part-of: <>
9 hoursdocs: add release notes for 21.3.1Eric Engestrom1-0/+132
Part-of: <>
10 hoursCI/d3d12: Add a quick_shader runJesse Natalie2-4/+13145
Refactor the YML for some DRY, and rename the existing pass from "-windows" to "-quick_gl" to disambiguate it. Reviewed-by: Enrico Galli <> Acked-by: Daniel Stone <> Part-of: <>
10 hoursCI/windows: Move reference files to relevant ci subdirectoriesJesse Natalie5-1507/+1508
Reviewed-by: Enrico Galli <> Acked-by: Daniel Stone <> Part-of: <>
10 hoursCI/windows: Move SPIRV-to-DXIL test YML to microsoft folderJesse Natalie2-20/+20
Reviewed-by: Enrico Galli <> Acked-by: Daniel Stone <> Part-of: <>
10 hoursCI/windows: Move D3D12 test YML to D3D12 driver folderJesse Natalie2-24/+24
Reviewed-by: Enrico Galli <> Acked-by: Daniel Stone <> Part-of: <>
11 hoursfreedreno/crashdec: Basing GMU log decodingRob Clark1-0/+31
Looks like each entry is four dwords, with the second dword being a timestamp. Signed-off-by: Rob Clark <> Part-of: <>
11 hoursfreedreno/crashdec: Fallback to chip_id for GPU idRob Clark1-1/+9
Signed-off-by: Rob Clark <> Part-of: <>
11 hoursfreedreno/crashdec: HFI queue decodingRob Clark4-0/+592
Signed-off-by: Rob Clark <> Part-of: <>
11 hoursfreedreno/crashdec: Split out mempool decodingRob Clark4-329/+403
Before we start adding GMU HFI decoding, lets split the other big section specific decoding (mempool) out into it's own file. Signed-off-by: Rob Clark <> Part-of: <>
12 hoursturnip: Move CP_SET_SUBDRAW_SIZE to vkCmdBindPipeline() time.Emma Anholt1-37/+19
Now that the subdraw size is constant for a pipeline, this lets tess draws avoid the slow path in vkCmdDraw*(). Part-of: <>
12 hoursturnip: use SUBDRAW_SIZE and constant sized tess bosJonathan Marek3-124/+77
This fixes the problem of large indirect draws, and at the same time avoids allocating too large buffers for tessellation. Reworked by @anholt to use a separate tess factor BO so we can skip the WFIs to set the TESSFACTOR_ADDR. Signed-off-by: Jonathan Marek <> Part-of: <>
12 hoursfreedreno/ir3: Make a shared helper for the tess factor stride.Emma Anholt3-35/+25
Part-of: <>
13 hoursnouveau/nir: Use natural alignment for scalarsM Henning1-4/+8
We used to request vec4 alignment for everything on the nir codepath, but this triggers an assertion failure since a0b82c24b6, which prohibits vec4 alignment on scalars. Since requiring vec4 alignment on scalars is a little silly anyway, this patch relaxes the alignment to naturally aligned for scalars. Fixes about 27 crashing tests in piglit and deqp on kepler, including eg piglit/tests/spec/glsl-1.30/execution/fs-large-local-array.shader_test Reviewed-by: Karol Herbst <> Part-of: <>
14 hoursutil/u_trace/perfetto: add new env variable to enable perfettoLionel Landwerlin5-6/+39
When using the Vulkan API, command buffers can be recorded way before perfetto is enabled. This can be problematic if you want already recorded command buffers to produce traces. This new environment variable makes perfetto enabled internally so that command buffers are recorded with timestamps, even though no perfetto recording happens. v2: rename to GPU_TRACE_INSTRUMENT (Rob) v3: Move instrumentation check to generated headers (Danylo) Decouple instrumentation enabling from tracing (Danylo) Signed-off-by: Lionel Landwerlin <> Reviewed-by: Danylo Piliaiev <> Part-of: <>
14 hoursutil/u_trace: add end_of_pipe property to tracepointsLionel Landwerlin6-5/+11
In order to capture the timestamp when things actually end on Intel GPU HW, we need to know whether the timestamp should be capture at the top or end of pipeline. v2: use one line python if/else (Danylo) Signed-off-by: Lionel Landwerlin <> Reviewed-by: Danylo Piliaiev <> Part-of: <>
19 hoursglsl: fix for unused variable in glsl_types.cppViktoriia Palianytsia1-1/+1
Unused variable vector_elements is now used in return from function decode_type_from_blob instead of encoded.basic.vector_elements. In the code we can see how those variables were equated and then the operations were made exactly to vector_elements. But variable didn't pass into any other variables or functions. Closes: Signed-off-by: Viktoriia Palianytsia <> Reviewed-by: Karol Herbst <> Part-of: <>
20 hoursspirv: handle SpvOpMemberNameMarcin Ślusarz3-10/+41
Now we can see field names in structs instead of generic "fieldN" with NIR_PRINT=1. Reviewed-by: Caio Oliveira <> Part-of: <>
20 hoursnir/opt_deref: don't try to cast empty structuresLionel Landwerlin1-0/+4
Found while running valgrind : ==3583454== Invalid read of size 4 ==3583454== at 0xF48336: glsl_get_struct_field_offset (nir_types.cpp:84) ==3583454== by 0xC7CD0D: opt_replace_struct_wrapper_cast (nir_deref.c:1068) ==3583454== by 0xC7CDD9: opt_deref_cast (nir_deref.c:1087) ==3583454== by 0xC7DD8E: nir_opt_deref_impl (nir_deref.c:1369) ==3583454== by 0xC7DF4E: nir_opt_deref (nir_deref.c:1428) ==3583454== by 0xA63F3C: brw_kernel_from_spirv (brw_kernel.c:325) ==3583454== by 0xA3BC2C: main (intel_clc.c:481) ==3583454== Address 0xe4f7e88 is 24 bytes after a block of size 48 in arena "client" Signed-off-by: Lionel Landwerlin <> Cc: mesa-stable Reviewed-by: Jason Ekstrand <> Part-of: <>
21 hoursgallium/d3d12: Don't use designated initializersBoris Brezillon1-1/+2
Use of designated initializers requires at least '/std:c++20', and mesa is using c++14 by default. Fixes: 8d3a3e7a00b ("microsoft/compiler: Use textures for SRVs") Signed-off-by: Boris Brezillon <> Reviewed-by: Jesse Natalie <> Part-of: <>
21 hoursmicrosoft/compiler: Fix dxil_nir_create_bare_samplers()Boris Brezillon1-18/+15
_mesa_hash_table_u64_search() returns the data directly, not an hash_entry object. We also need to take the descriptor set into account for this pass to work properly on Vulkan shaders. Fixes: 46bc7cf6783 ("microsoft/compiler: Rewrite sampler splitting pass to be smarter and handle derefs") Signed-off-by: Boris Brezillon <> Reviewed-by: Jesse Natalie <> Part-of: <>
28 hoursfreedreno/ci: add piglit runs for a306Ilia Mirkin4-0/+678
Signed-off-by: Ilia Mirkin <> Reviewed-by: Emma Anholt <> Part-of: <>
31 hoursandroid: define cpp_rtti=false because libLLVM is built w/o RTTI (v2)Mauro Rossi1-0/+1
libLLVM for Android is built without RTTI, but after commit ad86267 mesa inherits meson default RTTI enabled state. cpp_rtti=false is added to meson options in android/ (v2) Add Fixes tag and use spaces instead of tabs for aligning the trailing \ Signed-off-by: Mauro Rossi <> Fixes: ad862674 ("meson: Don't override built-in cpp_rtti option, error if it's invalid") Cc: "21.3" "21.2" mesa-stable Reviewed-by: Marijn Suijten <> Part-of: <>
31 hoursRevert "android: define cpp_rtti=false because libLLVM is built w/o RTTI"Mauro Rossi1-1/+0
This reverts commit f659d00000a1a3667f9861d01d5828dd12ec6857. The revert is done because essential Fixes tag was missing and to apply a better version that could be picked for mesa-stable. Acked-by: Marijn Suijten <> Part-of: <>
33 hoursaco: don't create DPP instructions with SGPR operandsRhys Perry2-2/+20
Signed-off-by: Rhys Perry <> Reviewed-by: Daniel Schürmann <> Fixes: 2e6834d4f6c ("aco: combine DPP into VALU before RA") Part-of: <>
37 hourspanfrost: Add empty tile flags to GenXMLAlyssa Rosenzweig2-0/+4
These flags control special CRC handling for empty tiles using the CRC clear colour field added on Bifrost. Their use depends on CRC being used. We missed these flags earlier; let's add them since they are used by the Valhall DDK but are not new to Valhall. Signed-off-by: Alyssa Rosenzweig <> Part-of: <>
38 hoursradv: fix resetting the entire vertex input dynamic stateSamuel Pitoiset1-8/+1
If there is holes, eg. the application firsts set vertex attributes 0 and 1, then vertex attributes 0 and 7, the format of vertex attribute 1 is still the previous one, while it should be FORMAT_INVALID to avoid a GPU hang. This fixes a GPU hang with Yuzu. Closes: Cc: 21.3 mesa-stable Signed-off-by: Samuel Pitoiset <> Reviewed-by: Rhys Perry <> Part-of: <>
39 hoursanv: Drop code from get_blorp_surf_for_anv_bufferNanley Chery1-15/+0
The code to handle ASTC surfaces hasn't been needed since commit dd92179a72 ("anv: Canonicalize buffer formats for image/buffer copies"). Reviewed-by: Jason Ekstrand <> Part-of: <>
39 hoursanv: Allow transfer-only linear ASTC imagesNanley Chery2-11/+11
Some apps depend on this to run. Closes: Reviewed-by: Jason Ekstrand <> Part-of: <>
39 hoursanv: Require transfer features for transfer usagesNanley Chery1-6/+10
In order for an image to support the transfer usage, require that its format can be used for blits or copies. Reviewed-by: Jason Ekstrand <> Part-of: <>
39 hoursiris: Allow GPU-based uploads of ASTC texturesNanley Chery1-5/+0
ISL recently started allowing linear ASTC surfaces to be created. With that in place, iris can perform GPU-based uploads to ASTC textures in the same way it does so with other compressed surfaces. We're not aware of any reason to continue special-casing ASTC texture uploads, so we get rid of the code which does so. Reviewed-by: Jason Ekstrand <> Part-of: <>
39 hoursintel/isl: Allow creating non-Y-tiled ASTC surfacesNanley Chery2-6/+8
The sampler can only decode ASTC surfaces that are Y-tiled. ISL has been asserting this restriction at surface creation time. However, some drivers want to create a surface that is only used for copying compressed data. And during the copy, the surface won't have a compressed format. To enable this behavior, we choose to move the tiling assertion to the moment a surface state is created for the sampler. Reviewed-by: Jason Ekstrand <> Part-of: <>
40 hoursblorp: Disallow multisampling for BLORP compute blits and copies.Kenneth Graunke2-5/+25
We don't support typed image writes for multisampling, so we can't handle multisampled destinations. We also usually handle MSAA by running the fragment shader per-sample, which we aren't accounting for in our compute shaders, so we can't handle MSAA sources either. We could do both of these things if we really wanted to, but we don't. Reviewed-by: Jordan Justen <> Part-of: <>