diff options
Diffstat (limited to 'docs')
-rw-r--r-- | docs/_static/specs/EGL_MESA_x11_native_visual_id.txt | 80 | ||||
-rw-r--r-- | docs/_static/specs/WL_create_wayland_buffer_from_image.spec (renamed from docs/_static/specs/OLD/WL_create_wayland_buffer_from_image.spec) | 0 | ||||
-rw-r--r-- | docs/android.rst | 4 | ||||
-rw-r--r-- | docs/drivers/panfrost.rst | 301 | ||||
-rw-r--r-- | docs/drivers/panfrost/drm-shim.rst | 84 | ||||
-rw-r--r-- | docs/drivers/panfrost/instancing.rst | 112 | ||||
-rw-r--r-- | docs/drivers/panfrost/texcomp.rst | 17 | ||||
-rw-r--r-- | docs/drivers/panfrost/tiling.rst | 38 | ||||
-rw-r--r-- | docs/envvars.rst | 8 | ||||
-rw-r--r-- | docs/features.txt | 23 | ||||
-rw-r--r-- | docs/header-stubs/compiler/spirv/spirv_info.h | 1 | ||||
-rw-r--r-- | docs/header-stubs/vk_enum_to_str.h | 0 | ||||
-rw-r--r-- | docs/release-calendar.csv | 7 | ||||
-rw-r--r-- | docs/relnotes.rst | 2 | ||||
-rw-r--r-- | docs/relnotes/24.0.7.rst | 155 | ||||
-rw-r--r-- | docs/relnotes/new_features.txt | 3 | ||||
-rw-r--r-- | docs/rusticl.rst | 4 |
17 files changed, 547 insertions, 292 deletions
diff --git a/docs/_static/specs/EGL_MESA_x11_native_visual_id.txt b/docs/_static/specs/EGL_MESA_x11_native_visual_id.txt new file mode 100644 index 00000000000..de30c399ef0 --- /dev/null +++ b/docs/_static/specs/EGL_MESA_x11_native_visual_id.txt @@ -0,0 +1,80 @@ +Name + + MESA_x11_native_visual_id + +Name Strings + + EGL_MESA_x11_native_visual_id + +Contact + + Eric Engestrom <eric@engestrom.ch> + +Status + + Complete, shipping. + +Version + + Version 2, May 10, 2024 + +Number + + EGL Extension #TBD + +Extension Type + + EGL display extension + +Dependencies + + None. This extension is written against the + wording of the EGL 1.5 specification. + +Overview + + This extension allows EGL_NATIVE_VISUAL_ID to be used in + eglChooseConfig() for a display of type EGL_PLATFORM_X11_EXT. + +IP Status + + Open-source; freely implementable. + +New Types + + None + +New Procedures and Functions + + None + +New Tokens + + None + +In section 3.4.1.1 "Selection of EGLConfigs" of the EGL 1.5 +Specification, replace: + + If EGL_MAX_PBUFFER_WIDTH, EGL_MAX_PBUFFER_HEIGHT, + EGL_MAX_PBUFFER_PIXELS, or EGL_NATIVE_VISUAL_ID are specified in + attrib list, then they are ignored [...] + +with: + + If EGL_MAX_PBUFFER_WIDTH, EGL_MAX_PBUFFER_HEIGHT, + or EGL_MAX_PBUFFER_PIXELS are specified in attrib list, then they + are ignored [...]. EGL_NATIVE_VISUAL_ID is ignored except on + a display of type EGL_PLATFORM_X11_EXT when EGL_ALPHA_SIZE is + greater than zero. + +Issues + + None. + +Revision History + + Version 1, March 27, 2024 (Eric Engestrom) + Initial draft + Version 2, May 10, 2024 (David Heidelberg) + add EGL_ALPHA_SIZE condition + add Extension type and set it to display extension diff --git a/docs/_static/specs/OLD/WL_create_wayland_buffer_from_image.spec b/docs/_static/specs/WL_create_wayland_buffer_from_image.spec index aa5eb4d24d9..aa5eb4d24d9 100644 --- a/docs/_static/specs/OLD/WL_create_wayland_buffer_from_image.spec +++ b/docs/_static/specs/WL_create_wayland_buffer_from_image.spec diff --git a/docs/android.rst b/docs/android.rst index 3ac75171e57..0034706bb75 100644 --- a/docs/android.rst +++ b/docs/android.rst @@ -34,8 +34,8 @@ Then, create your Meson cross file to use it, something like this [host_machine] system = 'android' - cpu_family = 'arm' - cpu = 'aarch64' + cpu_family = 'aarch64' + cpu = 'armv8' endian = 'little' Now, use that cross file for your Android build directory (as in this diff --git a/docs/drivers/panfrost.rst b/docs/drivers/panfrost.rst index 7fc1a32e9f0..a8b63da2441 100644 --- a/docs/drivers/panfrost.rst +++ b/docs/drivers/panfrost.rst @@ -3,33 +3,31 @@ Panfrost The Panfrost driver stack includes an OpenGL ES implementation for Arm Mali GPUs based on the Midgard and Bifrost microarchitectures. It is **conformant** -on Mali-G52 and Mali-G57 but **non-conformant** on other GPUs. The following -hardware is currently supported: - -========= ============= ============ ======= -Product Architecture OpenGL ES OpenGL -========= ============= ============ ======= -Mali T600 Midgard (v4) 2.0 2.1 -Mali T620 Midgard (v4) 2.0 2.1 -Mali T720 Midgard (v4) 2.0 2.1 -Mali T760 Midgard (v5) 3.1 3.1 -Mali T820 Midgard (v5) 3.1 3.1 -Mali T830 Midgard (v5) 3.1 3.1 -Mali T860 Midgard (v5) 3.1 3.1 -Mali T880 Midgard (v5) 3.1 3.1 -Mali G72 Bifrost (v6) 3.1 3.1 -Mali G31 Bifrost (v7) 3.1 3.1 -Mali G51 Bifrost (v7) 3.1 3.1 -Mali G52 Bifrost (v7) 3.1 3.1 -Mali G76 Bifrost (v7) 3.1 3.1 -Mali G57 Valhall (v9) 3.1 3.1 -Mali G310 Valhall (v10) 3.1 3.1 -Mali G610 Valhall (v10) 3.1 3.1 -========= ============= ============ ======= +on `Mali-G52 <https://www.khronos.org/conformance/adopters/conformant-products/opengles#submission_949>`_ +and `Mali-G57 <https://www.khronos.org/conformance/adopters/conformant-products/opengles#submission_980>`_ +but **non-conformant** on other GPUs. The following hardware is currently +supported: + ++--------------------+---------------+-----------+--------+ +| Models | Architecture | OpenGL ES | OpenGL | ++====================+===============+===========+========+ +| T600, T620, T720 | Midgard (v4) | 2.0 | 2.1 | ++--------------------+---------------+-----------+--------+ +| T760, T820, T830 | Midgard (v5) | 3.1 | 3.1 | +| T860, T880 | | | | ++--------------------+---------------+-----------+--------+ +| G72 | Bifrost (v6) | 3.1 | 3.1 | ++--------------------+---------------+-----------+--------+ +| G31, G51, G52, G76 | Bifrost (v7) | 3.1 | 3.1 | ++--------------------+---------------+-----------+--------+ +| G57 | Valhall (v9) | 3.1 | 3.1 | ++--------------------+---------------+-----------+--------+ +| G310, G610 | Valhall (v10) | 3.1 | 3.1 | ++--------------------+---------------+-----------+--------+ Other Midgard and Bifrost chips (e.g. G71) are not yet supported. -Older Mali chips based on the Utgard architecture (Mali 400, Mali 450) are +Older Mali chips based on the Utgard architecture (Mali-400, Mali-450) are supported in the :doc:`Lima <lima>` driver, not Panfrost. Lima is also available in Mesa. @@ -61,255 +59,12 @@ Panfrost developers and users hang out on IRC at ``#panfrost`` on OFTC. Note that registering and authenticating with ``NickServ`` is required to prevent spam. `Join the chat. <https://webchat.oftc.net/?channels=panfrost>`_ -Compressed texture support --------------------------- - -In the driver, Panfrost supports ASTC, ETC, and all BCn formats (e.g. RGTC, -S3TC, etc.) However, Panfrost depends on the hardware to support these formats -efficiently. All supported Mali architectures support these formats, but not -every system-on-chip with a Mali GPU support all these formats. Many lower-end -systems lack support for some BCn formats, which can cause problems when playing -desktop games with Panfrost. To check whether this issue applies to your -system-on-chip, Panfrost includes a ``panfrost_texfeatures`` tool to query -supported formats. - -To use this tool, include the option ``-Dtools=panfrost`` when configuring Mesa. -Then inside your Mesa build directory, the tool is located at -``src/panfrost/tools/panfrost_texfeatures``. Copy it to your target device, -set as executable as necessary, and run on the target device. A table of -supported formats will be printed to standard output. - -drm-shim --------- - -Panfrost implements ``drm-shim``, stubbing out the Panfrost kernel interface. -Use cases for this functionality include: - -- Future hardware bring up -- Running shader-db on non-Mali workstations -- Reproducing compiler (and some driver) bugs without Mali hardware - -Although Mali hardware is usually paired with an Arm CPU, Panfrost is portable C -code and should work on any Linux machine. In particular, you can test the -compiler on shader-db on an Intel desktop. - -To build Mesa with Panfrost drm-shim, configure Meson with -``-Dgallium-drivers=panfrost`` and ``-Dtools=drm-shim``. See the above -building section for a full invocation. The drm-shim binary will be built to -``build/src/panfrost/drm-shim/libpanfrost_noop_drm_shim.so``. - -To use, set the ``LD_PRELOAD`` environment variable to the drm-shim binary. It -may also be necessary to set ``LIBGL_DRIVERS_PATH`` to the location where Mesa -was installed. - -By default, drm-shim mocks a Mali-G52 system. To select a specific Mali GPU, -set the ``PAN_GPU_ID`` environment variable to the desired GPU ID: - -========= ============= ======= -Product Architecture GPU ID -========= ============= ======= -Mali-T720 Midgard (v4) 720 -Mali-T860 Midgard (v5) 860 -Mali-G72 Bifrost (v6) 6221 -Mali-G52 Bifrost (v7) 7212 -Mali-G57 Valhall (v9) 9093 -Mali-G610 Valhall (v10) a867 -========= ============= ======= - -Additional GPU IDs are enumerated in the ``panfrost_model_list`` list in -``src/panfrost/lib/pan_props.c``. - -As an example: assuming Mesa is installed to a local path ``~/lib`` and Mesa's -build directory is ``~/mesa/build``, a shader can be compiled for Mali-G52 as: - -.. code-block:: sh - - ~/shader-db$ BIFROST_MESA_DEBUG=shaders \ - LIBGL_DRIVERS_PATH=~/lib/dri/ \ - LD_PRELOAD=~/mesa/build/src/panfrost/drm-shim/libpanfrost_noop_drm_shim.so \ - PAN_GPU_ID=7212 \ - ./run shaders/glmark/1-1.shader_test - -The same shader can be compiled for Mali-T720 as: - -.. code-block:: sh - - ~/shader-db$ MIDGARD_MESA_DEBUG=shaders \ - LIBGL_DRIVERS_PATH=~/lib/dri/ \ - LD_PRELOAD=~/mesa/build/src/panfrost/drm-shim/libpanfrost_noop_drm_shim.so \ - PAN_GPU_ID=720 \ - ./run shaders/glmark/1-1.shader_test - -These examples set the compilers' ``shaders`` debug flags to dump the optimized -NIR, backend IR after instruction selection, backend IR after register -allocation and scheduling, and a disassembly of the final compiled binary. - -As another example, this invocation runs a single dEQP test "on" Mali-G52, -pretty-printing GPU data structures and disassembling all shaders -(``PAN_MESA_DEBUG=trace``) as well as dumping raw GPU memory -(``PAN_MESA_DEBUG=dump``). The ``EGL_PLATFORM=surfaceless`` environment variable -and various flags to dEQP mimic the surfaceless environment that our -continuous integration (CI) uses. This eliminates window system dependencies, -although it requires a specially built CTS: - -.. code-block:: sh - - ~/VK-GL-CTS/build/external/openglcts/modules$ PAN_MESA_DEBUG=trace,dump \ - LIBGL_DRIVERS_PATH=~/lib/dri/ \ - LD_PRELOAD=~/mesa/build/src/panfrost/drm-shim/libpanfrost_noop_drm_shim.so \ - PAN_GPU_ID=7212 EGL_PLATFORM=surfaceless \ - ./glcts --deqp-surface-type=pbuffer \ - --deqp-gl-config-name=rgba8888d24s8ms0 --deqp-surface-width=256 \ - --deqp-surface-height=256 -n \ - dEQP-GLES31.functional.shaders.builtin_functions.common.abs.float_highp_compute - -U-interleaved tiling ---------------------- - -Panfrost supports u-interleaved tiling. U-interleaved tiling is -indicated by the ``DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED`` modifier. - -The tiling reorders whole pixels (blocks). It does not compress or modify the -pixels themselves, so it can be used for any image format. Internally, images -are divided into tiles. Tiles occur in source order, but pixels (blocks) within -each tile are reordered according to a space-filling curve. - -For regular formats, 16x16 tiles are used. This harmonizes with the default tile -size for binning and CRCs (transaction elimination). It also means a single line -(16 pixels) at 4 bytes per pixel equals a single 64-byte cache line. - -For formats that are already block compressed (S3TC, RGTC, etc), 4x4 tiles are -used, where entire blocks are reorder. Most of these formats compress 4x4 -blocks, so this gives an effective 16x16 tiling. This justifies the tile size -intuitively, though it's not a rule: ASTC may uses larger blocks. - -Within a tile, the X and Y bits are interleaved (like Morton order), but with a -twist: adjacent bit pairs are XORed. The reason to add XORs is not obvious. -Visually, addresses take the form:: - - | y3 | (x3 ^ y3) | y2 | (y2 ^ x2) | y1 | (y1 ^ x1) | y0 | (y0 ^ x0) | - -Reference routines to encode/decode u-interleaved images are available in -``src/panfrost/shared/test/test-tiling.cpp``, which documents the space-filling -curve. This reference implementation is used to unit test the optimized -implementation used in production. The optimized implementation is available in -``src/panfrost/shared/pan_tiling.c``. - -Although these routines are part of Panfrost, they are also used by Lima, as Arm -introduced the format with Utgard. It is the only tiling supported on Utgard. On -Mali-T760 and newer, Arm Framebuffer Compression (AFBC) is more efficient and -should be used instead where possible. However, not all formats are -compressible, so u-interleaved tiling remains an important fallback on Panfrost. - -Instancing ----------- - -The attribute descriptor lets the attribute unit compute the address of an -attribute given the vertex and instance ID. Unfortunately, the way this works is -rather complicated when instancing is enabled. - -To explain this, first we need to explain how compute and vertex threads are -dispatched. When a quad is dispatched, it receives a single, linear index. -However, we need to translate that index into a (vertex id, instance id) pair. -One option would be to do: - -.. math:: - \text{vertex id} = \text{linear id} \% \text{num vertices} - - \text{instance id} = \text{linear id} / \text{num vertices} - -but this involves a costly division and modulus by an arbitrary number. -Instead, we could pad num_vertices. We dispatch padded_num_vertices * -num_instances threads instead of num_vertices * num_instances, which results -in some "extra" threads with vertex_id >= num_vertices, which we have to -discard. The more we pad num_vertices, the more "wasted" threads we -dispatch, but the division is potentially easier. - -One straightforward choice is to pad num_vertices to the next power of two, -which means that the division and modulus are just simple bit shifts and -masking. But the actual algorithm is a bit more complicated. The thread -dispatcher has special support for dividing by 3, 5, 7, and 9, in addition -to dividing by a power of two. As a result, padded_num_vertices can be -1, 3, 5, 7, or 9 times a power of two. This results in less wasted threads, -since we need less padding. - -padded_num_vertices is picked by the hardware. The driver just specifies the -actual number of vertices. Note that padded_num_vertices is a multiple of four -(presumably because threads are dispatched in groups of 4). Also, -padded_num_vertices is always at least one more than num_vertices, which seems -like a quirk of the hardware. For larger num_vertices, the hardware uses the -following algorithm: using the binary representation of num_vertices, we look at -the most significant set bit as well as the following 3 bits. Let n be the -number of bits after those 4 bits. Then we set padded_num_vertices according to -the following table: - -========== ======================= -high bits padded_num_vertices -========== ======================= -1000 :math:`9 \cdot 2^n` -1001 :math:`5 \cdot 2^{n+1}` -101x :math:`3 \cdot 2^{n+2}` -110x :math:`7 \cdot 2^{n+1}` -111x :math:`2^{n+4}` -========== ======================= - -For example, if num_vertices = 70 is passed to glDraw(), its binary -representation is 1000110, so n = 3 and the high bits are 1000, and -therefore padded_num_vertices = :math:`9 \cdot 2^3` = 72. - -The attribute unit works in terms of the original linear_id. if -num_instances = 1, then they are the same, and everything is simple. -However, with instancing things get more complicated. There are four -possible modes, two of them we can group together: - -1. Use the linear_id directly. Only used when there is no instancing. - -2. Use the linear_id modulo a constant. This is used for per-vertex -attributes with instancing enabled by making the constant equal -padded_num_vertices. Because the modulus is always padded_num_vertices, this -mode only supports a modulus that is a power of 2 times 1, 3, 5, 7, or 9. -The shift field specifies the power of two, while the extra_flags field -specifies the odd number. If shift = n and extra_flags = m, then the modulus -is :math:`(2m + 1) \cdot 2^n`. As an example, if num_vertices = 70, then as -computed above, padded_num_vertices = :math:`9 \cdot 2^3`, so we should set -extra_flags = 4 and shift = 3. Note that we must exactly follow the hardware -algorithm used to get padded_num_vertices in order to correctly implement -per-vertex attributes. - -3. Divide the linear_id by a constant. In order to correctly implement -instance divisors, we have to divide linear_id by padded_num_vertices times -to user-specified divisor. So first we compute padded_num_vertices, again -following the exact same algorithm that the hardware uses, then multiply it -by the GL-level divisor to get the hardware-level divisor. This case is -further divided into two more cases. If the hardware-level divisor is a -power of two, then we just need to shift. The shift amount is specified by -the shift field, so that the hardware-level divisor is just -:math:`2^\text{shift}`. +Technical details +----------------- -If it isn't a power of two, then we have to divide by an arbitrary integer. -For that, we use the well-known technique of multiplying by an approximation -of the inverse. The driver must compute the magic multiplier and shift -amount, and then the hardware does the multiplication and shift. The -hardware and driver also use the "round-down" optimization as described in -https://ridiculousfish.com/files/faster_unsigned_division_by_constants.pdf. -The hardware further assumes the multiplier is between :math:`2^{31}` and -:math:`2^{32}`, so the high bit is implicitly set to 1 even though it is set -to 0 by the driver -- presumably this simplifies the hardware multiplier a -little. The hardware first multiplies linear_id by the multiplier and -takes the high 32 bits, then applies the round-down correction if -extra_flags = 1, then finally shifts right by the shift field. +You can read more technical details about Panfrost here: -There are some differences between ridiculousfish's algorithm and the Mali -hardware algorithm, which means that the reference code from ridiculousfish -doesn't always produce the right constants. Mali does not use the pre-shift -optimization, since that would make a hardware implementation slower (it -would have to always do the pre-shift, multiply, and post-shift operations). -It also forces the multiplier to be at least :math:`2^{31}`, which means -that the exponent is entirely fixed, so there is no trial-and-error. -Altogether, given the divisor d, the algorithm the driver must follow is: +.. toctree:: + :glob: -1. Set shift = :math:`\lfloor \log_2(d) \rfloor`. -2. Compute :math:`m = \lceil 2^{shift + 32} / d \rceil` and :math:`e = 2^{shift + 32} % d`. -3. If :math:`e <= 2^{shift}`, then we need to use the round-down algorithm. Set - magic_divisor = m - 1 and extra_flags = 1. 4. Otherwise, set magic_divisor = - m and extra_flags = 0. + panfrost/* diff --git a/docs/drivers/panfrost/drm-shim.rst b/docs/drivers/panfrost/drm-shim.rst new file mode 100644 index 00000000000..874ac37c2f9 --- /dev/null +++ b/docs/drivers/panfrost/drm-shim.rst @@ -0,0 +1,84 @@ + +drm-shim +======== + +Panfrost implements ``drm-shim``, stubbing out the Panfrost kernel interface. +Use cases for this functionality include: + +- Future hardware bring up +- Running shader-db on non-Mali workstations +- Reproducing compiler (and some driver) bugs without Mali hardware + +Although Mali hardware is usually paired with an Arm CPU, Panfrost is portable C +code and should work on any Linux machine. In particular, you can test the +compiler on shader-db on an Intel desktop. + +To build Mesa with Panfrost drm-shim, configure Meson with +``-Dgallium-drivers=panfrost`` and ``-Dtools=drm-shim``. See the above +building section for a full invocation. The drm-shim binary will be built to +``build/src/panfrost/drm-shim/libpanfrost_noop_drm_shim.so``. + +To use, set the ``LD_PRELOAD`` environment variable to the drm-shim binary. It +may also be necessary to set ``LIBGL_DRIVERS_PATH`` to the location where Mesa +was installed. + +By default, drm-shim mocks a Mali-G52 system. To select a specific Mali GPU, +set the ``PAN_GPU_ID`` environment variable to the desired GPU ID: + +========= ============= ======= +Product Architecture GPU ID +========= ============= ======= +Mali-T720 Midgard (v4) 720 +Mali-T860 Midgard (v5) 860 +Mali-G72 Bifrost (v6) 6221 +Mali-G52 Bifrost (v7) 7212 +Mali-G57 Valhall (v9) 9093 +Mali-G610 Valhall (v10) a867 +========= ============= ======= + +Additional GPU IDs are enumerated in the ``panfrost_model_list`` list in +``src/panfrost/lib/pan_props.c``. + +As an example: assuming Mesa is installed to a local path ``~/lib`` and Mesa's +build directory is ``~/mesa/build``, a shader can be compiled for Mali-G52 as: + +.. code-block:: sh + + ~/shader-db$ BIFROST_MESA_DEBUG=shaders \ + LIBGL_DRIVERS_PATH=~/lib/dri/ \ + LD_PRELOAD=~/mesa/build/src/panfrost/drm-shim/libpanfrost_noop_drm_shim.so \ + PAN_GPU_ID=7212 \ + ./run shaders/glmark/1-1.shader_test + +The same shader can be compiled for Mali-T720 as: + +.. code-block:: sh + + ~/shader-db$ MIDGARD_MESA_DEBUG=shaders \ + LIBGL_DRIVERS_PATH=~/lib/dri/ \ + LD_PRELOAD=~/mesa/build/src/panfrost/drm-shim/libpanfrost_noop_drm_shim.so \ + PAN_GPU_ID=720 \ + ./run shaders/glmark/1-1.shader_test + +These examples set the compilers' ``shaders`` debug flags to dump the optimized +NIR, backend IR after instruction selection, backend IR after register +allocation and scheduling, and a disassembly of the final compiled binary. + +As another example, this invocation runs a single dEQP test "on" Mali-G52, +pretty-printing GPU data structures and disassembling all shaders +(``PAN_MESA_DEBUG=trace``) as well as dumping raw GPU memory +(``PAN_MESA_DEBUG=dump``). The ``EGL_PLATFORM=surfaceless`` environment variable +and various flags to dEQP mimic the surfaceless environment that our +continuous integration (CI) uses. This eliminates window system dependencies, +although it requires a specially built CTS: + +.. code-block:: sh + + ~/VK-GL-CTS/build/external/openglcts/modules$ PAN_MESA_DEBUG=trace,dump \ + LIBGL_DRIVERS_PATH=~/lib/dri/ \ + LD_PRELOAD=~/mesa/build/src/panfrost/drm-shim/libpanfrost_noop_drm_shim.so \ + PAN_GPU_ID=7212 EGL_PLATFORM=surfaceless \ + ./glcts --deqp-surface-type=pbuffer \ + --deqp-gl-config-name=rgba8888d24s8ms0 --deqp-surface-width=256 \ + --deqp-surface-height=256 -n \ + dEQP-GLES31.functional.shaders.builtin_functions.common.abs.float_highp_compute diff --git a/docs/drivers/panfrost/instancing.rst b/docs/drivers/panfrost/instancing.rst new file mode 100644 index 00000000000..d4565af3155 --- /dev/null +++ b/docs/drivers/panfrost/instancing.rst @@ -0,0 +1,112 @@ +Instancing +========== + +The attribute descriptor lets the attribute unit compute the address of an +attribute given the vertex and instance ID. Unfortunately, the way this works is +rather complicated when instancing is enabled. + +To explain this, first we need to explain how compute and vertex threads are +dispatched. When a quad is dispatched, it receives a single, linear index. +However, we need to translate that index into a (vertex id, instance id) pair. +One option would be to do: + +.. math:: + \text{vertex id} = \text{linear id} \% \text{num vertices} + + \text{instance id} = \text{linear id} / \text{num vertices} + +but this involves a costly division and modulus by an arbitrary number. +Instead, we could pad num_vertices. We dispatch padded_num_vertices * +num_instances threads instead of num_vertices * num_instances, which results +in some "extra" threads with vertex_id >= num_vertices, which we have to +discard. The more we pad num_vertices, the more "wasted" threads we +dispatch, but the division is potentially easier. + +One straightforward choice is to pad num_vertices to the next power of two, +which means that the division and modulus are just simple bit shifts and +masking. But the actual algorithm is a bit more complicated. The thread +dispatcher has special support for dividing by 3, 5, 7, and 9, in addition +to dividing by a power of two. As a result, padded_num_vertices can be +1, 3, 5, 7, or 9 times a power of two. This results in less wasted threads, +since we need less padding. + +padded_num_vertices is picked by the hardware. The driver just specifies the +actual number of vertices. Note that padded_num_vertices is a multiple of four +(presumably because threads are dispatched in groups of 4). Also, +padded_num_vertices is always at least one more than num_vertices, which seems +like a quirk of the hardware. For larger num_vertices, the hardware uses the +following algorithm: using the binary representation of num_vertices, we look at +the most significant set bit as well as the following 3 bits. Let n be the +number of bits after those 4 bits. Then we set padded_num_vertices according to +the following table: + +========== ======================= +high bits padded_num_vertices +========== ======================= +1000 :math:`9 \cdot 2^n` +1001 :math:`5 \cdot 2^{n+1}` +101x :math:`3 \cdot 2^{n+2}` +110x :math:`7 \cdot 2^{n+1}` +111x :math:`2^{n+4}` +========== ======================= + +For example, if num_vertices = 70 is passed to glDraw(), its binary +representation is 1000110, so n = 3 and the high bits are 1000, and +therefore padded_num_vertices = :math:`9 \cdot 2^3` = 72. + +The attribute unit works in terms of the original linear_id. if +num_instances = 1, then they are the same, and everything is simple. +However, with instancing things get more complicated. There are four +possible modes, two of them we can group together: + +1. Use the linear_id directly. Only used when there is no instancing. + +2. Use the linear_id modulo a constant. This is used for per-vertex +attributes with instancing enabled by making the constant equal +padded_num_vertices. Because the modulus is always padded_num_vertices, this +mode only supports a modulus that is a power of 2 times 1, 3, 5, 7, or 9. +The shift field specifies the power of two, while the extra_flags field +specifies the odd number. If shift = n and extra_flags = m, then the modulus +is :math:`(2m + 1) \cdot 2^n`. As an example, if num_vertices = 70, then as +computed above, padded_num_vertices = :math:`9 \cdot 2^3`, so we should set +extra_flags = 4 and shift = 3. Note that we must exactly follow the hardware +algorithm used to get padded_num_vertices in order to correctly implement +per-vertex attributes. + +3. Divide the linear_id by a constant. In order to correctly implement +instance divisors, we have to divide linear_id by padded_num_vertices times +to user-specified divisor. So first we compute padded_num_vertices, again +following the exact same algorithm that the hardware uses, then multiply it +by the GL-level divisor to get the hardware-level divisor. This case is +further divided into two more cases. If the hardware-level divisor is a +power of two, then we just need to shift. The shift amount is specified by +the shift field, so that the hardware-level divisor is just +:math:`2^\text{shift}`. + +If it isn't a power of two, then we have to divide by an arbitrary integer. +For that, we use the well-known technique of multiplying by an approximation +of the inverse. The driver must compute the magic multiplier and shift +amount, and then the hardware does the multiplication and shift. The +hardware and driver also use the "round-down" optimization as described in +https://ridiculousfish.com/files/faster_unsigned_division_by_constants.pdf. +The hardware further assumes the multiplier is between :math:`2^{31}` and +:math:`2^{32}`, so the high bit is implicitly set to 1 even though it is set +to 0 by the driver -- presumably this simplifies the hardware multiplier a +little. The hardware first multiplies linear_id by the multiplier and +takes the high 32 bits, then applies the round-down correction if +extra_flags = 1, then finally shifts right by the shift field. + +There are some differences between ridiculousfish's algorithm and the Mali +hardware algorithm, which means that the reference code from ridiculousfish +doesn't always produce the right constants. Mali does not use the pre-shift +optimization, since that would make a hardware implementation slower (it +would have to always do the pre-shift, multiply, and post-shift operations). +It also forces the multiplier to be at least :math:`2^{31}`, which means +that the exponent is entirely fixed, so there is no trial-and-error. +Altogether, given the divisor d, the algorithm the driver must follow is: + +1. Set shift = :math:`\lfloor \log_2(d) \rfloor`. +2. Compute :math:`m = \lceil 2^{shift + 32} / d \rceil` and :math:`e = 2^{shift + 32} % d`. +3. If :math:`e <= 2^{shift}`, then we need to use the round-down algorithm. Set + magic_divisor = m - 1 and extra_flags = 1. 4. Otherwise, set magic_divisor = + m and extra_flags = 0. diff --git a/docs/drivers/panfrost/texcomp.rst b/docs/drivers/panfrost/texcomp.rst new file mode 100644 index 00000000000..2cb6c9d59a0 --- /dev/null +++ b/docs/drivers/panfrost/texcomp.rst @@ -0,0 +1,17 @@ +Compressed texture support +========================== + +In the driver, Panfrost supports ASTC, ETC, and all BCn formats (e.g. RGTC, +S3TC, etc.) However, Panfrost depends on the hardware to support these formats +efficiently. All supported Mali architectures support these formats, but not +every system-on-chip with a Mali GPU support all these formats. Many lower-end +systems lack support for some BCn formats, which can cause problems when playing +desktop games with Panfrost. To check whether this issue applies to your +system-on-chip, Panfrost includes a ``panfrost_texfeatures`` tool to query +supported formats. + +To use this tool, include the option ``-Dtools=panfrost`` when configuring Mesa. +Then inside your Mesa build directory, the tool is located at +``src/panfrost/tools/panfrost_texfeatures``. Copy it to your target device, +set as executable as necessary, and run on the target device. A table of +supported formats will be printed to standard output. diff --git a/docs/drivers/panfrost/tiling.rst b/docs/drivers/panfrost/tiling.rst new file mode 100644 index 00000000000..08c311bd55a --- /dev/null +++ b/docs/drivers/panfrost/tiling.rst @@ -0,0 +1,38 @@ + +U-interleaved tiling +==================== + +Panfrost supports u-interleaved tiling. U-interleaved tiling is +indicated by the ``DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED`` modifier. + +The tiling reorders whole pixels (blocks). It does not compress or modify the +pixels themselves, so it can be used for any image format. Internally, images +are divided into tiles. Tiles occur in source order, but pixels (blocks) within +each tile are reordered according to a space-filling curve. + +For regular formats, 16x16 tiles are used. This harmonizes with the default tile +size for binning and CRCs (transaction elimination). It also means a single line +(16 pixels) at 4 bytes per pixel equals a single 64-byte cache line. + +For formats that are already block compressed (S3TC, RGTC, etc), 4x4 tiles are +used, where entire blocks are reorder. Most of these formats compress 4x4 +blocks, so this gives an effective 16x16 tiling. This justifies the tile size +intuitively, though it's not a rule: ASTC may uses larger blocks. + +Within a tile, the X and Y bits are interleaved (like Morton order), but with a +twist: adjacent bit pairs are XORed. The reason to add XORs is not obvious. +Visually, addresses take the form:: + + | y3 | (x3 ^ y3) | y2 | (y2 ^ x2) | y1 | (y1 ^ x1) | y0 | (y0 ^ x0) | + +Reference routines to encode/decode u-interleaved images are available in +``src/panfrost/shared/test/test-tiling.cpp``, which documents the space-filling +curve. This reference implementation is used to unit test the optimized +implementation used in production. The optimized implementation is available in +``src/panfrost/shared/pan_tiling.c``. + +Although these routines are part of Panfrost, they are also used by Lima, as Arm +introduced the format with Utgard. It is the only tiling supported on Utgard. On +Mali-T760 and newer, Arm Framebuffer Compression (AFBC) is more efficient and +should be used instead where possible. However, not all formats are +compressible, so u-interleaved tiling remains an important fallback on Panfrost. diff --git a/docs/envvars.rst b/docs/envvars.rst index a68483ae352..59df3964722 100644 --- a/docs/envvars.rst +++ b/docs/envvars.rst @@ -594,6 +594,9 @@ Intel driver environment variables ``sf`` emit messages about the strips & fans unit (for old gens, includes the SF program) + ``shader-print`` + allow developer print traces added by `brw_nir_printf` to be + printed out on the console ``soft64`` enable implementation of software 64bit floating point support ``sparse`` @@ -1085,6 +1088,11 @@ Rusticl environment variables - ``sync`` waits on the GPU to complete after every event - ``validate`` validates any internally generated SPIR-Vs, e.g. through compiling OpenCL C code +.. envvar:: RUSTICL_MAX_WORK_GROUPS + + Limits the amount of threads per dimension in a work-group. Useful for splitting up long running + tasks to increase responsiveness or to simulate the lowering of huge global sizes for testing. + .. _clc-env-var: clc environment variables diff --git a/docs/features.txt b/docs/features.txt index a314ae05469..b481aeb5f91 100644 --- a/docs/features.txt +++ b/docs/features.txt @@ -485,8 +485,8 @@ Vulkan 1.3 -- all DONE: anv, lvp, nvk, radv, tu, vn VK_KHR_synchronization2 DONE (anv, dzn, hasvk, lvp, nvk, panvk, radv, tu, v3dv, vn) VK_KHR_zero_initialize_workgroup_memory DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn) VK_EXT_4444_formats DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn) - VK_EXT_extended_dynamic_state DONE (anv, hasvk, lvp, nvk, radv, tu, vn) - VK_EXT_extended_dynamic_state2 DONE (anv, hasvk, lvp, nvk, radv, tu, vn) + VK_EXT_extended_dynamic_state DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn) + VK_EXT_extended_dynamic_state2 DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn) VK_EXT_inline_uniform_block DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn) VK_EXT_pipeline_creation_cache_control DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn) VK_EXT_pipeline_creation_feedback DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn) @@ -508,7 +508,7 @@ Khronos extensions that are not part of any Vulkan version: VK_KHR_deferred_host_operations DONE (anv, hasvk, lvp, radv) VK_KHR_display DONE (anv, nvk, pvr, radv, tu, v3dv) VK_KHR_display_swapchain not started - VK_KHR_dynamic_rendering_local_read DONE (lvp) + VK_KHR_dynamic_rendering_local_read DONE (lvp, radv) VK_KHR_external_fence_fd DONE (anv, hasvk, nvk, pvr, radv, tu, v3dv, vn) VK_KHR_external_fence_win32 not started VK_KHR_external_memory_fd DONE (anv, dzn, hasvk, lvp, nvk, pvr, radv, tu, v3dv, vn) @@ -540,7 +540,7 @@ Khronos extensions that are not part of any Vulkan version: VK_KHR_shader_maximal_reconvergence DONE (anv, lvp, nvk, radv) VK_KHR_shader_subgroup_rotate DONE (anv, nvk, radv) VK_KHR_shader_subgroup_uniform_control_flow DONE (anv, hasvk, nvk, radv) - VK_KHR_shader_quad_control DONE (radv) + VK_KHR_shader_quad_control DONE (anv, radv) VK_KHR_shared_presentable_image not started VK_KHR_surface DONE (anv, dzn, hasvk, lvp, nvk, panvk, pvr, radv, tu, v3dv, vn) VK_KHR_surface_protected_capabilities DONE (anv, lvp, nvk, radv, tu, v3dv, vn) @@ -561,14 +561,14 @@ Khronos extensions that are not part of any Vulkan version: VK_EXT_calibrated_timestamps DONE (anv, hasvk, nvk, lvp, radv, vn) VK_EXT_color_write_enable DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn) VK_EXT_conditional_rendering DONE (anv, hasvk, lvp, nvk, radv, tu, vn) - VK_EXT_conservative_rasterization DONE (anv, radv, vn) + VK_EXT_conservative_rasterization DONE (anv, nvk, radv, vn) VK_EXT_custom_border_color DONE (anv, hasvk, lvp, nvk, panvk, radv, tu, v3dv, vn) VK_EXT_debug_marker DONE (radv) VK_EXT_debug_report DONE (anv, dzn, lvp, nvk, pvr, radv, tu, v3dv) VK_EXT_depth_bias_control DONE (anv, nvk, radv) VK_EXT_depth_clip_control DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn) VK_EXT_depth_clip_enable DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn) - VK_EXT_depth_range_unrestricted DONE (anv/gen20+, radv, lvp) + VK_EXT_depth_range_unrestricted DONE (anv/gen20+, nvk, radv, lvp) VK_EXT_descriptor_buffer DONE (anv, lvp, radv, tu) VK_EXT_device_address_binding_report DONE (radv) VK_EXT_device_fault DONE (radv) @@ -590,10 +590,11 @@ Khronos extensions that are not part of any Vulkan version: VK_EXT_headless_surface DONE (anv, dzn, hasvk, lvp, nvk, panvk, pvr, radv, tu, v3dv, vn) VK_EXT_image_2d_view_of_3d DONE (anv, hasvk, lvp, nvk, radv, tu, vn) VK_EXT_image_compression_control DONE (radv) - VK_EXT_image_drm_format_modifier DONE (anv, hasvk, radv/gfx9+, tu, v3dv, vn) + VK_EXT_image_drm_format_modifier DONE (anv, hasvk, nvk, radv/gfx9+, tu, v3dv, vn) VK_EXT_image_sliced_view_of_3d DONE (anv, nvk, radv/gfx10+) VK_EXT_image_view_min_lod DONE (anv, hasvk, nvk, radv, tu, vn) VK_EXT_index_type_uint8 DONE (anv, hasvk, nvk, lvp, panvk, pvr, radv/gfx8+, tu, v3dv, vn) + VK_EXT_legacy_vertex_attributes DONE (anv, lvp, radv, tu) VK_EXT_line_rasterization DONE (anv, hasvk, nvk, lvp, radv, tu, v3dv, vn) VK_EXT_load_store_op_none DONE (anv, nvk, radv, tu, v3dv, vn) VK_EXT_memory_budget DONE (anv, hasvk, lvp, nvk, pvr, radv, tu, v3dv, vn) @@ -607,12 +608,12 @@ Khronos extensions that are not part of any Vulkan version: VK_EXT_pci_bus_info DONE (anv, hasvk, nvk, radv, vn) VK_EXT_physical_device_drm DONE (anv, hasvk, nvk, radv, tu, v3dv, vn) VK_EXT_pipeline_library_group_handles DONE (anv, radv) - VK_EXT_pipeline_robustness DONE (anv, radv, v3dv) + VK_EXT_pipeline_robustness DONE (anv, nvk, radv, v3dv) VK_EXT_post_depth_coverage DONE (anv/gfx11+, lvp, radv/gfx10+, tu) VK_EXT_primitive_topology_list_restart DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn) VK_EXT_primitives_generated_query DONE (anv, hasvk, lvp, nvk, radv, tu, vn) VK_EXT_provoking_vertex DONE (anv, hasvk, lvp, nvk, radv, tu, v3dv, vn) - VK_EXT_queue_family_foreign DONE (anv, hasvk, lvp, radv, tu, vn) + VK_EXT_queue_family_foreign DONE (anv, hasvk, nvk, lvp, radv, tu, vn) VK_EXT_rasterization_order_attachment_access DONE (lvp, tu, vn) VK_EXT_robustness2 DONE (anv, hasvk, lvp, nvk, radv, tu, vn) VK_EXT_sample_locations DONE (anv, hasvk, nvk, radv/gfx9-, tu/a650+) @@ -663,7 +664,9 @@ Khronos extensions that are not part of any Vulkan version: VK_EXT_depth_clamp_zero_one DONE (anv, radv) VK_INTEL_shader_integer_functions2 DONE (anv, hasvk, radv) VK_KHR_map_memory2 DONE (anv, nvk, radv, tu) - + VK_EXT_map_memory_placed DONE (anv, nvk, radv, tu) + VK_MESA_image_alignment_control DONE (radv) + VK_EXT_legacy_dithering DONE (anv) Clover OpenCL 1.0 -- all DONE: diff --git a/docs/header-stubs/compiler/spirv/spirv_info.h b/docs/header-stubs/compiler/spirv/spirv_info.h new file mode 100644 index 00000000000..d8db07f5f1b --- /dev/null +++ b/docs/header-stubs/compiler/spirv/spirv_info.h @@ -0,0 +1 @@ +struct spirv_capabilities {}; diff --git a/docs/header-stubs/vk_enum_to_str.h b/docs/header-stubs/vk_enum_to_str.h new file mode 100644 index 00000000000..e69de29bb2d --- /dev/null +++ b/docs/header-stubs/vk_enum_to_str.h diff --git a/docs/release-calendar.csv b/docs/release-calendar.csv index ed0ddca652c..8c3af5c6648 100644 --- a/docs/release-calendar.csv +++ b/docs/release-calendar.csv @@ -1,5 +1,2 @@ -24.0,2024-05-08,24.0.7,Eric Engestrom -,2024-05-22,24.0.8,Eric Engestrom -24.1,2024-05-01,24.1.0-rc2,Eric Engestrom -,2024-05-08,24.1.0-rc3,Eric Engestrom -,2024-05-15,24.1.0-rc4,Eric Engestrom,or 24.1.0 final +24.0,2024-05-22,24.0.8,Eric Engestrom +24.1,2024-05-22,24.1.0-rc5,Eric Engestrom,or 24.1.0 final diff --git a/docs/relnotes.rst b/docs/relnotes.rst index bf788eaae16..3d273e35112 100644 --- a/docs/relnotes.rst +++ b/docs/relnotes.rst @@ -3,6 +3,7 @@ Release Notes The release notes summarize what's new or changed in each Mesa release. +- :doc:`24.0.7 release notes <relnotes/24.0.7>` - :doc:`24.0.6 release notes <relnotes/24.0.6>` - :doc:`24.0.5 release notes <relnotes/24.0.5>` - :doc:`24.0.4 release notes <relnotes/24.0.4>` @@ -417,6 +418,7 @@ The release notes summarize what's new or changed in each Mesa release. :maxdepth: 1 :hidden: + 24.0.7 <relnotes/24.0.7> 24.0.6 <relnotes/24.0.6> 24.0.5 <relnotes/24.0.5> 24.0.4 <relnotes/24.0.4> diff --git a/docs/relnotes/24.0.7.rst b/docs/relnotes/24.0.7.rst new file mode 100644 index 00000000000..0eaecdec76f --- /dev/null +++ b/docs/relnotes/24.0.7.rst @@ -0,0 +1,155 @@ +Mesa 24.0.7 Release Notes / 2024-05-08 +====================================== + +Mesa 24.0.7 is a bug fix release which fixes bugs found since the 24.0.6 release. + +Mesa 24.0.7 implements the OpenGL 4.6 API, but the version reported by +glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / +glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. +Some drivers don't support all the features required in OpenGL 4.6. OpenGL +4.6 is **only** available if requested at context creation. +Compatibility contexts may report a lower version depending on each driver. + +Mesa 24.0.7 implements the Vulkan 1.3 API, but the version reported by +the apiVersion property of the VkPhysicalDeviceProperties struct +depends on the particular driver being used. + +SHA256 checksum +--------------- + +:: + + 7454425f1ed4a6f1b5b107e1672b30c88b22ea0efea000ae2c7d96db93f6c26a mesa-24.0.7.tar.xz + + +New features +------------ + +- None + + +Bug fixes +--------- + +- mesa 24 intel A770 KOTOR black shadow smoke scenes +- Graphical glitches in RPCS3 after updating Vulkan Intel drivers +- [R600] OpenGL and VDPAU regression in Mesa 23.3.0 - some bitmaps get distorted. +- VAAPI radeonsi: VBAQ broken with HEVC +- radv: vkCmdWaitEvents2 is broken +- Zink: enabled extensions and features may not match + + +Changes +------- + +Boris Brezillon (3): + +- panfrost: do not write outside num_wg_sysval +- panfrost: Add the BO containing fragment program descriptor to the batch +- pan/kmod: Make default allocator thread-safe + +Constantine Shablia (2): + +- pan/bi: fix 1D array tex coord lowering +- panfrost: report correct MAX_VARYINGS + +Daniel Schürmann (1): + +- aco/ra: fix kill flags after renaming fixed Operands + +David Rosca (5): + +- radeonsi/vcn: Allocate session buffer in VRAM +- radeonsi/vcn: Fix 10bit HEVC VPS general_profile_compatibility_flags +- radeonsi/vcn: Only enable VBAQ with rate control mode +- frontends/va: Fix AV1 slice_data_offset with multiple slice data buffers +- Revert "radeonsi/vcn: AV1 skip the redundant bs resize" + +Eric Engestrom (6): + +- docs: add sha256sum for 24.0.6 +- .pick_status.json: Update to 86281ef15fca378ef48bcb072a762168e537820d +- .pick_status.json: Mark 0666a715c7210558017ce717f6b0b947c679a68e as denominated +- .pick_status.json: Update to 603982ea802b3846e91a943b413a7baf430e875d +- .pick_status.json: Update to 9666756f603f0285d8a93ef93db1c7ec702b671f +- .pick_status.json: Update to b8e79d2769b4a4aed7e2103cf0405acc5bdadb86 + +Erik Faye-Lund (2): + +- panfrost: correct first-tracking for signature +- panvk: avoid dereferencing a null-pointer + +Georg Lehmann (1): + +- radv, radeonsi: don't use D16 for f2f16_rtz + +Gert Wollny (1): + +- zink/kopper: Wait for last QueuePresentKHR to finish before acquiring for readback + +Ian Romanick (1): + +- intel/brw: Fix optimize_extract_to_float for i2f of unsigned extract + +Iván Briano (2): + +- anv: check requirements for VK_IMAGE_USAGE_FRAGMENT_SHADING_RATE +- anv: fix casting to graphics_pipeline_base + +Karol Herbst (2): + +- nir: fix nir_shader_get_function_for_name for functions without names. +- rusticl: use stream uploader for cb0 if prefered + +Kenneth Graunke (1): + +- isl: Set MOCS to uncached for Gfx12.0 blitter sources/destinations + +Konstantin Seurer (1): + +- radv: Handle all dependencies of CmdWaitEvents2 + +Lionel Landwerlin (2): + +- anv: disable dual source blending state if not used in shader +- intel/brw: fixup wm_prog_data_barycentric_modes() + +Mike Blumenkrantz (8): + +- zink: reconstruct features pnext after determining extension support +- glthread: check for invalid primitive modes in DrawElementsBaseVertex +- zink: prune zink_shader::programs under lock +- zink: fully wait on all program fences during ctx destroy +- kopper: fix bufferage/swapinterval handling for non-window swapchains +- zink: slightly better swapinterval failure handling +- zink: clean up accidental debug print +- zink: add a tu flake + +Patrick Lerda (1): + +- gallium/auxiliary/vl: fix typo which negatively impacts the src_stride initialization + +Rohan Garg (1): + +- anv: formatting fix when printing pipe controls + +Samuel Pitoiset (1): + +- radv: fix image format properties with fragment shading rate usage + +Sviatoslav Peleshko (1): + +- anv: Fix descriptor sampler offsets assignment + +Tapani Pälli (1): + +- iris: change stream uploader default size to 2MB + +Yiwei Zhang (2): + +- venus: avoid client allocators for ring internals +- venus: fix to destroy all pipeline handles on early error paths + +Yusuf Khan (1): + +- nouveau: Fix crash when destination or source screen fences are null diff --git a/docs/relnotes/new_features.txt b/docs/relnotes/new_features.txt index e69de29bb2d..eec9619f4ac 100644 --- a/docs/relnotes/new_features.txt +++ b/docs/relnotes/new_features.txt @@ -0,0 +1,3 @@ +VK_KHR_dynamic_rendering_local_read on RADV +VK_EXT_legacy_vertex_attributes on lavapipe, ANV, Turnip and RADV +VK_MESA_image_alignment_control on RADV diff --git a/docs/rusticl.rst b/docs/rusticl.rst index 28cf87dda0e..5b7f56e24f2 100644 --- a/docs/rusticl.rst +++ b/docs/rusticl.rst @@ -32,8 +32,8 @@ To build Rusticl you need to satisfy the following build dependencies: The minimum versions to build Rusticl are: - Rust: 1.66 -- Meson: 1.3.1 -- Bindgen: 0.62.0 +- Meson: 1.4.0 +- Bindgen: 0.65.0 - LLVM: 15.0.0 - Clang: 15.0.0 Updating clang requires a rebuilt of mesa and rusticl if and only if the value of |