summaryrefslogtreecommitdiff
path: root/src/gallium/drivers/radeonsi
AgeCommit message (Collapse)AuthorFilesLines
8 hoursradeonsi: respect pipe_picture_desc::flush_flagsChia-I Wu6-7/+7
It is not always possible to assume PIPE_FLUSH_ASYNC. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28771>
8 hoursradeonsi: prep for pipe_picture_desc::flush_flagsChia-I Wu3-19/+19
Make sure video codecs support flush flags. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28771>
13 hoursradeonsi/vcn: set accurate size for dec header and index_codecDavid (Ming Qiang) Wu1-1/+10
Each codec has its own size in the dec message, for example: AVC has sizeof(rvcn_dec_message_avc_t) and AV1 has sizeof(rvcn_dec_message_av1_t) This patch will set the correct size for index_codec section and set the total_size properly for the dec message header. Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28886>
23 hoursac/debug,radv: Read UMR wave dumps into memory before parsingKonstantin1-1/+1
Allows RADV to reuse the wave dump, which leads to more consistency between pipeline.log and umr_waves.log. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28838>
38 hoursradeonsi: implement user_data_amd for 5, 6, and 7 components correctly24.1-branchpointMarek Olšák3-5/+17
NIR can't handle those component counts, so we have to split it into 2 SGPR vectors where each has max 4 components. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: use ip_type in debug code instead of hardcoding GFXMarek Olšák3-16/+26
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: always run nir_opt_16bit_tex_imageMarek Olšák1-1/+1
It optimizes constants in srcs to 16 bits. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: only expose 8 EQAA samples due to shader limitationsMarek Olšák1-5/+5
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: don't add whether NIR is used into the shader keyMarek Olšák1-2/+1
This is from when we had TGSI and NIR was a debug option. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: make clear_render_target clear DCC directly instead of via ↵Marek Olšák2-6/+94
pipe->clear() This extracts the relevant parts from si_fast_clear. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: enable fast FB clears for conditional renderingMarek Olšák5-14/+18
They use compute shaders, which always support the render condition. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: don't flush CB and DB if there have been no draw callsMarek Olšák2-9/+31
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: don't flush CB in si_launch_grid_internal_images if not neededMarek Olšák1-3/+5
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: don't use si_get_flush_flags() for flushing imagesMarek Olšák3-4/+14
si_make_{CB/DB}_shader_coherent are more correct. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: disable VRS flat shading for selected 8xMSAA and thick tiling casesMarek Olšák3-1/+12
for better slow clear performance Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi/gfx11: implement DCC clear to "single" for fast non-0/1 clearsMarek Olšák5-5/+157
If the clear color isn't 0 or 1, we used a slow clear. This adds a new DCC clear where the DCC buffer is cleared to a special value and the clear color is stored at the beginning of each 256B block in the image. It can be very fast, but it's not always faster than a slow clear. There is a heuristic that determines whether this new fast clear is better. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: don't call resource_copy_region in pipe->blitMarek Olšák1-10/+0
It's slower because it forces preservation of NaNs. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: change allow_flat_shading to make it a single conditionMarek Olšák1-6/+4
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: remove si_use_compute_copy_for_float_formatsMarek Olšák1-26/+0
Gfx blits preserve NaNs now, so this is no longer needed. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: use simpler UINT fallback formats for draw-based resource_copy_regionMarek Olšák1-10/+5
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: preserve NaNs in draw-based resource_copy_regionMarek Olšák1-3/+7
Gfx copies are faster sometimes, so they should be able to copy anything. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: move blitter clear_render_target impl into si_gfx_clear_render_targetMarek Olšák2-0/+15
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: move blitter resource_copy_region implementation to si_gfx_copy_imageMarek Olšák2-7/+20
for a new performance test. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: allow input NIR to use descriptors in image opcodesMarek Olšák1-0/+5
Skip lowering because there is nothing to lower. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: don't expose samples_identical and don't lower FMASK if it's disabledMarek Olšák2-2/+3
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: fix initialization of occlusion query buffers for disabled RBsMarek Olšák1-10/+22
GFX9+ should assume the enabled RB results are packed (no holes). Same as PAL. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: move TCS epilog key bits to the key->ge.opt sectionMarek Olšák4-22/+12
Since the TCS epilog is no more, this is required to apply those bits to monolithic shaders. tessfactors_are_def_in_all_invocs was unused. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: check has_stable_pstate in the winsysMarek Olšák1-5/+3
so that we don't duplicate the condition everywhere Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: add the radeonsi_optimize_io option into the shader cache keyMarek Olšák1-3/+2
otherwise the options would be ignored if the shader cache had already cached the same shader with the option inverted. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi: use the same nir_lower_subgroups_options as RADVMarek Olšák6-22/+33
Some FREE calls are removed because nir_options is always NULL there. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi/gfx11: enable DCC fast clears for 8-bit and 16-bit formatsMarek Olšák1-4/+0
They seem to work fine. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi/gfx11: don't prefetch constants in binaries into the instruction cacheMarek Olšák4-8/+18
Only prefetch shader instructions. There will be more GFX versions in that list. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
38 hoursradeonsi/ci: update gfx11 failuresMarek Olšák1-0/+4
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725>
3 daysac,radeonsi: add helpers to compute the number of tess patches/lds sizeSamuel Pitoiset1-75/+6
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28015>
9 daysradeonsi: Adds return on failure to get plane infoSurafel Assefa1-4/+6
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27456>
13 daysnir: change "user_data_amd" sysval from 4 to 8 componentsMarek Olšák2-2/+2
so that we can pass more fast constants to compute shaders (without reading memory in the shader). Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28606>
13 daysac/llvm: remove handling of input and output loads/stores that are loweredMarek Olšák2-7/+3
There is a lot that we still use. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28607>
13 daysac/llvm: add support for 16-bit coordinates (A16) for image (non-sampler) ↵Marek Olšák1-0/+1
opcodes Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28607>
2024-04-11radeonsi/uvd_enc: update to use correct padding sizenyanmisaka2-4/+4
Update padding size calculation to use cropping. Original method could result in 0 padding, which generated unnessary noise in the encoding result. Cc: mesa-stable Fixes: mesa/mesa#9196 Signed-off-by: nyanmisaka <nst799610810@gmail.com> Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28369>
2024-04-11nir: rename to nir_opt_16bit_tex_imageGeorg Lehmann1-8/+8
Not sure what I was thinking when I wrote this pass (probably not much), but opt makes more sense and matches other nir passes. Fold is usually used for constants, and this pass handles more than those. Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28662>
2024-04-10radeonsi/vpe: add support for p010Peyton Lee1-6/+42
add support for p010, correct the settings of format and buffer pitch. Signed-off-by: Peyton Lee <peytolee@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28518>
2024-04-03ac/nir/tess: Remove superfluous args for reserved TCS outputs.Timur Kristóf1-2/+0
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28487>
2024-04-03ac/nir/tess: Clarify when a TCS output is stored in LDS or VRAM.Timur Kristóf1-1/+1
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28487>
2024-04-01radeonsi/vcn: use num_instances from radeon_infoSathishkumar S1-3/+3
num_instances is used to track ip count not num_queues. Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28252>
2024-03-30ac/nir/tess: Remove dead code that was meant for epilogs.Timur Kristóf1-3/+1
We no longer need to emit store_output intrinsics at the end of the shaders. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
2024-03-30radeonsi: Use one more bit for number of patches in TCS offchip layout.Timur Kristóf3-19/+11
There was 1 more bit left, may as well use it for something. In the future, this may allow increasing the maximum number of patches per workgroup. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
2024-03-30radeonsi: Remove tess bits from VS state.Timur Kristóf4-26/+5
These parts are not used anymore, therefore we no longer need to change the VS state when tessellation states change. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
2024-03-30radeonsi: Add number of VS outputs to TCS output layout.Timur Kristóf3-9/+17
Use tcs_offchip_layout instead of VS state to determine the number of LS outputs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
2024-03-30radeonsi: Delete TCS epilogs entirely.Timur Kristóf9-610/+5
Always emit the tessellation factor writes in the main shader, which is doable now that the necessary information is in the tcs_offchip_layout SGPR. This eliminates the need for TCS epilogs, so delete them entirely from RadeonSI. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>
2024-03-30radeonsi: Implement dynamic TCS intrinsics for non-monolithic shaders.Timur Kristóf3-3/+17
Put the primitive mode and whether TES reads tess factors into the tcs_offchip_layout SGPR, so they can be used by the main shader instead of needing the epilog. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28425>