summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
3 daysradeonsi/sqtt: export shader code to RGPPierre-Eric Pelloux-Prayer3-1/+231
With these changes the shader code is visible in RGP. Vk pipeline feature is emulated using si_update_shaders: when shaders are updated we compute a sha1 of their code and use it as a pipeline hash. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>
3 daysradeonsi/sqtt: don't always use WGP 0Pierre-Eric Pelloux-Prayer1-3/+12
Because it may be disabled. Instead use the cu mask to pick the first active WGP. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>
3 daysradeonsi/sqtt: remove duplicate tokenPierre-Eric Pelloux-Prayer1-1/+0
V_008D18_REG_INCLUDE_CONTEXT was set twice. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>
3 daysradeonsi/sqtt: keep a copy of the uploaded shader codePierre-Eric Pelloux-Prayer2-2/+16
Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>
3 daysac/rgp: make the max gap between shader code a warningPierre-Eric Pelloux-Prayer1-11/+12
For radeonsi the shaders don't live in the same BOs, so they're unlikely to be less that 0x1000 bytes apart. So this commit bumps the threshold to 0x10000 and warns once when hitting it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>
3 daysradeonsi: properly set SPI_SHADER_PGM_HI_ESPierre-Eric Pelloux-Prayer1-1/+1
When not using S_00B324_MEM_BASE the value isn't properly truncated. Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>
3 daysradv: don't set sx_blend_opt_epsilon for V_028C70_COLOR_10_11_11Rhys Perry1-3/+1
Matches radeonsi and PAL. From PAL: // 1 is recommended, but doesn't provide sufficient precision Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4394 Fixes: ed946381564 ("radv: Enable RB+ where possible.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9427>
5 daysradeonsi: don't crash on NULL images in si_check_needs_implicit_syncMarek Olšák1-1/+1
This fixes CTS test: KHR-GL46.arrays_of_arrays_gl.AtomicUsage Fixes: bddc0e023c "radeonsi: fix read from compute / write from draw sync" Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9361>
6 daysradeonsi: don't index si_context::shaders with enum gl_shader_stageMarek Olšák2-3/+5
Fixes: a8373b3d387 "radeonsi: store si_context::xxx_shader members in union" Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9313>
7 daysradeonsi,radv: do not overallocate the SQTT buffer sizeSamuel Pitoiset4-18/+27
The number of shader engines isn't always 4. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9307>
8 daysfrontends/va: Use correct size for secondary planes.Bas Nieuwenhuizen1-2/+6
And initialize the whandle format while at it. Fixes: f7a4051b836 ("radeonsi: Check pitch and offset for validity.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4126 Reviewed-by: Simon Ser <contact@emersion.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9236>
8 daysradeonsi/uvd: make format modifiers-awareSimon Ser3-4/+24
When format modifiers are supported, use resource_create_with_modifiers instead of resource_create. This allows radeonsi to set the modifier field, and allows VA-API clients to have a proper modifier instead of DRM_FORMAT_MOD_INVALID. Signed-off-by: Simon Ser <contact@emersion.fr> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9308>
13 daysradv: Disable displayable DCC for GFX8 properly.Bas Nieuwenhuizen1-1/+1
On scanout the GFX8 ac_surface doesn't clear the size but only doesn't allocate space and hence dcc_offset is 0. This is the same as radeonsi. Fixes: 7acb30de8ac ("radv: Enable displayable DCC.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4346 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9221>
13 daysradv: Implement displayable DCC retiling.Bas Nieuwenhuizen7-0/+347
Straightforward implementation using the retile map from radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9042>
13 daysci: Move the dEQP and traces expectations to the per-driver CI dirs.Eric Anholt46-7/+3
This means less custom test-source-dep stuff for these drivers, though it means that touching the CI expects files will cause a bit more retesting: - broadcom drivers retest as a group (but Igalia requested that organization of CI files) - radv+radeonsi retest as a group - lvp+llvmpipe retest as a group Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9161>
2021-02-19ac/rgp,radeonsi,radv: pass struct thread_trace_data to ac_sqtt_dump_data()Yogesh Mohan Marimuthu5-8/+16
struct thread_trace_data holds struct rgp_code_object, struct rgp_loader_events, struct rgp_pso_correlation data. This data is required in function ac_sqtt_dump_data(). This patch makes the code changes required to pass struct thread_trace_data to function ac_sqtt_dump_data(). Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8609>
2021-02-19radv: set correct value for OFFCHIP_BUFFERING on GFX10+Samuel Pitoiset1-1/+1
Higher values break tessellation. I was only able to reproduce this by switching back/from AMDVLK which was really weird... According to Marek (1c6eca23fdd8), it looks like it's related to register shadowing and PAL enables it, that probably explains a bit. Copied from PAL and RadeonSI. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4207 Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2498 Fixes: 74d69299d16 ("radv/gfx10: double the number of tessellation offchip buffers per SE") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9141>
2021-02-17radeonsi: force dcc clear to use compute clearPierre-Eric Pelloux-Prayer1-2/+8
After the previous commit, when running the following deqp-gles31 caselist: dEQP-GLES31.functional.image_load_store.2d.format_reinterpret.rgba32f_rgba32ui dEQP-GLES31.functional.image_load_store.2d.format_reinterpret.rgba32f_rgba32i The second test always fails on gfx10. I don't know why, but forcing the dcc clear from si_decompress_dcc to use compute fixes the problem. The test caselist wasn't failing before because the dcc disable step was done in si_resource_copy_region, before calling si_compute_copy_image. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8958>
2021-02-17radeonsi: enable dcc image stores on gfx10+Pierre-Eric Pelloux-Prayer7-15/+20
This was implemented in 1d3bffaf9cb7ade0676bab969b5d33d6bdabcec8, but missing the WRITE_COMPRESS_ENABLE bit, then disabled by 4dc6ed2a59040f04648eadbffeb1522587d00f3. This commits reimplements it to: - avoid disabling dcc when uploading FP16 textures (see si_use_compute_copy_for_float_formats) - being able to use compute to upload textures in more cases, rather than using the blit path Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8958>
2021-02-17radeonsi: replace force_cp_dma arg of si_clear_buffer by enumPierre-Eric Pelloux-Prayer6-13/+25
The new enum has 3 values: - SI_CP_DMA_CLEAR_METHOD: equivalent to force_cp_dma = true - SI_COMPUTE_CLEAR_METHOD: to force the clear to use compute - SI_AUTO_SELECT_CLEAR_METHOD: equivalent to force_cp_dma = false No functional change yet, but this will be used later. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8958>
2021-02-17radeonsi: set MEM_ORDERED optimallyMarek Olšák3-6/+34
It must be 1 only if both sampler and non-sampler VMEM instructions that return something are used. BVH counts as a sampler instruction. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>
2021-02-17radeonsi: gather shader info about VMEM usage for MEM_ORDEREDMarek Olšák2-0/+51
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>
2021-02-17radeonsi: gather shader info about indirect UBO/SSBO/samplers/imagesMarek Olšák2-1/+46
A future commit will use it. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>
2021-02-17radeonsi: gather info about bindless images and memory stores with strstr(intr)Marek Olšák1-54/+13
This is only code simplification. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>
2021-02-17radeonsi: fix the value of uses_bindless_samplersMarek Olšák1-14/+6
We don't have any nir_variables for uniforms, so this code wasn't doing anything. Also, uniform handles are almost always uniforms. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>
2021-02-17radeonsi: do late NIR optimizations after uniform inliningMarek Olšák3-9/+16
This was missing. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>
2021-02-17radeonsi: allocate filled_size for streamout targets in set_streamout_buffersMarek Olšák1-9/+9
so that create_stream_output_target doesn't use the context and can be called from any thread. This is for u_threaded_context. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>
2021-02-17radeonsi: improve comments in si_emit_derived_tess_stateMarek Olšák1-11/+8
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>
2021-02-17radeonsi: for tess, determine the minimum num_patches before optimizing tg sizeMarek Olšák1-16/+16
Doing these MINs at the end could have undone optimizations for the LDS size and threadgroup size, so move the MINs up. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028>
2021-02-17radeonsi: fix si_check_render_feedbackPierre-Eric Pelloux-Prayer2-9/+17
si_check_render_feedback only relied on si_images::enabled_mask and si_samplers::enabled_mask to determine if a texture was being used both as input and output. Given that some samplers/images can be considered active (so accounted for by enabled_mask) but not used by the current shader this could lead to false-positive. This commit fixes this by and-ing the above mask with the information from shader_info for each active shader. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4227 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8869>
2021-02-17radeonsi: fix read from compute / write from draw syncPierre-Eric Pelloux-Prayer5-4/+67
A compute dispatch should see the result of a previous draw command. radeonsi was missing this implicit sync, causing rendering artifacts: the compute shader was reading from a texture still being written to by the previous draw. Framebuffer BOs are marked with RADEON_USAGE_NEEDS_IMPLICIT_SYNC, so compute jobs will sync. v2: use RADEON_USAGE_NEEDS_IMPLICIT_SYNC v3: unconditionally make CB coherent after a flush Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> (v3) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v3) Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4032 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2878 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/1336 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8869>
2021-02-17radeonsi: store si_context::xxx_shader members in unionPierre-Eric Pelloux-Prayer8-197/+189
This allows to access them individually (sctx->shader.ps) or using array indexing (sctx->shaders[PIPE_SHADER_FRAGMENT]). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8869>
2021-02-17radeonsi: fix indentation issue in si_texture.cPierre-Eric Pelloux-Prayer1-2/+2
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8869>
2021-02-15mesa: remove an optional GL error about mapped buffers during executionMarek Olšák3-31/+9
Not having this here, even if the branch is not taken, increases CPU performance by 2% on radeonsi. If some drivers need this, the spec does allow GL termination, meaning abort(), which is a more effective alternative given that this never happens. You may ask, do we really pay a 2% performance hit for every conditional not taken? For some of them, we do. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8798>
2021-02-13radeonsi: add debug options nodisplaytiling and nodisplaydccMarek Olšák3-2/+9
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8892>
2021-02-13radeonsi: skip s_sendmsg(gs_alloc_req) for NGG passthrough on new chipsMarek Olšák3-2/+10
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8892>
2021-02-09ci: Disable two radeonsi jobsTomeu Vizoso1-4/+4
The machine to which these boards are connected to is having trouble keeping up when the rootfs are expanded. This is causing jobs to time out and fail. So as a mitigation measure reduce the load by disabling two of these jobs until the root problem is solved. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8930>
2021-02-09gallium/u_tests: test no-op fragment shader instead of NULL fragment shaderMarek Olšák1-2/+6
radeonsi stopped supporting NULL fragment shaders. This makes the test pass. Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8906>
2021-02-08radeonsi: don't use cp_dma prefetch on GFX6Pierre-Eric Pelloux-Prayer1-2/+4
It's not supported. Fixes: 47587758f21 ("radeonsi: prefetch VB descriptors right after uploading") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4211 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8914>
2021-02-03winsys/amdgpu,radeonsi: add HUD counters for how much memory is wasted by slabsMarek Olšák9-1/+47
Slabs always allocate the next power of two size from their pools. This wastes memory if the size is not a power of two. bo->base.size is overwritten because the default is the allocated power of two size, but we need the real size to compute the wasted size in amdgpu_bo_slab_destroy. entry_size is added to the hole in pb_slab_entry to hold the real entry size. Like other memory stats, no atomics are used. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8683>
2021-02-02radeonsi: tune NGG shader culling vertex threshold for each chipMarek Olšák1-2/+18
These are based on my testing and estimation. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8434>
2021-02-02radeonsi: simplify the NGG culling condition in si_draw_vboMarek Olšák3-22/+14
Changes: - disallow NGG culling for GS, fast launch for tess using template args (GS can't do NGG culling, tess can't do fast launch) - skip checking current_rast_prim with tessellation (bake the condition into ngg_cull_vert_threshold) - use only 1 vertex count threshold for enabling NGG shader culling to simplify it. I think it doesn't have a big impact. The threshold computation depends on more parameters than just fast launch. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8434>
2021-02-02radeonsi: set current_rast_prim at bind time for tess and GSMarek Olšák2-21/+45
It doesn't have to be done in draw_vbo. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8434>
2021-02-01radv: prefer CP DMA for GTT buffer copies/clears on dGPUs due to slow PCIeSamuel Pitoiset1-2/+26
The CP DMA bandwidth is always better than PCIe, so I think wasting compute resources is not a good idea. This is only enabled on GFX10+ because untested on older gens and also because RadeonSI does that. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8763>
2021-01-30radeonsi: precompute NGG cull flags in si_create_rs_stateMarek Olšák3-17/+28
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8794>
2021-01-30radeonsi: prefetch VB descriptors right after uploadingMarek Olšák3-36/+16
This skips the logic that sets and checks prefetch_L2_mask. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8794>
2021-01-30radeonsi: set VB user SGPRs in si_upload_vertex_buffer_descriptorsMarek Olšák2-41/+65
so that we don't have to enter the state emit loop and invoke the more complicated function si_emit_graphics_shader_pointers. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8794>
2021-01-30radeonsi: reorganize si_draw_vbo for lower register pressure (part 2)Marek Olšák1-55/+62
Move statements that use the least number of local variables as close to the beginning as possible. Also move local variables closer to their use. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8794>
2021-01-30radeonsi: reorganize si_draw_vbo for lower register pressure (part 1)Marek Olšák1-31/+39
Move statements that use the least number of local variables as close to the beginning as possible. Also move local variables closer to their use. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8794>
2021-01-30radeonsi: optimize si_emit_prefetch_L2 when it's splitMarek Olšák1-54/+60
When using the prefetch with VS_ONLY=true followed by VS_ONLY=false, we tested the VS_ONLY bits in the mask when executing VS_ONLY=false where the bits were always 0. It's also useless to clear the prefetch mask when VS_ONLY=true. This commit skips those tests by splitting the function properly using BEFORE_DRAW and AFTER_DRAW template parameters. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8794>