Age | Commit message (Collapse) | Author | Files | Lines |
|
Depth/stencil/samplemask inputs are first to match
create_fs_jump_to_epilog().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26413>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26413>
|
|
For GFX11 only.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26413>
|
|
On RDNA3, alpha to coverage needs to be exported through MRTZ when
depth, stencil or samplemask are also exported. This option will allow
us to export MRTZ from PS epilogs instead of the main FS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26413>
|
|
This currently has no effects because the store_output instructions
are removed earlier (in ac_nir_lower_ps). Though, this will be needed
for exporting MRTZ from PS epilogs for alpha to coverage on RDNA3.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26413>
|
|
Since we switched to the "on-demand" path for GPL, this is dead code.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26398>
|
|
RADV currently has two paths for PS epilogs:
- the first one is mostly used by GPL to compile fragment shader epilogs
as part of the graphics pipeline. It's supposed to be optimal because
fragment shader epilogs are compiled in the pipeline and eventually
cached.
- the second one (the "on-demand" path) is required when some dynamic
states are used because otherwise it's just impossible to compile the
fragment shader. These epilogs are compiled during cmdbuf recording
when all needed info are known, they are also cached in memory. This
is the main path for Zink.
Having two different paths isn't ideal for maintenance but there is
another problem. On RDNA3, alpha to coverage needs to be exported as
part of MRTZ when either depth/stencil/samplemask are exported. The
problem being that with GPL, the PSO multisample state can be NULL when
the frag shader lib is created, which means that we can't know if atc
needs to be exported or not, even if it's static. The solution seems to
to always use on-demand fragment shader epilogs for GPL on RDNA3.
So far, I think that switching to on-demand PS epilogs unconditionally
for GPL shouldn't hurt performance and that will simplify a lot of
things.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26398>
|
|
Otherwise, we can transition to exact before p_jump_to_epilog, then
transition to WQM again and then back to exact:
p_jump_to_epilog //transitions to exact
p_logical_end //transitions to wqm
p_end_wqm //transitions to exact
We rely on ssa elimination to clean most of this up.
fossil-db (navi21):
Totals from 1 (0.00% of 79330) affected shaders:
Instrs: 111 -> 110 (-0.90%)
CodeSize: 572 -> 568 (-0.70%)
Copies: 16 -> 15 (-6.25%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25440>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26231>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26231>
|
|
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26028>
|
|
Currently, shader part caching logic is duplicated between VS prolog and
PS/TCS epilogs. This commit introduces a common abstraction to
deduplicate the code.
Additionally, there are a few design decisions that diverts from the
current implementation:
1. A simple mutex is used instead of reader-writer lock. Prolog/epilog
constructions are serialized, removing the need to free duplicate
objects in case of a race.
2. A CS-local cache is used to quickly lookup an entry without holding a
lock. This eliminates locking in over 99% of cases.
3. A set is used to reduce number of allocations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26028>
|
|
The main change is to use struct radv_vs_prolog_key directly instead of
the compressed representation to simplify an upcoming rework in prolog /
epilog caching. In doing so the state struct pointer was replaced with
an inline struct.
Care was also taken to pre-mask all the states with the active attribute
mask and other masks when it makes sense; this ensures that we don't
accidentally use information not hashed into the key during compilation.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26023>
|
|
This was broken as the field was never assigned to. This will also be
dropped from the upcoming prolog/epilog lookup rework, as it adds to
code complexity while the benefit of saving one hash table memory access
seems questionable.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26023>
|
|
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24989>
|
|
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24989>
|
|
PS with epilog does not need to fix_exports. And radeonsi use
p_end_with_regs so does not have jump instruction at last.
radeonsi may also have exec restore instruction, so may break
before reach to p_end_with_regs.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
|
|
radeonsi need to compact color export for ps epilog while radv does not.
radv will fill empty color slot, so won't affected by this change.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
|
|
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
|
|
It's now used by ps epilog only.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
|
|
Will add other mrtz args for radeonsi.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
|
|
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
|
|
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
|
|
ps needs to handle wqm:
1. main part may compute with args from prolog in wqm mode, so
prolog need to compute these args in wqm mode too.
2. prolog and main part need to end with exact exec, so next
shader part which inherit previous shader part's exec won't
do valid job for helper threads
1 need p_end_with_regs to operate in wqm mode and itself can't
be exact, otherwise some move instruction added by it won't be
in wqm mode so helper threads' compute result is not passed to
next shader part as args.
2 is done by p_end_wqm added by finish_program automatically
after p_end_with_regs.
Piglit tests can trigger the problem:
1. gl-2.1-polygon-stipple-fs
a. ps prolog call discard_if
b. ps main pass wqm exec to epilog
c. ps epilog export color for discarded pixel
2. fs-fwidth-color.shader_test
a. ps prolog need to pass args computed in wqm mode
b. set p_end_with_regs to exact will end wqm mode before
the move instructions, so helper threads's result is not
passed to next shader part
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24973>
|
|
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25444>
|
|
From https://cgit.freedesktop.org/drm-misc/
commit d59e75eef52d89201aaf5342a3ac23ddf3e9b112
Author: Danilo Krummrich <dakr@redhat.com>
Date: Mon Oct 2 15:46:48 2023 +0200
drm/nouveau: exec: report max pushs through getparam
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25444>
|
|
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25010>
|
|
With merged shaders, the long-jump should be emitted inside the
divergent if (ie. only for TCS invocations) and other non TCS
invocations should just end the program.
This fixes a bunch of failures with CTS by forcing TCS epilogs on
RDNA2.
Not sure how RadeonSI will handle that but maybe doing the merged
wave info thing in epilogs would help.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24832>
|
|
TES would be NULL because everything is merged to GS.
Found by inspection.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24890>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24890>
|
|
Instead of using bit 23 of nvk_cmd_push::range for this, pass it as a
separate bool. This lets us use the actual kernel flag with the new
UAPI.
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Tested-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24840>
|
|
From https://cgit.freedesktop.org/drm-misc/
commit 443f9e0b1ab5e3b95abf8606097d13e30e2f2413
Author: Danilo Krummrich <dakr@redhat.com>
Date: Wed Aug 23 20:15:34 2023 +0200
drm/nouveau: uapi: don't pass NO_PREFETCH flag implicitl
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Tested-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24840>
|
|
ACO no longer requires these arguments.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24838>
|
|
When TCS and TES aren't linked together and TCS exports unused outputs,
the per-patch data offset needs to be adjusted. This is similar to the
LS-HS vertex stride when VS and TCS aren't linked together.
This fixes a bunch of failures by forcing the driver to use TCS epilogs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24776>
|
|
This implements jumping from the main TCS to the epilog.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643>
|
|
For TCS epilogs, we will have to pass SGPRs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643>
|
|
On GFX9+, VS is merged with TCS which means this function is called
twice and the epilog was emitted in both shader parts.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643>
|
|
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643>
|
|
ACO skip it for epilogs now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643>
|
|
It's similar to when the patch control points value is dynamic.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24643>
|
|
The number of SGPRs need to be adjusted.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24747>
|
|
Now it only has tcs epilog build, will add more prolog/epilog to it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24443>
|
|
Prepare to be shared with prolog/epilog generation which
does not have si_shader param.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24443>
|
|
For shared with aco tcs epilog creation.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24443>
|
|
It's always true.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24443>
|
|
epilog does not use scratch so has no scratch arg.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442>
|
|
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442>
|
|
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24442>
|