summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorRhys Perry <pendingchaos02@gmail.com>2021-06-02 16:30:35 +0100
committerEric Engestrom <eric@engestrom.ch>2021-06-05 18:11:04 +0200
commitf74f86547b3cad14899d3eb4afcc755306a46888 (patch)
tree93841600362bedb4909b5e101aa718351bc829b2
parent0d224336f781b0ad8956c2def650f5caf0cb2f4f (diff)
aco: don't create 4 and 5 dword NSA instructions on GFX10
"stability issues", apparently: https://reviews.llvm.org/D103348 fossil-db (Navi10): Totals from 4512 (3.01% of 149839) affected shaders: VGPRs: 221516 -> 223308 (+0.81%); split: -0.07%, +0.88% CodeSize: 23000080 -> 23070672 (+0.31%); split: -0.08%, +0.39% MaxWaves: 107718 -> 107496 (-0.21%); split: +0.11%, -0.32% Instrs: 4321890 -> 4362822 (+0.95%); split: -0.00%, +0.95% Latency: 71495710 -> 71581476 (+0.12%); split: -0.07%, +0.19% InvThroughput: 11858568 -> 11938960 (+0.68%); split: -0.00%, +0.68% VClause: 76575 -> 76585 (+0.01%); split: -0.05%, +0.07% SClause: 168771 -> 168709 (-0.04%); split: -0.06%, +0.02% Copies: 182305 -> 221948 (+21.75%); split: -0.00%, +21.75% PreVGPRs: 194657 -> 195635 (+0.50%); split: -0.00%, +0.50% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Fixes: c353895c922 ("aco: use non-sequential addressing") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10898> (cherry picked from commit 903f814b78f3bdb6f330889277ada147070bfd7b)
-rw-r--r--.pick_status.json2
-rw-r--r--src/amd/compiler/README-ISA.md5
-rw-r--r--src/amd/compiler/aco_instruction_selection.cpp6
3 files changed, 11 insertions, 2 deletions
diff --git a/.pick_status.json b/.pick_status.json
index 1e5327ffa67..c58e91393ce 100644
--- a/.pick_status.json
+++ b/.pick_status.json
@@ -1084,7 +1084,7 @@
"description": "aco: don't create 4 and 5 dword NSA instructions on GFX10",
"nominated": true,
"nomination_type": 1,
- "resolution": 0,
+ "resolution": 1,
"main_sha": null,
"because_sha": "c353895c92270c0e2a6e2b849c24d558efae0d5e"
},
diff --git a/src/amd/compiler/README-ISA.md b/src/amd/compiler/README-ISA.md
index a790522ba4f..cb9d8da0298 100644
--- a/src/amd/compiler/README-ISA.md
+++ b/src/amd/compiler/README-ISA.md
@@ -250,3 +250,8 @@ Only `s_waitcnt_vscnt null, 0`. Needed even if the first instruction is a load.
### NSAClauseBug
"MIMG-NSA in a hard clause has unpredictable results on GFX10.1"
+
+### NSAMaxSize5
+
+NSA MIMG instructions should be limited to 3 dwords before GFX10.3 to avoid
+stability issues: https://reviews.llvm.org/D103348
diff --git a/src/amd/compiler/aco_instruction_selection.cpp b/src/amd/compiler/aco_instruction_selection.cpp
index 75ab73528a5..594d195374f 100644
--- a/src/amd/compiler/aco_instruction_selection.cpp
+++ b/src/amd/compiler/aco_instruction_selection.cpp
@@ -5555,7 +5555,11 @@ static MIMG_instruction *emit_mimg(Builder& bld, aco_opcode op,
unsigned wqm_mask=0,
Operand vdata=Operand(v1))
{
- if (bld.program->chip_class < GFX10) {
+ /* Limit NSA instructions to 3 dwords on GFX10 to avoid stability issues. */
+ unsigned max_nsa_size = bld.program->chip_class >= GFX10_3 ? 13 : 5;
+ bool use_nsa = bld.program->chip_class >= GFX10 && coords.size() <= max_nsa_size;
+
+ if (!use_nsa) {
Temp coord = coords[0];
if (coords.size() > 1) {
coord = bld.tmp(RegType::vgpr, coords.size());