summaryrefslogtreecommitdiff
path: root/src
diff options
context:
space:
mode:
authorKenneth Graunke <kenneth@whitecape.org>2017-04-21 01:28:13 -0700
committerAndres Gomez <agomez@igalia.com>2017-04-26 12:34:24 +0300
commitde9483a6cb97a34d47a04660061afd18a9fc6f41 (patch)
treec20b66c8faf7d7b0d01942000c389320629fd33a /src
parent4c2356f13cb199dad474d38fecd424d200e4cffd (diff)
i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().
opt_register_coalesce() was optimizing sequences such as: mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D mov(8) m4.zw:F, vgrf5.xxxy:F into: mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D This doesn't work - if we're going to reswizzle MACH, we'd need to reswizzle the MUL as well. Here, the MUL fills the accumulator's .zw components with attr18.yy * attr19.yy. But the MACH instruction expects .z to contain attr18.x * attr19.x. Bogus results ensue. No change in shader-db on Haswell. Prevents regressions in Timothy's patches to use enhanced layouts for varying packing (which rearrange code just enough to trigger this pre-existing bug, but were fine themselves). Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit 2faf227ec2e22c7a37e0a54783a3f0a0062ac852) Squashed with commit: i965/vec4: Use reads_accumulator_implicitly(), not MACH checks. Curro pointed out that I should not just check for MACH, but use the reads_accumulator_implicitly() helper, which would also prevent the same bug with MAC and SADA2 (if we ever decide to use them). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net> (cherry picked from commit 6b10c37b9c3a73add73f444fe1aee73c9ec82c94)
Diffstat (limited to 'src')
-rw-r--r--src/mesa/drivers/dri/i965/brw_vec4.cpp7
1 files changed, 7 insertions, 0 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 5e60eb657a7..a7694089e93 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1066,6 +1066,13 @@ vec4_instruction::can_reswizzle(const struct gen_device_info *devinfo,
if (devinfo->gen == 6 && is_math() && swizzle != BRW_SWIZZLE_XYZW)
return false;
+ /* We can't swizzle implicit accumulator access. We'd have to
+ * reswizzle the producer of the accumulator value in addition
+ * to the consumer (i.e. both MUL and MACH). Just skip this.
+ */
+ if (reads_accumulator_implicitly())
+ return false;
+
if (!can_do_writemask(devinfo) && dst_writemask != WRITEMASK_XYZW)
return false;