summaryrefslogtreecommitdiff
path: root/src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp
AgeCommit message (Collapse)AuthorFilesLines
2015-05-15i965: Fix FS unit testsIan Romanick1-1/+2
Commit 3687d75 changed the fs_visitor constructors, but it didn't update all the users. As a result, 'make check' fails. I added the explicit cast to the gl_program* parameter to make it more clear which NULL was which. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@Whitecape.org>
2015-04-22i965: Do better fake context setup in unit testsJason Ekstrand1-1/+5
In future tests, we will start relying on devinfo and not just brw in the compiler. Changing this now keeps these tests from failing in the future. Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-20i965/fs: Use correct null destination register in cmod testsIan Romanick1-3/+3
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89670 Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: Vinson Lee <vlee@freedesktop.org>
2015-03-17i965/fs: Handle CMP.nz ... 0 and AND.nz ... 1 similarly in cmod propagationIan Romanick1-0/+105
Espically on platforms that do not natively generate 0u and ~0u for Boolean results, we generate a lot of sequences where a CMP is followed by an AND with 1. emit_bool_to_cond_code does this, for example. On ILK, this results in a sequence like: add(8) g3<1>F g8<8,8,1>F -g4<0,1,0>F cmp.l.f0(8) g3<1>D g3<8,8,1>F 0F and.nz.f0(8) null g3<8,8,1>D 1D (+f0) iff(8) Jump: 6 The AND.nz is obviously redundant. By propagating the cmod, we can instead generate add.l.f0(8) null g8<8,8,1>F -g4<0,1,0>F (+f0) iff(8) Jump: 6 Existing code already handles the propagation from the CMP to the ADD. Shader-db results: GM45 (0x2A42): total instructions in shared programs: 3550829 -> 3550788 (-0.00%) instructions in affected programs: 10028 -> 9987 (-0.41%) helped: 24 Iron Lake (0x0046): total instructions in shared programs: 4993146 -> 4993105 (-0.00%) instructions in affected programs: 9675 -> 9634 (-0.42%) helped: 24 Ivy Bridge (0x0166): total instructions in shared programs: 6291870 -> 6291794 (-0.00%) instructions in affected programs: 17914 -> 17838 (-0.42%) helped: 48 Haswell (0x0426): total instructions in shared programs: 5779256 -> 5779180 (-0.00%) instructions in affected programs: 16694 -> 16618 (-0.46%) helped: 48 Broadwell (0x162E): total instructions in shared programs: 6823088 -> 6823014 (-0.00%) instructions in affected programs: 15824 -> 15750 (-0.47%) helped: 46 No chage on Sandy Bridge or on any platform when NIR is used. v2: Add unit tests suggested by Matt. Remove spurious writes_flag() check on scan_inst when scan_inst is known to be BRW_OPCODE_CMP (also suggested by Matt). v3: Fix some comments and remove some explicit int() casts in fs_reg constructors in the unit tests. Both suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-04i965/fs: Don't propagate cmod to inst with different type.Matt Turner1-0/+34
Cc: 10.5 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89317 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23i965/fs: Add support for removing MOV.NZ instructions.Matt Turner1-0/+32
For some reason, we occasionally write the flag register with a MOV.NZ instruction: add(8) g25<1>F -g6<0,1,0>F g15<8,8,1>F cmp.l.f0(8) g26<1>D g25<8,8,1>F 0F mov.nz.f0(8) null g26<8,8,1>D A MOV.NZ instruction on the result of a CMP is like comparing for equality with true in C. It's useless. Removing it allows us to generate: add.l.f0(8) null -g6<0,1,0>F g15<8,8,1>F total instructions in shared programs: 5955701 -> 5951657 (-0.07%) instructions in affected programs: 302910 -> 298866 (-1.34%) GAINED: 1 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23i965/fs: Allow flipping cond mod for negated arguments.Matt Turner1-0/+33
This allows us to apply the optimization in cases where the CMP's argument is negated, by flipping the conditional mod. For example, it allows us to optimize this: add(8) temp a b cmp.l.f0(8) null -temp 0.0 into add.g.f0(8) temp a b total instructions in shared programs: 5958360 -> 5955701 (-0.04%) instructions in affected programs: 466880 -> 464221 (-0.57%) GAINED: 0 LOST: 1 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23i965/fs: Propagate cmod across flag read if it contains the same value.Matt Turner1-0/+41
total instructions in shared programs: 5959463 -> 5958900 (-0.01%) instructions in affected programs: 70031 -> 69468 (-0.80%) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-23i965/fs: Add unit tests for cmod propagation pass.Matt Turner1-0/+311
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>