summaryrefslogtreecommitdiff
path: root/lib/Target/X86/X86InstrInfo.cpp
AgeCommit message (Collapse)AuthorFilesLines
2015-12-20[X86] Use range-based for loop. NFCCraig Topper1-3/+2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256127 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-17[X86] Use push-pop for materializing small constants under 'minsize'Hans Wennborg1-0/+48
Use the 3-byte (4 with REX prefix) push-pop sequence for materializing small constants. This is smaller than using a mov (5, 6 or 7 bytes depending on size and REX prefix), but it's likely to be slower, so only used for 'minsize'. This is a follow-up to r255656. Differential Revision: http://reviews.llvm.org/D15549 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255936 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-15Fix "Not having LAHF/SAHF" assert.Hans Wennborg1-1/+2
It wants to assert that the subtarget is 64-bit, not the register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255703 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-15[X86] Smaller code for materializing 32-bit 1 and -1 constantsHans Wennborg1-5/+43
"movl $-1, %eax" is 5 bytes, "xorl %eax, %eax; decl %eax" is 3 bytes. This commit makes LLVM use the latter when optimizing for size. Differential Revision: http://reviews.llvm.org/D14971 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255656 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-11CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness()Matthias Braun1-15/+10
computeRegisterLiveness() was broken in that it reported dead for a register even if a subregister was alive. I assume this was because the results of analayzePhysRegs() are hard to understand with respect to subregisters. This commit: Changes the results of analyzePhysRegs (=struct PhysRegInfo) to be clearly understandable, also renames the fields to avoid silent breakage of third-party code (and improve the grammar). Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling it and removing workarounds for the bug. This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033 Differential Revision: http://reviews.llvm.org/D15320 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255362 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-05[X86][ADX] Added memory folding patterns and stack folding testsSimon Pilgrim1-0/+6
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254844 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-04X86: Don't emit SAHF/LAHF for 64-bit targets unless explicitly supportedHans Wennborg1-4/+25
These instructions are not supported by all CPUs in 64-bit mode. Emitting them causes Chromium to crash on start-up for users with such chips. (GCC puts these instructions behind -msahf on 64-bit for the same reason.) This patch adds FeatureLAHFSAHF, enables it by default for 32-bit targets and modern CPUs, and changes X86InstrInfo::copyPhysReg back to the lowering from before r244503 when the instructions are not available. Differential Revision: http://reviews.llvm.org/D15240 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254793 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-04X86InstrInfo::copyPhysReg: workaround reg livenessJF Bastien1-3/+13
Summary: computeRegisterLiveness and analyzePhysReg are currently getting confused about liveness in some cases, breaking copyPhysReg's calculation of whether AX is dead in some cases. Work around this issue temporarily by assuming that AX is always live. See detail in: https://llvm.org/bugs/show_bug.cgi?id=25033#c7 And associated bugs PR24535 PR25033 PR24991 PR24992 PR25201. This workaround makes the code correct but slightly inefficient, but it seems to confuse the machine instr verifier which now things EAX was undefined in some cases where it's being conservatively saved / restored. Reviewers: majnemer, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15198 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254680 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-01[X86] Use range-based for loops. NFCCraig Topper1-6/+6
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254387 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-01[X86] Use array_lengthof instead of calculating manually. Also change index ↵Craig Topper1-7/+7
types to size_t to match. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254386 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-30Revert r254279 "[X86] Use ArrayRef. NFC". It seems to have upset an MSVC ↵Craig Topper1-4/+7
build bot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254280 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-30[X86] Use ArrayRef. NFCCraig Topper1-7/+4
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254279 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-26X86-FMA3: Improved/enabled the memory folding optimization for scalar loadsVyacheslav Klochkov1-0/+12
generated for _mm_losd_s{s,d}() intrinsics and used in scalar FMAs generated for FMA intrinsics _mm_f{madd,msub,nmadd,nmsub}_s{s,d}(). Reviewer: David Kreitzer Differential Revision: http://reviews.llvm.org/D14762 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254140 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-24[x86] remove duplicate movq instruction defs (PR25554)Sanjay Patel1-2/+0
We had duplicated definitions for the same hardware '[v]movq' instructions. For example with SSE: def MOVZQI2PQIrr : RS2I<0x6E, MRMSrcReg, (outs VR128:$dst), (ins GR64:$src), "mov{d|q}\t{$src, $dst|$dst, $src}", // X86-64 only [(set VR128:$dst, (v2i64 (X86vzmovl (v2i64 (scalar_to_vector GR64:$src)))))], IIC_SSE_MOVDQ>; def MOV64toPQIrr : RS2I<0x6E, MRMSrcReg, (outs VR128:$dst), (ins GR64:$src), "mov{d|q}\t{$src, $dst|$dst, $src}", [(set VR128:$dst, (v2i64 (scalar_to_vector GR64:$src)))], IIC_SSE_MOVDQ>, Sched<[WriteMove]>; As shown in the test case and PR25554: https://llvm.org/bugs/show_bug.cgi?id=25554 This causes us to miss reusing an operand because later passes don't know these 'movq' are the same instruction. This patch deletes one pair of these defs. Sadly, this won't fix the original test case in the bug report. Something else is still broken. Differential Revision: http://reviews.llvm.org/D14941 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253988 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-19[X86] Use existing MachineInstrBuilder::addDisp to create offseted pointer. NFC.Simon Pilgrim1-8/+1
Minor code duplication tidyup to D13988 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253606 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-19AVX-512: Fixed COPY_TO_REGCLASS for mask registersElena Demikhovsky1-13/+49
Copying one mask register to another under BW should be done with kmovq instruction, otherwise we can loose some bits. Copying 8 bits under DQ may be done with kmovb. Differential Revision: http://reviews.llvm.org/D14812 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253563 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-13X86-FMA3: Implemented commute transformations FMA*_Int instructions.Vyacheslav Klochkov1-118/+206
It made it possible to apply the memory folding optimization for the 2nd operand of FMA*_Int instructions. Reviewer: Quentin Colombet Differential Revision: http://reviews.llvm.org/D14550 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252973 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-12My first/test commit. Removed a trailing whitespace.Vyacheslav Klochkov1-1/+1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252940 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-06Improved the operands commute transformation for X86-FMA3 instructions.Andrew Kaylor1-27/+326
All 3 operands of FMA3 instructions are commutable now. Patch by Slava Klochkov Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab). Differential Revision: http://reviews.llvm.org/D13269 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252335 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-04Warning fix.Simon Pilgrim1-2/+2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252078 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-04[X86][SSE] Add general memory folding for (V)INSERTPS instructionSimon Pilgrim1-7/+71
This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction. The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening. This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done. It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted. Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll Differential Revision: http://reviews.llvm.org/D13988 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252074 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-04Created new X86 FMA3 opcodes (FMA*_Int) that are used now for lowering of ↵Andrew Kaylor1-0/+24
scalar FMA intrinsics. Patch by Slava Klochkov The key difference between FMA* and FMA*_Int opcodes is that FMA*_Int opcodes are handled more conservatively. It is illegal to commute the 1st operand of FMA*_Int instructions as the upper bits of scalar FMA intrinsic result must be taken from the 1st operand, but such commute transformation would change those upper bits and invalidate the intrinsic's result. Reviewers: Quentin Colombet, Elena Demikhovsky Differential Revision: http://reviews.llvm.org/D13710 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252060 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-26AVX512: Add AVX-512 not materializable instructions. Igor Breger1-1/+29
Otherwise value can be reused , despite its value could be changed - produces incorrect assembler. https://llvm.org/bugs/show_bug.cgi?id=25270 Differential Revision: http://reviews.llvm.org/D14057 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251275 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01[WinEH] Make FuncletLayout more robust against catchretDavid Majnemer1-3/+5
Catchret transfers control from a catch funclet to an earlier funclet. However, it is not completely clear which funclet the catchret target is part of. Make this clear by stapling the catchret target's funclet membership onto the CATCHRET SDAG node. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249052 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-30[x86] enable machine combiner reassociations for 256-bit vector logical ↵Sanjay Patel1-0/+3
integer insts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248955 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28Improved the interface of methods commuting operands, improved X86-FMA3 ↵Andrew Kaylor1-59/+57
mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248735 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-21[Machine Combiner] Refactor machine reassociation code to be target-independent.Chad Rosier1-206/+9
No functional change intended. Patch by Haicheng Wu <haicheng@codeaurora.org>! http://reviews.llvm.org/D12887 PR24522 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248164 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-17AVX-512: shufflevector for i1 vectors <2 x i1> .. <64 x i1>Elena Demikhovsky1-0/+4
AVX-512 does not provide an instruction that shuffles mask register. So I do the following way: mask-2-simd , shuffle simd , simd-2-mask Differential Revision: http://reviews.llvm.org/D12727 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247876 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-12[x86] enable machine combiner reassociations for 128-bit vector logical ↵Sanjay Patel1-0/+6
integer insts (2nd try) The changes in: test/CodeGen/X86/machine-cp.ll are just due to scheduling differences after some logic instructions were reassociated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247516 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-12revert r247506; need to verify changes in existing testsSanjay Patel1-6/+0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247507 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-12[x86] enable machine combiner reassociations for 128-bit vector logical ↵Sanjay Patel1-0/+6
integer insts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247506 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-03[x86] enable machine combiner reassociations for scalar 'xor' instsSanjay Patel1-0/+4
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246781 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-01rename "slow-unaligned-mem-under-32" to slow-unaligned-mem-16" (NFCI)Sanjay Patel1-3/+3
This is a follow-on suggested by: http://reviews.llvm.org/D12154 ( http://reviews.llvm.org/rL245729 ) http://reviews.llvm.org/D10662 ( http://reviews.llvm.org/rL245075 ) This makes the attribute name match most of the existing lowering logic and regression test expectations. But the current use of this attribute is inconsistent; see the FIXME comment for "allowsMisalignedMemoryAccesses()". That change will result in functional changes and should be coming soon. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246585 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-31[x86] enable machine combiner reassociations for scalar 'or' instsSanjay Patel1-0/+4
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246481 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-30[MIR Serialization] static -> static const in ↵Hal Finkel1-1/+1
getSerializable*MachineOperandTargetFlags Make the arrays 'static const' instead of just 'static'. Post-commit review comment from Roman Divacky on IRC. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246376 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-28[x86] enable machine combiner reassociations for scalar 'and' instsSanjay Patel1-1/+5
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-26Expose hasLiveCondCodeDef as a member function of the X86InstrInfo class. NFCAndrew Kaylor1-1/+1
This takes the existing static function hasLiveCondCodeDef and makes it a member function of the X86InstrInfo class. This is a useful utility function that an upcoming change would like to use. NFC. Patch by: Kevin B. Smith Differential Revision: http://reviews.llvm.org/D12371 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246073 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-21[x86] enable machine combiner reassociations for 256-bit vector min/maxSanjay Patel1-0/+4
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245735 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-21[x86] invert logic for attribute 'FeatureFastUAMem'Sanjay Patel1-3/+8
This is a 'no functional change intended' patch. It removes one FIXME, but adds several more. Motivation: the FeatureFastUAMem attribute may be too general. It is used to determine if any sized misaligned memory access under 32-bytes is 'fast'. From the added FIXME comments, however, you can see that we're not consistent about this. Changing the name of the attribute makes it clearer to see the logic holes. Changing this to a 'slow' attribute also means we don't have to add an explicit 'fast' attribute to new chips; fast unaligned accesses have been standard for several generations of CPUs now. Differential Revision: http://reviews.llvm.org/D12154 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245729 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-21[x86] enable machine combiner reassociations for 128-bit vector min/maxSanjay Patel1-0/+8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245715 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-19[x86] enable machine combiner reassociations for scalar double-precision min/maxSanjay Patel1-0/+4
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245506 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-19[x86] enable machine combiner reassociations for scalar single-precision ↵Sanjay Patel1-0/+2
maximums git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245504 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-15[x86] enable machine combiner reassociations for scalar single-precision ↵Sanjay Patel1-0/+6
minimums git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245166 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-12fix typo; NFCSanjay Patel1-1/+1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244753 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-12[x86] enable machine combiner reassociations for 256-bit vector FP mul/addSanjay Patel1-0/+4
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244705 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-11PseudoSourceValue: Replace global manager with a manager in a machine function.Alex Lorenz1-2/+2
This commit removes the global manager variable which is responsible for storing and allocating pseudo source values and instead it introduces a new manager class named 'PseudoSourceValueManager'. Machine functions now own an instance of the pseudo source value manager class. This commit also modifies the 'get...' methods in the 'MachinePointerInfo' class to construct pseudo source values using the instance of the pseudo source value manager object from the machine function. This commit updates calls to the 'get...' methods from the 'MachinePointerInfo' class in a lot of different files because those calls now need to pass in a reference to a machine function to those methods. This change will make it easier to serialize pseudo source values as it will enable me to transform the mips specific MipsCallEntry PseudoSourceValue subclass into two target independent subclasses. Reviewers: Akira Hatanaka git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244693 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-11[x86] enable machine combiner reassociations for 128-bit vector ↵Sanjay Patel1-2/+6
single/double multiplies git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244657 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-10x86: Emit LAHF/SAHF instead of PUSHF/POPFJF Bastien1-26/+51
NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask system bits in the past, and EFLAGS's DF bit is a constant source of hilarity). Commit r220529 fixed PR20376 by saving cmpxchg's flags result using EFLAGS, this commit now generated LAHF/SAHF instead, for all of x86 (not just NaCl) because it leads to an overall performance gain over PUSHF/POPF. As with the previous patch this code generation is pretty bad because it occurs very later, after register allocation, and in many cases it rematerializes flags which were already available (e.g. already in a register through SETE). Fortunately it's somewhat rare that this code needs to fire. I did [[ https://github.com/jfbastien/benchmark-x86-flags | a bit of benchmarking ]], the results on an Intel Haswell E5-2690 CPU at 2.9GHz are: | Time per call (ms) | Runtime (ms) | Benchmark | | 0.000012514 | 6257 | sete.i386 | | 0.000012810 | 6405 | sete.i386-fast | | 0.000010456 | 5228 | sete.x86-64 | | 0.000010496 | 5248 | sete.x86-64-fast | | 0.000012906 | 6453 | lahf-sahf.i386 | | 0.000013236 | 6618 | lahf-sahf.i386-fast | | 0.000010580 | 5290 | lahf-sahf.x86-64 | | 0.000010304 | 5152 | lahf-sahf.x86-64-fast | | 0.000028056 | 14028 | pushf-popf.i386 | | 0.000027160 | 13580 | pushf-popf.i386-fast | | 0.000023810 | 11905 | pushf-popf.x86-64 | | 0.000026468 | 13234 | pushf-popf.x86-64-fast | Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at least not for this purpose. Reviewers: rnk, jvoung, t.p.northover Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D6629 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244503 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-10fix minsize detection: minsize attribute implies optimizing for sizeSanjay Patel1-5/+2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244499 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-10fix minsize detection: minsize attribute implies optimizing for sizeSanjay Patel1-3/+1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244464 91177308-0d34-0410-b5e6-96231b3b80d8