Age | Commit message (Collapse) | Author | Files | Lines |
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256127 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Use the 3-byte (4 with REX prefix) push-pop sequence for materializing
small constants. This is smaller than using a mov (5, 6 or 7 bytes
depending on size and REX prefix), but it's likely to be slower, so
only used for 'minsize'.
This is a follow-up to r255656.
Differential Revision: http://reviews.llvm.org/D15549
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255936 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
It wants to assert that the subtarget is 64-bit, not the register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255703 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
"movl $-1, %eax" is 5 bytes, "xorl %eax, %eax; decl %eax" is 3 bytes.
This commit makes LLVM use the latter when optimizing for size.
Differential Revision: http://reviews.llvm.org/D14971
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255656 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
computeRegisterLiveness() was broken in that it reported dead for a
register even if a subregister was alive. I assume this was because the
results of analayzePhysRegs() are hard to understand with respect to
subregisters.
This commit: Changes the results of analyzePhysRegs (=struct
PhysRegInfo) to be clearly understandable, also renames the fields to
avoid silent breakage of third-party code (and improve the grammar).
Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling
it and removing workarounds for the bug.
This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033
Differential Revision: http://reviews.llvm.org/D15320
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255362 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254844 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
These instructions are not supported by all CPUs in 64-bit mode. Emitting them
causes Chromium to crash on start-up for users with such chips.
(GCC puts these instructions behind -msahf on 64-bit for the same reason.)
This patch adds FeatureLAHFSAHF, enables it by default for 32-bit targets
and modern CPUs, and changes X86InstrInfo::copyPhysReg back to the lowering
from before r244503 when the instructions are not available.
Differential Revision: http://reviews.llvm.org/D15240
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254793 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Summary:
computeRegisterLiveness and analyzePhysReg are currently getting
confused about liveness in some cases, breaking copyPhysReg's
calculation of whether AX is dead in some cases. Work around this issue
temporarily by assuming that AX is always live.
See detail in: https://llvm.org/bugs/show_bug.cgi?id=25033#c7
And associated bugs PR24535 PR25033 PR24991 PR24992 PR25201.
This workaround makes the code correct but slightly inefficient, but it
seems to confuse the machine instr verifier which now things EAX was
undefined in some cases where it's being conservatively saved /
restored.
Reviewers: majnemer, sanjoy
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15198
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254680 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254387 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
types to size_t to match.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254386 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
build bot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254280 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254279 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
generated for _mm_losd_s{s,d}() intrinsics and used in scalar FMAs generated
for FMA intrinsics _mm_f{madd,msub,nmadd,nmsub}_s{s,d}().
Reviewer: David Kreitzer
Differential Revision: http://reviews.llvm.org/D14762
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254140 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
We had duplicated definitions for the same hardware '[v]movq' instructions. For example with SSE:
def MOVZQI2PQIrr : RS2I<0x6E, MRMSrcReg, (outs VR128:$dst), (ins GR64:$src),
"mov{d|q}\t{$src, $dst|$dst, $src}", // X86-64 only
[(set VR128:$dst, (v2i64 (X86vzmovl (v2i64 (scalar_to_vector GR64:$src)))))],
IIC_SSE_MOVDQ>;
def MOV64toPQIrr : RS2I<0x6E, MRMSrcReg, (outs VR128:$dst), (ins GR64:$src),
"mov{d|q}\t{$src, $dst|$dst, $src}",
[(set VR128:$dst, (v2i64 (scalar_to_vector GR64:$src)))],
IIC_SSE_MOVDQ>, Sched<[WriteMove]>;
As shown in the test case and PR25554:
https://llvm.org/bugs/show_bug.cgi?id=25554
This causes us to miss reusing an operand because later passes don't know these 'movq' are the same instruction.
This patch deletes one pair of these defs.
Sadly, this won't fix the original test case in the bug report. Something else is still broken.
Differential Revision: http://reviews.llvm.org/D14941
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253988 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Minor code duplication tidyup to D13988
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253606 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Copying one mask register to another under BW should be done with kmovq instruction, otherwise we can loose some bits.
Copying 8 bits under DQ may be done with kmovb.
Differential Revision: http://reviews.llvm.org/D14812
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253563 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
It made it possible to apply the memory folding optimization for the 2nd
operand of FMA*_Int instructions.
Reviewer: Quentin Colombet
Differential Revision: http://reviews.llvm.org/D14550
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252973 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252940 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
All 3 operands of FMA3 instructions are commutable now.
Patch by Slava Klochkov
Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab).
Differential Revision: http://reviews.llvm.org/D13269
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252335 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252078 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction.
The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening.
This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done.
It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted.
Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll
Differential Revision: http://reviews.llvm.org/D13988
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252074 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
scalar FMA intrinsics.
Patch by Slava Klochkov
The key difference between FMA* and FMA*_Int opcodes is that FMA*_Int opcodes are handled more conservatively. It is illegal to commute the 1st operand of FMA*_Int instructions as the upper bits of scalar FMA intrinsic result must be taken from the 1st operand, but such commute transformation would change those upper bits and invalidate the intrinsic's result.
Reviewers: Quentin Colombet, Elena Demikhovsky
Differential Revision: http://reviews.llvm.org/D13710
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252060 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Otherwise value can be reused , despite its value could be changed - produces incorrect assembler.
https://llvm.org/bugs/show_bug.cgi?id=25270
Differential Revision: http://reviews.llvm.org/D14057
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251275 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Catchret transfers control from a catch funclet to an earlier funclet.
However, it is not completely clear which funclet the catchret target is
part of. Make this clear by stapling the catchret target's funclet
membership onto the CATCHRET SDAG node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249052 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
integer insts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248955 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
mem-folding&coalescing.
Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com)
Differential Revision: http://reviews.llvm.org/D11370
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248735 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
No functional change intended.
Patch by Haicheng Wu <haicheng@codeaurora.org>!
http://reviews.llvm.org/D12887
PR24522
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248164 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
AVX-512 does not provide an instruction that shuffles mask register. So I do the following way:
mask-2-simd , shuffle simd , simd-2-mask
Differential Revision: http://reviews.llvm.org/D12727
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247876 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
integer insts (2nd try)
The changes in:
test/CodeGen/X86/machine-cp.ll
are just due to scheduling differences after some logic instructions were reassociated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247516 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247507 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
integer insts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247506 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246781 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This is a follow-on suggested by:
http://reviews.llvm.org/D12154 ( http://reviews.llvm.org/rL245729 )
http://reviews.llvm.org/D10662 ( http://reviews.llvm.org/rL245075 )
This makes the attribute name match most of the existing lowering logic
and regression test expectations.
But the current use of this attribute is inconsistent; see the FIXME
comment for "allowsMisalignedMemoryAccesses()". That change will
result in functional changes and should be coming soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246585 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246481 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
getSerializable*MachineOperandTargetFlags
Make the arrays 'static const' instead of just 'static'. Post-commit review
comment from Roman Divacky on IRC. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246376 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246300 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This takes the existing static function hasLiveCondCodeDef and makes it a member function of the X86InstrInfo class. This is a useful utility function that an upcoming change would like to use. NFC.
Patch by: Kevin B. Smith
Differential Revision: http://reviews.llvm.org/D12371
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246073 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245735 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This is a 'no functional change intended' patch. It removes one FIXME, but adds several more.
Motivation: the FeatureFastUAMem attribute may be too general. It is used to determine if any
sized misaligned memory access under 32-bytes is 'fast'. From the added FIXME comments, however,
you can see that we're not consistent about this. Changing the name of the attribute makes it
clearer to see the logic holes.
Changing this to a 'slow' attribute also means we don't have to add an explicit 'fast' attribute
to new chips; fast unaligned accesses have been standard for several generations of CPUs now.
Differential Revision: http://reviews.llvm.org/D12154
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245729 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245715 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245506 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
maximums
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245504 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
minimums
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245166 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244753 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244705 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This commit removes the global manager variable which is responsible for
storing and allocating pseudo source values and instead it introduces a new
manager class named 'PseudoSourceValueManager'. Machine functions now own an
instance of the pseudo source value manager class.
This commit also modifies the 'get...' methods in the 'MachinePointerInfo'
class to construct pseudo source values using the instance of the pseudo
source value manager object from the machine function.
This commit updates calls to the 'get...' methods from the 'MachinePointerInfo'
class in a lot of different files because those calls now need to pass in a
reference to a machine function to those methods.
This change will make it easier to serialize pseudo source values as it will
enable me to transform the mips specific MipsCallEntry PseudoSourceValue
subclass into two target independent subclasses.
Reviewers: Akira Hatanaka
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244693 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
single/double multiplies
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244657 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask system bits in the past, and EFLAGS's DF bit is a constant source of hilarity). Commit r220529 fixed PR20376 by saving cmpxchg's flags result using EFLAGS, this commit now generated LAHF/SAHF instead, for all of x86 (not just NaCl) because it leads to an overall performance gain over PUSHF/POPF.
As with the previous patch this code generation is pretty bad because it occurs very later, after register allocation, and in many cases it rematerializes flags which were already available (e.g. already in a register through SETE). Fortunately it's somewhat rare that this code needs to fire.
I did [[ https://github.com/jfbastien/benchmark-x86-flags | a bit of benchmarking ]], the results on an Intel Haswell E5-2690 CPU at 2.9GHz are:
| Time per call (ms) | Runtime (ms) | Benchmark |
| 0.000012514 | 6257 | sete.i386 |
| 0.000012810 | 6405 | sete.i386-fast |
| 0.000010456 | 5228 | sete.x86-64 |
| 0.000010496 | 5248 | sete.x86-64-fast |
| 0.000012906 | 6453 | lahf-sahf.i386 |
| 0.000013236 | 6618 | lahf-sahf.i386-fast |
| 0.000010580 | 5290 | lahf-sahf.x86-64 |
| 0.000010304 | 5152 | lahf-sahf.x86-64-fast |
| 0.000028056 | 14028 | pushf-popf.i386 |
| 0.000027160 | 13580 | pushf-popf.i386-fast |
| 0.000023810 | 11905 | pushf-popf.x86-64 |
| 0.000026468 | 13234 | pushf-popf.x86-64-fast |
Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at least not for this purpose.
Reviewers: rnk, jvoung, t.p.northover
Subscribers: llvm-commits
Differential revision: http://reviews.llvm.org/D6629
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244503 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244499 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244464 91177308-0d34-0410-b5e6-96231b3b80d8
|