summaryrefslogtreecommitdiff
path: root/arch/x86/events/intel/core.c
AgeCommit message (Collapse)AuthorFilesLines
2016-07-03perf/x86/intel: Update event constraints when HT is offStephane Eranian1-0/+29
This patch updates the event constraints for non-PEBS mode for Intel Broadwell and Skylake processors. When HT is off, each CPU gets 8 generic counters. However, not all events can be programmed on any of the 8 counters. This patch adds the constraints for the MEM_* events which can only be measured on the bottom 4 counters. The constraints are also valid when HT is off because, then, there are only 4 generic counters and they are the bottom counters. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: kan.liang@intel.com Link: http://lkml.kernel.org/r/1467411742-13245-1-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-12Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar1-1/+1
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-12perf/x86: Fix undefined shift on 32-bit kernelsAndrey Ryabinin1-1/+1
Jim reported: UBSAN: Undefined behaviour in arch/x86/events/intel/core.c:3708:12 shift exponent 35 is too large for 32-bit type 'long unsigned int' The use of 'unsigned long' type obviously is not correct here, make it 'unsigned long long' instead. Reported-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Imre Palik <imrep@amazon.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 2c33645d366d ("perf/x86: Honor the architectural performance monitoring version") Link: http://lkml.kernel.org/r/1462974711-10037-1-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports itAlexander Shishkin1-0/+1
Not all cores prevent using Intel PT and LBRs simultaneously, although most of them still do as of today. This patch adds an opt-in flag for such cores to disable mutual exclusivity between PT and LBR; also flip it on for Goldmont. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461857746-31346-4-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar1-0/+2
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/x86: Add model numbers for Kabylake CPUsAndi Kleen1-0/+2
Everything the same as Skylake, just new model numbers. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461977748-17616-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23perf/x86/intel: Add LBR filter support for Silvermont and Airmont CPUsKan Liang1-1/+1
LBR filtering is also supported on the Silvermont and Airmont microarchitectures. The layout of MSR_LBR_SELECT is the same as Nehalem. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1460706825-46163-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23perf/x86/intel: Add Goldmont CPU supportKan Liang1-0/+157
Add perf core PMU support for Intel Goldmont CPU cores: - The init code is based on Silvermont. - There is a new cache event list, based on the Silvermont cache event list. - Goldmont has 32 LBR entries. It also uses new LBRv6 format, which report the cycle information using upper 16-bit of the LBR_TO. - It's recommended to use CPU_CLK_UNHALTED.CORE_P + NPEBS for precise cycles. For details, please refer to the latest SDM058: http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.pdf Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1460706167-45320-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-23perf/x86/intel: Add model number for Skylake Server to perfAndi Kleen1-0/+1
Everything the same as base Skylake, just a new model number. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1460751933-2264-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08perf/x86/intel: Fix PEBS data source interpretation on Nehalem/WestmereAndi Kleen1-0/+2
Jiri reported some time ago that some entries in the PEBS data source table in perf do not agree with the SDM. We investigated and the bits changed for Sandy Bridge, but the SDM was not updated. perf already implements the bits correctly for Sandy Bridge and later. This patch patches it up for Nehalem and Westmere. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/1456871124-15985-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08perf/x86/pebs: Add proper PEBS constraints for BroadwellStephane Eranian1-1/+1
This patch adds a Broadwell specific PEBS event constraint table. Broadwell has a fix for the HT corruption bug erratum HSD29 on Haswell. Therefore, there is no need to mark events 0xd0, 0xd1, 0xd2, 0xd3 has requiring the exclusive mode across both sibling HT threads. This holds true for regular counting and sampling (see core.c) and PEBS (ds.c) which we fix in this patch. In doing so, we relax evnt scheduling for these events, they can now be programmed on any 4 counters without impacting what is measured on the sibling thread. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: adrian.hunter@intel.com Cc: jolsa@redhat.com Cc: kan.liang@intel.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/1457034642-21837-4-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08perf/x86/pebs: Add workaround for broken OVFL status on HSW+Stephane Eranian1-0/+10
This patch fixes an issue with the GLOBAL_OVERFLOW_STATUS bits on Haswell, Broadwell and Skylake processors when using PEBS. The SDM stipulates that when the PEBS iterrupt threshold is crossed, an interrupt is posted and the kernel is interrupted. The kernel will find GLOBAL_OVF_SATUS bit 62 set indicating there are PEBS records to drain. But the bits corresponding to the actual counters should NOT be set. The kernel follows the SDM and assumes that all PEBS events are processed in the drain_pebs() callback. The kernel then checks for remaining overflows on any other (non-PEBS) events and processes these in the for_each_bit_set(&status) loop. As it turns out, under certain conditions on HSW and later processors, on PEBS buffer interrupt, bit 62 is set but the counter bits may be set as well. In that case, the kernel drains PEBS and generates SAMPLES with the EXACT tag, then it processes the counter bits, and generates normal (non-EXACT) SAMPLES. I ran into this problem by trying to understand why on HSW sampling on a PEBS event was sometimes returning SAMPLES without the EXACT tag. This should not happen on user level code because HSW has the eventing_ip which always point to the instruction that caused the event. The workaround in this patch simply ensures that the bits for the counters used for PEBS events are cleared after the PEBS buffer has been drained. With this fix 100% of the PEBS samples on my user code report the EXACT tag. Before: $ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase $ perf report -D | fgrep SAMPLES PERF_RECORD_SAMPLE(IP, 0x2): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y \--- EXACT tag is missing After: $ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase $ perf report -D | fgrep SAMPLES PERF_RECORD_SAMPLE(IP, 0x4002): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y \--- EXACT tag is set The problem tends to appear more often when multiple PEBS events are used. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: adrian.hunter@intel.com Cc: kan.liang@intel.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/1457034642-21837-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-08perf/x86/intel: Fix PEBS warning by only restoring active PMU in pmiKan Liang1-2/+13
This patch tries to fix a PEBS warning found in my stress test. The following perf command can easily trigger the pebs warning or spurious NMI error on Skylake/Broadwell/Haswell platforms: sudo perf record -e 'cpu/umask=0x04,event=0xc4/pp,cycles,branches,ref-cycles,cache-misses,cache-references' --call-graph fp -b -c1000 -a Also the NMI watchdog must be enabled. For this case, the events number is larger than counter number. So perf has to do multiplexing. In perf_mux_hrtimer_handler, it does perf_pmu_disable(), schedule out old events, rotate_ctx, schedule in new events and finally perf_pmu_enable(). If the old events include precise event, the MSR_IA32_PEBS_ENABLE should be cleared when perf_pmu_disable(). The MSR_IA32_PEBS_ENABLE should keep 0 until the perf_pmu_enable() is called and the new event is precise event. However, there is a corner case which could restore PEBS_ENABLE to stale value during the above period. In perf_pmu_disable(), GLOBAL_CTRL will be set to 0 to stop overflow and followed PMI. But there may be pending PMI from an earlier overflow, which cannot be stopped. So even GLOBAL_CTRL is cleared, the kernel still be possible to get PMI. At the end of the PMI handler, __intel_pmu_enable_all() will be called, which will restore the stale values if old events haven't scheduled out. Once the stale pebs value is set, it's impossible to be corrected if the new events are non-precise. Because the pebs_enabled will be set to 0. x86_pmu.enable_all() will ignore the MSR_IA32_PEBS_ENABLE setting. As a result, the following NMI with stale PEBS_ENABLE trigger pebs warning. The pending PMI after enabled=0 will become harmless if the NMI handler does not change the state. This patch checks cpuc->enabled in pmi and only restore the state when PMU is active. Here is the dump: Call Trace: <NMI> [<ffffffff813c3a2e>] dump_stack+0x63/0x85 [<ffffffff810a46f2>] warn_slowpath_common+0x82/0xc0 [<ffffffff810a483a>] warn_slowpath_null+0x1a/0x20 [<ffffffff8100fe2e>] intel_pmu_drain_pebs_nhm+0x2be/0x320 [<ffffffff8100caa9>] intel_pmu_handle_irq+0x279/0x460 [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40 [<ffffffff811f290d>] ? vunmap_page_range+0x20d/0x330 [<ffffffff811f2f11>] ? unmap_kernel_range_noflush+0x11/0x20 [<ffffffff8148379f>] ? ghes_copy_tofrom_phys+0x10f/0x2a0 [<ffffffff814839c8>] ? ghes_read_estatus+0x98/0x170 [<ffffffff81005a7d>] perf_event_nmi_handler+0x2d/0x50 [<ffffffff810310b9>] nmi_handle+0x69/0x120 [<ffffffff810316f6>] default_do_nmi+0xe6/0x100 [<ffffffff810317f2>] do_nmi+0xe2/0x130 [<ffffffff817aea71>] end_repeat_nmi+0x1a/0x1e [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40 [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40 [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40 <<EOE>> <IRQ> [<ffffffff81006df8>] ? x86_perf_event_set_period+0xd8/0x180 [<ffffffff81006eec>] x86_pmu_start+0x4c/0x100 [<ffffffff8100722d>] x86_pmu_enable+0x28d/0x300 [<ffffffff811994d7>] perf_pmu_enable.part.81+0x7/0x10 [<ffffffff8119cb70>] perf_mux_hrtimer_handler+0x200/0x280 [<ffffffff8119c970>] ? __perf_install_in_context+0xc0/0xc0 [<ffffffff8110f92d>] __hrtimer_run_queues+0xfd/0x280 [<ffffffff811100d8>] hrtimer_interrupt+0xa8/0x190 [<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0 [<ffffffff81051bd8>] local_apic_timer_interrupt+0x38/0x60 [<ffffffff817af01d>] smp_apic_timer_interrupt+0x3d/0x50 [<ffffffff817ad15c>] apic_timer_interrupt+0x8c/0xa0 <EOI> [<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0 [<ffffffff81123de5>] ? smp_call_function_single+0xd5/0x130 [<ffffffff81123ddb>] ? smp_call_function_single+0xcb/0x130 [<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0 [<ffffffff8119765a>] event_function_call+0x10a/0x120 [<ffffffff8119c660>] ? ctx_resched+0x90/0x90 [<ffffffff811971e0>] ? cpu_clock_event_read+0x30/0x30 [<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60 [<ffffffff8119772b>] _perf_event_enable+0x5b/0x70 [<ffffffff81197388>] perf_event_for_each_child+0x38/0xa0 [<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60 [<ffffffff811a0ffd>] perf_ioctl+0x12d/0x3c0 [<ffffffff8134d855>] ? selinux_file_ioctl+0x95/0x1e0 [<ffffffff8124a3a1>] do_vfs_ioctl+0xa1/0x5a0 [<ffffffff81036d29>] ? sched_clock+0x9/0x10 [<ffffffff8124a919>] SyS_ioctl+0x79/0x90 [<ffffffff817ac4b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4 ---[ end trace aef202839fe9a71d ]--- Uhhuh. NMI received for unknown reason 2d on CPU 2. Do you have a strange power saving mode enabled? Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1457046448-6184-1-git-send-email-kan.liang@intel.com [ Fixed various typos and other small details. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event.h to its new homeBorislav Petkov1-1/+1
Now that all functionality has been moved to arch/x86/events/, move the perf_event.h header and adjust include paths. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-18-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-17perf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.cBorislav Petkov1-0/+3773
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1455098123-11740-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>