BPF Archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events
@ 2024-03-31  4:18 Andrii Nakryiko
  2024-03-31  4:18 ` [PATCH v4 1/4] perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined Andrii Nakryiko
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Andrii Nakryiko @ 2024-03-31  4:18 UTC (permalink / raw
  To: x86, peterz, mingo, tglx
  Cc: bpf, linux-kernel, jolsa, song, kernel-team, Andrii Nakryiko

Add AMD-specific implementation of perf_snapshot_branch_stack static call that
allows LBR capture from arbitrary points in the kernel. This is utilized by
BPF programs. See patch #3 for all the details.

Patches #1 and #2 are preparatory steps to ensure LBR freezing is completely
inlined and have no branches, to minimize LBR snapshot contamination.

Patch #4 removes an artificial restriction on perf events with LBR enabled.

Andrii Nakryiko (4):
  perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined
  perf/x86/amd: avoid taking branches before disabling LBR
  perf/x86/amd: support capturing LBR from software events
  perf/x86/amd: don't reject non-sampling events with configured LBR

 arch/x86/events/amd/core.c   | 37 +++++++++++++++++++++++++++++++++++-
 arch/x86/events/amd/lbr.c    | 11 +----------
 arch/x86/events/perf_event.h | 11 +++++++++++
 3 files changed, 48 insertions(+), 11 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 1/4] perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined
  2024-03-31  4:18 [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events Andrii Nakryiko
@ 2024-03-31  4:18 ` Andrii Nakryiko
  2024-03-31  4:18 ` [PATCH v4 2/4] perf/x86/amd: avoid taking branches before disabling LBR Andrii Nakryiko
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Andrii Nakryiko @ 2024-03-31  4:18 UTC (permalink / raw
  To: x86, peterz, mingo, tglx
  Cc: bpf, linux-kernel, jolsa, song, kernel-team, Andrii Nakryiko,
	Sandipan Das

In the following patches we will enable LBR capture on AMD CPUs at
arbitrary point in time, which means that LBR recording won't be frozen
by hardware automatically as part of hardware overflow event. So we need
to take care to minimize amount of branches and function calls/returns
on the path to freezing LBR, minimizing LBR snapshot altering as much as
possible.

amd_pmu_core_disable_all() is one of the functions on this path, and is
already marked as __always_inline. But it calls amd_pmu_set_global_ctl()
which is marked as just inline.  So to guarantee no function call will
be generated thoughout mark amd_pmu_set_global_ctl() as __always_inline
as well.

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 arch/x86/events/amd/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index aec16e581f5b..c5bcbc87d057 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -618,7 +618,7 @@ static void amd_pmu_cpu_dead(int cpu)
 	}
 }
 
-static inline void amd_pmu_set_global_ctl(u64 ctl)
+static __always_inline void amd_pmu_set_global_ctl(u64 ctl)
 {
 	wrmsrl(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, ctl);
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 2/4] perf/x86/amd: avoid taking branches before disabling LBR
  2024-03-31  4:18 [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events Andrii Nakryiko
  2024-03-31  4:18 ` [PATCH v4 1/4] perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined Andrii Nakryiko
@ 2024-03-31  4:18 ` Andrii Nakryiko
  2024-03-31  4:18 ` [PATCH v4 3/4] perf/x86/amd: support capturing LBR from software events Andrii Nakryiko
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Andrii Nakryiko @ 2024-03-31  4:18 UTC (permalink / raw
  To: x86, peterz, mingo, tglx
  Cc: bpf, linux-kernel, jolsa, song, kernel-team, Andrii Nakryiko,
	Sandipan Das

In the following patches we will enable LBR capture on AMD CPUs at
arbitrary point in time, which means that LBR recording won't be frozen
by hardware automatically as part of hardware overflow event. So we need
to take care to minimize amount of branches and function calls/returns
on the path to freezing LBR, minimizing LBR snapshot altering as much as
possible.

As such, split out LBR disabling logic from the sanity checking logic
inside amd_pmu_lbr_disable_all(). This will ensure that no branches are
taken before LBR is frozen in the functionality added in the next patch.
Use __always_inline to also eliminate any possible function calls.

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 arch/x86/events/amd/lbr.c    |  7 +------
 arch/x86/events/perf_event.h | 11 +++++++++++
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/amd/lbr.c b/arch/x86/events/amd/lbr.c
index 4a1e600314d5..0e4de028590d 100644
--- a/arch/x86/events/amd/lbr.c
+++ b/arch/x86/events/amd/lbr.c
@@ -412,16 +412,11 @@ void amd_pmu_lbr_enable_all(void)
 void amd_pmu_lbr_disable_all(void)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
-	u64 dbg_ctl, dbg_extn_cfg;
 
 	if (!cpuc->lbr_users || !x86_pmu.lbr_nr)
 		return;
 
-	rdmsrl(MSR_AMD_DBG_EXTN_CFG, dbg_extn_cfg);
-	rdmsrl(MSR_IA32_DEBUGCTLMSR, dbg_ctl);
-
-	wrmsrl(MSR_AMD_DBG_EXTN_CFG, dbg_extn_cfg & ~DBG_EXTN_CFG_LBRV2EN);
-	wrmsrl(MSR_IA32_DEBUGCTLMSR, dbg_ctl & ~DEBUGCTLMSR_FREEZE_LBRS_ON_PMI);
+	__amd_pmu_lbr_disable();
 }
 
 __init int amd_pmu_lbr_init(void)
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index fb56518356ec..4dddf0a7e81e 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1329,6 +1329,17 @@ void amd_pmu_lbr_enable_all(void);
 void amd_pmu_lbr_disable_all(void);
 int amd_pmu_lbr_hw_config(struct perf_event *event);
 
+static __always_inline void __amd_pmu_lbr_disable(void)
+{
+	u64 dbg_ctl, dbg_extn_cfg;
+
+	rdmsrl(MSR_AMD_DBG_EXTN_CFG, dbg_extn_cfg);
+	rdmsrl(MSR_IA32_DEBUGCTLMSR, dbg_ctl);
+
+	wrmsrl(MSR_AMD_DBG_EXTN_CFG, dbg_extn_cfg & ~DBG_EXTN_CFG_LBRV2EN);
+	wrmsrl(MSR_IA32_DEBUGCTLMSR, dbg_ctl & ~DEBUGCTLMSR_FREEZE_LBRS_ON_PMI);
+}
+
 #ifdef CONFIG_PERF_EVENTS_AMD_BRS
 
 #define AMD_FAM19H_BRS_EVENT 0xc4 /* RETIRED_TAKEN_BRANCH_INSTRUCTIONS */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 3/4] perf/x86/amd: support capturing LBR from software events
  2024-03-31  4:18 [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events Andrii Nakryiko
  2024-03-31  4:18 ` [PATCH v4 1/4] perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined Andrii Nakryiko
  2024-03-31  4:18 ` [PATCH v4 2/4] perf/x86/amd: avoid taking branches before disabling LBR Andrii Nakryiko
@ 2024-03-31  4:18 ` Andrii Nakryiko
  2024-03-31  4:18 ` [PATCH v4 4/4] perf/x86/amd: don't reject non-sampling events with configured LBR Andrii Nakryiko
  2024-04-01  9:29 ` [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events Ingo Molnar
  4 siblings, 0 replies; 9+ messages in thread
From: Andrii Nakryiko @ 2024-03-31  4:18 UTC (permalink / raw
  To: x86, peterz, mingo, tglx
  Cc: bpf, linux-kernel, jolsa, song, kernel-team, Andrii Nakryiko,
	Sandipan Das

Upstream commit c22ac2a3d4bd ("perf: Enable branch record for software
events") added ability to capture LBR (Last Branch Records) on Intel CPUs
from inside BPF program at pretty much any arbitrary point. This is
extremely useful capability that allows to figure out otherwise
hard to debug problems, because LBR is now available based on some
application-defined conditions, not just hardware-supported events.

retsnoop ([0]) is one such tool that takes a huge advantage of this
functionality and has proved to be an extremely useful tool in
practice.

Now, AMD Zen4 CPUs got support for similar LBR functionality, but
necessary wiring inside the kernel is not yet setup. This patch seeks to
rectify this and follows a similar approach to the original patch
for Intel CPUs. We implement an AMD-specific callback set to be called
through perf_snapshot_branch_stack static call.

Previous preparatory patches ensured that amd_pmu_core_disable_all() and
__amd_pmu_lbr_disable() will be completely inlined and will have no
branches, so LBR snapshot contamination will be minimized.

This was tested on AMD Bergamo CPU and worked well when utilized from
the aforementioned retsnoop tool.

  [0] https://github.com/anakryiko/retsnoop

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 arch/x86/events/amd/core.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index c5bcbc87d057..ed53ce091664 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -878,6 +878,37 @@ static int amd_pmu_handle_irq(struct pt_regs *regs)
 	return amd_pmu_adjust_nmi_window(handled);
 }
 
+/*
+ * AMD-specific callback invoked through perf_snapshot_branch_stack static
+ * call, defined in include/linux/perf_event.h. See its definition for API
+ * details. It's up to caller to provide enough space in *entries* to fit all
+ * LBR records, otherwise returned result will be truncated to *cnt* entries.
+ */
+static int amd_pmu_v2_snapshot_branch_stack(struct perf_branch_entry *entries, unsigned int cnt)
+{
+	struct cpu_hw_events *cpuc;
+	unsigned long flags;
+
+	/*
+	 * The sequence of steps to freeze LBR should be completely inlined
+	 * and contain no branches to minimize contamination of LBR snapshot
+	 */
+	local_irq_save(flags);
+	amd_pmu_core_disable_all();
+	__amd_pmu_lbr_disable();
+
+	cpuc = this_cpu_ptr(&cpu_hw_events);
+
+	amd_pmu_lbr_read();
+	cnt = min(cnt, x86_pmu.lbr_nr);
+	memcpy(entries, cpuc->lbr_entries, sizeof(struct perf_branch_entry) * cnt);
+
+	amd_pmu_v2_enable_all(0);
+	local_irq_restore(flags);
+
+	return cnt;
+}
+
 static int amd_pmu_v2_handle_irq(struct pt_regs *regs)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -1414,6 +1445,10 @@ static int __init amd_core_pmu_init(void)
 		static_call_update(amd_pmu_branch_reset, amd_pmu_lbr_reset);
 		static_call_update(amd_pmu_branch_add, amd_pmu_lbr_add);
 		static_call_update(amd_pmu_branch_del, amd_pmu_lbr_del);
+
+		/* Only support branch_stack snapshot on perfmon v2 */
+		if (x86_pmu.handle_irq == amd_pmu_v2_handle_irq)
+			static_call_update(perf_snapshot_branch_stack, amd_pmu_v2_snapshot_branch_stack);
 	} else if (!amd_brs_init()) {
 		/*
 		 * BRS requires special event constraints and flushing on ctxsw.
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 4/4] perf/x86/amd: don't reject non-sampling events with configured LBR
  2024-03-31  4:18 [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2024-03-31  4:18 ` [PATCH v4 3/4] perf/x86/amd: support capturing LBR from software events Andrii Nakryiko
@ 2024-03-31  4:18 ` Andrii Nakryiko
  2024-04-01  9:29 ` [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events Ingo Molnar
  4 siblings, 0 replies; 9+ messages in thread
From: Andrii Nakryiko @ 2024-03-31  4:18 UTC (permalink / raw
  To: x86, peterz, mingo, tglx
  Cc: bpf, linux-kernel, jolsa, song, kernel-team, Andrii Nakryiko,
	Sandipan Das

Now that it's possible to capture LBR on AMD CPU from BPF at arbitrary
point, there is no reason to artificially limit this feature to just
sampling events. So corresponding check is removed. AFAIU, there is no
correctness implications of doing this (and it was possible to bypass
this check by just setting perf_event's sample_period to 1 anyways, so
it doesn't guard all that much).

Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 arch/x86/events/amd/lbr.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/events/amd/lbr.c b/arch/x86/events/amd/lbr.c
index 0e4de028590d..75920f895d67 100644
--- a/arch/x86/events/amd/lbr.c
+++ b/arch/x86/events/amd/lbr.c
@@ -310,10 +310,6 @@ int amd_pmu_lbr_hw_config(struct perf_event *event)
 {
 	int ret = 0;
 
-	/* LBR is not recommended in counting mode */
-	if (!is_sampling_event(event))
-		return -EINVAL;
-
 	ret = amd_pmu_lbr_setup_filter(event);
 	if (!ret)
 		event->attach_state |= PERF_ATTACH_SCHED_CB;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events
  2024-03-31  4:18 [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events Andrii Nakryiko
                   ` (3 preceding siblings ...)
  2024-03-31  4:18 ` [PATCH v4 4/4] perf/x86/amd: don't reject non-sampling events with configured LBR Andrii Nakryiko
@ 2024-04-01  9:29 ` Ingo Molnar
  2024-04-02  2:16   ` Andrii Nakryiko
  4 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2024-04-01  9:29 UTC (permalink / raw
  To: Andrii Nakryiko
  Cc: x86, peterz, mingo, tglx, bpf, linux-kernel, jolsa, song,
	kernel-team


* Andrii Nakryiko <andrii@kernel.org> wrote:

> Add AMD-specific implementation of perf_snapshot_branch_stack static call that
> allows LBR capture from arbitrary points in the kernel. This is utilized by
> BPF programs. See patch #3 for all the details.
> 
> Patches #1 and #2 are preparatory steps to ensure LBR freezing is completely
> inlined and have no branches, to minimize LBR snapshot contamination.
> 
> Patch #4 removes an artificial restriction on perf events with LBR enabled.
> 
> Andrii Nakryiko (4):
>   perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined
>   perf/x86/amd: avoid taking branches before disabling LBR
>   perf/x86/amd: support capturing LBR from software events
>   perf/x86/amd: don't reject non-sampling events with configured LBR
> 
>  arch/x86/events/amd/core.c   | 37 +++++++++++++++++++++++++++++++++++-
>  arch/x86/events/amd/lbr.c    | 11 +----------
>  arch/x86/events/perf_event.h | 11 +++++++++++
>  3 files changed, 48 insertions(+), 11 deletions(-)

So there's a new conflict with patch #2, probably due to interaction 
with this recent fix that is now upstream:

   598c2fafc06f ("perf/x86/amd/lbr: Use freeze based on availability")

I don't think it should change the logic of the snapshot feature 
materially, X86_FEATURE_AMD_LBR_PMC_FREEZE should be orthogonal to it, 
as the LBR snapshot isn't taken from a PMI.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events
  2024-04-01  9:29 ` [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events Ingo Molnar
@ 2024-04-02  2:16   ` Andrii Nakryiko
  2024-04-03  8:06     ` Ingo Molnar
  0 siblings, 1 reply; 9+ messages in thread
From: Andrii Nakryiko @ 2024-04-02  2:16 UTC (permalink / raw
  To: Ingo Molnar
  Cc: Andrii Nakryiko, x86, peterz, mingo, tglx, bpf, linux-kernel,
	jolsa, song, kernel-team

On Mon, Apr 1, 2024 at 2:30 AM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Andrii Nakryiko <andrii@kernel.org> wrote:
>
> > Add AMD-specific implementation of perf_snapshot_branch_stack static call that
> > allows LBR capture from arbitrary points in the kernel. This is utilized by
> > BPF programs. See patch #3 for all the details.
> >
> > Patches #1 and #2 are preparatory steps to ensure LBR freezing is completely
> > inlined and have no branches, to minimize LBR snapshot contamination.
> >
> > Patch #4 removes an artificial restriction on perf events with LBR enabled.
> >
> > Andrii Nakryiko (4):
> >   perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined
> >   perf/x86/amd: avoid taking branches before disabling LBR
> >   perf/x86/amd: support capturing LBR from software events
> >   perf/x86/amd: don't reject non-sampling events with configured LBR
> >
> >  arch/x86/events/amd/core.c   | 37 +++++++++++++++++++++++++++++++++++-
> >  arch/x86/events/amd/lbr.c    | 11 +----------
> >  arch/x86/events/perf_event.h | 11 +++++++++++
> >  3 files changed, 48 insertions(+), 11 deletions(-)
>
> So there's a new conflict with patch #2, probably due to interaction
> with this recent fix that is now upstream:
>
>    598c2fafc06f ("perf/x86/amd/lbr: Use freeze based on availability")
>
> I don't think it should change the logic of the snapshot feature
> materially, X86_FEATURE_AMD_LBR_PMC_FREEZE should be orthogonal to it,
> as the LBR snapshot isn't taken from a PMI.
>

Yep, seems like there was a parallel change to related code in
perf/urgent branch. And yes, you are right that it's orthogonal and
doesn't regress anything as far as branching and whatnot (just
retested everything on real hardware). So I've rebased my patches on
top of perf/urgent, will send v5 momentarily. Sorry for an extra round
on this.

> Thanks,
>
>         Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events
  2024-04-02  2:16   ` Andrii Nakryiko
@ 2024-04-03  8:06     ` Ingo Molnar
  2024-04-03 16:08       ` Andrii Nakryiko
  0 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2024-04-03  8:06 UTC (permalink / raw
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, x86, peterz, mingo, tglx, bpf, linux-kernel,
	jolsa, song, kernel-team


* Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Mon, Apr 1, 2024 at 2:30 AM Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > > Add AMD-specific implementation of perf_snapshot_branch_stack static call that
> > > allows LBR capture from arbitrary points in the kernel. This is utilized by
> > > BPF programs. See patch #3 for all the details.
> > >
> > > Patches #1 and #2 are preparatory steps to ensure LBR freezing is completely
> > > inlined and have no branches, to minimize LBR snapshot contamination.
> > >
> > > Patch #4 removes an artificial restriction on perf events with LBR enabled.
> > >
> > > Andrii Nakryiko (4):
> > >   perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined
> > >   perf/x86/amd: avoid taking branches before disabling LBR
> > >   perf/x86/amd: support capturing LBR from software events
> > >   perf/x86/amd: don't reject non-sampling events with configured LBR
> > >
> > >  arch/x86/events/amd/core.c   | 37 +++++++++++++++++++++++++++++++++++-
> > >  arch/x86/events/amd/lbr.c    | 11 +----------
> > >  arch/x86/events/perf_event.h | 11 +++++++++++
> > >  3 files changed, 48 insertions(+), 11 deletions(-)
> >
> > So there's a new conflict with patch #2, probably due to interaction
> > with this recent fix that is now upstream:
> >
> >    598c2fafc06f ("perf/x86/amd/lbr: Use freeze based on availability")
> >
> > I don't think it should change the logic of the snapshot feature
> > materially, X86_FEATURE_AMD_LBR_PMC_FREEZE should be orthogonal to it,
> > as the LBR snapshot isn't taken from a PMI.
> >
> 
> Yep, seems like there was a parallel change to related code in 
> perf/urgent branch. And yes, you are right that it's orthogonal and 
> doesn't regress anything as far as branching and whatnot (just 
> retested everything on real hardware). So I've rebased my patches on 
> top of perf/urgent, will send v5 momentarily.

Thank you - it's now all in tip:perf/core and lined up for v6.10.

> Sorry for an extra round on this.

Not your doing really - just crossing patches.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events
  2024-04-03  8:06     ` Ingo Molnar
@ 2024-04-03 16:08       ` Andrii Nakryiko
  0 siblings, 0 replies; 9+ messages in thread
From: Andrii Nakryiko @ 2024-04-03 16:08 UTC (permalink / raw
  To: Ingo Molnar
  Cc: Andrii Nakryiko, x86, peterz, mingo, tglx, bpf, linux-kernel,
	jolsa, song, kernel-team

On Wed, Apr 3, 2024 at 1:06 AM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Mon, Apr 1, 2024 at 2:30 AM Ingo Molnar <mingo@kernel.org> wrote:
> > >
> > >
> > > * Andrii Nakryiko <andrii@kernel.org> wrote:
> > >
> > > > Add AMD-specific implementation of perf_snapshot_branch_stack static call that
> > > > allows LBR capture from arbitrary points in the kernel. This is utilized by
> > > > BPF programs. See patch #3 for all the details.
> > > >
> > > > Patches #1 and #2 are preparatory steps to ensure LBR freezing is completely
> > > > inlined and have no branches, to minimize LBR snapshot contamination.
> > > >
> > > > Patch #4 removes an artificial restriction on perf events with LBR enabled.
> > > >
> > > > Andrii Nakryiko (4):
> > > >   perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined
> > > >   perf/x86/amd: avoid taking branches before disabling LBR
> > > >   perf/x86/amd: support capturing LBR from software events
> > > >   perf/x86/amd: don't reject non-sampling events with configured LBR
> > > >
> > > >  arch/x86/events/amd/core.c   | 37 +++++++++++++++++++++++++++++++++++-
> > > >  arch/x86/events/amd/lbr.c    | 11 +----------
> > > >  arch/x86/events/perf_event.h | 11 +++++++++++
> > > >  3 files changed, 48 insertions(+), 11 deletions(-)
> > >
> > > So there's a new conflict with patch #2, probably due to interaction
> > > with this recent fix that is now upstream:
> > >
> > >    598c2fafc06f ("perf/x86/amd/lbr: Use freeze based on availability")
> > >
> > > I don't think it should change the logic of the snapshot feature
> > > materially, X86_FEATURE_AMD_LBR_PMC_FREEZE should be orthogonal to it,
> > > as the LBR snapshot isn't taken from a PMI.
> > >
> >
> > Yep, seems like there was a parallel change to related code in
> > perf/urgent branch. And yes, you are right that it's orthogonal and
> > doesn't regress anything as far as branching and whatnot (just
> > retested everything on real hardware). So I've rebased my patches on
> > top of perf/urgent, will send v5 momentarily.
>
> Thank you - it's now all in tip:perf/core and lined up for v6.10.

Great, thank you!

>
> > Sorry for an extra round on this.
>
> Not your doing really - just crossing patches.
>
> Thanks,
>
>         Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-04-03 16:08 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-31  4:18 [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events Andrii Nakryiko
2024-03-31  4:18 ` [PATCH v4 1/4] perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined Andrii Nakryiko
2024-03-31  4:18 ` [PATCH v4 2/4] perf/x86/amd: avoid taking branches before disabling LBR Andrii Nakryiko
2024-03-31  4:18 ` [PATCH v4 3/4] perf/x86/amd: support capturing LBR from software events Andrii Nakryiko
2024-03-31  4:18 ` [PATCH v4 4/4] perf/x86/amd: don't reject non-sampling events with configured LBR Andrii Nakryiko
2024-04-01  9:29 ` [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events Ingo Molnar
2024-04-02  2:16   ` Andrii Nakryiko
2024-04-03  8:06     ` Ingo Molnar
2024-04-03 16:08       ` Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).