LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] convert perf to local64_t
@ 2010-05-21 13:42 Peter Zijlstra
  2010-05-21 13:42 ` [PATCH 1/4] arch: local64_t Peter Zijlstra
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Peter Zijlstra @ 2010-05-21 13:42 UTC (permalink / raw
  To: Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo
  Cc: Frederic Weisbecker, Steven Rostedt, David Miller, Paul Mundt,
	Will Deacon, Deng-Cheng Zhu, Peter Zijlstra, LKML

These patches introduce local64_t.

Since perf_event:count is only modified cross-cpu when child-counters
feed back their changes on exit, and we can use a secondary variable
for that, we can convert perf to use local64_t instead of atomic64_t
and use instructions without buslock semantics.

The local64_t implementation uses local_t for 64 bits, since local_t is
of type long, for 32 bit it falls back to atomic64_t. Architectures can
provide their own implementation as usual.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] arch: local64_t
  2010-05-21 13:42 [PATCH 0/4] convert perf to local64_t Peter Zijlstra
@ 2010-05-21 13:42 ` Peter Zijlstra
  2010-05-21 14:47   ` Kyle McMartin
  2010-05-21 13:42 ` [PATCH 2/4] perf: Add perf_event_count() Peter Zijlstra
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2010-05-21 13:42 UTC (permalink / raw
  To: Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo
  Cc: Frederic Weisbecker, Steven Rostedt, David Miller, Paul Mundt,
	Will Deacon, Deng-Cheng Zhu, Peter Zijlstra, LKML, linux-arch

[-- Attachment #1: local64.patch --]
[-- Type: text/plain, Size: 10524 bytes --]

Implements local64_t.

On 64bit, local_t is of size long, and thus we make local64_t an alias.
On 32bit, we fall back to atomic64_t.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-arch@vger.kernel.org
LKML-Reference: <new-submission>
---
 arch/alpha/include/asm/local64.h      |    1 
 arch/arm/include/asm/local64.h        |    1 
 arch/avr32/include/asm/local64.h      |    1 
 arch/blackfin/include/asm/local64.h   |    1 
 arch/cris/include/asm/local64.h       |    1 
 arch/frv/include/asm/local64.h        |    1 
 arch/frv/kernel/local64.h             |    1 
 arch/h8300/include/asm/local64.h      |    1 
 arch/ia64/include/asm/local64.h       |    1 
 arch/m32r/include/asm/local64.h       |    1 
 arch/m68k/include/asm/local64.h       |    1 
 arch/microblaze/include/asm/local64.h |    1 
 arch/mips/include/asm/local64.h       |    1 
 arch/mn10300/include/asm/local64.h    |    1 
 arch/parisc/include/asm/local64.h     |    1 
 arch/powerpc/include/asm/local64.h    |    1 
 arch/s390/include/asm/local64.h       |    1 
 arch/score/include/asm/local64.h      |    1 
 arch/sh/include/asm/local64.h         |    1 
 arch/sparc/include/asm/local64.h      |    1 
 arch/x86/include/asm/local64.h        |    1 
 arch/xtensa/include/asm/local64.h     |    1 
 include/asm-generic/local64.h         |   96 ++++++++++++++++++++++++++++++++++
 23 files changed, 118 insertions(+)

Index: linux-2.6/arch/alpha/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/alpha/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/arm/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/arm/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/avr32/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/avr32/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/blackfin/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/blackfin/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/cris/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/cris/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/frv/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/frv/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/frv/kernel/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/frv/kernel/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/h8300/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/h8300/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/ia64/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/ia64/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/m32r/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/m32r/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/m68k/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/m68k/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/microblaze/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/microblaze/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/mips/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/mips/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/mn10300/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/mn10300/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/parisc/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/parisc/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/powerpc/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/powerpc/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/s390/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/s390/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/score/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/score/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/sh/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/sh/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/sparc/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/sparc/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/x86/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/x86/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/arch/xtensa/include/asm/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/xtensa/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
Index: linux-2.6/include/asm-generic/local64.h
===================================================================
--- /dev/null
+++ linux-2.6/include/asm-generic/local64.h
@@ -0,0 +1,96 @@
+#ifndef _ASM_GENERIC_LOCAL64_H
+#define _ASM_GENERIC_LOCAL64_H
+
+#include <linux/percpu.h>
+#include <asm/types.h>
+
+/*
+ * A signed long type for operations which are atomic for a single CPU.
+ * Usually used in combination with per-cpu variables.
+ *
+ * This is the default implementation, which uses atomic64_t.  Which is
+ * rather pointless.  The whole point behind local64_t is that some processors
+ * can perform atomic adds and subtracts in a manner which is atomic wrt IRQs
+ * running on this CPU.  local64_t allows exploitation of such capabilities.
+ */
+
+/* Implement in terms of atomics. */
+
+#if BITS_PER_LONG == 64
+
+#include <asm/local.h>
+
+typedef struct {
+	local_t a;
+} local64_t;
+
+#define LOCAL64_INIT(i)	{ LOCAL_INIT(i) }
+
+#define local64_read(l)		local_read(&(l)->a)
+#define local64_set(l,i)	local_set((&(l)->a),(i))
+#define local64_inc(l)		local_inc(&(l)->a)
+#define local64_dec(l)		local_dec(&(l)->a)
+#define local64_add(i,l)	local_add((i),(&(l)->a))
+#define local64_sub(i,l)	local_sub((i),(&(l)->a))
+
+#define local64_sub_and_test(i, l) local_sub_and_test((i), (&(l)->a))
+#define local64_dec_and_test(l) local_dec_and_test(&(l)->a)
+#define local64_inc_and_test(l) local_inc_and_test(&(l)->a)
+#define local64_add_negative(i, l) local_add_negative((i), (&(l)->a))
+#define local64_add_return(i, l) local_add_return((i), (&(l)->a))
+#define local64_sub_return(i, l) local_sub_return((i), (&(l)->a))
+#define local64_inc_return(l)	local_inc_return(&(l)->a)
+
+#define local64_cmpxchg(l, o, n) local_cmpxchg((&(l)->a), (o), (n))
+#define local64_xchg(l, n)	local_xchg((&(l)->a), (n))
+#define local64_add_unless(l, _a, u) local_add_unless((&(l)->a), (_a), (u))
+#define local64_inc_not_zero(l)	local_inc_not_zero(&(l)->a)
+
+/* Non-atomic variants, ie. preemption disabled and won't be touched
+ * in interrupt, etc.  Some archs can optimize this case well. */
+#define __local64_inc(l)	local64_set((l), local64_read(l) + 1)
+#define __local64_dec(l)	local64_set((l), local64_read(l) - 1)
+#define __local64_add(i,l)	local64_set((l), local64_read(l) + (i))
+#define __local64_sub(i,l)	local64_set((l), local64_read(l) - (i))
+
+#else /* BITS_PER_LONG != 64 */
+
+#include <asm/atomic64.h>
+
+/* Don't use typedef: don't want them to be mixed with atomic_t's. */
+typedef struct {
+	atomic64_t a;
+} local64_t;
+
+#define LOCAL64_INIT(i)	{ ATOMIC_LONG_INIT(i) }
+
+#define local64_read(l)		atomic64_read(&(l)->a)
+#define local64_set(l,i)	atomic64_set((&(l)->a),(i))
+#define local64_inc(l)		atomic64_inc(&(l)->a)
+#define local64_dec(l)		atomic64_dec(&(l)->a)
+#define local64_add(i,l)	atomic64_add((i),(&(l)->a))
+#define local64_sub(i,l)	atomic64_sub((i),(&(l)->a))
+
+#define local64_sub_and_test(i, l) atomic64_sub_and_test((i), (&(l)->a))
+#define local64_dec_and_test(l) atomic64_dec_and_test(&(l)->a)
+#define local64_inc_and_test(l) atomic64_inc_and_test(&(l)->a)
+#define local64_add_negative(i, l) atomic64_add_negative((i), (&(l)->a))
+#define local64_add_return(i, l) atomic64_add_return((i), (&(l)->a))
+#define local64_sub_return(i, l) atomic64_sub_return((i), (&(l)->a))
+#define local64_inc_return(l)	atomic64_inc_return(&(l)->a)
+
+#define local64_cmpxchg(l, o, n) atomic64_cmpxchg((&(l)->a), (o), (n))
+#define local64_xchg(l, n)	atomic64_xchg((&(l)->a), (n))
+#define local64_add_unless(l, _a, u) atomic64_add_unless((&(l)->a), (_a), (u))
+#define local64_inc_not_zero(l)	atomic64_inc_not_zero(&(l)->a)
+
+/* Non-atomic variants, ie. preemption disabled and won't be touched
+ * in interrupt, etc.  Some archs can optimize this case well. */
+#define __local64_inc(l)	local64_set((l), local64_read(l) + 1)
+#define __local64_dec(l)	local64_set((l), local64_read(l) - 1)
+#define __local64_add(i,l)	local64_set((l), local64_read(l) + (i))
+#define __local64_sub(i,l)	local64_set((l), local64_read(l) - (i))
+
+#endif /* BITS_PER_LONG != 64 */
+
+#endif /* _ASM_GENERIC_LOCAL64_H */



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/4] perf: Add perf_event_count()
  2010-05-21 13:42 [PATCH 0/4] convert perf to local64_t Peter Zijlstra
  2010-05-21 13:42 ` [PATCH 1/4] arch: local64_t Peter Zijlstra
@ 2010-05-21 13:42 ` Peter Zijlstra
  2010-05-21 13:42 ` [PATCH 3/4] perf: Add child_count Peter Zijlstra
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2010-05-21 13:42 UTC (permalink / raw
  To: Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo
  Cc: Frederic Weisbecker, Steven Rostedt, David Miller, Paul Mundt,
	Will Deacon, Deng-Cheng Zhu, LKML, Peter Zijlstra

[-- Attachment #1: perf-event_count.patch --]
[-- Type: text/plain, Size: 2456 bytes --]

Create a helper function for those sites that want to read the event count.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
---
 kernel/perf_event.c |   17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -1701,6 +1701,11 @@ static void __perf_event_read(void *info
 	event->pmu->read(event);
 }
 
+static inline u64 perf_event_count(struct perf_event *event)
+{
+	return atomic64_read(&event->count);
+}
+
 static u64 perf_event_read(struct perf_event *event)
 {
 	/*
@@ -1720,7 +1725,7 @@ static u64 perf_event_read(struct perf_e
 		raw_spin_unlock_irqrestore(&ctx->lock, flags);
 	}
 
-	return atomic64_read(&event->count);
+	return perf_event_count(event);
 }
 
 /*
@@ -2280,7 +2285,7 @@ void perf_event_update_userpage(struct p
 	++userpg->lock;
 	barrier();
 	userpg->index = perf_event_index(event);
-	userpg->offset = atomic64_read(&event->count);
+	userpg->offset = perf_event_count(event);
 	if (event->state == PERF_EVENT_STATE_ACTIVE)
 		userpg->offset -= atomic64_read(&event->hw.prev_count);
 
@@ -3125,7 +3130,7 @@ static void perf_output_read_one(struct 
 	u64 values[4];
 	int n = 0;
 
-	values[n++] = atomic64_read(&event->count);
+	values[n++] = perf_event_count(event);
 	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED) {
 		values[n++] = event->total_time_enabled +
 			atomic64_read(&event->child_total_time_enabled);
@@ -3162,7 +3167,7 @@ static void perf_output_read_group(struc
 	if (leader != event)
 		leader->pmu->read(leader);
 
-	values[n++] = atomic64_read(&leader->count);
+	values[n++] = perf_event_count(leader);
 	if (read_format & PERF_FORMAT_ID)
 		values[n++] = primary_event_id(leader);
 
@@ -3174,7 +3179,7 @@ static void perf_output_read_group(struc
 		if (sub != event)
 			sub->pmu->read(sub);
 
-		values[n++] = atomic64_read(&sub->count);
+		values[n++] = perf_event_count(sub);
 		if (read_format & PERF_FORMAT_ID)
 			values[n++] = primary_event_id(sub);
 
@@ -5272,7 +5277,7 @@ static void sync_child_event(struct perf
 	if (child_event->attr.inherit_stat)
 		perf_event_read_event(child_event, child);
 
-	child_val = atomic64_read(&child_event->count);
+	child_val = perf_event_count(child_event);
 
 	/*
 	 * Add back the child's count to the parent's count:



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 3/4] perf: Add child_count
  2010-05-21 13:42 [PATCH 0/4] convert perf to local64_t Peter Zijlstra
  2010-05-21 13:42 ` [PATCH 1/4] arch: local64_t Peter Zijlstra
  2010-05-21 13:42 ` [PATCH 2/4] perf: Add perf_event_count() Peter Zijlstra
@ 2010-05-21 13:42 ` Peter Zijlstra
  2010-05-21 13:42 ` [PATCH 4/4] perf: Convert perf_event to local_t Peter Zijlstra
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2010-05-21 13:42 UTC (permalink / raw
  To: Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo
  Cc: Frederic Weisbecker, Steven Rostedt, David Miller, Paul Mundt,
	Will Deacon, Deng-Cheng Zhu, LKML, Peter Zijlstra

[-- Attachment #1: perf-event-child_count.patch --]
[-- Type: text/plain, Size: 1715 bytes --]

Only child counters adding back their values into the parent counter
are responsible for cross-cpu updates to event->count.

So if we pull that out into a new child_count variable, we get an
event->count that is only modified locally.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
---
 include/linux/perf_event.h |    1 +
 kernel/perf_event.c        |    4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

Index: linux-2.6/include/linux/perf_event.h
===================================================================
--- linux-2.6.orig/include/linux/perf_event.h
+++ linux-2.6/include/linux/perf_event.h
@@ -648,6 +648,7 @@ struct perf_event {
 
 	enum perf_event_active_state	state;
 	atomic64_t			count;
+	atomic64_t			child_count;
 
 	/*
 	 * These are the total time in nanoseconds that the event
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -1703,7 +1703,7 @@ static void __perf_event_read(void *info
 
 static inline u64 perf_event_count(struct perf_event *event)
 {
-	return atomic64_read(&event->count);
+	return atomic64_read(&event->count) + atomic64_read(&event->child_count);
 }
 
 static u64 perf_event_read(struct perf_event *event)
@@ -5282,7 +5282,7 @@ static void sync_child_event(struct perf
 	/*
 	 * Add back the child's count to the parent's count:
 	 */
-	atomic64_add(child_val, &parent_event->count);
+	atomic64_add(child_val, &parent_event->child_count);
 	atomic64_add(child_event->total_time_enabled,
 		     &parent_event->child_total_time_enabled);
 	atomic64_add(child_event->total_time_running,



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 4/4] perf: Convert perf_event to local_t
  2010-05-21 13:42 [PATCH 0/4] convert perf to local64_t Peter Zijlstra
                   ` (2 preceding siblings ...)
  2010-05-21 13:42 ` [PATCH 3/4] perf: Add child_count Peter Zijlstra
@ 2010-05-21 13:42 ` Peter Zijlstra
  2010-05-21 14:52 ` [PATCH 1/4] arch: local64_t David Howells
  2010-05-26 10:08 ` [PATCH 0/4] convert perf to local64_t Frederic Weisbecker
  5 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2010-05-21 13:42 UTC (permalink / raw
  To: Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo
  Cc: Frederic Weisbecker, Steven Rostedt, David Miller, Paul Mundt,
	Will Deacon, Deng-Cheng Zhu, LKML, Peter Zijlstra

[-- Attachment #1: perf-local-count.patch --]
[-- Type: text/plain, Size: 18067 bytes --]

Since now all modification to event->count (and ->prev_count
and ->period_left) are local to a cpu, change then to local64_t so we
avoid the LOCK'ed ops.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
---
 arch/arm/kernel/perf_event.c     |   18 ++++++++--------
 arch/powerpc/kernel/perf_event.c |   34 +++++++++++++++----------------
 arch/sh/kernel/perf_event.c      |    6 ++---
 arch/sparc/kernel/perf_event.c   |   18 ++++++++--------
 arch/x86/kernel/cpu/perf_event.c |   18 ++++++++--------
 include/linux/perf_event.h       |    7 +++---
 kernel/perf_event.c              |   42 +++++++++++++++++++--------------------
 7 files changed, 72 insertions(+), 71 deletions(-)

Index: linux-2.6/arch/arm/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/arch/arm/kernel/perf_event.c
+++ linux-2.6/arch/arm/kernel/perf_event.c
@@ -128,20 +128,20 @@ armpmu_event_set_period(struct perf_even
 			struct hw_perf_event *hwc,
 			int idx)
 {
-	s64 left = atomic64_read(&hwc->period_left);
+	s64 left = local64_read(&hwc->period_left);
 	s64 period = hwc->sample_period;
 	int ret = 0;
 
 	if (unlikely(left <= -period)) {
 		left = period;
-		atomic64_set(&hwc->period_left, left);
+		local64_set(&hwc->period_left, left);
 		hwc->last_period = period;
 		ret = 1;
 	}
 
 	if (unlikely(left <= 0)) {
 		left += period;
-		atomic64_set(&hwc->period_left, left);
+		local64_set(&hwc->period_left, left);
 		hwc->last_period = period;
 		ret = 1;
 	}
@@ -149,7 +149,7 @@ armpmu_event_set_period(struct perf_even
 	if (left > (s64)armpmu->max_period)
 		left = armpmu->max_period;
 
-	atomic64_set(&hwc->prev_count, (u64)-left);
+	local64_set(&hwc->prev_count, (u64)-left);
 
 	armpmu->write_counter(idx, (u64)(-left) & 0xffffffff);
 
@@ -168,18 +168,18 @@ armpmu_event_update(struct perf_event *e
 	s64 delta;
 
 again:
-	prev_raw_count = atomic64_read(&hwc->prev_count);
+	prev_raw_count = local64_read(&hwc->prev_count);
 	new_raw_count = armpmu->read_counter(idx);
 
-	if (atomic64_cmpxchg(&hwc->prev_count, prev_raw_count,
+	if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
 			     new_raw_count) != prev_raw_count)
 		goto again;
 
 	delta = (new_raw_count << shift) - (prev_raw_count << shift);
 	delta >>= shift;
 
-	atomic64_add(delta, &event->count);
-	atomic64_sub(delta, &hwc->period_left);
+	local64_add(delta, &event->count);
+	local64_sub(delta, &hwc->period_left);
 
 	return new_raw_count;
 }
@@ -433,7 +433,7 @@ __hw_perf_event_init(struct perf_event *
 	if (!hwc->sample_period) {
 		hwc->sample_period  = armpmu->max_period;
 		hwc->last_period    = hwc->sample_period;
-		atomic64_set(&hwc->period_left, hwc->sample_period);
+		local64_set(&hwc->period_left, hwc->sample_period);
 	}
 
 	err = 0;
Index: linux-2.6/arch/powerpc/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/perf_event.c
+++ linux-2.6/arch/powerpc/kernel/perf_event.c
@@ -410,15 +410,15 @@ static void power_pmu_read(struct perf_e
 	 * Therefore we treat them like NMIs.
 	 */
 	do {
-		prev = atomic64_read(&event->hw.prev_count);
+		prev = local64_read(&event->hw.prev_count);
 		barrier();
 		val = read_pmc(event->hw.idx);
-	} while (atomic64_cmpxchg(&event->hw.prev_count, prev, val) != prev);
+	} while (local64_cmpxchg(&event->hw.prev_count, prev, val) != prev);
 
 	/* The counters are only 32 bits wide */
 	delta = (val - prev) & 0xfffffffful;
-	atomic64_add(delta, &event->count);
-	atomic64_sub(delta, &event->hw.period_left);
+	local64_add(delta, &event->count);
+	local64_sub(delta, &event->hw.period_left);
 }
 
 /*
@@ -444,10 +444,10 @@ static void freeze_limited_counters(stru
 		if (!event->hw.idx)
 			continue;
 		val = (event->hw.idx == 5) ? pmc5 : pmc6;
-		prev = atomic64_read(&event->hw.prev_count);
+		prev = local64_read(&event->hw.prev_count);
 		event->hw.idx = 0;
 		delta = (val - prev) & 0xfffffffful;
-		atomic64_add(delta, &event->count);
+		local64_add(delta, &event->count);
 	}
 }
 
@@ -462,7 +462,7 @@ static void thaw_limited_counters(struct
 		event = cpuhw->limited_counter[i];
 		event->hw.idx = cpuhw->limited_hwidx[i];
 		val = (event->hw.idx == 5) ? pmc5 : pmc6;
-		atomic64_set(&event->hw.prev_count, val);
+		local64_set(&event->hw.prev_count, val);
 		perf_event_update_userpage(event);
 	}
 }
@@ -666,11 +666,11 @@ void hw_perf_enable(void)
 		}
 		val = 0;
 		if (event->hw.sample_period) {
-			left = atomic64_read(&event->hw.period_left);
+			left = local64_read(&event->hw.period_left);
 			if (left < 0x80000000L)
 				val = 0x80000000L - left;
 		}
-		atomic64_set(&event->hw.prev_count, val);
+		local64_set(&event->hw.prev_count, val);
 		event->hw.idx = idx;
 		write_pmc(idx, val);
 		perf_event_update_userpage(event);
@@ -842,8 +842,8 @@ static void power_pmu_unthrottle(struct 
 	if (left < 0x80000000L)
 		val = 0x80000000L - left;
 	write_pmc(event->hw.idx, val);
-	atomic64_set(&event->hw.prev_count, val);
-	atomic64_set(&event->hw.period_left, left);
+	local64_set(&event->hw.prev_count, val);
+	local64_set(&event->hw.period_left, left);
 	perf_event_update_userpage(event);
 	perf_enable();
 	local_irq_restore(flags);
@@ -1108,7 +1108,7 @@ const struct pmu *hw_perf_event_init(str
 	event->hw.config = events[n];
 	event->hw.event_base = cflags[n];
 	event->hw.last_period = event->hw.sample_period;
-	atomic64_set(&event->hw.period_left, event->hw.last_period);
+	local64_set(&event->hw.period_left, event->hw.last_period);
 
 	/*
 	 * See if we need to reserve the PMU.
@@ -1146,16 +1146,16 @@ static void record_and_restart(struct pe
 	int record = 0;
 
 	/* we don't have to worry about interrupts here */
-	prev = atomic64_read(&event->hw.prev_count);
+	prev = local64_read(&event->hw.prev_count);
 	delta = (val - prev) & 0xfffffffful;
-	atomic64_add(delta, &event->count);
+	local64_add(delta, &event->count);
 
 	/*
 	 * See if the total period for this event has expired,
 	 * and update for the next period.
 	 */
 	val = 0;
-	left = atomic64_read(&event->hw.period_left) - delta;
+	left = local64_read(&event->hw.period_left) - delta;
 	if (period) {
 		if (left <= 0) {
 			left += period;
@@ -1193,8 +1193,8 @@ static void record_and_restart(struct pe
 	}
 
 	write_pmc(event->hw.idx, val);
-	atomic64_set(&event->hw.prev_count, val);
-	atomic64_set(&event->hw.period_left, left);
+	local64_set(&event->hw.prev_count, val);
+	local64_set(&event->hw.period_left, left);
 	perf_event_update_userpage(event);
 }
 
Index: linux-2.6/arch/sh/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/arch/sh/kernel/perf_event.c
+++ linux-2.6/arch/sh/kernel/perf_event.c
@@ -185,10 +185,10 @@ static void sh_perf_event_update(struct 
 	 * this is the simplest approach for maintaining consistency.
 	 */
 again:
-	prev_raw_count = atomic64_read(&hwc->prev_count);
+	prev_raw_count = local64_read(&hwc->prev_count);
 	new_raw_count = sh_pmu->read(idx);
 
-	if (atomic64_cmpxchg(&hwc->prev_count, prev_raw_count,
+	if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
 			     new_raw_count) != prev_raw_count)
 		goto again;
 
@@ -203,7 +203,7 @@ again:
 	delta = (new_raw_count << shift) - (prev_raw_count << shift);
 	delta >>= shift;
 
-	atomic64_add(delta, &event->count);
+	local64_add(delta, &event->count);
 }
 
 static void sh_pmu_disable(struct perf_event *event)
Index: linux-2.6/arch/sparc/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/arch/sparc/kernel/perf_event.c
+++ linux-2.6/arch/sparc/kernel/perf_event.c
@@ -571,18 +571,18 @@ static u64 sparc_perf_event_update(struc
 	s64 delta;
 
 again:
-	prev_raw_count = atomic64_read(&hwc->prev_count);
+	prev_raw_count = local64_read(&hwc->prev_count);
 	new_raw_count = read_pmc(idx);
 
-	if (atomic64_cmpxchg(&hwc->prev_count, prev_raw_count,
+	if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
 			     new_raw_count) != prev_raw_count)
 		goto again;
 
 	delta = (new_raw_count << shift) - (prev_raw_count << shift);
 	delta >>= shift;
 
-	atomic64_add(delta, &event->count);
-	atomic64_sub(delta, &hwc->period_left);
+	local64_add(delta, &event->count);
+	local64_sub(delta, &hwc->period_left);
 
 	return new_raw_count;
 }
@@ -590,27 +590,27 @@ again:
 static int sparc_perf_event_set_period(struct perf_event *event,
 				       struct hw_perf_event *hwc, int idx)
 {
-	s64 left = atomic64_read(&hwc->period_left);
+	s64 left = local64_read(&hwc->period_left);
 	s64 period = hwc->sample_period;
 	int ret = 0;
 
 	if (unlikely(left <= -period)) {
 		left = period;
-		atomic64_set(&hwc->period_left, left);
+		local64_set(&hwc->period_left, left);
 		hwc->last_period = period;
 		ret = 1;
 	}
 
 	if (unlikely(left <= 0)) {
 		left += period;
-		atomic64_set(&hwc->period_left, left);
+		local64_set(&hwc->period_left, left);
 		hwc->last_period = period;
 		ret = 1;
 	}
 	if (left > MAX_PERIOD)
 		left = MAX_PERIOD;
 
-	atomic64_set(&hwc->prev_count, (u64)-left);
+	local64_set(&hwc->prev_count, (u64)-left);
 
 	write_pmc(idx, (u64)(-left) & 0xffffffff);
 
@@ -1086,7 +1086,7 @@ static int __hw_perf_event_init(struct p
 	if (!hwc->sample_period) {
 		hwc->sample_period = MAX_PERIOD;
 		hwc->last_period = hwc->sample_period;
-		atomic64_set(&hwc->period_left, hwc->sample_period);
+		local64_set(&hwc->period_left, hwc->sample_period);
 	}
 
 	return 0;
Index: linux-2.6/arch/x86/kernel/cpu/perf_event.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event.c
@@ -294,10 +294,10 @@ x86_perf_event_update(struct perf_event 
 	 * count to the generic event atomically:
 	 */
 again:
-	prev_raw_count = atomic64_read(&hwc->prev_count);
+	prev_raw_count = local64_read(&hwc->prev_count);
 	rdmsrl(hwc->event_base + idx, new_raw_count);
 
-	if (atomic64_cmpxchg(&hwc->prev_count, prev_raw_count,
+	if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
 					new_raw_count) != prev_raw_count)
 		goto again;
 
@@ -312,8 +312,8 @@ again:
 	delta = (new_raw_count << shift) - (prev_raw_count << shift);
 	delta >>= shift;
 
-	atomic64_add(delta, &event->count);
-	atomic64_sub(delta, &hwc->period_left);
+	local64_add(delta, &event->count);
+	local64_sub(delta, &hwc->period_left);
 
 	return new_raw_count;
 }
@@ -437,7 +437,7 @@ static int x86_setup_perfctr(struct perf
 	if (!hwc->sample_period) {
 		hwc->sample_period = x86_pmu.max_period;
 		hwc->last_period = hwc->sample_period;
-		atomic64_set(&hwc->period_left, hwc->sample_period);
+		local64_set(&hwc->period_left, hwc->sample_period);
 	} else {
 		/*
 		 * If we have a PMU initialized but no APIC
@@ -884,7 +884,7 @@ static int
 x86_perf_event_set_period(struct perf_event *event)
 {
 	struct hw_perf_event *hwc = &event->hw;
-	s64 left = atomic64_read(&hwc->period_left);
+	s64 left = local64_read(&hwc->period_left);
 	s64 period = hwc->sample_period;
 	int ret = 0, idx = hwc->idx;
 
@@ -896,14 +896,14 @@ x86_perf_event_set_period(struct perf_ev
 	 */
 	if (unlikely(left <= -period)) {
 		left = period;
-		atomic64_set(&hwc->period_left, left);
+		local64_set(&hwc->period_left, left);
 		hwc->last_period = period;
 		ret = 1;
 	}
 
 	if (unlikely(left <= 0)) {
 		left += period;
-		atomic64_set(&hwc->period_left, left);
+		local64_set(&hwc->period_left, left);
 		hwc->last_period = period;
 		ret = 1;
 	}
@@ -922,7 +922,7 @@ x86_perf_event_set_period(struct perf_ev
 	 * The hw event starts counting from this event offset,
 	 * mark it to be able to extra future deltas:
 	 */
-	atomic64_set(&hwc->prev_count, (u64)-left);
+	local64_set(&hwc->prev_count, (u64)-left);
 
 	wrmsrl(hwc->event_base + idx,
 			(u64)(-left) & x86_pmu.cntval_mask);
Index: linux-2.6/include/linux/perf_event.h
===================================================================
--- linux-2.6.orig/include/linux/perf_event.h
+++ linux-2.6/include/linux/perf_event.h
@@ -486,6 +486,7 @@ struct perf_guest_info_callbacks {
 #include <linux/cpu.h>
 #include <asm/atomic.h>
 #include <asm/local.h>
+#include <asm/local64.h>
 
 #define PERF_MAX_STACK_DEPTH		255
 
@@ -535,10 +536,10 @@ struct hw_perf_event {
 		struct arch_hw_breakpoint	info;
 #endif
 	};
-	atomic64_t			prev_count;
+	local64_t			prev_count;
 	u64				sample_period;
 	u64				last_period;
-	atomic64_t			period_left;
+	local64_t			period_left;
 	u64				interrupts;
 
 	u64				freq_time_stamp;
@@ -647,7 +648,7 @@ struct perf_event {
 	const struct pmu		*pmu;
 
 	enum perf_event_active_state	state;
-	atomic64_t			count;
+	local64_t			count;
 	atomic64_t			child_count;
 
 	/*
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -1116,9 +1116,9 @@ static void __perf_event_sync_stat(struc
 	 * In order to keep per-task stats reliable we need to flip the event
 	 * values when we flip the contexts.
 	 */
-	value = atomic64_read(&next_event->count);
-	value = atomic64_xchg(&event->count, value);
-	atomic64_set(&next_event->count, value);
+	value = local64_read(&next_event->count);
+	value = local64_xchg(&event->count, value);
+	local64_set(&next_event->count, value);
 
 	swap(event->total_time_enabled, next_event->total_time_enabled);
 	swap(event->total_time_running, next_event->total_time_running);
@@ -1505,10 +1505,10 @@ static void perf_adjust_period(struct pe
 
 	hwc->sample_period = sample_period;
 
-	if (atomic64_read(&hwc->period_left) > 8*sample_period) {
+	if (local64_read(&hwc->period_left) > 8*sample_period) {
 		perf_disable();
 		perf_event_stop(event);
-		atomic64_set(&hwc->period_left, 0);
+		local64_set(&hwc->period_left, 0);
 		perf_event_start(event);
 		perf_enable();
 	}
@@ -1549,7 +1549,7 @@ static void perf_ctx_adjust_freq(struct 
 
 		perf_disable();
 		event->pmu->read(event);
-		now = atomic64_read(&event->count);
+		now = local64_read(&event->count);
 		delta = now - hwc->freq_count_stamp;
 		hwc->freq_count_stamp = now;
 
@@ -1703,7 +1703,7 @@ static void __perf_event_read(void *info
 
 static inline u64 perf_event_count(struct perf_event *event)
 {
-	return atomic64_read(&event->count) + atomic64_read(&event->child_count);
+	return local64_read(&event->count) + atomic64_read(&event->child_count);
 }
 
 static u64 perf_event_read(struct perf_event *event)
@@ -2105,7 +2105,7 @@ static unsigned int perf_poll(struct fil
 static void perf_event_reset(struct perf_event *event)
 {
 	(void)perf_event_read(event);
-	atomic64_set(&event->count, 0);
+	local64_set(&event->count, 0);
 	perf_event_update_userpage(event);
 }
 
@@ -2287,7 +2287,7 @@ void perf_event_update_userpage(struct p
 	userpg->index = perf_event_index(event);
 	userpg->offset = perf_event_count(event);
 	if (event->state == PERF_EVENT_STATE_ACTIVE)
-		userpg->offset -= atomic64_read(&event->hw.prev_count);
+		userpg->offset -= local64_read(&event->hw.prev_count);
 
 	userpg->time_enabled = event->total_time_enabled +
 			atomic64_read(&event->child_total_time_enabled);
@@ -3937,14 +3937,14 @@ static u64 perf_swevent_set_period(struc
 	hwc->last_period = hwc->sample_period;
 
 again:
-	old = val = atomic64_read(&hwc->period_left);
+	old = val = local64_read(&hwc->period_left);
 	if (val < 0)
 		return 0;
 
 	nr = div64_u64(period + val, period);
 	offset = nr * period;
 	val -= offset;
-	if (atomic64_cmpxchg(&hwc->period_left, old, val) != old)
+	if (local64_cmpxchg(&hwc->period_left, old, val) != old)
 		goto again;
 
 	return nr;
@@ -3990,7 +3990,7 @@ static void perf_swevent_add(struct perf
 {
 	struct hw_perf_event *hwc = &event->hw;
 
-	atomic64_add(nr, &event->count);
+	local64_add(nr, &event->count);
 
 	if (!regs)
 		return;
@@ -4001,7 +4001,7 @@ static void perf_swevent_add(struct perf
 	if (nr == 1 && hwc->sample_period == 1 && !event->attr.freq)
 		return perf_swevent_overflow(event, 1, nmi, data, regs);
 
-	if (atomic64_add_negative(nr, &hwc->period_left))
+	if (local64_add_negative(nr, &hwc->period_left))
 		return;
 
 	perf_swevent_overflow(event, 0, nmi, data, regs);
@@ -4283,8 +4283,8 @@ static void cpu_clock_perf_event_update(
 	u64 now;
 
 	now = cpu_clock(cpu);
-	prev = atomic64_xchg(&event->hw.prev_count, now);
-	atomic64_add(now - prev, &event->count);
+	prev = local64_xchg(&event->hw.prev_count, now);
+	local64_add(now - prev, &event->count);
 }
 
 static int cpu_clock_perf_event_enable(struct perf_event *event)
@@ -4292,7 +4292,7 @@ static int cpu_clock_perf_event_enable(s
 	struct hw_perf_event *hwc = &event->hw;
 	int cpu = raw_smp_processor_id();
 
-	atomic64_set(&hwc->prev_count, cpu_clock(cpu));
+	local64_set(&hwc->prev_count, cpu_clock(cpu));
 	perf_swevent_start_hrtimer(event);
 
 	return 0;
@@ -4324,9 +4324,9 @@ static void task_clock_perf_event_update
 	u64 prev;
 	s64 delta;
 
-	prev = atomic64_xchg(&event->hw.prev_count, now);
+	prev = local64_xchg(&event->hw.prev_count, now);
 	delta = now - prev;
-	atomic64_add(delta, &event->count);
+	local64_add(delta, &event->count);
 }
 
 static int task_clock_perf_event_enable(struct perf_event *event)
@@ -4336,7 +4336,7 @@ static int task_clock_perf_event_enable(
 
 	now = event->ctx->time;
 
-	atomic64_set(&hwc->prev_count, now);
+	local64_set(&hwc->prev_count, now);
 
 	perf_swevent_start_hrtimer(event);
 
@@ -4777,7 +4777,7 @@ perf_event_alloc(struct perf_event_attr 
 		hwc->sample_period = 1;
 	hwc->last_period = hwc->sample_period;
 
-	atomic64_set(&hwc->period_left, hwc->sample_period);
+	local64_set(&hwc->period_left, hwc->sample_period);
 
 	/*
 	 * we currently do not support PERF_FORMAT_GROUP on inherited events
@@ -5216,7 +5216,7 @@ inherit_event(struct perf_event *parent_
 		hwc->sample_period = sample_period;
 		hwc->last_period   = sample_period;
 
-		atomic64_set(&hwc->period_left, sample_period);
+		local64_set(&hwc->period_left, sample_period);
 	}
 
 	child_event->overflow_handler = parent_event->overflow_handler;



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/4] arch: local64_t
  2010-05-21 13:42 ` [PATCH 1/4] arch: local64_t Peter Zijlstra
@ 2010-05-21 14:47   ` Kyle McMartin
  0 siblings, 0 replies; 9+ messages in thread
From: Kyle McMartin @ 2010-05-21 14:47 UTC (permalink / raw
  To: Peter Zijlstra
  Cc: Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo,
	Frederic Weisbecker, Steven Rostedt, David Miller, Paul Mundt,
	Will Deacon, Deng-Cheng Zhu, LKML, linux-arch

On Fri, May 21, 2010 at 03:42:06PM +0200, Peter Zijlstra wrote:
> Implements local64_t.
> 
> On 64bit, local_t is of size long, and thus we make local64_t an alias.
> On 32bit, we fall back to atomic64_t.
> 
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: linux-arch@vger.kernel.org
> LKML-Reference: <new-submission>

Header looks good.

Acked-by: Kyle McMartin <kyle@mcmartin.ca>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/4] arch: local64_t
  2010-05-21 13:42 [PATCH 0/4] convert perf to local64_t Peter Zijlstra
                   ` (3 preceding siblings ...)
  2010-05-21 13:42 ` [PATCH 4/4] perf: Convert perf_event to local_t Peter Zijlstra
@ 2010-05-21 14:52 ` David Howells
  2010-05-26 10:08 ` [PATCH 0/4] convert perf to local64_t Frederic Weisbecker
  5 siblings, 0 replies; 9+ messages in thread
From: David Howells @ 2010-05-21 14:52 UTC (permalink / raw
  To: Peter Zijlstra
  Cc: dhowells, Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo,
	Frederic Weisbecker, Steven Rostedt, David Miller, Paul Mundt,
	Will Deacon, Deng-Cheng Zhu, LKML, linux-arch



Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> + * This is the default implementation, which uses atomic64_t.  Which is
> + * rather pointless.  The whole point behind local64_t is that some processors
> + * can perform atomic adds and subtracts in a manner which is atomic wrt IRQs
> + * running on this CPU.  local64_t allows exploitation of such capabilities.

Interesting...  What FRV does in atomic64-ops.S should probably be rebranded
local64_t, and atomic64_t ops be based on that in non-SMP mode.

What I did on FRV was to emulate LL/ST instructions using some of the
excessive numbers of conditional bits to do so - but it only works on UP
systems.

David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] convert perf to local64_t
  2010-05-21 13:42 [PATCH 0/4] convert perf to local64_t Peter Zijlstra
                   ` (4 preceding siblings ...)
  2010-05-21 14:52 ` [PATCH 1/4] arch: local64_t David Howells
@ 2010-05-26 10:08 ` Frederic Weisbecker
  2010-05-26 10:11   ` Peter Zijlstra
  5 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2010-05-26 10:08 UTC (permalink / raw
  To: Peter Zijlstra, Ingo Molnar
  Cc: Paul Mackerras, Arnaldo Carvalho de Melo, Steven Rostedt,
	David Miller, Paul Mundt, Will Deacon, Deng-Cheng Zhu, LKML

On Fri, May 21, 2010 at 03:42:05PM +0200, Peter Zijlstra wrote:
> These patches introduce local64_t.
> 
> Since perf_event:count is only modified cross-cpu when child-counters
> feed back their changes on exit, and we can use a secondary variable
> for that, we can convert perf to use local64_t instead of atomic64_t
> and use instructions without buslock semantics.
> 
> The local64_t implementation uses local_t for 64 bits, since local_t is
> of type long, for 32 bit it falls back to atomic64_t. Architectures can
> provide their own implementation as usual.


It seems nobody disagrees with it. Can we give it a try?

Thanks.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] convert perf to local64_t
  2010-05-26 10:08 ` [PATCH 0/4] convert perf to local64_t Frederic Weisbecker
@ 2010-05-26 10:11   ` Peter Zijlstra
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2010-05-26 10:11 UTC (permalink / raw
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo,
	Steven Rostedt, David Miller, Paul Mundt, Will Deacon,
	Deng-Cheng Zhu, LKML

On Wed, 2010-05-26 at 12:08 +0200, Frederic Weisbecker wrote:
> On Fri, May 21, 2010 at 03:42:05PM +0200, Peter Zijlstra wrote:
> > These patches introduce local64_t.
> > 
> > Since perf_event:count is only modified cross-cpu when child-counters
> > feed back their changes on exit, and we can use a secondary variable
> > for that, we can convert perf to use local64_t instead of atomic64_t
> > and use instructions without buslock semantics.
> > 
> > The local64_t implementation uses local_t for 64 bits, since local_t is
> > of type long, for 32 bit it falls back to atomic64_t. Architectures can
> > provide their own implementation as usual.
> 
> 
> It seems nobody disagrees with it. Can we give it a try?

I'll push it to mingo around -rc2 or so, to let the dust settle from the
current merge.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-05-26 10:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-21 13:42 [PATCH 0/4] convert perf to local64_t Peter Zijlstra
2010-05-21 13:42 ` [PATCH 1/4] arch: local64_t Peter Zijlstra
2010-05-21 14:47   ` Kyle McMartin
2010-05-21 13:42 ` [PATCH 2/4] perf: Add perf_event_count() Peter Zijlstra
2010-05-21 13:42 ` [PATCH 3/4] perf: Add child_count Peter Zijlstra
2010-05-21 13:42 ` [PATCH 4/4] perf: Convert perf_event to local_t Peter Zijlstra
2010-05-21 14:52 ` [PATCH 1/4] arch: local64_t David Howells
2010-05-26 10:08 ` [PATCH 0/4] convert perf to local64_t Frederic Weisbecker
2010-05-26 10:11   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).