LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] Reduce timekeeping lock hold time v2
@ 2012-03-02  7:12 John Stultz
  2012-03-02  7:12 ` [PATCH 1/9] time: Condense timekeeper.xtime into xtime_sec John Stultz
                   ` (8 more replies)
  0 siblings, 9 replies; 15+ messages in thread
From: John Stultz @ 2012-03-02  7:12 UTC (permalink / raw
  To: lkml
  Cc: John Stultz, Ingo Molnar, Thomas Gleixner, Eric Dumazet,
	Richard Cochran

Once again, here is the second half of the timekeeping cleanups (the
first half are already in -tip) as well as some changes to reduce the
timekeeping lock hold times.

This work grew out of some of Eric Dumazet's and Thomas Gleixner's
suggestions, after they noticed the xtime_lock hold time could be
on the long side.

The basic idea is that we keep a shadow copy of the timekeeper
stucture, which can be updated while readers are accessing the
time. Then we only have to block readers as we switch to the
newly updated structure.

Changelog:
v2: 
This revsion includes some formatting changes suggested by Ingo,
as well as a patch converting shift values from ints to u32 to 
avoid any unintended extra effort caused by the compiler. There
is also a few minor bug fixes that I caught with some additional
testing.

Let me know if you have any further issues with the patchset.

If not, feel free to pull the tree from:
  git://git.linaro.org/people/jstultz/linux.git fortglx/3.4/time

John Stultz (9):
  time: Condense timekeeper.xtime into xtime_sec
  time: Rework timekeeping functions to take timekeeper ptr as argument
  time: Split timekeeper lock into separate reader/writer locks
  time: Update timekeeper structure using a local shadow
  time: Shadow cycle_last in timekeeper structure
  time: Reduce timekeeper read lock hold time
  time: Convert the timekeeper's wlock to a raw_spin_lock
  time: Whitespace cleanups per Ingo's requests
  time: Explicitly use u32 instead of int for shift values

 kernel/time/timekeeping.c |  506 ++++++++++++++++++++++++++-------------------
 1 files changed, 290 insertions(+), 216 deletions(-)

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/9] time: Condense timekeeper.xtime into xtime_sec
  2012-03-02  7:12 [PATCH 0/9] Reduce timekeeping lock hold time v2 John Stultz
@ 2012-03-02  7:12 ` John Stultz
  2012-03-02 21:13   ` Christoph Lameter
  2012-03-02  7:12 ` [PATCH 2/9] time: Rework timekeeping functions to take timekeeper ptr as argument John Stultz
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: John Stultz @ 2012-03-02  7:12 UTC (permalink / raw
  To: lkml
  Cc: John Stultz, Ingo Molnar, Thomas Gleixner, Eric Dumazet,
	Richard Cochran

The timekeeper struct has a xtime_nsec, which keeps the
sub-nanosecond remainder.  This ends up being somewhat
duplicative of the timekeeper.xtime.tv_nsec value, and we
have to do extra work to keep them apart, copying the full
nsec portion out and back in over and over.

This patch simplifies some of the logic by taking the timekeeper
xtime value and splitting it into timekeeper.xtime_sec and
reuses the timekeeper.xtime_nsec for the sub-second portion
(stored in higher res shifted nanoseconds).

This simplifies some of the accumulation logic. And will
allow for more accurate timekeeping once the vsyscall code
is updated to use the shifted nanosecond remainder.

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c |  170 ++++++++++++++++++++++++++++-----------------
 1 files changed, 107 insertions(+), 63 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 403c2a0..5434c6e 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -39,8 +39,11 @@ struct timekeeper {
 	/* Raw nano seconds accumulated per NTP interval. */
 	u32	raw_interval;
 
-	/* Clock shifted nano seconds remainder not stored in xtime.tv_nsec. */
+	/* Current CLOCK_REALTIME time in seconds */
+	u64	xtime_sec;
+	/* Clock shifted nano seconds */
 	u64	xtime_nsec;
+
 	/* Difference between accumulated time and NTP time in ntp
 	 * shifted nano seconds. */
 	s64	ntp_error;
@@ -48,8 +51,6 @@ struct timekeeper {
 	 * ntp shifted nano seconds. */
 	int	ntp_error_shift;
 
-	/* The current time */
-	struct timespec xtime;
 	/*
 	 * wall_to_monotonic is what we need to add to xtime (or xtime corrected
 	 * for sub jiffie times) to get to monotonic time.  Monotonic is pegged
@@ -87,6 +88,36 @@ __cacheline_aligned_in_smp DEFINE_SEQLOCK(xtime_lock);
 int __read_mostly timekeeping_suspended;
 
 
+static inline void tk_normalize_xtime(struct timekeeper *tk)
+{
+	while (tk->xtime_nsec >= ((u64)NSEC_PER_SEC << tk->shift)) {
+		tk->xtime_nsec -= (u64)NSEC_PER_SEC << tk->shift;
+		tk->xtime_sec++;
+	}
+}
+
+static struct timespec tk_xtime(struct timekeeper *tk)
+{
+	struct timespec ts;
+
+	ts.tv_sec = tk->xtime_sec;
+	ts.tv_nsec = (long)(tk->xtime_nsec >> tk->shift);
+	return ts;
+}
+
+static void tk_set_xtime(struct timekeeper *tk, const struct timespec *ts)
+{
+	tk->xtime_sec = ts->tv_sec;
+	tk->xtime_nsec = ts->tv_nsec << tk->shift;
+}
+
+
+static void tk_xtime_add(struct timekeeper *tk, const struct timespec *ts)
+{
+	tk->xtime_sec += ts->tv_sec;
+	tk->xtime_nsec += ts->tv_nsec << tk->shift;
+}
+
 
 /**
  * timekeeper_setup_internals - Set up internals to use clocksource clock.
@@ -102,7 +133,9 @@ static void timekeeper_setup_internals(struct clocksource *clock)
 {
 	cycle_t interval;
 	u64 tmp, ntpinterval;
+	struct clocksource *old_clock;
 
+	old_clock = timekeeper.clock;
 	timekeeper.clock = clock;
 	clock->cycle_last = clock->read(clock);
 
@@ -124,7 +157,14 @@ static void timekeeper_setup_internals(struct clocksource *clock)
 	timekeeper.raw_interval =
 		((u64) interval * clock->mult) >> clock->shift;
 
-	timekeeper.xtime_nsec = 0;
+	 /* if changing clocks, convert xtime_nsec shift units */
+	if (old_clock) {
+		int shift_change = clock->shift - old_clock->shift;
+		if (shift_change < 0)
+			timekeeper.xtime_nsec >>= -shift_change;
+		else
+			timekeeper.xtime_nsec <<= shift_change;
+	}
 	timekeeper.shift = clock->shift;
 
 	timekeeper.ntp_error = 0;
@@ -143,6 +183,7 @@ static inline s64 timekeeping_get_ns(void)
 {
 	cycle_t cycle_now, cycle_delta;
 	struct clocksource *clock;
+	s64 nsec;
 
 	/* read clocksource: */
 	clock = timekeeper.clock;
@@ -151,9 +192,8 @@ static inline s64 timekeeping_get_ns(void)
 	/* calculate the delta since the last update_wall_time: */
 	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 
-	/* return delta convert to nanoseconds using ntp adjusted mult. */
-	return clocksource_cyc2ns(cycle_delta, timekeeper.mult,
-				  timekeeper.shift);
+	nsec = cycle_delta * timekeeper.mult + timekeeper.xtime_nsec;
+	return nsec >> timekeeper.shift;
 }
 
 static inline s64 timekeeping_get_ns_raw(void)
@@ -175,11 +215,13 @@ static inline s64 timekeeping_get_ns_raw(void)
 /* must hold write on timekeeper.lock */
 static void timekeeping_update(bool clearntp)
 {
+	struct timespec xt;
 	if (clearntp) {
 		timekeeper.ntp_error = 0;
 		ntp_clear();
 	}
-	update_vsyscall(&timekeeper.xtime, &timekeeper.wall_to_monotonic,
+	xt = tk_xtime(&timekeeper);
+	update_vsyscall(&xt, &timekeeper.wall_to_monotonic,
 			 timekeeper.clock, timekeeper.mult);
 }
 
@@ -189,7 +231,7 @@ void timekeeping_leap_insert(int leapsecond)
 	unsigned long flags;
 
 	write_seqlock_irqsave(&timekeeper.lock, flags);
-	timekeeper.xtime.tv_sec += leapsecond;
+	timekeeper.xtime_sec += leapsecond;
 	timekeeper.wall_to_monotonic.tv_sec -= leapsecond;
 	timekeeping_update(false);
 	write_sequnlock_irqrestore(&timekeeper.lock, flags);
@@ -214,13 +256,12 @@ static void timekeeping_forward_now(void)
 	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 	clock->cycle_last = cycle_now;
 
-	nsec = clocksource_cyc2ns(cycle_delta, timekeeper.mult,
-				  timekeeper.shift);
+	timekeeper.xtime_nsec += cycle_delta * timekeeper.mult;
 
 	/* If arch requires, add in gettimeoffset() */
-	nsec += arch_gettimeoffset();
+	timekeeper.xtime_nsec += arch_gettimeoffset() << timekeeper.shift;
 
-	timespec_add_ns(&timekeeper.xtime, nsec);
+	tk_normalize_xtime(&timekeeper);
 
 	nsec = clocksource_cyc2ns(cycle_delta, clock->mult, clock->shift);
 	timespec_add_ns(&timekeeper.raw_time, nsec);
@@ -235,15 +276,15 @@ static void timekeeping_forward_now(void)
 void getnstimeofday(struct timespec *ts)
 {
 	unsigned long seq;
-	s64 nsecs;
+	s64 nsecs = 0;
 
 	WARN_ON(timekeeping_suspended);
 
 	do {
 		seq = read_seqbegin(&timekeeper.lock);
 
-		*ts = timekeeper.xtime;
-		nsecs = timekeeping_get_ns();
+		ts->tv_sec = timekeeper.xtime_sec;
+		ts->tv_nsec = timekeeping_get_ns();
 
 		/* If arch requires, add in gettimeoffset() */
 		nsecs += arch_gettimeoffset();
@@ -264,11 +305,10 @@ ktime_t ktime_get(void)
 
 	do {
 		seq = read_seqbegin(&timekeeper.lock);
-		secs = timekeeper.xtime.tv_sec +
+		secs = timekeeper.xtime_sec +
 				timekeeper.wall_to_monotonic.tv_sec;
-		nsecs = timekeeper.xtime.tv_nsec +
+		nsecs = timekeeping_get_ns() +
 				timekeeper.wall_to_monotonic.tv_nsec;
-		nsecs += timekeeping_get_ns();
 		/* If arch requires, add in gettimeoffset() */
 		nsecs += arch_gettimeoffset();
 
@@ -293,22 +333,21 @@ void ktime_get_ts(struct timespec *ts)
 {
 	struct timespec tomono;
 	unsigned int seq;
-	s64 nsecs;
 
 	WARN_ON(timekeeping_suspended);
 
 	do {
 		seq = read_seqbegin(&timekeeper.lock);
-		*ts = timekeeper.xtime;
+		ts->tv_sec = timekeeper.xtime_sec;
+		ts->tv_nsec = timekeeping_get_ns();
 		tomono = timekeeper.wall_to_monotonic;
-		nsecs = timekeeping_get_ns();
 		/* If arch requires, add in gettimeoffset() */
-		nsecs += arch_gettimeoffset();
+		ts->tv_nsec += arch_gettimeoffset();
 
 	} while (read_seqretry(&timekeeper.lock, seq));
 
 	set_normalized_timespec(ts, ts->tv_sec + tomono.tv_sec,
-				ts->tv_nsec + tomono.tv_nsec + nsecs);
+				ts->tv_nsec + tomono.tv_nsec);
 }
 EXPORT_SYMBOL_GPL(ktime_get_ts);
 
@@ -336,7 +375,8 @@ void getnstime_raw_and_real(struct timespec *ts_raw, struct timespec *ts_real)
 		seq = read_seqbegin(&timekeeper.lock);
 
 		*ts_raw = timekeeper.raw_time;
-		*ts_real = timekeeper.xtime;
+		ts_real->tv_sec = timekeeper.xtime_sec;
+		ts_real->tv_nsec = 0;
 
 		nsecs_raw = timekeeping_get_ns_raw();
 		nsecs_real = timekeeping_get_ns();
@@ -379,7 +419,7 @@ EXPORT_SYMBOL(do_gettimeofday);
  */
 int do_settimeofday(const struct timespec *tv)
 {
-	struct timespec ts_delta;
+	struct timespec ts_delta, xt;
 	unsigned long flags;
 
 	if ((unsigned long)tv->tv_nsec >= NSEC_PER_SEC)
@@ -389,12 +429,15 @@ int do_settimeofday(const struct timespec *tv)
 
 	timekeeping_forward_now();
 
-	ts_delta.tv_sec = tv->tv_sec - timekeeper.xtime.tv_sec;
-	ts_delta.tv_nsec = tv->tv_nsec - timekeeper.xtime.tv_nsec;
+	xt = tk_xtime(&timekeeper);
+	ts_delta.tv_sec = tv->tv_sec - xt.tv_sec;
+	ts_delta.tv_nsec = tv->tv_nsec - xt.tv_nsec;
+
 	timekeeper.wall_to_monotonic =
 			timespec_sub(timekeeper.wall_to_monotonic, ts_delta);
 
-	timekeeper.xtime = *tv;
+	tk_set_xtime(&timekeeper, tv);
+
 	timekeeping_update(true);
 
 	write_sequnlock_irqrestore(&timekeeper.lock, flags);
@@ -425,7 +468,8 @@ int timekeeping_inject_offset(struct timespec *ts)
 
 	timekeeping_forward_now();
 
-	timekeeper.xtime = timespec_add(timekeeper.xtime, *ts);
+
+	tk_xtime_add(&timekeeper, ts);
 	timekeeper.wall_to_monotonic =
 				timespec_sub(timekeeper.wall_to_monotonic, *ts);
 
@@ -601,14 +645,12 @@ void __init timekeeping_init(void)
 		clock->enable(clock);
 	timekeeper_setup_internals(clock);
 
-	timekeeper.xtime.tv_sec = now.tv_sec;
-	timekeeper.xtime.tv_nsec = now.tv_nsec;
+	tk_set_xtime(&timekeeper, &now);
 	timekeeper.raw_time.tv_sec = 0;
 	timekeeper.raw_time.tv_nsec = 0;
-	if (boot.tv_sec == 0 && boot.tv_nsec == 0) {
-		boot.tv_sec = timekeeper.xtime.tv_sec;
-		boot.tv_nsec = timekeeper.xtime.tv_nsec;
-	}
+	if (boot.tv_sec == 0 && boot.tv_nsec == 0)
+		boot = tk_xtime(&timekeeper);
+
 	set_normalized_timespec(&timekeeper.wall_to_monotonic,
 				-boot.tv_sec, -boot.tv_nsec);
 	timekeeper.total_sleep_time.tv_sec = 0;
@@ -634,7 +676,7 @@ static void __timekeeping_inject_sleeptime(struct timespec *delta)
 		return;
 	}
 
-	timekeeper.xtime = timespec_add(timekeeper.xtime, *delta);
+	tk_xtime_add(&timekeeper, delta);
 	timekeeper.wall_to_monotonic =
 			timespec_sub(timekeeper.wall_to_monotonic, *delta);
 	timekeeper.total_sleep_time = timespec_add(
@@ -731,7 +773,7 @@ static int timekeeping_suspend(void)
 	 * try to compensate so the difference in system time
 	 * and persistent_clock time stays close to constant.
 	 */
-	delta = timespec_sub(timekeeper.xtime, timekeeping_suspend_time);
+	delta = timespec_sub(tk_xtime(&timekeeper), timekeeping_suspend_time);
 	delta_delta = timespec_sub(delta, old_delta);
 	if (abs(delta_delta.tv_sec)  >= 2) {
 		/*
@@ -963,7 +1005,7 @@ static cycle_t logarithmic_accumulation(cycle_t offset, int shift)
 	timekeeper.xtime_nsec += timekeeper.xtime_interval << shift;
 	while (timekeeper.xtime_nsec >= nsecps) {
 		timekeeper.xtime_nsec -= nsecps;
-		timekeeper.xtime.tv_sec++;
+		timekeeper.xtime_sec++;
 		second_overflow();
 	}
 
@@ -997,6 +1039,7 @@ static void update_wall_time(void)
 	cycle_t offset;
 	int shift = 0, maxshift;
 	unsigned long flags;
+	s64 remainder;
 
 	write_seqlock_irqsave(&timekeeper.lock, flags);
 
@@ -1011,8 +1054,6 @@ static void update_wall_time(void)
 #else
 	offset = (clock->read(clock) - clock->cycle_last) & clock->mask;
 #endif
-	timekeeper.xtime_nsec = (s64)timekeeper.xtime.tv_nsec <<
-						timekeeper.shift;
 
 	/*
 	 * With NO_HZ we may have to accumulate many cycle_intervals
@@ -1058,25 +1099,29 @@ static void update_wall_time(void)
 		timekeeper.ntp_error += neg << timekeeper.ntp_error_shift;
 	}
 
-
 	/*
-	 * Store full nanoseconds into xtime after rounding it up and
-	 * add the remainder to the error difference.
-	 */
-	timekeeper.xtime.tv_nsec = ((s64)timekeeper.xtime_nsec >>
-						timekeeper.shift) + 1;
-	timekeeper.xtime_nsec -= (s64)timekeeper.xtime.tv_nsec <<
-						timekeeper.shift;
-	timekeeper.ntp_error +=	timekeeper.xtime_nsec <<
-				timekeeper.ntp_error_shift;
+	* Store only full nanoseconds into xtime_nsec after rounding
+	* it up and add the remainder to the error difference.
+	* XXX - This is necessary to avoid small 1ns inconsistnecies caused
+	* by truncating the remainder in vsyscalls. However, it causes
+	* additional work to be done in timekeeping_adjust(). Once
+	* the vsyscall implementations are converted to use xtime_nsec
+	* (shifted nanoseconds), this can be killed.
+	*/
+	remainder = timekeeper.xtime_nsec & ((1<<timekeeper.shift)-1);
+	timekeeper.xtime_nsec -= remainder;
+	timekeeper.xtime_nsec += 1<<timekeeper.shift;
+	timekeeper.ntp_error += remainder <<
+					timekeeper.ntp_error_shift;
 
 	/*
 	 * Finally, make sure that after the rounding
 	 * xtime.tv_nsec isn't larger then NSEC_PER_SEC
 	 */
-	if (unlikely(timekeeper.xtime.tv_nsec >= NSEC_PER_SEC)) {
-		timekeeper.xtime.tv_nsec -= NSEC_PER_SEC;
-		timekeeper.xtime.tv_sec++;
+	if (unlikely(timekeeper.xtime_nsec >=
+			((u64)NSEC_PER_SEC << timekeeper.shift))) {
+		timekeeper.xtime_nsec -= (u64)NSEC_PER_SEC << timekeeper.shift;
+		timekeeper.xtime_sec++;
 		second_overflow();
 	}
 
@@ -1125,21 +1170,20 @@ void get_monotonic_boottime(struct timespec *ts)
 {
 	struct timespec tomono, sleep;
 	unsigned int seq;
-	s64 nsecs;
 
 	WARN_ON(timekeeping_suspended);
 
 	do {
 		seq = read_seqbegin(&timekeeper.lock);
-		*ts = timekeeper.xtime;
+		ts->tv_sec = timekeeper.xtime_sec;
+		ts->tv_nsec = timekeeping_get_ns();
 		tomono = timekeeper.wall_to_monotonic;
 		sleep = timekeeper.total_sleep_time;
-		nsecs = timekeeping_get_ns();
 
 	} while (read_seqretry(&timekeeper.lock, seq));
 
 	set_normalized_timespec(ts, ts->tv_sec + tomono.tv_sec + sleep.tv_sec,
-			ts->tv_nsec + tomono.tv_nsec + sleep.tv_nsec + nsecs);
+			ts->tv_nsec + tomono.tv_nsec + sleep.tv_nsec);
 }
 EXPORT_SYMBOL_GPL(get_monotonic_boottime);
 
@@ -1172,13 +1216,13 @@ EXPORT_SYMBOL_GPL(monotonic_to_bootbased);
 
 unsigned long get_seconds(void)
 {
-	return timekeeper.xtime.tv_sec;
+	return timekeeper.xtime_sec;
 }
 EXPORT_SYMBOL(get_seconds);
 
 struct timespec __current_kernel_time(void)
 {
-	return timekeeper.xtime;
+	return tk_xtime(&timekeeper);
 }
 
 struct timespec current_kernel_time(void)
@@ -1189,7 +1233,7 @@ struct timespec current_kernel_time(void)
 	do {
 		seq = read_seqbegin(&timekeeper.lock);
 
-		now = timekeeper.xtime;
+		now = tk_xtime(&timekeeper);
 	} while (read_seqretry(&timekeeper.lock, seq));
 
 	return now;
@@ -1204,7 +1248,7 @@ struct timespec get_monotonic_coarse(void)
 	do {
 		seq = read_seqbegin(&timekeeper.lock);
 
-		now = timekeeper.xtime;
+		now = tk_xtime(&timekeeper);
 		mono = timekeeper.wall_to_monotonic;
 	} while (read_seqretry(&timekeeper.lock, seq));
 
@@ -1239,7 +1283,7 @@ void get_xtime_and_monotonic_and_sleep_offset(struct timespec *xtim,
 
 	do {
 		seq = read_seqbegin(&timekeeper.lock);
-		*xtim = timekeeper.xtime;
+		*xtim = tk_xtime(&timekeeper);
 		*wtom = timekeeper.wall_to_monotonic;
 		*sleep = timekeeper.total_sleep_time;
 	} while (read_seqretry(&timekeeper.lock, seq));
-- 
1.7.3.2.146.gca209


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/9] time: Rework timekeeping functions to take timekeeper ptr as argument
  2012-03-02  7:12 [PATCH 0/9] Reduce timekeeping lock hold time v2 John Stultz
  2012-03-02  7:12 ` [PATCH 1/9] time: Condense timekeeper.xtime into xtime_sec John Stultz
@ 2012-03-02  7:12 ` John Stultz
  2012-03-02  7:12 ` [PATCH 3/9] time: Split timekeeper lock into separate reader/writer locks John Stultz
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: John Stultz @ 2012-03-02  7:12 UTC (permalink / raw
  To: lkml
  Cc: John Stultz, Ingo Molnar, Thomas Gleixner, Eric Dumazet,
	Richard Cochran

As part of cleaning up the timekeeping code, this patch converts
a number of internal functions to takea  timekeeper ptr as an
argument, so that the internal functions don't access the global
timekeeper structure directly. This allows for further optimizations
later.

This patch has been updated to include more consistent usage of the
timekeeper value, by making sure it is always passed as a argument
to non top-level functions.

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c |  180 ++++++++++++++++++++++----------------------
 1 files changed, 90 insertions(+), 90 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 5434c6e..74568ca 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -120,7 +120,7 @@ static void tk_xtime_add(struct timekeeper *tk, const struct timespec *ts)
 
 
 /**
- * timekeeper_setup_internals - Set up internals to use clocksource clock.
+ * tk_setup_internals - Set up internals to use clocksource clock.
  *
  * @clock:		Pointer to clocksource.
  *
@@ -129,14 +129,14 @@ static void tk_xtime_add(struct timekeeper *tk, const struct timespec *ts)
  *
  * Unless you're the timekeeping code, you should not be using this!
  */
-static void timekeeper_setup_internals(struct clocksource *clock)
+static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
 {
 	cycle_t interval;
 	u64 tmp, ntpinterval;
 	struct clocksource *old_clock;
 
-	old_clock = timekeeper.clock;
-	timekeeper.clock = clock;
+	old_clock = tk->clock;
+	tk->clock = clock;
 	clock->cycle_last = clock->read(clock);
 
 	/* Do the ns -> cycle conversion first, using original mult */
@@ -149,60 +149,60 @@ static void timekeeper_setup_internals(struct clocksource *clock)
 		tmp = 1;
 
 	interval = (cycle_t) tmp;
-	timekeeper.cycle_interval = interval;
+	tk->cycle_interval = interval;
 
 	/* Go back from cycles -> shifted ns */
-	timekeeper.xtime_interval = (u64) interval * clock->mult;
-	timekeeper.xtime_remainder = ntpinterval - timekeeper.xtime_interval;
-	timekeeper.raw_interval =
+	tk->xtime_interval = (u64) interval * clock->mult;
+	tk->xtime_remainder = ntpinterval - tk->xtime_interval;
+	tk->raw_interval =
 		((u64) interval * clock->mult) >> clock->shift;
 
 	 /* if changing clocks, convert xtime_nsec shift units */
 	if (old_clock) {
 		int shift_change = clock->shift - old_clock->shift;
 		if (shift_change < 0)
-			timekeeper.xtime_nsec >>= -shift_change;
+			tk->xtime_nsec >>= -shift_change;
 		else
-			timekeeper.xtime_nsec <<= shift_change;
+			tk->xtime_nsec <<= shift_change;
 	}
-	timekeeper.shift = clock->shift;
+	tk->shift = clock->shift;
 
-	timekeeper.ntp_error = 0;
-	timekeeper.ntp_error_shift = NTP_SCALE_SHIFT - clock->shift;
+	tk->ntp_error = 0;
+	tk->ntp_error_shift = NTP_SCALE_SHIFT - clock->shift;
 
 	/*
 	 * The timekeeper keeps its own mult values for the currently
 	 * active clocksource. These value will be adjusted via NTP
 	 * to counteract clock drifting.
 	 */
-	timekeeper.mult = clock->mult;
+	tk->mult = clock->mult;
 }
 
 /* Timekeeper helper functions. */
-static inline s64 timekeeping_get_ns(void)
+static inline s64 timekeeping_get_ns(struct timekeeper *tk)
 {
 	cycle_t cycle_now, cycle_delta;
 	struct clocksource *clock;
 	s64 nsec;
 
 	/* read clocksource: */
-	clock = timekeeper.clock;
+	clock = tk->clock;
 	cycle_now = clock->read(clock);
 
 	/* calculate the delta since the last update_wall_time: */
 	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 
-	nsec = cycle_delta * timekeeper.mult + timekeeper.xtime_nsec;
-	return nsec >> timekeeper.shift;
+	nsec = cycle_delta * tk->mult + tk->xtime_nsec;
+	return nsec >> tk->shift;
 }
 
-static inline s64 timekeeping_get_ns_raw(void)
+static inline s64 timekeeping_get_ns_raw(struct timekeeper *tk)
 {
 	cycle_t cycle_now, cycle_delta;
 	struct clocksource *clock;
 
 	/* read clocksource: */
-	clock = timekeeper.clock;
+	clock = tk->clock;
 	cycle_now = clock->read(clock);
 
 	/* calculate the delta since the last update_wall_time: */
@@ -213,16 +213,15 @@ static inline s64 timekeeping_get_ns_raw(void)
 }
 
 /* must hold write on timekeeper.lock */
-static void timekeeping_update(bool clearntp)
+static void timekeeping_update(struct timekeeper *tk, bool clearntp)
 {
 	struct timespec xt;
 	if (clearntp) {
-		timekeeper.ntp_error = 0;
+		tk->ntp_error = 0;
 		ntp_clear();
 	}
-	xt = tk_xtime(&timekeeper);
-	update_vsyscall(&xt, &timekeeper.wall_to_monotonic,
-			 timekeeper.clock, timekeeper.mult);
+	xt = tk_xtime(tk);
+	update_vsyscall(&xt, &tk->wall_to_monotonic, tk->clock, tk->mult);
 }
 
 
@@ -233,7 +232,7 @@ void timekeeping_leap_insert(int leapsecond)
 	write_seqlock_irqsave(&timekeeper.lock, flags);
 	timekeeper.xtime_sec += leapsecond;
 	timekeeper.wall_to_monotonic.tv_sec -= leapsecond;
-	timekeeping_update(false);
+	timekeeping_update(&timekeeper, false);
 	write_sequnlock_irqrestore(&timekeeper.lock, flags);
 
 }
@@ -245,26 +244,26 @@ void timekeeping_leap_insert(int leapsecond)
  * update_wall_time(). This is useful before significant clock changes,
  * as it avoids having to deal with this time offset explicitly.
  */
-static void timekeeping_forward_now(void)
+static void timekeeping_forward_now(struct timekeeper *tk)
 {
 	cycle_t cycle_now, cycle_delta;
 	struct clocksource *clock;
 	s64 nsec;
 
-	clock = timekeeper.clock;
+	clock = tk->clock;
 	cycle_now = clock->read(clock);
 	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
 	clock->cycle_last = cycle_now;
 
-	timekeeper.xtime_nsec += cycle_delta * timekeeper.mult;
+	tk->xtime_nsec += cycle_delta * tk->mult;
 
 	/* If arch requires, add in gettimeoffset() */
-	timekeeper.xtime_nsec += arch_gettimeoffset() << timekeeper.shift;
+	tk->xtime_nsec += arch_gettimeoffset() << tk->shift;
 
-	tk_normalize_xtime(&timekeeper);
+	tk_normalize_xtime(tk);
 
 	nsec = clocksource_cyc2ns(cycle_delta, clock->mult, clock->shift);
-	timespec_add_ns(&timekeeper.raw_time, nsec);
+	timespec_add_ns(&tk->raw_time, nsec);
 }
 
 /**
@@ -284,7 +283,7 @@ void getnstimeofday(struct timespec *ts)
 		seq = read_seqbegin(&timekeeper.lock);
 
 		ts->tv_sec = timekeeper.xtime_sec;
-		ts->tv_nsec = timekeeping_get_ns();
+		ts->tv_nsec = timekeeping_get_ns(&timekeeper);
 
 		/* If arch requires, add in gettimeoffset() */
 		nsecs += arch_gettimeoffset();
@@ -307,7 +306,7 @@ ktime_t ktime_get(void)
 		seq = read_seqbegin(&timekeeper.lock);
 		secs = timekeeper.xtime_sec +
 				timekeeper.wall_to_monotonic.tv_sec;
-		nsecs = timekeeping_get_ns() +
+		nsecs = timekeeping_get_ns(&timekeeper) +
 				timekeeper.wall_to_monotonic.tv_nsec;
 		/* If arch requires, add in gettimeoffset() */
 		nsecs += arch_gettimeoffset();
@@ -339,7 +338,7 @@ void ktime_get_ts(struct timespec *ts)
 	do {
 		seq = read_seqbegin(&timekeeper.lock);
 		ts->tv_sec = timekeeper.xtime_sec;
-		ts->tv_nsec = timekeeping_get_ns();
+		ts->tv_nsec = timekeeping_get_ns(&timekeeper);
 		tomono = timekeeper.wall_to_monotonic;
 		/* If arch requires, add in gettimeoffset() */
 		ts->tv_nsec += arch_gettimeoffset();
@@ -378,8 +377,8 @@ void getnstime_raw_and_real(struct timespec *ts_raw, struct timespec *ts_real)
 		ts_real->tv_sec = timekeeper.xtime_sec;
 		ts_real->tv_nsec = 0;
 
-		nsecs_raw = timekeeping_get_ns_raw();
-		nsecs_real = timekeeping_get_ns();
+		nsecs_raw = timekeeping_get_ns_raw(&timekeeper);
+		nsecs_real = timekeeping_get_ns(&timekeeper);
 
 		/* If arch requires, add in gettimeoffset() */
 		arch_offset = arch_gettimeoffset();
@@ -427,7 +426,7 @@ int do_settimeofday(const struct timespec *tv)
 
 	write_seqlock_irqsave(&timekeeper.lock, flags);
 
-	timekeeping_forward_now();
+	timekeeping_forward_now(&timekeeper);
 
 	xt = tk_xtime(&timekeeper);
 	ts_delta.tv_sec = tv->tv_sec - xt.tv_sec;
@@ -438,7 +437,7 @@ int do_settimeofday(const struct timespec *tv)
 
 	tk_set_xtime(&timekeeper, tv);
 
-	timekeeping_update(true);
+	timekeeping_update(&timekeeper, true);
 
 	write_sequnlock_irqrestore(&timekeeper.lock, flags);
 
@@ -466,14 +465,14 @@ int timekeeping_inject_offset(struct timespec *ts)
 
 	write_seqlock_irqsave(&timekeeper.lock, flags);
 
-	timekeeping_forward_now();
+	timekeeping_forward_now(&timekeeper);
 
 
 	tk_xtime_add(&timekeeper, ts);
 	timekeeper.wall_to_monotonic =
 				timespec_sub(timekeeper.wall_to_monotonic, *ts);
 
-	timekeeping_update(true);
+	timekeeping_update(&timekeeper, true);
 
 	write_sequnlock_irqrestore(&timekeeper.lock, flags);
 
@@ -495,10 +494,10 @@ static int change_clocksource(void *data)
 
 	new = (struct clocksource *) data;
 
-	timekeeping_forward_now();
+	timekeeping_forward_now(&timekeeper);
 	if (!new->enable || new->enable(new) == 0) {
 		old = timekeeper.clock;
-		timekeeper_setup_internals(new);
+		tk_setup_internals(&timekeeper, new);
 		if (old->disable)
 			old->disable(old);
 	}
@@ -548,7 +547,7 @@ void getrawmonotonic(struct timespec *ts)
 
 	do {
 		seq = read_seqbegin(&timekeeper.lock);
-		nsecs = timekeeping_get_ns_raw();
+		nsecs = timekeeping_get_ns_raw(&timekeeper);
 		*ts = timekeeper.raw_time;
 
 	} while (read_seqretry(&timekeeper.lock, seq));
@@ -643,7 +642,7 @@ void __init timekeeping_init(void)
 	clock = clocksource_default_clock();
 	if (clock->enable)
 		clock->enable(clock);
-	timekeeper_setup_internals(clock);
+	tk_setup_internals(&timekeeper, clock);
 
 	tk_set_xtime(&timekeeper, &now);
 	timekeeper.raw_time.tv_sec = 0;
@@ -706,11 +705,11 @@ void timekeeping_inject_sleeptime(struct timespec *delta)
 
 	write_seqlock_irqsave(&timekeeper.lock, flags);
 
-	timekeeping_forward_now();
+	timekeeping_forward_now(&timekeeper);
 
 	__timekeeping_inject_sleeptime(delta);
 
-	timekeeping_update(true);
+	timekeeping_update(&timekeeper, true);
 
 	write_sequnlock_irqrestore(&timekeeper.lock, flags);
 
@@ -764,7 +763,7 @@ static int timekeeping_suspend(void)
 	read_persistent_clock(&timekeeping_suspend_time);
 
 	write_seqlock_irqsave(&timekeeper.lock, flags);
-	timekeeping_forward_now();
+	timekeeping_forward_now(&timekeeper);
 	timekeeping_suspended = 1;
 
 	/*
@@ -812,7 +811,8 @@ device_initcall(timekeeping_init_ops);
  * If the error is already larger, we look ahead even further
  * to compensate for late or lost adjustments.
  */
-static __always_inline int timekeeping_bigadjust(s64 error, s64 *interval,
+static __always_inline int timekeeping_bigadjust(struct timekeeper *tk,
+						 s64 error, s64 *interval,
 						 s64 *offset)
 {
 	s64 tick_error, i;
@@ -828,7 +828,7 @@ static __always_inline int timekeeping_bigadjust(s64 error, s64 *interval,
 	 * here.  This is tuned so that an error of about 1 msec is adjusted
 	 * within about 1 sec (or 2^20 nsec in 2^SHIFT_HZ ticks).
 	 */
-	error2 = timekeeper.ntp_error >> (NTP_SCALE_SHIFT + 22 - 2 * SHIFT_HZ);
+	error2 = tk->ntp_error >> (NTP_SCALE_SHIFT + 22 - 2 * SHIFT_HZ);
 	error2 = abs(error2);
 	for (look_ahead = 0; error2 > 0; look_ahead++)
 		error2 >>= 2;
@@ -837,8 +837,8 @@ static __always_inline int timekeeping_bigadjust(s64 error, s64 *interval,
 	 * Now calculate the error in (1 << look_ahead) ticks, but first
 	 * remove the single look ahead already included in the error.
 	 */
-	tick_error = ntp_tick_length() >> (timekeeper.ntp_error_shift + 1);
-	tick_error -= timekeeper.xtime_interval >> 1;
+	tick_error = ntp_tick_length() >> (tk->ntp_error_shift + 1);
+	tick_error -= tk->xtime_interval >> 1;
 	error = ((error - tick_error) >> look_ahead) + tick_error;
 
 	/* Finally calculate the adjustment shift value.  */
@@ -863,9 +863,9 @@ static __always_inline int timekeeping_bigadjust(s64 error, s64 *interval,
  * this is optimized for the most common adjustments of -1,0,1,
  * for other values we can do a bit more work.
  */
-static void timekeeping_adjust(s64 offset)
+static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
 {
-	s64 error, interval = timekeeper.cycle_interval;
+	s64 error, interval = tk->cycle_interval;
 	int adj;
 
 	/*
@@ -881,7 +881,7 @@ static void timekeeping_adjust(s64 offset)
 	 *
 	 * Note: It does not "save" on aggravation when reading the code.
 	 */
-	error = timekeeper.ntp_error >> (timekeeper.ntp_error_shift - 1);
+	error = tk->ntp_error >> (tk->ntp_error_shift - 1);
 	if (error > interval) {
 		/*
 		 * We now divide error by 4(via shift), which checks if
@@ -903,7 +903,8 @@ static void timekeeping_adjust(s64 offset)
 		if (likely(error <= interval))
 			adj = 1;
 		else
-			adj = timekeeping_bigadjust(error, &interval, &offset);
+			adj = timekeeping_bigadjust(tk, error, &interval,
+							&offset);
 	} else if (error < -interval) {
 		/* See comment above, this is just switched for the negative */
 		error >>= 2;
@@ -912,17 +913,17 @@ static void timekeeping_adjust(s64 offset)
 			interval = -interval;
 			offset = -offset;
 		} else
-			adj = timekeeping_bigadjust(error, &interval, &offset);
-	} else /* No adjustment needed */
+			adj = timekeeping_bigadjust(tk, error, &interval,
+							&offset);
+	} else
 		return;
 
-	WARN_ONCE(timekeeper.clock->maxadj &&
-			(timekeeper.mult + adj > timekeeper.clock->mult +
-						timekeeper.clock->maxadj),
+	WARN_ONCE(tk->clock->maxadj &&
+			(tk->mult + adj > tk->clock->mult + tk->clock->maxadj),
 			"Adjusting %s more then 11%% (%ld vs %ld)\n",
-			timekeeper.clock->name, (long)timekeeper.mult + adj,
-			(long)timekeeper.clock->mult +
-				timekeeper.clock->maxadj);
+			tk->clock->name, (long)tk->mult + adj,
+			(long)tk->clock->mult + tk->clock->maxadj);
+
 	/*
 	 * So the following can be confusing.
 	 *
@@ -972,11 +973,10 @@ static void timekeeping_adjust(s64 offset)
 	 *
 	 * XXX - TODO: Doc ntp_error calculation.
 	 */
-	timekeeper.mult += adj;
-	timekeeper.xtime_interval += interval;
-	timekeeper.xtime_nsec -= offset;
-	timekeeper.ntp_error -= (interval - offset) <<
-				timekeeper.ntp_error_shift;
+	tk->mult += adj;
+	tk->xtime_interval += interval;
+	tk->xtime_nsec -= offset;
+	tk->ntp_error -= (interval - offset) << tk->ntp_error_shift;
 }
 
 
@@ -989,41 +989,41 @@ static void timekeeping_adjust(s64 offset)
  *
  * Returns the unconsumed cycles.
  */
-static cycle_t logarithmic_accumulation(cycle_t offset, int shift)
+static cycle_t logarithmic_accumulation(struct timekeeper *tk, cycle_t offset,
+						int shift)
 {
-	u64 nsecps = (u64)NSEC_PER_SEC << timekeeper.shift;
+	u64 nsecps = (u64)NSEC_PER_SEC << tk->shift;
 	u64 raw_nsecs;
 
 	/* If the offset is smaller then a shifted interval, do nothing */
-	if (offset < timekeeper.cycle_interval<<shift)
+	if (offset < tk->cycle_interval<<shift)
 		return offset;
 
 	/* Accumulate one shifted interval */
-	offset -= timekeeper.cycle_interval << shift;
-	timekeeper.clock->cycle_last += timekeeper.cycle_interval << shift;
+	offset -= tk->cycle_interval << shift;
+	tk->clock->cycle_last += tk->cycle_interval << shift;
 
-	timekeeper.xtime_nsec += timekeeper.xtime_interval << shift;
-	while (timekeeper.xtime_nsec >= nsecps) {
-		timekeeper.xtime_nsec -= nsecps;
-		timekeeper.xtime_sec++;
+	tk->xtime_nsec += tk->xtime_interval << shift;
+	while (tk->xtime_nsec >= nsecps) {
+		tk->xtime_nsec -= nsecps;
+		tk->xtime_sec++;
 		second_overflow();
 	}
 
 	/* Accumulate raw time */
-	raw_nsecs = timekeeper.raw_interval << shift;
-	raw_nsecs += timekeeper.raw_time.tv_nsec;
+	raw_nsecs = tk->raw_interval << shift;
+	raw_nsecs += tk->raw_time.tv_nsec;
 	if (raw_nsecs >= NSEC_PER_SEC) {
 		u64 raw_secs = raw_nsecs;
 		raw_nsecs = do_div(raw_secs, NSEC_PER_SEC);
-		timekeeper.raw_time.tv_sec += raw_secs;
+		tk->raw_time.tv_sec += raw_secs;
 	}
-	timekeeper.raw_time.tv_nsec = raw_nsecs;
+	tk->raw_time.tv_nsec = raw_nsecs;
 
 	/* Accumulate error between NTP and clock interval */
-	timekeeper.ntp_error += ntp_tick_length() << shift;
-	timekeeper.ntp_error -=
-	    (timekeeper.xtime_interval + timekeeper.xtime_remainder) <<
-				(timekeeper.ntp_error_shift + shift);
+	tk->ntp_error += ntp_tick_length() << shift;
+	tk->ntp_error -= (tk->xtime_interval + tk->xtime_remainder) <<
+						(tk->ntp_error_shift + shift);
 
 	return offset;
 }
@@ -1069,13 +1069,13 @@ static void update_wall_time(void)
 	maxshift = (64 - (ilog2(ntp_tick_length())+1)) - 1;
 	shift = min(shift, maxshift);
 	while (offset >= timekeeper.cycle_interval) {
-		offset = logarithmic_accumulation(offset, shift);
+		offset = logarithmic_accumulation(&timekeeper, offset, shift);
 		if(offset < timekeeper.cycle_interval<<shift)
 			shift--;
 	}
 
 	/* correct the clock when NTP error is too big */
-	timekeeping_adjust(offset);
+	timekeeping_adjust(&timekeeper, offset);
 
 	/*
 	 * Since in the loop above, we accumulate any amount of time
@@ -1125,7 +1125,7 @@ static void update_wall_time(void)
 		second_overflow();
 	}
 
-	timekeeping_update(false);
+	timekeeping_update(&timekeeper, false);
 
 out:
 	write_sequnlock_irqrestore(&timekeeper.lock, flags);
@@ -1176,7 +1176,7 @@ void get_monotonic_boottime(struct timespec *ts)
 	do {
 		seq = read_seqbegin(&timekeeper.lock);
 		ts->tv_sec = timekeeper.xtime_sec;
-		ts->tv_nsec = timekeeping_get_ns();
+		ts->tv_nsec = timekeeping_get_ns(&timekeeper);
 		tomono = timekeeper.wall_to_monotonic;
 		sleep = timekeeper.total_sleep_time;
 
-- 
1.7.3.2.146.gca209


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/9] time: Split timekeeper lock into separate reader/writer locks
  2012-03-02  7:12 [PATCH 0/9] Reduce timekeeping lock hold time v2 John Stultz
  2012-03-02  7:12 ` [PATCH 1/9] time: Condense timekeeper.xtime into xtime_sec John Stultz
  2012-03-02  7:12 ` [PATCH 2/9] time: Rework timekeeping functions to take timekeeper ptr as argument John Stultz
@ 2012-03-02  7:12 ` John Stultz
  2012-03-02  7:12 ` [PATCH 4/9] time: Update timekeeper structure using a local shadow John Stultz
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: John Stultz @ 2012-03-02  7:12 UTC (permalink / raw
  To: lkml
  Cc: John Stultz, Ingo Molnar, Thomas Gleixner, Eric Dumazet,
	Richard Cochran

In order to reduce the lock hold time, split the timekeeper lock
into a writer lock, which serializes updates to the timekeeper
structure, and a reader sequence counter, which ensures readers
see a consistent version of the timekeeper.

This will allow us to reduce the lock wait time for readers, by
doing updates on a shadow copy of the timekeeper.

This patch also has been reworked to move the split locks out
of the timekeeper structure, so that we don't break lockdep
when updating from the shadow copy

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c |  116 +++++++++++++++++++++++++++-----------------
 1 files changed, 71 insertions(+), 45 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 74568ca..f9ee96c 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -71,12 +71,16 @@ struct timekeeper {
 	/* The raw monotonic time for the CLOCK_MONOTONIC_RAW posix clock. */
 	struct timespec raw_time;
 
-	/* Seqlock for all timekeeper values */
-	seqlock_t lock;
 };
 
 static struct timekeeper timekeeper;
 
+/* Locks for timekeeper variable: */
+/* This seqcount serializes readers from updates */
+static seqcount_t timekeeper_rlock;
+/* This spinlock serializes updaters */
+static spinlock_t timekeeper_wlock;
+
 /*
  * This read-write spinlock protects us from races in SMP while
  * playing with xtime.
@@ -212,7 +216,7 @@ static inline s64 timekeeping_get_ns_raw(struct timekeeper *tk)
 	return clocksource_cyc2ns(cycle_delta, clock->mult, clock->shift);
 }
 
-/* must hold write on timekeeper.lock */
+/* must hold write on timekeeper_wlock */
 static void timekeeping_update(struct timekeeper *tk, bool clearntp)
 {
 	struct timespec xt;
@@ -229,11 +233,13 @@ void timekeeping_leap_insert(int leapsecond)
 {
 	unsigned long flags;
 
-	write_seqlock_irqsave(&timekeeper.lock, flags);
+	spin_lock_irqsave(&timekeeper_wlock, flags);
+	write_seqcount_begin(&timekeeper_rlock);
 	timekeeper.xtime_sec += leapsecond;
 	timekeeper.wall_to_monotonic.tv_sec -= leapsecond;
 	timekeeping_update(&timekeeper, false);
-	write_sequnlock_irqrestore(&timekeeper.lock, flags);
+	write_seqcount_end(&timekeeper_rlock);
+	spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 }
 
@@ -280,7 +286,7 @@ void getnstimeofday(struct timespec *ts)
 	WARN_ON(timekeeping_suspended);
 
 	do {
-		seq = read_seqbegin(&timekeeper.lock);
+		seq = read_seqcount_begin(&timekeeper_rlock);
 
 		ts->tv_sec = timekeeper.xtime_sec;
 		ts->tv_nsec = timekeeping_get_ns(&timekeeper);
@@ -288,7 +294,7 @@ void getnstimeofday(struct timespec *ts)
 		/* If arch requires, add in gettimeoffset() */
 		nsecs += arch_gettimeoffset();
 
-	} while (read_seqretry(&timekeeper.lock, seq));
+	} while (read_seqcount_retry(&timekeeper_rlock, seq));
 
 	timespec_add_ns(ts, nsecs);
 }
@@ -303,7 +309,7 @@ ktime_t ktime_get(void)
 	WARN_ON(timekeeping_suspended);
 
 	do {
-		seq = read_seqbegin(&timekeeper.lock);
+		seq = read_seqcount_begin(&timekeeper_rlock);
 		secs = timekeeper.xtime_sec +
 				timekeeper.wall_to_monotonic.tv_sec;
 		nsecs = timekeeping_get_ns(&timekeeper) +
@@ -311,7 +317,7 @@ ktime_t ktime_get(void)
 		/* If arch requires, add in gettimeoffset() */
 		nsecs += arch_gettimeoffset();
 
-	} while (read_seqretry(&timekeeper.lock, seq));
+	} while (read_seqcount_retry(&timekeeper_rlock, seq));
 	/*
 	 * Use ktime_set/ktime_add_ns to create a proper ktime on
 	 * 32-bit architectures without CONFIG_KTIME_SCALAR.
@@ -336,14 +342,14 @@ void ktime_get_ts(struct timespec *ts)
 	WARN_ON(timekeeping_suspended);
 
 	do {
-		seq = read_seqbegin(&timekeeper.lock);
+		seq = read_seqcount_begin(&timekeeper_rlock);
 		ts->tv_sec = timekeeper.xtime_sec;
 		ts->tv_nsec = timekeeping_get_ns(&timekeeper);
 		tomono = timekeeper.wall_to_monotonic;
 		/* If arch requires, add in gettimeoffset() */
 		ts->tv_nsec += arch_gettimeoffset();
 
-	} while (read_seqretry(&timekeeper.lock, seq));
+	} while (read_seqcount_retry(&timekeeper_rlock, seq));
 
 	set_normalized_timespec(ts, ts->tv_sec + tomono.tv_sec,
 				ts->tv_nsec + tomono.tv_nsec);
@@ -371,7 +377,7 @@ void getnstime_raw_and_real(struct timespec *ts_raw, struct timespec *ts_real)
 	do {
 		u32 arch_offset;
 
-		seq = read_seqbegin(&timekeeper.lock);
+		seq = read_seqcount_begin(&timekeeper_rlock);
 
 		*ts_raw = timekeeper.raw_time;
 		ts_real->tv_sec = timekeeper.xtime_sec;
@@ -385,7 +391,7 @@ void getnstime_raw_and_real(struct timespec *ts_raw, struct timespec *ts_real)
 		nsecs_raw += arch_offset;
 		nsecs_real += arch_offset;
 
-	} while (read_seqretry(&timekeeper.lock, seq));
+	} while (read_seqcount_retry(&timekeeper_rlock, seq));
 
 	timespec_add_ns(ts_raw, nsecs_raw);
 	timespec_add_ns(ts_real, nsecs_real);
@@ -424,7 +430,8 @@ int do_settimeofday(const struct timespec *tv)
 	if ((unsigned long)tv->tv_nsec >= NSEC_PER_SEC)
 		return -EINVAL;
 
-	write_seqlock_irqsave(&timekeeper.lock, flags);
+	spin_lock_irqsave(&timekeeper_wlock, flags);
+	write_seqcount_begin(&timekeeper_rlock);
 
 	timekeeping_forward_now(&timekeeper);
 
@@ -439,7 +446,8 @@ int do_settimeofday(const struct timespec *tv)
 
 	timekeeping_update(&timekeeper, true);
 
-	write_sequnlock_irqrestore(&timekeeper.lock, flags);
+	write_seqcount_end(&timekeeper_rlock);
+	spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 	/* signal hrtimers about time change */
 	clock_was_set();
@@ -463,7 +471,8 @@ int timekeeping_inject_offset(struct timespec *ts)
 	if ((unsigned long)ts->tv_nsec >= NSEC_PER_SEC)
 		return -EINVAL;
 
-	write_seqlock_irqsave(&timekeeper.lock, flags);
+	spin_lock_irqsave(&timekeeper_wlock, flags);
+	write_seqcount_begin(&timekeeper_rlock);
 
 	timekeeping_forward_now(&timekeeper);
 
@@ -474,7 +483,8 @@ int timekeeping_inject_offset(struct timespec *ts)
 
 	timekeeping_update(&timekeeper, true);
 
-	write_sequnlock_irqrestore(&timekeeper.lock, flags);
+	write_seqcount_end(&timekeeper_rlock);
+	spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 	/* signal hrtimers about time change */
 	clock_was_set();
@@ -546,11 +556,11 @@ void getrawmonotonic(struct timespec *ts)
 	s64 nsecs;
 
 	do {
-		seq = read_seqbegin(&timekeeper.lock);
+		seq = read_seqcount_begin(&timekeeper_rlock);
 		nsecs = timekeeping_get_ns_raw(&timekeeper);
 		*ts = timekeeper.raw_time;
 
-	} while (read_seqretry(&timekeeper.lock, seq));
+	} while (read_seqcount_retry(&timekeeper_rlock, seq));
 
 	timespec_add_ns(ts, nsecs);
 }
@@ -566,11 +576,11 @@ int timekeeping_valid_for_hres(void)
 	int ret;
 
 	do {
-		seq = read_seqbegin(&timekeeper.lock);
+		seq = read_seqcount_begin(&timekeeper_rlock);
 
 		ret = timekeeper.clock->flags & CLOCK_SOURCE_VALID_FOR_HRES;
 
-	} while (read_seqretry(&timekeeper.lock, seq));
+	} while (read_seqcount_retry(&timekeeper_rlock, seq));
 
 	return ret;
 }
@@ -583,11 +593,11 @@ u64 timekeeping_max_deferment(void)
 	unsigned long seq;
 	u64 ret;
 	do {
-		seq = read_seqbegin(&timekeeper.lock);
+		seq = read_seqcount_begin(&timekeeper_rlock);
 
 		ret = timekeeper.clock->max_idle_ns;
 
-	} while (read_seqretry(&timekeeper.lock, seq));
+	} while (read_seqcount_retry(&timekeeper_rlock, seq));
 
 	return ret;
 }
@@ -634,11 +644,13 @@ void __init timekeeping_init(void)
 	read_persistent_clock(&now);
 	read_boot_clock(&boot);
 
-	seqlock_init(&timekeeper.lock);
-
+	seqcount_init(&timekeeper_rlock);
+	spin_lock_init(&timekeeper_wlock);
 	ntp_init();
 
-	write_seqlock_irqsave(&timekeeper.lock, flags);
+	spin_lock_irqsave(&timekeeper_wlock, flags);
+	write_seqcount_begin(&timekeeper_rlock);
+
 	clock = clocksource_default_clock();
 	if (clock->enable)
 		clock->enable(clock);
@@ -654,7 +666,10 @@ void __init timekeeping_init(void)
 				-boot.tv_sec, -boot.tv_nsec);
 	timekeeper.total_sleep_time.tv_sec = 0;
 	timekeeper.total_sleep_time.tv_nsec = 0;
-	write_sequnlock_irqrestore(&timekeeper.lock, flags);
+
+	write_seqcount_end(&timekeeper_rlock);
+	spin_unlock_irqrestore(&timekeeper_wlock, flags);
+
 }
 
 /* time in seconds when suspend began */
@@ -703,7 +718,8 @@ void timekeeping_inject_sleeptime(struct timespec *delta)
 	if (!(ts.tv_sec == 0 && ts.tv_nsec == 0))
 		return;
 
-	write_seqlock_irqsave(&timekeeper.lock, flags);
+	spin_lock_irqsave(&timekeeper_wlock, flags);
+	write_seqcount_begin(&timekeeper_rlock);
 
 	timekeeping_forward_now(&timekeeper);
 
@@ -711,7 +727,8 @@ void timekeeping_inject_sleeptime(struct timespec *delta)
 
 	timekeeping_update(&timekeeper, true);
 
-	write_sequnlock_irqrestore(&timekeeper.lock, flags);
+	write_seqcount_end(&timekeeper_rlock);
+	spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 	/* signal hrtimers about time change */
 	clock_was_set();
@@ -734,7 +751,8 @@ static void timekeeping_resume(void)
 
 	clocksource_resume();
 
-	write_seqlock_irqsave(&timekeeper.lock, flags);
+	spin_lock_irqsave(&timekeeper_wlock, flags);
+	write_seqcount_begin(&timekeeper_rlock);
 
 	if (timespec_compare(&ts, &timekeeping_suspend_time) > 0) {
 		ts = timespec_sub(ts, timekeeping_suspend_time);
@@ -744,7 +762,9 @@ static void timekeeping_resume(void)
 	timekeeper.clock->cycle_last = timekeeper.clock->read(timekeeper.clock);
 	timekeeper.ntp_error = 0;
 	timekeeping_suspended = 0;
-	write_sequnlock_irqrestore(&timekeeper.lock, flags);
+
+	write_seqcount_end(&timekeeper_rlock);
+	spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 	touch_softlockup_watchdog();
 
@@ -762,7 +782,9 @@ static int timekeeping_suspend(void)
 
 	read_persistent_clock(&timekeeping_suspend_time);
 
-	write_seqlock_irqsave(&timekeeper.lock, flags);
+	spin_lock_irqsave(&timekeeper_wlock, flags);
+	write_seqcount_begin(&timekeeper_rlock);
+
 	timekeeping_forward_now(&timekeeper);
 	timekeeping_suspended = 1;
 
@@ -785,7 +807,9 @@ static int timekeeping_suspend(void)
 		timekeeping_suspend_time =
 			timespec_add(timekeeping_suspend_time, delta_delta);
 	}
-	write_sequnlock_irqrestore(&timekeeper.lock, flags);
+
+	write_seqcount_end(&timekeeper_rlock);
+	spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
 	clocksource_suspend();
@@ -1041,7 +1065,8 @@ static void update_wall_time(void)
 	unsigned long flags;
 	s64 remainder;
 
-	write_seqlock_irqsave(&timekeeper.lock, flags);
+	spin_lock_irqsave(&timekeeper_wlock, flags);
+	write_seqcount_begin(&timekeeper_rlock);
 
 	/* Make sure we're fully resumed: */
 	if (unlikely(timekeeping_suspended))
@@ -1128,7 +1153,8 @@ static void update_wall_time(void)
 	timekeeping_update(&timekeeper, false);
 
 out:
-	write_sequnlock_irqrestore(&timekeeper.lock, flags);
+	write_seqcount_end(&timekeeper_rlock);
+	spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 }
 
@@ -1174,13 +1200,13 @@ void get_monotonic_boottime(struct timespec *ts)
 	WARN_ON(timekeeping_suspended);
 
 	do {
-		seq = read_seqbegin(&timekeeper.lock);
+		seq = read_seqcount_begin(&timekeeper_rlock);
 		ts->tv_sec = timekeeper.xtime_sec;
 		ts->tv_nsec = timekeeping_get_ns(&timekeeper);
 		tomono = timekeeper.wall_to_monotonic;
 		sleep = timekeeper.total_sleep_time;
 
-	} while (read_seqretry(&timekeeper.lock, seq));
+	} while (read_seqcount_retry(&timekeeper_rlock, seq));
 
 	set_normalized_timespec(ts, ts->tv_sec + tomono.tv_sec + sleep.tv_sec,
 			ts->tv_nsec + tomono.tv_nsec + sleep.tv_nsec);
@@ -1231,10 +1257,10 @@ struct timespec current_kernel_time(void)
 	unsigned long seq;
 
 	do {
-		seq = read_seqbegin(&timekeeper.lock);
+		seq = read_seqcount_begin(&timekeeper_rlock);
 
 		now = tk_xtime(&timekeeper);
-	} while (read_seqretry(&timekeeper.lock, seq));
+	} while (read_seqcount_retry(&timekeeper_rlock, seq));
 
 	return now;
 }
@@ -1246,11 +1272,11 @@ struct timespec get_monotonic_coarse(void)
 	unsigned long seq;
 
 	do {
-		seq = read_seqbegin(&timekeeper.lock);
+		seq = read_seqcount_begin(&timekeeper_rlock);
 
 		now = tk_xtime(&timekeeper);
 		mono = timekeeper.wall_to_monotonic;
-	} while (read_seqretry(&timekeeper.lock, seq));
+	} while (read_seqcount_retry(&timekeeper_rlock, seq));
 
 	set_normalized_timespec(&now, now.tv_sec + mono.tv_sec,
 				now.tv_nsec + mono.tv_nsec);
@@ -1282,11 +1308,11 @@ void get_xtime_and_monotonic_and_sleep_offset(struct timespec *xtim,
 	unsigned long seq;
 
 	do {
-		seq = read_seqbegin(&timekeeper.lock);
+		seq = read_seqcount_begin(&timekeeper_rlock);
 		*xtim = tk_xtime(&timekeeper);
 		*wtom = timekeeper.wall_to_monotonic;
 		*sleep = timekeeper.total_sleep_time;
-	} while (read_seqretry(&timekeeper.lock, seq));
+	} while (read_seqcount_retry(&timekeeper_rlock, seq));
 }
 
 /**
@@ -1298,9 +1324,9 @@ ktime_t ktime_get_monotonic_offset(void)
 	struct timespec wtom;
 
 	do {
-		seq = read_seqbegin(&timekeeper.lock);
+		seq = read_seqcount_begin(&timekeeper_rlock);
 		wtom = timekeeper.wall_to_monotonic;
-	} while (read_seqretry(&timekeeper.lock, seq));
+	} while (read_seqcount_retry(&timekeeper_rlock, seq));
 
 	return timespec_to_ktime(wtom);
 }
-- 
1.7.3.2.146.gca209


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/9] time: Update timekeeper structure using a local shadow
  2012-03-02  7:12 [PATCH 0/9] Reduce timekeeping lock hold time v2 John Stultz
                   ` (2 preceding siblings ...)
  2012-03-02  7:12 ` [PATCH 3/9] time: Split timekeeper lock into separate reader/writer locks John Stultz
@ 2012-03-02  7:12 ` John Stultz
  2012-03-02  7:38   ` Ingo Molnar
  2012-03-02  7:12 ` [PATCH 5/9] time: Shadow cycle_last in timekeeper structure John Stultz
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: John Stultz @ 2012-03-02  7:12 UTC (permalink / raw
  To: lkml
  Cc: John Stultz, Ingo Molnar, Thomas Gleixner, Eric Dumazet,
	Richard Cochran

Uses a local shadow structure to update the timekeeper. This
will allow for reduced timekeeper.rlock hold time.

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c |   43 +++++++++++++++++++++++--------------------
 1 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index f9ee96c..09460c1 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -74,6 +74,7 @@ struct timekeeper {
 };
 
 static struct timekeeper timekeeper;
+static struct timekeeper shadow_tk;
 
 /* Locks for timekeeper variable: */
 /* This seqcount serializes readers from updates */
@@ -1072,10 +1073,11 @@ static void update_wall_time(void)
 	if (unlikely(timekeeping_suspended))
 		goto out;
 
-	clock = timekeeper.clock;
+	shadow_tk = timekeeper;
+	clock = shadow_tk.clock;
 
 #ifdef CONFIG_ARCH_USES_GETTIMEOFFSET
-	offset = timekeeper.cycle_interval;
+	offset = shadow_tk.cycle_interval;
 #else
 	offset = (clock->read(clock) - clock->cycle_last) & clock->mask;
 #endif
@@ -1088,19 +1090,19 @@ static void update_wall_time(void)
 	 * chunk in one go, and then try to consume the next smaller
 	 * doubled multiple.
 	 */
-	shift = ilog2(offset) - ilog2(timekeeper.cycle_interval);
+	shift = ilog2(offset) - ilog2(shadow_tk.cycle_interval);
 	shift = max(0, shift);
 	/* Bound shift to one less then what overflows tick_length */
 	maxshift = (64 - (ilog2(ntp_tick_length())+1)) - 1;
 	shift = min(shift, maxshift);
-	while (offset >= timekeeper.cycle_interval) {
-		offset = logarithmic_accumulation(&timekeeper, offset, shift);
-		if(offset < timekeeper.cycle_interval<<shift)
+	while (offset >= shadow_tk.cycle_interval) {
+		offset = logarithmic_accumulation(&shadow_tk, offset, shift);
+		if (offset < shadow_tk.cycle_interval<<shift)
 			shift--;
 	}
 
 	/* correct the clock when NTP error is too big */
-	timekeeping_adjust(&timekeeper, offset);
+	timekeeping_adjust(&shadow_tk, offset);
 
 	/*
 	 * Since in the loop above, we accumulate any amount of time
@@ -1118,10 +1120,10 @@ static void update_wall_time(void)
 	 * We'll correct this error next time through this function, when
 	 * xtime_nsec is not as small.
 	 */
-	if (unlikely((s64)timekeeper.xtime_nsec < 0)) {
-		s64 neg = -(s64)timekeeper.xtime_nsec;
-		timekeeper.xtime_nsec = 0;
-		timekeeper.ntp_error += neg << timekeeper.ntp_error_shift;
+	if (unlikely((s64)shadow_tk.xtime_nsec < 0)) {
+		s64 neg = -(s64)shadow_tk.xtime_nsec;
+		shadow_tk.xtime_nsec = 0;
+		shadow_tk.ntp_error += neg << shadow_tk.ntp_error_shift;
 	}
 
 	/*
@@ -1133,23 +1135,24 @@ static void update_wall_time(void)
 	* the vsyscall implementations are converted to use xtime_nsec
 	* (shifted nanoseconds), this can be killed.
 	*/
-	remainder = timekeeper.xtime_nsec & ((1<<timekeeper.shift)-1);
-	timekeeper.xtime_nsec -= remainder;
-	timekeeper.xtime_nsec += 1<<timekeeper.shift;
-	timekeeper.ntp_error += remainder <<
-					timekeeper.ntp_error_shift;
+	remainder = shadow_tk.xtime_nsec & ((1<<shadow_tk.shift)-1);
+	shadow_tk.xtime_nsec -= remainder;
+	shadow_tk.xtime_nsec += 1<<shadow_tk.shift;
+	shadow_tk.ntp_error += remainder << shadow_tk.ntp_error_shift;
 
 	/*
 	 * Finally, make sure that after the rounding
 	 * xtime.tv_nsec isn't larger then NSEC_PER_SEC
 	 */
-	if (unlikely(timekeeper.xtime_nsec >=
-			((u64)NSEC_PER_SEC << timekeeper.shift))) {
-		timekeeper.xtime_nsec -= (u64)NSEC_PER_SEC << timekeeper.shift;
-		timekeeper.xtime_sec++;
+	if (unlikely(shadow_tk.xtime_nsec >=
+				((u64)NSEC_PER_SEC << shadow_tk.shift))) {
+		shadow_tk.xtime_nsec -= (u64)NSEC_PER_SEC << shadow_tk.shift;
+		shadow_tk.xtime_sec++;
 		second_overflow();
 	}
 
+
+	timekeeper = shadow_tk;
 	timekeeping_update(&timekeeper, false);
 
 out:
-- 
1.7.3.2.146.gca209


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/9] time: Shadow cycle_last in timekeeper structure
  2012-03-02  7:12 [PATCH 0/9] Reduce timekeeping lock hold time v2 John Stultz
                   ` (3 preceding siblings ...)
  2012-03-02  7:12 ` [PATCH 4/9] time: Update timekeeper structure using a local shadow John Stultz
@ 2012-03-02  7:12 ` John Stultz
  2012-03-02  7:12 ` [PATCH 6/9] time: Reduce timekeeper read lock hold time John Stultz
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: John Stultz @ 2012-03-02  7:12 UTC (permalink / raw
  To: lkml
  Cc: John Stultz, Ingo Molnar, Thomas Gleixner, Eric Dumazet,
	Richard Cochran

The clocksource cycle_last value is problematic for working on
shadow copies of the timekeeper, because the clocksource is global.

Since its mostly used only for timekeeping, move cycle_last into
the timekeeper. Unfortunately there are some uses for cycle_last
outside of timekeeping (such as tsc_read, which makes sure we haven't
skipped to a core that the TSC is behind the last read), so we
keep the clocksource cycle_last updated as well.

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c |   23 ++++++++++++++---------
 1 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 09460c1..ed8cb51 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -29,7 +29,8 @@ struct timekeeper {
 	u32	mult;
 	/* The shift value of the current clocksource. */
 	int	shift;
-
+	/* cycle value at last accumulation point */
+	cycle_t cycle_last;
 	/* Number of clock cycles in one NTP interval. */
 	cycle_t cycle_interval;
 	/* Number of clock shifted nano seconds in one NTP interval. */
@@ -142,7 +143,8 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
 
 	old_clock = tk->clock;
 	tk->clock = clock;
-	clock->cycle_last = clock->read(clock);
+	tk->cycle_last = clock->read(clock);
+	clock->cycle_last = tk->cycle_last;
 
 	/* Do the ns -> cycle conversion first, using original mult */
 	tmp = NTP_INTERVAL_LENGTH;
@@ -195,7 +197,7 @@ static inline s64 timekeeping_get_ns(struct timekeeper *tk)
 	cycle_now = clock->read(clock);
 
 	/* calculate the delta since the last update_wall_time: */
-	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
+	cycle_delta = (cycle_now - tk->cycle_last) & clock->mask;
 
 	nsec = cycle_delta * tk->mult + tk->xtime_nsec;
 	return nsec >> tk->shift;
@@ -211,7 +213,7 @@ static inline s64 timekeeping_get_ns_raw(struct timekeeper *tk)
 	cycle_now = clock->read(clock);
 
 	/* calculate the delta since the last update_wall_time: */
-	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
+	cycle_delta = (cycle_now - tk->cycle_last) & clock->mask;
 
 	/* return delta convert to nanoseconds. */
 	return clocksource_cyc2ns(cycle_delta, clock->mult, clock->shift);
@@ -259,8 +261,9 @@ static void timekeeping_forward_now(struct timekeeper *tk)
 
 	clock = tk->clock;
 	cycle_now = clock->read(clock);
-	cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
-	clock->cycle_last = cycle_now;
+	cycle_delta = (cycle_now - tk->cycle_last) & clock->mask;
+	tk->cycle_last = cycle_now;
+	tk->clock->cycle_last = cycle_now;
 
 	tk->xtime_nsec += cycle_delta * tk->mult;
 
@@ -760,7 +763,8 @@ static void timekeeping_resume(void)
 		__timekeeping_inject_sleeptime(&ts);
 	}
 	/* re-base the last cycle value */
-	timekeeper.clock->cycle_last = timekeeper.clock->read(timekeeper.clock);
+	timekeeper.cycle_last = timekeeper.clock->read(timekeeper.clock);
+	timekeeper.clock->cycle_last = timekeeper.cycle_last;
 	timekeeper.ntp_error = 0;
 	timekeeping_suspended = 0;
 
@@ -1026,7 +1030,7 @@ static cycle_t logarithmic_accumulation(struct timekeeper *tk, cycle_t offset,
 
 	/* Accumulate one shifted interval */
 	offset -= tk->cycle_interval << shift;
-	tk->clock->cycle_last += tk->cycle_interval << shift;
+	tk->cycle_last += tk->cycle_interval << shift;
 
 	tk->xtime_nsec += tk->xtime_interval << shift;
 	while (tk->xtime_nsec >= nsecps) {
@@ -1079,7 +1083,7 @@ static void update_wall_time(void)
 #ifdef CONFIG_ARCH_USES_GETTIMEOFFSET
 	offset = shadow_tk.cycle_interval;
 #else
-	offset = (clock->read(clock) - clock->cycle_last) & clock->mask;
+	offset = (clock->read(clock) - shadow_tk.cycle_last) & clock->mask;
 #endif
 
 	/*
@@ -1153,6 +1157,7 @@ static void update_wall_time(void)
 
 
 	timekeeper = shadow_tk;
+	timekeeper.clock->cycle_last = timekeeper.cycle_last;
 	timekeeping_update(&timekeeper, false);
 
 out:
-- 
1.7.3.2.146.gca209


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/9] time: Reduce timekeeper read lock hold time
  2012-03-02  7:12 [PATCH 0/9] Reduce timekeeping lock hold time v2 John Stultz
                   ` (4 preceding siblings ...)
  2012-03-02  7:12 ` [PATCH 5/9] time: Shadow cycle_last in timekeeper structure John Stultz
@ 2012-03-02  7:12 ` John Stultz
  2012-03-02  7:12 ` [PATCH 7/9] time: Convert the timekeeper's wlock to a raw_spin_lock John Stultz
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: John Stultz @ 2012-03-02  7:12 UTC (permalink / raw
  To: lkml
  Cc: John Stultz, Ingo Molnar, Thomas Gleixner, Eric Dumazet,
	Richard Cochran

Now that timekeeper updates are done with a shadow copy,
we can reduce the readlock hold time to only the update.

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index ed8cb51..97982b1 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1071,7 +1071,6 @@ static void update_wall_time(void)
 	s64 remainder;
 
 	spin_lock_irqsave(&timekeeper_wlock, flags);
-	write_seqcount_begin(&timekeeper_rlock);
 
 	/* Make sure we're fully resumed: */
 	if (unlikely(timekeeping_suspended))
@@ -1155,13 +1154,13 @@ static void update_wall_time(void)
 		second_overflow();
 	}
 
-
+	write_seqcount_begin(&timekeeper_rlock);
 	timekeeper = shadow_tk;
 	timekeeper.clock->cycle_last = timekeeper.cycle_last;
 	timekeeping_update(&timekeeper, false);
+	write_seqcount_end(&timekeeper_rlock);
 
 out:
-	write_seqcount_end(&timekeeper_rlock);
 	spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 }
-- 
1.7.3.2.146.gca209


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 7/9] time: Convert the timekeeper's wlock to a raw_spin_lock
  2012-03-02  7:12 [PATCH 0/9] Reduce timekeeping lock hold time v2 John Stultz
                   ` (5 preceding siblings ...)
  2012-03-02  7:12 ` [PATCH 6/9] time: Reduce timekeeper read lock hold time John Stultz
@ 2012-03-02  7:12 ` John Stultz
  2012-03-02  7:12 ` [PATCH 8/9] time: Whitespace cleanups per Ingo's requests John Stultz
  2012-03-02  7:12 ` [PATCH 9/9] time: Explicitly use u32 instead of int for shift values John Stultz
  8 siblings, 0 replies; 15+ messages in thread
From: John Stultz @ 2012-03-02  7:12 UTC (permalink / raw
  To: lkml
  Cc: John Stultz, Ingo Molnar, Thomas Gleixner, Eric Dumazet,
	Richard Cochran

Convert the wlock to raw spin lock.

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c |   36 ++++++++++++++++++------------------
 1 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 97982b1..2ceee24 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -81,7 +81,7 @@ static struct timekeeper shadow_tk;
 /* This seqcount serializes readers from updates */
 static seqcount_t timekeeper_rlock;
 /* This spinlock serializes updaters */
-static spinlock_t timekeeper_wlock;
+static raw_spinlock_t timekeeper_wlock;
 
 /*
  * This read-write spinlock protects us from races in SMP while
@@ -236,13 +236,13 @@ void timekeeping_leap_insert(int leapsecond)
 {
 	unsigned long flags;
 
-	spin_lock_irqsave(&timekeeper_wlock, flags);
+	raw_spin_lock_irqsave(&timekeeper_wlock, flags);
 	write_seqcount_begin(&timekeeper_rlock);
 	timekeeper.xtime_sec += leapsecond;
 	timekeeper.wall_to_monotonic.tv_sec -= leapsecond;
 	timekeeping_update(&timekeeper, false);
 	write_seqcount_end(&timekeeper_rlock);
-	spin_unlock_irqrestore(&timekeeper_wlock, flags);
+	raw_spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 }
 
@@ -434,7 +434,7 @@ int do_settimeofday(const struct timespec *tv)
 	if ((unsigned long)tv->tv_nsec >= NSEC_PER_SEC)
 		return -EINVAL;
 
-	spin_lock_irqsave(&timekeeper_wlock, flags);
+	raw_spin_lock_irqsave(&timekeeper_wlock, flags);
 	write_seqcount_begin(&timekeeper_rlock);
 
 	timekeeping_forward_now(&timekeeper);
@@ -451,7 +451,7 @@ int do_settimeofday(const struct timespec *tv)
 	timekeeping_update(&timekeeper, true);
 
 	write_seqcount_end(&timekeeper_rlock);
-	spin_unlock_irqrestore(&timekeeper_wlock, flags);
+	raw_spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 	/* signal hrtimers about time change */
 	clock_was_set();
@@ -475,7 +475,7 @@ int timekeeping_inject_offset(struct timespec *ts)
 	if ((unsigned long)ts->tv_nsec >= NSEC_PER_SEC)
 		return -EINVAL;
 
-	spin_lock_irqsave(&timekeeper_wlock, flags);
+	raw_spin_lock_irqsave(&timekeeper_wlock, flags);
 	write_seqcount_begin(&timekeeper_rlock);
 
 	timekeeping_forward_now(&timekeeper);
@@ -488,7 +488,7 @@ int timekeeping_inject_offset(struct timespec *ts)
 	timekeeping_update(&timekeeper, true);
 
 	write_seqcount_end(&timekeeper_rlock);
-	spin_unlock_irqrestore(&timekeeper_wlock, flags);
+	raw_spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 	/* signal hrtimers about time change */
 	clock_was_set();
@@ -649,10 +649,10 @@ void __init timekeeping_init(void)
 	read_boot_clock(&boot);
 
 	seqcount_init(&timekeeper_rlock);
-	spin_lock_init(&timekeeper_wlock);
+	raw_spin_lock_init(&timekeeper_wlock);
 	ntp_init();
 
-	spin_lock_irqsave(&timekeeper_wlock, flags);
+	raw_spin_lock_irqsave(&timekeeper_wlock, flags);
 	write_seqcount_begin(&timekeeper_rlock);
 
 	clock = clocksource_default_clock();
@@ -672,7 +672,7 @@ void __init timekeeping_init(void)
 	timekeeper.total_sleep_time.tv_nsec = 0;
 
 	write_seqcount_end(&timekeeper_rlock);
-	spin_unlock_irqrestore(&timekeeper_wlock, flags);
+	raw_spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 }
 
@@ -722,7 +722,7 @@ void timekeeping_inject_sleeptime(struct timespec *delta)
 	if (!(ts.tv_sec == 0 && ts.tv_nsec == 0))
 		return;
 
-	spin_lock_irqsave(&timekeeper_wlock, flags);
+	raw_spin_lock_irqsave(&timekeeper_wlock, flags);
 	write_seqcount_begin(&timekeeper_rlock);
 
 	timekeeping_forward_now(&timekeeper);
@@ -732,7 +732,7 @@ void timekeeping_inject_sleeptime(struct timespec *delta)
 	timekeeping_update(&timekeeper, true);
 
 	write_seqcount_end(&timekeeper_rlock);
-	spin_unlock_irqrestore(&timekeeper_wlock, flags);
+	raw_spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 	/* signal hrtimers about time change */
 	clock_was_set();
@@ -755,7 +755,7 @@ static void timekeeping_resume(void)
 
 	clocksource_resume();
 
-	spin_lock_irqsave(&timekeeper_wlock, flags);
+	raw_spin_lock_irqsave(&timekeeper_wlock, flags);
 	write_seqcount_begin(&timekeeper_rlock);
 
 	if (timespec_compare(&ts, &timekeeping_suspend_time) > 0) {
@@ -769,7 +769,7 @@ static void timekeeping_resume(void)
 	timekeeping_suspended = 0;
 
 	write_seqcount_end(&timekeeper_rlock);
-	spin_unlock_irqrestore(&timekeeper_wlock, flags);
+	raw_spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 	touch_softlockup_watchdog();
 
@@ -787,7 +787,7 @@ static int timekeeping_suspend(void)
 
 	read_persistent_clock(&timekeeping_suspend_time);
 
-	spin_lock_irqsave(&timekeeper_wlock, flags);
+	raw_spin_lock_irqsave(&timekeeper_wlock, flags);
 	write_seqcount_begin(&timekeeper_rlock);
 
 	timekeeping_forward_now(&timekeeper);
@@ -814,7 +814,7 @@ static int timekeeping_suspend(void)
 	}
 
 	write_seqcount_end(&timekeeper_rlock);
-	spin_unlock_irqrestore(&timekeeper_wlock, flags);
+	raw_spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
 	clocksource_suspend();
@@ -1070,7 +1070,7 @@ static void update_wall_time(void)
 	unsigned long flags;
 	s64 remainder;
 
-	spin_lock_irqsave(&timekeeper_wlock, flags);
+	raw_spin_lock_irqsave(&timekeeper_wlock, flags);
 
 	/* Make sure we're fully resumed: */
 	if (unlikely(timekeeping_suspended))
@@ -1161,7 +1161,7 @@ static void update_wall_time(void)
 	write_seqcount_end(&timekeeper_rlock);
 
 out:
-	spin_unlock_irqrestore(&timekeeper_wlock, flags);
+	raw_spin_unlock_irqrestore(&timekeeper_wlock, flags);
 
 }
 
-- 
1.7.3.2.146.gca209


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 8/9] time: Whitespace cleanups per Ingo's requests
  2012-03-02  7:12 [PATCH 0/9] Reduce timekeeping lock hold time v2 John Stultz
                   ` (6 preceding siblings ...)
  2012-03-02  7:12 ` [PATCH 7/9] time: Convert the timekeeper's wlock to a raw_spin_lock John Stultz
@ 2012-03-02  7:12 ` John Stultz
  2012-03-02  7:12 ` [PATCH 9/9] time: Explicitly use u32 instead of int for shift values John Stultz
  8 siblings, 0 replies; 15+ messages in thread
From: John Stultz @ 2012-03-02  7:12 UTC (permalink / raw
  To: lkml
  Cc: John Stultz, Ingo Molnar, Thomas Gleixner, Eric Dumazet,
	Richard Cochran

Ingo noted a number of places where there is inconsistent
use of whitespace. This patch tries to address the main
culprits.

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c |   35 ++++++++++++++++-------------------
 1 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 2ceee24..67983ff 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -24,34 +24,31 @@
 /* Structure holding internal timekeeping values. */
 struct timekeeper {
 	/* Current clocksource used for timekeeping. */
-	struct clocksource *clock;
+	struct clocksource	*clock;
 	/* NTP adjusted clock multiplier */
-	u32	mult;
+	u32			mult;
 	/* The shift value of the current clocksource. */
-	int	shift;
+	int			shift;
 	/* cycle value at last accumulation point */
-	cycle_t cycle_last;
+	cycle_t			cycle_last;
 	/* Number of clock cycles in one NTP interval. */
-	cycle_t cycle_interval;
+	cycle_t			cycle_interval;
 	/* Number of clock shifted nano seconds in one NTP interval. */
-	u64	xtime_interval;
+	u64			xtime_interval;
 	/* shifted nano seconds left over when rounding cycle_interval */
-	s64	xtime_remainder;
+	s64			xtime_remainder;
 	/* Raw nano seconds accumulated per NTP interval. */
-	u32	raw_interval;
-
+	u32			raw_interval;
 	/* Current CLOCK_REALTIME time in seconds */
-	u64	xtime_sec;
+	u64			xtime_sec;
 	/* Clock shifted nano seconds */
-	u64	xtime_nsec;
-
+	u64			xtime_nsec;
 	/* Difference between accumulated time and NTP time in ntp
 	 * shifted nano seconds. */
-	s64	ntp_error;
+	s64			ntp_error;
 	/* Shift conversion between clock shifted nano seconds and
 	 * ntp shifted nano seconds. */
-	int	ntp_error_shift;
-
+	int			ntp_error_shift;
 	/*
 	 * wall_to_monotonic is what we need to add to xtime (or xtime corrected
 	 * for sub jiffie times) to get to monotonic time.  Monotonic is pegged
@@ -66,12 +63,11 @@ struct timekeeper {
 	 * - wall_to_monotonic is no longer the boot time, getboottime must be
 	 * used instead.
 	 */
-	struct timespec wall_to_monotonic;
+	struct timespec		wall_to_monotonic;
 	/* time spent in suspend */
-	struct timespec total_sleep_time;
+	struct timespec		total_sleep_time;
 	/* The raw monotonic time for the CLOCK_MONOTONIC_RAW posix clock. */
-	struct timespec raw_time;
-
+	struct timespec		raw_time;
 };
 
 static struct timekeeper timekeeper;
@@ -596,6 +592,7 @@ u64 timekeeping_max_deferment(void)
 {
 	unsigned long seq;
 	u64 ret;
+
 	do {
 		seq = read_seqcount_begin(&timekeeper_rlock);
 
-- 
1.7.3.2.146.gca209


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 9/9] time: Explicitly use u32 instead of int for shift values
  2012-03-02  7:12 [PATCH 0/9] Reduce timekeeping lock hold time v2 John Stultz
                   ` (7 preceding siblings ...)
  2012-03-02  7:12 ` [PATCH 8/9] time: Whitespace cleanups per Ingo's requests John Stultz
@ 2012-03-02  7:12 ` John Stultz
  8 siblings, 0 replies; 15+ messages in thread
From: John Stultz @ 2012-03-02  7:12 UTC (permalink / raw
  To: lkml
  Cc: John Stultz, Ingo Molnar, Thomas Gleixner, Eric Dumazet,
	Richard Cochran

Ingo noted that using a u32 instead of int for shift values
would be better to make sure the compiler doesn't unnecessarily
use complex signed arithmetic.

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 67983ff..60351a8 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -28,7 +28,7 @@ struct timekeeper {
 	/* NTP adjusted clock multiplier */
 	u32			mult;
 	/* The shift value of the current clocksource. */
-	int			shift;
+	u32			shift;
 	/* cycle value at last accumulation point */
 	cycle_t			cycle_last;
 	/* Number of clock cycles in one NTP interval. */
@@ -48,7 +48,7 @@ struct timekeeper {
 	s64			ntp_error;
 	/* Shift conversion between clock shifted nano seconds and
 	 * ntp shifted nano seconds. */
-	int			ntp_error_shift;
+	u32			ntp_error_shift;
 	/*
 	 * wall_to_monotonic is what we need to add to xtime (or xtime corrected
 	 * for sub jiffie times) to get to monotonic time.  Monotonic is pegged
@@ -1016,7 +1016,7 @@ static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
  * Returns the unconsumed cycles.
  */
 static cycle_t logarithmic_accumulation(struct timekeeper *tk, cycle_t offset,
-						int shift)
+					u32 shift)
 {
 	u64 nsecps = (u64)NSEC_PER_SEC << tk->shift;
 	u64 raw_nsecs;
-- 
1.7.3.2.146.gca209


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/9] time: Update timekeeper structure using a local shadow
  2012-03-02  7:12 ` [PATCH 4/9] time: Update timekeeper structure using a local shadow John Stultz
@ 2012-03-02  7:38   ` Ingo Molnar
  2012-03-02  7:53     ` John Stultz
  0 siblings, 1 reply; 15+ messages in thread
From: Ingo Molnar @ 2012-03-02  7:38 UTC (permalink / raw
  To: John Stultz; +Cc: lkml, Thomas Gleixner, Eric Dumazet, Richard Cochran


* John Stultz <john.stultz@linaro.org> wrote:

> Uses a local shadow structure to update the timekeeper. This
> will allow for reduced timekeeper.rlock hold time.
> 
> CC: Ingo Molnar <mingo@elte.hu>
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Richard Cochran <richardcochran@gmail.com>
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
>  kernel/time/timekeeping.c |   43 +++++++++++++++++++++++--------------------
>  1 files changed, 23 insertions(+), 20 deletions(-)
> 
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index f9ee96c..09460c1 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -74,6 +74,7 @@ struct timekeeper {
>  };
>  
>  static struct timekeeper timekeeper;
> +static struct timekeeper shadow_tk;

Sigh.

As I said it in the first round of review, it's fundamentally 
wrong to copy live fields like locks or the clocksource pointer 
around.

It's doubly wrong to do it in a global variable that no-one else 
but the copying function (update_wall_time()) is supposed to 
access.

There are over a dozen fields in 'struct timekeeper' - exactly 
which ones of them are used on this private copy, as 
update_wall_time() does the cycle accumulation and calls down 
into timkeeping_adjust()?

The right solution would be to separate timekeeping time state 
from global state:

struct timekeeper {
	spinlock_t		lock;

	struct time_state	time_state;
};

And then standardize the time calculation code on passing around 
not 'struct timekeeper *' but 'struct time_state *' ! Then you 
can have a local shadow copy of the global state:

	struct time_state time_state_copy;

and copy it from the global one and then pass it down to 
calculation functions.

This also gives the freedom to add other global state fields 
beyond the lock. (Right now the lock appears to be the only 
global state field - there might be more.)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/9] time: Update timekeeper structure using a local shadow
  2012-03-02  7:38   ` Ingo Molnar
@ 2012-03-02  7:53     ` John Stultz
  2012-03-02  8:07       ` Ingo Molnar
  0 siblings, 1 reply; 15+ messages in thread
From: John Stultz @ 2012-03-02  7:53 UTC (permalink / raw
  To: Ingo Molnar; +Cc: lkml, Thomas Gleixner, Eric Dumazet, Richard Cochran

On Fri, 2012-03-02 at 08:38 +0100, Ingo Molnar wrote:
> * John Stultz <john.stultz@linaro.org> wrote:
> 
> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> > index f9ee96c..09460c1 100644
> > --- a/kernel/time/timekeeping.c
> > +++ b/kernel/time/timekeeping.c
> > @@ -74,6 +74,7 @@ struct timekeeper {
> >  };
> >  
> >  static struct timekeeper timekeeper;
> > +static struct timekeeper shadow_tk;
> 
> Sigh.
> 
> As I said it in the first round of review, it's fundamentally 
> wrong to copy live fields like locks or the clocksource pointer 
> around.

So I actually removed the locks out from the timekeeper structure to try
to address this concern.

> It's doubly wrong to do it in a global variable that no-one else 
> but the copying function (update_wall_time()) is supposed to 
> access.
> 
> There are over a dozen fields in 'struct timekeeper' - exactly 
> which ones of them are used on this private copy, as 
> update_wall_time() does the cycle accumulation and calls down 
> into timkeeping_adjust()?

Just about all of timekeeper state is used and modified in the
update_wall_time. 

> The right solution would be to separate timekeeping time state 
> from global state:
> 
> struct timekeeper {
> 	spinlock_t		lock;
> 
> 	struct time_state	time_state;
> };
> 
> And then standardize the time calculation code on passing around 
> not 'struct timekeeper *' but 'struct time_state *' ! Then you 
> can have a local shadow copy of the global state:
> 
> 	struct time_state time_state_copy;
> 
> and copy it from the global one and then pass it down to 
> calculation functions.
> 
> This also gives the freedom to add other global state fields 
> beyond the lock. (Right now the lock appears to be the only 
> global state field - there might be more.)

So, just to be clear, you want me to push basically everything in the
timekeeper structure, except the lock (which would be re-added), into a
time_state sub-structure?

Sorry for being dense here (its been a long day), but maybe could you
clarify a bit more about the differences you're describing between
time-state and global-state wrt the timekeeper?

thanks
-john


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/9] time: Update timekeeper structure using a local shadow
  2012-03-02  7:53     ` John Stultz
@ 2012-03-02  8:07       ` Ingo Molnar
  0 siblings, 0 replies; 15+ messages in thread
From: Ingo Molnar @ 2012-03-02  8:07 UTC (permalink / raw
  To: John Stultz; +Cc: lkml, Thomas Gleixner, Eric Dumazet, Richard Cochran


* John Stultz <john.stultz@linaro.org> wrote:

> On Fri, 2012-03-02 at 08:38 +0100, Ingo Molnar wrote:
> > * John Stultz <john.stultz@linaro.org> wrote:
> > 
> > > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> > > index f9ee96c..09460c1 100644
> > > --- a/kernel/time/timekeeping.c
> > > +++ b/kernel/time/timekeeping.c
> > > @@ -74,6 +74,7 @@ struct timekeeper {
> > >  };
> > >  
> > >  static struct timekeeper timekeeper;
> > > +static struct timekeeper shadow_tk;
> > 
> > Sigh.
> > 
> > As I said it in the first round of review, it's fundamentally 
> > wrong to copy live fields like locks or the clocksource pointer 
> > around.
> 
> So I actually removed the locks out from the timekeeper 
> structure to try to address this concern.

The ->clock pointer is unused as well AFAICS [we pass in 
'offset'] - and that together with the lock is already two 
fields. It's 8 bytes copied back and forth unnecessarily, 
amongst other things.

But yes, moving those two fields out is an equivalent solution 
too - although I do agree with the whole clean-up direction of 
going away from standalone global variables and collecting those 
fields into a single structure. It's your call which one you 
prefer - but mixing the two types does not look clean to me.

> > It's doubly wrong to do it in a global variable that no-one 
> > else but the copying function (update_wall_time()) is 
> > supposed to access.
> > 
> > There are over a dozen fields in 'struct timekeeper' - 
> > exactly which ones of them are used on this private copy, as 
> > update_wall_time() does the cycle accumulation and calls 
> > down into timkeeping_adjust()?
> 
> Just about all of timekeeper state is used and modified in the 
> update_wall_time.
> 
> > The right solution would be to separate timekeeping time state 
> > from global state:
> > 
> > struct timekeeper {
> > 	spinlock_t		lock;
> > 
> > 	struct time_state	time_state;
> > };
> > 
> > And then standardize the time calculation code on passing around 
> > not 'struct timekeeper *' but 'struct time_state *' ! Then you 
> > can have a local shadow copy of the global state:
> > 
> > 	struct time_state time_state_copy;
> > 
> > and copy it from the global one and then pass it down to 
> > calculation functions.
> > 
> > This also gives the freedom to add other global state fields 
> > beyond the lock. (Right now the lock appears to be the only 
> > global state field - there might be more.)
> 
> So, just to be clear, you want me to push basically everything 
> in the timekeeper structure, except the lock (which would be 
> re-added), into a time_state sub-structure?

Moving the lock and any other field not used internally out of 
it is fine as well - plus not using a global shadow_copy but 
making it local to update_wall_time().

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/9] time: Condense timekeeper.xtime into xtime_sec
  2012-03-02  7:12 ` [PATCH 1/9] time: Condense timekeeper.xtime into xtime_sec John Stultz
@ 2012-03-02 21:13   ` Christoph Lameter
  2012-03-03  5:54     ` Richard Cochran
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Lameter @ 2012-03-02 21:13 UTC (permalink / raw
  To: John Stultz
  Cc: lkml, Ingo Molnar, Thomas Gleixner, Eric Dumazet, Richard Cochran

On Thu, 1 Mar 2012, John Stultz wrote:

> +static inline void tk_normalize_xtime(struct timekeeper *tk)
> +{
> +	while (tk->xtime_nsec >= ((u64)NSEC_PER_SEC << tk->shift)) {
> +		tk->xtime_nsec -= (u64)NSEC_PER_SEC << tk->shift;
> +		tk->xtime_sec++;
> +	}
> +}

Could we avoid the loop?

y = ((u64)NSEC_PER_SEC << tk->shift));
tk->xtime_sec += tk->xtime_nsec / y;
tk->xtime_nsec %= y;


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/9] time: Condense timekeeper.xtime into xtime_sec
  2012-03-02 21:13   ` Christoph Lameter
@ 2012-03-03  5:54     ` Richard Cochran
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Cochran @ 2012-03-03  5:54 UTC (permalink / raw
  To: Christoph Lameter
  Cc: John Stultz, lkml, Ingo Molnar, Thomas Gleixner, Eric Dumazet

On Fri, Mar 02, 2012 at 03:13:25PM -0600, Christoph Lameter wrote:
> On Thu, 1 Mar 2012, John Stultz wrote:
> 
> > +static inline void tk_normalize_xtime(struct timekeeper *tk)
> > +{
> > +	while (tk->xtime_nsec >= ((u64)NSEC_PER_SEC << tk->shift)) {
> > +		tk->xtime_nsec -= (u64)NSEC_PER_SEC << tk->shift;
> > +		tk->xtime_sec++;
> > +	}
> > +}
> 
> Could we avoid the loop?
> 
> y = ((u64)NSEC_PER_SEC << tk->shift));
> tk->xtime_sec += tk->xtime_nsec / y;
> tk->xtime_nsec %= y;

But the two divisions are more costly than addition, substraction, and
shift.

(Normally the code loops just once.)

Richard

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-03-03  5:54 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-02  7:12 [PATCH 0/9] Reduce timekeeping lock hold time v2 John Stultz
2012-03-02  7:12 ` [PATCH 1/9] time: Condense timekeeper.xtime into xtime_sec John Stultz
2012-03-02 21:13   ` Christoph Lameter
2012-03-03  5:54     ` Richard Cochran
2012-03-02  7:12 ` [PATCH 2/9] time: Rework timekeeping functions to take timekeeper ptr as argument John Stultz
2012-03-02  7:12 ` [PATCH 3/9] time: Split timekeeper lock into separate reader/writer locks John Stultz
2012-03-02  7:12 ` [PATCH 4/9] time: Update timekeeper structure using a local shadow John Stultz
2012-03-02  7:38   ` Ingo Molnar
2012-03-02  7:53     ` John Stultz
2012-03-02  8:07       ` Ingo Molnar
2012-03-02  7:12 ` [PATCH 5/9] time: Shadow cycle_last in timekeeper structure John Stultz
2012-03-02  7:12 ` [PATCH 6/9] time: Reduce timekeeper read lock hold time John Stultz
2012-03-02  7:12 ` [PATCH 7/9] time: Convert the timekeeper's wlock to a raw_spin_lock John Stultz
2012-03-02  7:12 ` [PATCH 8/9] time: Whitespace cleanups per Ingo's requests John Stultz
2012-03-02  7:12 ` [PATCH 9/9] time: Explicitly use u32 instead of int for shift values John Stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).