[PATCH 0/2] rcu/nocb fix and optimization

LKML Archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] rcu/nocb fix and optimization
@ 2022-10-10 22:39 Frederic Weisbecker
  2022-10-10 22:39 ` [PATCH 1/2] rcu: Fix missing nocb gp wake on rcu_barrier() Frederic Weisbecker
  2022-10-10 22:39 ` [PATCH 2/2] rcu/nocb: Spare bypass locking upon normal enqueue Frederic Weisbecker
  0 siblings, 2 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2022-10-10 22:39 UTC (permalink / raw
  To: Paul E . McKenney
  Cc: LKML, Frederic Weisbecker, Joel Fernandes, rcu, Neeraj Upadhyay,
	Uladzislau Rezki

I wanted to send this a few days ago but then I faced a TREE01 stall
(rcu_barrier() related). I just couldn't remember if it was before or
after these patches... But finally looking at the log, I found the
head which is e8c56bb5baa6969bf2d01b8619f22f5a71818497 (an old RCU:dev
head from more than a month ago, way before these patches). Phew!

Anyway I re-launched many times TREE01 on latest RCU:dev before and
after these patches, it has run both ways more than 600 hours without
any issue...

Frederic Weisbecker (2):
  rcu: Fix missing nocb gp wake on rcu_barrier()
  rcu/nocb: Spare bypass locking upon normal enqueue

 kernel/rcu/tree.c      |  6 ++++++
 kernel/rcu/tree.h      |  1 +
 kernel/rcu/tree_nocb.h | 11 +++++++++--
 3 files changed, 16 insertions(+), 2 deletions(-)

-- 
2.25.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] rcu: Fix missing nocb gp wake on rcu_barrier()
  2022-10-10 22:39 [PATCH 0/2] rcu/nocb fix and optimization Frederic Weisbecker
@ 2022-10-10 22:39 ` Frederic Weisbecker
  2022-10-11  2:01   ` Joel Fernandes
  2022-10-10 22:39 ` [PATCH 2/2] rcu/nocb: Spare bypass locking upon normal enqueue Frederic Weisbecker
  1 sibling, 1 reply; 12+ messages in thread
From: Frederic Weisbecker @ 2022-10-10 22:39 UTC (permalink / raw
  To: Paul E . McKenney; +Cc: LKML, Frederic Weisbecker, Joel Fernandes

Upon entraining a callback to a NOCB CPU, no further wake up is
issued on the corresponding nocb_gp kthread. As a result, the callback
and all the subsequent ones on that CPU may be ignored, at least until
an RCU_NOCB_WAKE_FORCE timer is ever armed or another NOCB CPU belonging
to the same group enqueues a callback on an empty queue.

Here is a possible bad scenario:

1) CPU 0 is NOCB unlike all other CPUs.
2) CPU 0 queues a callback
2) The grace period related to that callback elapses
3) The callback is moved to the done list (but is not invoked yet),
   there are no more pending callbacks for CPU 0
4) CPU 1 calls rcu_barrier() and sends an IPI to CPU 0
5) CPU 0 entrains the callback but doesn't wake up nocb_gp
6) CPU 1 blocks forever, unless CPU 0 ever queues enough further
   callbacks to arm an RCU_NOCB_WAKE_FORCE timer.

Make sure the necessary wake up is produced whenever necessary.

Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Fixes: 5d6742b37727 ("rcu/nocb: Use rcu_segcblist for no-CBs CPUs")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/rcu/tree.c      | 6 ++++++
 kernel/rcu/tree.h      | 1 +
 kernel/rcu/tree_nocb.h | 5 +++++
 3 files changed, 12 insertions(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 96d678c9cfb6..025f59f6f97f 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3914,6 +3914,8 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
 {
 	unsigned long gseq = READ_ONCE(rcu_state.barrier_sequence);
 	unsigned long lseq = READ_ONCE(rdp->barrier_seq_snap);
+	bool wake_nocb = false;
+	bool was_alldone = false;
 
 	lockdep_assert_held(&rcu_state.barrier_lock);
 	if (rcu_seq_state(lseq) || !rcu_seq_state(gseq) || rcu_seq_ctr(lseq) != rcu_seq_ctr(gseq))
@@ -3922,6 +3924,7 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
 	rdp->barrier_head.func = rcu_barrier_callback;
 	debug_rcu_head_queue(&rdp->barrier_head);
 	rcu_nocb_lock(rdp);
+	was_alldone = rcu_rdp_is_offloaded(rdp) && !rcu_segcblist_pend_cbs(&rdp->cblist);
 	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
 	if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) {
 		atomic_inc(&rcu_state.barrier_cpu_count);
@@ -3929,7 +3932,10 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
 		debug_rcu_head_unqueue(&rdp->barrier_head);
 		rcu_barrier_trace(TPS("IRQNQ"), -1, rcu_state.barrier_sequence);
 	}
+	wake_nocb = was_alldone && rcu_segcblist_pend_cbs(&rdp->cblist);
 	rcu_nocb_unlock(rdp);
+	if (wake_nocb)
+		wake_nocb_gp(rdp, false);
 	smp_store_release(&rdp->barrier_seq_snap, gseq);
 }
 
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index d4a97e40ea9c..925dd98f8b23 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -439,6 +439,7 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp);
 static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
 static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
 static void rcu_init_one_nocb(struct rcu_node *rnp);
+static bool wake_nocb_gp(struct rcu_data *rdp, bool force);
 static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 				  unsigned long j);
 static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index f77a6d7e1356..094fd454b6c3 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -1558,6 +1558,11 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
 {
 }
 
+static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
+{
+	return false;
+}
+
 static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 				  unsigned long j)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] rcu/nocb: Spare bypass locking upon normal enqueue
  2022-10-10 22:39 [PATCH 0/2] rcu/nocb fix and optimization Frederic Weisbecker
  2022-10-10 22:39 ` [PATCH 1/2] rcu: Fix missing nocb gp wake on rcu_barrier() Frederic Weisbecker
@ 2022-10-10 22:39 ` Frederic Weisbecker
  2022-10-11  2:00   ` Joel Fernandes
  1 sibling, 1 reply; 12+ messages in thread
From: Frederic Weisbecker @ 2022-10-10 22:39 UTC (permalink / raw
  To: Paul E . McKenney; +Cc: LKML, Frederic Weisbecker, Joel Fernandes

When a callback is to be enqueued to the normal queue and not the bypass
one, a flush to the bypass queue is always tried anyway. This attempt
involves locking the bypass lock unconditionally. Although it is
guaranteed not to be contended at this point, because only call_rcu()
can lock the bypass lock without holding the nocb lock, it's still not
free and the operation can easily be spared most of the time by just
checking if the bypass list is empty. The check is safe as nobody can
queue nor flush the bypass concurrently.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/rcu/tree_nocb.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 094fd454b6c3..30c3d473ffd8 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -423,8 +423,10 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 		if (*was_alldone)
 			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 					    TPS("FirstQ"));
-		WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
-		WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
+		if (rcu_cblist_n_cbs(&rdp->nocb_bypass)) {
+			WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
+			WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
+		}
 		return false; // Caller must enqueue the callback.
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] rcu/nocb: Spare bypass locking upon normal enqueue
  2022-10-10 22:39 ` [PATCH 2/2] rcu/nocb: Spare bypass locking upon normal enqueue Frederic Weisbecker
@ 2022-10-11  2:00   ` Joel Fernandes
  2022-10-11  2:08     ` Joel Fernandes
  2022-10-11 19:21     ` Frederic Weisbecker
  0 siblings, 2 replies; 12+ messages in thread
From: Joel Fernandes @ 2022-10-11  2:00 UTC (permalink / raw
  To: Frederic Weisbecker; +Cc: Paul E . McKenney, LKML

On Tue, Oct 11, 2022 at 12:39:56AM +0200, Frederic Weisbecker wrote:
> When a callback is to be enqueued to the normal queue and not the bypass
> one, a flush to the bypass queue is always tried anyway. This attempt
> involves locking the bypass lock unconditionally. Although it is
> guaranteed not to be contended at this point, because only call_rcu()
> can lock the bypass lock without holding the nocb lock, it's still not
> free and the operation can easily be spared most of the time by just
> checking if the bypass list is empty. The check is safe as nobody can
> queue nor flush the bypass concurrently.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> ---
>  kernel/rcu/tree_nocb.h | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index 094fd454b6c3..30c3d473ffd8 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -423,8 +423,10 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>  		if (*was_alldone)
>  			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
>  					    TPS("FirstQ"));
> -		WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> -		WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> +		if (rcu_cblist_n_cbs(&rdp->nocb_bypass)) {
> +			WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> +			WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> +		}
>  		return false; // Caller must enqueue the callback.
>  	}

Instead of this, since as you mentioned that the bypass lock is not contended
in this path, isn't it unnecessary to even check or attempt to acquire the
lock in call_rcu() path? So how about something like the following, or would
this not work for some reason?

Thanks.

---8<-----------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index ad8d4e52ae92..6235e72cca07 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3950,7 +3950,7 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
 	debug_rcu_head_queue(&rdp->barrier_head);
 	rcu_nocb_lock(rdp);
 	was_done = rcu_rdp_is_offloaded(rdp) && !rcu_segcblist_pend_cbs(&rdp->cblist);
-	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false));
+	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false, false));
 	if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) {
 		atomic_inc(&rcu_state.barrier_cpu_count);
 	} else {
@@ -4379,7 +4379,7 @@ void rcutree_migrate_callbacks(int cpu)
 	my_rdp = this_cpu_ptr(&rcu_data);
 	my_rnp = my_rdp->mynode;
 	rcu_nocb_lock(my_rdp); /* irqs already disabled. */
-	WARN_ON_ONCE(!rcu_nocb_flush_bypass(my_rdp, NULL, jiffies, false));
+	WARN_ON_ONCE(!rcu_nocb_flush_bypass(my_rdp, NULL, jiffies, false, false));
 	raw_spin_lock_rcu_node(my_rnp); /* irqs already disabled. */
 	/* Leverage recent GPs and set GP for new callbacks. */
 	needwake = rcu_advance_cbs(my_rnp, rdp) ||
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 1d803d39f0d1..0adb8f97a56d 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -442,7 +442,7 @@ static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
 static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
 static void rcu_init_one_nocb(struct rcu_node *rnp);
 static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
-				  unsigned long j, bool lazy);
+				  unsigned long j, bool lazy, bool nolock);
 static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 				bool *was_alldone, unsigned long flags,
 				bool lazy);
diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index c9a791407650..2164f5d79dec 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -328,7 +328,7 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
  * Note that this function always returns true if rhp is NULL.
  */
 static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_in,
-				     unsigned long j, bool lazy)
+				     unsigned long j, bool lazy, bool nolock)
 {
 	struct rcu_cblist rcl;
 	struct rcu_head *rhp = rhp_in;
@@ -359,7 +359,8 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_
 
 	rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl);
 	WRITE_ONCE(rdp->nocb_bypass_first, j);
-	rcu_nocb_bypass_unlock(rdp);
+	if (!nolock)
+		rcu_nocb_bypass_unlock(rdp);
 	return true;
 }
 
@@ -372,13 +373,14 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_
  * Note that this function always returns true if rhp is NULL.
  */
 static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
-				  unsigned long j, bool lazy)
+				  unsigned long j, bool lazy, bool nolock)
 {
 	if (!rcu_rdp_is_offloaded(rdp))
 		return true;
 	rcu_lockdep_assert_cblist_protected(rdp);
-	rcu_nocb_bypass_lock(rdp);
-	return rcu_nocb_do_flush_bypass(rdp, rhp, j, lazy);
+	if (!nolock)
+		rcu_nocb_bypass_lock(rdp);
+	return rcu_nocb_do_flush_bypass(rdp, rhp, j, lazy, nolock);
 }
 
 /*
@@ -391,7 +393,7 @@ static void rcu_nocb_try_flush_bypass(struct rcu_data *rdp, unsigned long j)
 	if (!rcu_rdp_is_offloaded(rdp) ||
 	    !rcu_nocb_bypass_trylock(rdp))
 		return;
-	WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j, false));
+	WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j, false, false));
 }
 
 /*
@@ -473,7 +475,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 					    TPS("FirstQ"));
 
-		WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j, false));
+		WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j, false, true));
 		WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
 		return false; // Caller must enqueue the callback.
 	}
@@ -487,7 +489,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
 		rcu_nocb_lock(rdp);
 		*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
 
-		if (!rcu_nocb_flush_bypass(rdp, rhp, j, lazy)) {
+		if (!rcu_nocb_flush_bypass(rdp, rhp, j, lazy, true)) {
 			if (*was_alldone)
 				trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
 						    TPS("FirstQ"));
@@ -1136,7 +1138,7 @@ static long rcu_nocb_rdp_deoffload(void *arg)
 	 * return false, which means that future calls to rcu_nocb_try_bypass()
 	 * will refuse to put anything into the bypass.
 	 */
-	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false));
+	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false, false));
 	/*
 	 * Start with invoking rcu_core() early. This way if the current thread
 	 * happens to preempt an ongoing call to rcu_core() in the middle,
@@ -1717,7 +1719,7 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
 }
 
 static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
-				  unsigned long j, bool lazy)
+				  unsigned long j, bool lazy, bool nolock)
 {
 	return true;
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] rcu: Fix missing nocb gp wake on rcu_barrier()
  2022-10-10 22:39 ` [PATCH 1/2] rcu: Fix missing nocb gp wake on rcu_barrier() Frederic Weisbecker
@ 2022-10-11  2:01   ` Joel Fernandes
  2022-10-11  7:25     ` Paul E. McKenney
  0 siblings, 1 reply; 12+ messages in thread
From: Joel Fernandes @ 2022-10-11  2:01 UTC (permalink / raw
  To: Frederic Weisbecker; +Cc: Paul E . McKenney, LKML

On Tue, Oct 11, 2022 at 12:39:55AM +0200, Frederic Weisbecker wrote:
> Upon entraining a callback to a NOCB CPU, no further wake up is
> issued on the corresponding nocb_gp kthread. As a result, the callback
> and all the subsequent ones on that CPU may be ignored, at least until
> an RCU_NOCB_WAKE_FORCE timer is ever armed or another NOCB CPU belonging
> to the same group enqueues a callback on an empty queue.
> 
> Here is a possible bad scenario:
> 
> 1) CPU 0 is NOCB unlike all other CPUs.
> 2) CPU 0 queues a callback
> 2) The grace period related to that callback elapses
> 3) The callback is moved to the done list (but is not invoked yet),
>    there are no more pending callbacks for CPU 0
> 4) CPU 1 calls rcu_barrier() and sends an IPI to CPU 0
> 5) CPU 0 entrains the callback but doesn't wake up nocb_gp
> 6) CPU 1 blocks forever, unless CPU 0 ever queues enough further
>    callbacks to arm an RCU_NOCB_WAKE_FORCE timer.
> 
> Make sure the necessary wake up is produced whenever necessary.
> 
> Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> Fixes: 5d6742b37727 ("rcu/nocb: Use rcu_segcblist for no-CBs CPUs")
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>

And if Paul is taking this, I'll rebase and drop this patch from the lazy
series.

thanks,

 - Joel


> ---
>  kernel/rcu/tree.c      | 6 ++++++
>  kernel/rcu/tree.h      | 1 +
>  kernel/rcu/tree_nocb.h | 5 +++++
>  3 files changed, 12 insertions(+)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 96d678c9cfb6..025f59f6f97f 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3914,6 +3914,8 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
>  {
>  	unsigned long gseq = READ_ONCE(rcu_state.barrier_sequence);
>  	unsigned long lseq = READ_ONCE(rdp->barrier_seq_snap);
> +	bool wake_nocb = false;
> +	bool was_alldone = false;
>  
>  	lockdep_assert_held(&rcu_state.barrier_lock);
>  	if (rcu_seq_state(lseq) || !rcu_seq_state(gseq) || rcu_seq_ctr(lseq) != rcu_seq_ctr(gseq))
> @@ -3922,6 +3924,7 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
>  	rdp->barrier_head.func = rcu_barrier_callback;
>  	debug_rcu_head_queue(&rdp->barrier_head);
>  	rcu_nocb_lock(rdp);
> +	was_alldone = rcu_rdp_is_offloaded(rdp) && !rcu_segcblist_pend_cbs(&rdp->cblist);
>  	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
>  	if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) {
>  		atomic_inc(&rcu_state.barrier_cpu_count);
> @@ -3929,7 +3932,10 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
>  		debug_rcu_head_unqueue(&rdp->barrier_head);
>  		rcu_barrier_trace(TPS("IRQNQ"), -1, rcu_state.barrier_sequence);
>  	}
> +	wake_nocb = was_alldone && rcu_segcblist_pend_cbs(&rdp->cblist);
>  	rcu_nocb_unlock(rdp);
> +	if (wake_nocb)
> +		wake_nocb_gp(rdp, false);
>  	smp_store_release(&rdp->barrier_seq_snap, gseq);
>  }
>  
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index d4a97e40ea9c..925dd98f8b23 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -439,6 +439,7 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp);
>  static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
>  static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
>  static void rcu_init_one_nocb(struct rcu_node *rnp);
> +static bool wake_nocb_gp(struct rcu_data *rdp, bool force);
>  static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>  				  unsigned long j);
>  static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index f77a6d7e1356..094fd454b6c3 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -1558,6 +1558,11 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
>  {
>  }
>  
> +static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
> +{
> +	return false;
> +}
> +
>  static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>  				  unsigned long j)
>  {
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] rcu/nocb: Spare bypass locking upon normal enqueue
  2022-10-11  2:00   ` Joel Fernandes
@ 2022-10-11  2:08     ` Joel Fernandes
  2022-10-11 19:21     ` Frederic Weisbecker
  1 sibling, 0 replies; 12+ messages in thread
From: Joel Fernandes @ 2022-10-11  2:08 UTC (permalink / raw
  To: Frederic Weisbecker; +Cc: Paul E . McKenney, LKML

On Mon, Oct 10, 2022 at 10:00 PM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Tue, Oct 11, 2022 at 12:39:56AM +0200, Frederic Weisbecker wrote:
> > When a callback is to be enqueued to the normal queue and not the bypass
> > one, a flush to the bypass queue is always tried anyway. This attempt
> > involves locking the bypass lock unconditionally. Although it is
> > guaranteed not to be contended at this point, because only call_rcu()
> > can lock the bypass lock without holding the nocb lock, it's still not
> > free and the operation can easily be spared most of the time by just
> > checking if the bypass list is empty. The check is safe as nobody can
> > queue nor flush the bypass concurrently.
> >
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > ---
> >  kernel/rcu/tree_nocb.h | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> > index 094fd454b6c3..30c3d473ffd8 100644
> > --- a/kernel/rcu/tree_nocb.h
> > +++ b/kernel/rcu/tree_nocb.h
> > @@ -423,8 +423,10 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> >               if (*was_alldone)
> >                       trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
> >                                           TPS("FirstQ"));
> > -             WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> > -             WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> > +             if (rcu_cblist_n_cbs(&rdp->nocb_bypass)) {
> > +                     WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> > +                     WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> > +             }
> >               return false; // Caller must enqueue the callback.
> >       }
>
> Instead of this, since as you mentioned that the bypass lock is not contended
> in this path, isn't it unnecessary to even check or attempt to acquire the
> lock in call_rcu() path? So how about something like the following, or would
> this not work for some reason?
>
> Thanks.
>
> ---8<-----------------------
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c

If this is too ugly, perhaps a new rcu_nocb_flush_bypass_locked()
function could be called from rcu_nocb_try_flush_bypass() while
keeping all other call sites as-is.

thanks,

 - Joel


> index ad8d4e52ae92..6235e72cca07 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3950,7 +3950,7 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
>         debug_rcu_head_queue(&rdp->barrier_head);
>         rcu_nocb_lock(rdp);
>         was_done = rcu_rdp_is_offloaded(rdp) && !rcu_segcblist_pend_cbs(&rdp->cblist);
> -       WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false));
> +       WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false, false));
>         if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) {
>                 atomic_inc(&rcu_state.barrier_cpu_count);
>         } else {
> @@ -4379,7 +4379,7 @@ void rcutree_migrate_callbacks(int cpu)
>         my_rdp = this_cpu_ptr(&rcu_data);
>         my_rnp = my_rdp->mynode;
>         rcu_nocb_lock(my_rdp); /* irqs already disabled. */
> -       WARN_ON_ONCE(!rcu_nocb_flush_bypass(my_rdp, NULL, jiffies, false));
> +       WARN_ON_ONCE(!rcu_nocb_flush_bypass(my_rdp, NULL, jiffies, false, false));
>         raw_spin_lock_rcu_node(my_rnp); /* irqs already disabled. */
>         /* Leverage recent GPs and set GP for new callbacks. */
>         needwake = rcu_advance_cbs(my_rnp, rdp) ||
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 1d803d39f0d1..0adb8f97a56d 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -442,7 +442,7 @@ static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
>  static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
>  static void rcu_init_one_nocb(struct rcu_node *rnp);
>  static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> -                                 unsigned long j, bool lazy);
> +                                 unsigned long j, bool lazy, bool nolock);
>  static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>                                 bool *was_alldone, unsigned long flags,
>                                 bool lazy);
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index c9a791407650..2164f5d79dec 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -328,7 +328,7 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
>   * Note that this function always returns true if rhp is NULL.
>   */
>  static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_in,
> -                                    unsigned long j, bool lazy)
> +                                    unsigned long j, bool lazy, bool nolock)
>  {
>         struct rcu_cblist rcl;
>         struct rcu_head *rhp = rhp_in;
> @@ -359,7 +359,8 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_
>
>         rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl);
>         WRITE_ONCE(rdp->nocb_bypass_first, j);
> -       rcu_nocb_bypass_unlock(rdp);
> +       if (!nolock)
> +               rcu_nocb_bypass_unlock(rdp);
>         return true;
>  }
>
> @@ -372,13 +373,14 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_
>   * Note that this function always returns true if rhp is NULL.
>   */
>  static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> -                                 unsigned long j, bool lazy)
> +                                 unsigned long j, bool lazy, bool nolock)
>  {
>         if (!rcu_rdp_is_offloaded(rdp))
>                 return true;
>         rcu_lockdep_assert_cblist_protected(rdp);
> -       rcu_nocb_bypass_lock(rdp);
> -       return rcu_nocb_do_flush_bypass(rdp, rhp, j, lazy);
> +       if (!nolock)
> +               rcu_nocb_bypass_lock(rdp);
> +       return rcu_nocb_do_flush_bypass(rdp, rhp, j, lazy, nolock);
>  }
>
>  /*
> @@ -391,7 +393,7 @@ static void rcu_nocb_try_flush_bypass(struct rcu_data *rdp, unsigned long j)
>         if (!rcu_rdp_is_offloaded(rdp) ||
>             !rcu_nocb_bypass_trylock(rdp))
>                 return;
> -       WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j, false));
> +       WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j, false, false));
>  }
>
>  /*
> @@ -473,7 +475,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>                         trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
>                                             TPS("FirstQ"));
>
> -               WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j, false));
> +               WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j, false, true));
>                 WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
>                 return false; // Caller must enqueue the callback.
>         }
> @@ -487,7 +489,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>                 rcu_nocb_lock(rdp);
>                 *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
>
> -               if (!rcu_nocb_flush_bypass(rdp, rhp, j, lazy)) {
> +               if (!rcu_nocb_flush_bypass(rdp, rhp, j, lazy, true)) {
>                         if (*was_alldone)
>                                 trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
>                                                     TPS("FirstQ"));
> @@ -1136,7 +1138,7 @@ static long rcu_nocb_rdp_deoffload(void *arg)
>          * return false, which means that future calls to rcu_nocb_try_bypass()
>          * will refuse to put anything into the bypass.
>          */
> -       WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false));
> +       WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false, false));
>         /*
>          * Start with invoking rcu_core() early. This way if the current thread
>          * happens to preempt an ongoing call to rcu_core() in the middle,
> @@ -1717,7 +1719,7 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
>  }
>
>  static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> -                                 unsigned long j, bool lazy)
> +                                 unsigned long j, bool lazy, bool nolock)
>  {
>         return true;
>  }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] rcu: Fix missing nocb gp wake on rcu_barrier()
  2022-10-11  2:01   ` Joel Fernandes
@ 2022-10-11  7:25     ` Paul E. McKenney
  2022-10-11  7:33       ` Joel Fernandes
  0 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2022-10-11  7:25 UTC (permalink / raw
  To: Joel Fernandes; +Cc: Frederic Weisbecker, LKML

On Tue, Oct 11, 2022 at 02:01:23AM +0000, Joel Fernandes wrote:
> On Tue, Oct 11, 2022 at 12:39:55AM +0200, Frederic Weisbecker wrote:
> > Upon entraining a callback to a NOCB CPU, no further wake up is
> > issued on the corresponding nocb_gp kthread. As a result, the callback
> > and all the subsequent ones on that CPU may be ignored, at least until
> > an RCU_NOCB_WAKE_FORCE timer is ever armed or another NOCB CPU belonging
> > to the same group enqueues a callback on an empty queue.
> > 
> > Here is a possible bad scenario:
> > 
> > 1) CPU 0 is NOCB unlike all other CPUs.
> > 2) CPU 0 queues a callback
> > 2) The grace period related to that callback elapses
> > 3) The callback is moved to the done list (but is not invoked yet),
> >    there are no more pending callbacks for CPU 0
> > 4) CPU 1 calls rcu_barrier() and sends an IPI to CPU 0
> > 5) CPU 0 entrains the callback but doesn't wake up nocb_gp
> > 6) CPU 1 blocks forever, unless CPU 0 ever queues enough further
> >    callbacks to arm an RCU_NOCB_WAKE_FORCE timer.
> > 
> > Make sure the necessary wake up is produced whenever necessary.
> > 
> > Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > Fixes: 5d6742b37727 ("rcu/nocb: Use rcu_segcblist for no-CBs CPUs")
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> 
> Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> 
> And if Paul is taking this, I'll rebase and drop this patch from the lazy
> series.

Joel, could you please incorporate this into your series?  My internet
access is likely to be a bit iffy over the next few days.  Likely no
problem for email and the occasional test-systme access, but best not
to take it for granted.  ;-)

							Thanx, Paul

> thanks,
> 
>  - Joel
> 
> 
> > ---
> >  kernel/rcu/tree.c      | 6 ++++++
> >  kernel/rcu/tree.h      | 1 +
> >  kernel/rcu/tree_nocb.h | 5 +++++
> >  3 files changed, 12 insertions(+)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 96d678c9cfb6..025f59f6f97f 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3914,6 +3914,8 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
> >  {
> >  	unsigned long gseq = READ_ONCE(rcu_state.barrier_sequence);
> >  	unsigned long lseq = READ_ONCE(rdp->barrier_seq_snap);
> > +	bool wake_nocb = false;
> > +	bool was_alldone = false;
> >  
> >  	lockdep_assert_held(&rcu_state.barrier_lock);
> >  	if (rcu_seq_state(lseq) || !rcu_seq_state(gseq) || rcu_seq_ctr(lseq) != rcu_seq_ctr(gseq))
> > @@ -3922,6 +3924,7 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
> >  	rdp->barrier_head.func = rcu_barrier_callback;
> >  	debug_rcu_head_queue(&rdp->barrier_head);
> >  	rcu_nocb_lock(rdp);
> > +	was_alldone = rcu_rdp_is_offloaded(rdp) && !rcu_segcblist_pend_cbs(&rdp->cblist);
> >  	WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
> >  	if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) {
> >  		atomic_inc(&rcu_state.barrier_cpu_count);
> > @@ -3929,7 +3932,10 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
> >  		debug_rcu_head_unqueue(&rdp->barrier_head);
> >  		rcu_barrier_trace(TPS("IRQNQ"), -1, rcu_state.barrier_sequence);
> >  	}
> > +	wake_nocb = was_alldone && rcu_segcblist_pend_cbs(&rdp->cblist);
> >  	rcu_nocb_unlock(rdp);
> > +	if (wake_nocb)
> > +		wake_nocb_gp(rdp, false);
> >  	smp_store_release(&rdp->barrier_seq_snap, gseq);
> >  }
> >  
> > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> > index d4a97e40ea9c..925dd98f8b23 100644
> > --- a/kernel/rcu/tree.h
> > +++ b/kernel/rcu/tree.h
> > @@ -439,6 +439,7 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp);
> >  static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
> >  static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
> >  static void rcu_init_one_nocb(struct rcu_node *rnp);
> > +static bool wake_nocb_gp(struct rcu_data *rdp, bool force);
> >  static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> >  				  unsigned long j);
> >  static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> > index f77a6d7e1356..094fd454b6c3 100644
> > --- a/kernel/rcu/tree_nocb.h
> > +++ b/kernel/rcu/tree_nocb.h
> > @@ -1558,6 +1558,11 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
> >  {
> >  }
> >  
> > +static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
> > +{
> > +	return false;
> > +}
> > +
> >  static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> >  				  unsigned long j)
> >  {
> > -- 
> > 2.25.1
> > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] rcu: Fix missing nocb gp wake on rcu_barrier()
  2022-10-11  7:25     ` Paul E. McKenney
@ 2022-10-11  7:33       ` Joel Fernandes
  0 siblings, 0 replies; 12+ messages in thread
From: Joel Fernandes @ 2022-10-11  7:33 UTC (permalink / raw
  To: paulmck; +Cc: Frederic Weisbecker, LKML



> On Oct 11, 2022, at 3:25 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> On Tue, Oct 11, 2022 at 02:01:23AM +0000, Joel Fernandes wrote:
>>> On Tue, Oct 11, 2022 at 12:39:55AM +0200, Frederic Weisbecker wrote:
>>> Upon entraining a callback to a NOCB CPU, no further wake up is
>>> issued on the corresponding nocb_gp kthread. As a result, the callback
>>> and all the subsequent ones on that CPU may be ignored, at least until
>>> an RCU_NOCB_WAKE_FORCE timer is ever armed or another NOCB CPU belonging
>>> to the same group enqueues a callback on an empty queue.
>>> 
>>> Here is a possible bad scenario:
>>> 
>>> 1) CPU 0 is NOCB unlike all other CPUs.
>>> 2) CPU 0 queues a callback
>>> 2) The grace period related to that callback elapses
>>> 3) The callback is moved to the done list (but is not invoked yet),
>>>   there are no more pending callbacks for CPU 0
>>> 4) CPU 1 calls rcu_barrier() and sends an IPI to CPU 0
>>> 5) CPU 0 entrains the callback but doesn't wake up nocb_gp
>>> 6) CPU 1 blocks forever, unless CPU 0 ever queues enough further
>>>   callbacks to arm an RCU_NOCB_WAKE_FORCE timer.
>>> 
>>> Make sure the necessary wake up is produced whenever necessary.
>>> 
>>> Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org>
>>> Fixes: 5d6742b37727 ("rcu/nocb: Use rcu_segcblist for no-CBs CPUs")
>>> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
>> 
>> Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
>> 
>> And if Paul is taking this, I'll rebase and drop this patch from the lazy
>> series.
> 
> Joel, could you please incorporate this into your series?  My internet
> access is likely to be a bit iffy over the next few days.  Likely no
> problem for email and the occasional test-systme access, but best not
> to take it for granted.  ;-)

Sure, I’ll do that. Thanks.

Fingers crossed on the internet ;-)

Thanks,

 - Joel


> 
>                            Thanx, Paul
> 
>> thanks,
>> 
>> - Joel
>> 
>> 
>>> ---
>>> kernel/rcu/tree.c      | 6 ++++++
>>> kernel/rcu/tree.h      | 1 +
>>> kernel/rcu/tree_nocb.h | 5 +++++
>>> 3 files changed, 12 insertions(+)
>>> 
>>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>>> index 96d678c9cfb6..025f59f6f97f 100644
>>> --- a/kernel/rcu/tree.c
>>> +++ b/kernel/rcu/tree.c
>>> @@ -3914,6 +3914,8 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
>>> {
>>>    unsigned long gseq = READ_ONCE(rcu_state.barrier_sequence);
>>>    unsigned long lseq = READ_ONCE(rdp->barrier_seq_snap);
>>> +    bool wake_nocb = false;
>>> +    bool was_alldone = false;
>>> 
>>>    lockdep_assert_held(&rcu_state.barrier_lock);
>>>    if (rcu_seq_state(lseq) || !rcu_seq_state(gseq) || rcu_seq_ctr(lseq) != rcu_seq_ctr(gseq))
>>> @@ -3922,6 +3924,7 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
>>>    rdp->barrier_head.func = rcu_barrier_callback;
>>>    debug_rcu_head_queue(&rdp->barrier_head);
>>>    rcu_nocb_lock(rdp);
>>> +    was_alldone = rcu_rdp_is_offloaded(rdp) && !rcu_segcblist_pend_cbs(&rdp->cblist);
>>>    WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
>>>    if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) {
>>>        atomic_inc(&rcu_state.barrier_cpu_count);
>>> @@ -3929,7 +3932,10 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
>>>        debug_rcu_head_unqueue(&rdp->barrier_head);
>>>        rcu_barrier_trace(TPS("IRQNQ"), -1, rcu_state.barrier_sequence);
>>>    }
>>> +    wake_nocb = was_alldone && rcu_segcblist_pend_cbs(&rdp->cblist);
>>>    rcu_nocb_unlock(rdp);
>>> +    if (wake_nocb)
>>> +        wake_nocb_gp(rdp, false);
>>>    smp_store_release(&rdp->barrier_seq_snap, gseq);
>>> }
>>> 
>>> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
>>> index d4a97e40ea9c..925dd98f8b23 100644
>>> --- a/kernel/rcu/tree.h
>>> +++ b/kernel/rcu/tree.h
>>> @@ -439,6 +439,7 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp);
>>> static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
>>> static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
>>> static void rcu_init_one_nocb(struct rcu_node *rnp);
>>> +static bool wake_nocb_gp(struct rcu_data *rdp, bool force);
>>> static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>>>                  unsigned long j);
>>> static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>>> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
>>> index f77a6d7e1356..094fd454b6c3 100644
>>> --- a/kernel/rcu/tree_nocb.h
>>> +++ b/kernel/rcu/tree_nocb.h
>>> @@ -1558,6 +1558,11 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
>>> {
>>> }
>>> 
>>> +static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
>>> +{
>>> +    return false;
>>> +}
>>> +
>>> static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>>>                  unsigned long j)
>>> {
>>> -- 
>>> 2.25.1
>>> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] rcu/nocb: Spare bypass locking upon normal enqueue
  2022-10-11  2:00   ` Joel Fernandes
  2022-10-11  2:08     ` Joel Fernandes
@ 2022-10-11 19:21     ` Frederic Weisbecker
  2022-10-11 23:47       ` Joel Fernandes
  1 sibling, 1 reply; 12+ messages in thread
From: Frederic Weisbecker @ 2022-10-11 19:21 UTC (permalink / raw
  To: Joel Fernandes; +Cc: Paul E . McKenney, LKML

On Tue, Oct 11, 2022 at 02:00:40AM +0000, Joel Fernandes wrote:
> On Tue, Oct 11, 2022 at 12:39:56AM +0200, Frederic Weisbecker wrote:
> > When a callback is to be enqueued to the normal queue and not the bypass
> > one, a flush to the bypass queue is always tried anyway. This attempt
> > involves locking the bypass lock unconditionally. Although it is
> > guaranteed not to be contended at this point, because only call_rcu()
> > can lock the bypass lock without holding the nocb lock, it's still not
> > free and the operation can easily be spared most of the time by just
> > checking if the bypass list is empty. The check is safe as nobody can
> > queue nor flush the bypass concurrently.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > ---
> >  kernel/rcu/tree_nocb.h | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> > index 094fd454b6c3..30c3d473ffd8 100644
> > --- a/kernel/rcu/tree_nocb.h
> > +++ b/kernel/rcu/tree_nocb.h
> > @@ -423,8 +423,10 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> >  		if (*was_alldone)
> >  			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
> >  					    TPS("FirstQ"));
> > -		WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> > -		WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> > +		if (rcu_cblist_n_cbs(&rdp->nocb_bypass)) {
> > +			WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> > +			WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> > +		}
> >  		return false; // Caller must enqueue the callback.
> >  	}
> 
> Instead of this, since as you mentioned that the bypass lock is not contended
> in this path, isn't it unnecessary to even check or attempt to acquire the
> lock in call_rcu() path? So how about something like the following, or would
> this not work for some reason?

You're right. But it's a bit error prone and it adds quite some code complication
just for a gain on a rare event (bypass is supposed to be flushed on rare
occasions by the caller).

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] rcu/nocb: Spare bypass locking upon normal enqueue
  2022-10-11 19:21     ` Frederic Weisbecker
@ 2022-10-11 23:47       ` Joel Fernandes
  2022-10-12 10:23         ` Frederic Weisbecker
  0 siblings, 1 reply; 12+ messages in thread
From: Joel Fernandes @ 2022-10-11 23:47 UTC (permalink / raw
  To: Frederic Weisbecker; +Cc: Paul E . McKenney, LKML

On Tue, Oct 11, 2022 at 3:21 PM Frederic Weisbecker <frederic@kernel.org> wrote:
>
> On Tue, Oct 11, 2022 at 02:00:40AM +0000, Joel Fernandes wrote:
> > On Tue, Oct 11, 2022 at 12:39:56AM +0200, Frederic Weisbecker wrote:
> > > When a callback is to be enqueued to the normal queue and not the bypass
> > > one, a flush to the bypass queue is always tried anyway. This attempt
> > > involves locking the bypass lock unconditionally. Although it is
> > > guaranteed not to be contended at this point, because only call_rcu()
> > > can lock the bypass lock without holding the nocb lock, it's still not
> > > free and the operation can easily be spared most of the time by just
> > > checking if the bypass list is empty. The check is safe as nobody can
> > > queue nor flush the bypass concurrently.
> > >
> > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > ---
> > >  kernel/rcu/tree_nocb.h | 6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> > > index 094fd454b6c3..30c3d473ffd8 100644
> > > --- a/kernel/rcu/tree_nocb.h
> > > +++ b/kernel/rcu/tree_nocb.h
> > > @@ -423,8 +423,10 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> > >             if (*was_alldone)
> > >                     trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
> > >                                         TPS("FirstQ"));
> > > -           WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> > > -           WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> > > +           if (rcu_cblist_n_cbs(&rdp->nocb_bypass)) {
> > > +                   WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> > > +                   WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> > > +           }
> > >             return false; // Caller must enqueue the callback.
> > >     }
> >
> > Instead of this, since as you mentioned that the bypass lock is not contended
> > in this path, isn't it unnecessary to even check or attempt to acquire the
> > lock in call_rcu() path? So how about something like the following, or would
> > this not work for some reason?
>
> You're right. But it's a bit error prone and it adds quite some code complication
> just for a gain on a rare event (bypass is supposed to be flushed on rare
> occasions by the caller).

But the "checking of whether to flush" which leads to "acquiring the
bypass lock first" , is not a rare event as you pointed out (can be
spared most of the time as you said). The alternative I proposed
removes the need for the frequent locking (which is another way of
implementing what you suggested).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] rcu/nocb: Spare bypass locking upon normal enqueue
  2022-10-11 23:47       ` Joel Fernandes
@ 2022-10-12 10:23         ` Frederic Weisbecker
  2022-10-12 14:49           ` Joel Fernandes
  0 siblings, 1 reply; 12+ messages in thread
From: Frederic Weisbecker @ 2022-10-12 10:23 UTC (permalink / raw
  To: Joel Fernandes; +Cc: Paul E . McKenney, LKML

On Tue, Oct 11, 2022 at 07:47:07PM -0400, Joel Fernandes wrote:
> On Tue, Oct 11, 2022 at 3:21 PM Frederic Weisbecker <frederic@kernel.org> wrote:
> >
> > On Tue, Oct 11, 2022 at 02:00:40AM +0000, Joel Fernandes wrote:
> > > On Tue, Oct 11, 2022 at 12:39:56AM +0200, Frederic Weisbecker wrote:
> > > > When a callback is to be enqueued to the normal queue and not the bypass
> > > > one, a flush to the bypass queue is always tried anyway. This attempt
> > > > involves locking the bypass lock unconditionally. Although it is
> > > > guaranteed not to be contended at this point, because only call_rcu()
> > > > can lock the bypass lock without holding the nocb lock, it's still not
> > > > free and the operation can easily be spared most of the time by just
> > > > checking if the bypass list is empty. The check is safe as nobody can
> > > > queue nor flush the bypass concurrently.
> > > >
> > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > > ---
> > > >  kernel/rcu/tree_nocb.h | 6 ++++--
> > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> > > > index 094fd454b6c3..30c3d473ffd8 100644
> > > > --- a/kernel/rcu/tree_nocb.h
> > > > +++ b/kernel/rcu/tree_nocb.h
> > > > @@ -423,8 +423,10 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> > > >             if (*was_alldone)
> > > >                     trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
> > > >                                         TPS("FirstQ"));
> > > > -           WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> > > > -           WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> > > > +           if (rcu_cblist_n_cbs(&rdp->nocb_bypass)) {
> > > > +                   WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> > > > +                   WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> > > > +           }
> > > >             return false; // Caller must enqueue the callback.
> > > >     }
> > >
> > > Instead of this, since as you mentioned that the bypass lock is not contended
> > > in this path, isn't it unnecessary to even check or attempt to acquire the
> > > lock in call_rcu() path? So how about something like the following, or would
> > > this not work for some reason?
> >
> > You're right. But it's a bit error prone and it adds quite some code complication
> > just for a gain on a rare event (bypass is supposed to be flushed on rare
> > occasions by the caller).
> 
> But the "checking of whether to flush" which leads to "acquiring the
> bypass lock first" , is not a rare event as you pointed out (can be
> spared most of the time as you said). The alternative I proposed
> removes the need for the frequent locking (which is another way of
> implementing what you suggested).

It's not rare as a whole but this quick-check patch addresses the fast path.
What you propose is to extend the API to also cover the other flushes in
rcu_nocb_try_bypass() that are slower path.

I think this makes the API more error prone (users may get it easily wrong)
and complicated for tiny, if measurable, gains.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] rcu/nocb: Spare bypass locking upon normal enqueue
  2022-10-12 10:23         ` Frederic Weisbecker
@ 2022-10-12 14:49           ` Joel Fernandes
  0 siblings, 0 replies; 12+ messages in thread
From: Joel Fernandes @ 2022-10-12 14:49 UTC (permalink / raw
  To: Frederic Weisbecker; +Cc: Paul E . McKenney, LKML

On Wed, Oct 12, 2022 at 12:23:58PM +0200, Frederic Weisbecker wrote:
> On Tue, Oct 11, 2022 at 07:47:07PM -0400, Joel Fernandes wrote:
> > On Tue, Oct 11, 2022 at 3:21 PM Frederic Weisbecker <frederic@kernel.org> wrote:
> > >
> > > On Tue, Oct 11, 2022 at 02:00:40AM +0000, Joel Fernandes wrote:
> > > > On Tue, Oct 11, 2022 at 12:39:56AM +0200, Frederic Weisbecker wrote:
> > > > > When a callback is to be enqueued to the normal queue and not the bypass
> > > > > one, a flush to the bypass queue is always tried anyway. This attempt
> > > > > involves locking the bypass lock unconditionally. Although it is
> > > > > guaranteed not to be contended at this point, because only call_rcu()
> > > > > can lock the bypass lock without holding the nocb lock, it's still not
> > > > > free and the operation can easily be spared most of the time by just
> > > > > checking if the bypass list is empty. The check is safe as nobody can
> > > > > queue nor flush the bypass concurrently.
> > > > >
> > > > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > > > ---
> > > > >  kernel/rcu/tree_nocb.h | 6 ++++--
> > > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> > > > > index 094fd454b6c3..30c3d473ffd8 100644
> > > > > --- a/kernel/rcu/tree_nocb.h
> > > > > +++ b/kernel/rcu/tree_nocb.h
> > > > > @@ -423,8 +423,10 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> > > > >             if (*was_alldone)
> > > > >                     trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
> > > > >                                         TPS("FirstQ"));
> > > > > -           WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> > > > > -           WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> > > > > +           if (rcu_cblist_n_cbs(&rdp->nocb_bypass)) {
> > > > > +                   WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
> > > > > +                   WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
> > > > > +           }
> > > > >             return false; // Caller must enqueue the callback.
> > > > >     }
> > > >
> > > > Instead of this, since as you mentioned that the bypass lock is not contended
> > > > in this path, isn't it unnecessary to even check or attempt to acquire the
> > > > lock in call_rcu() path? So how about something like the following, or would
> > > > this not work for some reason?
> > >
> > > You're right. But it's a bit error prone and it adds quite some code complication
> > > just for a gain on a rare event (bypass is supposed to be flushed on rare
> > > occasions by the caller).
> > 
> > But the "checking of whether to flush" which leads to "acquiring the
> > bypass lock first" , is not a rare event as you pointed out (can be
> > spared most of the time as you said). The alternative I proposed
> > removes the need for the frequent locking (which is another way of
> > implementing what you suggested).
> 
> It's not rare as a whole but this quick-check patch addresses the fast path.
> What you propose is to extend the API to also cover the other flushes in
> rcu_nocb_try_bypass() that are slower path.

You can keep the same API though.

But there is also the unlock path which needs to be conditional, so I agree
it does complicate the code a bit more.

> I think this makes the API more error prone (users may get it easily wrong)
> and complicated for tiny, if measurable, gains.

Ok fair point. So then your original patch is good with me then. And nice
observation indeed.

thanks!

 - Joel


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-10-12 14:49 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-10 22:39 [PATCH 0/2] rcu/nocb fix and optimization Frederic Weisbecker
2022-10-10 22:39 ` [PATCH 1/2] rcu: Fix missing nocb gp wake on rcu_barrier() Frederic Weisbecker
2022-10-11  2:01   ` Joel Fernandes
2022-10-11  7:25     ` Paul E. McKenney
2022-10-11  7:33       ` Joel Fernandes
2022-10-10 22:39 ` [PATCH 2/2] rcu/nocb: Spare bypass locking upon normal enqueue Frederic Weisbecker
2022-10-11  2:00   ` Joel Fernandes
2022-10-11  2:08     ` Joel Fernandes
2022-10-11 19:21     ` Frederic Weisbecker
2022-10-11 23:47       ` Joel Fernandes
2022-10-12 10:23         ` Frederic Weisbecker
2022-10-12 14:49           ` Joel Fernandes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).