LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC tip/core/rcu 0/2] Avoid sending IPIs to offline CPUs
@ 2017-10-14 17:51 Paul E. McKenney
  2017-10-14 17:51 ` [PATCH RFC tip/core/rcu 1/2] sched: Stop resched_cpu() from " Paul E. McKenney
  2017-10-14 17:51 ` [PATCH RFC tip/core/rcu 2/2] sched: Stop switched_to_rt() " Paul E. McKenney
  0 siblings, 2 replies; 3+ messages in thread
From: Paul E. McKenney @ 2017-10-14 17:51 UTC (permalink / raw
  To: linux-kernel; +Cc: mingo, peterz

Hello!

This RFC series contains a couple of small patches that avoid splats due
to resched_cpu() and rt_mutex_setprio() sending IPIs to offline CPUs.
They make the obvious (and thus perhaps inappropriate) changes to
avoid this.  Nevertheless, they do seem effective in rcutorture testing.
The patches are as follows:

1.	Stop resched_cpu() from sending IPIs to offline CPUs, unless
	that offline CPU happens to be the current CPU.  (This last
	proviso is required to preserve resched_cpu()'s unconditional
	semantics for expedited RCU grace periods.)  I am reasonably
	confident in this patch.

2.	Stop switched_to_rt() from sending IPIs to offline CPUs, in
	particular, when invoked via rt_mutex_lock().  This -looks-
	correct to me, but I am assuming that the fact that the current
	CPU is holding the target task's CPU's rq lock is preventing
	the to-be-boosted task from doing anything, and that a later
	migration of the target task will finalize the priority boosting.
	But there might be an odd corner case involving offlining an
	extremely heavily loaded CPU with lots of preempted tasks, one of
	which is blocking a high-priority real-time task somewhere else.

Note: The first patch depends on a patch intended for the upcoming
merge window, and this latter patch may be found here:
lkml.kernel.org/r/1507152575-11055-6-git-send-email-paulmck@linux.vnet.ibm.com

							Thanx, Paul

------------------------------------------------------------------------

 core.c |    3 ++-
 rt.c   |    2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH RFC tip/core/rcu 1/2] sched: Stop resched_cpu() from sending IPIs to offline CPUs
  2017-10-14 17:51 [PATCH RFC tip/core/rcu 0/2] Avoid sending IPIs to offline CPUs Paul E. McKenney
@ 2017-10-14 17:51 ` Paul E. McKenney
  2017-10-14 17:51 ` [PATCH RFC tip/core/rcu 2/2] sched: Stop switched_to_rt() " Paul E. McKenney
  1 sibling, 0 replies; 3+ messages in thread
From: Paul E. McKenney @ 2017-10-14 17:51 UTC (permalink / raw
  To: linux-kernel; +Cc: mingo, peterz, Paul E. McKenney, Ingo Molnar

The rcutorture test suite occasionally provokes a splat due to invoking
resched_cpu() on an offline CPU:

WARNING: CPU: 2 PID: 8 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x37/0x40
Modules linked in:
CPU: 2 PID: 8 Comm: rcu_preempt Not tainted 4.14.0-rc4+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
task: ffff902ede9daf00 task.stack: ffff96c50010c000
RIP: 0010:native_smp_send_reschedule+0x37/0x40
RSP: 0018:ffff96c50010fdb8 EFLAGS: 00010096
RAX: 000000000000002e RBX: ffff902edaab4680 RCX: 0000000000000003
RDX: 0000000080000003 RSI: 0000000000000000 RDI: 00000000ffffffff
RBP: ffff96c50010fdb8 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 00000000299f36ae R12: 0000000000000001
R13: ffffffff9de64240 R14: 0000000000000001 R15: ffffffff9de64240
FS:  0000000000000000(0000) GS:ffff902edfc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000f7d4c642 CR3: 000000001e0e2000 CR4: 00000000000006e0
Call Trace:
 resched_curr+0x8f/0x1c0
 resched_cpu+0x2c/0x40
 rcu_implicit_dynticks_qs+0x152/0x220
 force_qs_rnp+0x147/0x1d0
 ? sync_rcu_exp_select_cpus+0x450/0x450
 rcu_gp_kthread+0x5a9/0x950
 kthread+0x142/0x180
 ? force_qs_rnp+0x1d0/0x1d0
 ? kthread_create_on_node+0x40/0x40
 ret_from_fork+0x27/0x40
Code: 14 01 0f 92 c0 84 c0 74 14 48 8b 05 14 4f f4 00 be fd 00 00 00 ff 90 a0 00 00 00 5d c3 89 fe 48 c7 c7 38 89 ca 9d e8 e5 56 08 00 <0f> ff 5d c3 0f 1f 44 00 00 8b 05 52 9e 37 02 85 c0 75 38 55 48
---[ end trace 26df9e5df4bba4ac ]---

This splat cannot be generated by expedited grace periods because they
always invoke resched_cpu() on the current CPU, which is good because
expedited grace periods require that resched_cpu() unconditionally
succeed.  However, other parts of RCU can tolerate resched_cpu() acting
as a no-op, at least as long as it doesn't happen too often.

This commit therefore makes resched_cpu() invoke resched_curr() only if
the CPU is either online or is the current CPU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 kernel/sched/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6fb6bb5f3682..402f6da8c986 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -506,7 +506,8 @@ void resched_cpu(int cpu)
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&rq->lock, flags);
-	resched_curr(rq);
+	if (cpu_online(cpu) || cpu == smp_processor_id())
+		resched_curr(rq);
 	raw_spin_unlock_irqrestore(&rq->lock, flags);
 }
 
-- 
2.5.2

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH RFC tip/core/rcu 2/2] sched: Stop switched_to_rt() from sending IPIs to offline CPUs
  2017-10-14 17:51 [PATCH RFC tip/core/rcu 0/2] Avoid sending IPIs to offline CPUs Paul E. McKenney
  2017-10-14 17:51 ` [PATCH RFC tip/core/rcu 1/2] sched: Stop resched_cpu() from " Paul E. McKenney
@ 2017-10-14 17:51 ` Paul E. McKenney
  1 sibling, 0 replies; 3+ messages in thread
From: Paul E. McKenney @ 2017-10-14 17:51 UTC (permalink / raw
  To: linux-kernel; +Cc: mingo, peterz, Paul E. McKenney, Ingo Molnar

The rcutorture test suite occasionally provokes a splat due to invoking
rt_mutex_lock() which needs to boost the priority of a task currently
sitting on a runqueue that belongs to an offline CPU:

WARNING: CPU: 0 PID: 12 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x37/0x40
Modules linked in:
CPU: 0 PID: 12 Comm: rcub/7 Not tainted 4.14.0-rc4+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
task: ffff9ed3de5f8cc0 task.stack: ffffbbf80012c000
RIP: 0010:native_smp_send_reschedule+0x37/0x40
RSP: 0018:ffffbbf80012fd10 EFLAGS: 00010082
RAX: 000000000000002f RBX: ffff9ed3dd9cb300 RCX: 0000000000000004
RDX: 0000000080000004 RSI: 0000000000000086 RDI: 00000000ffffffff
RBP: ffffbbf80012fd10 R08: 000000000009da7a R09: 0000000000007b9d
R10: 0000000000000001 R11: ffffffffbb57c2cd R12: 000000000000000d
R13: ffff9ed3de5f8cc0 R14: 0000000000000061 R15: ffff9ed3ded59200
FS:  0000000000000000(0000) GS:ffff9ed3dea00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000080686f0 CR3: 000000001b9e0000 CR4: 00000000000006f0
Call Trace:
 resched_curr+0x61/0xd0
 switched_to_rt+0x8f/0xa0
 rt_mutex_setprio+0x25c/0x410
 task_blocks_on_rt_mutex+0x1b3/0x1f0
 rt_mutex_slowlock+0xa9/0x1e0
 rt_mutex_lock+0x29/0x30
 rcu_boost_kthread+0x127/0x3c0
 kthread+0x104/0x140
 ? rcu_report_unblock_qs_rnp+0x90/0x90
 ? kthread_create_on_node+0x40/0x40
 ret_from_fork+0x22/0x30
Code: f0 00 0f 92 c0 84 c0 74 14 48 8b 05 34 74 c5 00 be fd 00 00 00 ff 90 a0 00 00 00 5d c3 89 fe 48 c7 c7 a0 c6 fc b9 e8 d5 b5 06 00 <0f> ff 5d c3 0f 1f 44 00 00 8b 05 a2 d1 13 02 85 c0 75 38 55 48

But the target task's priority has already been adjusted, so the only
purpose of switched_to_rt() invoking resched_curr() is to wake up the
CPU running some task that needs to be preempted by the boosted task.
But the CPU is offline, which presumably means that the task must be
migrated to some other CPU, and that this other CPU will undertake any
needed preemption at the time of migration.  Because the runqueue lock
is held when resched_curr() is invoked, we know that the boosted task
cannot go anywhere, so it is not necessary to invoke resched_curr()
in this particular case.

This commit therefore makes switched_to_rt() refrain from invoking
resched_curr() when the target CPU is offline.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 kernel/sched/rt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 0af5ca9e3e3f..640eca709b57 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2291,7 +2291,7 @@ static void switched_to_rt(struct rq *rq, struct task_struct *p)
 		if (p->nr_cpus_allowed > 1 && rq->rt.overloaded)
 			queue_push_tasks(rq);
 #endif /* CONFIG_SMP */
-		if (p->prio < rq->curr->prio)
+		if (p->prio < rq->curr->prio && cpu_online(cpu_of(rq)))
 			resched_curr(rq);
 	}
 }
-- 
2.5.2

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-10-14 17:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-14 17:51 [PATCH RFC tip/core/rcu 0/2] Avoid sending IPIs to offline CPUs Paul E. McKenney
2017-10-14 17:51 ` [PATCH RFC tip/core/rcu 1/2] sched: Stop resched_cpu() from " Paul E. McKenney
2017-10-14 17:51 ` [PATCH RFC tip/core/rcu 2/2] sched: Stop switched_to_rt() " Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).