[PATCH v7 0/2] CPU hotplug: Fix the long-standing "IPI to offline CPU" issue

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v7 0/2] CPU hotplug: Fix the long-standing "IPI to offline CPU" issue
@ 2014-05-26 11:08 Srivatsa S. Bhat
  2014-05-26 11:08 ` [PATCH v7 1/2] smp: Print more useful debug info upon receiving IPI on an offline CPU Srivatsa S. Bhat
  2014-05-26 11:08 ` [PATCH v7 2/2] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline Srivatsa S. Bhat
  0 siblings, 2 replies; 5+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-26 11:08 UTC (permalink / raw
  To: peterz, tglx, mingo, tj, rusty, akpm, fweisbec, hch
  Cc: mgorman, riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel, srivatsa.bhat

Hi,

There is a long-standing problem related to CPU hotplug which causes IPIs to
be delivered to offline CPUs, and the smp-call-function IPI handler code
prints out a warning whenever this is detected. Every once in a while this
(usually harmless) warning gets reported on LKML, but so far it has not been
completely fixed. Usually the solution involves finding out the IPI sender
and fixing it by adding appropriate synchronization with CPU hotplug.

However, while going through one such internal bug reports, I found that
there is a significant bug in the receiver side itself (more specifically,
in stop-machine) that can lead to this problem even when the sender code
is perfectly fine. This patchset handles that scenario to ensure that a
CPU doesn't go offline with callbacks still pending.

Patch 1 adds some additional debug code to the smp-call-function framework,
to help debug such issues easily.

Patch 2 adds a mechanism to flush any pending smp-call-function callbacks
queued on the CPU going offline (including those callbacks for which the
IPIs from the source CPUs might not have arrived in time at the outgoing CPU).
This ensures that a CPU never goes offline with work still pending. Also,
the warning condition in smp-call-function IPI handler code is modified to
trigger only if an IPI is received on an offline CPU *and* it still has
pending callbacks to execute, since that's the only remaining buggy scenario
after applying this patch.

In fact, I debugged the problem by using Patch 1, and found that the
payload of the IPI was always the block layer's trigger_softirq() function.
But I was not able to find anything wrong with the block layer code. That's
when I started looking at the stop-machine code and realized that there is
a race-window which makes the IPI _receiver_ the culprit, not the sender.
Patch 2 handles this scenario and hence this should put an end to most of
the hard-to-debug IPI-to-offline-CPU issues.

Changes in v7:
* Modified the warning condition in smp-call-function IPI handler code, such
  that it triggers only if an offline CPU got an IPI *and* it still had pending
  callbacks to execute.
* Completely dropped the patch that modified the stop-machine code to
  introduce additional states to order the disabling of interrupts on various
  CPUs. This strict ordering is not necessary any more after the first change.
  Thanks to Frederic Weisbecker for suggesting this enhancement.

Changes in v6:
Modified Patch 3 to flush the pending callbacks from CPU_DYING notifier
instead of stop-machine directly, so that only the CPU hotplug path will
run this code, instead of everybody who uses stop-machine. Suggested by
Peter Zijlstra.

Changes in v5:
Added Patch 3 to flush out any pending smp-call-function callbacks on the
outgoing CPU, as suggested by Frederic Weisbecker.

Changes in v4:
Rewrote a comment in Patch 2 and reorganized the code for better readability.

Changes in v3:
Rewrote patch 2 and split the MULTI_STOP_DISABLE_IRQ state into two:
MULTI_STOP_DISABLE_IRQ_INACTIVE and MULTI_STOP_DISABLE_IRQ_ACTIVE, and
used this framework to ensure that the CPU going offline always disables
its interrupts last. Suggested by Tejun Heo.

v1 and v2:
https://lkml.org/lkml/2014/5/6/474

 Srivatsa S. Bhat (2):
      smp: Print more useful debug info upon receiving IPI on an offline CPU
      CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline

 kernel/smp.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 60 insertions(+), 8 deletions(-)

Regards,
Srivatsa S. Bhat
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v7 1/2] smp: Print more useful debug info upon receiving IPI on an offline CPU
  2014-05-26 11:08 [PATCH v7 0/2] CPU hotplug: Fix the long-standing "IPI to offline CPU" issue Srivatsa S. Bhat
@ 2014-05-26 11:08 ` Srivatsa S. Bhat
  2014-05-26 11:08 ` [PATCH v7 2/2] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline Srivatsa S. Bhat
  1 sibling, 0 replies; 5+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-26 11:08 UTC (permalink / raw
  To: peterz, tglx, mingo, tj, rusty, akpm, fweisbec, hch
  Cc: mgorman, riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel, srivatsa.bhat

Today the smp-call-function code just prints a warning if we get an IPI on
an offline CPU. This info is sufficient to let us know that something went
wrong, but often it is very hard to debug exactly who sent the IPI and why,
from this info alone.

In most cases, we get the warning about the IPI to an offline CPU, immediately
after the CPU going offline comes out of the stop-machine phase and reenables
interrupts. Since all online CPUs participate in stop-machine, the information
regarding the sender of the IPI is already lost by the time we exit the
stop-machine loop. So even if we dump the stack on each CPU at this point,
we won't find anything useful since all of them will show the stack-trace of
the stopper thread. So we need a better way to figure out who sent the IPI and
why.

To achieve this, when we detect an IPI targeted to an offline CPU, loop through
the call-single-data linked list and print out the payload (i.e., the name
of the function which was supposed to be executed by the target CPU). This
would give us an insight as to who might have sent the IPI and help us debug
this further.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 kernel/smp.c |   18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 06d574e..306f818 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -185,14 +185,26 @@ void generic_smp_call_function_single_interrupt(void)
 {
 	struct llist_node *entry;
 	struct call_single_data *csd, *csd_next;
+	static bool warned;
+
+	entry = llist_del_all(&__get_cpu_var(call_single_queue));
+	entry = llist_reverse_order(entry);

 	/*
 	 * Shouldn't receive this interrupt on a cpu that is not yet online.
 	 */
-	WARN_ON_ONCE(!cpu_online(smp_processor_id()));
+	if (unlikely(!cpu_online(smp_processor_id()) && !warned)) {
+		warned = true;
+		WARN(1, "IPI on offline CPU %d\n", smp_processor_id());

-	entry = llist_del_all(&__get_cpu_var(call_single_queue));
-	entry = llist_reverse_order(entry);
+		/*
+		 * We don't have to use the _safe() variant here
+		 * because we are not invoking the IPI handlers yet.
+		 */
+		llist_for_each_entry(csd, entry, llist)
+			pr_warn("IPI callback %pS sent to offline CPU\n",
+				csd->func);
+	}

 	llist_for_each_entry_safe(csd, csd_next, entry, llist) {
 		csd->func(csd->info);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v7 2/2] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-26 11:08 [PATCH v7 0/2] CPU hotplug: Fix the long-standing "IPI to offline CPU" issue Srivatsa S. Bhat
  2014-05-26 11:08 ` [PATCH v7 1/2] smp: Print more useful debug info upon receiving IPI on an offline CPU Srivatsa S. Bhat
@ 2014-05-26 11:08 ` Srivatsa S. Bhat
  2014-06-25 15:42   ` Sasha Levin
  1 sibling, 1 reply; 5+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-26 11:08 UTC (permalink / raw
  To: peterz, tglx, mingo, tj, rusty, akpm, fweisbec, hch
  Cc: mgorman, riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel, srivatsa.bhat

During CPU offline, in stop-machine, we don't enforce any rule in the
_DISABLE_IRQ stage, regarding the order in which the outgoing CPU and the other
CPUs disable their local interrupts. Hence, we can encounter a scenario as
depicted below, in which IPIs are sent by the other CPUs to the CPU going
offline (while it is *still* online), but the outgoing CPU notices them only
*after* it has gone offline.

              CPU 1                                         CPU 2
          (Online CPU)                               (CPU going offline)

       Enter _PREPARE stage                          Enter _PREPARE stage

                                                     Enter _DISABLE_IRQ stage

                                                   =
       Got a device interrupt,                     | Didn't notice the IPI
       and the interrupt handler                   | since interrupts were
       called smp_call_function()                  | disabled on this CPU.
       and sent an IPI to CPU 2.                   |
                                                   =

       Enter _DISABLE_IRQ stage

       Enter _RUN stage                              Enter _RUN stage

                                  =
       Busy loop with interrupts  |                  Invoke take_cpu_down()
       disabled.                  |                  and take CPU 2 offline
                                  =

       Enter _EXIT stage                             Enter _EXIT stage

       Re-enable interrupts                          Re-enable interrupts

                                                     The pending IPI is noted
                                                     immediately, but alas,
                                                     the CPU is offline at
                                                     this point.

This of course, makes the smp-call-function IPI handler code unhappy and it
complains about "receiving an IPI on an offline CPU".

However, if we look closely, we observe that the IPI was sent when CPU 2 was
still online, and hence it was perfectly legal for CPU 1 to send the IPI at
that point. Furthermore, receiving an IPI on an offline CPU is terrible only
if there were pending callbacks yet to be executed by that CPU (in other words,
its a bug if the CPU went offline with work still pending).

So, fix this by flushing all the queued smp-call-function callbacks on the
outgoing CPU in the CPU_DYING stage[1], including those callbacks for which the
source CPU's IPIs might not have been received on the outgoing CPU yet. This
ensures that all pending IPI callbacks are run before the CPU goes completely
offline. But note that the outgoing CPU can still get IPIs from the other CPUs
just after it exits stop-machine, due to the scenario mentioned above. But
because we flush the callbacks before going offline, this will be completely
harmless.

Further, this solution also guarantees that there will be pending callbacks
on an offline CPU *only if* the source CPU initiated the IPI-send-procedure
*after* the target CPU went offline, which clearly indicates a bug in the
sender code.

So, considering all this, teach the smp-call-function IPI handler code to
complain only if an offline CPU received an IPI *and* it still had pending
callbacks to execute, since that is the only buggy scenario.

There is another case (somewhat theoretical though) where IPIs might arrive
late on the target CPU (possibly _after_ the CPU has gone offline): due to IPI
latencies in the hardware. But with this patch, even this scenario turns out
to be harmless, since we explicitly loop through the call_single_queue and
flush out any pending callbacks without waiting for the corresponding IPIs
to arrive.

[1]. The CPU_DYING part needs a little more explanation: by the time we
execute the CPU_DYING notifier callbacks, the CPU would have already been
marked offline. But we want to flush out the pending callbacks at this stage,
ignoring the fact that the CPU is offline. So restructure the IPI handler
code so that we can by-pass the "is-cpu-offline?" check in this particular
case. (Of course, the right solution here is to fix CPU hotplug to mark the
CPU offline _after_ invoking the CPU_DYING notifiers, but this requires a
lot of audit to ensure that this change doesn't break any existing code;
hence lets go with the solution proposed above until that is done).

Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 kernel/smp.c |   56 ++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 48 insertions(+), 8 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 306f818..5295388 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -29,6 +29,8 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(struct call_function_data, cfd_data);

 static DEFINE_PER_CPU_SHARED_ALIGNED(struct llist_head, call_single_queue);

+static void flush_smp_call_function_queue(bool warn_cpu_offline);
+
 static int
 hotplug_cfd(struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
@@ -52,6 +54,20 @@ hotplug_cfd(struct notifier_block *nfb, unsigned long action, void *hcpu)
 	case CPU_UP_CANCELED:
 	case CPU_UP_CANCELED_FROZEN:

+	case CPU_DYING:
+	case CPU_DYING_FROZEN:
+		/*
+		 * The IPIs for the smp-call-function callbacks queued by other
+		 * CPUs might arrive late, either due to hardware latencies or
+		 * because this CPU disabled interrupts (inside stop-machine)
+		 * before the IPIs were sent. So flush out any pending callbacks
+		 * explicitly (without waiting for the IPIs to arrive), to
+		 * ensure that the outgoing CPU doesn't go offline with work
+		 * still pending.
+		 */
+		flush_smp_call_function_queue(false);
+		break;
+
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
 		free_cpumask_var(cfd->cpumask);
@@ -177,23 +193,47 @@ static int generic_exec_single(int cpu, struct call_single_data *csd,
 	return 0;
 }

-/*
- * Invoked by arch to handle an IPI for call function single. Must be
- * called from the arch with interrupts disabled.
+/**
+ * generic_smp_call_function_single_interrupt - Execute SMP IPI callbacks
+ *
+ * Invoked by arch to handle an IPI for call function single.
+ * Must be called with interrupts disabled.
  */
 void generic_smp_call_function_single_interrupt(void)
 {
+	flush_smp_call_function_queue(true);
+}
+
+/**
+ * flush_smp_call_function_queue - Flush pending smp-call-function callbacks
+ *
+ * @warn_cpu_offline: If set to 'true', warn if callbacks were queued on an
+ * 		      offline CPU. Skip this check if set to 'false'.
+ *
+ * Flush any pending smp-call-function callbacks queued on this CPU. This is
+ * invoked by the generic IPI handler, as well as by a CPU about to go offline,
+ * to ensure that all pending IPI functions are run before it goes completely
+ * offline.
+ *
+ * Loop through the call_single_queue and run all the queued functions.
+ * Must be called with interrupts disabled.
+ */
+static void flush_smp_call_function_queue(bool warn_cpu_offline)
+{
+	struct llist_head *head;
 	struct llist_node *entry;
 	struct call_single_data *csd, *csd_next;
 	static bool warned;

-	entry = llist_del_all(&__get_cpu_var(call_single_queue));
+	WARN_ON(!irqs_disabled());
+
+	head = &__get_cpu_var(call_single_queue);
+	entry = llist_del_all(head);
 	entry = llist_reverse_order(entry);

-	/*
-	 * Shouldn't receive this interrupt on a cpu that is not yet online.
-	 */
-	if (unlikely(!cpu_online(smp_processor_id()) && !warned)) {
+	/* There shouldn't be any pending callbacks on an offline CPU. */
+	if (unlikely(warn_cpu_offline && !cpu_online(smp_processor_id()) &&
+		     !warned && !llist_empty(head))) {
 		warned = true;
 		WARN(1, "IPI on offline CPU %d\n", smp_processor_id());

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v7 2/2] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-26 11:08 ` [PATCH v7 2/2] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline Srivatsa S. Bhat
@ 2014-06-25 15:42   ` Sasha Levin
  2014-06-25 16:59     ` Srivatsa S. Bhat
  0 siblings, 1 reply; 5+ messages in thread
From: Sasha Levin @ 2014-06-25 15:42 UTC (permalink / raw
  To: Srivatsa S. Bhat, peterz, tglx, mingo, tj, rusty, akpm, fweisbec,
	hch
  Cc: mgorman, riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel, Dave Jones

On 05/26/2014 07:08 AM, Srivatsa S. Bhat wrote:
> During CPU offline, in stop-machine, we don't enforce any rule in the
> _DISABLE_IRQ stage, regarding the order in which the outgoing CPU and the other
> CPUs disable their local interrupts. Hence, we can encounter a scenario as
> depicted below, in which IPIs are sent by the other CPUs to the CPU going
> offline (while it is *still* online), but the outgoing CPU notices them only
> *after* it has gone offline.
> 
> 
>               CPU 1                                         CPU 2
>           (Online CPU)                               (CPU going offline)
> 
>        Enter _PREPARE stage                          Enter _PREPARE stage
> 
>                                                      Enter _DISABLE_IRQ stage
> 
> 
>                                                    =
>        Got a device interrupt,                     | Didn't notice the IPI
>        and the interrupt handler                   | since interrupts were
>        called smp_call_function()                  | disabled on this CPU.
>        and sent an IPI to CPU 2.                   |
>                                                    =
> 
> 
>        Enter _DISABLE_IRQ stage
> 
> 
>        Enter _RUN stage                              Enter _RUN stage
> 
>                                   =
>        Busy loop with interrupts  |                  Invoke take_cpu_down()
>        disabled.                  |                  and take CPU 2 offline
>                                   =
> 
> 
>        Enter _EXIT stage                             Enter _EXIT stage
> 
>        Re-enable interrupts                          Re-enable interrupts
> 
>                                                      The pending IPI is noted
>                                                      immediately, but alas,
>                                                      the CPU is offline at
>                                                      this point.
> 
> 
> 
> This of course, makes the smp-call-function IPI handler code unhappy and it
> complains about "receiving an IPI on an offline CPU".
> 
> However, if we look closely, we observe that the IPI was sent when CPU 2 was
> still online, and hence it was perfectly legal for CPU 1 to send the IPI at
> that point. Furthermore, receiving an IPI on an offline CPU is terrible only
> if there were pending callbacks yet to be executed by that CPU (in other words,
> its a bug if the CPU went offline with work still pending).
> 
> So, fix this by flushing all the queued smp-call-function callbacks on the
> outgoing CPU in the CPU_DYING stage[1], including those callbacks for which the
> source CPU's IPIs might not have been received on the outgoing CPU yet. This
> ensures that all pending IPI callbacks are run before the CPU goes completely
> offline. But note that the outgoing CPU can still get IPIs from the other CPUs
> just after it exits stop-machine, due to the scenario mentioned above. But
> because we flush the callbacks before going offline, this will be completely
> harmless.
> 
> Further, this solution also guarantees that there will be pending callbacks
> on an offline CPU *only if* the source CPU initiated the IPI-send-procedure
> *after* the target CPU went offline, which clearly indicates a bug in the
> sender code.
> 
> So, considering all this, teach the smp-call-function IPI handler code to
> complain only if an offline CPU received an IPI *and* it still had pending
> callbacks to execute, since that is the only buggy scenario.
> 
> There is another case (somewhat theoretical though) where IPIs might arrive
> late on the target CPU (possibly _after_ the CPU has gone offline): due to IPI
> latencies in the hardware. But with this patch, even this scenario turns out
> to be harmless, since we explicitly loop through the call_single_queue and
> flush out any pending callbacks without waiting for the corresponding IPIs
> to arrive.
> 
> 
> [1]. The CPU_DYING part needs a little more explanation: by the time we
> execute the CPU_DYING notifier callbacks, the CPU would have already been
> marked offline. But we want to flush out the pending callbacks at this stage,
> ignoring the fact that the CPU is offline. So restructure the IPI handler
> code so that we can by-pass the "is-cpu-offline?" check in this particular
> case. (Of course, the right solution here is to fix CPU hotplug to mark the
> CPU offline _after_ invoking the CPU_DYING notifiers, but this requires a
> lot of audit to ensure that this change doesn't break any existing code;
> hence lets go with the solution proposed above until that is done).
> 
> Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Hi all,

While fuzzing with trinity inside a KVM tools guest running the latest -next
kernel I've stumbled on the following spew:

[ 1982.600053] kernel BUG at kernel/irq_work.c:175!
[ 1982.600053] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 1982.600053] Dumping ftrace buffer:
[ 1982.600053]    (ftrace buffer empty)
[ 1982.600053] Modules linked in:
[ 1982.600053] CPU: 14 PID: 168 Comm: migration/14 Not tainted 3.16.0-rc2-next-20140624-sasha-00024-g332b58d #726
[ 1982.600053] task: ffff88036a5a3000 ti: ffff88036a5ac000 task.ti: ffff88036a5ac000
[ 1982.600053] RIP: irq_work_run (kernel/irq_work.c:175 (discriminator 1))
[ 1982.600053] RSP: 0000:ffff88036a5afbe0  EFLAGS: 00010046
[ 1982.600053] RAX: 0000000080000001 RBX: 0000000000000000 RCX: 0000000000000008
[ 1982.600053] RDX: 000000000000000e RSI: ffffffffaf9185fb RDI: 0000000000000000
[ 1982.600053] RBP: ffff88036a5afc08 R08: 0000000000099224 R09: 0000000000000000
[ 1982.600053] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88036afd8400
[ 1982.600053] R13: 0000000000000000 R14: ffffffffb0cf8120 R15: ffffffffb0cce5d0
[ 1982.600053] FS:  0000000000000000(0000) GS:ffff88036ae00000(0000) knlGS:0000000000000000
[ 1982.600053] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1982.600053] CR2: 00000000019485d0 CR3: 00000002c7c8f000 CR4: 00000000000006a0
[ 1982.600053] Stack:
[ 1982.600053]  ffffffffab20fbb5 0000000000000082 ffff88036afd8440 0000000000000000
[ 1982.600053]  0000000000000001 ffff88036a5afc28 ffffffffab20fca7 0000000000000000
[ 1982.600053]  00000000ffffffef ffff88036a5afc78 ffffffffab19c58e 000000000000000e
[ 1982.600053] Call Trace:
[ 1982.600053] ? flush_smp_call_function_queue (kernel/smp.c:263)
[ 1982.600053] hotplug_cfd (kernel/smp.c:81)
[ 1982.600053] notifier_call_chain (kernel/notifier.c:95)
[ 1982.600053] __raw_notifier_call_chain (kernel/notifier.c:395)
[ 1982.600053] __cpu_notify (kernel/cpu.c:202)
[ 1982.600053] cpu_notify (kernel/cpu.c:211)
[ 1982.600053] take_cpu_down (./arch/x86/include/asm/current.h:14 kernel/cpu.c:312)
[ 1982.600053] multi_cpu_stop (kernel/stop_machine.c:201)
[ 1982.600053] ? __stop_cpus (kernel/stop_machine.c:170)
[ 1982.600053] cpu_stopper_thread (kernel/stop_machine.c:474)
[ 1982.600053] ? put_lock_stats.isra.12 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
[ 1982.600053] ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/paravirt.h:809 include/linux/spinlock_api_smp.h:160 kernel/locking/spinlock.c:191)
[ 1982.600053] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[ 1982.600053] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2557 kernel/locking/lockdep.c:2599)
[ 1982.600053] smpboot_thread_fn (kernel/smpboot.c:160)
[ 1982.600053] ? __smpboot_create_thread (kernel/smpboot.c:105)
[ 1982.600053] kthread (kernel/kthread.c:210)
[ 1982.600053] ? wait_for_completion (kernel/sched/completion.c:77 kernel/sched/completion.c:93 kernel/sched/completion.c:101 kernel/sched/completion.c:122)
[ 1982.600053] ? kthread_create_on_node (kernel/kthread.c:176)
[ 1982.600053] ret_from_fork (arch/x86/kernel/entry_64.S:349)
[ 1982.600053] ? kthread_create_on_node (kernel/kthread.c:176)
[ 1982.600053] Code: 00 00 00 00 e8 63 ff ff ff 48 83 c4 08 b8 01 00 00 00 5b 5d c3 b8 01 00 00 00 c3 90 65 8b 04 25 a0 da 00 00 a9 00 00 0f 00 75 09 <0f> 0b 0f 1f 80 00 00 00 00 55 48 89 e5 e8 2f ff ff ff 5d c3 66
All code
========
   0:	00 00                	add    %al,(%rax)
   2:	00 00                	add    %al,(%rax)
   4:	e8 63 ff ff ff       	callq  0xffffffffffffff6c
   9:	48 83 c4 08          	add    $0x8,%rsp
   d:	b8 01 00 00 00       	mov    $0x1,%eax
  12:	5b                   	pop    %rbx
  13:	5d                   	pop    %rbp
  14:	c3                   	retq
  15:	b8 01 00 00 00       	mov    $0x1,%eax
  1a:	c3                   	retq
  1b:	90                   	nop
  1c:	65 8b 04 25 a0 da 00 	mov    %gs:0xdaa0,%eax
  23:	00
  24:	a9 00 00 0f 00       	test   $0xf0000,%eax
  29:	75 09                	jne    0x34
  2b:*	0f 0b                	ud2    		<-- trapping instruction
  2d:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
  34:	55                   	push   %rbp
  35:	48 89 e5             	mov    %rsp,%rbp
  38:	e8 2f ff ff ff       	callq  0xffffffffffffff6c
  3d:	5d                   	pop    %rbp
  3e:	c3                   	retq
  3f:	66                   	data16
	...

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2
   2:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
   9:	55                   	push   %rbp
   a:	48 89 e5             	mov    %rsp,%rbp
   d:	e8 2f ff ff ff       	callq  0xffffffffffffff41
  12:	5d                   	pop    %rbp
  13:	c3                   	retq
  14:	66                   	data16
	...
[ 1982.600053] RIP irq_work_run (kernel/irq_work.c:175 (discriminator 1))
[ 1982.600053]  RSP <ffff88036a5afbe0>


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v7 2/2] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-06-25 15:42   ` Sasha Levin
@ 2014-06-25 16:59     ` Srivatsa S. Bhat
  0 siblings, 0 replies; 5+ messages in thread
From: Srivatsa S. Bhat @ 2014-06-25 16:59 UTC (permalink / raw
  To: Sasha Levin
  Cc: peterz, tglx, mingo, tj, rusty, akpm, fweisbec, hch, mgorman,
	riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel, Dave Jones

On 06/25/2014 09:12 PM, Sasha Levin wrote:
> On 05/26/2014 07:08 AM, Srivatsa S. Bhat wrote:
>> During CPU offline, in stop-machine, we don't enforce any rule in the
>> _DISABLE_IRQ stage, regarding the order in which the outgoing CPU and the other
>> CPUs disable their local interrupts. Hence, we can encounter a scenario as
>> depicted below, in which IPIs are sent by the other CPUs to the CPU going
>> offline (while it is *still* online), but the outgoing CPU notices them only
>> *after* it has gone offline.
>>
[...]
> Hi all,
> 
> While fuzzing with trinity inside a KVM tools guest running the latest -next
> kernel I've stumbled on the following spew:
> 

Thanks for the bug report. Please test if this patch fixes the problem
for you:

https://git.kernel.org/cgit/linux/kernel/git/peterz/queue.git/commit/?h=timers/nohz&id=921d8b81281ecdca686369f52165d04fa3505bd7

Regards,
Srivatsa S. Bhat

> [ 1982.600053] kernel BUG at kernel/irq_work.c:175!
> [ 1982.600053] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> [ 1982.600053] Dumping ftrace buffer:
> [ 1982.600053]    (ftrace buffer empty)
> [ 1982.600053] Modules linked in:
> [ 1982.600053] CPU: 14 PID: 168 Comm: migration/14 Not tainted 3.16.0-rc2-next-20140624-sasha-00024-g332b58d #726
> [ 1982.600053] task: ffff88036a5a3000 ti: ffff88036a5ac000 task.ti: ffff88036a5ac000
> [ 1982.600053] RIP: irq_work_run (kernel/irq_work.c:175 (discriminator 1))
> [ 1982.600053] RSP: 0000:ffff88036a5afbe0  EFLAGS: 00010046
> [ 1982.600053] RAX: 0000000080000001 RBX: 0000000000000000 RCX: 0000000000000008
> [ 1982.600053] RDX: 000000000000000e RSI: ffffffffaf9185fb RDI: 0000000000000000
> [ 1982.600053] RBP: ffff88036a5afc08 R08: 0000000000099224 R09: 0000000000000000
> [ 1982.600053] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88036afd8400
> [ 1982.600053] R13: 0000000000000000 R14: ffffffffb0cf8120 R15: ffffffffb0cce5d0
> [ 1982.600053] FS:  0000000000000000(0000) GS:ffff88036ae00000(0000) knlGS:0000000000000000
> [ 1982.600053] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 1982.600053] CR2: 00000000019485d0 CR3: 00000002c7c8f000 CR4: 00000000000006a0
> [ 1982.600053] Stack:
> [ 1982.600053]  ffffffffab20fbb5 0000000000000082 ffff88036afd8440 0000000000000000
> [ 1982.600053]  0000000000000001 ffff88036a5afc28 ffffffffab20fca7 0000000000000000
> [ 1982.600053]  00000000ffffffef ffff88036a5afc78 ffffffffab19c58e 000000000000000e
> [ 1982.600053] Call Trace:
> [ 1982.600053] ? flush_smp_call_function_queue (kernel/smp.c:263)
> [ 1982.600053] hotplug_cfd (kernel/smp.c:81)
> [ 1982.600053] notifier_call_chain (kernel/notifier.c:95)
> [ 1982.600053] __raw_notifier_call_chain (kernel/notifier.c:395)
> [ 1982.600053] __cpu_notify (kernel/cpu.c:202)
> [ 1982.600053] cpu_notify (kernel/cpu.c:211)
> [ 1982.600053] take_cpu_down (./arch/x86/include/asm/current.h:14 kernel/cpu.c:312)
> [ 1982.600053] multi_cpu_stop (kernel/stop_machine.c:201)
> [ 1982.600053] ? __stop_cpus (kernel/stop_machine.c:170)
> [ 1982.600053] cpu_stopper_thread (kernel/stop_machine.c:474)
> [ 1982.600053] ? put_lock_stats.isra.12 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
> [ 1982.600053] ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/paravirt.h:809 include/linux/spinlock_api_smp.h:160 kernel/locking/spinlock.c:191)
> [ 1982.600053] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
> [ 1982.600053] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2557 kernel/locking/lockdep.c:2599)
> [ 1982.600053] smpboot_thread_fn (kernel/smpboot.c:160)
> [ 1982.600053] ? __smpboot_create_thread (kernel/smpboot.c:105)
> [ 1982.600053] kthread (kernel/kthread.c:210)
> [ 1982.600053] ? wait_for_completion (kernel/sched/completion.c:77 kernel/sched/completion.c:93 kernel/sched/completion.c:101 kernel/sched/completion.c:122)
> [ 1982.600053] ? kthread_create_on_node (kernel/kthread.c:176)
> [ 1982.600053] ret_from_fork (arch/x86/kernel/entry_64.S:349)
> [ 1982.600053] ? kthread_create_on_node (kernel/kthread.c:176)
> [ 1982.600053] Code: 00 00 00 00 e8 63 ff ff ff 48 83 c4 08 b8 01 00 00 00 5b 5d c3 b8 01 00 00 00 c3 90 65 8b 04 25 a0 da 00 00 a9 00 00 0f 00 75 09 <0f> 0b 0f 1f 80 00 00 00 00 55 48 89 e5 e8 2f ff ff ff 5d c3 66
> All code
> ========
>    0:	00 00                	add    %al,(%rax)
>    2:	00 00                	add    %al,(%rax)
>    4:	e8 63 ff ff ff       	callq  0xffffffffffffff6c
>    9:	48 83 c4 08          	add    $0x8,%rsp
>    d:	b8 01 00 00 00       	mov    $0x1,%eax
>   12:	5b                   	pop    %rbx
>   13:	5d                   	pop    %rbp
>   14:	c3                   	retq
>   15:	b8 01 00 00 00       	mov    $0x1,%eax
>   1a:	c3                   	retq
>   1b:	90                   	nop
>   1c:	65 8b 04 25 a0 da 00 	mov    %gs:0xdaa0,%eax
>   23:	00
>   24:	a9 00 00 0f 00       	test   $0xf0000,%eax
>   29:	75 09                	jne    0x34
>   2b:*	0f 0b                	ud2    		<-- trapping instruction
>   2d:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
>   34:	55                   	push   %rbp
>   35:	48 89 e5             	mov    %rsp,%rbp
>   38:	e8 2f ff ff ff       	callq  0xffffffffffffff6c
>   3d:	5d                   	pop    %rbp
>   3e:	c3                   	retq
>   3f:	66                   	data16
> 	...
> 
> Code starting with the faulting instruction
> ===========================================
>    0:	0f 0b                	ud2
>    2:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
>    9:	55                   	push   %rbp
>    a:	48 89 e5             	mov    %rsp,%rbp
>    d:	e8 2f ff ff ff       	callq  0xffffffffffffff41
>   12:	5d                   	pop    %rbp
>   13:	c3                   	retq
>   14:	66                   	data16
> 	...
> [ 1982.600053] RIP irq_work_run (kernel/irq_work.c:175 (discriminator 1))
> [ 1982.600053]  RSP <ffff88036a5afbe0>
> 
> 
> Thanks,
> Sasha
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-06-25 17:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-26 11:08 [PATCH v7 0/2] CPU hotplug: Fix the long-standing "IPI to offline CPU" issue Srivatsa S. Bhat
2014-05-26 11:08 ` [PATCH v7 1/2] smp: Print more useful debug info upon receiving IPI on an offline CPU Srivatsa S. Bhat
2014-05-26 11:08 ` [PATCH v7 2/2] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline Srivatsa S. Bhat
2014-06-25 15:42   ` Sasha Levin
2014-06-25 16:59     ` Srivatsa S. Bhat

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.