kexec.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: James Gowans <jgowans@amazon.com>
To: <jgowans@amazon.com>, Eric Biederman <ebiederm@xmission.com>,
	"Sean Christopherson" <seanjc@google.com>
Cc: <kexec@lists.infradead.org>, <linux-kernel@vger.kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Marc Zyngier <maz@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Tony Luck <tony.luck@intel.com>, Borislav Petkov <bp@alien8.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Chen-Yu Tsai <wens@csie.org>,
	Jernej Skrabec <jernej.skrabec@gmail.com>,
	Samuel Holland <samuel@sholland.org>,
	"Pavel Machek" <pavel@ucw.cz>, Sebastian Reichel <sre@kernel.org>,
	Orson Zhai <orsonzhai@gmail.com>, Alexander Graf <graf@amazon.de>,
	"Jan H . Schoenherr" <jschoenh@amazon.de>
Subject: [PATCH] kexec: do syscore_shutdown() in kernel_kexec
Date: Wed, 13 Dec 2023 08:40:04 +0200	[thread overview]
Message-ID: <20231213064004.2419447-1-jgowans@amazon.com> (raw)

syscore_shutdown() runs driver and module callbacks to get the system
into a state where it can be correctly shut down. In commit
6f389a8f1dd2 ("PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()")
syscore_shutdown() was removed from kernel_restart_prepare() and hence
got (incorrectly?) removed from the kexec flow. This was innocuous until
commit 6735150b6997 ("KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown")
changed the way that KVM registered its shutdown callbacks, switching from
reboot notifiers to syscore_ops.shutdown. As syscore_shutdown() is
missing from kexec, KVM's shutdown hook is not run and virtualisation is
left enabled on the boot CPU which results in triple faults when
switching to the new kernel on Intel x86 VT-x with VMXE enabled.

Fix this by adding syscore_shutdown() to the kexec sequence. In terms of
where to add it, it is being added after migrating the kexec task to the
boot CPU, but before APs are shut down. It is not totally clear if this
is the best place: in commit 6f389a8f1dd2 ("PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()")
it is stated that "syscore_ops operations should be carried with one
CPU on-line and interrupts disabled." APs are only offlined later in
machine_shutdown(), so this syscore_shutdown() is being run while APs
are still online. This seems to be the correct place as it matches where
syscore_shutdown() is run in the reboot and halt flows - they also run
it before APs are shut down. The assumption is that the commit message
in commit 6f389a8f1dd2 ("PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()")
is no longer valid.

KVM has been discussed here as it is what broke loudly by not having
syscore_shutdown() in kexec, but this change impacts more than just KVM;
all drivers/modules which register a syscore_ops.shutdown callback will
now be invoked in the kexec flow. Looking at some of them like x86 MCE
it is probably more correct to also shut these down during kexec.
Maintainers of all drivers which use syscore_ops.shutdown are added on
CC for visibility. They are:

arch/powerpc/platforms/cell/spu_base.c  .shutdown = spu_shutdown,
arch/x86/kernel/cpu/mce/core.c	        .shutdown = mce_syscore_shutdown,
arch/x86/kernel/i8259.c                 .shutdown = i8259A_shutdown,
drivers/irqchip/irq-i8259.c	        .shutdown = i8259A_shutdown,
drivers/irqchip/irq-sun6i-r.c	        .shutdown = sun6i_r_intc_shutdown,
drivers/leds/trigger/ledtrig-cpu.c	.shutdown = ledtrig_cpu_syscore_shutdown,
drivers/power/reset/sc27xx-poweroff.c	.shutdown = sc27xx_poweroff_shutdown,
kernel/irq/generic-chip.c	        .shutdown = irq_gc_shutdown,
virt/kvm/kvm_main.c	                .shutdown = kvm_shutdown,

This has been tested by doing a kexec on x86_64 and aarch64.

Fixes: 6735150b6997 ("KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown")

Signed-off-by: James Gowans <jgowans@amazon.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Samuel Holland <samuel@sholland.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Orson Zhai <orsonzhai@gmail.com>
Cc: Alexander Graf <graf@amazon.de>
Cc: Jan H. Schoenherr <jschoenh@amazon.de>
---
 kernel/kexec_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index be5642a4ec49..b926c4db8a91 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1254,6 +1254,7 @@ int kernel_kexec(void)
 		kexec_in_progress = true;
 		kernel_restart_prepare("kexec reboot");
 		migrate_to_reboot_cpu();
+		syscore_shutdown();
 
 		/*
 		 * migrate_to_reboot_cpu() disables CPU hotplug assuming that
-- 
2.34.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

             reply	other threads:[~2023-12-13  6:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-13  6:40 James Gowans [this message]
2023-12-13 16:39 ` [PATCH] kexec: do syscore_shutdown() in kernel_kexec Eric W. Biederman
2023-12-18 12:41   ` Gowans, James
2024-01-09  6:59     ` Gowans, James
2023-12-19  4:22 ` Baoquan He
2023-12-19  7:41   ` Gowans, James
2023-12-19  8:26     ` bhe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231213064004.2419447-1-jgowans@amazon.com \
    --to=jgowans@amazon.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=ebiederm@xmission.com \
    --cc=graf@amazon.de \
    --cc=jernej.skrabec@gmail.com \
    --cc=jschoenh@amazon.de \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=orsonzhai@gmail.com \
    --cc=pavel@ucw.cz \
    --cc=pbonzini@redhat.com \
    --cc=samuel@sholland.org \
    --cc=seanjc@google.com \
    --cc=sre@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=wens@csie.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).