All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 | stable v5.10 0/3] x86/kvm: fixes for hibernation
@ 2021-05-31 14:05 Krzysztof Kozlowski
  2021-05-31 14:05 ` [PATCH v3 | stable v5.10 1/3] x86/kvm: Teardown PV features on boot CPU as well Krzysztof Kozlowski
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Krzysztof Kozlowski @ 2021-05-31 14:05 UTC (permalink / raw
  To: stable; +Cc: Andrea Righi, Paolo Bonzini, Vitaly Kuznetsov,
	Krzysztof Kozlowski

Hi,

This is version 2 of a backport for v5.10.

Changes since v2:
1. Rebased v3 of v5.4 version, I kept numbering to be consistent.
2. The context in patch 1/3 had to be adjusted.


The first patch specifically fixes bug aftert 2nd resume:
  BUG: Bad page state in process dbus-daemon  pfn:18b01
  page:ffffea000062c040 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 compound_mapcount: -30591
  flags: 0xfffffc0078141(locked|error|workingset|writeback|head|mappedtodisk|reclaim)
  raw: 000fffffc0078141 dead0000000002d0 dead000000000100 0000000000000000
  raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
  page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
  bad because of flags: 0x78141(locked|error|workingset|writeback|head|mappedtodisk|reclaim)

Best regards,
Krzysztof

Vitaly Kuznetsov (3):
  x86/kvm: Teardown PV features on boot CPU as well
  x86/kvm: Disable kvmclock on all CPUs on shutdown
  x86/kvm: Disable all PV features on crash

 arch/x86/include/asm/kvm_para.h | 10 +---
 arch/x86/kernel/kvm.c           | 92 ++++++++++++++++++++++++---------
 arch/x86/kernel/kvmclock.c      | 26 +---------
 3 files changed, 72 insertions(+), 56 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 | stable v5.10 1/3] x86/kvm: Teardown PV features on boot CPU as well
  2021-05-31 14:05 [PATCH v3 | stable v5.10 0/3] x86/kvm: fixes for hibernation Krzysztof Kozlowski
@ 2021-05-31 14:05 ` Krzysztof Kozlowski
  2021-05-31 15:42   ` Paolo Bonzini
  2021-05-31 14:05 ` [PATCH v3 | stable v5.10 2/3] x86/kvm: Disable kvmclock on all CPUs on shutdown Krzysztof Kozlowski
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Krzysztof Kozlowski @ 2021-05-31 14:05 UTC (permalink / raw
  To: stable; +Cc: Andrea Righi, Paolo Bonzini, Vitaly Kuznetsov,
	Krzysztof Kozlowski

From: Vitaly Kuznetsov <vkuznets@redhat.com>

commit 8b79feffeca28c5459458fe78676b081e87c93a4 upstream.

Various PV features (Async PF, PV EOI, steal time) work through memory
shared with hypervisor and when we restore from hibernation we must
properly teardown all these features to make sure hypervisor doesn't
write to stale locations after we jump to the previously hibernated kernel
(which can try to place anything there). For secondary CPUs the job is
already done by kvm_cpu_down_prepare(), register syscore ops to do
the same for boot CPU.

Krzysztof:
This fixes memory corruption visible after second resume from
hibernation:

  BUG: Bad page state in process dbus-daemon  pfn:18b01
  page:ffffea000062c040 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 compound_mapcount: -30591
  flags: 0xfffffc0078141(locked|error|workingset|writeback|head|mappedtodisk|reclaim)
  raw: 000fffffc0078141 dead0000000002d0 dead000000000100 0000000000000000
  raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
  page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
  bad because of flags: 0x78141(locked|error|workingset|writeback|head|mappedtodisk|reclaim)

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210414123544.1060604-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
[krzysztof: Extend the commit message, adjust for v5.10 context]
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
---
 arch/x86/kernel/kvm.c | 57 +++++++++++++++++++++++++++++++------------
 1 file changed, 41 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 7f57ede3cb8e..6af3f9c3956c 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -26,6 +26,7 @@
 #include <linux/kprobes.h>
 #include <linux/nmi.h>
 #include <linux/swait.h>
+#include <linux/syscore_ops.h>
 #include <asm/timer.h>
 #include <asm/cpu.h>
 #include <asm/traps.h>
@@ -460,6 +461,25 @@ static bool pv_tlb_flush_supported(void)
 
 static DEFINE_PER_CPU(cpumask_var_t, __pv_cpu_mask);
 
+static void kvm_guest_cpu_offline(void)
+{
+	kvm_disable_steal_time();
+	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
+		wrmsrl(MSR_KVM_PV_EOI_EN, 0);
+	kvm_pv_disable_apf();
+	apf_task_wake_all();
+}
+
+static int kvm_cpu_online(unsigned int cpu)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	kvm_guest_cpu_init();
+	local_irq_restore(flags);
+	return 0;
+}
+
 #ifdef CONFIG_SMP
 
 static bool pv_ipi_supported(void)
@@ -587,31 +607,34 @@ static void __init kvm_smp_prepare_boot_cpu(void)
 	kvm_spinlock_init();
 }
 
-static void kvm_guest_cpu_offline(void)
+static int kvm_cpu_down_prepare(unsigned int cpu)
 {
-	kvm_disable_steal_time();
-	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
-		wrmsrl(MSR_KVM_PV_EOI_EN, 0);
-	kvm_pv_disable_apf();
-	apf_task_wake_all();
-}
+	unsigned long flags;
 
-static int kvm_cpu_online(unsigned int cpu)
-{
-	local_irq_disable();
-	kvm_guest_cpu_init();
-	local_irq_enable();
+	local_irq_save(flags);
+	kvm_guest_cpu_offline();
+	local_irq_restore(flags);
 	return 0;
 }
 
-static int kvm_cpu_down_prepare(unsigned int cpu)
+#endif
+
+static int kvm_suspend(void)
 {
-	local_irq_disable();
 	kvm_guest_cpu_offline();
-	local_irq_enable();
+
 	return 0;
 }
-#endif
+
+static void kvm_resume(void)
+{
+	kvm_cpu_online(raw_smp_processor_id());
+}
+
+static struct syscore_ops kvm_syscore_ops = {
+	.suspend	= kvm_suspend,
+	.resume		= kvm_resume,
+};
 
 static void kvm_flush_tlb_others(const struct cpumask *cpumask,
 			const struct flush_tlb_info *info)
@@ -681,6 +704,8 @@ static void __init kvm_guest_init(void)
 	kvm_guest_cpu_init();
 #endif
 
+	register_syscore_ops(&kvm_syscore_ops);
+
 	/*
 	 * Hard lockup detection is enabled by default. Disable it, as guests
 	 * can get false positives too easily, for example if the host is
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 | stable v5.10 2/3] x86/kvm: Disable kvmclock on all CPUs on shutdown
  2021-05-31 14:05 [PATCH v3 | stable v5.10 0/3] x86/kvm: fixes for hibernation Krzysztof Kozlowski
  2021-05-31 14:05 ` [PATCH v3 | stable v5.10 1/3] x86/kvm: Teardown PV features on boot CPU as well Krzysztof Kozlowski
@ 2021-05-31 14:05 ` Krzysztof Kozlowski
  2021-05-31 15:42   ` Paolo Bonzini
  2021-05-31 14:05 ` [PATCH v3 | stable v5.10 3/3] x86/kvm: Disable all PV features on crash Krzysztof Kozlowski
  2021-06-01  6:40 ` [PATCH v3 | stable v5.10 0/3] x86/kvm: fixes for hibernation Greg KH
  3 siblings, 1 reply; 9+ messages in thread
From: Krzysztof Kozlowski @ 2021-05-31 14:05 UTC (permalink / raw
  To: stable; +Cc: Andrea Righi, Paolo Bonzini, Vitaly Kuznetsov,
	Krzysztof Kozlowski

From: Vitaly Kuznetsov <vkuznets@redhat.com>

commit c02027b5742b5aa804ef08a4a9db433295533046 upstream.

Currenly, we disable kvmclock from machine_shutdown() hook and this
only happens for boot CPU. We need to disable it for all CPUs to
guard against memory corruption e.g. on restore from hibernate.

Note, writing '0' to kvmclock MSR doesn't clear memory location, it
just prevents hypervisor from updating the location so for the short
while after write and while CPU is still alive, the clock remains usable
and correct so we don't need to switch to some other clocksource.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210414123544.1060604-4-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
---
 arch/x86/include/asm/kvm_para.h | 4 ++--
 arch/x86/kernel/kvm.c           | 1 +
 arch/x86/kernel/kvmclock.c      | 5 +----
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 338119852512..9c56e0defd45 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -7,8 +7,6 @@
 #include <linux/interrupt.h>
 #include <uapi/asm/kvm_para.h>
 
-extern void kvmclock_init(void);
-
 #ifdef CONFIG_KVM_GUEST
 bool kvm_check_and_clear_guest_paused(void);
 #else
@@ -86,6 +84,8 @@ static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
 }
 
 #ifdef CONFIG_KVM_GUEST
+void kvmclock_init(void);
+void kvmclock_disable(void);
 bool kvm_para_available(void);
 unsigned int kvm_arch_para_features(void);
 unsigned int kvm_arch_para_hints(void);
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 6af3f9c3956c..be1c42e663c6 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -468,6 +468,7 @@ static void kvm_guest_cpu_offline(void)
 		wrmsrl(MSR_KVM_PV_EOI_EN, 0);
 	kvm_pv_disable_apf();
 	apf_task_wake_all();
+	kvmclock_disable();
 }
 
 static int kvm_cpu_online(unsigned int cpu)
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 5ee705b44560..327a0de01066 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -221,11 +221,9 @@ static void kvm_crash_shutdown(struct pt_regs *regs)
 }
 #endif
 
-static void kvm_shutdown(void)
+void kvmclock_disable(void)
 {
 	native_write_msr(msr_kvm_system_time, 0, 0);
-	kvm_disable_steal_time();
-	native_machine_shutdown();
 }
 
 static void __init kvmclock_init_mem(void)
@@ -352,7 +350,6 @@ void __init kvmclock_init(void)
 #endif
 	x86_platform.save_sched_clock_state = kvm_save_sched_clock_state;
 	x86_platform.restore_sched_clock_state = kvm_restore_sched_clock_state;
-	machine_ops.shutdown  = kvm_shutdown;
 #ifdef CONFIG_KEXEC_CORE
 	machine_ops.crash_shutdown  = kvm_crash_shutdown;
 #endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 | stable v5.10 3/3] x86/kvm: Disable all PV features on crash
  2021-05-31 14:05 [PATCH v3 | stable v5.10 0/3] x86/kvm: fixes for hibernation Krzysztof Kozlowski
  2021-05-31 14:05 ` [PATCH v3 | stable v5.10 1/3] x86/kvm: Teardown PV features on boot CPU as well Krzysztof Kozlowski
  2021-05-31 14:05 ` [PATCH v3 | stable v5.10 2/3] x86/kvm: Disable kvmclock on all CPUs on shutdown Krzysztof Kozlowski
@ 2021-05-31 14:05 ` Krzysztof Kozlowski
  2021-05-31 15:42   ` Paolo Bonzini
  2021-06-01  6:40 ` [PATCH v3 | stable v5.10 0/3] x86/kvm: fixes for hibernation Greg KH
  3 siblings, 1 reply; 9+ messages in thread
From: Krzysztof Kozlowski @ 2021-05-31 14:05 UTC (permalink / raw
  To: stable; +Cc: Andrea Righi, Paolo Bonzini, Vitaly Kuznetsov,
	Krzysztof Kozlowski

From: Vitaly Kuznetsov <vkuznets@redhat.com>

commit 3d6b84132d2a57b5a74100f6923a8feb679ac2ce upstream.

Crash shutdown handler only disables kvmclock and steal time, other PV
features remain active so we risk corrupting memory or getting some
side-effects in kdump kernel. Move crash handler to kvm.c and unify
with CPU offline.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210414123544.1060604-5-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
---
 arch/x86/include/asm/kvm_para.h |  6 -----
 arch/x86/kernel/kvm.c           | 44 ++++++++++++++++++++++++---------
 arch/x86/kernel/kvmclock.c      | 21 ----------------
 3 files changed, 32 insertions(+), 39 deletions(-)

diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 9c56e0defd45..69299878b200 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -92,7 +92,6 @@ unsigned int kvm_arch_para_hints(void);
 void kvm_async_pf_task_wait_schedule(u32 token);
 void kvm_async_pf_task_wake(u32 token);
 u32 kvm_read_and_reset_apf_flags(void);
-void kvm_disable_steal_time(void);
 bool __kvm_handle_async_pf(struct pt_regs *regs, u32 token);
 
 DECLARE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
@@ -137,11 +136,6 @@ static inline u32 kvm_read_and_reset_apf_flags(void)
 	return 0;
 }
 
-static inline void kvm_disable_steal_time(void)
-{
-	return;
-}
-
 static __always_inline bool kvm_handle_async_pf(struct pt_regs *regs, u32 token)
 {
 	return false;
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index be1c42e663c6..7462b79c39de 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -38,6 +38,7 @@
 #include <asm/tlb.h>
 #include <asm/cpuidle_haltpoll.h>
 #include <asm/ptrace.h>
+#include <asm/reboot.h>
 #include <asm/svm.h>
 
 DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
@@ -375,6 +376,14 @@ static void kvm_pv_disable_apf(void)
 	pr_info("Unregister pv shared memory for cpu %d\n", smp_processor_id());
 }
 
+static void kvm_disable_steal_time(void)
+{
+	if (!has_steal_clock)
+		return;
+
+	wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
+}
+
 static void kvm_pv_guest_cpu_reboot(void *unused)
 {
 	/*
@@ -417,14 +426,6 @@ static u64 kvm_steal_clock(int cpu)
 	return steal;
 }
 
-void kvm_disable_steal_time(void)
-{
-	if (!has_steal_clock)
-		return;
-
-	wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
-}
-
 static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
 {
 	early_set_memory_decrypted((unsigned long) ptr, size);
@@ -461,13 +462,14 @@ static bool pv_tlb_flush_supported(void)
 
 static DEFINE_PER_CPU(cpumask_var_t, __pv_cpu_mask);
 
-static void kvm_guest_cpu_offline(void)
+static void kvm_guest_cpu_offline(bool shutdown)
 {
 	kvm_disable_steal_time();
 	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
 		wrmsrl(MSR_KVM_PV_EOI_EN, 0);
 	kvm_pv_disable_apf();
-	apf_task_wake_all();
+	if (!shutdown)
+		apf_task_wake_all();
 	kvmclock_disable();
 }
 
@@ -613,7 +615,7 @@ static int kvm_cpu_down_prepare(unsigned int cpu)
 	unsigned long flags;
 
 	local_irq_save(flags);
-	kvm_guest_cpu_offline();
+	kvm_guest_cpu_offline(false);
 	local_irq_restore(flags);
 	return 0;
 }
@@ -622,7 +624,7 @@ static int kvm_cpu_down_prepare(unsigned int cpu)
 
 static int kvm_suspend(void)
 {
-	kvm_guest_cpu_offline();
+	kvm_guest_cpu_offline(false);
 
 	return 0;
 }
@@ -637,6 +639,20 @@ static struct syscore_ops kvm_syscore_ops = {
 	.resume		= kvm_resume,
 };
 
+/*
+ * After a PV feature is registered, the host will keep writing to the
+ * registered memory location. If the guest happens to shutdown, this memory
+ * won't be valid. In cases like kexec, in which you install a new kernel, this
+ * means a random memory location will be kept being written.
+ */
+#ifdef CONFIG_KEXEC_CORE
+static void kvm_crash_shutdown(struct pt_regs *regs)
+{
+	kvm_guest_cpu_offline(true);
+	native_machine_crash_shutdown(regs);
+}
+#endif
+
 static void kvm_flush_tlb_others(const struct cpumask *cpumask,
 			const struct flush_tlb_info *info)
 {
@@ -705,6 +721,10 @@ static void __init kvm_guest_init(void)
 	kvm_guest_cpu_init();
 #endif
 
+#ifdef CONFIG_KEXEC_CORE
+	machine_ops.crash_shutdown = kvm_crash_shutdown;
+#endif
+
 	register_syscore_ops(&kvm_syscore_ops);
 
 	/*
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 327a0de01066..c4ac26333bc4 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -20,7 +20,6 @@
 #include <asm/hypervisor.h>
 #include <asm/mem_encrypt.h>
 #include <asm/x86_init.h>
-#include <asm/reboot.h>
 #include <asm/kvmclock.h>
 
 static int kvmclock __initdata = 1;
@@ -204,23 +203,6 @@ static void kvm_setup_secondary_clock(void)
 }
 #endif
 
-/*
- * After the clock is registered, the host will keep writing to the
- * registered memory location. If the guest happens to shutdown, this memory
- * won't be valid. In cases like kexec, in which you install a new kernel, this
- * means a random memory location will be kept being written. So before any
- * kind of shutdown from our side, we unregister the clock by writing anything
- * that does not have the 'enable' bit set in the msr
- */
-#ifdef CONFIG_KEXEC_CORE
-static void kvm_crash_shutdown(struct pt_regs *regs)
-{
-	native_write_msr(msr_kvm_system_time, 0, 0);
-	kvm_disable_steal_time();
-	native_machine_crash_shutdown(regs);
-}
-#endif
-
 void kvmclock_disable(void)
 {
 	native_write_msr(msr_kvm_system_time, 0, 0);
@@ -350,9 +332,6 @@ void __init kvmclock_init(void)
 #endif
 	x86_platform.save_sched_clock_state = kvm_save_sched_clock_state;
 	x86_platform.restore_sched_clock_state = kvm_restore_sched_clock_state;
-#ifdef CONFIG_KEXEC_CORE
-	machine_ops.crash_shutdown  = kvm_crash_shutdown;
-#endif
 	kvm_get_preset_lpj();
 
 	/*
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 | stable v5.10 1/3] x86/kvm: Teardown PV features on boot CPU as well
  2021-05-31 14:05 ` [PATCH v3 | stable v5.10 1/3] x86/kvm: Teardown PV features on boot CPU as well Krzysztof Kozlowski
@ 2021-05-31 15:42   ` Paolo Bonzini
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2021-05-31 15:42 UTC (permalink / raw
  To: Krzysztof Kozlowski, stable; +Cc: Andrea Righi, Vitaly Kuznetsov

On 31/05/21 16:05, Krzysztof Kozlowski wrote:
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> commit 8b79feffeca28c5459458fe78676b081e87c93a4 upstream.
> 
> Various PV features (Async PF, PV EOI, steal time) work through memory
> shared with hypervisor and when we restore from hibernation we must
> properly teardown all these features to make sure hypervisor doesn't
> write to stale locations after we jump to the previously hibernated kernel
> (which can try to place anything there). For secondary CPUs the job is
> already done by kvm_cpu_down_prepare(), register syscore ops to do
> the same for boot CPU.
> 
> Krzysztof:
> This fixes memory corruption visible after second resume from
> hibernation:
> 
>    BUG: Bad page state in process dbus-daemon  pfn:18b01
>    page:ffffea000062c040 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 compound_mapcount: -30591
>    flags: 0xfffffc0078141(locked|error|workingset|writeback|head|mappedtodisk|reclaim)
>    raw: 000fffffc0078141 dead0000000002d0 dead000000000100 0000000000000000
>    raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
>    page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set
>    bad because of flags: 0x78141(locked|error|workingset|writeback|head|mappedtodisk|reclaim)
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Message-Id: <20210414123544.1060604-3-vkuznets@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
> [krzysztof: Extend the commit message, adjust for v5.10 context]
> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
> ---
>   arch/x86/kernel/kvm.c | 57 +++++++++++++++++++++++++++++++------------
>   1 file changed, 41 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 7f57ede3cb8e..6af3f9c3956c 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -26,6 +26,7 @@
>   #include <linux/kprobes.h>
>   #include <linux/nmi.h>
>   #include <linux/swait.h>
> +#include <linux/syscore_ops.h>
>   #include <asm/timer.h>
>   #include <asm/cpu.h>
>   #include <asm/traps.h>
> @@ -460,6 +461,25 @@ static bool pv_tlb_flush_supported(void)
>   
>   static DEFINE_PER_CPU(cpumask_var_t, __pv_cpu_mask);
>   
> +static void kvm_guest_cpu_offline(void)
> +{
> +	kvm_disable_steal_time();
> +	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
> +		wrmsrl(MSR_KVM_PV_EOI_EN, 0);
> +	kvm_pv_disable_apf();
> +	apf_task_wake_all();
> +}
> +
> +static int kvm_cpu_online(unsigned int cpu)
> +{
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	kvm_guest_cpu_init();
> +	local_irq_restore(flags);
> +	return 0;
> +}
> +
>   #ifdef CONFIG_SMP
>   
>   static bool pv_ipi_supported(void)
> @@ -587,31 +607,34 @@ static void __init kvm_smp_prepare_boot_cpu(void)
>   	kvm_spinlock_init();
>   }
>   
> -static void kvm_guest_cpu_offline(void)
> +static int kvm_cpu_down_prepare(unsigned int cpu)
>   {
> -	kvm_disable_steal_time();
> -	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
> -		wrmsrl(MSR_KVM_PV_EOI_EN, 0);
> -	kvm_pv_disable_apf();
> -	apf_task_wake_all();
> -}
> +	unsigned long flags;
>   
> -static int kvm_cpu_online(unsigned int cpu)
> -{
> -	local_irq_disable();
> -	kvm_guest_cpu_init();
> -	local_irq_enable();
> +	local_irq_save(flags);
> +	kvm_guest_cpu_offline();
> +	local_irq_restore(flags);
>   	return 0;
>   }
>   
> -static int kvm_cpu_down_prepare(unsigned int cpu)
> +#endif
> +
> +static int kvm_suspend(void)
>   {
> -	local_irq_disable();
>   	kvm_guest_cpu_offline();
> -	local_irq_enable();
> +
>   	return 0;
>   }
> -#endif
> +
> +static void kvm_resume(void)
> +{
> +	kvm_cpu_online(raw_smp_processor_id());
> +}
> +
> +static struct syscore_ops kvm_syscore_ops = {
> +	.suspend	= kvm_suspend,
> +	.resume		= kvm_resume,
> +};
>   
>   static void kvm_flush_tlb_others(const struct cpumask *cpumask,
>   			const struct flush_tlb_info *info)
> @@ -681,6 +704,8 @@ static void __init kvm_guest_init(void)
>   	kvm_guest_cpu_init();
>   #endif
>   
> +	register_syscore_ops(&kvm_syscore_ops);
> +
>   	/*
>   	 * Hard lockup detection is enabled by default. Disable it, as guests
>   	 * can get false positives too easily, for example if the host is
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 | stable v5.10 2/3] x86/kvm: Disable kvmclock on all CPUs on shutdown
  2021-05-31 14:05 ` [PATCH v3 | stable v5.10 2/3] x86/kvm: Disable kvmclock on all CPUs on shutdown Krzysztof Kozlowski
@ 2021-05-31 15:42   ` Paolo Bonzini
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2021-05-31 15:42 UTC (permalink / raw
  To: Krzysztof Kozlowski, stable; +Cc: Andrea Righi, Vitaly Kuznetsov

On 31/05/21 16:05, Krzysztof Kozlowski wrote:
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> commit c02027b5742b5aa804ef08a4a9db433295533046 upstream.
> 
> Currenly, we disable kvmclock from machine_shutdown() hook and this
> only happens for boot CPU. We need to disable it for all CPUs to
> guard against memory corruption e.g. on restore from hibernate.
> 
> Note, writing '0' to kvmclock MSR doesn't clear memory location, it
> just prevents hypervisor from updating the location so for the short
> while after write and while CPU is still alive, the clock remains usable
> and correct so we don't need to switch to some other clocksource.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Message-Id: <20210414123544.1060604-4-vkuznets@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
> ---
>   arch/x86/include/asm/kvm_para.h | 4 ++--
>   arch/x86/kernel/kvm.c           | 1 +
>   arch/x86/kernel/kvmclock.c      | 5 +----
>   3 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
> index 338119852512..9c56e0defd45 100644
> --- a/arch/x86/include/asm/kvm_para.h
> +++ b/arch/x86/include/asm/kvm_para.h
> @@ -7,8 +7,6 @@
>   #include <linux/interrupt.h>
>   #include <uapi/asm/kvm_para.h>
>   
> -extern void kvmclock_init(void);
> -
>   #ifdef CONFIG_KVM_GUEST
>   bool kvm_check_and_clear_guest_paused(void);
>   #else
> @@ -86,6 +84,8 @@ static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
>   }
>   
>   #ifdef CONFIG_KVM_GUEST
> +void kvmclock_init(void);
> +void kvmclock_disable(void);
>   bool kvm_para_available(void);
>   unsigned int kvm_arch_para_features(void);
>   unsigned int kvm_arch_para_hints(void);
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 6af3f9c3956c..be1c42e663c6 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -468,6 +468,7 @@ static void kvm_guest_cpu_offline(void)
>   		wrmsrl(MSR_KVM_PV_EOI_EN, 0);
>   	kvm_pv_disable_apf();
>   	apf_task_wake_all();
> +	kvmclock_disable();
>   }
>   
>   static int kvm_cpu_online(unsigned int cpu)
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index 5ee705b44560..327a0de01066 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -221,11 +221,9 @@ static void kvm_crash_shutdown(struct pt_regs *regs)
>   }
>   #endif
>   
> -static void kvm_shutdown(void)
> +void kvmclock_disable(void)
>   {
>   	native_write_msr(msr_kvm_system_time, 0, 0);
> -	kvm_disable_steal_time();
> -	native_machine_shutdown();
>   }
>   
>   static void __init kvmclock_init_mem(void)
> @@ -352,7 +350,6 @@ void __init kvmclock_init(void)
>   #endif
>   	x86_platform.save_sched_clock_state = kvm_save_sched_clock_state;
>   	x86_platform.restore_sched_clock_state = kvm_restore_sched_clock_state;
> -	machine_ops.shutdown  = kvm_shutdown;
>   #ifdef CONFIG_KEXEC_CORE
>   	machine_ops.crash_shutdown  = kvm_crash_shutdown;
>   #endif
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 | stable v5.10 3/3] x86/kvm: Disable all PV features on crash
  2021-05-31 14:05 ` [PATCH v3 | stable v5.10 3/3] x86/kvm: Disable all PV features on crash Krzysztof Kozlowski
@ 2021-05-31 15:42   ` Paolo Bonzini
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2021-05-31 15:42 UTC (permalink / raw
  To: Krzysztof Kozlowski, stable; +Cc: Andrea Righi, Vitaly Kuznetsov

On 31/05/21 16:05, Krzysztof Kozlowski wrote:
> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> commit 3d6b84132d2a57b5a74100f6923a8feb679ac2ce upstream.
> 
> Crash shutdown handler only disables kvmclock and steal time, other PV
> features remain active so we risk corrupting memory or getting some
> side-effects in kdump kernel. Move crash handler to kvm.c and unify
> with CPU offline.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Message-Id: <20210414123544.1060604-5-vkuznets@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
> ---
>   arch/x86/include/asm/kvm_para.h |  6 -----
>   arch/x86/kernel/kvm.c           | 44 ++++++++++++++++++++++++---------
>   arch/x86/kernel/kvmclock.c      | 21 ----------------
>   3 files changed, 32 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
> index 9c56e0defd45..69299878b200 100644
> --- a/arch/x86/include/asm/kvm_para.h
> +++ b/arch/x86/include/asm/kvm_para.h
> @@ -92,7 +92,6 @@ unsigned int kvm_arch_para_hints(void);
>   void kvm_async_pf_task_wait_schedule(u32 token);
>   void kvm_async_pf_task_wake(u32 token);
>   u32 kvm_read_and_reset_apf_flags(void);
> -void kvm_disable_steal_time(void);
>   bool __kvm_handle_async_pf(struct pt_regs *regs, u32 token);
>   
>   DECLARE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
> @@ -137,11 +136,6 @@ static inline u32 kvm_read_and_reset_apf_flags(void)
>   	return 0;
>   }
>   
> -static inline void kvm_disable_steal_time(void)
> -{
> -	return;
> -}
> -
>   static __always_inline bool kvm_handle_async_pf(struct pt_regs *regs, u32 token)
>   {
>   	return false;
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index be1c42e663c6..7462b79c39de 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -38,6 +38,7 @@
>   #include <asm/tlb.h>
>   #include <asm/cpuidle_haltpoll.h>
>   #include <asm/ptrace.h>
> +#include <asm/reboot.h>
>   #include <asm/svm.h>
>   
>   DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
> @@ -375,6 +376,14 @@ static void kvm_pv_disable_apf(void)
>   	pr_info("Unregister pv shared memory for cpu %d\n", smp_processor_id());
>   }
>   
> +static void kvm_disable_steal_time(void)
> +{
> +	if (!has_steal_clock)
> +		return;
> +
> +	wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
> +}
> +
>   static void kvm_pv_guest_cpu_reboot(void *unused)
>   {
>   	/*
> @@ -417,14 +426,6 @@ static u64 kvm_steal_clock(int cpu)
>   	return steal;
>   }
>   
> -void kvm_disable_steal_time(void)
> -{
> -	if (!has_steal_clock)
> -		return;
> -
> -	wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
> -}
> -
>   static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
>   {
>   	early_set_memory_decrypted((unsigned long) ptr, size);
> @@ -461,13 +462,14 @@ static bool pv_tlb_flush_supported(void)
>   
>   static DEFINE_PER_CPU(cpumask_var_t, __pv_cpu_mask);
>   
> -static void kvm_guest_cpu_offline(void)
> +static void kvm_guest_cpu_offline(bool shutdown)
>   {
>   	kvm_disable_steal_time();
>   	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
>   		wrmsrl(MSR_KVM_PV_EOI_EN, 0);
>   	kvm_pv_disable_apf();
> -	apf_task_wake_all();
> +	if (!shutdown)
> +		apf_task_wake_all();
>   	kvmclock_disable();
>   }
>   
> @@ -613,7 +615,7 @@ static int kvm_cpu_down_prepare(unsigned int cpu)
>   	unsigned long flags;
>   
>   	local_irq_save(flags);
> -	kvm_guest_cpu_offline();
> +	kvm_guest_cpu_offline(false);
>   	local_irq_restore(flags);
>   	return 0;
>   }
> @@ -622,7 +624,7 @@ static int kvm_cpu_down_prepare(unsigned int cpu)
>   
>   static int kvm_suspend(void)
>   {
> -	kvm_guest_cpu_offline();
> +	kvm_guest_cpu_offline(false);
>   
>   	return 0;
>   }
> @@ -637,6 +639,20 @@ static struct syscore_ops kvm_syscore_ops = {
>   	.resume		= kvm_resume,
>   };
>   
> +/*
> + * After a PV feature is registered, the host will keep writing to the
> + * registered memory location. If the guest happens to shutdown, this memory
> + * won't be valid. In cases like kexec, in which you install a new kernel, this
> + * means a random memory location will be kept being written.
> + */
> +#ifdef CONFIG_KEXEC_CORE
> +static void kvm_crash_shutdown(struct pt_regs *regs)
> +{
> +	kvm_guest_cpu_offline(true);
> +	native_machine_crash_shutdown(regs);
> +}
> +#endif
> +
>   static void kvm_flush_tlb_others(const struct cpumask *cpumask,
>   			const struct flush_tlb_info *info)
>   {
> @@ -705,6 +721,10 @@ static void __init kvm_guest_init(void)
>   	kvm_guest_cpu_init();
>   #endif
>   
> +#ifdef CONFIG_KEXEC_CORE
> +	machine_ops.crash_shutdown = kvm_crash_shutdown;
> +#endif
> +
>   	register_syscore_ops(&kvm_syscore_ops);
>   
>   	/*
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index 327a0de01066..c4ac26333bc4 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -20,7 +20,6 @@
>   #include <asm/hypervisor.h>
>   #include <asm/mem_encrypt.h>
>   #include <asm/x86_init.h>
> -#include <asm/reboot.h>
>   #include <asm/kvmclock.h>
>   
>   static int kvmclock __initdata = 1;
> @@ -204,23 +203,6 @@ static void kvm_setup_secondary_clock(void)
>   }
>   #endif
>   
> -/*
> - * After the clock is registered, the host will keep writing to the
> - * registered memory location. If the guest happens to shutdown, this memory
> - * won't be valid. In cases like kexec, in which you install a new kernel, this
> - * means a random memory location will be kept being written. So before any
> - * kind of shutdown from our side, we unregister the clock by writing anything
> - * that does not have the 'enable' bit set in the msr
> - */
> -#ifdef CONFIG_KEXEC_CORE
> -static void kvm_crash_shutdown(struct pt_regs *regs)
> -{
> -	native_write_msr(msr_kvm_system_time, 0, 0);
> -	kvm_disable_steal_time();
> -	native_machine_crash_shutdown(regs);
> -}
> -#endif
> -
>   void kvmclock_disable(void)
>   {
>   	native_write_msr(msr_kvm_system_time, 0, 0);
> @@ -350,9 +332,6 @@ void __init kvmclock_init(void)
>   #endif
>   	x86_platform.save_sched_clock_state = kvm_save_sched_clock_state;
>   	x86_platform.restore_sched_clock_state = kvm_restore_sched_clock_state;
> -#ifdef CONFIG_KEXEC_CORE
> -	machine_ops.crash_shutdown  = kvm_crash_shutdown;
> -#endif
>   	kvm_get_preset_lpj();
>   
>   	/*
> 

Acked-by: Paolo Bonzini <pbonzini@redhat.com>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 | stable v5.10 0/3] x86/kvm: fixes for hibernation
  2021-05-31 14:05 [PATCH v3 | stable v5.10 0/3] x86/kvm: fixes for hibernation Krzysztof Kozlowski
                   ` (2 preceding siblings ...)
  2021-05-31 14:05 ` [PATCH v3 | stable v5.10 3/3] x86/kvm: Disable all PV features on crash Krzysztof Kozlowski
@ 2021-06-01  6:40 ` Greg KH
  2021-06-01  7:02   ` Krzysztof Kozlowski
  3 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2021-06-01  6:40 UTC (permalink / raw
  To: Krzysztof Kozlowski; +Cc: stable, Andrea Righi, Paolo Bonzini, Vitaly Kuznetsov

On Mon, May 31, 2021 at 04:05:23PM +0200, Krzysztof Kozlowski wrote:
> Hi,
> 
> This is version 2 of a backport for v5.10.
> 
> Changes since v2:
> 1. Rebased v3 of v5.4 version, I kept numbering to be consistent.
> 2. The context in patch 1/3 had to be adjusted.

I also need a 5.12.y version of this series before I can apply these to
older kernel releases.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 | stable v5.10 0/3] x86/kvm: fixes for hibernation
  2021-06-01  6:40 ` [PATCH v3 | stable v5.10 0/3] x86/kvm: fixes for hibernation Greg KH
@ 2021-06-01  7:02   ` Krzysztof Kozlowski
  0 siblings, 0 replies; 9+ messages in thread
From: Krzysztof Kozlowski @ 2021-06-01  7:02 UTC (permalink / raw
  To: Greg KH; +Cc: stable, Andrea Righi, Paolo Bonzini, Vitaly Kuznetsov

On 01/06/2021 08:40, Greg KH wrote:
> On Mon, May 31, 2021 at 04:05:23PM +0200, Krzysztof Kozlowski wrote:
>> Hi,
>>
>> This is version 2 of a backport for v5.10.
>>
>> Changes since v2:
>> 1. Rebased v3 of v5.4 version, I kept numbering to be consistent.
>> 2. The context in patch 1/3 had to be adjusted.
> 
> I also need a 5.12.y version of this series before I can apply these to
> older kernel releases.

The cherry-pick from v5.10 would work although apply might fail due to
context difference. I'll sent shortly a v5.12 port.


Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-06-01  7:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-05-31 14:05 [PATCH v3 | stable v5.10 0/3] x86/kvm: fixes for hibernation Krzysztof Kozlowski
2021-05-31 14:05 ` [PATCH v3 | stable v5.10 1/3] x86/kvm: Teardown PV features on boot CPU as well Krzysztof Kozlowski
2021-05-31 15:42   ` Paolo Bonzini
2021-05-31 14:05 ` [PATCH v3 | stable v5.10 2/3] x86/kvm: Disable kvmclock on all CPUs on shutdown Krzysztof Kozlowski
2021-05-31 15:42   ` Paolo Bonzini
2021-05-31 14:05 ` [PATCH v3 | stable v5.10 3/3] x86/kvm: Disable all PV features on crash Krzysztof Kozlowski
2021-05-31 15:42   ` Paolo Bonzini
2021-06-01  6:40 ` [PATCH v3 | stable v5.10 0/3] x86/kvm: fixes for hibernation Greg KH
2021-06-01  7:02   ` Krzysztof Kozlowski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.