KVM Archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: kvm_vcpu_kick: Do not read potentially-stale vcpu->cpu
@ 2021-06-30  3:10 Venkatesh Srinivas
  2021-06-30 16:24 ` David Matlack
  0 siblings, 1 reply; 4+ messages in thread
From: Venkatesh Srinivas @ 2021-06-30  3:10 UTC (permalink / raw
  To: kvm, jmattson, seanjc, elver, dvyukov; +Cc: Venkatesh Srinivas

vcpu->cpu contains the last cpu a vcpu run/is running on;
kvm_vcpu_kick is used to 'kick' a guest vcpu by sending an IPI
to the last CPU; vcpu->cpu is updated on sched_in when a vcpu
moves to a new CPU, so it possible for the vcpu->cpu field to
be stale.

kvm_vcpu_kick also read vcpu->cpu early with a plain read,
while vcpu->cpu could be concurrently written. This caused
a data race, found by kcsan:

write to 0xffff8880274e8460 of 4 bytes by task 30718 on cpu 0:
 kvm_arch_vcpu_load arch/x86/kvm/x86.c:4018
 kvm_sched_in virt/kvm/kvm_main.c:4864

vs
 kvm_vcpu_kick virt/kvm/kvm_main.c:2909
 kvm_arch_async_page_present_queued arch/x86/kvm/x86.c:11287
 async_pf_execute virt/kvm/async_pf.c:79
 ...

Use a READ_ONCE to atomically read vcpu->cpu and avoid the
data race.

Found by kcsan & syzbot.

Signed-off-by: Venkatesh Srinivas <venkateshs@chromium.org>
---
 virt/kvm/kvm_main.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 46fb042837d2..0525f42afb7d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3058,16 +3058,18 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up);
  */
 void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
 {
-	int me;
-	int cpu = vcpu->cpu;
+	int me, cpu;
 
 	if (kvm_vcpu_wake_up(vcpu))
 		return;
 
 	me = get_cpu();
-	if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
-		if (kvm_arch_vcpu_should_kick(vcpu))
+	if (kvm_arch_vcpu_should_kick(vcpu)) {
+		cpu = READ_ONCE(vcpu->cpu);
+		WARN_ON_ONCE(cpu == me);
+		if ((unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
 			smp_send_reschedule(cpu);
+	}
 	put_cpu();
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: kvm_vcpu_kick: Do not read potentially-stale vcpu->cpu
  2021-06-30  3:10 [PATCH] KVM: kvm_vcpu_kick: Do not read potentially-stale vcpu->cpu Venkatesh Srinivas
@ 2021-06-30 16:24 ` David Matlack
  2021-07-09 17:47   ` Sean Christopherson
  0 siblings, 1 reply; 4+ messages in thread
From: David Matlack @ 2021-06-30 16:24 UTC (permalink / raw
  To: Venkatesh Srinivas; +Cc: kvm, jmattson, seanjc, elver, dvyukov

On Tue, Jun 29, 2021 at 08:10:37PM -0700, Venkatesh Srinivas wrote:
> vcpu->cpu contains the last cpu a vcpu run/is running on;
> kvm_vcpu_kick is used to 'kick' a guest vcpu by sending an IPI
> to the last CPU; vcpu->cpu is updated on sched_in when a vcpu
> moves to a new CPU, so it possible for the vcpu->cpu field to
> be stale.
> 
> kvm_vcpu_kick also read vcpu->cpu early with a plain read,
> while vcpu->cpu could be concurrently written. This caused
> a data race, found by kcsan:
> 
> write to 0xffff8880274e8460 of 4 bytes by task 30718 on cpu 0:
>  kvm_arch_vcpu_load arch/x86/kvm/x86.c:4018
>  kvm_sched_in virt/kvm/kvm_main.c:4864
> 
> vs
>  kvm_vcpu_kick virt/kvm/kvm_main.c:2909
>  kvm_arch_async_page_present_queued arch/x86/kvm/x86.c:11287
>  async_pf_execute virt/kvm/async_pf.c:79
>  ...
> 
> Use a READ_ONCE to atomically read vcpu->cpu and avoid the
> data race.
> 
> Found by kcsan & syzbot.
> 
> Signed-off-by: Venkatesh Srinivas <venkateshs@chromium.org>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  virt/kvm/kvm_main.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 46fb042837d2..0525f42afb7d 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3058,16 +3058,18 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up);
>   */
>  void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
>  {
> -	int me;
> -	int cpu = vcpu->cpu;
> +	int me, cpu;
>  
>  	if (kvm_vcpu_wake_up(vcpu))
>  		return;
>  
>  	me = get_cpu();
> -	if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
> -		if (kvm_arch_vcpu_should_kick(vcpu))
> +	if (kvm_arch_vcpu_should_kick(vcpu)) {
> +		cpu = READ_ONCE(vcpu->cpu);
> +		WARN_ON_ONCE(cpu == me);

nit: A comment here may be useful to explain the interaction with
kvm_arch_vcpu_should_kick(). Took me a minute to understand why you
added the warning.

> +		if ((unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
>  			smp_send_reschedule(cpu);
> +	}
>  	put_cpu();
>  }
>  EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: kvm_vcpu_kick: Do not read potentially-stale vcpu->cpu
  2021-06-30 16:24 ` David Matlack
@ 2021-07-09 17:47   ` Sean Christopherson
  2021-07-13  1:45     ` Venkatesh Srinivas
  0 siblings, 1 reply; 4+ messages in thread
From: Sean Christopherson @ 2021-07-09 17:47 UTC (permalink / raw
  To: David Matlack; +Cc: Venkatesh Srinivas, kvm, jmattson, elver, dvyukov

On Wed, Jun 30, 2021, David Matlack wrote:
> On Tue, Jun 29, 2021 at 08:10:37PM -0700, Venkatesh Srinivas wrote:
> > vcpu->cpu contains the last cpu a vcpu run/is running on;
> > kvm_vcpu_kick is used to 'kick' a guest vcpu by sending an IPI
> > to the last CPU; vcpu->cpu is updated on sched_in when a vcpu
> > moves to a new CPU, so it possible for the vcpu->cpu field to
> > be stale.

This fails to document the actual bug being fixed, and why the fix is correct.
The fact that vcpu->cpu may be stale is itself not a bug, e.g. even with this
patch, the IPI can be sent to the "wrong", i.e. smp_send_reschedule() can still
consume a stale vcpu->cpu.

The bug that's being fixed is that grabbing the potentially-stale vcpu->cpu
_before_ kvm_arch_vcpu_should_kick() can cause KVM to send an IPI to the wrong
CPU _and_ let the vCPU run longer than intended.  The fix is correct because KVM
doesn't truly care about sending an IPI to the correct pCPU, it only cares about
about kicking the pCPU out of the guest.  If the vCPU has exited and been loaded
on a different pCPU after kvm_arch_vcpu_should_kick(), then its mission has been
accomplished and the IPI (to the wrong pCPU) is truly spurious.

> > kvm_vcpu_kick also read vcpu->cpu early with a plain read,
> > while vcpu->cpu could be concurrently written. This caused
> > a data race, found by kcsan:
> > 
> > write to 0xffff8880274e8460 of 4 bytes by task 30718 on cpu 0:
> >  kvm_arch_vcpu_load arch/x86/kvm/x86.c:4018
> >  kvm_sched_in virt/kvm/kvm_main.c:4864
> > 
> > vs
> >  kvm_vcpu_kick virt/kvm/kvm_main.c:2909
> >  kvm_arch_async_page_present_queued arch/x86/kvm/x86.c:11287
> >  async_pf_execute virt/kvm/async_pf.c:79
> >  ...
> > 
> > Use a READ_ONCE to atomically read vcpu->cpu and avoid the
> > data race.
> > 
> > Found by kcsan & syzbot.
> > 
> > Signed-off-by: Venkatesh Srinivas <venkateshs@chromium.org>

Suggested-by: Sean Christopherson <seanjc@google.com>

> Reviewed-by: David Matlack <dmatlack@google.com>
> 
> > ---
> >  virt/kvm/kvm_main.c | 10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> > 
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 46fb042837d2..0525f42afb7d 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -3058,16 +3058,18 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up);
> >   */
> >  void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
> >  {
> > -	int me;
> > -	int cpu = vcpu->cpu;
> > +	int me, cpu;
> >  
> >  	if (kvm_vcpu_wake_up(vcpu))
> >  		return;
> >  
> >  	me = get_cpu();
> > -	if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
> > -		if (kvm_arch_vcpu_should_kick(vcpu))
> > +	if (kvm_arch_vcpu_should_kick(vcpu)) {
> > +		cpu = READ_ONCE(vcpu->cpu);
> > +		WARN_ON_ONCE(cpu == me);
> 
> nit: A comment here may be useful to explain the interaction with
> kvm_arch_vcpu_should_kick(). Took me a minute to understand why you
> added the warning.

Agreed.

> > +		if ((unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
> >  			smp_send_reschedule(cpu);
> > +	}
> >  	put_cpu();
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
> > -- 
> > 2.30.2
> > 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: kvm_vcpu_kick: Do not read potentially-stale vcpu->cpu
  2021-07-09 17:47   ` Sean Christopherson
@ 2021-07-13  1:45     ` Venkatesh Srinivas
  0 siblings, 0 replies; 4+ messages in thread
From: Venkatesh Srinivas @ 2021-07-13  1:45 UTC (permalink / raw
  To: Sean Christopherson; +Cc: David Matlack, kvm, Jim Mattson, elver, dvyukov

On Fri, Jul 9, 2021 at 10:47 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Jun 30, 2021, David Matlack wrote:
> > On Tue, Jun 29, 2021 at 08:10:37PM -0700, Venkatesh Srinivas wrote:
> > > vcpu->cpu contains the last cpu a vcpu run/is running on;
> > > kvm_vcpu_kick is used to 'kick' a guest vcpu by sending an IPI
> > > to the last CPU; vcpu->cpu is updated on sched_in when a vcpu
> > > moves to a new CPU, so it possible for the vcpu->cpu field to
> > > be stale.
>
> This fails to document the actual bug being fixed, and why the fix is correct.
> The fact that vcpu->cpu may be stale is itself not a bug, e.g. even with this
> patch, the IPI can be sent to the "wrong", i.e. smp_send_reschedule() can still
> consume a stale vcpu->cpu.
>
> The bug that's being fixed is that grabbing the potentially-stale vcpu->cpu
> _before_ kvm_arch_vcpu_should_kick() can cause KVM to send an IPI to the wrong
> CPU _and_ let the vCPU run longer than intended.  The fix is correct because KVM
> doesn't truly care about sending an IPI to the correct pCPU, it only cares about
> about kicking the pCPU out of the guest.  If the vCPU has exited and been loaded
> on a different pCPU after kvm_arch_vcpu_should_kick(), then its mission has been
> accomplished and the IPI (to the wrong pCPU) is truly spurious.

Will improve the comment for V2.

> > > kvm_vcpu_kick also read vcpu->cpu early with a plain read,
> > > while vcpu->cpu could be concurrently written. This caused
> > > a data race, found by kcsan:
> > >
> > > write to 0xffff8880274e8460 of 4 bytes by task 30718 on cpu 0:
> > >  kvm_arch_vcpu_load arch/x86/kvm/x86.c:4018
> > >  kvm_sched_in virt/kvm/kvm_main.c:4864
> > >
> > > vs
> > >  kvm_vcpu_kick virt/kvm/kvm_main.c:2909
> > >  kvm_arch_async_page_present_queued arch/x86/kvm/x86.c:11287
> > >  async_pf_execute virt/kvm/async_pf.c:79
> > >  ...
> > >
> > > Use a READ_ONCE to atomically read vcpu->cpu and avoid the
> > > data race.
> > >
> > > Found by kcsan & syzbot.
> > >
> > > Signed-off-by: Venkatesh Srinivas <venkateshs@chromium.org>
>
> Suggested-by: Sean Christopherson <seanjc@google.com>
>
> > Reviewed-by: David Matlack <dmatlack@google.com>
> >
> > > ---
> > >  virt/kvm/kvm_main.c | 10 ++++++----
> > >  1 file changed, 6 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > > index 46fb042837d2..0525f42afb7d 100644
> > > --- a/virt/kvm/kvm_main.c
> > > +++ b/virt/kvm/kvm_main.c
> > > @@ -3058,16 +3058,18 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up);
> > >   */
> > >  void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
> > >  {
> > > -   int me;
> > > -   int cpu = vcpu->cpu;
> > > +   int me, cpu;
> > >
> > >     if (kvm_vcpu_wake_up(vcpu))
> > >             return;
> > >
> > >     me = get_cpu();
> > > -   if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
> > > -           if (kvm_arch_vcpu_should_kick(vcpu))
> > > +   if (kvm_arch_vcpu_should_kick(vcpu)) {
> > > +           cpu = READ_ONCE(vcpu->cpu);
> > > +           WARN_ON_ONCE(cpu == me);
> >
> > nit: A comment here may be useful to explain the interaction with
> > kvm_arch_vcpu_should_kick(). Took me a minute to understand why you
> > added the warning.
>
> Agreed.

Will do for V2

> > > +           if ((unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
> > >                     smp_send_reschedule(cpu);
> > > +   }
> > >     put_cpu();
> > >  }
> > >  EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
> > > --
> > > 2.30.2
> > >

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-07-13  1:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-06-30  3:10 [PATCH] KVM: kvm_vcpu_kick: Do not read potentially-stale vcpu->cpu Venkatesh Srinivas
2021-06-30 16:24 ` David Matlack
2021-07-09 17:47   ` Sean Christopherson
2021-07-13  1:45     ` Venkatesh Srinivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).