Re: [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu

KVM Archive mirror
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Leonardo Bras <leobras@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	 Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <quic_neeraju@quicinc.com>,
	 Joel Fernandes <joel@joelfernandes.org>,
	Josh Triplett <josh@joshtriplett.org>,
	 Boqun Feng <boqun.feng@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	 Zqiang <qiang.zhang1211@gmail.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	kvm@vger.kernel.org,  linux-kernel@vger.kernel.org,
	rcu@vger.kernel.org
Subject: Re: [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu
Date: Wed, 8 May 2024 08:35:03 -0700	[thread overview]
Message-ID: <Zjubp36yHVf01C16@google.com> (raw)
In-Reply-To: <ac66bb23-2955-41bf-b1f0-85adcc4628a0@paulmck-laptop>

On Tue, May 07, 2024, Paul E. McKenney wrote:
> On Tue, May 07, 2024 at 05:08:54PM -0700, Sean Christopherson wrote:
> > > > This is admittedly a bit indirect, but then again this is Linux-kernel
> > > > RCU that we are talking about.
> > > > 
> > > > > And I'm arguing that, since the @user check isn't bombproof, there's no reason to
> > > > > try to harden against every possible edge case in an equivalent @guest check,
> > > > > because it's unnecessary for kernel safety, thanks to the guardrails.
> > > > 
> > > > And the same argument above would also apply to an equivalent check for
> > > > execution in guest mode at the time of the interrupt.
> > > 
> > > This is partly why I was off in the weeds.  KVM cannot guarantee that the
> > > interrupt that leads to rcu_pending() actually interrupted the guest.  And the
> > > original patch didn't help at all, because a time-based check doesn't come
> > > remotely close to the guarantees that the @user check provides.
> 
> Nothing in the registers from the interrupted context permits that
> determination?

No, because the interrupt/call chain that reaches rcu_pending() actually originates
in KVM host code, not guest code.  I.e. the eventual IRET will return control to
KVM, not to the guest.

On AMD, the interrupt quite literally interrupts the host, not the guest.  AMD
CPUs don't actually acknowledge/consume the physical interrupt when the guest is
running, the CPU simply generates a VM-Exit that says "there's an interrupt pending".
It's up to software, i.e. KVM, to enable IRQs and handle (all!) pending interrupts.

Intel CPUs have a mode where the CPU fully acknowledges the interrupt and reports
the exact vector that caused the VM-Exit, but it's still up to software to invoke
the interrupt handler, i.e. the interrupt trampolines through KVM.

And before handling/forwarding the interrupt, KVM exits its quiescent state,
leaves its no-instrumention region, invokes tracepoitnes, etc.  So even my PF_VCPU
idea is _very_ different than the user/idle scenarios, where the interrupt really
truly does original from an extended quiescent state.

> > > > But if we do need RCU to be more aggressive about treating guest execution as
> > > > an RCU quiescent state within the host, that additional check would be an
> > > > excellent way of making that happen.
> > > 
> > > It's not clear to me that being more agressive is warranted.  If my understanding
> > > of the existing @user check is correct, we _could_ achieve similar functionality
> > > for vCPU tasks by defining a rule that KVM must never enter an RCU critical section
> > > with PF_VCPU set and IRQs enabled, and then rcu_pending() could check PF_VCPU.
> > > On x86, this would be relatively straightforward (hack-a-patch below), but I've
> > > no idea what it would look like on other architectures.
> 
> At first glance, this looks plausible.  I would guess that a real patch
> would have to be architecture dependent, and that could simply involve
> a Kconfig option (perhaps something like CONFIG_RCU_SENSE_GUEST), so
> that the check you add to rcu_pending is conditioned on something like
> IS_ENABLED(CONFIG_RCU_SENSE_GUEST).
> 
> There would also need to be a similar check in rcu_sched_clock_irq(),
> or maybe in rcu_flavor_sched_clock_irq(), to force a call to rcu_qs()
> in this situation.
> 
> > > But the value added isn't entirely clear to me, probably because I'm still missing
> > > something.  KVM will have *very* recently called __ct_user_exit(CONTEXT_GUEST) to
> > > note the transition from guest to host kernel.  Why isn't that a sufficient hook
> > > for RCU to infer grace period completion?
> 
> Agreed, unless we are sure we need the change, we should not make it.

+1.  And your comments about tracepoints, instrumentions, etc. makes me think
that trying to force the issue with PF_VCPU would be a bad idea.

     prev parent reply	other threads:[~2024-05-08 15:35 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-28 17:19 [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu Leonardo Bras
2024-03-28 17:19 ` [RFC PATCH v1 1/2] kvm: Implement guest_exit_last_time() Leonardo Bras
2024-03-28 17:19 ` [RFC PATCH v1 2/2] rcu: Ignore RCU in nohz_full cpus if it was running a guest recently Leonardo Bras
2024-04-01 15:52   ` Paul E. McKenney
2024-04-01 20:21 ` [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu Sean Christopherson
2024-04-05 13:45   ` Marcelo Tosatti
2024-04-05 14:42     ` Sean Christopherson
2024-04-06  0:03       ` Paul E. McKenney
2024-04-08 17:16         ` Sean Christopherson
2024-04-08 18:42           ` Paul E. McKenney
2024-04-08 20:06             ` Sean Christopherson
2024-04-08 21:02               ` Paul E. McKenney
2024-04-08 21:56                 ` Sean Christopherson
2024-04-08 22:35                   ` Paul E. McKenney
2024-04-08 23:06                     ` Sean Christopherson
2024-04-08 23:20                       ` Paul E. McKenney
2024-04-10  2:39           ` Marcelo Tosatti
2024-04-15 19:47           ` Marcelo Tosatti
2024-04-15 21:29             ` Sean Christopherson
2024-04-16 12:36               ` Marcelo Tosatti
2024-04-16 14:07                 ` Sean Christopherson
2024-04-17 16:14                   ` Marcelo Tosatti
2024-04-17 17:22                     ` Sean Christopherson
2024-05-03 20:44                       ` Leonardo Bras
2024-05-06 18:47                         ` Marcelo Tosatti
2024-05-07 18:05                           ` Sean Christopherson
2024-05-07 22:36                             ` Leonardo Bras
2024-05-03 18:42   ` Leonardo Bras
2024-05-03 19:09     ` Leonardo Bras
2024-05-03 21:29     ` Sean Christopherson
2024-05-03 22:00       ` Leonardo Bras
2024-05-03 22:00       ` Paul E. McKenney
2024-05-07 17:55         ` Sean Christopherson
2024-05-07 19:15           ` Paul E. McKenney
2024-05-07 21:00             ` Sean Christopherson
2024-05-07 21:37               ` Paul E. McKenney
2024-05-07 23:47                 ` Sean Christopherson
2024-05-08  0:08                   ` Sean Christopherson
2024-05-08  2:51                     ` Leonardo Bras
2024-05-08  3:22                       ` Paul E. McKenney
2024-05-08  6:19                         ` Leonardo Bras
2024-05-08 14:01                           ` Sean Christopherson
2024-05-09  3:32                             ` Paul E. McKenney
2024-05-09  8:16                               ` Leonardo Bras
2024-05-09 10:14                                 ` Leonardo Bras
2024-05-09 23:45                                   ` Paul E. McKenney
2024-05-10 16:06                                     ` Leonardo Bras
2024-05-10 16:21                                       ` Paul E. McKenney
2024-05-10 17:12                                         ` Leonardo Bras
2024-05-10 17:41                                           ` Paul E. McKenney
2024-05-10 19:50                                             ` Leonardo Bras
2024-05-10 21:15                                               ` Leonardo Bras
2024-05-10 21:38                                                 ` Paul E. McKenney
2024-05-09 22:41                                 ` Paul E. McKenney
2024-05-09 23:07                                   ` Leonardo Bras Soares Passos
2024-05-11  2:08                             ` Leonardo Bras
2024-05-08  3:20                     ` Paul E. McKenney
2024-05-08  4:04                       ` Paul E. McKenney
2024-05-08 14:36                         ` Paul E. McKenney
2024-05-08 15:35                       ` Sean Christopherson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zjubp36yHVf01C16@google.com \
    --to=seanjc@google.com \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kvm@vger.kernel.org \
    --cc=leobras@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mtosatti@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=qiang.zhang1211@gmail.com \
    --cc=quic_neeraju@quicinc.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).