KVM Archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	Leonardo Bras <leobras@redhat.com>,
	 Paolo Bonzini <pbonzini@redhat.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	 Neeraj Upadhyay <quic_neeraju@quicinc.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	 Josh Triplett <josh@joshtriplett.org>,
	Boqun Feng <boqun.feng@gmail.com>,
	 Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	 Lai Jiangshan <jiangshanlai@gmail.com>,
	Zqiang <qiang.zhang1211@gmail.com>,
	kvm@vger.kernel.org,  linux-kernel@vger.kernel.org,
	rcu@vger.kernel.org
Subject: Re: [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu
Date: Mon, 8 Apr 2024 14:56:29 -0700	[thread overview]
Message-ID: <ZhRoDfoz-YqsGhIB@google.com> (raw)
In-Reply-To: <44eb0d36-7454-41e7-9a16-ce92a88e568c@paulmck-laptop>

On Mon, Apr 08, 2024, Paul E. McKenney wrote:
> On Mon, Apr 08, 2024 at 01:06:00PM -0700, Sean Christopherson wrote:
> > On Mon, Apr 08, 2024, Paul E. McKenney wrote:
> > > > > > +	if (vcpu->wants_to_run)
> > > > > > +		context_tracking_guest_start_run_loop();
> > > > > 
> > > > > At this point, if this is a nohz_full CPU, it will no longer report
> > > > > quiescent states until the grace period is at least one second old.
> > > > 
> > > > I don't think I follow the "will no longer report quiescent states" issue.  Are
> > > > you saying that this would prevent guest_context_enter_irqoff() from reporting
> > > > that the CPU is entering a quiescent state?  If so, that's an issue that would
> > > > need to be resolved regardless of what heuristic we use to determine whether or
> > > > not a CPU is likely to enter a KVM guest.
> > > 
> > > Please allow me to start over.  Are interrupts disabled at this point,
> > 
> > Nope, IRQs are enabled.
> > 
> > Oof, I'm glad you asked, because I was going to say that there's one exception,
> > kvm_sched_in(), which is KVM's notifier for when a preempted task/vCPU is scheduled
> > back in.  But I forgot that kvm_sched_{in,out}() don't use vcpu_{load,put}(),
> > i.e. would need explicit calls to context_tracking_guest_{stop,start}_run_loop().
> > 
> > > and, if so, will they remain disabled until the transfer of control to
> > > the guest has become visible to RCU via the context-tracking code?
> > > 
> > > Or has the context-tracking code already made the transfer of control
> > > to the guest visible to RCU?
> > 
> > Nope.  The call to __ct_user_enter(CONTEXT_GUEST) or rcu_virt_note_context_switch()
> > happens later, just before the actual VM-Enter.  And that call does happen with
> > IRQs disabled (and IRQs stay disabled until the CPU enters the guest).
> 
> OK, then we can have difficulties with long-running interrupts hitting
> this range of code.  It is unfortunately not unheard-of for interrupts
> plus trailing softirqs to run for tens of seconds, even minutes.

Ah, and if that occurs, *and* KVM is slow to re-enter the guest, then there will
be a massive lag before the CPU gets back into a quiescent state.

> One counter-argument is that that softirq would take scheduling-clock
> interrupts, and would eventually make rcu_core() run.

Considering that this behavior would be unique to nohz_full CPUs, how much
responsibility does RCU have to ensure a sane setup?  E.g. if a softirq runs for
multiple seconds on a nohz_full CPU whose primary role is to run a KVM vCPU, then
whatever real-time workaround the vCPU is running is already doomed.

> But does a rcu_sched_clock_irq() from a guest OS have its "user"
> argument set?

No, and it shouldn't, at least not on x86 (I assume other architectures are
similar, but I don't actually no for sure).

On x86, the IRQ that the kernel sees comes looks like it comes from host kernel
code.  And on AMD (SVM), the IRQ doesn't just "look" like it came from host kernel,
the IRQ really does get vectored/handled in the host kernel.  Intel CPUs have a
performance optimization where the IRQ gets "eaten" as part of the VM-Exit, and
so KVM synthesizes a stack frame and does a manual CALL to invoke the IRQ handler.

And that's just for IRQs that actually arrive while the guest is running.  IRQs
arrive while KVM is active, e.g. running its large vcpu_run(), are "pure" host
IRQs.

  reply	other threads:[~2024-04-08 21:56 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-28 17:19 [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu Leonardo Bras
2024-03-28 17:19 ` [RFC PATCH v1 1/2] kvm: Implement guest_exit_last_time() Leonardo Bras
2024-03-28 17:19 ` [RFC PATCH v1 2/2] rcu: Ignore RCU in nohz_full cpus if it was running a guest recently Leonardo Bras
2024-04-01 15:52   ` Paul E. McKenney
2024-04-01 20:21 ` [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu Sean Christopherson
2024-04-05 13:45   ` Marcelo Tosatti
2024-04-05 14:42     ` Sean Christopherson
2024-04-06  0:03       ` Paul E. McKenney
2024-04-08 17:16         ` Sean Christopherson
2024-04-08 18:42           ` Paul E. McKenney
2024-04-08 20:06             ` Sean Christopherson
2024-04-08 21:02               ` Paul E. McKenney
2024-04-08 21:56                 ` Sean Christopherson [this message]
2024-04-08 22:35                   ` Paul E. McKenney
2024-04-08 23:06                     ` Sean Christopherson
2024-04-08 23:20                       ` Paul E. McKenney
2024-04-10  2:39           ` Marcelo Tosatti
2024-04-15 19:47           ` Marcelo Tosatti
2024-04-15 21:29             ` Sean Christopherson
2024-04-16 12:36               ` Marcelo Tosatti
2024-04-16 14:07                 ` Sean Christopherson
2024-04-17 16:14                   ` Marcelo Tosatti
2024-04-17 17:22                     ` Sean Christopherson
2024-05-03 20:44                       ` Leonardo Bras
2024-05-06 18:47                         ` Marcelo Tosatti
2024-05-07 18:05                           ` Sean Christopherson
2024-05-07 22:36                             ` Leonardo Bras
2024-05-03 18:42   ` Leonardo Bras
2024-05-03 19:09     ` Leonardo Bras
2024-05-03 21:29     ` Sean Christopherson
2024-05-03 22:00       ` Leonardo Bras
2024-05-03 22:00       ` Paul E. McKenney
2024-05-07 17:55         ` Sean Christopherson
2024-05-07 19:15           ` Paul E. McKenney
2024-05-07 21:00             ` Sean Christopherson
2024-05-07 21:37               ` Paul E. McKenney
2024-05-07 23:47                 ` Sean Christopherson
2024-05-08  0:08                   ` Sean Christopherson
2024-05-08  2:51                     ` Leonardo Bras
2024-05-08  3:22                       ` Paul E. McKenney
2024-05-08  6:19                         ` Leonardo Bras
2024-05-08 14:01                           ` Sean Christopherson
2024-05-09  3:32                             ` Paul E. McKenney
2024-05-09  8:16                               ` Leonardo Bras
2024-05-09 10:14                                 ` Leonardo Bras
2024-05-09 23:45                                   ` Paul E. McKenney
2024-05-10 16:06                                     ` Leonardo Bras
2024-05-10 16:21                                       ` Paul E. McKenney
2024-05-10 17:12                                         ` Leonardo Bras
2024-05-10 17:41                                           ` Paul E. McKenney
2024-05-10 19:50                                             ` Leonardo Bras
2024-05-10 21:15                                               ` Leonardo Bras
2024-05-10 21:38                                                 ` Paul E. McKenney
2024-05-09 22:41                                 ` Paul E. McKenney
2024-05-09 23:07                                   ` Leonardo Bras Soares Passos
2024-05-11  2:08                             ` Leonardo Bras
2024-05-08  3:20                     ` Paul E. McKenney
2024-05-08  4:04                       ` Paul E. McKenney
2024-05-08 14:36                         ` Paul E. McKenney
2024-05-08 15:35                       ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZhRoDfoz-YqsGhIB@google.com \
    --to=seanjc@google.com \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kvm@vger.kernel.org \
    --cc=leobras@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mtosatti@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=qiang.zhang1211@gmail.com \
    --cc=quic_neeraju@quicinc.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).