All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: paulmck@kernel.org
Cc: Frederic Weisbecker <frederic@kernel.org>, rcu@vger.kernel.org
Subject: Re: [BUG] Random intermittent boost failures (Was Re: [BUG] TREE04..)
Date: Fri, 15 Sep 2023 17:14:44 -0400	[thread overview]
Message-ID: <CAEXW_YRbFetLk1psOjjdXL2JiuKTcXs1T3J8V0YQDZ41ZewwDg@mail.gmail.com> (raw)
In-Reply-To: <d8f43d7a-3e42-4be7-a8c9-dcbff2ca87ef@paulmck-laptop>

On Fri, Sep 15, 2023 at 12:57 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
[...]
> > > > > On the other hand, I came up with a real fix [1] and I am currently testing it.
> > > > > This is to fix a live lock between RT push and CPU hotplug's
> > > > > select_fallback_rq()-induced push. I am not sure if the fix works but I have
> > > > > some faith based on what I'm seeing in traces. Fingers crossed. I also feel
> > > > > the real fix is needed to prevent these issues even if we're able to hide it
> > > > > by halving the total rcutorture boost threads.
> > > >
> > > > So that fixed it without any changes to RCU. Below is the updated patch also
> > > > for the archives. Though I'm rewriting it slightly differently and testing
> > > > that more. The main thing I am doing in the new patch is I find that RT
> > > > should not select !cpu_active() CPUs since those have the scheduler turned
> > > > off. Though checking for cpu_dying() also works. I could not find any
> > > > instance where cpu_dying() != cpu_active() but there could be a tiny window
> > > > where that is true. Anyway, I'll make some noise with scheduler folks once I
> > > > have the new version of the patch tested.
> > > >
> > > > Also halving the number of RT boost threads makes it less likely to occur but
> > > > does not work. Not too surprising since the issue actually may not be related
> > > > to too many RT threads but rather a lockup between hotplug and RT..
> > >
> > > Again, looks promising!  When I get the non-RCU -rcu stuff moved to
> > > v6.6-rc1 and appropriately branched and tested, I will give it a go on
> > > the test setup here.
> >
> > Thanks a lot, and I have enclosed a simpler updated patch below which also
> > similarly shows very good results. This is the one I would like to test
> > more and send to scheduler folks. I'll send it out once I have it tested more
> > and also possibly after seeing your results (I am on vacation next week so
> > there's time).
>
> Much nicer!  This is just on current mainline, correct?

Yes, correct. I also applied it cleanly to all stable kernels for my
test rigs. Only 5.10 had a little merge conflict but it was trivially
fixed.

thanks,

 - Joel

  reply	other threads:[~2023-09-15 21:16 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-10 20:14 [BUG] Random intermittent boost failures (Was Re: [BUG] TREE04..) Joel Fernandes
2023-09-10 21:16 ` Paul E. McKenney
2023-09-10 23:37   ` Joel Fernandes
2023-09-11  2:27     ` Joel Fernandes
2023-09-11  8:16       ` Paul E. McKenney
2023-09-11 13:17         ` Joel Fernandes
2023-09-11 13:49           ` Paul E. McKenney
2023-09-11 16:18             ` Joel Fernandes
2023-09-13 20:30         ` Joel Fernandes
2023-09-14 11:11           ` Paul E. McKenney
2023-09-14 13:13             ` Joel Fernandes
2023-09-14 15:23               ` Paul E. McKenney
2023-09-14 18:56                 ` Joel Fernandes
2023-09-14 21:53                   ` Joel Fernandes
2023-09-15  0:13                     ` Joel Fernandes
2023-09-15 11:33                       ` Joel Fernandes
2023-09-15 14:53                         ` Paul E. McKenney
2023-09-15 16:37                           ` Joel Fernandes
2023-09-15 16:57                             ` Paul E. McKenney
2023-09-15 21:14                               ` Joel Fernandes [this message]
2023-09-18  6:05                             ` Paul E. McKenney
2023-09-15 14:48                       ` Paul E. McKenney
2023-09-15 14:45                     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEXW_YRbFetLk1psOjjdXL2JiuKTcXs1T3J8V0YQDZ41ZewwDg@mail.gmail.com \
    --to=joel@joelfernandes.org \
    --cc=frederic@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.