oe-lkp.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Luis Chamberlain <mcgrof@kernel.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: kernel test robot <oliver.sang@intel.com>,
	Daniel Gomez <da.gomez@samsung.com>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	linux-kernel@vger.kernel.org,
	"gost.dev@samsung.com" <gost.dev@samsung.com>,
	Pankaj Raghav <p.raghav@samsung.com>
Subject: Re: [PATCH 1/2] test_xarray: add tests for advanced multi-index use
Date: Wed, 14 Feb 2024 18:15:24 -0800	[thread overview]
Message-ID: <Zc1zvBpL6x1kpsfk@bombadil.infradead.org> (raw)
In-Reply-To: <ZbQW3PRAIw8e7m0m@casper.infradead.org>

On Fri, Jan 26, 2024 at 08:32:28PM +0000, Matthew Wilcox wrote:
> On Fri, Jan 26, 2024 at 12:04:44PM -0800, Luis Chamberlain wrote:
> > > We have a perfectly good system for "relaxing":
> > > 
> > >         xas_for_each_marked(&xas, page, end, PAGECACHE_TAG_DIRTY) {
> > >                 xas_set_mark(&xas, PAGECACHE_TAG_TOWRITE);
> > >                 if (++tagged % XA_CHECK_SCHED)
> > >                         continue;
> > > 
> > >                 xas_pause(&xas);
> > >                 xas_unlock_irq(&xas);
> > >                 cond_resched();
> > >                 xas_lock_irq(&xas);
> > >         }
> > 
> > And yet we can get a soft lockup with order 20 (1,048,576 entries),
> > granted busy looping over 1 million entries is insane, but it seems it
> > the existing code may not be enough to avoid the soft lockup. Also
> > cond_resched() may be eventually removed [0].
> 
> what?  you're in charge of when you sleep.  you can do this:
> 
> unsigned i = 0;
> rcu_read_lock();
> xas_for_each(...) {
> 	...
> 	if (iter++ % XA_CHECK_SCHED)
> 		continue;
> 	xas_pause();
> 	rcu_read_unlock();
> 	rcu_read_lock();
> }
> rcu_read_unlock();
> 
> and that will get rid of the rcu warnings.  right?

The RCU splat is long gone on my last iteration merged now on
linux-next, what's left is just a soft lockup over 22 seconds when you
enable disable preemption and enable RCU prooving and use 2 vcpus. This
could happen for instance if we loop over test_get_entry() and don't
want to use xas_for_each() API, in this case we don't as part of the
selftest is to not trust the xarray API and test it.

So in the simplest case for instance, this is used:

check_xa_multi_store_adv_add(xa, base, order, &some_val);               
                                                                                
for (i = 0; i < nrpages; i++)                                           
	XA_BUG_ON(xa, test_get_entry(xa, base + i) != &some_val);  

test_get_entry() will do the RCU locking for us. So while I agree that
if you are using the xarray API using xas_for_each*() is best, we want
to not trust the xarray API and prove it. So what do you think about
something like this, as it does fix the soft lockup.

diff --git a/lib/test_xarray.c b/lib/test_xarray.c
index d4e55b4867dc..ac162025cc59 100644
--- a/lib/test_xarray.c
+++ b/lib/test_xarray.c
@@ -781,6 +781,7 @@ static noinline void *test_get_entry(struct xarray *xa, unsigned long index)
 {
 	XA_STATE(xas, xa, index);
 	void *p;
+	static unsigned int i = 0;
 
 	rcu_read_lock();
 repeat:
@@ -790,6 +791,17 @@ static noinline void *test_get_entry(struct xarray *xa, unsigned long index)
 		goto repeat;
 	rcu_read_unlock();
 
+	/*
+	 * This is not part of the page cache, this selftest is pretty
+	 * aggressive and does not want to trust the xarray API but rather
+	 * test it, and for order 20 (4 GiB block size) we can loop over
+	 * over a million entries which can cause a soft lockup. Page cache
+	 * APIs won't be stupid, proper page cache APIs loop over the proper
+	 * order so when using a larger order we skip shared entries.
+	 */
+	if (++i % XA_CHECK_SCHED == 0)
+		schedule();
+
 	return p;
 }
 

      parent reply	other threads:[~2024-02-15  2:15 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20231104005747.1389762-2-da.gomez@samsung.com>
2023-11-15 15:02 ` [PATCH 1/2] test_xarray: add tests for advanced multi-index use kernel test robot
2023-11-17 20:54   ` Luis Chamberlain
2023-11-17 20:58     ` Matthew Wilcox
2023-11-17 21:01       ` Luis Chamberlain
2024-01-26 19:12         ` Luis Chamberlain
2024-01-26 19:26           ` Matthew Wilcox
2024-01-26 20:04             ` Luis Chamberlain
2024-01-26 20:32               ` Matthew Wilcox
2024-01-31 21:58                 ` Luis Chamberlain
2024-02-15  2:15                 ` Luis Chamberlain [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zc1zvBpL6x1kpsfk@bombadil.infradead.org \
    --to=mcgrof@kernel.org \
    --cc=da.gomez@samsung.com \
    --cc=gost.dev@samsung.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=p.raghav@samsung.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).