Linux-Raid Archives mirror
 help / color / mirror / Atom feed
From: tada keisuke <keisuke1.tada@kioxia.com>
To: Paul E Luse <paul.e.luse@linux.intel.com>
Cc: Yu Kuai <yukuai1@huaweicloud.com>,
	"song@kernel.org" <song@kernel.org>,
	"yukuai (C)" <yukuai3@huawei.com>,
	"linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Luse, Paul E" <paul.e.luse@intel.com>
Subject: RE: [PATCH v2 08/11] md: add atomic mode switching in RAID 1/10
Date: Thu, 9 May 2024 06:46:39 +0000	[thread overview]
Message-ID: <170ff038967543228fef9d0868332620@kioxia.com> (raw)
In-Reply-To: <20240425021224.6419ee2c@peluse-desk5>

> > > > > Hi,
> > > > >
> > > > > 在 2024/04/18 13:44, tada keisuke 写道:
> > > > > > This patch depends on patch 07.
> > > > > >
> > > > > > All rdevs running in RAID 1/10 switch nr_pending to atomic
> > > > > > mode. The value of nr_pending is read in a normal operation
> > > > > > (choose_best_rdev()). Therefore, nr_pending must always be
> > > > > > consistent.
> > > > > >
> > > > > > Signed-off-by: Keisuke TADA <keisuke1.tada@kioxia.com>
> > > > > > Signed-off-by: Toshifumi OHTAKE <toshifumi.ootake@kioxia.com>
> > > > > > ---
> > > > > >   drivers/md/md.h     | 14 ++++++++++++++
> > > > > >   drivers/md/raid1.c  |  7 +++++++
> > > > > >   drivers/md/raid10.c |  4 ++++
> > > > > >   3 files changed, 25 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/md/md.h b/drivers/md/md.h
> > > > > > index ab09e312c9bb..57b09b567ffa 100644
> > > > > > --- a/drivers/md/md.h
> > > > > > +++ b/drivers/md/md.h
> > > > > > @@ -236,6 +236,20 @@ static inline unsigned long
> > > > > > nr_pending_read(struct md_rdev *rdev) return
> > > > > > atomic_long_read(&rdev->nr_pending.data->count); }
> > > > > >
> > > > > > +static inline bool nr_pending_is_percpu_mode(struct md_rdev
> > > > > > *rdev) +{
> > > > > > +	unsigned long __percpu *percpu_count;
> > > > > > +
> > > > > > +	return __ref_is_percpu(&rdev->nr_pending,
> > > > > > &percpu_count); +}
> > > > > > +
> > > > > > +static inline bool nr_pending_is_atomic_mode(struct md_rdev
> > > > > > *rdev) +{
> > > > > > +	unsigned long __percpu *percpu_count;
> > > > > > +
> > > > > > +	return !__ref_is_percpu(&rdev->nr_pending,
> > > > > > &percpu_count); +}
> > > > > > +
> > > > > >   static inline int is_badblock(struct md_rdev *rdev,
> > > > > > sector_t s, int sectors, sector_t *first_bad, int
> > > > > > *bad_sectors) {
> > > > > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> > > > > > index 12318fb15a88..c38ae13aadab 100644
> > > > > > --- a/drivers/md/raid1.c
> > > > > > +++ b/drivers/md/raid1.c
> > > > > > @@ -784,6 +784,7 @@ static int choose_best_rdev(struct r1conf
> > > > > > *conf, struct r1bio *r1_bio) if (ctl.readable_disks++ == 1)
> > > > > >   			set_bit(R1BIO_FailFast,
> > > > > > &r1_bio->state);
> > > > > >
> > > > > > +
> > > > > > WARN_ON_ONCE(nr_pending_is_percpu_mode(rdev)); pending =
> > > > > > nr_pending_read(rdev); dist = abs(r1_bio->sector -
> > > > > > conf->mirrors[disk].head_position);
> > > > > > @@ -1930,6 +1931,7 @@ static int raid1_add_disk(struct mddev
> > > > > > *mddev, struct md_rdev *rdev) if (err)
> > > > > >   				return err;
> > > > > >
> > > > > > +
> > > > > > percpu_ref_switch_to_atomic_sync(&rdev->nr_pending);
> > > > > > raid1_add_conf(conf, rdev, mirror, false); /* As all devices
> > > > > > are equivalent, we don't need a full recovery
> > > > > >   			 * if this was recently any drive
> > > > > > of the array @@ -1949,6 +1951,7 @@ static int
> > > > > > raid1_add_disk(struct mddev *mddev, struct md_rdev *rdev)
> > > > > > set_bit(Replacement, &rdev->flags); raid1_add_conf(conf,
> > > > > > rdev, repl_slot, true); err = 0;
> > > > > > +
> > > > > > percpu_ref_switch_to_atomic_sync(&rdev->nr_pending);
> > > > >
> > > > > I don't understand what's the point here, 'nr_pending' will be
> > > > > used when the rdev issuing IO, and it's always used as atomic
> > > > > mode, there is no difference.
> > > > >
> > > > > Consider that 'nr_pending' must be read from IO fast path, use
> > > > > it as atomic is something we must accept. Unless someone comes
> > > > > up with a plan to avoid reading 'inflight' counter from fast
> > > > > path like generic block layer, it's not ok to me to switch to
> > > > > percpu_ref for now.
> >
> > The main purpose of this patchset is to improve RAID5 performance.
> > In the current RAID 1/10 design, the value of nr_pending is
> > intentionally always in atomic mode because it must be read in IO
> > fast path. Unless the design of reading the value of nr_pending has
> > changed, I believe that this patchset is a reasonable design and
> > RAID1 performance is about the same as atomic_t before this patchset
> > was applied. Paul's results also show that.
> >
> > Best Regards,
> > Keisuke
> 
> I only tested RAID1 and do believe that simpler is better so would
> prefer not to change the RAID1 code.  I can run some RAID5 tests on
> this as well unless you have some wide sweeping results? Would love to
> see more RAID5 performance improvments.  Shushu has another RAID5 perf
> patch out there that I think has some very good potential, it would be
> good if you could take a look at that one.
> 
> -Paul

We are planning to measure the performance of RAID5 using SSDs.

Keisuke

      reply	other threads:[~2024-05-09  7:42 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-18  5:44 [PATCH v2 08/11] md: add atomic mode switching in RAID 1/10 tada keisuke
2024-04-18  6:39 ` Yu Kuai
2024-04-16 14:38   ` Paul E Luse
2024-04-16 22:41     ` Paul E Luse
2024-04-26  8:01       ` tada keisuke
2024-04-25  9:12         ` Paul E Luse
2024-05-09  6:46           ` tada keisuke [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=170ff038967543228fef9d0868332620@kioxia.com \
    --to=keisuke1.tada@kioxia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=paul.e.luse@intel.com \
    --cc=paul.e.luse@linux.intel.com \
    --cc=song@kernel.org \
    --cc=yukuai1@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).