Linux-HyperV Archive mirror
 help / color / mirror / Atom feed
From: Michael Kelley <mhklinux@outlook.com>
To: "axboe@kernel.dk" <axboe@kernel.dk>, "hch@lst.de" <hch@lst.de>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>
Subject: Merging raw block device writes
Date: Sat, 25 Nov 2023 17:38:28 +0000	[thread overview]
Message-ID: <SN6PR02MB41575884C4898B59615B496AD4BFA@SN6PR02MB4157.namprd02.prod.outlook.com> (raw)

About 18 months ago, I raised an issue with merging of direct raw block
device writes. [1]   At the time, nobody offered any input on the issue.
To recap, when multiple direct write requests are in-flight to a raw block
device with I/O scheduler "none", and requests must wait for budget
(i.e., the device is SCSI), write requests don't go on a blk-mq software
queue, and no merging is done.  Direct read requests that must wait for
budget *do* go on a software queue and merges happen.

Recently, I noticed that the problem has been fixed in the latest
upstream kernel, and I had time to do further investigation on the
issue.  Bisecting shows the problem first occurred in 5.16-rc1 with
commit dc5fc361d891 from Jens Axboe.  This commit actually prevents
merging of both reads and writes.  But reads were indirectly fixed in
commit 54a88eb838d3 from Pavel Begunkov, also in 5.16-rc1, so
the read problem never occurred in a release.   There's no mention
of merging in either commit message, so I suspect the effect on
merging was unintentional in both cases.   In 5.16, blkdev_read_iter()
does not create a plug list, while blkdev_write_iter() does.  But the
lower level __blkdev_direct_IO() creates a plug list for both reads
and writes, which is why commit dc5fc361d891 broke both.  Then
commit 54a88eb838d3 bypassed __blkdev_direct_IO() in most
cases, and the new path does not create a plug list.  So reads
typically proceed without a plug list, and the merging can happen.
Writes still don't merge because of the plug list in the higher level
blkdev_write_iter().

The situation stayed that way until 6.5-rc1 when commit
712c7364655f from Christoph removed the plug list from
blkdev_write_iter().  Again, there's no mention of merging in the
commit message, so fixing the merge problem may be happenstance.

Hyper-V guests and the Azure cloud have a particular interest here
because Hyper-V guests uses SCSI as the standard interface to virtual
disks.  Azure cloud disks can be throttled to a limited number of IOPS,
so the number of in-flights I/Os can be relatively high, and
merging can be beneficial to staying within the throttle
limits.  Of the flip side, this problem hasn't generated complaints
over the last 18 months that I'm aware of, though that may be more
because commercial distros haven't been running 5.16 or later kernels
until relatively recently.

In any case, the 6.5 kernel fixes the problem, at least in the
common cases where there's no plug list.  But I still wonder if
there's a latent problem with the original commit dc5fc361d891
that should be looked at by someone with more blk-mq expertise
than I have.

Michael

[1] https://lore.kernel.org/linux-block/PH0PR21MB3025A7D1326A92A4B8BDB5FED7B59@PH0PR21MB3025.namprd21.prod.outlook.com/

             reply	other threads:[~2023-11-25 17:38 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-25 17:38 Michael Kelley [this message]
2023-11-27  6:59 ` Merging raw block device writes hch
2023-11-27 16:10   ` Jens Axboe
2023-11-28 19:29     ` Michael Kelley
2023-11-28 21:59       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SN6PR02MB41575884C4898B59615B496AD4BFA@SN6PR02MB4157.namprd02.prod.outlook.com \
    --to=mhklinux@outlook.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).