From: Max Reitz <mreitz@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, qemu-block@nongnu.org
Subject: Re: [PATCH v3 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2
Date: Tue, 15 Jun 2021 18:32:34 +0200 [thread overview]
Message-ID: <6f4ce577-a2d2-e63a-15ae-a248e141634a@redhat.com> (raw)
In-Reply-To: <20210603133722.218465-5-pbonzini@redhat.com>
On 03.06.21 15:37, Paolo Bonzini wrote:
> bs->sg is only true for character devices, but block devices can also
> be used with scsi-block and scsi-generic. Unfortunately BLKSECTGET
> returns bytes in an int for /dev/sgN devices, and sectors in a short
> for block devices, so account for that in the code.
>
> The maximum transfer also need not be a power of 2 (for example I have
> seen disks with 1280 KiB maximum transfer) so there's no need to pass
> the result through pow2floor.
>
> This patch reintroduces the logic that was removed in commit 867eccfed8
> ("file-posix: Use max transfer length/segment count only for SCSI
> passthrough", 2019-07-12). The removal was motivated by the performance
> regression when using a block device's max_transfer with kernel I/O;
> the new, separate max_hw_transfer limit avoids reintroducing the same
> regression.
(And the fact that the result of hdev_get_max_segments() is no longer
used to cap max_transfer, but is just stored in max_iov instead.)
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/file-posix.c | 40 ++++++++++++++++++++++------------------
> 1 file changed, 22 insertions(+), 18 deletions(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index f55f92d0f5..1439293f63 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1149,22 +1149,27 @@ static void raw_reopen_abort(BDRVReopenState *state)
> s->reopen_state = NULL;
> }
>
> -static int sg_get_max_transfer_length(int fd)
> +static int hdev_get_max_hw_transfer(int fd, struct stat *st)
> {
> #ifdef BLKSECTGET
> - int max_bytes = 0;
> -
> - if (ioctl(fd, BLKSECTGET, &max_bytes) == 0) {
> - return max_bytes;
> + if (S_ISBLK(st->st_mode)) {
> + unsigned short max_sectors = 0;
> + if (ioctl(fd, BLKSECTGET, &max_sectors) == 0) {
> + return max_sectors * 512;
> + }
> } else {
> - return -errno;
> + int max_bytes = 0;
> + if (ioctl(fd, BLKSECTGET, &max_bytes) == 0) {
> + return max_bytes;
> + }
> }
> + return -errno;
> #else
> return -ENOSYS;
> #endif
> }
>
> -static int sg_get_max_segments(int fd)
> +static int hdev_get_max_segments(int fd, struct stat *st)
OK, now I see why in patch 1 you treated `st` as a pointer. Still, needs
to be `st.` in patch 1, and changed to `st->` in this patch only.
> {
> #ifdef CONFIG_LINUX
> char buf[32];
> @@ -1173,12 +1178,6 @@ static int sg_get_max_segments(int fd)
> int ret;
> int sysfd = -1;
> long max_segments;
> - struct stat st;
> -
> - if (fstat(fd, &st)) {
> - ret = -errno;
> - goto out;
> - }
>
> if (S_ISCHR(st->st_mode)) {
> if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
> @@ -1192,7 +1191,7 @@ static int sg_get_max_segments(int fd)
> }
>
> sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
> - major(st.st_rdev), minor(st.st_rdev));
> + major(st->st_rdev), minor(st->st_rdev));
> sysfd = open(sysfspath, O_RDONLY);
> if (sysfd == -1) {
> ret = -errno;
> @@ -1229,15 +1228,20 @@ out:
> static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
> {
> BDRVRawState *s = bs->opaque;
> + struct stat st;
> +
> + if (fstat(s->fd, &st)) {
> + return;
> + }
First, I think if we ignore an error, there should be a comment
explaining why that’s OK.
It is OK here because inquiring transfer limits is best-effort.
However, the alignment probes below the following block are completely
independent of `st`. Therefore, I don’t think we should skip them if
`fstat()` failed here; only the `if (bs->sg || ...)` block should be
skipped.
Max
>
> - if (bs->sg) {
> - int ret = sg_get_max_transfer_length(s->fd);
> + if (bs->sg || S_ISBLK(st.st_mode)) {
> + int ret = hdev_get_max_hw_transfer(s->fd, &st);
>
> if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
> - bs->bl.max_hw_transfer = pow2floor(ret);
> + bs->bl.max_hw_transfer = ret;
> }
>
> - ret = sg_get_max_segments(s->fd);
> + ret = hdev_get_max_segments(s->fd, &st);
> if (ret > 0) {
> bs->bl.max_iov = ret;
> }
next prev parent reply other threads:[~2021-06-15 16:33 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-03 13:37 [PATCH v3 0/7] block: file-posix queue Paolo Bonzini
2021-06-03 13:37 ` [PATCH v3 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
2021-06-07 5:39 ` Vladimir Sementsov-Ogievskiy
2021-06-07 6:16 ` Vladimir Sementsov-Ogievskiy
2021-06-07 7:23 ` Vladimir Sementsov-Ogievskiy
2021-06-15 15:58 ` Max Reitz
2021-06-03 13:37 ` [PATCH v3 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits Paolo Bonzini
2021-06-15 16:06 ` Max Reitz
2021-06-15 16:20 ` Max Reitz
2021-06-15 17:41 ` Paolo Bonzini
2021-06-03 13:37 ` [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits Paolo Bonzini
2021-06-03 17:33 ` Eric Blake
2021-06-04 7:21 ` Paolo Bonzini
2021-06-15 16:18 ` Max Reitz
2021-06-16 13:18 ` Paolo Bonzini
2021-06-16 13:46 ` Max Reitz
2021-06-17 9:59 ` Paolo Bonzini
2021-06-03 13:37 ` [PATCH v3 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2 Paolo Bonzini
2021-06-15 16:32 ` Max Reitz [this message]
2021-06-03 13:37 ` [PATCH v3 5/7] block: feature detection for host block support Paolo Bonzini
2021-06-15 16:40 ` Max Reitz
2021-06-03 13:37 ` [PATCH v3 6/7] block: check for sys/disk.h Paolo Bonzini
2021-06-15 16:42 ` Max Reitz
2021-06-03 13:37 ` [PATCH v3 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use Paolo Bonzini
2021-06-15 16:50 ` Max Reitz
2021-06-15 16:57 ` Daniel P. Berrangé
2021-06-07 13:52 ` [PATCH v3 0/7] block: file-posix queue Maxim Levitsky
2021-06-10 13:40 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6f4ce577-a2d2-e63a-15ae-a248e141634a@redhat.com \
--to=mreitz@redhat.com \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).