Linux-Block Archive mirror
 help / color / mirror / Atom feed
* [PATCH] block: don't allow multiple bios for IOCB_NOWAIT issue
@ 2023-01-16 16:01 Jens Axboe
  2023-01-16 17:11 ` Michael Kelley (LINUX)
  2023-01-16 17:51 ` Christoph Hellwig
  0 siblings, 2 replies; 7+ messages in thread
From: Jens Axboe @ 2023-01-16 16:01 UTC (permalink / raw
  To: linux-block@vger.kernel.org; +Cc: Michael Kelley

If we're doing a large IO request which needs to be split into multiple
bios for issue, then we can run into the same situation as the below
marked commit fixes - parts will complete just fine, one or more parts
will fail to allocate a request. This will result in a partially
completed read or write request, where the caller gets EAGAIN even though
parts of the IO completed just fine.

Do the same for large bios as we do for splits - fail a NOWAIT request
with EAGAIN. This isn't technically fixing an issue in the below marked
patch, but for stable purposes, we should have either none of them or
both.

This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return")

Cc: stable@vger.kernel.org # 5.15+
Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio")
Link: https://github.com/axboe/liburing/issues/766
Reported-and-tested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

---

diff --git a/block/fops.c b/block/fops.c
index 50d245e8c913..a03cb732c2a7 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -368,6 +368,14 @@ static ssize_t blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 			return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
 		return __blkdev_direct_IO_async(iocb, iter, nr_pages);
 	}
+	/*
+	 * We're doing more than a bio worth of IO (> 256 pages), and we
+	 * cannot guarantee that one of the sub bios will not fail getting
+	 * issued FOR NOWAIT as error results are coalesced across all of
+	 * them. Be safe and ask for a retry of this from blocking context.
+	 */
+	if (iocb->ki_flags & IOCB_NOWAIT)
+		return -EAGAIN;
 	return __blkdev_direct_IO(iocb, iter, bio_max_segs(nr_pages));
 }
 
-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* RE: [PATCH] block: don't allow multiple bios for IOCB_NOWAIT issue
  2023-01-16 16:01 [PATCH] block: don't allow multiple bios for IOCB_NOWAIT issue Jens Axboe
@ 2023-01-16 17:11 ` Michael Kelley (LINUX)
  2023-01-16 17:27   ` Jens Axboe
  2023-01-16 17:51 ` Christoph Hellwig
  1 sibling, 1 reply; 7+ messages in thread
From: Michael Kelley (LINUX) @ 2023-01-16 17:11 UTC (permalink / raw
  To: Jens Axboe, linux-block@vger.kernel.org

From: Jens Axboe <axboe@kernel.dk> Sent: Monday, January 16, 2023 8:02 AM
> 
> If we're doing a large IO request which needs to be split into multiple
> bios for issue, then we can run into the same situation as the below
> marked commit fixes - parts will complete just fine, one or more parts
> will fail to allocate a request. This will result in a partially
> completed read or write request, where the caller gets EAGAIN even though
> parts of the IO completed just fine.
> 
> Do the same for large bios as we do for splits - fail a NOWAIT request
> with EAGAIN. This isn't technically fixing an issue in the below marked
> patch, but for stable purposes, we should have either none of them or
> both.
> 
> This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return")
> 
> Cc: stable@vger.kernel.org # 5.15+
> Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio")
> Link: https://github.com/axboe/liburing/issues/766
> Reported-and-tested-by: Michael Kelley <mikelley@microsoft.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> ---
> 
> diff --git a/block/fops.c b/block/fops.c
> index 50d245e8c913..a03cb732c2a7 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -368,6 +368,14 @@ static ssize_t blkdev_direct_IO(struct kiocb *iocb, struct
> iov_iter *iter)
>  			return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
>  		return __blkdev_direct_IO_async(iocb, iter, nr_pages);
>  	}
> +	/*
> +	 * We're doing more than a bio worth of IO (> 256 pages), and we
> +	 * cannot guarantee that one of the sub bios will not fail getting
> +	 * issued FOR NOWAIT as error results are coalesced across all of
> +	 * them. Be safe and ask for a retry of this from blocking context.
> +	 */
> +	if (iocb->ki_flags & IOCB_NOWAIT)
> +		return -EAGAIN;
>  	return __blkdev_direct_IO(iocb, iter, bio_max_segs(nr_pages));
>  }

A code observation:  __blkdev_direct_IO() has a test for IOCB_NOWAIT
that now can't happen, as this is the only place it is called.  But maybe it's
safer to leave the check in case of future code shuffling.

Michael

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: don't allow multiple bios for IOCB_NOWAIT issue
  2023-01-16 17:11 ` Michael Kelley (LINUX)
@ 2023-01-16 17:27   ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2023-01-16 17:27 UTC (permalink / raw
  To: Michael Kelley (LINUX), linux-block@vger.kernel.org

On 1/16/23 10:11 AM, Michael Kelley (LINUX) wrote:
> From: Jens Axboe <axboe@kernel.dk> Sent: Monday, January 16, 2023 8:02 AM
>>
>> If we're doing a large IO request which needs to be split into multiple
>> bios for issue, then we can run into the same situation as the below
>> marked commit fixes - parts will complete just fine, one or more parts
>> will fail to allocate a request. This will result in a partially
>> completed read or write request, where the caller gets EAGAIN even though
>> parts of the IO completed just fine.
>>
>> Do the same for large bios as we do for splits - fail a NOWAIT request
>> with EAGAIN. This isn't technically fixing an issue in the below marked
>> patch, but for stable purposes, we should have either none of them or
>> both.
>>
>> This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return")
>>
>> Cc: stable@vger.kernel.org # 5.15+
>> Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio")
>> Link: https://github.com/axboe/liburing/issues/766
>> Reported-and-tested-by: Michael Kelley <mikelley@microsoft.com>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>
>> ---
>>
>> diff --git a/block/fops.c b/block/fops.c
>> index 50d245e8c913..a03cb732c2a7 100644
>> --- a/block/fops.c
>> +++ b/block/fops.c
>> @@ -368,6 +368,14 @@ static ssize_t blkdev_direct_IO(struct kiocb *iocb, struct
>> iov_iter *iter)
>>  			return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
>>  		return __blkdev_direct_IO_async(iocb, iter, nr_pages);
>>  	}
>> +	/*
>> +	 * We're doing more than a bio worth of IO (> 256 pages), and we
>> +	 * cannot guarantee that one of the sub bios will not fail getting
>> +	 * issued FOR NOWAIT as error results are coalesced across all of
>> +	 * them. Be safe and ask for a retry of this from blocking context.
>> +	 */
>> +	if (iocb->ki_flags & IOCB_NOWAIT)
>> +		return -EAGAIN;
>>  	return __blkdev_direct_IO(iocb, iter, bio_max_segs(nr_pages));
>>  }
> 
> A code observation:  __blkdev_direct_IO() has a test for IOCB_NOWAIT
> that now can't happen, as this is the only place it is called.  But maybe it's
> safer to leave the check in case of future code shuffling.

I think we should just keep it, or it will get missed later on. I am
pondering how we could make this better, but it's a bit more involved.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: don't allow multiple bios for IOCB_NOWAIT issue
  2023-01-16 16:01 [PATCH] block: don't allow multiple bios for IOCB_NOWAIT issue Jens Axboe
  2023-01-16 17:11 ` Michael Kelley (LINUX)
@ 2023-01-16 17:51 ` Christoph Hellwig
  2023-01-16 18:03   ` Jens Axboe
  1 sibling, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2023-01-16 17:51 UTC (permalink / raw
  To: Jens Axboe; +Cc: linux-block@vger.kernel.org, Michael Kelley

On Mon, Jan 16, 2023 at 09:01:37AM -0700, Jens Axboe wrote:
> This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return")

Can we sort the NUL vs ERR_PTR thing there first?

> +	/*
> +	 * We're doing more than a bio worth of IO (> 256 pages), and we
> +	 * cannot guarantee that one of the sub bios will not fail getting
> +	 * issued FOR NOWAIT as error results are coalesced across all of
> +	 * them. Be safe and ask for a retry of this from blocking context.
> +	 */
> +	if (iocb->ki_flags & IOCB_NOWAIT)
> +		return -EAGAIN;
>  	return __blkdev_direct_IO(iocb, iter, bio_max_segs(nr_pages));

If the I/O is too a huge page we could easily end up with a single
bio here.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: don't allow multiple bios for IOCB_NOWAIT issue
  2023-01-16 17:51 ` Christoph Hellwig
@ 2023-01-16 18:03   ` Jens Axboe
  2023-01-16 18:15     ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2023-01-16 18:03 UTC (permalink / raw
  To: Christoph Hellwig; +Cc: linux-block@vger.kernel.org, Michael Kelley

On 1/16/23 10:51 AM, Christoph Hellwig wrote:
> On Mon, Jan 16, 2023 at 09:01:37AM -0700, Jens Axboe wrote:
>> This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return")
> 
> Can we sort the NUL vs ERR_PTR thing there first?

Which thing?

>> +	/*
>> +	 * We're doing more than a bio worth of IO (> 256 pages), and we
>> +	 * cannot guarantee that one of the sub bios will not fail getting
>> +	 * issued FOR NOWAIT as error results are coalesced across all of
>> +	 * them. Be safe and ask for a retry of this from blocking context.
>> +	 */
>> +	if (iocb->ki_flags & IOCB_NOWAIT)
>> +		return -EAGAIN;
>>  	return __blkdev_direct_IO(iocb, iter, bio_max_segs(nr_pages));
> 
> If the I/O is too a huge page we could easily end up with a single
> bio here.

True - we can push the decision making further down potentially, but
honestly not sure it's worth the effort.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: don't allow multiple bios for IOCB_NOWAIT issue
  2023-01-16 18:03   ` Jens Axboe
@ 2023-01-16 18:15     ` Jens Axboe
  2023-01-16 18:30       ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2023-01-16 18:15 UTC (permalink / raw
  To: Christoph Hellwig; +Cc: linux-block@vger.kernel.org, Michael Kelley

On 1/16/23 11:03?AM, Jens Axboe wrote:
>>> +	/*
>>> +	 * We're doing more than a bio worth of IO (> 256 pages), and we
>>> +	 * cannot guarantee that one of the sub bios will not fail getting
>>> +	 * issued FOR NOWAIT as error results are coalesced across all of
>>> +	 * them. Be safe and ask for a retry of this from blocking context.
>>> +	 */
>>> +	if (iocb->ki_flags & IOCB_NOWAIT)
>>> +		return -EAGAIN;
>>>  	return __blkdev_direct_IO(iocb, iter, bio_max_segs(nr_pages));
>>
>> If the I/O is too a huge page we could easily end up with a single
>> bio here.
> 
> True - we can push the decision making further down potentially, but
> honestly not sure it's worth the effort.

And even for page merges too, fwiw. We could probably do something like
the below (totally untested), downside there would be that we've already
mapped and allocated a bio at that point.


diff --git a/block/fops.c b/block/fops.c
index a03cb732c2a7..859361011e43 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -221,6 +221,14 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 			bio_endio(bio);
 			break;
 		}
+		if (iocb->ki_flags & IOCB_NOWAIT) {
+			if (iov_iter_count(iter)) {
+				bio_release_pages(bio, false);
+				bio_put(bio);
+				return -EAGAIN;
+			}
+			bio->bi_opf |= REQ_NOWAIT;
+		}
 
 		if (is_read) {
 			if (dio->flags & DIO_SHOULD_DIRTY)
@@ -228,9 +236,6 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 		} else {
 			task_io_account_write(bio->bi_iter.bi_size);
 		}
-		if (iocb->ki_flags & IOCB_NOWAIT)
-			bio->bi_opf |= REQ_NOWAIT;
-
 		dio->size += bio->bi_iter.bi_size;
 		pos += bio->bi_iter.bi_size;
 

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] block: don't allow multiple bios for IOCB_NOWAIT issue
  2023-01-16 18:15     ` Jens Axboe
@ 2023-01-16 18:30       ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2023-01-16 18:30 UTC (permalink / raw
  To: Christoph Hellwig; +Cc: linux-block@vger.kernel.org, Michael Kelley

On 1/16/23 11:15?AM, Jens Axboe wrote:
> On 1/16/23 11:03?AM, Jens Axboe wrote:
>>>> +	/*
>>>> +	 * We're doing more than a bio worth of IO (> 256 pages), and we
>>>> +	 * cannot guarantee that one of the sub bios will not fail getting
>>>> +	 * issued FOR NOWAIT as error results are coalesced across all of
>>>> +	 * them. Be safe and ask for a retry of this from blocking context.
>>>> +	 */
>>>> +	if (iocb->ki_flags & IOCB_NOWAIT)
>>>> +		return -EAGAIN;
>>>>  	return __blkdev_direct_IO(iocb, iter, bio_max_segs(nr_pages));
>>>
>>> If the I/O is too a huge page we could easily end up with a single
>>> bio here.
>>
>> True - we can push the decision making further down potentially, but
>> honestly not sure it's worth the effort.
> 
> And even for page merges too, fwiw. We could probably do something like
> the below (totally untested), downside there would be that we've already
> mapped and allocated a bio at that point.

Was missing a plug finish, but apart from that it works in testing.
Question is just if we end up doing the punt anyway in the majority of
the cases, then it's slower then it was before. If we end up skipping
some -EAGAIN's, then it'd be better. Even without huge pages, I see fio
runs that have a 1:1 ratio between them (eg we always end up punting
anyway), and cases where we now do zero punting. This must be down to
memory layout - if we can successfully merge pages in a vec, then we
don't punt anyway.

I'm leaning towards the below likely being the more optimal fix, even
with a worse worst case behavior of punting anyway and now allocating
and mapping data twice.

diff --git a/block/fops.c b/block/fops.c
index a03cb732c2a7..1a371f50cb13 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -221,6 +221,15 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 			bio_endio(bio);
 			break;
 		}
+		if (iocb->ki_flags & IOCB_NOWAIT) {
+			if (iov_iter_count(iter)) {
+				bio_release_pages(bio, false);
+				bio_put(bio);
+				blk_finish_plug(&plug);
+				return -EAGAIN;
+			}
+			bio->bi_opf |= REQ_NOWAIT;
+		}
 
 		if (is_read) {
 			if (dio->flags & DIO_SHOULD_DIRTY)
@@ -228,9 +237,6 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 		} else {
 			task_io_account_write(bio->bi_iter.bi_size);
 		}
-		if (iocb->ki_flags & IOCB_NOWAIT)
-			bio->bi_opf |= REQ_NOWAIT;
-
 		dio->size += bio->bi_iter.bi_size;
 		pos += bio->bi_iter.bi_size;
 

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-01-16 18:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-16 16:01 [PATCH] block: don't allow multiple bios for IOCB_NOWAIT issue Jens Axboe
2023-01-16 17:11 ` Michael Kelley (LINUX)
2023-01-16 17:27   ` Jens Axboe
2023-01-16 17:51 ` Christoph Hellwig
2023-01-16 18:03   ` Jens Axboe
2023-01-16 18:15     ` Jens Axboe
2023-01-16 18:30       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).