Linux-XFS Archive mirror
 help / color / mirror / Atom feed
* RFCv2: fix fatal signal handling in __blkdev_issue_discard
@ 2024-03-12 14:45 Christoph Hellwig
  2024-03-12 14:45 ` [PATCH 1/5] block: move discard checks into the ioctl handler Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Christoph Hellwig @ 2024-03-12 14:45 UTC (permalink / raw
  To: Jens Axboe, Chandan Babu R, Keith Busch
  Cc: linux-block, linux-nvme, linux-xfs

Hi all,

this tries to address the block for-next oops Chandan reported on XFS.
I can't actually reproduce it unfortunately, but this series should
sort it out by movign the fatal_signal_pending check out of all but
the ioctl path.  The write_zeroes and secure_erase path will need
similar treatment eventually.

Tested with blktests and the xfstests discard group for xfs only.

Changes since v1:
 - open code the fatal signal logic in the ioctl handler
 - better bio-level helpers
 - drop the file system cleanups for now

Diffstat:
 block/bio.c         |   48 +++++++++++++++++++++++----
 block/blk-lib.c     |   90 +++++++++++++++-------------------------------------
 block/blk.h         |    1 
 block/ioctl.c       |   35 +++++++++++++++++---
 include/linux/bio.h |    4 ++
 5 files changed, 102 insertions(+), 76 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/5] block: move discard checks into the ioctl handler
  2024-03-12 14:45 RFCv2: fix fatal signal handling in __blkdev_issue_discard Christoph Hellwig
@ 2024-03-12 14:45 ` Christoph Hellwig
  2024-03-12 22:12   ` Keith Busch
  2024-03-13 15:40   ` Keith Busch
  2024-03-12 14:45 ` [PATCH 2/5] block: add a bio_chain_and_submit helper Christoph Hellwig
  2024-03-12 14:45 ` [PATCH 3/5] block: add a blk_alloc_discard_bio helper Christoph Hellwig
  2 siblings, 2 replies; 13+ messages in thread
From: Christoph Hellwig @ 2024-03-12 14:45 UTC (permalink / raw
  To: Jens Axboe, Chandan Babu R, Keith Busch
  Cc: linux-block, linux-nvme, linux-xfs

Most bio operations get basic sanity checking in submit_bio and anything
more complicated than that is done in the callers.  Discards are a bit
different from that in that a lot of checking is done in
__blkdev_issue_discard, and the specific errnos for that are returned
to userspace.  Move the checks that require specific errnos to the ioctl
handler instead and replace the existing kernel sector alignment check
with the actual alignment check based on the logical block size. This
leaves jut the basic sanity checking in submit_bio for the other
submitters of discards and introduces two changes in behavior:

 1) the logical block size alignment check of the start and len is lost
    for non-ioctl callers.
    This matches what is done for other operations including reads and
    writes.  We should probably verify this for all bios, but for now
    make discards match the normal flow.
 2) for non-ioctl callers all errors are reported on I/O completion now
    instead of synchronously.  Callers in general mostly ignore or log
    errors so this will actually simplify the code once cleaned up

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-lib.c | 20 --------------------
 block/ioctl.c   | 13 +++++++++----
 2 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index dc8e35d0a51d6d..50923508a32466 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -59,26 +59,6 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 		sector_t nr_sects, gfp_t gfp_mask, struct bio **biop)
 {
 	struct bio *bio = *biop;
-	sector_t bs_mask;
-
-	if (bdev_read_only(bdev))
-		return -EPERM;
-	if (!bdev_max_discard_sectors(bdev))
-		return -EOPNOTSUPP;
-
-	/* In case the discard granularity isn't set by buggy device driver */
-	if (WARN_ON_ONCE(!bdev_discard_granularity(bdev))) {
-		pr_err_ratelimited("%pg: Error: discard_granularity is 0.\n",
-				   bdev);
-		return -EOPNOTSUPP;
-	}
-
-	bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
-	if ((sector | nr_sects) & bs_mask)
-		return -EINVAL;
-
-	if (!nr_sects)
-		return -EINVAL;
 
 	while (nr_sects) {
 		sector_t req_sects =
diff --git a/block/ioctl.c b/block/ioctl.c
index 0c76137adcaaa5..57c8171fda93c5 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -95,6 +95,8 @@ static int compat_blkpg_ioctl(struct block_device *bdev,
 static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
 		unsigned long arg)
 {
+	sector_t bs_mask = (bdev_logical_block_size(bdev) >> SECTOR_SHIFT) - 1;
+	sector_t sector, nr_sects;
 	uint64_t range[2];
 	uint64_t start, len;
 	struct inode *inode = bdev->bd_inode;
@@ -105,18 +107,21 @@ static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
 
 	if (!bdev_max_discard_sectors(bdev))
 		return -EOPNOTSUPP;
+	if (bdev_read_only(bdev))
+		return -EPERM;
 
 	if (copy_from_user(range, (void __user *)arg, sizeof(range)))
 		return -EFAULT;
 
 	start = range[0];
 	len = range[1];
+	sector = start >> SECTOR_SHIFT;
+	nr_sects = len >> SECTOR_SHIFT;
 
-	if (start & 511)
+	if (!nr_sects)
 		return -EINVAL;
-	if (len & 511)
+	if ((sector | nr_sects) & bs_mask)
 		return -EINVAL;
-
 	if (start + len > bdev_nr_bytes(bdev))
 		return -EINVAL;
 
@@ -124,7 +129,7 @@ static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
 	err = truncate_bdev_range(bdev, mode, start, start + len - 1);
 	if (err)
 		goto fail;
-	err = blkdev_issue_discard(bdev, start >> 9, len >> 9, GFP_KERNEL);
+	err = blkdev_issue_discard(bdev, sector, nr_sects, GFP_KERNEL);
 fail:
 	filemap_invalidate_unlock(inode->i_mapping);
 	return err;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/5] block: add a bio_chain_and_submit helper
  2024-03-12 14:45 RFCv2: fix fatal signal handling in __blkdev_issue_discard Christoph Hellwig
  2024-03-12 14:45 ` [PATCH 1/5] block: move discard checks into the ioctl handler Christoph Hellwig
@ 2024-03-12 14:45 ` Christoph Hellwig
  2024-03-12 14:51   ` Keith Busch
  2024-03-12 14:45 ` [PATCH 3/5] block: add a blk_alloc_discard_bio helper Christoph Hellwig
  2 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2024-03-12 14:45 UTC (permalink / raw
  To: Jens Axboe, Chandan Babu R, Keith Busch
  Cc: linux-block, linux-nvme, linux-xfs

This is basically blk_next_bio just with the bio allocation moved
to the caller to allow for more flexible bio handling in the caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/bio.c         | 28 ++++++++++++++++++++--------
 include/linux/bio.h |  1 +
 2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index d24420ed1c4c6f..32ff538b29e564 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -345,18 +345,30 @@ void bio_chain(struct bio *bio, struct bio *parent)
 }
 EXPORT_SYMBOL(bio_chain);
 
-struct bio *blk_next_bio(struct bio *bio, struct block_device *bdev,
-		unsigned int nr_pages, blk_opf_t opf, gfp_t gfp)
+/**
+ * bio_chain_and_submit - submit a bio after chaining it to another one
+ * @prev: bio to chain and submit
+ * @new: bio to chain to
+ *
+ * If @prev is non-NULL, chain it to @new and submit it.
+ *
+ * Return: @new.
+ */
+struct bio *bio_chain_and_submit(struct bio *prev, struct bio *new)
 {
-	struct bio *new = bio_alloc(bdev, nr_pages, opf, gfp);
-
-	if (bio) {
-		bio_chain(bio, new);
-		submit_bio(bio);
+	if (prev) {
+		bio_chain(prev, new);
+		submit_bio(prev);
 	}
-
 	return new;
 }
+EXPORT_SYMBOL_GPL(bio_chain_and_submit);
+
+struct bio *blk_next_bio(struct bio *bio, struct block_device *bdev,
+		unsigned int nr_pages, blk_opf_t opf, gfp_t gfp)
+{
+	return bio_chain_and_submit(bio, bio_alloc(bdev, nr_pages, opf, gfp));
+}
 EXPORT_SYMBOL_GPL(blk_next_bio);
 
 static void bio_alloc_rescue(struct work_struct *work)
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 875d792bffff82..643d61b7cb82f7 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -824,5 +824,6 @@ static inline void bio_clear_polled(struct bio *bio)
 
 struct bio *blk_next_bio(struct bio *bio, struct block_device *bdev,
 		unsigned int nr_pages, blk_opf_t opf, gfp_t gfp);
+struct bio *bio_chain_and_submit(struct bio *prev, struct bio *new);
 
 #endif /* __LINUX_BIO_H */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/5] block: add a blk_alloc_discard_bio helper
  2024-03-12 14:45 RFCv2: fix fatal signal handling in __blkdev_issue_discard Christoph Hellwig
  2024-03-12 14:45 ` [PATCH 1/5] block: move discard checks into the ioctl handler Christoph Hellwig
  2024-03-12 14:45 ` [PATCH 2/5] block: add a bio_chain_and_submit helper Christoph Hellwig
@ 2024-03-12 14:45 ` Christoph Hellwig
  2 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2024-03-12 14:45 UTC (permalink / raw
  To: Jens Axboe, Chandan Babu R, Keith Busch
  Cc: linux-block, linux-nvme, linux-xfs

Factor out a helper from __blkdev_issue_discard that chews off as much as
possible from a discard range and allocates a bio for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-lib.c     | 58 ++++++++++++++++++++++++++-------------------
 include/linux/bio.h |  3 +++
 2 files changed, 37 insertions(+), 24 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 50923508a32466..fd97f4dd34e7f4 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -55,36 +55,46 @@ static void await_bio_chain(struct bio *bio)
 	blk_wait_io(&done);
 }
 
+struct bio *blk_alloc_discard_bio(struct block_device *bdev,
+		sector_t *sector, sector_t *nr_sects, gfp_t gfp_mask)
+{
+	sector_t bio_sects = min(*nr_sects, bio_discard_limit(bdev, *sector));
+	struct bio *bio;
+
+	if (WARN_ON_ONCE(!(gfp_mask & __GFP_RECLAIM)))
+		return NULL;
+	if (!bio_sects)
+		return NULL;
+
+	bio = bio_alloc(bdev, 0, REQ_OP_DISCARD, gfp_mask);
+	bio->bi_iter.bi_sector = *sector;
+	bio->bi_iter.bi_size = bio_sects << SECTOR_SHIFT;
+	*sector += bio_sects;
+	*nr_sects -= bio_sects;
+	/*
+	 * We can loop for a long time in here if someone does full device
+	 * discards (like mkfs).  Be nice and allow us to schedule out to avoid
+	 * softlocking if preempt is disabled.
+	 */
+	cond_resched();
+	return bio;
+}
+
 int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 		sector_t nr_sects, gfp_t gfp_mask, struct bio **biop)
 {
-	struct bio *bio = *biop;
-
-	while (nr_sects) {
-		sector_t req_sects =
-			min(nr_sects, bio_discard_limit(bdev, sector));
+	struct bio *bio;
 
-		bio = blk_next_bio(bio, bdev, 0, REQ_OP_DISCARD, gfp_mask);
-		bio->bi_iter.bi_sector = sector;
-		bio->bi_iter.bi_size = req_sects << 9;
-		sector += req_sects;
-		nr_sects -= req_sects;
-
-		/*
-		 * We can loop for a long time in here, if someone does
-		 * full device discards (like mkfs). Be nice and allow
-		 * us to schedule out to avoid softlocking if preempt
-		 * is disabled.
-		 */
-		cond_resched();
-		if (fatal_signal_pending(current)) {
-			await_bio_chain(bio);
-			return -EINTR;
-		}
+	while (!fatal_signal_pending(current)) {
+		bio = blk_alloc_discard_bio(bdev, &sector, &nr_sects, gfp_mask);
+		if (!bio)
+			return 0;
+		*biop = bio_chain_and_submit(*biop, bio);
 	}
 
-	*biop = bio;
-	return 0;
+	if (*biop)
+		await_bio_chain(*biop);
+	return -EINTR;
 }
 EXPORT_SYMBOL(__blkdev_issue_discard);
 
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 643d61b7cb82f7..74138c815657fd 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -826,4 +826,7 @@ struct bio *blk_next_bio(struct bio *bio, struct block_device *bdev,
 		unsigned int nr_pages, blk_opf_t opf, gfp_t gfp);
 struct bio *bio_chain_and_submit(struct bio *prev, struct bio *new);
 
+struct bio *blk_alloc_discard_bio(struct block_device *bdev,
+		sector_t *sector, sector_t *nr_sects, gfp_t gfp_mask);
+
 #endif /* __LINUX_BIO_H */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/5] block: add a bio_chain_and_submit helper
  2024-03-12 14:48 RFCv2: fix fatal signal handling in __blkdev_issue_discard Christoph Hellwig
@ 2024-03-12 14:48 ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2024-03-12 14:48 UTC (permalink / raw
  To: Jens Axboe, Chandan Babu R, Keith Busch
  Cc: linux-block, linux-nvme, linux-xfs

This is basically blk_next_bio just with the bio allocation moved
to the caller to allow for more flexible bio handling in the caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/bio.c         | 28 ++++++++++++++++++++--------
 include/linux/bio.h |  1 +
 2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index d24420ed1c4c6f..32ff538b29e564 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -345,18 +345,30 @@ void bio_chain(struct bio *bio, struct bio *parent)
 }
 EXPORT_SYMBOL(bio_chain);
 
-struct bio *blk_next_bio(struct bio *bio, struct block_device *bdev,
-		unsigned int nr_pages, blk_opf_t opf, gfp_t gfp)
+/**
+ * bio_chain_and_submit - submit a bio after chaining it to another one
+ * @prev: bio to chain and submit
+ * @new: bio to chain to
+ *
+ * If @prev is non-NULL, chain it to @new and submit it.
+ *
+ * Return: @new.
+ */
+struct bio *bio_chain_and_submit(struct bio *prev, struct bio *new)
 {
-	struct bio *new = bio_alloc(bdev, nr_pages, opf, gfp);
-
-	if (bio) {
-		bio_chain(bio, new);
-		submit_bio(bio);
+	if (prev) {
+		bio_chain(prev, new);
+		submit_bio(prev);
 	}
-
 	return new;
 }
+EXPORT_SYMBOL_GPL(bio_chain_and_submit);
+
+struct bio *blk_next_bio(struct bio *bio, struct block_device *bdev,
+		unsigned int nr_pages, blk_opf_t opf, gfp_t gfp)
+{
+	return bio_chain_and_submit(bio, bio_alloc(bdev, nr_pages, opf, gfp));
+}
 EXPORT_SYMBOL_GPL(blk_next_bio);
 
 static void bio_alloc_rescue(struct work_struct *work)
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 875d792bffff82..643d61b7cb82f7 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -824,5 +824,6 @@ static inline void bio_clear_polled(struct bio *bio)
 
 struct bio *blk_next_bio(struct bio *bio, struct block_device *bdev,
 		unsigned int nr_pages, blk_opf_t opf, gfp_t gfp);
+struct bio *bio_chain_and_submit(struct bio *prev, struct bio *new);
 
 #endif /* __LINUX_BIO_H */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/5] block: add a bio_chain_and_submit helper
  2024-03-12 14:45 ` [PATCH 2/5] block: add a bio_chain_and_submit helper Christoph Hellwig
@ 2024-03-12 14:51   ` Keith Busch
  2024-03-12 21:06     ` Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Keith Busch @ 2024-03-12 14:51 UTC (permalink / raw
  To: Christoph Hellwig
  Cc: Jens Axboe, Chandan Babu R, linux-block, linux-nvme, linux-xfs

On Tue, Mar 12, 2024 at 08:45:28AM -0600, Christoph Hellwig wrote:
> +struct bio *bio_chain_and_submit(struct bio *prev, struct bio *new)
>  {
> -	struct bio *new = bio_alloc(bdev, nr_pages, opf, gfp);
> -
> -	if (bio) {
> -		bio_chain(bio, new);
> -		submit_bio(bio);
> +	if (prev) {
> +		bio_chain(prev, new);
> +		submit_bio(prev);
>  	}
> -
>  	return new;
>  }
> +EXPORT_SYMBOL_GPL(bio_chain_and_submit);
> +
> +struct bio *blk_next_bio(struct bio *bio, struct block_device *bdev,
> +		unsigned int nr_pages, blk_opf_t opf, gfp_t gfp)
> +{
> +	return bio_chain_and_submit(bio, bio_alloc(bdev, nr_pages, opf, gfp));
> +}

I realize you're not changing any behavior here, but I want to ask, is
bio_alloc() always guaranteed to return a valid bio? It sure looks like
it can return NULL under some uncommon conditions, but I can't find
anyone checking the result. So I guess it's safe?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/5] block: add a bio_chain_and_submit helper
  2024-03-12 14:51   ` Keith Busch
@ 2024-03-12 21:06     ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2024-03-12 21:06 UTC (permalink / raw
  To: Keith Busch
  Cc: Christoph Hellwig, Jens Axboe, Chandan Babu R, linux-block,
	linux-nvme, linux-xfs

On Tue, Mar 12, 2024 at 08:51:35AM -0600, Keith Busch wrote:
> > +
> > +struct bio *blk_next_bio(struct bio *bio, struct block_device *bdev,
> > +		unsigned int nr_pages, blk_opf_t opf, gfp_t gfp)
> > +{
> > +	return bio_chain_and_submit(bio, bio_alloc(bdev, nr_pages, opf, gfp));
> > +}
> 
> I realize you're not changing any behavior here, but I want to ask, is
> bio_alloc() always guaranteed to return a valid bio? It sure looks like
> it can return NULL under some uncommon conditions, but I can't find
> anyone checking the result. So I guess it's safe?

bio_alloc can only fail if we don't wait for allocations, that is if
__GFP_DIRECT_RECLAIM isn't set.

We could an assert here.  Or work on killing the gfp_flags argument
and just add a bio_alloc_nowait for the few cases that need it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] block: move discard checks into the ioctl handler
  2024-03-12 14:45 ` [PATCH 1/5] block: move discard checks into the ioctl handler Christoph Hellwig
@ 2024-03-12 22:12   ` Keith Busch
  2024-03-12 22:31     ` Christoph Hellwig
  2024-03-13 15:40   ` Keith Busch
  1 sibling, 1 reply; 13+ messages in thread
From: Keith Busch @ 2024-03-12 22:12 UTC (permalink / raw
  To: Christoph Hellwig
  Cc: Jens Axboe, Chandan Babu R, linux-block, linux-nvme, linux-xfs

On Tue, Mar 12, 2024 at 08:45:27AM -0600, Christoph Hellwig wrote:
> @@ -95,6 +95,8 @@ static int compat_blkpg_ioctl(struct block_device *bdev,
>  static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
>  		unsigned long arg)
>  {
> +	sector_t bs_mask = (bdev_logical_block_size(bdev) >> SECTOR_SHIFT) - 1;
> +	sector_t sector, nr_sects;
>  	uint64_t range[2];
>  	uint64_t start, len;
>  	struct inode *inode = bdev->bd_inode;
> @@ -105,18 +107,21 @@ static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
>  
>  	if (!bdev_max_discard_sectors(bdev))
>  		return -EOPNOTSUPP;
> +	if (bdev_read_only(bdev))
> +		return -EPERM;
>  
>  	if (copy_from_user(range, (void __user *)arg, sizeof(range)))
>  		return -EFAULT;
>  
>  	start = range[0];
>  	len = range[1];
> +	sector = start >> SECTOR_SHIFT;
> +	nr_sects = len >> SECTOR_SHIFT;
>  
> -	if (start & 511)
> +	if (!nr_sects)
>  		return -EINVAL;
> -	if (len & 511)
> +	if ((sector | nr_sects) & bs_mask)
>  		return -EINVAL;
> -
>  	if (start + len > bdev_nr_bytes(bdev))
>  		return -EINVAL;

Maybe you want to shift lower bytes out of consideration, but it is
different, right? For example, if I call this ioctl with start=5 and
len=555, it would return EINVAL, but your change would let it succeed
the same as if start=0, len=512.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] block: move discard checks into the ioctl handler
  2024-03-12 22:12   ` Keith Busch
@ 2024-03-12 22:31     ` Christoph Hellwig
  2024-03-13  1:22       ` Keith Busch
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2024-03-12 22:31 UTC (permalink / raw
  To: Keith Busch
  Cc: Christoph Hellwig, Jens Axboe, Chandan Babu R, linux-block,
	linux-nvme, linux-xfs

On Tue, Mar 12, 2024 at 04:12:54PM -0600, Keith Busch wrote:
> > +	if (!nr_sects)
> >  		return -EINVAL;
> > +	if ((sector | nr_sects) & bs_mask)
> >  		return -EINVAL;
> > -
> >  	if (start + len > bdev_nr_bytes(bdev))
> >  		return -EINVAL;
> 
> Maybe you want to shift lower bytes out of consideration, but it is
> different, right? For example, if I call this ioctl with start=5 and
> len=555, it would return EINVAL, but your change would let it succeed
> the same as if start=0, len=512.

We did the same before, just down in __blkdev_issue_discard instead of
in the ioctl handler.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] block: move discard checks into the ioctl handler
  2024-03-12 22:31     ` Christoph Hellwig
@ 2024-03-13  1:22       ` Keith Busch
  0 siblings, 0 replies; 13+ messages in thread
From: Keith Busch @ 2024-03-13  1:22 UTC (permalink / raw
  To: Christoph Hellwig
  Cc: Jens Axboe, Chandan Babu R, linux-block, linux-nvme, linux-xfs

On Tue, Mar 12, 2024 at 11:31:31PM +0100, Christoph Hellwig wrote:
> On Tue, Mar 12, 2024 at 04:12:54PM -0600, Keith Busch wrote:
> > > +	if (!nr_sects)
> > >  		return -EINVAL;
> > > +	if ((sector | nr_sects) & bs_mask)
> > >  		return -EINVAL;
> > > -
> > >  	if (start + len > bdev_nr_bytes(bdev))
> > >  		return -EINVAL;
> > 
> > Maybe you want to shift lower bytes out of consideration, but it is
> > different, right? For example, if I call this ioctl with start=5 and
> > len=555, it would return EINVAL, but your change would let it succeed
> > the same as if start=0, len=512.
> 
> We did the same before, just down in __blkdev_issue_discard instead of
> in the ioctl handler.

Here's an example program demonstrating the difference:

discard-test.c:
---
#include <stdio.h>
#include <stdint.h>
#include <fcntl.h>
#include <linux/fs.h>
#include <sys/ioctl.h>

int main(int argc, char **argv)
{
	uint64_t range[2];
	int fd;

	if (argc < 2)
	        return -1;

	fd = open(argv[1], O_RDWR);
	if (fd < 0)
	        return fd;

	range[0] = 5;
	range[1] = 555;
	ioctl(fd, BLKDISCARD, &range);
	perror("BLKDISCARD");

	return 0;
}
--

Before:

 # ./discard-test /dev/nvme0n1
 BLKDISCARD: Invalid argument

After:

 # ./discard-test /dev/nvme0n1
 BLKDISCARD: Success

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] block: move discard checks into the ioctl handler
  2024-03-12 14:45 ` [PATCH 1/5] block: move discard checks into the ioctl handler Christoph Hellwig
  2024-03-12 22:12   ` Keith Busch
@ 2024-03-13 15:40   ` Keith Busch
  2024-03-13 20:06     ` Christoph Hellwig
  1 sibling, 1 reply; 13+ messages in thread
From: Keith Busch @ 2024-03-13 15:40 UTC (permalink / raw
  To: Christoph Hellwig
  Cc: Jens Axboe, Chandan Babu R, linux-block, linux-nvme, linux-xfs

On Tue, Mar 12, 2024 at 08:45:27AM -0600, Christoph Hellwig wrote:
> @@ -95,6 +95,8 @@ static int compat_blkpg_ioctl(struct block_device *bdev,
>  static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
>  		unsigned long arg)
>  {
> +	sector_t bs_mask = (bdev_logical_block_size(bdev) >> SECTOR_SHIFT) - 1;
> +	sector_t sector, nr_sects;
>  	uint64_t range[2];
>  	uint64_t start, len;
>  	struct inode *inode = bdev->bd_inode;
> @@ -105,18 +107,21 @@ static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
>  
>  	if (!bdev_max_discard_sectors(bdev))
>  		return -EOPNOTSUPP;
> +	if (bdev_read_only(bdev))
> +		return -EPERM;
>  
>  	if (copy_from_user(range, (void __user *)arg, sizeof(range)))
>  		return -EFAULT;
>  
>  	start = range[0];
>  	len = range[1];
> +	sector = start >> SECTOR_SHIFT;
> +	nr_sects = len >> SECTOR_SHIFT;
>  
> -	if (start & 511)
> +	if (!nr_sects)
>  		return -EINVAL;
> -	if (len & 511)
> +	if ((sector | nr_sects) & bs_mask)
>  		return -EINVAL;
> -
>  	if (start + len > bdev_nr_bytes(bdev))
>  		return -EINVAL;
>  
> @@ -124,7 +129,7 @@ static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
>  	err = truncate_bdev_range(bdev, mode, start, start + len - 1);
>  	if (err)
>  		goto fail;
> -	err = blkdev_issue_discard(bdev, start >> 9, len >> 9, GFP_KERNEL);
> +	err = blkdev_issue_discard(bdev, sector, nr_sects, GFP_KERNEL);
>  fail:
>  	filemap_invalidate_unlock(inode->i_mapping);
>  	return err;
> -- 

The incremental change I think you want atop this patch to keep the
previous behavior:

-- >8 --
diff --git b/block/ioctl.c a/block/ioctl.c
index 57c8171fda93c..e14388548ab97 100644
--- b/block/ioctl.c
+++ a/block/ioctl.c
@@ -95,7 +95,7 @@ static int compat_blkpg_ioctl(struct block_device *bdev,
 static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
 		unsigned long arg)
 {
-	sector_t bs_mask = (bdev_logical_block_size(bdev) >> SECTOR_SHIFT) - 1;
+	sector_t mask = bdev_logical_block_size(bdev) - 1;
 	sector_t sector, nr_sects;
 	uint64_t range[2];
 	uint64_t start, len;
@@ -120,7 +120,7 @@ static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
 
 	if (!nr_sects)
 		return -EINVAL;
-	if ((sector | nr_sects) & bs_mask)
+	if ((start | len) & mask)
 		return -EINVAL;
 	if (start + len > bdev_nr_bytes(bdev))
 		return -EINVAL;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] block: move discard checks into the ioctl handler
  2024-03-13 15:40   ` Keith Busch
@ 2024-03-13 20:06     ` Christoph Hellwig
  2024-03-13 20:08       ` Jens Axboe
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2024-03-13 20:06 UTC (permalink / raw
  To: Keith Busch
  Cc: Christoph Hellwig, Jens Axboe, Chandan Babu R, linux-block,
	linux-nvme, linux-xfs

On Wed, Mar 13, 2024 at 09:40:21AM -0600, Keith Busch wrote:
> The incremental change I think you want atop this patch to keep the
> previous behavior:

Ah, yes, thanks.  Can you submit your reproducer to blktests or at
least throw a license on it that allows me to wire it up?

Also I'm going to wait for more comments on the approach in this
series before resending it, but we really should get a fix in in
the next days for this regression.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] block: move discard checks into the ioctl handler
  2024-03-13 20:06     ` Christoph Hellwig
@ 2024-03-13 20:08       ` Jens Axboe
  0 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2024-03-13 20:08 UTC (permalink / raw
  To: Christoph Hellwig, Keith Busch
  Cc: Chandan Babu R, linux-block, linux-nvme, linux-xfs

On 3/13/24 2:06 PM, Christoph Hellwig wrote:
> Also I'm going to wait for more comments on the approach in this
> series before resending it, but we really should get a fix in in
> the next days for this regression.

Yes please, sooner rather than later. Too much fallout this merge
window, and I've got other items that should go out soon too.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-03-13 20:08 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-12 14:45 RFCv2: fix fatal signal handling in __blkdev_issue_discard Christoph Hellwig
2024-03-12 14:45 ` [PATCH 1/5] block: move discard checks into the ioctl handler Christoph Hellwig
2024-03-12 22:12   ` Keith Busch
2024-03-12 22:31     ` Christoph Hellwig
2024-03-13  1:22       ` Keith Busch
2024-03-13 15:40   ` Keith Busch
2024-03-13 20:06     ` Christoph Hellwig
2024-03-13 20:08       ` Jens Axboe
2024-03-12 14:45 ` [PATCH 2/5] block: add a bio_chain_and_submit helper Christoph Hellwig
2024-03-12 14:51   ` Keith Busch
2024-03-12 21:06     ` Christoph Hellwig
2024-03-12 14:45 ` [PATCH 3/5] block: add a blk_alloc_discard_bio helper Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2024-03-12 14:48 RFCv2: fix fatal signal handling in __blkdev_issue_discard Christoph Hellwig
2024-03-12 14:48 ` [PATCH 2/5] block: add a bio_chain_and_submit helper Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).