[PATCH] ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate

Linux-ext4 Archive mirror
 help / color / mirror / Atom feed

* [PATCH] ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate
@ 2022-08-31  7:46 Zhang Yi
  2022-08-31 10:36 ` Jan Kara
  2022-09-30  3:19 ` Theodore Ts'o
  0 siblings, 2 replies; 3+ messages in thread
From: Zhang Yi @ 2022-08-31  7:46 UTC (permalink / raw
  To: linux-ext4; +Cc: tytso, adilger.kernel, jack, openglfreak, yi.zhang, yukuai3

Recently we notice that ext4 filesystem occasionally fail to read
metadata from disk and report error message, but the disk and block
layer looks fine. After analyse, we lockon commit 88dbcbb3a484
("blkdev: avoid migration stalls for blkdev pages"). It provide a
migration method for the bdev, we could move page that has buffers
without extra users now, but it lock the buffers on the page, which
breaks the fragile metadata read operation on ext4 filesystem,
ext4_read_bh_lock() was copied from ll_rw_block(), it depends on the
assumption of that locked buffer means it is under IO. So it just
trylock the buffer and skip submit IO if it lock failed, after
wait_on_buffer() we conclude IO error because the buffer is not
uptodate.

This issue could be easily reproduced by add some delay just after
buffer_migrate_lock_buffers() in __buffer_migrate_folio() and do
fsstress on ext4 filesystem.

  EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #73193:
  comm fsstress: reading directory lblock 0
  EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #75334:
  comm fsstress: reading directory lblock 0

Fix it by removing the trylock logic in ext4_read_bh_lock(), just lock
the buffer and submit IO if it's not uptodate, and also leave over
readahead helper.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/super.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 9a66abcca1a8..629bba3fa99a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -205,19 +205,12 @@ int ext4_read_bh(struct buffer_head *bh, blk_opf_t op_flags, bh_end_io_t *end_io

 int ext4_read_bh_lock(struct buffer_head *bh, blk_opf_t op_flags, bool wait)
 {
-	if (trylock_buffer(bh)) {
-		if (wait)
-			return ext4_read_bh(bh, op_flags, NULL);
+	lock_buffer(bh);
+	if (!wait) {
 		ext4_read_bh_nowait(bh, op_flags, NULL);
 		return 0;
 	}
-	if (wait) {
-		wait_on_buffer(bh);
-		if (buffer_uptodate(bh))
-			return 0;
-		return -EIO;
-	}
-	return 0;
+	return ext4_read_bh(bh, op_flags, NULL);
 }

 /*
@@ -264,7 +257,8 @@ void ext4_sb_breadahead_unmovable(struct super_block *sb, sector_t block)
 	struct buffer_head *bh = sb_getblk_gfp(sb, block, 0);

 	if (likely(bh)) {
-		ext4_read_bh_lock(bh, REQ_RAHEAD, false);
+		if (trylock_buffer(bh))
+			ext4_read_bh_nowait(bh, REQ_RAHEAD, NULL);
 		brelse(bh);
 	}
 }
-- 
2.31.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate
  2022-08-31  7:46 [PATCH] ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate Zhang Yi
@ 2022-08-31 10:36 ` Jan Kara
  2022-09-30  3:19 ` Theodore Ts'o
  1 sibling, 0 replies; 3+ messages in thread
From: Jan Kara @ 2022-08-31 10:36 UTC (permalink / raw
  To: Zhang Yi; +Cc: linux-ext4, tytso, adilger.kernel, jack, openglfreak, yukuai3

On Wed 31-08-22 15:46:29, Zhang Yi wrote:
> Recently we notice that ext4 filesystem occasionally fail to read
> metadata from disk and report error message, but the disk and block
> layer looks fine. After analyse, we lockon commit 88dbcbb3a484
> ("blkdev: avoid migration stalls for blkdev pages"). It provide a
> migration method for the bdev, we could move page that has buffers
> without extra users now, but it lock the buffers on the page, which
> breaks the fragile metadata read operation on ext4 filesystem,
> ext4_read_bh_lock() was copied from ll_rw_block(), it depends on the
> assumption of that locked buffer means it is under IO. So it just
> trylock the buffer and skip submit IO if it lock failed, after
> wait_on_buffer() we conclude IO error because the buffer is not
> uptodate.
> 
> This issue could be easily reproduced by add some delay just after
> buffer_migrate_lock_buffers() in __buffer_migrate_folio() and do
> fsstress on ext4 filesystem.
> 
>   EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #73193:
>   comm fsstress: reading directory lblock 0
>   EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #75334:
>   comm fsstress: reading directory lblock 0
> 
> Fix it by removing the trylock logic in ext4_read_bh_lock(), just lock
> the buffer and submit IO if it's not uptodate, and also leave over
> readahead helper.
> 
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

Thanks for the fix. It looks good to me. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/super.c | 16 +++++-----------
>  1 file changed, 5 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 9a66abcca1a8..629bba3fa99a 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -205,19 +205,12 @@ int ext4_read_bh(struct buffer_head *bh, blk_opf_t op_flags, bh_end_io_t *end_io
>  
>  int ext4_read_bh_lock(struct buffer_head *bh, blk_opf_t op_flags, bool wait)
>  {
> -	if (trylock_buffer(bh)) {
> -		if (wait)
> -			return ext4_read_bh(bh, op_flags, NULL);
> +	lock_buffer(bh);
> +	if (!wait) {
>  		ext4_read_bh_nowait(bh, op_flags, NULL);
>  		return 0;
>  	}
> -	if (wait) {
> -		wait_on_buffer(bh);
> -		if (buffer_uptodate(bh))
> -			return 0;
> -		return -EIO;
> -	}
> -	return 0;
> +	return ext4_read_bh(bh, op_flags, NULL);
>  }
>  
>  /*
> @@ -264,7 +257,8 @@ void ext4_sb_breadahead_unmovable(struct super_block *sb, sector_t block)
>  	struct buffer_head *bh = sb_getblk_gfp(sb, block, 0);
>  
>  	if (likely(bh)) {
> -		ext4_read_bh_lock(bh, REQ_RAHEAD, false);
> +		if (trylock_buffer(bh))
> +			ext4_read_bh_nowait(bh, REQ_RAHEAD, NULL);
>  		brelse(bh);
>  	}
>  }
> -- 
> 2.31.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate
  2022-08-31  7:46 [PATCH] ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate Zhang Yi
  2022-08-31 10:36 ` Jan Kara
@ 2022-09-30  3:19 ` Theodore Ts'o
  1 sibling, 0 replies; 3+ messages in thread
From: Theodore Ts'o @ 2022-09-30  3:19 UTC (permalink / raw
  To: linux-ext4, Zhang Yi
  Cc: Theodore Ts'o, adilger.kernel, openglfreak, jack, yukuai3

On Wed, 31 Aug 2022 15:46:29 +0800, Zhang Yi wrote:
> Recently we notice that ext4 filesystem occasionally fail to read
> metadata from disk and report error message, but the disk and block
> layer looks fine. After analyse, we lockon commit 88dbcbb3a484
> ("blkdev: avoid migration stalls for blkdev pages"). It provide a
> migration method for the bdev, we could move page that has buffers
> without extra users now, but it lock the buffers on the page, which
> breaks the fragile metadata read operation on ext4 filesystem,
> ext4_read_bh_lock() was copied from ll_rw_block(), it depends on the
> assumption of that locked buffer means it is under IO. So it just
> trylock the buffer and skip submit IO if it lock failed, after
> wait_on_buffer() we conclude IO error because the buffer is not
> uptodate.
> 
> [...]

Applied, thanks!

[1/1] ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate
      commit: 07e3f4273ff54a4c891f05e9f0f0c842b46578a7

Best regards,
-- 
Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-09-30  3:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-31  7:46 [PATCH] ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate Zhang Yi
2022-08-31 10:36 ` Jan Kara
2022-09-30  3:19 ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).