From: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> To: Daeho Jeong <daeho43@gmail.com> Cc: Yi Zhang <yi.zhang@redhat.com>, linux-block <linux-block@vger.kernel.org>, Jaegeuk Kim <jaegeuk@kernel.org>, Bart Van Assche <bvanassche@acm.org>, "linux-f2fs-devel@lists.sourceforge.net" <linux-f2fs-devel@lists.sourceforge.net> Subject: Re: [f2fs-dev] [bug report]WARNING: CPU: 22 PID: 44011 at fs/iomap/iter.c:51 iomap_iter+0x32b observed with blktests zbd/010 Date: Tue, 19 Mar 2024 10:56:11 +0000 [thread overview] Message-ID: <xuuuw57eh4wwbthqe22n6n32m3ejkzzlkpwcicob6h7wfobkt6@j3znemafupbv> (raw) In-Reply-To: <CACOAw_ydoaJTDu1eu+Lasz4AHReHqT5Pgd9a1h5Y4E8y_Hh-8A@mail.gmail.com> On Mar 18, 2024 / 14:12, Daeho Jeong wrote: > On Sun, Mar 17, 2024 at 10:49 PM Shinichiro Kawasaki via > Linux-f2fs-devel <linux-f2fs-devel@lists.sourceforge.net> wrote: > > > > I confirmed that the trigger commit is dbf8e63f48af as Yi reported. I took a > > look in the commit, but it looks fine to me. So I thought the cause is not > > in the commit diff. > > > > I found the WARN is printed when the f2fs is set up with multiple devices, > > and read requests are mapped to the very first block of the second device in the > > direct read path. In this case, f2fs_map_blocks() and f2fs_map_blocks_cached() > > modify map->m_pblk as the physical block address from each block device. It > > becomes zero when it is mapped to the first block of the device. However, > > f2fs_iomap_begin() assumes that map->m_pblk is the physical block address of the > > whole f2fs, across the all block devices. It compares map->m_pblk against > > NULL_ADDR == 0, then go into the unexpected branch and sets the invalid > > iomap->length. The WARN catches the invalid iomap->length. > > > > This WARN is printed even for non-zoned block devices, by following steps. > > > > - Create two (non-zoned) null_blk devices memory backed with 128MB size each: > > nullb0 and nullb1. > > # mkfs.f2fs /dev/nullb0 -c /dev/nullb1 > > # mount -t f2fs /dev/nullb0 "${mount_dir}" > > # dd if=/dev/zero of="${mount_dir}/test.dat" bs=1M count=192 > > # dd if="${mount_dir}/test.dat" of=/dev/null bs=1M count=192 iflag=direct > > > > I created a fix candidate patch [1]. It modifies f2fs_iomap_begin() to handle > > map->m_pblk as the physical block address from each device start, not the > > address of whole f2fs. I confirmed it avoids the WARN. > > > > But I'm not so sure if the fix is good enough. map->m_pblk has dual meanings. > > Sometimes it holds the physical block address of each device, and sometimes > > the address of the whole f2fs. I'm not sure what is the condition for > > map->m_pblk to have which meaning. I guess F2FS_GET_BLOCK_DIO flag is the > > condition, but f2fs_map_blocks_cached() does not ensure it. > > > > Also, I noticed that map->m_pblk is referred to in other functions below, and > > not sure if they need the similar change as I did for f2fs_iomap_begin(). > > > > f2fs_fiemap() > > f2fs_read_single_page() > > f2fs_bmap() > > check_swap_activate() > > > > I would like to hear advices from f2fs experts for the fix. > > > > > > [1] > > > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > > index 26e317696b33..5232223a69e5 100644 > > --- a/fs/f2fs/data.c > > +++ b/fs/f2fs/data.c > > @@ -1569,6 +1569,7 @@ static bool f2fs_map_blocks_cached(struct inode *inode, > > int bidx = f2fs_target_device_index(sbi, map->m_pblk); > > struct f2fs_dev_info *dev = &sbi->devs[bidx]; > > > > + map->m_multidev_dio = true; > > map->m_bdev = dev->bdev; > > map->m_pblk -= dev->start_blk; > > map->m_len = min(map->m_len, dev->end_blk + 1 - map->m_pblk); > > @@ -4211,9 +4212,11 @@ static int f2fs_iomap_begin(struct inode *inode, loff_t offset, loff_t length, > > unsigned int flags, struct iomap *iomap, > > struct iomap *srcmap) > > { > > + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); > > struct f2fs_map_blocks map = {}; > > pgoff_t next_pgofs = 0; > > - int err; > > + block_t pblk; > > + int err, i; > > > > map.m_lblk = bytes_to_blks(inode, offset); > > map.m_len = bytes_to_blks(inode, offset + length - 1) - map.m_lblk + 1; > > @@ -4239,12 +4242,17 @@ static int f2fs_iomap_begin(struct inode *inode, loff_t offset, loff_t length, > > * We should never see delalloc or compressed extents here based on > > * prior flushing and checks. > > */ > > - if (WARN_ON_ONCE(map.m_pblk == NEW_ADDR)) > > + pblk = map.m_pblk; > > + if (map.m_multidev_dio && map.m_flags & F2FS_MAP_MAPPED) > > + for (i = 0; i < sbi->s_ndevs; i++) > > + if (FDEV(i).bdev == map.m_bdev) > > + pblk += FDEV(i).start_blk; > > + if (WARN_ON_ONCE(pblk == NEW_ADDR)) > > return -EINVAL; > > - if (WARN_ON_ONCE(map.m_pblk == COMPRESS_ADDR)) > > + if (WARN_ON_ONCE(pblk == COMPRESS_ADDR)) > > return -EINVAL; > > > > - if (map.m_pblk != NULL_ADDR) { > > + if (pblk != NULL_ADDR) { > > I feel like we should check only whether the block is really mapped or > not by checking F2FS_MAP_MAPPED field without changing the pblk, since > "0" pblk for the secondary device should remain 0 if it's the correct > value. My intent was to keep the physical block address from the secondary device start in map.m_pblk. So "0" for the secondeary device is kept in map.m_pblk. Having said that, I'm not sure that this modification covers all conditions. I think your comment above is similar as Chao's comment. Will respond to it. > > > iomap->length = blks_to_bytes(inode, map.m_len); > > iomap->type = IOMAP_MAPPED; > > iomap->flags |= IOMAP_F_MERGED; > > > > _______________________________________________ > > Linux-f2fs-devel mailing list > > Linux-f2fs-devel@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
WARNING: multiple messages have this Message-ID (diff)
From: Shinichiro Kawasaki via Linux-f2fs-devel <linux-f2fs-devel@lists.sourceforge.net> To: Daeho Jeong <daeho43@gmail.com> Cc: linux-block <linux-block@vger.kernel.org>, Jaegeuk Kim <jaegeuk@kernel.org>, Yi Zhang <yi.zhang@redhat.com>, Bart Van Assche <bvanassche@acm.org>, "linux-f2fs-devel@lists.sourceforge.net" <linux-f2fs-devel@lists.sourceforge.net> Subject: Re: [f2fs-dev] [bug report]WARNING: CPU: 22 PID: 44011 at fs/iomap/iter.c:51 iomap_iter+0x32b observed with blktests zbd/010 Date: Tue, 19 Mar 2024 10:56:11 +0000 [thread overview] Message-ID: <xuuuw57eh4wwbthqe22n6n32m3ejkzzlkpwcicob6h7wfobkt6@j3znemafupbv> (raw) In-Reply-To: <CACOAw_ydoaJTDu1eu+Lasz4AHReHqT5Pgd9a1h5Y4E8y_Hh-8A@mail.gmail.com> On Mar 18, 2024 / 14:12, Daeho Jeong wrote: > On Sun, Mar 17, 2024 at 10:49 PM Shinichiro Kawasaki via > Linux-f2fs-devel <linux-f2fs-devel@lists.sourceforge.net> wrote: > > > > I confirmed that the trigger commit is dbf8e63f48af as Yi reported. I took a > > look in the commit, but it looks fine to me. So I thought the cause is not > > in the commit diff. > > > > I found the WARN is printed when the f2fs is set up with multiple devices, > > and read requests are mapped to the very first block of the second device in the > > direct read path. In this case, f2fs_map_blocks() and f2fs_map_blocks_cached() > > modify map->m_pblk as the physical block address from each block device. It > > becomes zero when it is mapped to the first block of the device. However, > > f2fs_iomap_begin() assumes that map->m_pblk is the physical block address of the > > whole f2fs, across the all block devices. It compares map->m_pblk against > > NULL_ADDR == 0, then go into the unexpected branch and sets the invalid > > iomap->length. The WARN catches the invalid iomap->length. > > > > This WARN is printed even for non-zoned block devices, by following steps. > > > > - Create two (non-zoned) null_blk devices memory backed with 128MB size each: > > nullb0 and nullb1. > > # mkfs.f2fs /dev/nullb0 -c /dev/nullb1 > > # mount -t f2fs /dev/nullb0 "${mount_dir}" > > # dd if=/dev/zero of="${mount_dir}/test.dat" bs=1M count=192 > > # dd if="${mount_dir}/test.dat" of=/dev/null bs=1M count=192 iflag=direct > > > > I created a fix candidate patch [1]. It modifies f2fs_iomap_begin() to handle > > map->m_pblk as the physical block address from each device start, not the > > address of whole f2fs. I confirmed it avoids the WARN. > > > > But I'm not so sure if the fix is good enough. map->m_pblk has dual meanings. > > Sometimes it holds the physical block address of each device, and sometimes > > the address of the whole f2fs. I'm not sure what is the condition for > > map->m_pblk to have which meaning. I guess F2FS_GET_BLOCK_DIO flag is the > > condition, but f2fs_map_blocks_cached() does not ensure it. > > > > Also, I noticed that map->m_pblk is referred to in other functions below, and > > not sure if they need the similar change as I did for f2fs_iomap_begin(). > > > > f2fs_fiemap() > > f2fs_read_single_page() > > f2fs_bmap() > > check_swap_activate() > > > > I would like to hear advices from f2fs experts for the fix. > > > > > > [1] > > > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > > index 26e317696b33..5232223a69e5 100644 > > --- a/fs/f2fs/data.c > > +++ b/fs/f2fs/data.c > > @@ -1569,6 +1569,7 @@ static bool f2fs_map_blocks_cached(struct inode *inode, > > int bidx = f2fs_target_device_index(sbi, map->m_pblk); > > struct f2fs_dev_info *dev = &sbi->devs[bidx]; > > > > + map->m_multidev_dio = true; > > map->m_bdev = dev->bdev; > > map->m_pblk -= dev->start_blk; > > map->m_len = min(map->m_len, dev->end_blk + 1 - map->m_pblk); > > @@ -4211,9 +4212,11 @@ static int f2fs_iomap_begin(struct inode *inode, loff_t offset, loff_t length, > > unsigned int flags, struct iomap *iomap, > > struct iomap *srcmap) > > { > > + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); > > struct f2fs_map_blocks map = {}; > > pgoff_t next_pgofs = 0; > > - int err; > > + block_t pblk; > > + int err, i; > > > > map.m_lblk = bytes_to_blks(inode, offset); > > map.m_len = bytes_to_blks(inode, offset + length - 1) - map.m_lblk + 1; > > @@ -4239,12 +4242,17 @@ static int f2fs_iomap_begin(struct inode *inode, loff_t offset, loff_t length, > > * We should never see delalloc or compressed extents here based on > > * prior flushing and checks. > > */ > > - if (WARN_ON_ONCE(map.m_pblk == NEW_ADDR)) > > + pblk = map.m_pblk; > > + if (map.m_multidev_dio && map.m_flags & F2FS_MAP_MAPPED) > > + for (i = 0; i < sbi->s_ndevs; i++) > > + if (FDEV(i).bdev == map.m_bdev) > > + pblk += FDEV(i).start_blk; > > + if (WARN_ON_ONCE(pblk == NEW_ADDR)) > > return -EINVAL; > > - if (WARN_ON_ONCE(map.m_pblk == COMPRESS_ADDR)) > > + if (WARN_ON_ONCE(pblk == COMPRESS_ADDR)) > > return -EINVAL; > > > > - if (map.m_pblk != NULL_ADDR) { > > + if (pblk != NULL_ADDR) { > > I feel like we should check only whether the block is really mapped or > not by checking F2FS_MAP_MAPPED field without changing the pblk, since > "0" pblk for the secondary device should remain 0 if it's the correct > value. My intent was to keep the physical block address from the secondary device start in map.m_pblk. So "0" for the secondeary device is kept in map.m_pblk. Having said that, I'm not sure that this modification covers all conditions. I think your comment above is similar as Chao's comment. Will respond to it. > > > iomap->length = blks_to_bytes(inode, map.m_len); > > iomap->type = IOMAP_MAPPED; > > iomap->flags |= IOMAP_F_MERGED; > > > > _______________________________________________ > > Linux-f2fs-devel mailing list > > Linux-f2fs-devel@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next prev parent reply other threads:[~2024-03-19 10:56 UTC|newest] Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-02-18 16:58 [bug report]WARNING: CPU: 22 PID: 44011 at fs/iomap/iter.c:51 iomap_iter+0x32b observed with blktests zbd/010 Yi Zhang 2024-02-18 16:58 ` [f2fs-dev] " Yi Zhang 2024-02-28 11:08 ` Shinichiro Kawasaki 2024-02-28 11:08 ` [f2fs-dev] " Shinichiro Kawasaki via Linux-f2fs-devel 2024-02-28 12:08 ` Yi Zhang 2024-02-28 12:08 ` [f2fs-dev] " Yi Zhang 2024-03-01 7:49 ` Yi Zhang 2024-03-01 16:33 ` Bart Van Assche 2024-03-01 16:33 ` [f2fs-dev] " Bart Van Assche 2024-03-12 2:52 ` Shinichiro Kawasaki 2024-03-12 2:52 ` [f2fs-dev] " Shinichiro Kawasaki via Linux-f2fs-devel 2024-03-12 4:57 ` Yi Zhang 2024-03-12 9:34 ` Shinichiro Kawasaki 2024-03-12 9:34 ` [f2fs-dev] " Shinichiro Kawasaki via Linux-f2fs-devel 2024-03-18 5:47 ` Shinichiro Kawasaki 2024-03-18 5:47 ` Shinichiro Kawasaki via Linux-f2fs-devel 2024-03-18 21:12 ` Daeho Jeong 2024-03-18 21:12 ` Daeho Jeong 2024-03-19 10:56 ` Shinichiro Kawasaki [this message] 2024-03-19 10:56 ` Shinichiro Kawasaki via Linux-f2fs-devel 2024-03-19 2:22 ` Chao Yu 2024-03-19 2:22 ` Chao Yu 2024-03-19 11:13 ` Shinichiro Kawasaki 2024-03-19 11:13 ` Shinichiro Kawasaki via Linux-f2fs-devel 2024-03-24 12:13 ` Chao Yu 2024-03-24 12:13 ` Chao Yu 2024-03-25 2:14 ` Shinichiro Kawasaki via Linux-f2fs-devel 2024-03-25 2:14 ` Shinichiro Kawasaki 2024-03-25 3:06 ` Chao Yu 2024-03-25 3:06 ` Chao Yu 2024-03-25 6:56 ` Shinichiro Kawasaki via Linux-f2fs-devel 2024-03-25 6:56 ` Shinichiro Kawasaki 2024-03-26 3:30 ` Chao Yu 2024-03-26 3:30 ` Chao Yu
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=xuuuw57eh4wwbthqe22n6n32m3ejkzzlkpwcicob6h7wfobkt6@j3znemafupbv \ --to=shinichiro.kawasaki@wdc.com \ --cc=bvanassche@acm.org \ --cc=daeho43@gmail.com \ --cc=jaegeuk@kernel.org \ --cc=linux-block@vger.kernel.org \ --cc=linux-f2fs-devel@lists.sourceforge.net \ --cc=yi.zhang@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.