From: Filipe Manana <fdmanana@kernel.org>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: linux-btrfs@vger.kernel.org, Filipe Manana <fdmanana@suse.com>
Subject: Re: [PATCH] btrfs: remove not needed mod_start and mod_len from struct extent_map
Date: Wed, 3 Apr 2024 10:55:20 +0000 [thread overview]
Message-ID: <CAL3q7H7AM5wkgL=d5+7ohh25Q787f89NGKP_FoZdLKVaftH7_w@mail.gmail.com> (raw)
In-Reply-To: <000d9059-f896-429c-b69e-9b9910f6d421@gmx.com>
On Tue, Apr 2, 2024 at 10:53 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> 在 2024/4/3 08:13, Filipe Manana 写道:
> > On Tue, Apr 2, 2024 at 10:18 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> >>
> >>
> >>
> >> 在 2024/4/3 03:10, fdmanana@kernel.org 写道:
> >>> From: Filipe Manana <fdmanana@suse.com>
> >>>
> >>> The mod_start and mod_len fields of struct extent_map were introduced by
> >>> commit 4e2f84e63dc1 ("Btrfs: improve fsync by filtering extents that we
> >>> want") in order to avoid too low performance when fsyncing a file that
> >>> keeps getting extent maps merge, because it resulted in each fsync logging
> >>> again csum ranges that were already merged before.
> >>>
> >>> We don't need this anymore as extent maps in the modified list of extents
> >>> that get merged with other extents and once we log an extent map we remove
> >>> it from the list of modified extent maps.
> >>>
> >>> So remove the mod_start and mod_len fields from struct extent_map and use
> >>> instead the start and len fields when logging checksums in the fast fsync
> >>> path. This also makes EXTENT_FLAG_FILLING unused so remove it as well.
> >>>
> >>> Running the reproducer from the commit mentioned before, with a larger
> >>> number of extents and against a null block device, so that IO is fast
> >>> and we can better see any impact from searching checkums items and logging
> >>> them, gave the following results from dd:
> >>>
> >>> Before this change:
> >>>
> >>> 409600000 bytes (410 MB, 391 MiB) copied, 22.948 s, 17.8 MB/s
> >>>
> >>> After this change:
> >>>
> >>> 409600000 bytes (410 MB, 391 MiB) copied, 22.9997 s, 17.8 MB/s
> >>>
> >>> So no changes in throughput.
> >>> The test was done in a release kernel (non-debug, Debian's default kernel
> >>> config) and its steps are the following:
> >>>
> >>> $ mkfs.btrfs -f /dev/nullb0
> >>> $ mount /dev/sdb /mnt
> >>> $ dd if=/dev/zero of=/mnt/foobar bs=4k count=100000 oflag=sync
> >>> $ umount /mnt
> >>>
> >>> This also reduce the size of struct extent_map from 128 bytes down to 112
> >>> bytes, so now we can have 36 extents maps per 4K page instead of 32.
> >>>
> >>> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> >>
> >> Reviewed-by: Qu Wenruo <wqu@suse.com>
> >>
> >> Great we can start cleaning up the extent map members.
> >>
> >> Mind this patch to be included in my upcoming extent map cleaning series?
> >
> > Well, the second paragraph needs to be updated to:
> >
> > "We don't need this anymore as extent maps in the modified list of extents
> > are never merged with other extent maps and once we log an extent map we
> > remove it from the list of modified extent maps, so it's never logged twice."
> >
> > Plus s/reduce/reduces/ in the last paragraph of the changelog.
> >
> > Why does it need to be included in your series? Can't I just push this
> > to misc-next (with the updated changelog)?
> >
> > I've also been working on a large patchset for extent maps for quite
> > some time, which I hope it's ready in 1 to 2 weeks.
> > Several refactorings and a shrinker.
> >
> > What kind of cleanups do you have?
>
> My plan is to sync the em members with the file extent item members, so
> that there is no/less confusion on how to convert a file extent to
> extent map.
> Orig_start/block_start/block_len/orig_block_len to be replaced by
> disk_bytenr/disk_num_bytes/offset, and keep the old members which are
> already the more-or-less the same as the file extent items.
> (ram_bytes, start, len, generation)
Ok, I see. I'm not completely sure because orig_block_len is a max
value across splits.
It may work to always set it to block_len of the original extent that
was split, this is just from a quick look, so I may be wrong about it.
>
> Unfortunately this would result quite some name changes, and would
> definitely going to conflict with your changes.
That would be a huge change, touching lines everywhere due to member renames.
>
> My bad as I'm not aware of your em work, I'm totally fine to wait for
> your patch and re-base my work then.
I'm not going to change anything in the extent_map structure itself,
but all major operations like adding/removing/replacing/merging/etc
are being changed,
mostly changing function parameters and updating a percpu counter, and
some tiny change for fsync to avoid races and data loss with the
shrinker itself.
So all potential conflicts are likely simple to resolve.
I pushed the patch to for-next with the fixed changelog.
Thanks.
>
> Thanks,
> Qu
>
> >
> >
> >>
> >> Thanks,
> >> Qu
> >>> ---
> >>> fs/btrfs/extent_map.c | 18 ------------------
> >>> fs/btrfs/extent_map.h | 4 ----
> >>> fs/btrfs/inode.c | 4 +---
> >>> fs/btrfs/tree-log.c | 4 ++--
> >>> include/trace/events/btrfs.h | 3 +--
> >>> 5 files changed, 4 insertions(+), 29 deletions(-)
> >>>
> >>> diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
> >>> index 445f7716f1e2..471654cb65b0 100644
> >>> --- a/fs/btrfs/extent_map.c
> >>> +++ b/fs/btrfs/extent_map.c
> >>> @@ -252,8 +252,6 @@ static void try_merge_map(struct extent_map_tree *tree, struct extent_map *em)
> >>> em->len += merge->len;
> >>> em->block_len += merge->block_len;
> >>> em->block_start = merge->block_start;
> >>> - em->mod_len = (em->mod_len + em->mod_start) - merge->mod_start;
> >>> - em->mod_start = merge->mod_start;
> >>> em->generation = max(em->generation, merge->generation);
> >>> em->flags |= EXTENT_FLAG_MERGED;
> >>>
> >>> @@ -271,7 +269,6 @@ static void try_merge_map(struct extent_map_tree *tree, struct extent_map *em)
> >>> em->block_len += merge->block_len;
> >>> rb_erase_cached(&merge->rb_node, &tree->map);
> >>> RB_CLEAR_NODE(&merge->rb_node);
> >>> - em->mod_len = (merge->mod_start + merge->mod_len) - em->mod_start;
> >>> em->generation = max(em->generation, merge->generation);
> >>> em->flags |= EXTENT_FLAG_MERGED;
> >>> free_extent_map(merge);
> >>> @@ -300,7 +297,6 @@ int unpin_extent_cache(struct btrfs_inode *inode, u64 start, u64 len, u64 gen)
> >>> struct extent_map_tree *tree = &inode->extent_tree;
> >>> int ret = 0;
> >>> struct extent_map *em;
> >>> - bool prealloc = false;
> >>>
> >>> write_lock(&tree->lock);
> >>> em = lookup_extent_mapping(tree, start, len);
> >>> @@ -325,21 +321,9 @@ int unpin_extent_cache(struct btrfs_inode *inode, u64 start, u64 len, u64 gen)
> >>>
> >>> em->generation = gen;
> >>> em->flags &= ~EXTENT_FLAG_PINNED;
> >>> - em->mod_start = em->start;
> >>> - em->mod_len = em->len;
> >>> -
> >>> - if (em->flags & EXTENT_FLAG_FILLING) {
> >>> - prealloc = true;
> >>> - em->flags &= ~EXTENT_FLAG_FILLING;
> >>> - }
> >>>
> >>> try_merge_map(tree, em);
> >>>
> >>> - if (prealloc) {
> >>> - em->mod_start = em->start;
> >>> - em->mod_len = em->len;
> >>> - }
> >>> -
> >>> out:
> >>> write_unlock(&tree->lock);
> >>> free_extent_map(em);
> >>> @@ -361,8 +345,6 @@ static inline void setup_extent_mapping(struct extent_map_tree *tree,
> >>> int modified)
> >>> {
> >>> refcount_inc(&em->refs);
> >>> - em->mod_start = em->start;
> >>> - em->mod_len = em->len;
> >>>
> >>> ASSERT(list_empty(&em->list));
> >>>
> >>> diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h
> >>> index c5a098c99cc6..10e9491865c9 100644
> >>> --- a/fs/btrfs/extent_map.h
> >>> +++ b/fs/btrfs/extent_map.h
> >>> @@ -30,8 +30,6 @@ enum {
> >>> ENUM_BIT(EXTENT_FLAG_PREALLOC),
> >>> /* Logging this extent */
> >>> ENUM_BIT(EXTENT_FLAG_LOGGING),
> >>> - /* Filling in a preallocated extent */
> >>> - ENUM_BIT(EXTENT_FLAG_FILLING),
> >>> /* This em is merged from two or more physically adjacent ems */
> >>> ENUM_BIT(EXTENT_FLAG_MERGED),
> >>> };
> >>> @@ -46,8 +44,6 @@ struct extent_map {
> >>> /* all of these are in bytes */
> >>> u64 start;
> >>> u64 len;
> >>> - u64 mod_start;
> >>> - u64 mod_len;
> >>> u64 orig_start;
> >>> u64 orig_block_len;
> >>> u64 ram_bytes;
> >>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> >>> index 3442dedff53d..c6f2b5d1dee1 100644
> >>> --- a/fs/btrfs/inode.c
> >>> +++ b/fs/btrfs/inode.c
> >>> @@ -7338,9 +7338,7 @@ static struct extent_map *create_io_em(struct btrfs_inode *inode, u64 start,
> >>> em->ram_bytes = ram_bytes;
> >>> em->generation = -1;
> >>> em->flags |= EXTENT_FLAG_PINNED;
> >>> - if (type == BTRFS_ORDERED_PREALLOC)
> >>> - em->flags |= EXTENT_FLAG_FILLING;
> >>> - else if (type == BTRFS_ORDERED_COMPRESSED)
> >>> + if (type == BTRFS_ORDERED_COMPRESSED)
> >>> extent_map_set_compression(em, compress_type);
> >>>
> >>> ret = btrfs_replace_extent_map_range(inode, em, true);
> >>> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> >>> index 472918a5bc73..d9777649e170 100644
> >>> --- a/fs/btrfs/tree-log.c
> >>> +++ b/fs/btrfs/tree-log.c
> >>> @@ -4574,8 +4574,8 @@ static int log_extent_csums(struct btrfs_trans_handle *trans,
> >>> struct btrfs_root *csum_root;
> >>> u64 csum_offset;
> >>> u64 csum_len;
> >>> - u64 mod_start = em->mod_start;
> >>> - u64 mod_len = em->mod_len;
> >>> + u64 mod_start = em->start;
> >>> + u64 mod_len = em->len;
> >>> LIST_HEAD(ordered_sums);
> >>> int ret = 0;
> >>>
> >>> diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
> >>> index 90b0222390e5..766cfd48386c 100644
> >>> --- a/include/trace/events/btrfs.h
> >>> +++ b/include/trace/events/btrfs.h
> >>> @@ -277,8 +277,7 @@ DEFINE_EVENT(btrfs__inode, btrfs_inode_evict,
> >>> { EXTENT_FLAG_COMPRESS_LZO, "COMPRESS_LZO" },\
> >>> { EXTENT_FLAG_COMPRESS_ZSTD, "COMPRESS_ZSTD" },\
> >>> { EXTENT_FLAG_PREALLOC, "PREALLOC" },\
> >>> - { EXTENT_FLAG_LOGGING, "LOGGING" },\
> >>> - { EXTENT_FLAG_FILLING, "FILLING" })
> >>> + { EXTENT_FLAG_LOGGING, "LOGGING" })
> >>>
> >>> TRACE_EVENT_CONDITION(btrfs_get_extent,
> >>>
prev parent reply other threads:[~2024-04-03 10:55 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-02 16:40 [PATCH] btrfs: remove not needed mod_start and mod_len from struct extent_map fdmanana
2024-04-02 21:18 ` Qu Wenruo
2024-04-02 21:43 ` Filipe Manana
2024-04-02 21:53 ` Qu Wenruo
2024-04-03 10:55 ` Filipe Manana [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAL3q7H7AM5wkgL=d5+7ohh25Q787f89NGKP_FoZdLKVaftH7_w@mail.gmail.com' \
--to=fdmanana@kernel.org \
--cc=fdmanana@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).