Linux-ext4 Archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Anand Jain <anand.jain@oracle.com>,
	linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org
Subject: Re: [PATCH RFC 4/4] btrfs-progs: convert: support ext2 unwritten file data extents
Date: Fri, 3 May 2024 19:07:35 +0930	[thread overview]
Message-ID: <48787c70-1abf-433e-ad3f-9e212237f9a5@suse.com> (raw)
In-Reply-To: <6d2a19ced4551bfcf2a5d841921af7f84c4ea950.1714722726.git.anand.jain@oracle.com>



在 2024/5/3 18:38, Anand Jain 写道:
> This patch, along with the dependent patches below, adds support for
> ext4 unwritten file extents as preallocated file extent in btrfs.
> 
>   btrfs-progs: convert: refactor ext2_create_file_extents add argument ext2_inode
>   btrfs-progs: convert: struct blk_iterate_data, add ext2-specific file inode pointers
>   btrfs-progs: convert: refactor __btrfs_record_file_extent to add a prealloc flag
> 
> The patch is developed with POV of portability with the current
> e2fsprogs library.
> 
> Testcase:
> 
>       $ dd if=/dev/urandom of=/mnt/test/foo bs=4K count=1 conv=fsync status=none
>       $ dd if=/dev/urandom of=/mnt/test/foo bs=4K count=2 conv=fsync seek=1 status=none
>       $ xfs_io -f -c 'falloc -k 12K 12K' /mnt/test/foo
>       $ dd if=/dev/zero of=/mnt/test/foo bs=4K count=1 conv=fsync seek=6 status=none
> 
>       $ filefrag -v /mnt/test/foo
>       Filesystem type is: ef53
>       File size of /mnt/test/foo is 28672 (7 blocks of 4096 bytes)
> 	 ext:     logical_offset:        physical_offset: length:   expected: flags:
> 	   0:        0..       0:      33280..     33280:      1:
> 	   1:        1..       2:      33792..     33793:      2:      33281:
> 	   2:        3..       5:      33281..     33283:      3:      33794: unwritten
> 	   3:        6..       6:      33794..     33794:      1:      33284: last,eof
> 
>       $ sha256sum /mnt/test/foo
>       18619b678a5c207a971a0aa931604f48162e307c57ecdec450d5f095fe9f32c7  /mnt/test/foo
> 
>     Convert and compare the checksum
> 
>     Before:
> 
>       $ filefrag -v /mnt/test/foo
>       Filesystem type is: 9123683e
>       File size of /mnt/test/foo is 28672 (7 blocks of 4096 bytes)
>        ext:     logical_offset:        physical_offset: length:   expected: flags:
>        0:        0..       0:      33280..     33280:      1:             shared
>        1:        1..       2:      33792..     33793:      2:      33281: shared
>        2:        3..       6:      33281..     33284:      4:      33794: last,shared,eof
>       /mnt/test/foo: 3 extents found
> 
>       $ sha256sum /mnt/test/foo
>       6874a1733e5785682210d69c07f256f684cf5433c7235ed29848b4a4d52030e0  /mnt/test/foo
> 
>     After:
> 
>       $ filefrag -v /mnt/test/foo
>       Filesystem type is: 9123683e
>       File size of /mnt/test/foo is 28672 (7 blocks of 4096 bytes)
> 	 ext:     logical_offset:        physical_offset: length:   expected: flags:
> 	   0:        0..       0:      33280..     33280:      1:             shared
> 	   1:        1..       2:      33792..     33793:      2:      33281: shared
> 	   2:        3..       5:      33281..     33283:      3:      33794: unwritten,shared
> 	   3:        6..       6:      33794..     33794:      1:      33284: last,shared,eof
>       /mnt/test/foo: 4 extents found
> 
>       $ sha256sum /mnt/test/foo
>       18619b678a5c207a971a0aa931604f48162e307c57ecdec450d5f095fe9f32c7  /mnt/test/foo
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> RFC: Limited tested. Is there a ready file or test case available to
> exercise the feature?
> 
>   convert/source-fs.c | 49 ++++++++++++++++++++++++++++++++++++++++++++-
>   convert/source-fs.h |  1 +
>   2 files changed, 49 insertions(+), 1 deletion(-)
> 
> diff --git a/convert/source-fs.c b/convert/source-fs.c
> index 9039b0e66758..647ea1f29060 100644
> --- a/convert/source-fs.c
> +++ b/convert/source-fs.c
> @@ -239,6 +239,45 @@ fail:
>   	return ret;
>   }
>   
> +int find_prealloc(struct blk_iterate_data *data, int index, bool *prealloc)

This function is called for every file extent we're going to create.
I'm not a big fan of doing so many lookup.

My question is, is this the only way to determine the flag of the data 
extent?

My instinct says there should be a straight forward way to determine if 
a file extent is preallocated or not, just like what we do in our file 
extent items.
Thus during the ext2fs_block_iterate2(), there should be some way to 
tell if a block is preallocated or not.

Thus adding ext4 ML to get some feedback.

Thanks,
Qu

> +{
> +	ext2_extent_handle_t handle;
> +	struct ext2fs_extent extent;
> +
> +	if (ext2fs_extent_open2(data->ext2_fs, data->ext2_ino,
> +				data->ext2_inode, &handle)) {
> +		error("ext2fs_extent_open2 failed, inode %d", data->ext2_ino);
> +		return -EINVAL;
> +	}
> +
> +	if (ext2fs_extent_goto2(handle, 0, index)) {
> +		error("ext2fs_extent_goto2 failed, inode %d num_blocks %llu",
> +		       data->ext2_ino, data->num_blocks);
> +		ext2fs_extent_free(handle);
> +		return -EINVAL;
> +	}
> +
> +	memset(&extent, 0, sizeof(struct ext2fs_extent));
> +	if (ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT, &extent)) {
> +		error("ext2fs_extent_get EXT2_EXTENT_CURRENT failed inode %d",
> +		       data->ext2_ino);
> +		ext2fs_extent_free(handle);
> +		return -EINVAL;
> +	}
> +
> +	if (extent.e_pblk != data->disk_block) {
> +	error("inode %d index %d found wrong extent e_pblk %llu wanted disk_block %llu",
> +		       data->ext2_ino, index, extent.e_pblk, data->disk_block);
> +		ext2fs_extent_free(handle);
> +		return -EINVAL;
> +	}
> +
> +	if (extent.e_flags & EXT2_EXTENT_FLAGS_UNINIT)
> +		*prealloc = true;
> +
> +	return 0;
> +}
> +
>   /*
>    * Record a file extent in original filesystem into btrfs one.
>    * The special point is, old disk_block can point to a reserved range.
> @@ -257,6 +296,7 @@ int record_file_blocks(struct blk_iterate_data *data,
>   	u64 old_disk_bytenr = disk_block * sectorsize;
>   	u64 num_bytes = num_blocks * sectorsize;
>   	u64 cur_off = old_disk_bytenr;
> +	int index = data->first_block;
>   
>   	/* Hole, pass it to record_file_extent directly */
>   	if (old_disk_bytenr == 0)
> @@ -276,6 +316,12 @@ int record_file_blocks(struct blk_iterate_data *data,
>   		u64 extent_num_bytes;
>   		u64 real_disk_bytenr;
>   		u64 cur_len;
> +		bool prealloc = false;
> +
> +		if (find_prealloc(data, index, &prealloc)) {
> +			data->errcode = -1;
> +			return -EINVAL;
> +		}
>   
>   		key.objectid = data->convert_ino;
>   		key.type = BTRFS_EXTENT_DATA_KEY;
> @@ -317,11 +363,12 @@ int record_file_blocks(struct blk_iterate_data *data,
>   		ret = btrfs_record_file_extent(data->trans, data->root,
>   					data->objectid, data->inode, file_pos,
>   					real_disk_bytenr, cur_len,
> -					false);
> +					prealloc);
>   		if (ret < 0)
>   			break;
>   		cur_off += cur_len;
>   		file_pos += cur_len;
> +		index++;
>   
>   		/*
>   		 * No need to care about csum
> diff --git a/convert/source-fs.h b/convert/source-fs.h
> index 0e71e79eddcc..3922c444de10 100644
> --- a/convert/source-fs.h
> +++ b/convert/source-fs.h
> @@ -156,5 +156,6 @@ int read_disk_extent(struct btrfs_root *root, u64 bytenr,
>   		            u32 num_bytes, char *buffer);
>   int record_file_blocks(struct blk_iterate_data *data,
>   			      u64 file_block, u64 disk_block, u64 num_blocks);
> +int find_prealloc(struct blk_iterate_data *data, int index, bool *prealloc);
>   
>   #endif

       reply	other threads:[~2024-05-03  9:37 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1714722726.git.anand.jain@oracle.com>
     [not found] ` <6d2a19ced4551bfcf2a5d841921af7f84c4ea950.1714722726.git.anand.jain@oracle.com>
2024-05-03  9:37   ` Qu Wenruo [this message]
2024-05-03 12:25     ` [PATCH RFC 4/4] btrfs-progs: convert: support ext2 unwritten file data extents Anand Jain
2024-05-03 22:23       ` Qu Wenruo
2024-05-03 23:27         ` Anand Jain
2024-05-04  0:06           ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48787c70-1abf-433e-ad3f-9e212237f9a5@suse.com \
    --to=wqu@suse.com \
    --cc=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).