Linux-BTRFS Archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Naohiro Aota <Naohiro.Aota@wdc.com>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH v2 6/8] btrfs-progs: support byte length for zone resetting
Date: Thu, 16 May 2024 07:17:25 +0930	[thread overview]
Message-ID: <16385ce4-fde3-4356-add1-dbe05d4f5052@gmx.com> (raw)
In-Reply-To: <bxcsi225e5cpmvod6yx764nihpml6dyizjtgslmqajmvtn26mq@l2yuxz4myrye>



在 2024/5/16 01:41, Naohiro Aota 写道:
> On Wed, May 15, 2024 at 08:29:55AM +0930, Qu Wenruo wrote:
>>
>>
>> 在 2024/5/15 03:52, Naohiro Aota 写道:
>>> Even with "mkfs.btrfs -b", mkfs.btrfs resets all the zones on the device.
>>> Limit the reset target within the specified length.
>>>
>>> Also, we need to check that there is no active zone outside of the FS
>>> range. If there is one, btrfs fails to meet the active zone limit properly.
>>
>> Mind to explain more on why an active zone *outside* of the fs range is
>> a problem?
>>
>> It's pretty instinctive to consider such active zones out of the fs
>> range as non-exist, thus should not cause much problem (until we want to
>> expand the fs etc).
>>
>> This should just acts like the data beyond fs range in traditional
>> devices, and we never really bothered them.
>
> A zoned device may have an upper limit on the number of active zones, so
> you cannot write into zones beyond that limit at the same time.
>
> https://zonedstorage.io/docs/introduction/zns#zone-resources-limits

Oh, I forgot the active zones limits.

>
> So, if we have an active zone outside the FS, btrfs cannot utilize all the
> active zones for it. In the worst case, if you have an active zone limit =
> 8 and 5 zones are already used outside the FS, we cannot maintain the
> minimum necessary 4 active zones: superblock, data, metadata, and system
> block group.
>
> Technically, we can scan all the device zones to count active zones and try
> to live with the rest. But, I don't see a clear use case for that.
>
> However ... I just noticed we do it so because the current mount code never
> checks the btrfs_device->total_bytes. The minumum active zone requirement
> check is broken for the "-b" case, though.

A new series for kernel would be great.

>
> I believe mandating no active zones outside the FS both at mkfs and mount
> time is a clean way to go unless there is a request with a good reason.

Yeah, this sounds very reasonable now to require no active zones.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

>
>> Thanks,
>> Qu
>>
>>>
>>> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
>>> ---
>>>    common/device-utils.c | 17 ++++++++++++-----
>>>    kernel-shared/zoned.c | 23 ++++++++++++++++++++++-
>>>    kernel-shared/zoned.h |  7 ++++---
>>>    3 files changed, 38 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/common/device-utils.c b/common/device-utils.c
>>> index 86942e0c7041..7df7d9ce39d8 100644
>>> --- a/common/device-utils.c
>>> +++ b/common/device-utils.c
>>> @@ -254,16 +254,23 @@ int btrfs_prepare_device(int fd, const char *file, u64 *byte_count_ret,
>>>
>>>    		if (!zinfo->emulated) {
>>>    			if (opflags & PREP_DEVICE_VERBOSE)
>>> -				printf("Resetting device zones %s (%u zones) ...\n",
>>> -				       file, zinfo->nr_zones);
>>> +				printf("Resetting device zones %s (%llu zones) ...\n",
>>> +				       file, byte_count / zinfo->zone_size);
>>>    			/*
>>>    			 * We cannot ignore zone reset errors for a zoned block
>>>    			 * device as this could result in the inability to write
>>>    			 * to non-empty sequential zones of the device.
>>>    			 */
>>> -			if (btrfs_reset_all_zones(fd, zinfo)) {
>>> -				error("zoned: failed to reset device '%s' zones: %m",
>>> -				      file);
>>> +			ret = btrfs_reset_zones(fd, zinfo, byte_count);
>>> +			if (ret) {
>>> +				if (ret == EBUSY) {
>>> +					error("zoned: device '%s' contains an active zone outside of the FS range",
>>> +					      file);
>>> +					error("zoned: btrfs needs full control of active zones");
>>> +				} else {
>>> +					error("zoned: failed to reset device '%s' zones: %m",
>>> +					      file);
>>> +				}
>>>    				goto err;
>>>    			}
>>>    		}
>>> diff --git a/kernel-shared/zoned.c b/kernel-shared/zoned.c
>>> index fb1e1388804e..b4244966ca36 100644
>>> --- a/kernel-shared/zoned.c
>>> +++ b/kernel-shared/zoned.c
>>> @@ -395,16 +395,24 @@ static int report_zones(int fd, const char *file,
>>>     * Discard blocks in the zones of a zoned block device. Process this with zone
>>>     * size granularity so that blocks in conventional zones are discarded using
>>>     * discard_range and blocks in sequential zones are reset though a zone reset.
>>> + *
>>> + * We need to ensure that zones outside of the FS is not active, so that
>>> + * the FS can use all the active zones. Return EBUSY if there is an active
>>> + * zone.
>>>     */
>>> -int btrfs_reset_all_zones(int fd, struct btrfs_zoned_device_info *zinfo)
>>> +int btrfs_reset_zones(int fd, struct btrfs_zoned_device_info *zinfo, u64 byte_count)
>>>    {
>>>    	unsigned int i;
>>>    	int ret = 0;
>>>
>>>    	ASSERT(zinfo);
>>> +	ASSERT(IS_ALIGNED(byte_count, zinfo->zone_size));
>>>
>>>    	/* Zone size granularity */
>>>    	for (i = 0; i < zinfo->nr_zones; i++) {
>>> +		if (byte_count == 0)
>>> +			break;
>>> +
>>>    		if (zinfo->zones[i].type == BLK_ZONE_TYPE_CONVENTIONAL) {
>>>    			ret = device_discard_blocks(fd,
>>>    					     zinfo->zones[i].start << SECTOR_SHIFT,
>>> @@ -419,7 +427,20 @@ int btrfs_reset_all_zones(int fd, struct btrfs_zoned_device_info *zinfo)
>>>
>>>    		if (ret)
>>>    			return ret;
>>> +
>>> +		byte_count -= zinfo->zone_size;
>>>    	}
>>> +	for (; i < zinfo->nr_zones; i++) {
>>> +		const enum blk_zone_cond cond = zinfo->zones[i].cond;
>>> +
>>> +		if (zinfo->zones[i].type == BLK_ZONE_TYPE_CONVENTIONAL)
>>> +			continue;
>>> +		if (cond == BLK_ZONE_COND_IMP_OPEN ||
>>> +		    cond == BLK_ZONE_COND_EXP_OPEN ||
>>> +		    cond == BLK_ZONE_COND_CLOSED)
>>> +			return EBUSY;
>>> +	}
>>> +
>>>    	return fsync(fd);
>>>    }
>>>
>>> diff --git a/kernel-shared/zoned.h b/kernel-shared/zoned.h
>>> index 6eba86d266bf..2bf24cbba62a 100644
>>> --- a/kernel-shared/zoned.h
>>> +++ b/kernel-shared/zoned.h
>>> @@ -149,7 +149,7 @@ bool btrfs_redirty_extent_buffer_for_zoned(struct btrfs_fs_info *fs_info,
>>>    					   u64 start, u64 end);
>>>    int btrfs_reset_chunk_zones(struct btrfs_fs_info *fs_info, u64 devid,
>>>    			    u64 offset, u64 length);
>>> -int btrfs_reset_all_zones(int fd, struct btrfs_zoned_device_info *zinfo);
>>> +int btrfs_reset_zones(int fd, struct btrfs_zoned_device_info *zinfo, u64 byte_count);
>>>    int zero_zone_blocks(int fd, struct btrfs_zoned_device_info *zinfo, off_t start,
>>>    		     size_t len);
>>>    int btrfs_wipe_temporary_sb(struct btrfs_fs_devices *fs_devices);
>>> @@ -203,8 +203,9 @@ static inline int btrfs_reset_chunk_zones(struct btrfs_fs_info *fs_info,
>>>    	return 0;
>>>    }
>>>
>>> -static inline int btrfs_reset_all_zones(int fd,
>>> -					struct btrfs_zoned_device_info *zinfo)
>>> +static inline int btrfs_reset_zones(int fd,
>>> +				    struct btrfs_zoned_device_info *zinfo,
>>> +				    u64 byte_count)
>>>    {
>>>    	return -EOPNOTSUPP;
>>>    }

  reply	other threads:[~2024-05-15 21:47 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-14 18:22 [PATCH v2 0/8] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
2024-05-14 18:22 ` [PATCH v2 1/8] btrfs-progs: rename block_count to byte_count Naohiro Aota
2024-05-14 18:22 ` [PATCH v2 2/8] btrfs-progs: mkfs: remove duplicated device size check Naohiro Aota
2024-05-14 18:22 ` [PATCH v2 3/8] btrfs-progs: mkfs: unify zoned mode minimum size calc into btrfs_min_dev_size() Naohiro Aota
2024-05-14 18:22 ` [PATCH v2 4/8] btrfs-progs: mkfs: fix minimum size calculation for zoned mode Naohiro Aota
2024-05-14 22:54   ` Qu Wenruo
2024-05-15 16:25     ` Naohiro Aota
2024-05-14 18:22 ` [PATCH v2 5/8] btrfs-progs: mkfs: check if byte_count is zone size aligned Naohiro Aota
2024-05-14 22:56   ` Qu Wenruo
2024-05-15 15:43     ` Naohiro Aota
2024-05-14 18:22 ` [PATCH v2 6/8] btrfs-progs: support byte length for zone resetting Naohiro Aota
2024-05-14 22:59   ` Qu Wenruo
2024-05-15 16:11     ` Naohiro Aota
2024-05-15 21:47       ` Qu Wenruo [this message]
2024-05-14 18:22 ` [PATCH v2 7/8] btrfs-progs: add test " Naohiro Aota
2024-05-14 23:04   ` Qu Wenruo
2024-05-15 16:14     ` Naohiro Aota
2024-05-14 18:22 ` [PATCH v2 8/8] btrfs-progs: test: use smaller emulated zone size Naohiro Aota

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16385ce4-fde3-4356-add1-dbe05d4f5052@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=Naohiro.Aota@wdc.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).