Linux-bcache Archive mirror
 help / color / mirror / Atom feed
From: Coly Li <colyli@suse.de>
To: mingzhe.zou@easystack.cn
Cc: linux-bcache@vger.kernel.org, zoumingzhe@qq.com
Subject: Re: [PATCH] bcache: limit multiple flash devices size
Date: Thu, 15 Sep 2022 23:23:29 +0800	[thread overview]
Message-ID: <DC88819F-BB84-4766-A8FB-8637B6686D5F@suse.de> (raw)
In-Reply-To: <20220914060657.22102-1-mingzhe.zou@easystack.cn>



> 2022年9月14日 14:06,mingzhe.zou@easystack.cn 写道:
> 
> From: mingzhe <mingzhe.zou@easystack.cn>
> 
> Bcache allows multiple flash devices to be created on the same cache.
> We can create multiple flash devices, and the total size larger than
> cache device's actual size.
> ```
> [root@zou ~]# lsblk /dev/vdd
> NAME       MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
> vdd        252:48   0  100G  0 disk
> [root@zou ~]# echo 50G > /sys/block/vdd/bcache/set/flash_vol_create
> [root@zou ~]# lsblk /dev/vdd
> NAME       MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
> vdd        252:48   0  100G  0 disk
> └─bcache1  251:128  0   50G  0 disk
> [root@zou ~]# echo 50G > /sys/block/vdd/bcache/set/flash_vol_create
> [root@zou ~]# lsblk /dev/vdd
> NAME       MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
> vdd        252:48   0  100G  0 disk
> ├─bcache2  251:256  0   50G  0 disk
> └─bcache1  251:128  0   50G  0 disk
> [root@zou ~]# echo 50G > /sys/block/vdd/bcache/set/flash_vol_create
> [root@zou ~]# lsblk /dev/vdd
> NAME       MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
> vdd        252:48   0  100G  0 disk
> ├─bcache3  251:256  0   50G  0 disk
> ├─bcache2  251:256  0   50G  0 disk
> └─bcache1  251:128  0   50G  0 disk
> ```
> 
> This patch will limit the total size of multi-flash device, until no
> free space to create a new flash device with an error.
> ```
> [root@zou ~]# lsblk /dev/vdd
> NAME       MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
> vdd        252:48   0  100G  0 disk
> [root@zou ~]# echo 50G > /sys/block/vdd/bcache/set/flash_vol_create
> [root@zou ~]# lsblk /dev/vdd
> NAME       MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
> vdd        252:48   0  100G  0 disk
> └─bcache1  251:128  0   50G  0 disk
> [root@zou ~]# echo 50G > /sys/block/vdd/bcache/set/flash_vol_create
> [root@zou ~]# lsblk /dev/vdd
> NAME       MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
> vdd        252:48   0  100G  0 disk
> ├─bcache2  251:256  0 39.9G  0 disk
> └─bcache1  251:128  0   50G  0 disk
> [root@zou ~]# echo 50G > /sys/block/vdd/bcache/set/flash_vol_create
> -bash: echo: write error: Invalid argument
> [root@zou ~]# lsblk /dev/vdd
> NAME       MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
> vdd        252:48   0  100G  0 disk
> ├─bcache2  251:256  0 39.9G  0 disk
> └─bcache1  251:128  0   50G  0 disk
> ```
> 
> Signed-off-by: mingzhe <mingzhe.zou@easystack.cn>
> ---
> drivers/md/bcache/super.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> index 214a384dc1d7..e019cfd793eb 100644
> --- a/drivers/md/bcache/super.c
> +++ b/drivers/md/bcache/super.c
> @@ -1581,13 +1581,20 @@ static int flash_devs_run(struct cache_set *c)
> 
> static inline sector_t flash_dev_max_sectors(struct cache_set *c)
> {
> +	sector_t sectors;
> +	struct uuid_entry *u;
> 	size_t avail_nbuckets;
> 	struct cache *ca = c->cache;
> 	size_t first_bucket = ca->sb.first_bucket;
> 	size_t njournal_buckets = ca->sb.njournal_buckets;
> 
> 	avail_nbuckets = c->nbuckets - first_bucket - njournal_buckets;
> -	return bucket_to_sector(c, avail_nbuckets / 100 * FLASH_DEV_AVAILABLE_RATIO);
> +	sectors = bucket_to_sector(c, avail_nbuckets / 100 * FLASH_DEV_AVAILABLE_RATIO);
> +
> +	for (u = c->uuids; u < c->uuids + c->nr_uuids && sectors > 0; u++)
> +		if (UUID_FLASH_ONLY(u))
> +			sectors -= min(u->sectors, sectors);
> +	return sectors;

The value returned from flash_dev_max_sectors() is the buckets number which not allocated to flash devices. But it might not always be the allocable free buckets for new flash device. Because some of the buckets might be allocated to btree nodes, or cached dirty data. Although these space might be shrunk eventually, we should always avoid to use up all the free buckets.

Therefore, the exact free bucket amount should be calculated —-- no cheap method to do it.

There is a variable cache_set->avail_nbuckets for current available buckets, but it is updated after gc accomplished and not a updated-in-time value. So this value is always <= real available buckets. That is to say, if the creating flash device size < (cache_set->avail_nbuckets - reserved_buckets), the creation failed but there might be enough free buckets for the creating flash device. This is very probably to happen because  cache_set->avail_nbuckets is not refreshed frequently.


> }
> 
> int bch_flash_dev_create(struct cache_set *c, uint64_t size)
> @@ -1612,6 +1619,10 @@ int bch_flash_dev_create(struct cache_set *c, uint64_t size)
> 
> 	SET_UUID_FLASH_ONLY(u, 1);
> 	u->sectors = min(flash_dev_max_sectors(c), size >> 9);
> +	if (!u->sectors) {
> +		pr_err("Can't create volume, no free space");
> +		return -EINVAL;
> +	}


The idea is cool. But current code doesn’t solve the target problem, and I don’t have better solution in my brain yet...


Thanks.

Coly Li

> 
> 	bch_uuid_write(c);
> 
> -- 
> 2.17.1
> 


      reply	other threads:[~2022-09-15 15:24 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-14  6:06 [PATCH] bcache: limit multiple flash devices size mingzhe.zou
2022-09-15 15:23 ` Coly Li [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DC88819F-BB84-4766-A8FB-8637B6686D5F@suse.de \
    --to=colyli@suse.de \
    --cc=linux-bcache@vger.kernel.org \
    --cc=mingzhe.zou@easystack.cn \
    --cc=zoumingzhe@qq.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).