Linux-BTRFS Archive mirror
 help / color / mirror / Atom feed
From: Hans Holmberg <Hans.Holmberg@wdc.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
	Zorro Lang <zlang@redhat.com>
Cc: "Zorro Lang" <zlang@kernel.org>,
	"Darrick J. Wong" <djwong@kernel.org>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	"Damien Le Moal" <Damien.LeMoal@wdc.com>,
	"Matias Bjørling" <Matias.Bjorling@wdc.com>,
	"Naohiro Aota" <Naohiro.Aota@wdc.com>, "hch@lst.de" <hch@lst.de>,
	"fstests@vger.kernel.org" <fstests@vger.kernel.org>,
	"Jaegeuk Kim" <jaegeuk@kernel.org>,
	"bvanassche@acm.org" <bvanassche@acm.org>,
	"daeho43@gmail.com" <daeho43@gmail.com>,
	"Boris Burkov" <boris@bur.io>
Subject: Re: [PATCH] generic: add gc stress test
Date: Tue, 14 May 2024 08:02:00 +0000	[thread overview]
Message-ID: <98c90900-b016-429d-a32b-268ac5163fd7@wdc.com> (raw)
In-Reply-To: <b8723562-d154-4171-836c-6194cfd708a5@gmx.com>

On 2024-05-13 09:33, Qu Wenruo wrote:
> 
> 
> 在 2024/5/13 02:26, Johannes Thumshirn 写道:
>> [ +CC Boris ]
> [...]
>>> I was surprised to see the failure for brtrfs on a conventional block
>>> device, but have not dug into it. I suspect/assume it's the same root
>>> cause as the issue Johannes is looking into when using a zoned block
>>> device as backing storage.
>>>
>>> I debugged that a bit with Johannes, and noticed that if I manually
>>> kick btrfs rebalancing after each write via sysfs, the test progresses
>>> further (but super slow).
>>>
>>> So *I think* that btrfs needs to:
>>>
>>> * tune the triggering of gc to kick in way before available free space
>>>       runs out
>>> * start slowing down / blocking writes when reclaim pressure is high to
>>>       avoid premature -ENOSPC:es.
>>
>> Yes both Boris and I are working on different solutions to the GC
>> problem. But apart from that, I have the feeling that using stat to
>> check on the available space is not the best idea.
> 
> Although my previous workaround (fill to 100% then deleting 5%) is not
> going to be feasible for zoned devices, what about two-run solution below?
> 
> - The first run to fill the whole fs until ENOSPC
>     Then calculate how many bytes we have really written. (du?)
> 
> - Recreate the fs and fill to 95% of above number and start the test
> 
> But with this workaround, I'm not 100% if this is a good idea for all
> filesystems.
> 
> AFAIK ext4/xfs sometimes can under-report the available space (aka,
> reporting no available bytes, but can still write new data).
> 
> If we always go ENOSPC to calculate the real available space, it may
> cause too much pressure.
> 
> And it may be a good idea for us btrfs guys to implement a similar
> under-reporting available space behavior?


My thoughts on this:

This test is not designed for testing how much data we can write to
a file system, so it would be fine to decrease fill_percent to allow
for a bit of fuzzyness. It would make the test longer to run though.

BUT that does not work around the btrfs issue(s). When testing around, I
tried decreasing fill_percent to something like 70 and btrfs still
-ENOSPC:ed. It's the fragmentation and the fact that reclaim does not
happen fast enough that causes writes to fail (I believe, johannes &
boris knows better).

Also, how are users supposed to know how much data they can store if 
stat does not tell them that with some degree of certainty?

Space accounting for full copy-on-write file systems is a Hard
Problem (tm), especially if metadata is also fully copy on write, but
that should not stop us from trying to do it right :)


Thanks,
Hans


> 
> Thanks,
> Qu
>>
>>> It's a pretty nasty problem, as potentially any write could -ENOSPC
>>> long before the reported available space runs out when a workload
>>> ends up fragmenting the disk and write pressure is high..
>>
>>
> 


      reply	other threads:[~2024-05-14  8:03 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-15 11:23 [PATCH] generic: add gc stress test Hans Holmberg
2024-04-16  9:07 ` Hans Holmberg
2024-04-16 18:54   ` Darrick J. Wong
2024-04-17 12:43     ` Zorro Lang
2024-04-17 13:21       ` Hans Holmberg
2024-04-17 14:06         ` Zorro Lang
2024-04-17 14:45           ` Hans Holmberg
2024-05-08  7:08             ` Hans Holmberg
2024-05-08  8:51               ` Zorro Lang
2024-05-08  9:28                 ` Qu Wenruo
2024-05-08 11:02                   ` Johannes Thumshirn
2024-05-09  5:43                 ` hch
2024-05-09  9:42                   ` Zorro Lang
2024-05-09 12:54                     ` hch
2024-05-10  3:21                       ` Zorro Lang
2024-05-11 13:08                 ` Hans Holmberg
2024-05-12 16:54                   ` Johannes Thumshirn
2024-05-12 16:56                   ` Johannes Thumshirn
2024-05-13  7:33                     ` Qu Wenruo
2024-05-14  8:02                       ` Hans Holmberg [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98c90900-b016-429d-a32b-268ac5163fd7@wdc.com \
    --to=hans.holmberg@wdc.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=Matias.Bjorling@wdc.com \
    --cc=Naohiro.Aota@wdc.com \
    --cc=boris@bur.io \
    --cc=bvanassche@acm.org \
    --cc=daeho43@gmail.com \
    --cc=djwong@kernel.org \
    --cc=fstests@vger.kernel.org \
    --cc=hch@lst.de \
    --cc=jaegeuk@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=zlang@kernel.org \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).