FSTests Archive mirror
 help / color / mirror / Atom feed
From: Hans Holmberg <Hans.Holmberg@wdc.com>
To: Zorro Lang <zlang@redhat.com>
Cc: "Zorro Lang" <zlang@kernel.org>,
	"Darrick J. Wong" <djwong@kernel.org>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	"Damien Le Moal" <Damien.LeMoal@wdc.com>,
	"Matias Bjørling" <Matias.Bjorling@wdc.com>,
	"Naohiro Aota" <Naohiro.Aota@wdc.com>,
	"Johannes Thumshirn" <Johannes.Thumshirn@wdc.com>,
	"hch@lst.de" <hch@lst.de>,
	"fstests@vger.kernel.org" <fstests@vger.kernel.org>,
	"Jaegeuk Kim" <jaegeuk@kernel.org>,
	"bvanassche@acm.org" <bvanassche@acm.org>,
	"daeho43@gmail.com" <daeho43@gmail.com>
Subject: Re: [PATCH] generic: add gc stress test
Date: Sat, 11 May 2024 13:08:36 +0000	[thread overview]
Message-ID: <9c38fffc-72e9-4766-a9d0-ef90411df6f2@wdc.com> (raw)
In-Reply-To: <20240508085135.gwo3wiaqwhptdkju@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com>

On 2024-05-08 10:51, Zorro Lang wrote:
> On Wed, May 08, 2024 at 07:08:01AM +0000, Hans Holmberg wrote:
>> On 2024-04-17 16:50, Hans Holmberg wrote:
>>> On 2024-04-17 16:07, Zorro Lang wrote:
>>>> On Wed, Apr 17, 2024 at 01:21:39PM +0000, Hans Holmberg wrote:
>>>>> On 2024-04-17 14:43, Zorro Lang wrote:
>>>>>> On Tue, Apr 16, 2024 at 11:54:37AM -0700, Darrick J. Wong wrote:
>>>>>>> On Tue, Apr 16, 2024 at 09:07:43AM +0000, Hans Holmberg wrote:
>>>>>>>> +Zorro (doh!)
>>>>>>>>
>>>>>>>> On 2024-04-15 13:23, Hans Holmberg wrote:
>>>>>>>>> This test stresses garbage collection for file systems by first filling
>>>>>>>>> up a scratch mount to a specific usage point with files of random size,
>>>>>>>>> then doing overwrites in parallel with deletes to fragment the backing
>>>>>>>>> storage, forcing reclaim.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> Test results in my setup (kernel 6.8.0-rc4+)
>>>>>>>>> 	f2fs on zoned nullblk: pass (77s)
>>>>>>>>> 	f2fs on conventional nvme ssd: pass (13s)
>>>>>>>>> 	btrfs on zoned nublk: fails (-ENOSPC)
>>>>>>>>> 	btrfs on conventional nvme ssd: fails (-ENOSPC)
>>>>>>>>> 	xfs on conventional nvme ssd: pass (8s)
>>>>>>>>>
>>>>>>>>> Johannes(cc) is working on the btrfs ENOSPC issue.
>>>>>>>>> 	
>>>>>>>>>       tests/generic/744     | 124 ++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>       tests/generic/744.out |   6 ++
>>>>>>>>>       2 files changed, 130 insertions(+)
>>>>>>>>>       create mode 100755 tests/generic/744
>>>>>>>>>       create mode 100644 tests/generic/744.out
>>>>>>>>>
>>>>>>>>> diff --git a/tests/generic/744 b/tests/generic/744
>>>>>>>>> new file mode 100755
>>>>>>>>> index 000000000000..2c7ab76bf8b1
>>>>>>>>> --- /dev/null
>>>>>>>>> +++ b/tests/generic/744
>>>>>>>>> @@ -0,0 +1,124 @@
>>>>>>>>> +#! /bin/bash
>>>>>>>>> +# SPDX-License-Identifier: GPL-2.0
>>>>>>>>> +# Copyright (c) 2024 Western Digital Corporation.  All Rights Reserved.
>>>>>>>>> +#
>>>>>>>>> +# FS QA Test No. 744
>>>>>>>>> +#
>>>>>>>>> +# Inspired by btrfs/273 and generic/015
>>>>>>>>> +#
>>>>>>>>> +# This test stresses garbage collection in file systems
>>>>>>>>> +# by first filling up a scratch mount to a specific usage point with
>>>>>>>>> +# files of random size, then doing overwrites in parallel with
>>>>>>>>> +# deletes to fragment the backing zones, forcing reclaim.
>>>>>>>>> +
>>>>>>>>> +. ./common/preamble
>>>>>>>>> +_begin_fstest auto
>>>>>>>>> +
>>>>>>>>> +# real QA test starts here
>>>>>>>>> +
>>>>>>>>> +_require_scratch
>>>>>>>>> +
>>>>>>>>> +# This test requires specific data space usage, skip if we have compression
>>>>>>>>> +# enabled.
>>>>>>>>> +_require_no_compress
>>>>>>>>> +
>>>>>>>>> +M=$((1024 * 1024))
>>>>>>>>> +min_fsz=$((1 * ${M}))
>>>>>>>>> +max_fsz=$((256 * ${M}))
>>>>>>>>> +bs=${M}
>>>>>>>>> +fill_percent=95
>>>>>>>>> +overwrite_percentage=20
>>>>>>>>> +seq=0
>>>>>>>>> +
>>>>>>>>> +_create_file() {
>>>>>>>>> +	local file_name=${SCRATCH_MNT}/data_$1
>>>>>>>>> +	local file_sz=$2
>>>>>>>>> +	local dd_extra=$3
>>>>>>>>> +
>>>>>>>>> +	POSIXLY_CORRECT=yes dd if=/dev/zero of=${file_name} \
>>>>>>>>> +		bs=${bs} count=$(( $file_sz / ${bs} )) \
>>>>>>>>> +		status=none $dd_extra  2>&1
>>>>>>>>> +
>>>>>>>>> +	status=$?
>>>>>>>>> +	if [ $status -ne 0 ]; then
>>>>>>>>> +		echo "Failed writing $file_name" >>$seqres.full
>>>>>>>>> +		exit
>>>>>>>>> +	fi
>>>>>>>>> +}
>>>>>>>
>>>>>>> I wonder, is there a particular reason for doing all these file
>>>>>>> operations with shell code instead of using fsstress to create and
>>>>>>> delete files to fill the fs and stress all the zone-gc code?  This test
>>>>>>> reminds me a lot of generic/476 but with more fork()ing.
>>>>>>
>>>>>> /me has the same confusion. Can this test cover more things than using
>>>>>> fsstress (to do reclaim test) ? Or does it uncover some known bugs which
>>>>>> other cases can't?
>>>>>
>>>>> ah, adding some more background is probably useful:
>>>>>
>>>>> I've been using this test to stress the crap out the zoned xfs garbage
>>>>> collection / write throttling implementation for zoned rt subvolumes
>>>>> support in xfs and it has found a number of issues during implementation
>>>>> that i did not reproduce by other means.
>>>>>
>>>>> I think it also has wider applicability as it triggers bugs in btrfs.
>>>>> f2fs passes without issues, but probably benefits from a quick smoke gc
>>>>> test as well. Discussed this with Bart and Daeho (now in cc) before
>>>>> submitting.
>>>>>
>>>>> Using fsstress would be cool, but as far as I can tell it cannot
>>>>> be told to operate at a specific file system usage point, which
>>>>> is a key thing for this test.
>>>>
>>>> As a random test case, if this case can be transformed to use fsstress to cover
>>>> same issues, that would be nice.
>>>>
>>>> But if as a regression test case, it has its particular test coverage, and the
>>>> issue it covered can't be reproduced by fsstress way, then let's work on this
>>>> bash script one.
>>>>
>>>> Any thoughts?
>>>
>>> Yeah, I think bash is preferable for this particular test case.
>>> Bash also makes it easy to hack for people's private uses.
>>>
>>> I use longer versions of this test (increasing overwrite_percentage)
>>> for weekly testing.
>>>
>>> If we need fsstress for reproducing any future gc bug we can add
>>> whats missing to it then.
>>>
>>> Does that make sense?
>>>
>>
>> Hey Zorro,
>>
>> Any remaining concerns for adding this test? I could run it across
>> more file systems(bcachefs could be interesting) and share the results
>> if needed be.
> 
> Hi,
> 
> I remembered you metioned btrfs fails on this test, and I can reproduce it
> on btrfs [1] with general disk. Have you figured out the reason? I don't
> want to give btrfs a test failure suddently without a proper explanation :)
> If it's a case issue, better to fix it for btrfs.


I was surprised to see the failure for brtrfs on a conventional block
device, but have not dug into it. I suspect/assume it's the same root
cause as the issue Johannes is looking into when using a zoned block
device as backing storage.

I debugged that a bit with Johannes, and noticed that if I manually
kick btrfs rebalancing after each write via sysfs, the test progresses
further (but super slow).

So *I think* that btrfs needs to:

* tune the triggering of gc to kick in way before available free space
   runs out
* start slowing down / blocking writes when reclaim pressure is high to
   avoid premature -ENOSPC:es.

It's a pretty nasty problem, as potentially any write could -ENOSPC
long before the reported available space runs out when a workload
ends up fragmenting the disk and write pressure is high..


Thanks,
Hans (back from a couple of days away from email)



> 
> Thanks,
> Zorro
> 
> # ./check generic/744
> FSTYP         -- btrfs
> PLATFORM      -- Linux/x86_64 hp-dl380pg8-01 6.9.0-0.rc5.20240425gite88c4cfcb7b8.47.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Apr 25 14:21:52 UTC 2024
> MKFS_OPTIONS  -- /dev/sda4
> MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/sda4 /mnt/scratch
> 
> generic/744 115s ... [failed, exit status 1]- output mismatch (see /root/git/xfstests/results//generic/744.out.bad)
>      --- tests/generic/744.out   2024-05-08 16:11:14.476635417 +0800
>      +++ /root/git/xfstests/results//generic/744.out.bad 2024-05-08 16:46:03.617194377 +0800
>      @@ -2,5 +2,4 @@
>       Starting fillup using direct IO
>       Starting mixed write/delete test using direct IO
>       Starting mixed write/delete test using buffered IO
>      -Syncing
>      -Done, all good
>      +dd: error writing '/mnt/scratch/data_82': No space left on device
>      ...
>      (Run 'diff -u /root/git/xfstests/tests/generic/744.out /root/git/xfstests/results//generic/744.out.bad'  to see the entire diff)
> Ran: generic/744
> Failures: generic/744
> Failed 1 of 1 tests
> 
>>
>> Thanks,
>> Hans
> 
> 


  parent reply	other threads:[~2024-05-11 13:08 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-15 11:23 [PATCH] generic: add gc stress test Hans Holmberg
2024-04-16  9:07 ` Hans Holmberg
2024-04-16 18:54   ` Darrick J. Wong
2024-04-17 12:43     ` Zorro Lang
2024-04-17 13:21       ` Hans Holmberg
2024-04-17 14:06         ` Zorro Lang
2024-04-17 14:45           ` Hans Holmberg
2024-05-08  7:08             ` Hans Holmberg
2024-05-08  8:51               ` Zorro Lang
2024-05-08  9:28                 ` Qu Wenruo
2024-05-08 11:02                   ` Johannes Thumshirn
2024-05-09  5:43                 ` hch
2024-05-09  9:42                   ` Zorro Lang
2024-05-09 12:54                     ` hch
2024-05-10  3:21                       ` Zorro Lang
2024-05-11 13:08                 ` Hans Holmberg [this message]
2024-05-12 16:54                   ` Johannes Thumshirn
2024-05-12 16:56                   ` Johannes Thumshirn
2024-05-13  7:33                     ` Qu Wenruo
2024-05-14  8:02                       ` Hans Holmberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9c38fffc-72e9-4766-a9d0-ef90411df6f2@wdc.com \
    --to=hans.holmberg@wdc.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=Matias.Bjorling@wdc.com \
    --cc=Naohiro.Aota@wdc.com \
    --cc=bvanassche@acm.org \
    --cc=daeho43@gmail.com \
    --cc=djwong@kernel.org \
    --cc=fstests@vger.kernel.org \
    --cc=hch@lst.de \
    --cc=jaegeuk@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=zlang@kernel.org \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).