All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: cem@kernel.org, djwong@kernel.org
Cc: djwong@djwong.org, hch@lst.de, linux-xfs@vger.kernel.org
Subject: [GIT PULL 11/11] xfs_repair: support more than 4 billion records
Date: Wed, 17 Apr 2024 15:10:12 -0700	[thread overview]
Message-ID: <171339162291.1911630.9932999805644506997.stg-ugh@frogsfrogsfrogs> (raw)
In-Reply-To: <20240417220440.GB11948@frogsfrogsfrogs>

Hi Carlos,

Please pull this branch with changes for xfsprogs for 6.6-rc1.

As usual, I did a test-merge with the main upstream branch as of a few
minutes ago, and didn't see any conflicts.  Please let me know if you
encounter any problems.

The following changes since commit b3bcb8f0a8b5763defc09bc6d9a04da275ad780a:

xfs_repair: rebuild block mappings from rmapbt data (2024-04-17 14:06:28 -0700)

are available in the Git repository at:

https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git tags/repair-support-4bn-records-6.8_2024-04-17

for you to fetch changes up to 90ee2c3a94511da87929989a06199fd537c94db4:

xfs_repair: support more than INT_MAX block maps (2024-04-17 14:06:28 -0700)

----------------------------------------------------------------
xfs_repair: support more than 4 billion records [11/20]

I started looking through all the places where XFS has to deal with the
rc_refcount attribute of refcount records, and noticed that offline
repair doesn't handle the situation where there are more than 2^32
reverse mappings in an AG, or that there are more than 2^32 owners of a
particular piece of AG space.  I've estimated that it would take several
months to produce a filesystem with this many records, but we really
ought to do better at handling them than crashing or (worse) not
crashing and writing out corrupt btrees due to integer truncation.

Once I started using the bmap_inflate debugger command to create extreme
reflink scenarios, I noticed that the memory usage of xfs_repair was
astronomical.  This I observed to be due to the fact that it allocates a
single huge block mapping array for all files on the system, even though
it only uses that array for data and attr forks that map metadata blocks
(e.g. directories, xattrs, symlinks) and does not use it for regular
data files.

So I got rid of the 2^31-1 limits on the block map array and turned off
the block mapping for regular data files.  This doesn't answer the
question of what to do if there are a lot of extents, but it kicks the
can down the road until someone creates a maximally sized xattr tree,
which so far nobody's ever stuck to long enough to complain about.

This has been running on the djcloud for months with no problems.  Enjoy!

Signed-off-by: Darrick J. Wong <djwong@kernel.org>

----------------------------------------------------------------
Darrick J. Wong (8):
xfs_db: add a bmbt inflation command
xfs_repair: slab and bag structs need to track more than 2^32 items
xfs_repair: support more than 2^32 rmapbt records per AG
xfs_repair: support more than 2^32 owners per physical block
xfs_repair: clean up lock resources
xfs_repair: constrain attr fork extent count
xfs_repair: don't create block maps for data files
xfs_repair: support more than INT_MAX block maps

db/Makefile       |  65 ++++++-
db/bmap_inflate.c | 551 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
db/command.c      |   1 +
db/command.h      |   1 +
man/man8/xfs_db.8 |  23 +++
repair/bmap.c     |  23 +--
repair/bmap.h     |   7 +-
repair/dinode.c   |  18 +-
repair/dir2.c     |   2 +-
repair/incore.c   |   9 +
repair/rmap.c     |  25 ++-
repair/rmap.h     |   4 +-
repair/slab.c     |  36 ++--
repair/slab.h     |  36 ++--
14 files changed, 725 insertions(+), 76 deletions(-)
create mode 100644 db/bmap_inflate.c


      parent reply	other threads:[~2024-04-17 22:10 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-17 22:04 [GIT PULLBOMB] xfsprogs: catch us up to 6.8, at least Darrick J. Wong
2024-04-17 22:07 ` [GIT PULL 01/11] xfsprogs: packaging fixes for 6.7 Darrick J. Wong
2024-04-18  4:31   ` Christoph Hellwig
2024-04-22  9:56   ` Carlos Maiolino
2024-04-17 22:07 ` [GIT PULL 02/11] xfsprogs: minor " Darrick J. Wong
2024-04-17 22:08 ` [GIT PULL 03/11] xfsprogs: convert utilities to use new rt helpers Darrick J. Wong
2024-04-17 22:08 ` [GIT PULL 04/11] libxfs: sync with 6.8 Darrick J. Wong
2024-04-17 22:08 ` [GIT PULL 05/11] xfs_repair: faster btree bulkloading Darrick J. Wong
2024-04-17 22:08 ` [GIT PULL 06/11] xfsprogs: bug fixes for 6.8 Darrick J. Wong
2024-04-17 22:09 ` [GIT PULL 07/11] xfsprogs: fix log sector size detection Darrick J. Wong
2024-04-17 22:09 ` [GIT PULL 08/11] mkfs: scale shards on ssds Darrick J. Wong
2024-04-17 22:09 ` [GIT PULL 09/11] xfs_scrub: scan metadata files in parallel Darrick J. Wong
2024-04-17 22:09 ` [GIT PULL 10/11] xfs_repair: rebuild inode fork mappings Darrick J. Wong
2024-04-17 22:10 ` Darrick J. Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=171339162291.1911630.9932999805644506997.stg-ugh@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=cem@kernel.org \
    --cc=djwong@djwong.org \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.