All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: cem@kernel.org
To: linux-xfs@vger.kernel.org
Cc: djwong@kernel.org, hch@lst.de
Subject: [PATCH 18/67] xfs: automatic freeing of freshly allocated unwritten space
Date: Mon, 22 Apr 2024 18:25:40 +0200	[thread overview]
Message-ID: <20240422163832.858420-20-cem@kernel.org> (raw)
In-Reply-To: <20240422163832.858420-2-cem@kernel.org>

From: "Darrick J. Wong" <djwong@kernel.org>

Source kernel commit: e3042be36c343207b7af249a09f50b4e37e9fda4

As mentioned in the previous commit, online repair wants to allocate
space to write out a new metadata structure, and it also wants to hedge
against system crashes during repairs by logging (and later cancelling)
EFIs to free the space if we crash before committing the new data
structure.

Therefore, create a trio of functions to schedule automatic reaping of
freshly allocated unwritten space.  xfs_alloc_schedule_autoreap creates
a paused EFI representing the space we just allocated.  Once the
allocations are made and the autoreaps scheduled, we can start writing
to disk.

If the writes succeed, xfs_alloc_cancel_autoreap marks the EFI work
items as stale and unpauses the pending deferred work item.  Assuming
that's done in the same transaction that commits the new structure into
the filesystem, we guarantee that either the new object is fully
visible, or that all the space gets reclaimed.

If the writes succeed but only part of an extent was used, repair must
call the same _cancel_autoreap function to kill the first EFI and then
log a new EFI to free the unused space.  The first EFI is already

For full extents that aren't used, xfs_alloc_commit_autoreap will
unpause the EFI, which results in the space being freed during the next
_defer_finish cycle.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
---
 libxfs/defer_item.c |  10 +++--
 libxfs/xfs_alloc.c  | 104 ++++++++++++++++++++++++++++++++++++++++++--
 libxfs/xfs_alloc.h  |  12 +++++
 3 files changed, 119 insertions(+), 7 deletions(-)

diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index 8731d1834..c4a16c7ce 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -101,7 +101,7 @@ xfs_extent_free_finish_item(
 	struct xfs_owner_info		oinfo = { };
 	struct xfs_extent_free_item	*xefi;
 	xfs_agblock_t			agbno;
-	int				error;
+	int				error = 0;
 
 	xefi = container_of(item, struct xfs_extent_free_item, xefi_list);
 
@@ -112,8 +112,12 @@ xfs_extent_free_finish_item(
 		oinfo.oi_flags |= XFS_OWNER_INFO_BMBT_BLOCK;
 
 	agbno = XFS_FSB_TO_AGBNO(tp->t_mountp, xefi->xefi_startblock);
-	error = xfs_free_extent(tp, xefi->xefi_pag, agbno,
-			xefi->xefi_blockcount, &oinfo, XFS_AG_RESV_NONE);
+
+	if (!(xefi->xefi_flags & XFS_EFI_CANCELLED)) {
+		error = xfs_free_extent(tp, xefi->xefi_pag, agbno,
+			xefi->xefi_blockcount, &oinfo,
+			XFS_AG_RESV_NONE);
+	}
 
 	/*
 	 * Don't free the XEFI if we need a new transaction to complete
diff --git a/libxfs/xfs_alloc.c b/libxfs/xfs_alloc.c
index 0a2404466..463381be7 100644
--- a/libxfs/xfs_alloc.c
+++ b/libxfs/xfs_alloc.c
@@ -2518,14 +2518,15 @@ xfs_defer_agfl_block(
  * Add the extent to the list of extents to be free at transaction end.
  * The list is maintained sorted (by block number).
  */
-int
-xfs_free_extent_later(
+static int
+xfs_defer_extent_free(
 	struct xfs_trans		*tp,
 	xfs_fsblock_t			bno,
 	xfs_filblks_t			len,
 	const struct xfs_owner_info	*oinfo,
 	enum xfs_ag_resv_type		type,
-	bool				skip_discard)
+	bool				skip_discard,
+	struct xfs_defer_pending	**dfpp)
 {
 	struct xfs_extent_free_item	*xefi;
 	struct xfs_mount		*mp = tp->t_mountp;
@@ -2573,10 +2574,105 @@ xfs_free_extent_later(
 			XFS_FSB_TO_AGBNO(tp->t_mountp, bno), len);
 
 	xfs_extent_free_get_group(mp, xefi);
-	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_FREE, &xefi->xefi_list);
+	*dfpp = xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_FREE, &xefi->xefi_list);
+	return 0;
+}
+
+int
+xfs_free_extent_later(
+	struct xfs_trans		*tp,
+	xfs_fsblock_t			bno,
+	xfs_filblks_t			len,
+	const struct xfs_owner_info	*oinfo,
+	enum xfs_ag_resv_type		type,
+	bool				skip_discard)
+{
+	struct xfs_defer_pending	*dontcare = NULL;
+
+	return xfs_defer_extent_free(tp, bno, len, oinfo, type, skip_discard,
+			&dontcare);
+}
+
+/*
+ * Set up automatic freeing of unwritten space in the filesystem.
+ *
+ * This function attached a paused deferred extent free item to the
+ * transaction.  Pausing means that the EFI will be logged in the next
+ * transaction commit, but the pending EFI will not be finished until the
+ * pending item is unpaused.
+ *
+ * If the system goes down after the EFI has been persisted to the log but
+ * before the pending item is unpaused, log recovery will find the EFI, fail to
+ * find the EFD, and free the space.
+ *
+ * If the pending item is unpaused, the next transaction commit will log an EFD
+ * without freeing the space.
+ *
+ * Caller must ensure that the tp, fsbno, len, oinfo, and resv flags of the
+ * @args structure are set to the relevant values.
+ */
+int
+xfs_alloc_schedule_autoreap(
+	const struct xfs_alloc_arg	*args,
+	bool				skip_discard,
+	struct xfs_alloc_autoreap	*aarp)
+{
+	int				error;
+
+	error = xfs_defer_extent_free(args->tp, args->fsbno, args->len,
+			&args->oinfo, args->resv, skip_discard, &aarp->dfp);
+	if (error)
+		return error;
+
+	xfs_defer_item_pause(args->tp, aarp->dfp);
 	return 0;
 }
 
+/*
+ * Cancel automatic freeing of unwritten space in the filesystem.
+ *
+ * Earlier, we created a paused deferred extent free item and attached it to
+ * this transaction so that we could automatically roll back a new space
+ * allocation if the system went down.  Now we want to cancel the paused work
+ * item by marking the EFI stale so we don't actually free the space, unpausing
+ * the pending item and logging an EFD.
+ *
+ * The caller generally should have already mapped the space into the ondisk
+ * filesystem.  If the reserved space was partially used, the caller must call
+ * xfs_free_extent_later to create a new EFI to free the unused space.
+ */
+void
+xfs_alloc_cancel_autoreap(
+	struct xfs_trans		*tp,
+	struct xfs_alloc_autoreap	*aarp)
+{
+	struct xfs_defer_pending	*dfp = aarp->dfp;
+	struct xfs_extent_free_item	*xefi;
+
+	if (!dfp)
+		return;
+
+	list_for_each_entry(xefi, &dfp->dfp_work, xefi_list)
+		xefi->xefi_flags |= XFS_EFI_CANCELLED;
+
+	xfs_defer_item_unpause(tp, dfp);
+}
+
+/*
+ * Commit automatic freeing of unwritten space in the filesystem.
+ *
+ * This unpauses an earlier _schedule_autoreap and commits to freeing the
+ * allocated space.  Call this if none of the reserved space was used.
+ */
+void
+xfs_alloc_commit_autoreap(
+	struct xfs_trans		*tp,
+	struct xfs_alloc_autoreap	*aarp)
+{
+	if (aarp->dfp)
+		xfs_defer_item_unpause(tp, aarp->dfp);
+}
+
 #ifdef DEBUG
 /*
  * Check if an AGF has a free extent record whose length is equal to
diff --git a/libxfs/xfs_alloc.h b/libxfs/xfs_alloc.h
index 6b95d1d8a..851cafbd6 100644
--- a/libxfs/xfs_alloc.h
+++ b/libxfs/xfs_alloc.h
@@ -255,6 +255,18 @@ void xfs_extent_free_get_group(struct xfs_mount *mp,
 #define XFS_EFI_SKIP_DISCARD	(1U << 0) /* don't issue discard */
 #define XFS_EFI_ATTR_FORK	(1U << 1) /* freeing attr fork block */
 #define XFS_EFI_BMBT_BLOCK	(1U << 2) /* freeing bmap btree block */
+#define XFS_EFI_CANCELLED	(1U << 3) /* dont actually free the space */
+
+struct xfs_alloc_autoreap {
+	struct xfs_defer_pending	*dfp;
+};
+
+int xfs_alloc_schedule_autoreap(const struct xfs_alloc_arg *args,
+		bool skip_discard, struct xfs_alloc_autoreap *aarp);
+void xfs_alloc_cancel_autoreap(struct xfs_trans *tp,
+		struct xfs_alloc_autoreap *aarp);
+void xfs_alloc_commit_autoreap(struct xfs_trans *tp,
+		struct xfs_alloc_autoreap *aarp);
 
 extern struct kmem_cache	*xfs_extfree_item_cache;
 
-- 
2.44.0


  parent reply	other threads:[~2024-04-22 16:39 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-22 16:25 [PATCH 00/67] libxfs: Sync to Linux 6.8 cem
2024-04-22 16:25 ` [PATCH 01/67] xfs: use xfs_defer_pending objects to recover intent items cem
2024-04-22 16:25 ` [PATCH 02/67] xfs: recreate work items when recovering " cem
2024-04-22 16:25 ` [PATCH 03/67] xfs: use xfs_defer_finish_one to finish recovered work items cem
2024-04-22 16:25 ` [PATCH 04/67] xfs: move ->iop_recover to xfs_defer_op_type cem
2024-04-22 16:25 ` [PATCH 05/67] xfs: hoist intent done flag setting to ->finish_item callsite cem
2024-04-22 16:25 ` [PATCH 06/67] xfs: hoist ->create_intent boilerplate to its callsite cem
2024-04-22 16:25 ` [PATCH 07/67] xfs: use xfs_defer_create_done for the relogging operation cem
2024-04-22 16:25 ` [PATCH 08/67] xfs: clean out XFS_LI_DIRTY setting boilerplate from ->iop_relog cem
2024-04-22 16:25 ` [PATCH 09/67] xfs: hoist xfs_trans_add_item calls to defer ops functions cem
2024-04-22 16:25 ` [PATCH 10/67] xfs: move ->iop_relog to struct xfs_defer_op_type cem
2024-04-22 16:25 ` [PATCH 11/67] xfs: make rextslog computation consistent with mkfs cem
2024-04-22 16:25 ` [PATCH 12/67] xfs: fix 32-bit truncation in xfs_compute_rextslog cem
2024-04-22 16:25 ` [PATCH 13/67] xfs: don't allow overly small or large realtime volumes cem
2024-04-22 16:25 ` [PATCH 14/67] xfs: elide ->create_done calls for unlogged deferred work cem
2024-04-22 16:25 ` [PATCH 15/67] xfs: don't append work items to logged xfs_defer_pending objects cem
2024-04-22 16:25 ` [PATCH 16/67] xfs: allow pausing of pending deferred work items cem
2024-04-22 16:25 ` [PATCH 17/67] xfs: remove __xfs_free_extent_later cem
2024-04-22 16:25 ` cem [this message]
2024-04-22 16:25 ` [PATCH 19/67] xfs: remove unused fields from struct xbtree_ifakeroot cem
2024-04-22 16:25 ` [PATCH 20/67] xfs: force small EFIs for reaping btree extents cem
2024-04-22 16:25 ` [PATCH 21/67] xfs: ensure logflagsp is initialized in xfs_bmap_del_extent_real cem
2024-04-22 16:25 ` [PATCH 22/67] xfs: update dir3 leaf block metadata after swap cem
2024-04-22 16:25 ` [PATCH 23/67] xfs: extract xfs_da_buf_copy() helper function cem
2024-04-22 16:25 ` [PATCH 24/67] xfs: move xfs_ondisk.h to libxfs/ cem
2024-04-22 16:25 ` [PATCH 25/67] xfs: consolidate the xfs_attr_defer_* helpers cem
2024-04-22 16:25 ` [PATCH 26/67] xfs: store an ops pointer in struct xfs_defer_pending cem
2024-04-22 16:25 ` [PATCH 27/67] xfs: pass the defer ops instead of type to xfs_defer_start_recovery cem
2024-04-22 16:25 ` [PATCH 28/67] xfs: pass the defer ops directly to xfs_defer_add cem
2024-04-22 16:25 ` [PATCH 29/67] xfs: force all buffers to be written during btree bulk load cem
2024-04-22 16:25 ` [PATCH 30/67] xfs: set XBF_DONE on newly formatted btree block that are ready for writing cem
2024-04-22 16:25 ` [PATCH 31/67] xfs: read leaf blocks when computing keys for bulkloading into node blocks cem
2024-04-22 16:25 ` [PATCH 32/67] xfs: move btree bulkload record initialization to ->get_record implementations cem
2024-04-22 16:25 ` [PATCH 33/67] xfs: constrain dirty buffers while formatting a staged btree cem
2024-04-22 16:25 ` [PATCH 34/67] xfs: repair free space btrees cem
2024-04-22 16:25 ` [PATCH 35/67] xfs: repair inode btrees cem
2024-04-22 16:25 ` [PATCH 36/67] xfs: repair refcount btrees cem
2024-04-22 16:25 ` [PATCH 37/67] xfs: dont cast to char * for XFS_DFORK_*PTR macros cem
2024-04-22 16:26 ` [PATCH 38/67] xfs: set inode sick state flags when we zap either ondisk fork cem
2024-04-22 16:26 ` [PATCH 39/67] xfs: zap broken inode forks cem
2024-04-22 16:26 ` [PATCH 40/67] xfs: repair inode fork block mapping data structures cem
2024-04-22 16:26 ` [PATCH 41/67] xfs: create a ranged query function for refcount btrees cem
2024-04-22 16:26 ` [PATCH 42/67] xfs: create a new inode fork block unmap helper cem
2024-04-22 16:26 ` [PATCH 43/67] xfs: improve dquot iteration for scrub cem
2024-04-22 16:26 ` [PATCH 44/67] xfs: add lock protection when remove perag from radix tree cem
2024-04-22 16:26 ` [PATCH 45/67] xfs: fix perag leak when growfs fails cem
2024-04-22 16:26 ` [PATCH 46/67] xfs: remove the xfs_alloc_arg argument to xfs_bmap_btalloc_accounting cem
2024-04-22 16:26 ` [PATCH 47/67] xfs: also use xfs_bmap_btalloc_accounting for RT allocations cem
2024-04-22 16:26 ` [PATCH 48/67] xfs: return -ENOSPC from xfs_rtallocate_* cem
2024-04-22 16:26 ` [PATCH 49/67] xfs: indicate if xfs_bmap_adjacent changed ap->blkno cem
2024-04-22 16:26 ` [PATCH 50/67] xfs: move xfs_rtget_summary to xfs_rtbitmap.c cem
2024-04-22 16:26 ` [PATCH 51/67] xfs: split xfs_rtmodify_summary_int cem
2024-04-22 16:26 ` [PATCH 52/67] xfs: remove rt-wrappers from xfs_format.h cem
2024-04-22 16:26 ` [PATCH 53/67] xfs: remove XFS_RTMIN/XFS_RTMAX cem
2024-04-22 16:26 ` [PATCH 54/67] xfs: make if_data a void pointer cem
2024-04-22 16:26 ` [PATCH 55/67] xfs: return if_data from xfs_idata_realloc cem
2024-04-22 16:26 ` [PATCH 56/67] xfs: move the xfs_attr_sf_lookup tracepoint cem
2024-04-22 16:26 ` [PATCH 57/67] xfs: simplify xfs_attr_sf_findname cem
2024-04-22 16:26 ` [PATCH 58/67] xfs: remove xfs_attr_shortform_lookup cem
2024-04-22 16:26 ` [PATCH 59/67] xfs: use xfs_attr_sf_findname in xfs_attr_shortform_getvalue cem
2024-04-22 16:26 ` [PATCH 60/67] xfs: remove struct xfs_attr_shortform cem
2024-04-22 16:26 ` [PATCH 61/67] xfs: remove xfs_attr_sf_hdr_t cem
2024-04-22 16:26 ` [PATCH 62/67] xfs: turn the XFS_DA_OP_REPLACE checks in xfs_attr_shortform_addname into asserts cem
2024-04-22 16:26 ` [PATCH 63/67] xfs: fix a use after free in xfs_defer_finish_recovery cem
2024-04-22 16:26 ` [PATCH 64/67] xfs: use the op name in trace_xlog_intent_recovery_failed cem
2024-04-22 16:26 ` [PATCH 65/67] xfs: fix backwards logic in xfs_bmap_alloc_account cem
2024-04-22 16:26 ` [PATCH 66/67] xfs: reset XFS_ATTR_INCOMPLETE filter on node removal cem
2024-04-22 16:26 ` [PATCH 67/67] xfs: remove conditional building of rt geometry validator functions cem
  -- strict thread matches above, loose matches on Subject: below --
2024-04-17 21:16 [PATCHSET 04/11] libxfs: sync with 6.8 Darrick J. Wong
2024-04-17 21:26 ` [PATCH 18/67] xfs: automatic freeing of freshly allocated unwritten space Darrick J. Wong
2024-03-26  2:55 [PATCHSET 02/18] libxfs: sync with 6.8 Darrick J. Wong
2024-03-26  3:07 ` [PATCH 18/67] xfs: automatic freeing of freshly allocated unwritten space Darrick J. Wong
2024-03-13  1:47 [PATCHSET 02/10] libxfs: sync with 6.8 Darrick J. Wong
2024-03-13  1:57 ` [PATCH 18/67] xfs: automatic freeing of freshly allocated unwritten space Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240422163832.858420-20-cem@kernel.org \
    --to=cem@kernel.org \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.