All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* bring back RT delalloc support v4
@ 2024-03-25  2:24 Christoph Hellwig
  2024-03-25  2:24 ` [PATCH 01/11] xfs: make XFS_TRANS_LOWMODE match the other XFS_TRANS_ definitions Christoph Hellwig
                   ` (10 more replies)
  0 siblings, 11 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-25  2:24 UTC (permalink / raw
  To: Chandan Babu R; +Cc: Darrick J. Wong, Dave Chinner, linux-xfs

Hi all,

this series adds back delalloc support for RT inodes, at least if the RT
extent size is a single file system block.  This shows really nice
performance improvements for workloads that frequently rewrite or append
to files, and improves fragmentation for larger writes.  On other
workloads it sometimes shows small performance improvements or flat
performance.

Changes since v3:
 - fix a bug in splitting RT delalloc reservations
 - fix online repair of RT delalloc fscounters
 - adjust the freespace counter refactoing to play nicer with an
   upcoming series of mine

Changes since v3:
 - drop gratuitous changes to xfs_mod_freecounter/xfs_add_freecounter
   that caused a warning to be logged when it shouldn't.
 - track delalloc rtextents instead of rtblocks in xfs_mount
 - fix commit message spelling an typos
 
Changes since v2:
 - keep casting to int64_t for xfs_mod_delalloc
 - add a patch to clarify and assert that the block delta in
   xfs_trans_unreserve_and_mod_sb can only be positive

Diffstat:
 libxfs/xfs_ag.c           |    4 -
 libxfs/xfs_ag_resv.c      |   24 ++-------
 libxfs/xfs_ag_resv.h      |    2 
 libxfs/xfs_alloc.c        |    4 -
 libxfs/xfs_bmap.c         |  118 ++++++++++++++++++++++------------------------
 libxfs/xfs_rtbitmap.c     |   14 +++++
 libxfs/xfs_shared.h       |    6 +-
 scrub/fscounters.c        |    8 ++-
 scrub/fscounters.h        |    1 
 scrub/fscounters_repair.c |    3 -
 scrub/repair.c            |    5 -
 xfs_fsops.c               |   29 +++--------
 xfs_fsops.h               |    2 
 xfs_inode.c               |    3 -
 xfs_iomap.c               |   44 +++++++++++------
 xfs_iops.c                |    2 
 xfs_mount.c               |   85 +++++++++++++++++++--------------
 xfs_mount.h               |   36 ++++++++++----
 xfs_rtalloc.c             |    2 
 xfs_super.c               |   17 ++++--
 xfs_trace.h               |    1 
 xfs_trans.c               |   63 ++++++++++++------------
 22 files changed, 260 insertions(+), 213 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 01/11] xfs: make XFS_TRANS_LOWMODE match the other XFS_TRANS_ definitions
  2024-03-25  2:24 bring back RT delalloc support v4 Christoph Hellwig
@ 2024-03-25  2:24 ` Christoph Hellwig
  2024-03-25  2:24 ` [PATCH 02/11] xfs: move RT inode locking out of __xfs_bunmapi Christoph Hellwig
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-25  2:24 UTC (permalink / raw
  To: Chandan Babu R; +Cc: Darrick J. Wong, Dave Chinner, linux-xfs

Commit bb7b1c9c5dd3 ("xfs: tag transactions that contain intent done
items") switched the XFS_TRANS_ definitions to be bit based, and using
comments above the definitions.  As XFS_TRANS_LOWMODE was last and has
a big fat comment it was missed.  Switch it to the same style.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_shared.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h
index dfd61fa8332e1a..f35640ad3e7fe4 100644
--- a/fs/xfs/libxfs/xfs_shared.h
+++ b/fs/xfs/libxfs/xfs_shared.h
@@ -124,7 +124,6 @@ void	xfs_log_get_max_trans_res(struct xfs_mount *mp,
 #define XFS_TRANS_RES_FDBLKS		(1u << 6)
 /* Transaction contains an intent done log item */
 #define XFS_TRANS_HAS_INTENT_DONE	(1u << 7)
-
 /*
  * LOWMODE is used by the allocator to activate the lowspace algorithm - when
  * free space is running low the extent allocator may choose to allocate an
@@ -136,7 +135,7 @@ void	xfs_log_get_max_trans_res(struct xfs_mount *mp,
  * for free space from AG 0. If the correct transaction reservations have been
  * made then this algorithm will eventually find all the space it needs.
  */
-#define XFS_TRANS_LOWMODE	0x100	/* allocate in low space mode */
+#define XFS_TRANS_LOWMODE		(1u << 8)
 
 /*
  * Field values for xfs_trans_mod_sb.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 02/11] xfs: move RT inode locking out of __xfs_bunmapi
  2024-03-25  2:24 bring back RT delalloc support v4 Christoph Hellwig
  2024-03-25  2:24 ` [PATCH 01/11] xfs: make XFS_TRANS_LOWMODE match the other XFS_TRANS_ definitions Christoph Hellwig
@ 2024-03-25  2:24 ` Christoph Hellwig
  2024-03-25  2:24 ` [PATCH 03/11] xfs: block deltas in xfs_trans_unreserve_and_mod_sb must be positive Christoph Hellwig
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-25  2:24 UTC (permalink / raw
  To: Chandan Babu R; +Cc: Darrick J. Wong, Dave Chinner, linux-xfs

__xfs_bunmapi is a bit of an odd place to lock the rtbitmap and rtsummary
inodes given that it is very high level code.  While this only looks ugly
right now, it will become a problem when supporting delayed allocations
for RT inodes as __xfs_bunmapi might end up deleting only delalloc extents
and thus never unlock the rt inodes.

Move the locking into xfs_rtfree_blocks instead (where it will also be
helpful once we support extfree items for RT allocations), and use a new
flag in the transaction to ensure they aren't locked twice.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_bmap.c     | 10 ----------
 fs/xfs/libxfs/xfs_rtbitmap.c | 14 ++++++++++++++
 fs/xfs/libxfs/xfs_shared.h   |  3 +++
 3 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 656c95a22f2e6d..5fb7b38921c9a3 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -5414,16 +5414,6 @@ __xfs_bunmapi(
 	} else
 		cur = NULL;
 
-	if (isrt) {
-		/*
-		 * Synchronize by locking the bitmap inode.
-		 */
-		xfs_ilock(mp->m_rbmip, XFS_ILOCK_EXCL|XFS_ILOCK_RTBITMAP);
-		xfs_trans_ijoin(tp, mp->m_rbmip, XFS_ILOCK_EXCL);
-		xfs_ilock(mp->m_rsumip, XFS_ILOCK_EXCL|XFS_ILOCK_RTSUM);
-		xfs_trans_ijoin(tp, mp->m_rsumip, XFS_ILOCK_EXCL);
-	}
-
 	extno = 0;
 	while (end != (xfs_fileoff_t)-1 && end >= start &&
 	       (nexts == 0 || extno < nexts)) {
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index f246d6dbf4eca8..b8d395fa2448f3 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -1008,6 +1008,20 @@ xfs_rtfree_blocks(
 		return -EIO;
 	}
 
+	/*
+	 * Ensure the bitmap and summary inodes are locked before modifying
+	 * them.  We can get called multiples times per transaction, so record
+	 * the fact that they are locked in the transaction.
+	 */
+	if (!(tp->t_flags & XFS_TRANS_RTBITMAP_LOCKED)) {
+		tp->t_flags |= XFS_TRANS_RTBITMAP_LOCKED;
+
+		xfs_ilock(mp->m_rbmip, XFS_ILOCK_EXCL|XFS_ILOCK_RTBITMAP);
+		xfs_trans_ijoin(tp, mp->m_rbmip, XFS_ILOCK_EXCL);
+		xfs_ilock(mp->m_rsumip, XFS_ILOCK_EXCL|XFS_ILOCK_RTSUM);
+		xfs_trans_ijoin(tp, mp->m_rsumip, XFS_ILOCK_EXCL);
+	}
+
 	return xfs_rtfree_extent(tp, start, len);
 }
 
diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h
index f35640ad3e7fe4..34f104ed372c09 100644
--- a/fs/xfs/libxfs/xfs_shared.h
+++ b/fs/xfs/libxfs/xfs_shared.h
@@ -137,6 +137,9 @@ void	xfs_log_get_max_trans_res(struct xfs_mount *mp,
  */
 #define XFS_TRANS_LOWMODE		(1u << 8)
 
+/* Transaction has locked the rtbitmap and rtsum inodes */
+#define XFS_TRANS_RTBITMAP_LOCKED	(1u << 9)
+
 /*
  * Field values for xfs_trans_mod_sb.
  */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 03/11] xfs: block deltas in xfs_trans_unreserve_and_mod_sb must be positive
  2024-03-25  2:24 bring back RT delalloc support v4 Christoph Hellwig
  2024-03-25  2:24 ` [PATCH 01/11] xfs: make XFS_TRANS_LOWMODE match the other XFS_TRANS_ definitions Christoph Hellwig
  2024-03-25  2:24 ` [PATCH 02/11] xfs: move RT inode locking out of __xfs_bunmapi Christoph Hellwig
@ 2024-03-25  2:24 ` Christoph Hellwig
  2024-03-25  2:24 ` [PATCH 04/11] xfs: split xfs_mod_freecounter Christoph Hellwig
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-25  2:24 UTC (permalink / raw
  To: Chandan Babu R; +Cc: Darrick J. Wong, Dave Chinner, linux-xfs

And to make that more clear, rearrange the code a bit and add asserts
and a comment.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_trans.c | 38 ++++++++++++++++++++++++--------------
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 7350640059cc60..924b460229e951 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -594,28 +594,38 @@ xfs_trans_unreserve_and_mod_sb(
 {
 	struct xfs_mount	*mp = tp->t_mountp;
 	bool			rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0;
-	int64_t			blkdelta = 0;
-	int64_t			rtxdelta = 0;
+	int64_t			blkdelta = tp->t_blk_res;
+	int64_t			rtxdelta = tp->t_rtx_res;
 	int64_t			idelta = 0;
 	int64_t			ifreedelta = 0;
 	int			error;
 
-	/* calculate deltas */
-	if (tp->t_blk_res > 0)
-		blkdelta = tp->t_blk_res;
-	if ((tp->t_fdblocks_delta != 0) &&
-	    (xfs_has_lazysbcount(mp) ||
-	     (tp->t_flags & XFS_TRANS_SB_DIRTY)))
+	/*
+	 * Calculate the deltas.
+	 *
+	 * t_fdblocks_delta and t_frextents_delta can be positive or negative:
+	 *
+	 *  - positive values indicate blocks freed in the transaction.
+	 *  - negative values indicate blocks allocated in the transaction
+	 *
+	 * Negative values can only happen if the transaction has a block
+	 * reservation that covers the allocated block.  The end result is
+	 * that the calculated delta values must always be positive and we
+	 * can only put back previous allocated or reserved blocks here.
+	 */
+	ASSERT(tp->t_blk_res || tp->t_fdblocks_delta >= 0);
+	if (xfs_has_lazysbcount(mp) || (tp->t_flags & XFS_TRANS_SB_DIRTY)) {
 	        blkdelta += tp->t_fdblocks_delta;
+		ASSERT(blkdelta >= 0);
+	}
 
-	if (tp->t_rtx_res > 0)
-		rtxdelta = tp->t_rtx_res;
-	if ((tp->t_frextents_delta != 0) &&
-	    (tp->t_flags & XFS_TRANS_SB_DIRTY))
+	ASSERT(tp->t_rtx_res || tp->t_frextents_delta >= 0);
+	if (tp->t_flags & XFS_TRANS_SB_DIRTY) {
 		rtxdelta += tp->t_frextents_delta;
+		ASSERT(rtxdelta >= 0);
+	}
 
-	if (xfs_has_lazysbcount(mp) ||
-	     (tp->t_flags & XFS_TRANS_SB_DIRTY)) {
+	if (xfs_has_lazysbcount(mp) || (tp->t_flags & XFS_TRANS_SB_DIRTY)) {
 		idelta = tp->t_icount_delta;
 		ifreedelta = tp->t_ifree_delta;
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 04/11] xfs: split xfs_mod_freecounter
  2024-03-25  2:24 bring back RT delalloc support v4 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2024-03-25  2:24 ` [PATCH 03/11] xfs: block deltas in xfs_trans_unreserve_and_mod_sb must be positive Christoph Hellwig
@ 2024-03-25  2:24 ` Christoph Hellwig
  2024-03-25 23:41   ` Darrick J. Wong
  2024-03-25  2:24 ` [PATCH 05/11] xfs: reinstate RT support in xfs_bmapi_reserve_delalloc Christoph Hellwig
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-25  2:24 UTC (permalink / raw
  To: Chandan Babu R; +Cc: Darrick J. Wong, Dave Chinner, linux-xfs

xfs_mod_freecounter has two entirely separate code paths for adding or
subtracting from the free counters.  Only the subtract case looks at the
rsvd flag and can return an error.

Split xfs_mod_freecounter into separate helpers for subtracting or
adding the freecounter, and remove all the impossible to reach error
handling for the addition case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_ag.c      |  4 +--
 fs/xfs/libxfs/xfs_ag_resv.c | 24 ++++---------
 fs/xfs/libxfs/xfs_ag_resv.h |  2 +-
 fs/xfs/libxfs/xfs_alloc.c   |  4 +--
 fs/xfs/libxfs/xfs_bmap.c    | 23 +++++++------
 fs/xfs/scrub/fscounters.c   |  2 +-
 fs/xfs/scrub/repair.c       |  5 +--
 fs/xfs/xfs_fsops.c          | 29 +++++-----------
 fs/xfs/xfs_fsops.h          |  2 +-
 fs/xfs/xfs_mount.c          | 67 +++++++++++++++++++------------------
 fs/xfs/xfs_mount.h          | 27 ++++++++++-----
 fs/xfs/xfs_super.c          |  6 +---
 fs/xfs/xfs_trace.h          |  1 -
 fs/xfs/xfs_trans.c          | 25 +++++---------
 14 files changed, 97 insertions(+), 124 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
index dc1873f76bffd5..189e3296fef6de 100644
--- a/fs/xfs/libxfs/xfs_ag.c
+++ b/fs/xfs/libxfs/xfs_ag.c
@@ -963,9 +963,7 @@ xfs_ag_shrink_space(
 	 * Disable perag reservations so it doesn't cause the allocation request
 	 * to fail. We'll reestablish reservation before we return.
 	 */
-	error = xfs_ag_resv_free(pag);
-	if (error)
-		return error;
+	xfs_ag_resv_free(pag);
 
 	/* internal log shouldn't also show up in the free space btrees */
 	error = xfs_alloc_vextent_exact_bno(&args,
diff --git a/fs/xfs/libxfs/xfs_ag_resv.c b/fs/xfs/libxfs/xfs_ag_resv.c
index da1057bd0e6067..216423df939e5c 100644
--- a/fs/xfs/libxfs/xfs_ag_resv.c
+++ b/fs/xfs/libxfs/xfs_ag_resv.c
@@ -126,14 +126,13 @@ xfs_ag_resv_needed(
 }
 
 /* Clean out a reservation */
-static int
+static void
 __xfs_ag_resv_free(
 	struct xfs_perag		*pag,
 	enum xfs_ag_resv_type		type)
 {
 	struct xfs_ag_resv		*resv;
 	xfs_extlen_t			oldresv;
-	int				error;
 
 	trace_xfs_ag_resv_free(pag, type, 0);
 
@@ -149,30 +148,19 @@ __xfs_ag_resv_free(
 		oldresv = resv->ar_orig_reserved;
 	else
 		oldresv = resv->ar_reserved;
-	error = xfs_mod_fdblocks(pag->pag_mount, oldresv, true);
+	xfs_add_fdblocks(pag->pag_mount, oldresv);
 	resv->ar_reserved = 0;
 	resv->ar_asked = 0;
 	resv->ar_orig_reserved = 0;
-
-	if (error)
-		trace_xfs_ag_resv_free_error(pag->pag_mount, pag->pag_agno,
-				error, _RET_IP_);
-	return error;
 }
 
 /* Free a per-AG reservation. */
-int
+void
 xfs_ag_resv_free(
 	struct xfs_perag		*pag)
 {
-	int				error;
-	int				err2;
-
-	error = __xfs_ag_resv_free(pag, XFS_AG_RESV_RMAPBT);
-	err2 = __xfs_ag_resv_free(pag, XFS_AG_RESV_METADATA);
-	if (err2 && !error)
-		error = err2;
-	return error;
+	__xfs_ag_resv_free(pag, XFS_AG_RESV_RMAPBT);
+	__xfs_ag_resv_free(pag, XFS_AG_RESV_METADATA);
 }
 
 static int
@@ -216,7 +204,7 @@ __xfs_ag_resv_init(
 	if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_AG_RESV_FAIL))
 		error = -ENOSPC;
 	else
-		error = xfs_mod_fdblocks(mp, -(int64_t)hidden_space, true);
+		error = xfs_dec_fdblocks(mp, hidden_space, true);
 	if (error) {
 		trace_xfs_ag_resv_init_error(pag->pag_mount, pag->pag_agno,
 				error, _RET_IP_);
diff --git a/fs/xfs/libxfs/xfs_ag_resv.h b/fs/xfs/libxfs/xfs_ag_resv.h
index b74b210008ea7e..ff20ed93de7724 100644
--- a/fs/xfs/libxfs/xfs_ag_resv.h
+++ b/fs/xfs/libxfs/xfs_ag_resv.h
@@ -6,7 +6,7 @@
 #ifndef __XFS_AG_RESV_H__
 #define	__XFS_AG_RESV_H__
 
-int xfs_ag_resv_free(struct xfs_perag *pag);
+void xfs_ag_resv_free(struct xfs_perag *pag);
 int xfs_ag_resv_init(struct xfs_perag *pag, struct xfs_trans *tp);
 
 bool xfs_ag_resv_critical(struct xfs_perag *pag, enum xfs_ag_resv_type type);
diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 9da52e92172aba..6cb8b2ddc541b4 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -79,7 +79,7 @@ xfs_prealloc_blocks(
 }
 
 /*
- * The number of blocks per AG that we withhold from xfs_mod_fdblocks to
+ * The number of blocks per AG that we withhold from xfs_dec_fdblocks to
  * guarantee that we can refill the AGFL prior to allocating space in a nearly
  * full AG.  Although the space described by the free space btrees, the
  * blocks used by the freesp btrees themselves, and the blocks owned by the
@@ -89,7 +89,7 @@ xfs_prealloc_blocks(
  * until the fs goes down, we subtract this many AG blocks from the incore
  * fdblocks to ensure user allocation does not overcommit the space the
  * filesystem needs for the AGFLs.  The rmap btree uses a per-AG reservation to
- * withhold space from xfs_mod_fdblocks, so we do not account for that here.
+ * withhold space from xfs_dec_fdblocks, so we do not account for that here.
  */
 #define XFS_ALLOCBT_AGFL_RESERVE	4
 
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 5fb7b38921c9a3..240507cbe4db3e 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1983,10 +1983,11 @@ xfs_bmap_add_extent_delay_real(
 	}
 
 	/* adjust for changes in reserved delayed indirect blocks */
-	if (da_new != da_old) {
-		ASSERT(state == 0 || da_new < da_old);
-		error = xfs_mod_fdblocks(mp, (int64_t)(da_old - da_new),
-				false);
+	if (da_new < da_old) {
+		xfs_add_fdblocks(mp, da_old - da_new);
+	} else if (da_new > da_old) {
+		ASSERT(state == 0);
+		error = xfs_dec_fdblocks(mp, da_new - da_old, false);
 	}
 
 	xfs_bmap_check_leaf_extents(bma->cur, bma->ip, whichfork);
@@ -2688,8 +2689,8 @@ xfs_bmap_add_extent_hole_delay(
 	}
 	if (oldlen != newlen) {
 		ASSERT(oldlen > newlen);
-		xfs_mod_fdblocks(ip->i_mount, (int64_t)(oldlen - newlen),
-				 false);
+		xfs_add_fdblocks(ip->i_mount, oldlen - newlen);
+
 		/*
 		 * Nothing to do for disk quota accounting here.
 		 */
@@ -4108,11 +4109,11 @@ xfs_bmapi_reserve_delalloc(
 	indlen = (xfs_extlen_t)xfs_bmap_worst_indlen(ip, alen);
 	ASSERT(indlen > 0);
 
-	error = xfs_mod_fdblocks(mp, -((int64_t)alen), false);
+	error = xfs_dec_fdblocks(mp, alen, false);
 	if (error)
 		goto out_unreserve_quota;
 
-	error = xfs_mod_fdblocks(mp, -((int64_t)indlen), false);
+	error = xfs_dec_fdblocks(mp, indlen, false);
 	if (error)
 		goto out_unreserve_blocks;
 
@@ -4140,7 +4141,7 @@ xfs_bmapi_reserve_delalloc(
 	return 0;
 
 out_unreserve_blocks:
-	xfs_mod_fdblocks(mp, alen, false);
+	xfs_add_fdblocks(mp, alen);
 out_unreserve_quota:
 	if (XFS_IS_QUOTA_ON(mp))
 		xfs_quota_unreserve_blkres(ip, alen);
@@ -4926,7 +4927,7 @@ xfs_bmap_del_extent_delay(
 	ASSERT(got_endoff >= del_endoff);
 
 	if (isrt)
-		xfs_mod_frextents(mp, xfs_rtb_to_rtx(mp, del->br_blockcount));
+		xfs_add_frextents(mp, xfs_rtb_to_rtx(mp, del->br_blockcount));
 
 	/*
 	 * Update the inode delalloc counter now and wait to update the
@@ -5013,7 +5014,7 @@ xfs_bmap_del_extent_delay(
 	if (!isrt)
 		da_diff += del->br_blockcount;
 	if (da_diff) {
-		xfs_mod_fdblocks(mp, da_diff, false);
+		xfs_add_fdblocks(mp, da_diff);
 		xfs_mod_delalloc(mp, -da_diff);
 	}
 	return error;
diff --git a/fs/xfs/scrub/fscounters.c b/fs/xfs/scrub/fscounters.c
index d310737c882367..6f465373aa2027 100644
--- a/fs/xfs/scrub/fscounters.c
+++ b/fs/xfs/scrub/fscounters.c
@@ -517,7 +517,7 @@ xchk_fscounters(
 
 	/*
 	 * If the filesystem is not frozen, the counter summation calls above
-	 * can race with xfs_mod_freecounter, which subtracts a requested space
+	 * can race with xfs_dec_freecounter, which subtracts a requested space
 	 * reservation from the counter and undoes the subtraction if that made
 	 * the counter go negative.  Therefore, it's possible to see negative
 	 * values here, and we should only flag that as a corruption if we
diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c
index f43dce771cdd26..6123e6c7ac7d67 100644
--- a/fs/xfs/scrub/repair.c
+++ b/fs/xfs/scrub/repair.c
@@ -963,9 +963,7 @@ xrep_reset_perag_resv(
 	ASSERT(sc->tp);
 
 	sc->flags &= ~XREP_RESET_PERAG_RESV;
-	error = xfs_ag_resv_free(sc->sa.pag);
-	if (error)
-		goto out;
+	xfs_ag_resv_free(sc->sa.pag);
 	error = xfs_ag_resv_init(sc->sa.pag, sc->tp);
 	if (error == -ENOSPC) {
 		xfs_err(sc->mp,
@@ -974,7 +972,6 @@ xrep_reset_perag_resv(
 		error = 0;
 	}
 
-out:
 	return error;
 }
 
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 83f708f62ed9f2..c211ea2b63c4dd 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -213,10 +213,8 @@ xfs_growfs_data_private(
 			struct xfs_perag	*pag;
 
 			pag = xfs_perag_get(mp, id.agno);
-			error = xfs_ag_resv_free(pag);
+			xfs_ag_resv_free(pag);
 			xfs_perag_put(pag);
-			if (error)
-				return error;
 		}
 		/*
 		 * Reserve AG metadata blocks. ENOSPC here does not mean there
@@ -385,14 +383,14 @@ xfs_reserve_blocks(
 	 */
 	if (mp->m_resblks > request) {
 		lcounter = mp->m_resblks_avail - request;
-		if (lcounter  > 0) {		/* release unused blocks */
+		if (lcounter > 0) {		/* release unused blocks */
 			fdblks_delta = lcounter;
 			mp->m_resblks_avail -= lcounter;
 		}
 		mp->m_resblks = request;
 		if (fdblks_delta) {
 			spin_unlock(&mp->m_sb_lock);
-			error = xfs_mod_fdblocks(mp, fdblks_delta, 0);
+			xfs_add_fdblocks(mp, fdblks_delta);
 			spin_lock(&mp->m_sb_lock);
 		}
 
@@ -428,9 +426,9 @@ xfs_reserve_blocks(
 		 */
 		fdblks_delta = min(free, delta);
 		spin_unlock(&mp->m_sb_lock);
-		error = xfs_mod_fdblocks(mp, -fdblks_delta, 0);
+		error = xfs_dec_fdblocks(mp, fdblks_delta, 0);
 		if (!error)
-			xfs_mod_fdblocks(mp, fdblks_delta, 0);
+			xfs_add_fdblocks(mp, fdblks_delta);
 		spin_lock(&mp->m_sb_lock);
 	}
 out:
@@ -556,24 +554,13 @@ xfs_fs_reserve_ag_blocks(
 /*
  * Free space reserved for per-AG metadata.
  */
-int
+void
 xfs_fs_unreserve_ag_blocks(
 	struct xfs_mount	*mp)
 {
 	xfs_agnumber_t		agno;
 	struct xfs_perag	*pag;
-	int			error = 0;
-	int			err2;
 
-	for_each_perag(mp, agno, pag) {
-		err2 = xfs_ag_resv_free(pag);
-		if (err2 && !error)
-			error = err2;
-	}
-
-	if (error)
-		xfs_warn(mp,
-	"Error %d freeing per-AG metadata reserve pool.", error);
-
-	return error;
+	for_each_perag(mp, agno, pag)
+		xfs_ag_resv_free(pag);
 }
diff --git a/fs/xfs/xfs_fsops.h b/fs/xfs/xfs_fsops.h
index 44457b0a059376..3e2f73bcf8314b 100644
--- a/fs/xfs/xfs_fsops.h
+++ b/fs/xfs/xfs_fsops.h
@@ -12,6 +12,6 @@ int xfs_reserve_blocks(struct xfs_mount *mp, uint64_t request);
 int xfs_fs_goingdown(struct xfs_mount *mp, uint32_t inflags);
 
 int xfs_fs_reserve_ag_blocks(struct xfs_mount *mp);
-int xfs_fs_unreserve_ag_blocks(struct xfs_mount *mp);
+void xfs_fs_unreserve_ag_blocks(struct xfs_mount *mp);
 
 #endif	/* __XFS_FSOPS_H__ */
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 7328034d42ed8d..575a3b98cdb514 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1129,16 +1129,44 @@ xfs_fs_writable(
 	return true;
 }
 
-/* Adjust m_fdblocks or m_frextents. */
+void
+xfs_add_freecounter(
+	struct xfs_mount	*mp,
+	struct percpu_counter	*counter,
+	uint64_t		delta)
+{
+	bool			has_resv_pool = (counter == &mp->m_fdblocks);
+	uint64_t		res_used;
+
+	/*
+	 * If the reserve pool is depleted, put blocks back into it first.
+	 * Most of the time the pool is full.
+	 */
+	if (!has_resv_pool || mp->m_resblks == mp->m_resblks_avail) {
+		percpu_counter_add(counter, delta);
+		return;
+	}
+
+	spin_lock(&mp->m_sb_lock);
+	res_used = mp->m_resblks - mp->m_resblks_avail;
+	if (res_used > delta) {
+		mp->m_resblks_avail += delta;
+	} else {
+		delta -= res_used;
+		mp->m_resblks_avail = mp->m_resblks;
+		percpu_counter_add(counter, delta);
+	}
+	spin_unlock(&mp->m_sb_lock);
+}
+
 int
-xfs_mod_freecounter(
+xfs_dec_freecounter(
 	struct xfs_mount	*mp,
 	struct percpu_counter	*counter,
-	int64_t			delta,
+	uint64_t		delta,
 	bool			rsvd)
 {
 	int64_t			lcounter;
-	long long		res_used;
 	uint64_t		set_aside = 0;
 	s32			batch;
 	bool			has_resv_pool;
@@ -1148,31 +1176,6 @@ xfs_mod_freecounter(
 	if (rsvd)
 		ASSERT(has_resv_pool);
 
-	if (delta > 0) {
-		/*
-		 * If the reserve pool is depleted, put blocks back into it
-		 * first. Most of the time the pool is full.
-		 */
-		if (likely(!has_resv_pool ||
-			   mp->m_resblks == mp->m_resblks_avail)) {
-			percpu_counter_add(counter, delta);
-			return 0;
-		}
-
-		spin_lock(&mp->m_sb_lock);
-		res_used = (long long)(mp->m_resblks - mp->m_resblks_avail);
-
-		if (res_used > delta) {
-			mp->m_resblks_avail += delta;
-		} else {
-			delta -= res_used;
-			mp->m_resblks_avail = mp->m_resblks;
-			percpu_counter_add(counter, delta);
-		}
-		spin_unlock(&mp->m_sb_lock);
-		return 0;
-	}
-
 	/*
 	 * Taking blocks away, need to be more accurate the closer we
 	 * are to zero.
@@ -1200,7 +1203,7 @@ xfs_mod_freecounter(
 	 */
 	if (has_resv_pool)
 		set_aside = xfs_fdblocks_unavailable(mp);
-	percpu_counter_add_batch(counter, delta, batch);
+	percpu_counter_add_batch(counter, -((int64_t)delta), batch);
 	if (__percpu_counter_compare(counter, set_aside,
 				     XFS_FDBLOCKS_BATCH) >= 0) {
 		/* we had space! */
@@ -1212,11 +1215,11 @@ xfs_mod_freecounter(
 	 * that took us to ENOSPC.
 	 */
 	spin_lock(&mp->m_sb_lock);
-	percpu_counter_add(counter, -delta);
+	percpu_counter_add(counter, delta);
 	if (!has_resv_pool || !rsvd)
 		goto fdblocks_enospc;
 
-	lcounter = (long long)mp->m_resblks_avail + delta;
+	lcounter = (long long)mp->m_resblks_avail - delta;
 	if (lcounter >= 0) {
 		mp->m_resblks_avail = lcounter;
 		spin_unlock(&mp->m_sb_lock);
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index e880aa48de68bb..d941437a0c7369 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -534,19 +534,30 @@ xfs_fdblocks_unavailable(
 	return mp->m_alloc_set_aside + atomic64_read(&mp->m_allocbt_blks);
 }
 
-int xfs_mod_freecounter(struct xfs_mount *mp, struct percpu_counter *counter,
-		int64_t delta, bool rsvd);
+int xfs_dec_freecounter(struct xfs_mount *mp, struct percpu_counter *counter,
+		uint64_t delta, bool rsvd);
+void xfs_add_freecounter(struct xfs_mount *mp, struct percpu_counter *counter,
+		uint64_t delta);
 
-static inline int
-xfs_mod_fdblocks(struct xfs_mount *mp, int64_t delta, bool reserved)
+static inline int xfs_dec_fdblocks(struct xfs_mount *mp, uint64_t delta,
+		bool reserved)
 {
-	return xfs_mod_freecounter(mp, &mp->m_fdblocks, delta, reserved);
+	return xfs_dec_freecounter(mp, &mp->m_fdblocks, delta, reserved);
 }
 
-static inline int
-xfs_mod_frextents(struct xfs_mount *mp, int64_t delta)
+static inline void xfs_add_fdblocks(struct xfs_mount *mp, uint64_t delta)
 {
-	return xfs_mod_freecounter(mp, &mp->m_frextents, delta, false);
+	xfs_add_freecounter(mp, &mp->m_fdblocks, delta);
+}
+
+static inline int xfs_dec_frextents(struct xfs_mount *mp, uint64_t delta)
+{
+	return xfs_dec_freecounter(mp, &mp->m_frextents, delta, false);
+}
+
+static inline void xfs_add_frextents(struct xfs_mount *mp, uint64_t delta)
+{
+	xfs_add_freecounter(mp, &mp->m_frextents, delta);
 }
 
 extern int	xfs_readsb(xfs_mount_t *, int);
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 6828c48b15e9bd..0afcb005a28fc1 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1874,11 +1874,7 @@ xfs_remount_ro(
 	xfs_inodegc_stop(mp);
 
 	/* Free the per-AG metadata reservation pool. */
-	error = xfs_fs_unreserve_ag_blocks(mp);
-	if (error) {
-		xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
-		return error;
-	}
+	xfs_fs_unreserve_ag_blocks(mp);
 
 	/*
 	 * Before we sync the metadata, we need to free up the reserve block
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index aea97fc074f8de..1c5753f91c388e 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -3062,7 +3062,6 @@ DEFINE_AG_RESV_EVENT(xfs_ag_resv_free_extent);
 DEFINE_AG_RESV_EVENT(xfs_ag_resv_critical);
 DEFINE_AG_RESV_EVENT(xfs_ag_resv_needed);
 
-DEFINE_AG_ERROR_EVENT(xfs_ag_resv_free_error);
 DEFINE_AG_ERROR_EVENT(xfs_ag_resv_init_error);
 
 /* refcount tracepoint classes */
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 924b460229e951..b18d478e2c9e85 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -163,7 +163,7 @@ xfs_trans_reserve(
 	 * fail if the count would go below zero.
 	 */
 	if (blocks > 0) {
-		error = xfs_mod_fdblocks(mp, -((int64_t)blocks), rsvd);
+		error = xfs_dec_fdblocks(mp, blocks, rsvd);
 		if (error != 0)
 			return -ENOSPC;
 		tp->t_blk_res += blocks;
@@ -210,7 +210,7 @@ xfs_trans_reserve(
 	 * fail if the count would go below zero.
 	 */
 	if (rtextents > 0) {
-		error = xfs_mod_frextents(mp, -((int64_t)rtextents));
+		error = xfs_dec_frextents(mp, rtextents);
 		if (error) {
 			error = -ENOSPC;
 			goto undo_log;
@@ -234,7 +234,7 @@ xfs_trans_reserve(
 
 undo_blocks:
 	if (blocks > 0) {
-		xfs_mod_fdblocks(mp, (int64_t)blocks, rsvd);
+		xfs_add_fdblocks(mp, blocks);
 		tp->t_blk_res = 0;
 	}
 	return error;
@@ -593,12 +593,10 @@ xfs_trans_unreserve_and_mod_sb(
 	struct xfs_trans	*tp)
 {
 	struct xfs_mount	*mp = tp->t_mountp;
-	bool			rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0;
 	int64_t			blkdelta = tp->t_blk_res;
 	int64_t			rtxdelta = tp->t_rtx_res;
 	int64_t			idelta = 0;
 	int64_t			ifreedelta = 0;
-	int			error;
 
 	/*
 	 * Calculate the deltas.
@@ -631,10 +629,8 @@ xfs_trans_unreserve_and_mod_sb(
 	}
 
 	/* apply the per-cpu counters */
-	if (blkdelta) {
-		error = xfs_mod_fdblocks(mp, blkdelta, rsvd);
-		ASSERT(!error);
-	}
+	if (blkdelta)
+		xfs_add_fdblocks(mp, blkdelta);
 
 	if (idelta)
 		percpu_counter_add_batch(&mp->m_icount, idelta,
@@ -643,10 +639,8 @@ xfs_trans_unreserve_and_mod_sb(
 	if (ifreedelta)
 		percpu_counter_add(&mp->m_ifree, ifreedelta);
 
-	if (rtxdelta) {
-		error = xfs_mod_frextents(mp, rtxdelta);
-		ASSERT(!error);
-	}
+	if (rtxdelta)
+		xfs_add_frextents(mp, rtxdelta);
 
 	if (!(tp->t_flags & XFS_TRANS_SB_DIRTY))
 		return;
@@ -682,7 +676,6 @@ xfs_trans_unreserve_and_mod_sb(
 	 */
 	ASSERT(mp->m_sb.sb_imax_pct >= 0);
 	ASSERT(mp->m_sb.sb_rextslog >= 0);
-	return;
 }
 
 /* Add the given log item to the transaction's list of log items. */
@@ -1301,9 +1294,9 @@ xfs_trans_reserve_more_inode(
 		return 0;
 
 	/* Quota failed, give back the new reservation. */
-	xfs_mod_fdblocks(mp, dblocks, tp->t_flags & XFS_TRANS_RESERVE);
+	xfs_add_fdblocks(mp, dblocks);
 	tp->t_blk_res -= dblocks;
-	xfs_mod_frextents(mp, rtx);
+	xfs_add_frextents(mp, rtx);
 	tp->t_rtx_res -= rtx;
 	return error;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 05/11] xfs: reinstate RT support in xfs_bmapi_reserve_delalloc
  2024-03-25  2:24 bring back RT delalloc support v4 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2024-03-25  2:24 ` [PATCH 04/11] xfs: split xfs_mod_freecounter Christoph Hellwig
@ 2024-03-25  2:24 ` Christoph Hellwig
  2024-03-25  2:24 ` [PATCH 06/11] xfs: cleanup fdblock/frextent accounting in xfs_bmap_del_extent_delay Christoph Hellwig
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-25  2:24 UTC (permalink / raw
  To: Chandan Babu R; +Cc: Darrick J. Wong, Dave Chinner, linux-xfs

Allocate data blocks for RT inodes using xfs_dec_frextents.  While at
it optimize the data device case by doing only a single xfs_dec_fdblocks
call for the extent itself and the indirect blocks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_bmap.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 240507cbe4db3e..1114e057e55783 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4067,6 +4067,7 @@ xfs_bmapi_reserve_delalloc(
 	struct xfs_ifork	*ifp = xfs_ifork_ptr(ip, whichfork);
 	xfs_extlen_t		alen;
 	xfs_extlen_t		indlen;
+	uint64_t		fdblocks;
 	int			error;
 	xfs_fileoff_t		aoff = off;
 
@@ -4109,14 +4110,18 @@ xfs_bmapi_reserve_delalloc(
 	indlen = (xfs_extlen_t)xfs_bmap_worst_indlen(ip, alen);
 	ASSERT(indlen > 0);
 
-	error = xfs_dec_fdblocks(mp, alen, false);
-	if (error)
-		goto out_unreserve_quota;
+	fdblocks = indlen;
+	if (XFS_IS_REALTIME_INODE(ip)) {
+		error = xfs_dec_frextents(mp, xfs_rtb_to_rtx(mp, alen));
+		if (error)
+			goto out_unreserve_quota;
+	} else {
+		fdblocks += alen;
+	}
 
-	error = xfs_dec_fdblocks(mp, indlen, false);
+	error = xfs_dec_fdblocks(mp, fdblocks, false);
 	if (error)
-		goto out_unreserve_blocks;
-
+		goto out_unreserve_frextents;
 
 	ip->i_delayed_blks += alen;
 	xfs_mod_delalloc(ip->i_mount, alen + indlen);
@@ -4140,8 +4145,9 @@ xfs_bmapi_reserve_delalloc(
 
 	return 0;
 
-out_unreserve_blocks:
-	xfs_add_fdblocks(mp, alen);
+out_unreserve_frextents:
+	if (XFS_IS_REALTIME_INODE(ip))
+		xfs_add_frextents(mp, xfs_rtb_to_rtx(mp, alen));
 out_unreserve_quota:
 	if (XFS_IS_QUOTA_ON(mp))
 		xfs_quota_unreserve_blkres(ip, alen);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 06/11] xfs: cleanup fdblock/frextent accounting in xfs_bmap_del_extent_delay
  2024-03-25  2:24 bring back RT delalloc support v4 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2024-03-25  2:24 ` [PATCH 05/11] xfs: reinstate RT support in xfs_bmapi_reserve_delalloc Christoph Hellwig
@ 2024-03-25  2:24 ` Christoph Hellwig
  2024-03-25  2:24 ` [PATCH 07/11] xfs: support RT inodes in xfs_mod_delalloc Christoph Hellwig
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-25  2:24 UTC (permalink / raw
  To: Chandan Babu R; +Cc: Darrick J. Wong, Dave Chinner, linux-xfs

The code to account fdblocks and frextents in xfs_bmap_del_extent_delay
is a bit weird in that it accounts frextents before the iext tree
manipulations and fdblocks after it.  Given that the iext tree
manipulations cannot fail currently that's not really a problem, but
still odd.  Move the frextent manipulation to the end, and use a
fdblocks variable to account of the unconditional indirect blocks and
the data blocks only freed for !RT.  This prepares for following
updates in the area and already makes the code more readable.

Also remove the !isrt assert given that this code clearly handles
rt extents correctly, and we'll soon reinstate delalloc support for
RT inodes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_bmap.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 1114e057e55783..94b4aad1989bec 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4917,6 +4917,7 @@ xfs_bmap_del_extent_delay(
 	xfs_fileoff_t		del_endoff, got_endoff;
 	xfs_filblks_t		got_indlen, new_indlen, stolen;
 	uint32_t		state = xfs_bmap_fork_to_state(whichfork);
+	uint64_t		fdblocks;
 	int			error = 0;
 	bool			isrt;
 
@@ -4932,15 +4933,11 @@ xfs_bmap_del_extent_delay(
 	ASSERT(got->br_startoff <= del->br_startoff);
 	ASSERT(got_endoff >= del_endoff);
 
-	if (isrt)
-		xfs_add_frextents(mp, xfs_rtb_to_rtx(mp, del->br_blockcount));
-
 	/*
 	 * Update the inode delalloc counter now and wait to update the
 	 * sb counters as we might have to borrow some blocks for the
 	 * indirect block accounting.
 	 */
-	ASSERT(!isrt);
 	error = xfs_quota_unreserve_blkres(ip, del->br_blockcount);
 	if (error)
 		return error;
@@ -5017,12 +5014,15 @@ xfs_bmap_del_extent_delay(
 
 	ASSERT(da_old >= da_new);
 	da_diff = da_old - da_new;
-	if (!isrt)
-		da_diff += del->br_blockcount;
-	if (da_diff) {
-		xfs_add_fdblocks(mp, da_diff);
-		xfs_mod_delalloc(mp, -da_diff);
-	}
+	fdblocks = da_diff;
+
+	if (isrt)
+		xfs_add_frextents(mp, xfs_rtb_to_rtx(mp, del->br_blockcount));
+	else
+		fdblocks += del->br_blockcount;
+
+	xfs_add_fdblocks(mp, fdblocks);
+	xfs_mod_delalloc(mp, -(int64_t)fdblocks);
 	return error;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 07/11] xfs: support RT inodes in xfs_mod_delalloc
  2024-03-25  2:24 bring back RT delalloc support v4 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2024-03-25  2:24 ` [PATCH 06/11] xfs: cleanup fdblock/frextent accounting in xfs_bmap_del_extent_delay Christoph Hellwig
@ 2024-03-25  2:24 ` Christoph Hellwig
  2024-03-25 23:44   ` Darrick J. Wong
  2024-03-25  2:24 ` [PATCH 08/11] xfs: look at m_frextents in xfs_iomap_prealloc_size for RT allocations Christoph Hellwig
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-25  2:24 UTC (permalink / raw
  To: Chandan Babu R; +Cc: Darrick J. Wong, Dave Chinner, linux-xfs

To prepare for re-enabling delalloc on RT devices, track the data blocks
(which use the RT device when the inode sits on it) and the indirect
blocks (which don't) separately to xfs_mod_delalloc, and add a new
percpu counter to also track the RT delalloc blocks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_bmap.c         | 12 ++++++------
 fs/xfs/scrub/fscounters.c        |  6 +++++-
 fs/xfs/scrub/fscounters.h        |  1 +
 fs/xfs/scrub/fscounters_repair.c |  3 ++-
 fs/xfs/xfs_mount.c               | 18 +++++++++++++++---
 fs/xfs/xfs_mount.h               |  9 ++++++++-
 fs/xfs/xfs_super.c               | 11 ++++++++++-
 7 files changed, 47 insertions(+), 13 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 94b4aad1989bec..cc250c33890bac 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1975,7 +1975,7 @@ xfs_bmap_add_extent_delay_real(
 	}
 
 	if (da_new != da_old)
-		xfs_mod_delalloc(mp, (int64_t)da_new - da_old);
+		xfs_mod_delalloc(bma->ip, 0, (int64_t)da_new - da_old);
 
 	if (bma->cur) {
 		da_new += bma->cur->bc_bmap.allocated;
@@ -2694,7 +2694,7 @@ xfs_bmap_add_extent_hole_delay(
 		/*
 		 * Nothing to do for disk quota accounting here.
 		 */
-		xfs_mod_delalloc(ip->i_mount, (int64_t)newlen - oldlen);
+		xfs_mod_delalloc(ip, 0, (int64_t)newlen - oldlen);
 	}
 }
 
@@ -3371,7 +3371,7 @@ xfs_bmap_alloc_account(
 		 * yet.
 		 */
 		if (ap->wasdel) {
-			xfs_mod_delalloc(ap->ip->i_mount, -(int64_t)ap->length);
+			xfs_mod_delalloc(ap->ip, -(int64_t)ap->length, 0);
 			return;
 		}
 
@@ -3395,7 +3395,7 @@ xfs_bmap_alloc_account(
 	xfs_trans_log_inode(ap->tp, ap->ip, XFS_ILOG_CORE);
 	if (ap->wasdel) {
 		ap->ip->i_delayed_blks -= ap->length;
-		xfs_mod_delalloc(ap->ip->i_mount, -(int64_t)ap->length);
+		xfs_mod_delalloc(ap->ip, -(int64_t)ap->length, 0);
 		fld = isrt ? XFS_TRANS_DQ_DELRTBCOUNT : XFS_TRANS_DQ_DELBCOUNT;
 	} else {
 		fld = isrt ? XFS_TRANS_DQ_RTBCOUNT : XFS_TRANS_DQ_BCOUNT;
@@ -4124,7 +4124,7 @@ xfs_bmapi_reserve_delalloc(
 		goto out_unreserve_frextents;
 
 	ip->i_delayed_blks += alen;
-	xfs_mod_delalloc(ip->i_mount, alen + indlen);
+	xfs_mod_delalloc(ip, alen, indlen);
 
 	got->br_startoff = aoff;
 	got->br_startblock = nullstartblock(indlen);
@@ -5022,7 +5022,7 @@ xfs_bmap_del_extent_delay(
 		fdblocks += del->br_blockcount;
 
 	xfs_add_fdblocks(mp, fdblocks);
-	xfs_mod_delalloc(mp, -(int64_t)fdblocks);
+	xfs_mod_delalloc(ip, -(int64_t)del->br_blockcount, -da_diff);
 	return error;
 }
 
diff --git a/fs/xfs/scrub/fscounters.c b/fs/xfs/scrub/fscounters.c
index 6f465373aa2027..424fb9770f1920 100644
--- a/fs/xfs/scrub/fscounters.c
+++ b/fs/xfs/scrub/fscounters.c
@@ -412,6 +412,7 @@ xchk_fscount_count_frextents(
 	int			error;
 
 	fsc->frextents = 0;
+	fsc->frextents_delayed = 0;
 	if (!xfs_has_realtime(mp))
 		return 0;
 
@@ -423,6 +424,8 @@ xchk_fscount_count_frextents(
 		goto out_unlock;
 	}
 
+	fsc->frextents_delayed = percpu_counter_sum(&mp->m_delalloc_rtextents);
+
 out_unlock:
 	xfs_iunlock(sc->mp->m_rbmip, XFS_ILOCK_SHARED | XFS_ILOCK_RTBITMAP);
 	return error;
@@ -434,6 +437,7 @@ xchk_fscount_count_frextents(
 	struct xchk_fscounters	*fsc)
 {
 	fsc->frextents = 0;
+	fsc->frextents_delayed = 0;
 	return 0;
 }
 #endif /* CONFIG_XFS_RT */
@@ -593,7 +597,7 @@ xchk_fscounters(
 	}
 
 	if (!xchk_fscount_within_range(sc, frextents, &mp->m_frextents,
-			fsc->frextents)) {
+			fsc->frextents - fsc->frextents_delayed)) {
 		if (fsc->frozen)
 			xchk_set_corrupt(sc);
 		else
diff --git a/fs/xfs/scrub/fscounters.h b/fs/xfs/scrub/fscounters.h
index 461a13d25f4b38..bcf56e1c36f91c 100644
--- a/fs/xfs/scrub/fscounters.h
+++ b/fs/xfs/scrub/fscounters.h
@@ -12,6 +12,7 @@ struct xchk_fscounters {
 	uint64_t		ifree;
 	uint64_t		fdblocks;
 	uint64_t		frextents;
+	uint64_t		frextents_delayed;
 	unsigned long long	icount_min;
 	unsigned long long	icount_max;
 	bool			frozen;
diff --git a/fs/xfs/scrub/fscounters_repair.c b/fs/xfs/scrub/fscounters_repair.c
index 94cdb852bee462..210ebbcf3e1520 100644
--- a/fs/xfs/scrub/fscounters_repair.c
+++ b/fs/xfs/scrub/fscounters_repair.c
@@ -65,7 +65,8 @@ xrep_fscounters(
 	percpu_counter_set(&mp->m_icount, fsc->icount);
 	percpu_counter_set(&mp->m_ifree, fsc->ifree);
 	percpu_counter_set(&mp->m_fdblocks, fsc->fdblocks);
-	percpu_counter_set(&mp->m_frextents, fsc->frextents);
+	percpu_counter_set(&mp->m_frextents,
+		fsc->frextents - fsc->frextents_delayed);
 	mp->m_sb.sb_frextents = fsc->frextents;
 
 	return 0;
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 575a3b98cdb514..7430a3c7765be8 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -34,6 +34,7 @@
 #include "xfs_health.h"
 #include "xfs_trace.h"
 #include "xfs_ag.h"
+#include "xfs_rtbitmap.h"
 #include "scrub/stats.h"
 
 static DEFINE_MUTEX(xfs_uuid_table_mutex);
@@ -1400,9 +1401,20 @@ xfs_clear_incompat_log_features(
 #define XFS_DELALLOC_BATCH	(4096)
 void
 xfs_mod_delalloc(
-	struct xfs_mount	*mp,
-	int64_t			delta)
+	struct xfs_inode	*ip,
+	int64_t			data_delta,
+	int64_t			ind_delta)
 {
-	percpu_counter_add_batch(&mp->m_delalloc_blks, delta,
+	struct xfs_mount	*mp = ip->i_mount;
+
+	if (XFS_IS_REALTIME_INODE(ip)) {
+		percpu_counter_add_batch(&mp->m_delalloc_rtextents,
+				xfs_rtb_to_rtx(mp, data_delta),
+				XFS_DELALLOC_BATCH);
+		if (!ind_delta)
+			return;
+		data_delta = 0;
+	}
+	percpu_counter_add_batch(&mp->m_delalloc_blks, data_delta + ind_delta,
 			XFS_DELALLOC_BATCH);
 }
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index d941437a0c7369..0e8d7779c0a561 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -195,6 +195,12 @@ typedef struct xfs_mount {
 	 * extents or anything related to the rt device.
 	 */
 	struct percpu_counter	m_delalloc_blks;
+
+	/*
+	 * RT version of the above.
+	 */
+	struct percpu_counter	m_delalloc_rtextents;
+
 	/*
 	 * Global count of allocation btree blocks in use across all AGs. Only
 	 * used when perag reservation is enabled. Helps prevent block
@@ -577,6 +583,7 @@ struct xfs_error_cfg * xfs_error_get_cfg(struct xfs_mount *mp,
 void xfs_force_summary_recalc(struct xfs_mount *mp);
 int xfs_add_incompat_log_feature(struct xfs_mount *mp, uint32_t feature);
 bool xfs_clear_incompat_log_features(struct xfs_mount *mp);
-void xfs_mod_delalloc(struct xfs_mount *mp, int64_t delta);
+void xfs_mod_delalloc(struct xfs_inode *ip, int64_t data_delta,
+		int64_t ind_delta);
 
 #endif	/* __XFS_MOUNT_H__ */
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 0afcb005a28fc1..71732457583370 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1052,12 +1052,18 @@ xfs_init_percpu_counters(
 	if (error)
 		goto free_fdblocks;
 
-	error = percpu_counter_init(&mp->m_frextents, 0, GFP_KERNEL);
+	error = percpu_counter_init(&mp->m_delalloc_rtextents, 0, GFP_KERNEL);
 	if (error)
 		goto free_delalloc;
 
+	error = percpu_counter_init(&mp->m_frextents, 0, GFP_KERNEL);
+	if (error)
+		goto free_delalloc_rt;
+
 	return 0;
 
+free_delalloc_rt:
+	percpu_counter_destroy(&mp->m_delalloc_rtextents);
 free_delalloc:
 	percpu_counter_destroy(&mp->m_delalloc_blks);
 free_fdblocks:
@@ -1086,6 +1092,9 @@ xfs_destroy_percpu_counters(
 	percpu_counter_destroy(&mp->m_icount);
 	percpu_counter_destroy(&mp->m_ifree);
 	percpu_counter_destroy(&mp->m_fdblocks);
+	ASSERT(xfs_is_shutdown(mp) ||
+	       percpu_counter_sum(&mp->m_delalloc_rtextents) == 0);
+	percpu_counter_destroy(&mp->m_delalloc_rtextents);
 	ASSERT(xfs_is_shutdown(mp) ||
 	       percpu_counter_sum(&mp->m_delalloc_blks) == 0);
 	percpu_counter_destroy(&mp->m_delalloc_blks);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 08/11] xfs: look at m_frextents in xfs_iomap_prealloc_size for RT allocations
  2024-03-25  2:24 bring back RT delalloc support v4 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2024-03-25  2:24 ` [PATCH 07/11] xfs: support RT inodes in xfs_mod_delalloc Christoph Hellwig
@ 2024-03-25  2:24 ` Christoph Hellwig
  2024-03-25  2:24 ` [PATCH 09/11] xfs: rework splitting of indirect block reservations Christoph Hellwig
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-25  2:24 UTC (permalink / raw
  To: Chandan Babu R; +Cc: Darrick J. Wong, Dave Chinner, linux-xfs

Add a check for files on the RT subvolume and use m_frextents instead
of m_fdblocks to adjust the preallocation size.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_iomap.c | 42 ++++++++++++++++++++++++++++++------------
 1 file changed, 30 insertions(+), 12 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 4087af7f3c9f3f..e0c205bcf03404 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -28,6 +28,7 @@
 #include "xfs_dquot.h"
 #include "xfs_reflink.h"
 #include "xfs_health.h"
+#include "xfs_rtbitmap.h"
 
 #define XFS_ALLOC_ALIGN(mp, off) \
 	(((off) >> mp->m_allocsize_log) << mp->m_allocsize_log)
@@ -404,6 +405,29 @@ xfs_quota_calc_throttle(
 	}
 }
 
+static int64_t
+xfs_iomap_freesp(
+	struct percpu_counter	*counter,
+	uint64_t		low_space[XFS_LOWSP_MAX],
+	int			*shift)
+{
+	int64_t			freesp;
+
+	freesp = percpu_counter_read_positive(counter);
+	if (freesp < low_space[XFS_LOWSP_5_PCNT]) {
+		*shift = 2;
+		if (freesp < low_space[XFS_LOWSP_4_PCNT])
+			(*shift)++;
+		if (freesp < low_space[XFS_LOWSP_3_PCNT])
+			(*shift)++;
+		if (freesp < low_space[XFS_LOWSP_2_PCNT])
+			(*shift)++;
+		if (freesp < low_space[XFS_LOWSP_1_PCNT])
+			(*shift)++;
+	}
+	return freesp;
+}
+
 /*
  * If we don't have a user specified preallocation size, dynamically increase
  * the preallocation size as the size of the file grows.  Cap the maximum size
@@ -486,18 +510,12 @@ xfs_iomap_prealloc_size(
 	alloc_blocks = XFS_FILEOFF_MIN(roundup_pow_of_two(XFS_MAX_BMBT_EXTLEN),
 				       alloc_blocks);
 
-	freesp = percpu_counter_read_positive(&mp->m_fdblocks);
-	if (freesp < mp->m_low_space[XFS_LOWSP_5_PCNT]) {
-		shift = 2;
-		if (freesp < mp->m_low_space[XFS_LOWSP_4_PCNT])
-			shift++;
-		if (freesp < mp->m_low_space[XFS_LOWSP_3_PCNT])
-			shift++;
-		if (freesp < mp->m_low_space[XFS_LOWSP_2_PCNT])
-			shift++;
-		if (freesp < mp->m_low_space[XFS_LOWSP_1_PCNT])
-			shift++;
-	}
+	if (unlikely(XFS_IS_REALTIME_INODE(ip)))
+		freesp = xfs_rtx_to_rtb(mp, xfs_iomap_freesp(&mp->m_frextents,
+				mp->m_low_rtexts, &shift));
+	else
+		freesp = xfs_iomap_freesp(&mp->m_fdblocks, mp->m_low_space,
+				&shift);
 
 	/*
 	 * Check each quota to cap the prealloc size, provide a shift value to
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 09/11] xfs: rework splitting of indirect block reservations
  2024-03-25  2:24 bring back RT delalloc support v4 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2024-03-25  2:24 ` [PATCH 08/11] xfs: look at m_frextents in xfs_iomap_prealloc_size for RT allocations Christoph Hellwig
@ 2024-03-25  2:24 ` Christoph Hellwig
  2024-03-25 23:55   ` Darrick J. Wong
  2024-03-25  2:24 ` [PATCH 10/11] xfs: stop the steal (of data blocks for RT indirect blocks) Christoph Hellwig
  2024-03-25  2:24 ` [PATCH 11/11] xfs: reinstate delalloc for RT inodes (if sb_rextsize == 1) Christoph Hellwig
  10 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-25  2:24 UTC (permalink / raw
  To: Chandan Babu R; +Cc: Darrick J. Wong, Dave Chinner, linux-xfs

Move the check if we have enough indirect blocks and the stealing of
the deleted extent blocks out of xfs_bmap_split_indlen and into the
caller to prepare for handling delayed allocation of RT extents that
can't easily be stolen.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_bmap.c | 38 ++++++++++++++++----------------------
 1 file changed, 16 insertions(+), 22 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index cc250c33890bac..dda25a21100836 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4829,31 +4829,17 @@ xfs_bmapi_remap(
  * ores == 1). The number of stolen blocks is returned. The availability and
  * subsequent accounting of stolen blocks is the responsibility of the caller.
  */
-static xfs_filblks_t
+static void
 xfs_bmap_split_indlen(
 	xfs_filblks_t			ores,		/* original res. */
 	xfs_filblks_t			*indlen1,	/* ext1 worst indlen */
-	xfs_filblks_t			*indlen2,	/* ext2 worst indlen */
-	xfs_filblks_t			avail)		/* stealable blocks */
+	xfs_filblks_t			*indlen2)	/* ext2 worst indlen */
 {
 	xfs_filblks_t			len1 = *indlen1;
 	xfs_filblks_t			len2 = *indlen2;
 	xfs_filblks_t			nres = len1 + len2; /* new total res. */
-	xfs_filblks_t			stolen = 0;
 	xfs_filblks_t			resfactor;
 
-	/*
-	 * Steal as many blocks as we can to try and satisfy the worst case
-	 * indlen for both new extents.
-	 */
-	if (ores < nres && avail)
-		stolen = XFS_FILBLKS_MIN(nres - ores, avail);
-	ores += stolen;
-
-	 /* nothing else to do if we've satisfied the new reservation */
-	if (ores >= nres)
-		return stolen;
-
 	/*
 	 * We can't meet the total required reservation for the two extents.
 	 * Calculate the percent of the overall shortage between both extents
@@ -4898,8 +4884,6 @@ xfs_bmap_split_indlen(
 
 	*indlen1 = len1;
 	*indlen2 = len2;
-
-	return stolen;
 }
 
 int
@@ -4915,7 +4899,7 @@ xfs_bmap_del_extent_delay(
 	struct xfs_bmbt_irec	new;
 	int64_t			da_old, da_new, da_diff = 0;
 	xfs_fileoff_t		del_endoff, got_endoff;
-	xfs_filblks_t		got_indlen, new_indlen, stolen;
+	xfs_filblks_t		got_indlen, new_indlen, stolen = 0;
 	uint32_t		state = xfs_bmap_fork_to_state(whichfork);
 	uint64_t		fdblocks;
 	int			error = 0;
@@ -4994,8 +4978,19 @@ xfs_bmap_del_extent_delay(
 		new_indlen = xfs_bmap_worst_indlen(ip, new.br_blockcount);
 
 		WARN_ON_ONCE(!got_indlen || !new_indlen);
-		stolen = xfs_bmap_split_indlen(da_old, &got_indlen, &new_indlen,
-						       del->br_blockcount);
+		/*
+		 * Steal as many blocks as we can to try and satisfy the worst
+		 * case indlen for both new extents.
+		 */
+		da_new = got_indlen + new_indlen;
+		if (da_new > da_old) {
+			stolen = XFS_FILBLKS_MIN(da_new - da_old,
+						 new.br_blockcount);
+			da_old += stolen;
+		}
+		if (da_new > da_old)
+			xfs_bmap_split_indlen(da_old, &got_indlen, &new_indlen);
+		da_new = got_indlen + new_indlen;
 
 		got->br_startblock = nullstartblock((int)got_indlen);
 
@@ -5007,7 +5002,6 @@ xfs_bmap_del_extent_delay(
 		xfs_iext_next(ifp, icur);
 		xfs_iext_insert(ip, icur, &new, state);
 
-		da_new = got_indlen + new_indlen - stolen;
 		del->br_blockcount -= stolen;
 		break;
 	}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 10/11] xfs: stop the steal (of data blocks for RT indirect blocks)
  2024-03-25  2:24 bring back RT delalloc support v4 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2024-03-25  2:24 ` [PATCH 09/11] xfs: rework splitting of indirect block reservations Christoph Hellwig
@ 2024-03-25  2:24 ` Christoph Hellwig
  2024-03-25  2:24 ` [PATCH 11/11] xfs: reinstate delalloc for RT inodes (if sb_rextsize == 1) Christoph Hellwig
  10 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-25  2:24 UTC (permalink / raw
  To: Chandan Babu R; +Cc: Darrick J. Wong, Dave Chinner, linux-xfs

When xfs_bmap_del_extent_delay has to split an indirect block it tries
to steal blocks from the the part that gets unmapped to increase the
indirect block reservation that now needs to cover for two extents
instead of one.

This works perfectly fine on the data device, where the data and
indirect blocks come from the same pool.  It has no chance of working
when the inode sits on the RT device.  To support re-enabling delalloc
for inodes on the RT device, make this behavior conditional on not
beeing for rt extents.

Note that split of delalloc extents should only happen on writeback
failure, as for other kinds of hole punching we first write back all
data and thus convert the delalloc reservations covering the hole to
a real allocation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_bmap.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index dda25a21100836..16b0d76efd46ea 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4981,9 +4981,14 @@ xfs_bmap_del_extent_delay(
 		/*
 		 * Steal as many blocks as we can to try and satisfy the worst
 		 * case indlen for both new extents.
+		 *
+		 * However, we can't just steal reservations from the data
+		 * blocks if this is an RT inodes as the data and metadata
+		 * blocks come from different pools.  We'll have to live with
+		 * under-filled indirect reservation in this case.
 		 */
 		da_new = got_indlen + new_indlen;
-		if (da_new > da_old) {
+		if (da_new > da_old && !isrt) {
 			stolen = XFS_FILBLKS_MIN(da_new - da_old,
 						 new.br_blockcount);
 			da_old += stolen;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 11/11] xfs: reinstate delalloc for RT inodes (if sb_rextsize == 1)
  2024-03-25  2:24 bring back RT delalloc support v4 Christoph Hellwig
                   ` (9 preceding siblings ...)
  2024-03-25  2:24 ` [PATCH 10/11] xfs: stop the steal (of data blocks for RT indirect blocks) Christoph Hellwig
@ 2024-03-25  2:24 ` Christoph Hellwig
  10 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-25  2:24 UTC (permalink / raw
  To: Chandan Babu R; +Cc: Darrick J. Wong, Dave Chinner, linux-xfs

Commit aff3a9edb708 ("xfs: Use preallocation for inodes with extsz
hints") disabled delayed allocation for all inodes with extent size
hints due a data exposure problem.  It turns out we fixed this data
exposure problem since by always creating unwritten extents for
delalloc conversions due to more data exposure problems, but the
writeback path doesn't actually support extent size hints when
converting delalloc these days, which probably isn't a problem given
that people using the hints know what they get.

However due to the way how xfs_get_extsz_hint is implemented, it
always claims an extent size hint for RT inodes even if the RT
extent size is a single FSB.  Due to that the above commit effectively
disabled delalloc support for RT inodes.

Switch xfs_get_extsz_hint to return 0 for this case and work around
that in a few places to reinstate delalloc support for RT inodes on
file systems with an sb_rextsize of 1.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_inode.c   | 3 ++-
 fs/xfs/xfs_iomap.c   | 2 --
 fs/xfs/xfs_iops.c    | 2 +-
 fs/xfs/xfs_rtalloc.c | 2 ++
 4 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index ea48774f6b76d3..aa62fe2ed76834 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -60,7 +60,8 @@ xfs_get_extsz_hint(
 		return 0;
 	if ((ip->i_diflags & XFS_DIFLAG_EXTSIZE) && ip->i_extsize)
 		return ip->i_extsize;
-	if (XFS_IS_REALTIME_INODE(ip))
+	if (XFS_IS_REALTIME_INODE(ip) &&
+	    ip->i_mount->m_sb.sb_rextsize > 1)
 		return ip->i_mount->m_sb.sb_rextsize;
 	return 0;
 }
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index e0c205bcf03404..f6fd9aed3a7f4b 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1000,8 +1000,6 @@ xfs_buffered_write_iomap_begin(
 		return xfs_direct_write_iomap_begin(inode, offset, count,
 				flags, iomap, srcmap);
 
-	ASSERT(!XFS_IS_REALTIME_INODE(ip));
-
 	error = xfs_qm_dqattach(ip);
 	if (error)
 		return error;
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 66f8c47642e884..62f91392b281dc 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -521,7 +521,7 @@ xfs_stat_blksize(
 	 * always return the realtime extent size.
 	 */
 	if (XFS_IS_REALTIME_INODE(ip))
-		return XFS_FSB_TO_B(mp, xfs_get_extsz_hint(ip));
+		return XFS_FSB_TO_B(mp, xfs_get_extsz_hint(ip) ? : 1);
 
 	/*
 	 * Allow large block sizes to be reported to userspace programs if the
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index e66f9bd5de5cff..7eb30ecf96718c 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -1346,6 +1346,8 @@ xfs_bmap_rtalloc(
 	int			error;
 
 	align = xfs_get_extsz_hint(ap->ip);
+	if (!align)
+		align = 1;
 retry:
 	error = xfs_bmap_extsize_align(mp, &ap->got, &ap->prev,
 					align, 1, ap->eof, 0,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 04/11] xfs: split xfs_mod_freecounter
  2024-03-25  2:24 ` [PATCH 04/11] xfs: split xfs_mod_freecounter Christoph Hellwig
@ 2024-03-25 23:41   ` Darrick J. Wong
  2024-03-26  5:51     ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Darrick J. Wong @ 2024-03-25 23:41 UTC (permalink / raw
  To: Christoph Hellwig; +Cc: Chandan Babu R, Dave Chinner, linux-xfs

On Mon, Mar 25, 2024 at 10:24:04AM +0800, Christoph Hellwig wrote:
> xfs_mod_freecounter has two entirely separate code paths for adding or
> subtracting from the free counters.  Only the subtract case looks at the
> rsvd flag and can return an error.
> 
> Split xfs_mod_freecounter into separate helpers for subtracting or
> adding the freecounter, and remove all the impossible to reach error
> handling for the addition case.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

I'm guessing that the difference between v3 and v4 is that you can more
easily integrate a freertx reserve pool now?

Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/xfs/libxfs/xfs_ag.c      |  4 +--
>  fs/xfs/libxfs/xfs_ag_resv.c | 24 ++++---------
>  fs/xfs/libxfs/xfs_ag_resv.h |  2 +-
>  fs/xfs/libxfs/xfs_alloc.c   |  4 +--
>  fs/xfs/libxfs/xfs_bmap.c    | 23 +++++++------
>  fs/xfs/scrub/fscounters.c   |  2 +-
>  fs/xfs/scrub/repair.c       |  5 +--
>  fs/xfs/xfs_fsops.c          | 29 +++++-----------
>  fs/xfs/xfs_fsops.h          |  2 +-
>  fs/xfs/xfs_mount.c          | 67 +++++++++++++++++++------------------
>  fs/xfs/xfs_mount.h          | 27 ++++++++++-----
>  fs/xfs/xfs_super.c          |  6 +---
>  fs/xfs/xfs_trace.h          |  1 -
>  fs/xfs/xfs_trans.c          | 25 +++++---------
>  14 files changed, 97 insertions(+), 124 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
> index dc1873f76bffd5..189e3296fef6de 100644
> --- a/fs/xfs/libxfs/xfs_ag.c
> +++ b/fs/xfs/libxfs/xfs_ag.c
> @@ -963,9 +963,7 @@ xfs_ag_shrink_space(
>  	 * Disable perag reservations so it doesn't cause the allocation request
>  	 * to fail. We'll reestablish reservation before we return.
>  	 */
> -	error = xfs_ag_resv_free(pag);
> -	if (error)
> -		return error;
> +	xfs_ag_resv_free(pag);
>  
>  	/* internal log shouldn't also show up in the free space btrees */
>  	error = xfs_alloc_vextent_exact_bno(&args,
> diff --git a/fs/xfs/libxfs/xfs_ag_resv.c b/fs/xfs/libxfs/xfs_ag_resv.c
> index da1057bd0e6067..216423df939e5c 100644
> --- a/fs/xfs/libxfs/xfs_ag_resv.c
> +++ b/fs/xfs/libxfs/xfs_ag_resv.c
> @@ -126,14 +126,13 @@ xfs_ag_resv_needed(
>  }
>  
>  /* Clean out a reservation */
> -static int
> +static void
>  __xfs_ag_resv_free(
>  	struct xfs_perag		*pag,
>  	enum xfs_ag_resv_type		type)
>  {
>  	struct xfs_ag_resv		*resv;
>  	xfs_extlen_t			oldresv;
> -	int				error;
>  
>  	trace_xfs_ag_resv_free(pag, type, 0);
>  
> @@ -149,30 +148,19 @@ __xfs_ag_resv_free(
>  		oldresv = resv->ar_orig_reserved;
>  	else
>  		oldresv = resv->ar_reserved;
> -	error = xfs_mod_fdblocks(pag->pag_mount, oldresv, true);
> +	xfs_add_fdblocks(pag->pag_mount, oldresv);
>  	resv->ar_reserved = 0;
>  	resv->ar_asked = 0;
>  	resv->ar_orig_reserved = 0;
> -
> -	if (error)
> -		trace_xfs_ag_resv_free_error(pag->pag_mount, pag->pag_agno,
> -				error, _RET_IP_);
> -	return error;
>  }
>  
>  /* Free a per-AG reservation. */
> -int
> +void
>  xfs_ag_resv_free(
>  	struct xfs_perag		*pag)
>  {
> -	int				error;
> -	int				err2;
> -
> -	error = __xfs_ag_resv_free(pag, XFS_AG_RESV_RMAPBT);
> -	err2 = __xfs_ag_resv_free(pag, XFS_AG_RESV_METADATA);
> -	if (err2 && !error)
> -		error = err2;
> -	return error;
> +	__xfs_ag_resv_free(pag, XFS_AG_RESV_RMAPBT);
> +	__xfs_ag_resv_free(pag, XFS_AG_RESV_METADATA);
>  }
>  
>  static int
> @@ -216,7 +204,7 @@ __xfs_ag_resv_init(
>  	if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_AG_RESV_FAIL))
>  		error = -ENOSPC;
>  	else
> -		error = xfs_mod_fdblocks(mp, -(int64_t)hidden_space, true);
> +		error = xfs_dec_fdblocks(mp, hidden_space, true);
>  	if (error) {
>  		trace_xfs_ag_resv_init_error(pag->pag_mount, pag->pag_agno,
>  				error, _RET_IP_);
> diff --git a/fs/xfs/libxfs/xfs_ag_resv.h b/fs/xfs/libxfs/xfs_ag_resv.h
> index b74b210008ea7e..ff20ed93de7724 100644
> --- a/fs/xfs/libxfs/xfs_ag_resv.h
> +++ b/fs/xfs/libxfs/xfs_ag_resv.h
> @@ -6,7 +6,7 @@
>  #ifndef __XFS_AG_RESV_H__
>  #define	__XFS_AG_RESV_H__
>  
> -int xfs_ag_resv_free(struct xfs_perag *pag);
> +void xfs_ag_resv_free(struct xfs_perag *pag);
>  int xfs_ag_resv_init(struct xfs_perag *pag, struct xfs_trans *tp);
>  
>  bool xfs_ag_resv_critical(struct xfs_perag *pag, enum xfs_ag_resv_type type);
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index 9da52e92172aba..6cb8b2ddc541b4 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -79,7 +79,7 @@ xfs_prealloc_blocks(
>  }
>  
>  /*
> - * The number of blocks per AG that we withhold from xfs_mod_fdblocks to
> + * The number of blocks per AG that we withhold from xfs_dec_fdblocks to
>   * guarantee that we can refill the AGFL prior to allocating space in a nearly
>   * full AG.  Although the space described by the free space btrees, the
>   * blocks used by the freesp btrees themselves, and the blocks owned by the
> @@ -89,7 +89,7 @@ xfs_prealloc_blocks(
>   * until the fs goes down, we subtract this many AG blocks from the incore
>   * fdblocks to ensure user allocation does not overcommit the space the
>   * filesystem needs for the AGFLs.  The rmap btree uses a per-AG reservation to
> - * withhold space from xfs_mod_fdblocks, so we do not account for that here.
> + * withhold space from xfs_dec_fdblocks, so we do not account for that here.
>   */
>  #define XFS_ALLOCBT_AGFL_RESERVE	4
>  
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 5fb7b38921c9a3..240507cbe4db3e 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -1983,10 +1983,11 @@ xfs_bmap_add_extent_delay_real(
>  	}
>  
>  	/* adjust for changes in reserved delayed indirect blocks */
> -	if (da_new != da_old) {
> -		ASSERT(state == 0 || da_new < da_old);
> -		error = xfs_mod_fdblocks(mp, (int64_t)(da_old - da_new),
> -				false);
> +	if (da_new < da_old) {
> +		xfs_add_fdblocks(mp, da_old - da_new);
> +	} else if (da_new > da_old) {
> +		ASSERT(state == 0);
> +		error = xfs_dec_fdblocks(mp, da_new - da_old, false);
>  	}
>  
>  	xfs_bmap_check_leaf_extents(bma->cur, bma->ip, whichfork);
> @@ -2688,8 +2689,8 @@ xfs_bmap_add_extent_hole_delay(
>  	}
>  	if (oldlen != newlen) {
>  		ASSERT(oldlen > newlen);
> -		xfs_mod_fdblocks(ip->i_mount, (int64_t)(oldlen - newlen),
> -				 false);
> +		xfs_add_fdblocks(ip->i_mount, oldlen - newlen);
> +
>  		/*
>  		 * Nothing to do for disk quota accounting here.
>  		 */
> @@ -4108,11 +4109,11 @@ xfs_bmapi_reserve_delalloc(
>  	indlen = (xfs_extlen_t)xfs_bmap_worst_indlen(ip, alen);
>  	ASSERT(indlen > 0);
>  
> -	error = xfs_mod_fdblocks(mp, -((int64_t)alen), false);
> +	error = xfs_dec_fdblocks(mp, alen, false);
>  	if (error)
>  		goto out_unreserve_quota;
>  
> -	error = xfs_mod_fdblocks(mp, -((int64_t)indlen), false);
> +	error = xfs_dec_fdblocks(mp, indlen, false);
>  	if (error)
>  		goto out_unreserve_blocks;
>  
> @@ -4140,7 +4141,7 @@ xfs_bmapi_reserve_delalloc(
>  	return 0;
>  
>  out_unreserve_blocks:
> -	xfs_mod_fdblocks(mp, alen, false);
> +	xfs_add_fdblocks(mp, alen);
>  out_unreserve_quota:
>  	if (XFS_IS_QUOTA_ON(mp))
>  		xfs_quota_unreserve_blkres(ip, alen);
> @@ -4926,7 +4927,7 @@ xfs_bmap_del_extent_delay(
>  	ASSERT(got_endoff >= del_endoff);
>  
>  	if (isrt)
> -		xfs_mod_frextents(mp, xfs_rtb_to_rtx(mp, del->br_blockcount));
> +		xfs_add_frextents(mp, xfs_rtb_to_rtx(mp, del->br_blockcount));
>  
>  	/*
>  	 * Update the inode delalloc counter now and wait to update the
> @@ -5013,7 +5014,7 @@ xfs_bmap_del_extent_delay(
>  	if (!isrt)
>  		da_diff += del->br_blockcount;
>  	if (da_diff) {
> -		xfs_mod_fdblocks(mp, da_diff, false);
> +		xfs_add_fdblocks(mp, da_diff);
>  		xfs_mod_delalloc(mp, -da_diff);
>  	}
>  	return error;
> diff --git a/fs/xfs/scrub/fscounters.c b/fs/xfs/scrub/fscounters.c
> index d310737c882367..6f465373aa2027 100644
> --- a/fs/xfs/scrub/fscounters.c
> +++ b/fs/xfs/scrub/fscounters.c
> @@ -517,7 +517,7 @@ xchk_fscounters(
>  
>  	/*
>  	 * If the filesystem is not frozen, the counter summation calls above
> -	 * can race with xfs_mod_freecounter, which subtracts a requested space
> +	 * can race with xfs_dec_freecounter, which subtracts a requested space
>  	 * reservation from the counter and undoes the subtraction if that made
>  	 * the counter go negative.  Therefore, it's possible to see negative
>  	 * values here, and we should only flag that as a corruption if we
> diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c
> index f43dce771cdd26..6123e6c7ac7d67 100644
> --- a/fs/xfs/scrub/repair.c
> +++ b/fs/xfs/scrub/repair.c
> @@ -963,9 +963,7 @@ xrep_reset_perag_resv(
>  	ASSERT(sc->tp);
>  
>  	sc->flags &= ~XREP_RESET_PERAG_RESV;
> -	error = xfs_ag_resv_free(sc->sa.pag);
> -	if (error)
> -		goto out;
> +	xfs_ag_resv_free(sc->sa.pag);
>  	error = xfs_ag_resv_init(sc->sa.pag, sc->tp);
>  	if (error == -ENOSPC) {
>  		xfs_err(sc->mp,
> @@ -974,7 +972,6 @@ xrep_reset_perag_resv(
>  		error = 0;
>  	}
>  
> -out:
>  	return error;
>  }
>  
> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> index 83f708f62ed9f2..c211ea2b63c4dd 100644
> --- a/fs/xfs/xfs_fsops.c
> +++ b/fs/xfs/xfs_fsops.c
> @@ -213,10 +213,8 @@ xfs_growfs_data_private(
>  			struct xfs_perag	*pag;
>  
>  			pag = xfs_perag_get(mp, id.agno);
> -			error = xfs_ag_resv_free(pag);
> +			xfs_ag_resv_free(pag);
>  			xfs_perag_put(pag);
> -			if (error)
> -				return error;
>  		}
>  		/*
>  		 * Reserve AG metadata blocks. ENOSPC here does not mean there
> @@ -385,14 +383,14 @@ xfs_reserve_blocks(
>  	 */
>  	if (mp->m_resblks > request) {
>  		lcounter = mp->m_resblks_avail - request;
> -		if (lcounter  > 0) {		/* release unused blocks */
> +		if (lcounter > 0) {		/* release unused blocks */
>  			fdblks_delta = lcounter;
>  			mp->m_resblks_avail -= lcounter;
>  		}
>  		mp->m_resblks = request;
>  		if (fdblks_delta) {
>  			spin_unlock(&mp->m_sb_lock);
> -			error = xfs_mod_fdblocks(mp, fdblks_delta, 0);
> +			xfs_add_fdblocks(mp, fdblks_delta);
>  			spin_lock(&mp->m_sb_lock);
>  		}
>  
> @@ -428,9 +426,9 @@ xfs_reserve_blocks(
>  		 */
>  		fdblks_delta = min(free, delta);
>  		spin_unlock(&mp->m_sb_lock);
> -		error = xfs_mod_fdblocks(mp, -fdblks_delta, 0);
> +		error = xfs_dec_fdblocks(mp, fdblks_delta, 0);
>  		if (!error)
> -			xfs_mod_fdblocks(mp, fdblks_delta, 0);
> +			xfs_add_fdblocks(mp, fdblks_delta);
>  		spin_lock(&mp->m_sb_lock);
>  	}
>  out:
> @@ -556,24 +554,13 @@ xfs_fs_reserve_ag_blocks(
>  /*
>   * Free space reserved for per-AG metadata.
>   */
> -int
> +void
>  xfs_fs_unreserve_ag_blocks(
>  	struct xfs_mount	*mp)
>  {
>  	xfs_agnumber_t		agno;
>  	struct xfs_perag	*pag;
> -	int			error = 0;
> -	int			err2;
>  
> -	for_each_perag(mp, agno, pag) {
> -		err2 = xfs_ag_resv_free(pag);
> -		if (err2 && !error)
> -			error = err2;
> -	}
> -
> -	if (error)
> -		xfs_warn(mp,
> -	"Error %d freeing per-AG metadata reserve pool.", error);
> -
> -	return error;
> +	for_each_perag(mp, agno, pag)
> +		xfs_ag_resv_free(pag);
>  }
> diff --git a/fs/xfs/xfs_fsops.h b/fs/xfs/xfs_fsops.h
> index 44457b0a059376..3e2f73bcf8314b 100644
> --- a/fs/xfs/xfs_fsops.h
> +++ b/fs/xfs/xfs_fsops.h
> @@ -12,6 +12,6 @@ int xfs_reserve_blocks(struct xfs_mount *mp, uint64_t request);
>  int xfs_fs_goingdown(struct xfs_mount *mp, uint32_t inflags);
>  
>  int xfs_fs_reserve_ag_blocks(struct xfs_mount *mp);
> -int xfs_fs_unreserve_ag_blocks(struct xfs_mount *mp);
> +void xfs_fs_unreserve_ag_blocks(struct xfs_mount *mp);
>  
>  #endif	/* __XFS_FSOPS_H__ */
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 7328034d42ed8d..575a3b98cdb514 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1129,16 +1129,44 @@ xfs_fs_writable(
>  	return true;
>  }
>  
> -/* Adjust m_fdblocks or m_frextents. */
> +void
> +xfs_add_freecounter(
> +	struct xfs_mount	*mp,
> +	struct percpu_counter	*counter,
> +	uint64_t		delta)
> +{
> +	bool			has_resv_pool = (counter == &mp->m_fdblocks);
> +	uint64_t		res_used;
> +
> +	/*
> +	 * If the reserve pool is depleted, put blocks back into it first.
> +	 * Most of the time the pool is full.
> +	 */
> +	if (!has_resv_pool || mp->m_resblks == mp->m_resblks_avail) {
> +		percpu_counter_add(counter, delta);
> +		return;
> +	}
> +
> +	spin_lock(&mp->m_sb_lock);
> +	res_used = mp->m_resblks - mp->m_resblks_avail;
> +	if (res_used > delta) {
> +		mp->m_resblks_avail += delta;
> +	} else {
> +		delta -= res_used;
> +		mp->m_resblks_avail = mp->m_resblks;
> +		percpu_counter_add(counter, delta);
> +	}
> +	spin_unlock(&mp->m_sb_lock);
> +}
> +
>  int
> -xfs_mod_freecounter(
> +xfs_dec_freecounter(
>  	struct xfs_mount	*mp,
>  	struct percpu_counter	*counter,
> -	int64_t			delta,
> +	uint64_t		delta,
>  	bool			rsvd)
>  {
>  	int64_t			lcounter;
> -	long long		res_used;
>  	uint64_t		set_aside = 0;
>  	s32			batch;
>  	bool			has_resv_pool;
> @@ -1148,31 +1176,6 @@ xfs_mod_freecounter(
>  	if (rsvd)
>  		ASSERT(has_resv_pool);
>  
> -	if (delta > 0) {
> -		/*
> -		 * If the reserve pool is depleted, put blocks back into it
> -		 * first. Most of the time the pool is full.
> -		 */
> -		if (likely(!has_resv_pool ||
> -			   mp->m_resblks == mp->m_resblks_avail)) {
> -			percpu_counter_add(counter, delta);
> -			return 0;
> -		}
> -
> -		spin_lock(&mp->m_sb_lock);
> -		res_used = (long long)(mp->m_resblks - mp->m_resblks_avail);
> -
> -		if (res_used > delta) {
> -			mp->m_resblks_avail += delta;
> -		} else {
> -			delta -= res_used;
> -			mp->m_resblks_avail = mp->m_resblks;
> -			percpu_counter_add(counter, delta);
> -		}
> -		spin_unlock(&mp->m_sb_lock);
> -		return 0;
> -	}
> -
>  	/*
>  	 * Taking blocks away, need to be more accurate the closer we
>  	 * are to zero.
> @@ -1200,7 +1203,7 @@ xfs_mod_freecounter(
>  	 */
>  	if (has_resv_pool)
>  		set_aside = xfs_fdblocks_unavailable(mp);
> -	percpu_counter_add_batch(counter, delta, batch);
> +	percpu_counter_add_batch(counter, -((int64_t)delta), batch);
>  	if (__percpu_counter_compare(counter, set_aside,
>  				     XFS_FDBLOCKS_BATCH) >= 0) {
>  		/* we had space! */
> @@ -1212,11 +1215,11 @@ xfs_mod_freecounter(
>  	 * that took us to ENOSPC.
>  	 */
>  	spin_lock(&mp->m_sb_lock);
> -	percpu_counter_add(counter, -delta);
> +	percpu_counter_add(counter, delta);
>  	if (!has_resv_pool || !rsvd)
>  		goto fdblocks_enospc;
>  
> -	lcounter = (long long)mp->m_resblks_avail + delta;
> +	lcounter = (long long)mp->m_resblks_avail - delta;
>  	if (lcounter >= 0) {
>  		mp->m_resblks_avail = lcounter;
>  		spin_unlock(&mp->m_sb_lock);
> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> index e880aa48de68bb..d941437a0c7369 100644
> --- a/fs/xfs/xfs_mount.h
> +++ b/fs/xfs/xfs_mount.h
> @@ -534,19 +534,30 @@ xfs_fdblocks_unavailable(
>  	return mp->m_alloc_set_aside + atomic64_read(&mp->m_allocbt_blks);
>  }
>  
> -int xfs_mod_freecounter(struct xfs_mount *mp, struct percpu_counter *counter,
> -		int64_t delta, bool rsvd);
> +int xfs_dec_freecounter(struct xfs_mount *mp, struct percpu_counter *counter,
> +		uint64_t delta, bool rsvd);
> +void xfs_add_freecounter(struct xfs_mount *mp, struct percpu_counter *counter,
> +		uint64_t delta);
>  
> -static inline int
> -xfs_mod_fdblocks(struct xfs_mount *mp, int64_t delta, bool reserved)
> +static inline int xfs_dec_fdblocks(struct xfs_mount *mp, uint64_t delta,
> +		bool reserved)
>  {
> -	return xfs_mod_freecounter(mp, &mp->m_fdblocks, delta, reserved);
> +	return xfs_dec_freecounter(mp, &mp->m_fdblocks, delta, reserved);
>  }
>  
> -static inline int
> -xfs_mod_frextents(struct xfs_mount *mp, int64_t delta)
> +static inline void xfs_add_fdblocks(struct xfs_mount *mp, uint64_t delta)
>  {
> -	return xfs_mod_freecounter(mp, &mp->m_frextents, delta, false);
> +	xfs_add_freecounter(mp, &mp->m_fdblocks, delta);
> +}
> +
> +static inline int xfs_dec_frextents(struct xfs_mount *mp, uint64_t delta)
> +{
> +	return xfs_dec_freecounter(mp, &mp->m_frextents, delta, false);
> +}
> +
> +static inline void xfs_add_frextents(struct xfs_mount *mp, uint64_t delta)
> +{
> +	xfs_add_freecounter(mp, &mp->m_frextents, delta);
>  }
>  
>  extern int	xfs_readsb(xfs_mount_t *, int);
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 6828c48b15e9bd..0afcb005a28fc1 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1874,11 +1874,7 @@ xfs_remount_ro(
>  	xfs_inodegc_stop(mp);
>  
>  	/* Free the per-AG metadata reservation pool. */
> -	error = xfs_fs_unreserve_ag_blocks(mp);
> -	if (error) {
> -		xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
> -		return error;
> -	}
> +	xfs_fs_unreserve_ag_blocks(mp);
>  
>  	/*
>  	 * Before we sync the metadata, we need to free up the reserve block
> diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
> index aea97fc074f8de..1c5753f91c388e 100644
> --- a/fs/xfs/xfs_trace.h
> +++ b/fs/xfs/xfs_trace.h
> @@ -3062,7 +3062,6 @@ DEFINE_AG_RESV_EVENT(xfs_ag_resv_free_extent);
>  DEFINE_AG_RESV_EVENT(xfs_ag_resv_critical);
>  DEFINE_AG_RESV_EVENT(xfs_ag_resv_needed);
>  
> -DEFINE_AG_ERROR_EVENT(xfs_ag_resv_free_error);
>  DEFINE_AG_ERROR_EVENT(xfs_ag_resv_init_error);
>  
>  /* refcount tracepoint classes */
> diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> index 924b460229e951..b18d478e2c9e85 100644
> --- a/fs/xfs/xfs_trans.c
> +++ b/fs/xfs/xfs_trans.c
> @@ -163,7 +163,7 @@ xfs_trans_reserve(
>  	 * fail if the count would go below zero.
>  	 */
>  	if (blocks > 0) {
> -		error = xfs_mod_fdblocks(mp, -((int64_t)blocks), rsvd);
> +		error = xfs_dec_fdblocks(mp, blocks, rsvd);
>  		if (error != 0)
>  			return -ENOSPC;
>  		tp->t_blk_res += blocks;
> @@ -210,7 +210,7 @@ xfs_trans_reserve(
>  	 * fail if the count would go below zero.
>  	 */
>  	if (rtextents > 0) {
> -		error = xfs_mod_frextents(mp, -((int64_t)rtextents));
> +		error = xfs_dec_frextents(mp, rtextents);
>  		if (error) {
>  			error = -ENOSPC;
>  			goto undo_log;
> @@ -234,7 +234,7 @@ xfs_trans_reserve(
>  
>  undo_blocks:
>  	if (blocks > 0) {
> -		xfs_mod_fdblocks(mp, (int64_t)blocks, rsvd);
> +		xfs_add_fdblocks(mp, blocks);
>  		tp->t_blk_res = 0;
>  	}
>  	return error;
> @@ -593,12 +593,10 @@ xfs_trans_unreserve_and_mod_sb(
>  	struct xfs_trans	*tp)
>  {
>  	struct xfs_mount	*mp = tp->t_mountp;
> -	bool			rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0;
>  	int64_t			blkdelta = tp->t_blk_res;
>  	int64_t			rtxdelta = tp->t_rtx_res;
>  	int64_t			idelta = 0;
>  	int64_t			ifreedelta = 0;
> -	int			error;
>  
>  	/*
>  	 * Calculate the deltas.
> @@ -631,10 +629,8 @@ xfs_trans_unreserve_and_mod_sb(
>  	}
>  
>  	/* apply the per-cpu counters */
> -	if (blkdelta) {
> -		error = xfs_mod_fdblocks(mp, blkdelta, rsvd);
> -		ASSERT(!error);
> -	}
> +	if (blkdelta)
> +		xfs_add_fdblocks(mp, blkdelta);
>  
>  	if (idelta)
>  		percpu_counter_add_batch(&mp->m_icount, idelta,
> @@ -643,10 +639,8 @@ xfs_trans_unreserve_and_mod_sb(
>  	if (ifreedelta)
>  		percpu_counter_add(&mp->m_ifree, ifreedelta);
>  
> -	if (rtxdelta) {
> -		error = xfs_mod_frextents(mp, rtxdelta);
> -		ASSERT(!error);
> -	}
> +	if (rtxdelta)
> +		xfs_add_frextents(mp, rtxdelta);
>  
>  	if (!(tp->t_flags & XFS_TRANS_SB_DIRTY))
>  		return;
> @@ -682,7 +676,6 @@ xfs_trans_unreserve_and_mod_sb(
>  	 */
>  	ASSERT(mp->m_sb.sb_imax_pct >= 0);
>  	ASSERT(mp->m_sb.sb_rextslog >= 0);
> -	return;
>  }
>  
>  /* Add the given log item to the transaction's list of log items. */
> @@ -1301,9 +1294,9 @@ xfs_trans_reserve_more_inode(
>  		return 0;
>  
>  	/* Quota failed, give back the new reservation. */
> -	xfs_mod_fdblocks(mp, dblocks, tp->t_flags & XFS_TRANS_RESERVE);
> +	xfs_add_fdblocks(mp, dblocks);
>  	tp->t_blk_res -= dblocks;
> -	xfs_mod_frextents(mp, rtx);
> +	xfs_add_frextents(mp, rtx);
>  	tp->t_rtx_res -= rtx;
>  	return error;
>  }
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 07/11] xfs: support RT inodes in xfs_mod_delalloc
  2024-03-25  2:24 ` [PATCH 07/11] xfs: support RT inodes in xfs_mod_delalloc Christoph Hellwig
@ 2024-03-25 23:44   ` Darrick J. Wong
  0 siblings, 0 replies; 17+ messages in thread
From: Darrick J. Wong @ 2024-03-25 23:44 UTC (permalink / raw
  To: Christoph Hellwig; +Cc: Chandan Babu R, Dave Chinner, linux-xfs

On Mon, Mar 25, 2024 at 10:24:07AM +0800, Christoph Hellwig wrote:
> To prepare for re-enabling delalloc on RT devices, track the data blocks
> (which use the RT device when the inode sits on it) and the indirect
> blocks (which don't) separately to xfs_mod_delalloc, and add a new
> percpu counter to also track the RT delalloc blocks.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks good!
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/xfs/libxfs/xfs_bmap.c         | 12 ++++++------
>  fs/xfs/scrub/fscounters.c        |  6 +++++-
>  fs/xfs/scrub/fscounters.h        |  1 +
>  fs/xfs/scrub/fscounters_repair.c |  3 ++-
>  fs/xfs/xfs_mount.c               | 18 +++++++++++++++---
>  fs/xfs/xfs_mount.h               |  9 ++++++++-
>  fs/xfs/xfs_super.c               | 11 ++++++++++-
>  7 files changed, 47 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 94b4aad1989bec..cc250c33890bac 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -1975,7 +1975,7 @@ xfs_bmap_add_extent_delay_real(
>  	}
>  
>  	if (da_new != da_old)
> -		xfs_mod_delalloc(mp, (int64_t)da_new - da_old);
> +		xfs_mod_delalloc(bma->ip, 0, (int64_t)da_new - da_old);
>  
>  	if (bma->cur) {
>  		da_new += bma->cur->bc_bmap.allocated;
> @@ -2694,7 +2694,7 @@ xfs_bmap_add_extent_hole_delay(
>  		/*
>  		 * Nothing to do for disk quota accounting here.
>  		 */
> -		xfs_mod_delalloc(ip->i_mount, (int64_t)newlen - oldlen);
> +		xfs_mod_delalloc(ip, 0, (int64_t)newlen - oldlen);
>  	}
>  }
>  
> @@ -3371,7 +3371,7 @@ xfs_bmap_alloc_account(
>  		 * yet.
>  		 */
>  		if (ap->wasdel) {
> -			xfs_mod_delalloc(ap->ip->i_mount, -(int64_t)ap->length);
> +			xfs_mod_delalloc(ap->ip, -(int64_t)ap->length, 0);
>  			return;
>  		}
>  
> @@ -3395,7 +3395,7 @@ xfs_bmap_alloc_account(
>  	xfs_trans_log_inode(ap->tp, ap->ip, XFS_ILOG_CORE);
>  	if (ap->wasdel) {
>  		ap->ip->i_delayed_blks -= ap->length;
> -		xfs_mod_delalloc(ap->ip->i_mount, -(int64_t)ap->length);
> +		xfs_mod_delalloc(ap->ip, -(int64_t)ap->length, 0);
>  		fld = isrt ? XFS_TRANS_DQ_DELRTBCOUNT : XFS_TRANS_DQ_DELBCOUNT;
>  	} else {
>  		fld = isrt ? XFS_TRANS_DQ_RTBCOUNT : XFS_TRANS_DQ_BCOUNT;
> @@ -4124,7 +4124,7 @@ xfs_bmapi_reserve_delalloc(
>  		goto out_unreserve_frextents;
>  
>  	ip->i_delayed_blks += alen;
> -	xfs_mod_delalloc(ip->i_mount, alen + indlen);
> +	xfs_mod_delalloc(ip, alen, indlen);
>  
>  	got->br_startoff = aoff;
>  	got->br_startblock = nullstartblock(indlen);
> @@ -5022,7 +5022,7 @@ xfs_bmap_del_extent_delay(
>  		fdblocks += del->br_blockcount;
>  
>  	xfs_add_fdblocks(mp, fdblocks);
> -	xfs_mod_delalloc(mp, -(int64_t)fdblocks);
> +	xfs_mod_delalloc(ip, -(int64_t)del->br_blockcount, -da_diff);
>  	return error;
>  }
>  
> diff --git a/fs/xfs/scrub/fscounters.c b/fs/xfs/scrub/fscounters.c
> index 6f465373aa2027..424fb9770f1920 100644
> --- a/fs/xfs/scrub/fscounters.c
> +++ b/fs/xfs/scrub/fscounters.c
> @@ -412,6 +412,7 @@ xchk_fscount_count_frextents(
>  	int			error;
>  
>  	fsc->frextents = 0;
> +	fsc->frextents_delayed = 0;
>  	if (!xfs_has_realtime(mp))
>  		return 0;
>  
> @@ -423,6 +424,8 @@ xchk_fscount_count_frextents(
>  		goto out_unlock;
>  	}
>  
> +	fsc->frextents_delayed = percpu_counter_sum(&mp->m_delalloc_rtextents);
> +
>  out_unlock:
>  	xfs_iunlock(sc->mp->m_rbmip, XFS_ILOCK_SHARED | XFS_ILOCK_RTBITMAP);
>  	return error;
> @@ -434,6 +437,7 @@ xchk_fscount_count_frextents(
>  	struct xchk_fscounters	*fsc)
>  {
>  	fsc->frextents = 0;
> +	fsc->frextents_delayed = 0;
>  	return 0;
>  }
>  #endif /* CONFIG_XFS_RT */
> @@ -593,7 +597,7 @@ xchk_fscounters(
>  	}
>  
>  	if (!xchk_fscount_within_range(sc, frextents, &mp->m_frextents,
> -			fsc->frextents)) {
> +			fsc->frextents - fsc->frextents_delayed)) {
>  		if (fsc->frozen)
>  			xchk_set_corrupt(sc);
>  		else
> diff --git a/fs/xfs/scrub/fscounters.h b/fs/xfs/scrub/fscounters.h
> index 461a13d25f4b38..bcf56e1c36f91c 100644
> --- a/fs/xfs/scrub/fscounters.h
> +++ b/fs/xfs/scrub/fscounters.h
> @@ -12,6 +12,7 @@ struct xchk_fscounters {
>  	uint64_t		ifree;
>  	uint64_t		fdblocks;
>  	uint64_t		frextents;
> +	uint64_t		frextents_delayed;
>  	unsigned long long	icount_min;
>  	unsigned long long	icount_max;
>  	bool			frozen;
> diff --git a/fs/xfs/scrub/fscounters_repair.c b/fs/xfs/scrub/fscounters_repair.c
> index 94cdb852bee462..210ebbcf3e1520 100644
> --- a/fs/xfs/scrub/fscounters_repair.c
> +++ b/fs/xfs/scrub/fscounters_repair.c
> @@ -65,7 +65,8 @@ xrep_fscounters(
>  	percpu_counter_set(&mp->m_icount, fsc->icount);
>  	percpu_counter_set(&mp->m_ifree, fsc->ifree);
>  	percpu_counter_set(&mp->m_fdblocks, fsc->fdblocks);
> -	percpu_counter_set(&mp->m_frextents, fsc->frextents);
> +	percpu_counter_set(&mp->m_frextents,
> +		fsc->frextents - fsc->frextents_delayed);
>  	mp->m_sb.sb_frextents = fsc->frextents;
>  
>  	return 0;
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 575a3b98cdb514..7430a3c7765be8 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -34,6 +34,7 @@
>  #include "xfs_health.h"
>  #include "xfs_trace.h"
>  #include "xfs_ag.h"
> +#include "xfs_rtbitmap.h"
>  #include "scrub/stats.h"
>  
>  static DEFINE_MUTEX(xfs_uuid_table_mutex);
> @@ -1400,9 +1401,20 @@ xfs_clear_incompat_log_features(
>  #define XFS_DELALLOC_BATCH	(4096)
>  void
>  xfs_mod_delalloc(
> -	struct xfs_mount	*mp,
> -	int64_t			delta)
> +	struct xfs_inode	*ip,
> +	int64_t			data_delta,
> +	int64_t			ind_delta)
>  {
> -	percpu_counter_add_batch(&mp->m_delalloc_blks, delta,
> +	struct xfs_mount	*mp = ip->i_mount;
> +
> +	if (XFS_IS_REALTIME_INODE(ip)) {
> +		percpu_counter_add_batch(&mp->m_delalloc_rtextents,
> +				xfs_rtb_to_rtx(mp, data_delta),
> +				XFS_DELALLOC_BATCH);
> +		if (!ind_delta)
> +			return;
> +		data_delta = 0;
> +	}
> +	percpu_counter_add_batch(&mp->m_delalloc_blks, data_delta + ind_delta,
>  			XFS_DELALLOC_BATCH);
>  }
> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> index d941437a0c7369..0e8d7779c0a561 100644
> --- a/fs/xfs/xfs_mount.h
> +++ b/fs/xfs/xfs_mount.h
> @@ -195,6 +195,12 @@ typedef struct xfs_mount {
>  	 * extents or anything related to the rt device.
>  	 */
>  	struct percpu_counter	m_delalloc_blks;
> +
> +	/*
> +	 * RT version of the above.
> +	 */
> +	struct percpu_counter	m_delalloc_rtextents;
> +
>  	/*
>  	 * Global count of allocation btree blocks in use across all AGs. Only
>  	 * used when perag reservation is enabled. Helps prevent block
> @@ -577,6 +583,7 @@ struct xfs_error_cfg * xfs_error_get_cfg(struct xfs_mount *mp,
>  void xfs_force_summary_recalc(struct xfs_mount *mp);
>  int xfs_add_incompat_log_feature(struct xfs_mount *mp, uint32_t feature);
>  bool xfs_clear_incompat_log_features(struct xfs_mount *mp);
> -void xfs_mod_delalloc(struct xfs_mount *mp, int64_t delta);
> +void xfs_mod_delalloc(struct xfs_inode *ip, int64_t data_delta,
> +		int64_t ind_delta);
>  
>  #endif	/* __XFS_MOUNT_H__ */
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 0afcb005a28fc1..71732457583370 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1052,12 +1052,18 @@ xfs_init_percpu_counters(
>  	if (error)
>  		goto free_fdblocks;
>  
> -	error = percpu_counter_init(&mp->m_frextents, 0, GFP_KERNEL);
> +	error = percpu_counter_init(&mp->m_delalloc_rtextents, 0, GFP_KERNEL);
>  	if (error)
>  		goto free_delalloc;
>  
> +	error = percpu_counter_init(&mp->m_frextents, 0, GFP_KERNEL);
> +	if (error)
> +		goto free_delalloc_rt;
> +
>  	return 0;
>  
> +free_delalloc_rt:
> +	percpu_counter_destroy(&mp->m_delalloc_rtextents);
>  free_delalloc:
>  	percpu_counter_destroy(&mp->m_delalloc_blks);
>  free_fdblocks:
> @@ -1086,6 +1092,9 @@ xfs_destroy_percpu_counters(
>  	percpu_counter_destroy(&mp->m_icount);
>  	percpu_counter_destroy(&mp->m_ifree);
>  	percpu_counter_destroy(&mp->m_fdblocks);
> +	ASSERT(xfs_is_shutdown(mp) ||
> +	       percpu_counter_sum(&mp->m_delalloc_rtextents) == 0);
> +	percpu_counter_destroy(&mp->m_delalloc_rtextents);
>  	ASSERT(xfs_is_shutdown(mp) ||
>  	       percpu_counter_sum(&mp->m_delalloc_blks) == 0);
>  	percpu_counter_destroy(&mp->m_delalloc_blks);
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 09/11] xfs: rework splitting of indirect block reservations
  2024-03-25  2:24 ` [PATCH 09/11] xfs: rework splitting of indirect block reservations Christoph Hellwig
@ 2024-03-25 23:55   ` Darrick J. Wong
  2024-03-26  5:51     ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Darrick J. Wong @ 2024-03-25 23:55 UTC (permalink / raw
  To: Christoph Hellwig; +Cc: Chandan Babu R, Dave Chinner, linux-xfs

On Mon, Mar 25, 2024 at 10:24:09AM +0800, Christoph Hellwig wrote:
> Move the check if we have enough indirect blocks and the stealing of
> the deleted extent blocks out of xfs_bmap_split_indlen and into the
> caller to prepare for handling delayed allocation of RT extents that
> can't easily be stolen.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_bmap.c | 38 ++++++++++++++++----------------------
>  1 file changed, 16 insertions(+), 22 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index cc250c33890bac..dda25a21100836 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -4829,31 +4829,17 @@ xfs_bmapi_remap(
>   * ores == 1). The number of stolen blocks is returned. The availability and
>   * subsequent accounting of stolen blocks is the responsibility of the caller.
>   */
> -static xfs_filblks_t
> +static void
>  xfs_bmap_split_indlen(
>  	xfs_filblks_t			ores,		/* original res. */
>  	xfs_filblks_t			*indlen1,	/* ext1 worst indlen */
> -	xfs_filblks_t			*indlen2,	/* ext2 worst indlen */
> -	xfs_filblks_t			avail)		/* stealable blocks */
> +	xfs_filblks_t			*indlen2)	/* ext2 worst indlen */
>  {
>  	xfs_filblks_t			len1 = *indlen1;
>  	xfs_filblks_t			len2 = *indlen2;
>  	xfs_filblks_t			nres = len1 + len2; /* new total res. */
> -	xfs_filblks_t			stolen = 0;
>  	xfs_filblks_t			resfactor;
>  
> -	/*
> -	 * Steal as many blocks as we can to try and satisfy the worst case
> -	 * indlen for both new extents.
> -	 */
> -	if (ores < nres && avail)
> -		stolen = XFS_FILBLKS_MIN(nres - ores, avail);
> -	ores += stolen;
> -
> -	 /* nothing else to do if we've satisfied the new reservation */
> -	if (ores >= nres)
> -		return stolen;
> -
>  	/*
>  	 * We can't meet the total required reservation for the two extents.
>  	 * Calculate the percent of the overall shortage between both extents
> @@ -4898,8 +4884,6 @@ xfs_bmap_split_indlen(
>  
>  	*indlen1 = len1;
>  	*indlen2 = len2;
> -
> -	return stolen;
>  }
>  
>  int
> @@ -4915,7 +4899,7 @@ xfs_bmap_del_extent_delay(
>  	struct xfs_bmbt_irec	new;
>  	int64_t			da_old, da_new, da_diff = 0;
>  	xfs_fileoff_t		del_endoff, got_endoff;
> -	xfs_filblks_t		got_indlen, new_indlen, stolen;
> +	xfs_filblks_t		got_indlen, new_indlen, stolen = 0;
>  	uint32_t		state = xfs_bmap_fork_to_state(whichfork);
>  	uint64_t		fdblocks;
>  	int			error = 0;
> @@ -4994,8 +4978,19 @@ xfs_bmap_del_extent_delay(
>  		new_indlen = xfs_bmap_worst_indlen(ip, new.br_blockcount);
>  
>  		WARN_ON_ONCE(!got_indlen || !new_indlen);
> -		stolen = xfs_bmap_split_indlen(da_old, &got_indlen, &new_indlen,
> -						       del->br_blockcount);
> +		/*
> +		 * Steal as many blocks as we can to try and satisfy the worst
> +		 * case indlen for both new extents.
> +		 */
> +		da_new = got_indlen + new_indlen;
> +		if (da_new > da_old) {
> +			stolen = XFS_FILBLKS_MIN(da_new - da_old,
> +						 new.br_blockcount);

Huh.  We used to pass del->blockcount as one of the constraints on the
stolen block count.  Why pass new.br_blockcount instead?

--D

> +			da_old += stolen;
> +		}
> +		if (da_new > da_old)
> +			xfs_bmap_split_indlen(da_old, &got_indlen, &new_indlen);
> +		da_new = got_indlen + new_indlen;
>  
>  		got->br_startblock = nullstartblock((int)got_indlen);
>  
> @@ -5007,7 +5002,6 @@ xfs_bmap_del_extent_delay(
>  		xfs_iext_next(ifp, icur);
>  		xfs_iext_insert(ip, icur, &new, state);
>  
> -		da_new = got_indlen + new_indlen - stolen;
>  		del->br_blockcount -= stolen;
>  		break;
>  	}
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 04/11] xfs: split xfs_mod_freecounter
  2024-03-25 23:41   ` Darrick J. Wong
@ 2024-03-26  5:51     ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-26  5:51 UTC (permalink / raw
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Chandan Babu R, Dave Chinner, linux-xfs

On Mon, Mar 25, 2024 at 04:41:53PM -0700, Darrick J. Wong wrote:
> On Mon, Mar 25, 2024 at 10:24:04AM +0800, Christoph Hellwig wrote:
> > xfs_mod_freecounter has two entirely separate code paths for adding or
> > subtracting from the free counters.  Only the subtract case looks at the
> > rsvd flag and can return an error.
> > 
> > Split xfs_mod_freecounter into separate helpers for subtracting or
> > adding the freecounter, and remove all the impossible to reach error
> > handling for the addition case.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> I'm guessing that the difference between v3 and v4 is that you can more
> easily integrate a freertx reserve pool now?

Yes.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 09/11] xfs: rework splitting of indirect block reservations
  2024-03-25 23:55   ` Darrick J. Wong
@ 2024-03-26  5:51     ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2024-03-26  5:51 UTC (permalink / raw
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Chandan Babu R, Dave Chinner, linux-xfs

On Mon, Mar 25, 2024 at 04:55:43PM -0700, Darrick J. Wong wrote:
> > +		/*
> > +		 * Steal as many blocks as we can to try and satisfy the worst
> > +		 * case indlen for both new extents.
> > +		 */
> > +		da_new = got_indlen + new_indlen;
> > +		if (da_new > da_old) {
> > +			stolen = XFS_FILBLKS_MIN(da_new - da_old,
> > +						 new.br_blockcount);
> 
> Huh.  We used to pass del->blockcount as one of the constraints on the
> stolen block count.  Why pass new.br_blockcount instead?

.. we probably shouldn't.  Let me take another closer look.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-03-26  5:51 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-25  2:24 bring back RT delalloc support v4 Christoph Hellwig
2024-03-25  2:24 ` [PATCH 01/11] xfs: make XFS_TRANS_LOWMODE match the other XFS_TRANS_ definitions Christoph Hellwig
2024-03-25  2:24 ` [PATCH 02/11] xfs: move RT inode locking out of __xfs_bunmapi Christoph Hellwig
2024-03-25  2:24 ` [PATCH 03/11] xfs: block deltas in xfs_trans_unreserve_and_mod_sb must be positive Christoph Hellwig
2024-03-25  2:24 ` [PATCH 04/11] xfs: split xfs_mod_freecounter Christoph Hellwig
2024-03-25 23:41   ` Darrick J. Wong
2024-03-26  5:51     ` Christoph Hellwig
2024-03-25  2:24 ` [PATCH 05/11] xfs: reinstate RT support in xfs_bmapi_reserve_delalloc Christoph Hellwig
2024-03-25  2:24 ` [PATCH 06/11] xfs: cleanup fdblock/frextent accounting in xfs_bmap_del_extent_delay Christoph Hellwig
2024-03-25  2:24 ` [PATCH 07/11] xfs: support RT inodes in xfs_mod_delalloc Christoph Hellwig
2024-03-25 23:44   ` Darrick J. Wong
2024-03-25  2:24 ` [PATCH 08/11] xfs: look at m_frextents in xfs_iomap_prealloc_size for RT allocations Christoph Hellwig
2024-03-25  2:24 ` [PATCH 09/11] xfs: rework splitting of indirect block reservations Christoph Hellwig
2024-03-25 23:55   ` Darrick J. Wong
2024-03-26  5:51     ` Christoph Hellwig
2024-03-25  2:24 ` [PATCH 10/11] xfs: stop the steal (of data blocks for RT indirect blocks) Christoph Hellwig
2024-03-25  2:24 ` [PATCH 11/11] xfs: reinstate delalloc for RT inodes (if sb_rextsize == 1) Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.