From: Zhang Yi <yi.zhang@huaweicloud.com>
To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org,
djwong@kernel.org, hch@infradead.org, brauner@kernel.org,
david@fromorbit.com, chandanbabu@kernel.org, jack@suse.cz,
yi.zhang@huawei.com, yi.zhang@huaweicloud.com,
chengzhihao1@huawei.com, yukuai3@huawei.com
Subject: [PATCH 3/3] xfs: correct the zeroing truncate range
Date: Wed, 15 May 2024 10:28:29 +0800 [thread overview]
Message-ID: <20240515022829.2455554-4-yi.zhang@huaweicloud.com> (raw)
In-Reply-To: <20240515022829.2455554-1-yi.zhang@huaweicloud.com>
From: Zhang Yi <yi.zhang@huawei.com>
When truncating a realtime file unaligned to a shorter size,
xfs_setattr_size() only flush the EOF page before zeroing out, and
xfs_truncate_page() also only zeros the EOF block. This could expose
stale data since 943bc0882ceb ("iomap: don't increase i_size if it's not
a write operation").
If the sb_rextsize is bigger than one block, and we have a realtime
inode that contains a long enough written extent. If we unaligned
truncate into the middle of this extent, xfs_itruncate_extents() could
split the extent and align the it's tail to sb_rextsize, there maybe
have more than one blocks more between the end of the file. Since
xfs_truncate_page() only zeros the trailing portion of the i_blocksize()
value, so it may leftover some blocks contains stale data that could be
exposed if we append write it over a long enough distance later.
xfs_truncate_page() should flush, zeros out the entire rtextsize range,
and make sure the entire zeroed range have been flushed to disk before
updating the inode size.
Fixes: 943bc0882ceb ("iomap: don't increase i_size if it's not a write operation")
Reported-by: Chandan Babu R <chandanbabu@kernel.org>
Link: https://lore.kernel.org/linux-xfs/0b92a215-9d9b-3788-4504-a520778953c2@huaweicloud.com
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
fs/xfs/xfs_iomap.c | 35 +++++++++++++++++++++++++++++++----
fs/xfs/xfs_iops.c | 10 ----------
2 files changed, 31 insertions(+), 14 deletions(-)
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 4958cc3337bc..fc379450fe74 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1466,12 +1466,39 @@ xfs_truncate_page(
loff_t pos,
bool *did_zero)
{
+ struct xfs_mount *mp = ip->i_mount;
struct inode *inode = VFS_I(ip);
unsigned int blocksize = i_blocksize(inode);
+ int error;
+
+ if (XFS_IS_REALTIME_INODE(ip))
+ blocksize = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize);
+
+ /*
+ * iomap won't detect a dirty page over an unwritten block (or a
+ * cow block over a hole) and subsequently skips zeroing the
+ * newly post-EOF portion of the page. Flush the new EOF to
+ * convert the block before the pagecache truncate.
+ */
+ error = filemap_write_and_wait_range(inode->i_mapping, pos,
+ roundup_64(pos, blocksize));
+ if (error)
+ return error;
if (IS_DAX(inode))
- return dax_truncate_page(inode, pos, blocksize, did_zero,
- &xfs_dax_write_iomap_ops);
- return iomap_truncate_page(inode, pos, blocksize, did_zero,
- &xfs_buffered_write_iomap_ops);
+ error = dax_truncate_page(inode, pos, blocksize, did_zero,
+ &xfs_dax_write_iomap_ops);
+ else
+ error = iomap_truncate_page(inode, pos, blocksize, did_zero,
+ &xfs_buffered_write_iomap_ops);
+ if (error)
+ return error;
+
+ /*
+ * Write back path won't write dirty blocks post EOF folio,
+ * flush the entire zeroed range before updating the inode
+ * size.
+ */
+ return filemap_write_and_wait_range(inode->i_mapping, pos,
+ roundup_64(pos, blocksize));
}
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 66f8c47642e8..baeeddf4a6bb 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -845,16 +845,6 @@ xfs_setattr_size(
error = xfs_zero_range(ip, oldsize, newsize - oldsize,
&did_zeroing);
} else {
- /*
- * iomap won't detect a dirty page over an unwritten block (or a
- * cow block over a hole) and subsequently skips zeroing the
- * newly post-EOF portion of the page. Flush the new EOF to
- * convert the block before the pagecache truncate.
- */
- error = filemap_write_and_wait_range(inode->i_mapping, newsize,
- newsize);
- if (error)
- return error;
error = xfs_truncate_page(ip, newsize, &did_zeroing);
}
--
2.39.2
prev parent reply other threads:[~2024-05-15 2:39 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-15 2:28 [PATCH 0/3] iomap/xfs: fix stale data exposure when truncating realtime inodes Zhang Yi
2024-05-15 2:28 ` [PATCH 1/3] iomap: pass blocksize to iomap_truncate_page() Zhang Yi
2024-05-15 13:16 ` kernel test robot
2024-05-15 13:16 ` kernel test robot
2024-05-15 2:28 ` [PATCH 2/3] fsdax: pass blocksize to dax_truncate_page() Zhang Yi
2024-05-15 2:28 ` Zhang Yi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240515022829.2455554-4-yi.zhang@huaweicloud.com \
--to=yi.zhang@huaweicloud.com \
--cc=brauner@kernel.org \
--cc=chandanbabu@kernel.org \
--cc=chengzhihao1@huawei.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=yi.zhang@huawei.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).