From: Jan Kara <jack@suse.cz>
To: Ye Bin <yebin@huaweicloud.com>
Cc: tytso@mit.edu, adilger.kernel@dilger.ca,
linux-ext4@vger.kernel.org, jack@suse.cz,
Ye Bin <yebin10@huawei.com>
Subject: Re: [PATCH v3] jbd2: avoid mount failed when commit block is partial submitted
Date: Mon, 29 Apr 2024 21:14:37 +0200 [thread overview]
Message-ID: <20240429191437.2pxfg43hgdmkgl2m@quack3> (raw)
In-Reply-To: <20240425064515.836633-1-yebin@huaweicloud.com>
On Thu 25-04-24 14:45:15, Ye Bin wrote:
> From: Ye Bin <yebin10@huawei.com>
>
> We encountered a problem that the file system could not be mounted in
> the power-off scenario. The analysis of the file system mirror shows that
> only part of the data is written to the last commit block.
> The valid data of the commit block is concentrated in the first sector.
> However, the data of the entire block is involved in the checksum calculation.
> For different hardware, the minimum atomic unit may be different.
> If the checksum of a committed block is incorrect, clear the data except the
> 'commit_header' and then calculate the checksum. If the checkusm is correct,
> it is considered that the block is partially committed. However, if there are
> valid description/revoke blocks, it is considered that the data is abnormal
> and the log replay is stopped.
>
> Signed-off-by: Ye Bin <yebin10@huawei.com>
The patch as a fix looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
I'd just note that the maze of branches and gotos in do_one_pass() is
becoming very hard to follow so ideally we should come up with some
refactoring of the function so that it is easier to follow. But that's
definitely something for a separate patch.
Honza
> ---
> fs/jbd2/recovery.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 48 insertions(+)
>
> diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
> index 1f7664984d6e..594bf02a709f 100644
> --- a/fs/jbd2/recovery.c
> +++ b/fs/jbd2/recovery.c
> @@ -443,6 +443,27 @@ static int jbd2_commit_block_csum_verify(journal_t *j, void *buf)
> return provided == cpu_to_be32(calculated);
> }
>
> +static bool jbd2_commit_block_csum_verify_partial(journal_t *j, void *buf)
> +{
> + struct commit_header *h;
> + __be32 provided;
> + __u32 calculated;
> + void *tmpbuf;
> +
> + tmpbuf = kzalloc(j->j_blocksize, GFP_KERNEL);
> + if (!tmpbuf)
> + return false;
> +
> + memcpy(tmpbuf, buf, sizeof(struct commit_header));
> + h = tmpbuf;
> + provided = h->h_chksum[0];
> + h->h_chksum[0] = 0;
> + calculated = jbd2_chksum(j, j->j_csum_seed, tmpbuf, j->j_blocksize);
> + kfree(tmpbuf);
> +
> + return provided == cpu_to_be32(calculated);
> +}
> +
> static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag,
> journal_block_tag3_t *tag3,
> void *buf, __u32 sequence)
> @@ -479,6 +500,7 @@ static int do_one_pass(journal_t *journal,
> int descr_csum_size = 0;
> int block_error = 0;
> bool need_check_commit_time = false;
> + bool has_partial_commit = false;
> __u64 last_trans_commit_time = 0, commit_time;
>
> /*
> @@ -590,6 +612,14 @@ static int do_one_pass(journal_t *journal,
> next_log_block);
> }
>
> + if (pass == PASS_SCAN && has_partial_commit) {
> + pr_err("JBD2: Detect validate descriptor block %lu after incomplete commit block\n",
> + next_log_block);
> + err = -EFSBADCRC;
> + brelse(bh);
> + goto failed;
> + }
> +
> /* If it is a valid descriptor block, replay it
> * in pass REPLAY; if journal_checksums enabled, then
> * calculate checksums in PASS_SCAN, otherwise,
> @@ -810,6 +840,14 @@ static int do_one_pass(journal_t *journal,
> if (pass == PASS_SCAN &&
> !jbd2_commit_block_csum_verify(journal,
> bh->b_data)) {
> + if (jbd2_commit_block_csum_verify_partial(
> + journal,
> + bh->b_data)) {
> + pr_notice("JBD2: Find incomplete commit block in transaction %u block %lu\n",
> + next_commit_ID, next_log_block);
> + has_partial_commit = true;
> + goto chksum_ok;
> + }
> chksum_error:
> if (commit_time < last_trans_commit_time)
> goto ignore_crc_mismatch;
> @@ -824,6 +862,7 @@ static int do_one_pass(journal_t *journal,
> }
> }
> if (pass == PASS_SCAN) {
> + chksum_ok:
> last_trans_commit_time = commit_time;
> head_block = next_log_block;
> }
> @@ -843,6 +882,15 @@ static int do_one_pass(journal_t *journal,
> next_log_block);
> need_check_commit_time = true;
> }
> +
> + if (pass == PASS_SCAN && has_partial_commit) {
> + pr_err("JBD2: Detect validate revoke block %lu after incomplete commit block\n",
> + next_log_block);
> + err = -EFSBADCRC;
> + brelse(bh);
> + goto failed;
> + }
> +
> /* If we aren't in the REVOKE pass, then we can
> * just skip over this block. */
> if (pass != PASS_REVOKE) {
> --
> 2.31.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
prev parent reply other threads:[~2024-04-29 19:14 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-25 6:45 [PATCH v3] jbd2: avoid mount failed when commit block is partial submitted Ye Bin
2024-04-29 19:14 ` Jan Kara [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240429191437.2pxfg43hgdmkgl2m@quack3 \
--to=jack@suse.cz \
--cc=adilger.kernel@dilger.ca \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=yebin10@huawei.com \
--cc=yebin@huaweicloud.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).