Linux-Fsdevel Archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] jbd2: speed up jbd2_transaction_committed()
@ 2024-05-20 13:18 Zhang Yi
  2024-05-20 13:39 ` Jan Kara
  2024-05-23 12:30 ` Ritesh Harjani
  0 siblings, 2 replies; 3+ messages in thread
From: Zhang Yi @ 2024-05-20 13:18 UTC (permalink / raw
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	ritesh.list, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

jbd2_transaction_committed() is used to check whether a transaction with
the given tid has already committed, it holds j_state_lock in read mode
and check the tid of current running transaction and committing
transaction, but holding the j_state_lock is expensive.

We have already stored the sequence number of the most recently
committed transaction in journal t->j_commit_sequence, we could do this
check by comparing it with the given tid instead. If the given tid isn't
smaller than j_commit_sequence, we can ensure that the given transaction
has been committed. That way we could drop the expensive lock and
achieve about 10% ~ 20% performance gains in concurrent DIOs on may
virtual machine with 100G ramdisk.

fio -filename=/mnt/foo -direct=1 -iodepth=10 -rw=$rw -ioengine=libaio \
    -bs=4k -size=10G -numjobs=10 -runtime=60 -overwrite=1 -name=test \
    -group_reporting

Before:
  overwrite       IOPS=88.2k, BW=344MiB/s
  read            IOPS=95.7k, BW=374MiB/s
  rand overwrite  IOPS=98.7k, BW=386MiB/s
  randread        IOPS=102k, BW=397MiB/s

After:
  overwrite       IOPS=105k, BW=410MiB/s
  read            IOPS=112k, BW=436MiB/s
  rand overwrite  IOPS=104k, BW=404MiB/s
  randread        IOPS=111k, BW=432MiB/s

CC: Dave Chinner <david@fromorbit.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Link: https://lore.kernel.org/linux-ext4/ZjILCPNZRHeazSqV@dread.disaster.area/
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
v1->v2:
 - Add READ_ONCE and WRITE_ONCE to access ->j_commit_sequence
   concurrently.
 - Keep the jbd2_transaction_committed() helper.

 fs/jbd2/commit.c  |  2 +-
 fs/jbd2/journal.c | 12 +-----------
 2 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 5e122586e06e..8244cab17688 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -1108,7 +1108,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
 
 	commit_transaction->t_state = T_COMMIT_CALLBACK;
 	J_ASSERT(commit_transaction == journal->j_committing_transaction);
-	journal->j_commit_sequence = commit_transaction->t_tid;
+	WRITE_ONCE(journal->j_commit_sequence, commit_transaction->t_tid);
 	journal->j_committing_transaction = NULL;
 	commit_time = ktime_to_ns(ktime_sub(ktime_get(), start_time));
 
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index b6c114c11b97..cc586e3c4ee1 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -789,17 +789,7 @@ EXPORT_SYMBOL(jbd2_fc_end_commit_fallback);
 /* Return 1 when transaction with given tid has already committed. */
 int jbd2_transaction_committed(journal_t *journal, tid_t tid)
 {
-	int ret = 1;
-
-	read_lock(&journal->j_state_lock);
-	if (journal->j_running_transaction &&
-	    journal->j_running_transaction->t_tid == tid)
-		ret = 0;
-	if (journal->j_committing_transaction &&
-	    journal->j_committing_transaction->t_tid == tid)
-		ret = 0;
-	read_unlock(&journal->j_state_lock);
-	return ret;
+	return tid_geq(READ_ONCE(journal->j_commit_sequence), tid);
 }
 EXPORT_SYMBOL(jbd2_transaction_committed);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] jbd2: speed up jbd2_transaction_committed()
  2024-05-20 13:18 [PATCH v2] jbd2: speed up jbd2_transaction_committed() Zhang Yi
@ 2024-05-20 13:39 ` Jan Kara
  2024-05-23 12:30 ` Ritesh Harjani
  1 sibling, 0 replies; 3+ messages in thread
From: Jan Kara @ 2024-05-20 13:39 UTC (permalink / raw
  To: Zhang Yi
  Cc: linux-ext4, linux-fsdevel, linux-kernel, tytso, adilger.kernel,
	jack, ritesh.list, yi.zhang, chengzhihao1, yukuai3

On Mon 20-05-24 21:18:31, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> jbd2_transaction_committed() is used to check whether a transaction with
> the given tid has already committed, it holds j_state_lock in read mode
> and check the tid of current running transaction and committing
> transaction, but holding the j_state_lock is expensive.
> 
> We have already stored the sequence number of the most recently
> committed transaction in journal t->j_commit_sequence, we could do this
> check by comparing it with the given tid instead. If the given tid isn't
> smaller than j_commit_sequence, we can ensure that the given transaction
> has been committed. That way we could drop the expensive lock and
> achieve about 10% ~ 20% performance gains in concurrent DIOs on may
> virtual machine with 100G ramdisk.
> 
> fio -filename=/mnt/foo -direct=1 -iodepth=10 -rw=$rw -ioengine=libaio \
>     -bs=4k -size=10G -numjobs=10 -runtime=60 -overwrite=1 -name=test \
>     -group_reporting
> 
> Before:
>   overwrite       IOPS=88.2k, BW=344MiB/s
>   read            IOPS=95.7k, BW=374MiB/s
>   rand overwrite  IOPS=98.7k, BW=386MiB/s
>   randread        IOPS=102k, BW=397MiB/s
> 
> After:
>   overwrite       IOPS=105k, BW=410MiB/s
>   read            IOPS=112k, BW=436MiB/s
>   rand overwrite  IOPS=104k, BW=404MiB/s
>   randread        IOPS=111k, BW=432MiB/s
> 
> CC: Dave Chinner <david@fromorbit.com>
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Link: https://lore.kernel.org/linux-ext4/ZjILCPNZRHeazSqV@dread.disaster.area/
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

Thanks! The patch looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
> v1->v2:
>  - Add READ_ONCE and WRITE_ONCE to access ->j_commit_sequence
>    concurrently.
>  - Keep the jbd2_transaction_committed() helper.
> 
>  fs/jbd2/commit.c  |  2 +-
>  fs/jbd2/journal.c | 12 +-----------
>  2 files changed, 2 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
> index 5e122586e06e..8244cab17688 100644
> --- a/fs/jbd2/commit.c
> +++ b/fs/jbd2/commit.c
> @@ -1108,7 +1108,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
>  
>  	commit_transaction->t_state = T_COMMIT_CALLBACK;
>  	J_ASSERT(commit_transaction == journal->j_committing_transaction);
> -	journal->j_commit_sequence = commit_transaction->t_tid;
> +	WRITE_ONCE(journal->j_commit_sequence, commit_transaction->t_tid);
>  	journal->j_committing_transaction = NULL;
>  	commit_time = ktime_to_ns(ktime_sub(ktime_get(), start_time));
>  
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index b6c114c11b97..cc586e3c4ee1 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -789,17 +789,7 @@ EXPORT_SYMBOL(jbd2_fc_end_commit_fallback);
>  /* Return 1 when transaction with given tid has already committed. */
>  int jbd2_transaction_committed(journal_t *journal, tid_t tid)
>  {
> -	int ret = 1;
> -
> -	read_lock(&journal->j_state_lock);
> -	if (journal->j_running_transaction &&
> -	    journal->j_running_transaction->t_tid == tid)
> -		ret = 0;
> -	if (journal->j_committing_transaction &&
> -	    journal->j_committing_transaction->t_tid == tid)
> -		ret = 0;
> -	read_unlock(&journal->j_state_lock);
> -	return ret;
> +	return tid_geq(READ_ONCE(journal->j_commit_sequence), tid);
>  }
>  EXPORT_SYMBOL(jbd2_transaction_committed);
>  
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] jbd2: speed up jbd2_transaction_committed()
  2024-05-20 13:18 [PATCH v2] jbd2: speed up jbd2_transaction_committed() Zhang Yi
  2024-05-20 13:39 ` Jan Kara
@ 2024-05-23 12:30 ` Ritesh Harjani
  1 sibling, 0 replies; 3+ messages in thread
From: Ritesh Harjani @ 2024-05-23 12:30 UTC (permalink / raw
  To: Zhang Yi, linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack,
	yi.zhang, yi.zhang, chengzhihao1, yukuai3

Zhang Yi <yi.zhang@huaweicloud.com> writes:

> From: Zhang Yi <yi.zhang@huawei.com>
>
> jbd2_transaction_committed() is used to check whether a transaction with
> the given tid has already committed, it holds j_state_lock in read mode
> and check the tid of current running transaction and committing
> transaction, but holding the j_state_lock is expensive.
>
> We have already stored the sequence number of the most recently
> committed transaction in journal t->j_commit_sequence, we could do this
> check by comparing it with the given tid instead. If the given tid isn't
> smaller than j_commit_sequence, we can ensure that the given transaction
> has been committed. That way we could drop the expensive lock and
> achieve about 10% ~ 20% performance gains in concurrent DIOs on may
> virtual machine with 100G ramdisk.
>
> fio -filename=/mnt/foo -direct=1 -iodepth=10 -rw=$rw -ioengine=libaio \
>     -bs=4k -size=10G -numjobs=10 -runtime=60 -overwrite=1 -name=test \
>     -group_reporting
>
> Before:
>   overwrite       IOPS=88.2k, BW=344MiB/s
>   read            IOPS=95.7k, BW=374MiB/s
>   rand overwrite  IOPS=98.7k, BW=386MiB/s
>   randread        IOPS=102k, BW=397MiB/s
>
> After:
>   overwrite       IOPS=105k, BW=410MiB/s
>   read            IOPS=112k, BW=436MiB/s
>   rand overwrite  IOPS=104k, BW=404MiB/s
>   randread        IOPS=111k, BW=432MiB/s

I was surprised to see that even the read and randread performance
is improved with this patch which should theoritically only impact write
workloads given based on such checks we are just setting IOMAP_F_DIRTY
flag to report to iomap. 

But then I came across these two patches [1] [2]. It seems the change
[1] to set IOMAP_F_DIRTY was initially only done for IOMAP_WRITE path.
But patch [2] moved some logic to fs-dax core and filesystems were left
to always just reports if there is any dirty metadata, irrespective of
reads or writes.

[1]: https://lore.kernel.org/all/20171101153648.30166-17-jack@suse.cz/
[2]: https://lore.kernel.org/all/151062258598.8554.8157038002895095232.stgit@dwillia2-desk3.amr.corp.intel.com/


Ohh - could this patch be that reason of peformance regression when ext4
DIO moved to iomap? Should we CC: stable to when ext4 DIO was moved to
iomap atleast - which I believe was v5.5?

Looks good to me.
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

>
> CC: Dave Chinner <david@fromorbit.com>
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Link: https://lore.kernel.org/linux-ext4/ZjILCPNZRHeazSqV@dread.disaster.area/

aah. This link is helpful too to understand the context. Thanks!

> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> ---
> v1->v2:
>  - Add READ_ONCE and WRITE_ONCE to access ->j_commit_sequence
>    concurrently.
>  - Keep the jbd2_transaction_committed() helper.
>
>  fs/jbd2/commit.c  |  2 +-
>  fs/jbd2/journal.c | 12 +-----------
>  2 files changed, 2 insertions(+), 12 deletions(-)
>
> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
> index 5e122586e06e..8244cab17688 100644
> --- a/fs/jbd2/commit.c
> +++ b/fs/jbd2/commit.c
> @@ -1108,7 +1108,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
>  
>  	commit_transaction->t_state = T_COMMIT_CALLBACK;
>  	J_ASSERT(commit_transaction == journal->j_committing_transaction);
> -	journal->j_commit_sequence = commit_transaction->t_tid;
> +	WRITE_ONCE(journal->j_commit_sequence, commit_transaction->t_tid);
>  	journal->j_committing_transaction = NULL;
>  	commit_time = ktime_to_ns(ktime_sub(ktime_get(), start_time));
>  
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index b6c114c11b97..cc586e3c4ee1 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -789,17 +789,7 @@ EXPORT_SYMBOL(jbd2_fc_end_commit_fallback);
>  /* Return 1 when transaction with given tid has already committed. */
>  int jbd2_transaction_committed(journal_t *journal, tid_t tid)
>  {
> -	int ret = 1;
> -
> -	read_lock(&journal->j_state_lock);
> -	if (journal->j_running_transaction &&
> -	    journal->j_running_transaction->t_tid == tid)
> -		ret = 0;
> -	if (journal->j_committing_transaction &&
> -	    journal->j_committing_transaction->t_tid == tid)
> -		ret = 0;
> -	read_unlock(&journal->j_state_lock);
> -	return ret;
> +	return tid_geq(READ_ONCE(journal->j_commit_sequence), tid);
>  }
>  EXPORT_SYMBOL(jbd2_transaction_committed);
>  
> -- 
> 2.39.2

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-05-23 12:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-20 13:18 [PATCH v2] jbd2: speed up jbd2_transaction_committed() Zhang Yi
2024-05-20 13:39 ` Jan Kara
2024-05-23 12:30 ` Ritesh Harjani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).