* [PATCH v3] dm-io: don't warn if flush takes too long time
@ 2024-04-17 9:05 Mikulas Patocka
2024-04-17 14:20 ` Jens Axboe
2024-04-17 14:50 ` Christoph Hellwig
0 siblings, 2 replies; 4+ messages in thread
From: Mikulas Patocka @ 2024-04-17 9:05 UTC (permalink / raw
To: Mike Snitzer, Damien Le Moal
Cc: Guangwu Zhang, dm-devel, Jens Axboe, linux-block
There was reported hang warning when using dm-integrity on the top of loop
device on XFS on a rotational disk. The warning was triggered because
flush on the loop device was too slow.
There's no easy way to reduce the latency, so I made a patch that shuts
the warning up.
There's already a function blk_wait_io that avoids the hung task warning.
This commit moves this function from block/blk.h to
include/linux/completion.h, renames it to wait_for_completion_long_io
(because it is not dependent on the block layer at all) and uses it in
dm-io instead of wait_for_completion_io.
[ 1352.586981] INFO: task kworker/1:2:14820 blocked for more than 120 seconds.
[ 1352.593951] Not tainted 4.18.0-552.el8_10.x86_64 #1
[ 1352.599358] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1352.607202] Call Trace:
[ 1352.609670] __schedule+0x2d1/0x870
[ 1352.613173] ? update_load_avg+0x7e/0x710
[ 1352.617193] ? update_load_avg+0x7e/0x710
[ 1352.621214] schedule+0x55/0xf0
[ 1352.624371] schedule_timeout+0x281/0x320
[ 1352.628393] ? __schedule+0x2d9/0x870
[ 1352.632065] io_schedule_timeout+0x19/0x40
[ 1352.636176] wait_for_completion_io+0x96/0x100
[ 1352.640639] sync_io+0xcc/0x120 [dm_mod]
[ 1352.644592] dm_io+0x209/0x230 [dm_mod]
[ 1352.648436] ? bit_wait_timeout+0xa0/0xa0
[ 1352.652461] ? vm_next_page+0x20/0x20 [dm_mod]
[ 1352.656924] ? km_get_page+0x60/0x60 [dm_mod]
[ 1352.661298] dm_bufio_issue_flush+0xa0/0xd0 [dm_bufio]
[ 1352.666448] dm_bufio_write_dirty_buffers+0x1a0/0x1e0 [dm_bufio]
[ 1352.672462] dm_integrity_flush_buffers+0x32/0x140 [dm_integrity]
[ 1352.678567] ? lock_timer_base+0x67/0x90
[ 1352.682505] ? __timer_delete.part.36+0x5c/0x90
[ 1352.687050] integrity_commit+0x31a/0x330 [dm_integrity]
[ 1352.692368] ? __switch_to+0x10c/0x430
[ 1352.696131] process_one_work+0x1d3/0x390
[ 1352.700152] ? process_one_work+0x390/0x390
[ 1352.704348] worker_thread+0x30/0x390
[ 1352.708019] ? process_one_work+0x390/0x390
[ 1352.712214] kthread+0x134/0x150
[ 1352.715459] ? set_kthread_struct+0x50/0x50
[ 1352.719659] ret_from_fork+0x1f/0x40
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
---
block/bio.c | 2 +-
block/blk-mq.c | 2 +-
block/blk.h | 12 ------------
drivers/md/dm-io.c | 2 +-
include/linux/completion.h | 17 +++++++++++++++++
5 files changed, 20 insertions(+), 15 deletions(-)
Index: linux-2.6/block/blk.h
===================================================================
--- linux-2.6.orig/block/blk.h 2024-04-15 15:54:22.000000000 +0200
+++ linux-2.6/block/blk.h 2024-04-15 15:54:21.000000000 +0200
@@ -72,18 +72,6 @@ static inline int bio_queue_enter(struct
return __bio_queue_enter(q, bio);
}
-static inline void blk_wait_io(struct completion *done)
-{
- /* Prevent hang_check timer from firing at us during very long I/O */
- unsigned long timeout = sysctl_hung_task_timeout_secs * HZ / 2;
-
- if (timeout)
- while (!wait_for_completion_io_timeout(done, timeout))
- ;
- else
- wait_for_completion_io(done);
-}
-
#define BIO_INLINE_VECS 4
struct bio_vec *bvec_alloc(mempool_t *pool, unsigned short *nr_vecs,
gfp_t gfp_mask);
Index: linux-2.6/drivers/md/dm-io.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-io.c 2024-04-15 15:54:22.000000000 +0200
+++ linux-2.6/drivers/md/dm-io.c 2024-04-15 15:54:21.000000000 +0200
@@ -450,7 +450,7 @@ static int sync_io(struct dm_io_client *
dispatch_io(opf, num_regions, where, dp, io, 1, ioprio);
- wait_for_completion_io(&sio.wait);
+ wait_for_completion_long_io(&sio.wait);
if (error_bits)
*error_bits = sio.error_bits;
Index: linux-2.6/include/linux/completion.h
===================================================================
--- linux-2.6.orig/include/linux/completion.h 2024-04-15 15:54:22.000000000 +0200
+++ linux-2.6/include/linux/completion.h 2024-04-15 15:57:14.000000000 +0200
@@ -10,6 +10,7 @@
*/
#include <linux/swait.h>
+#include <linux/sched/sysctl.h>
/*
* struct completion - structure used to maintain state for a "completion"
@@ -119,4 +120,20 @@ extern void complete(struct completion *
extern void complete_on_current_cpu(struct completion *x);
extern void complete_all(struct completion *);
+/**
+ * wait_for_completion_long_io - this is like wait_for_completion_io,
+ * but it doesn't warn if the wait takes too long.
+ */
+static inline void wait_for_completion_long_io(struct completion *done)
+{
+ /* Prevent hang_check timer from firing at us during very long I/O */
+ unsigned long timeout = sysctl_hung_task_timeout_secs * HZ / 2;
+
+ if (timeout)
+ while (!wait_for_completion_io_timeout(done, timeout))
+ ;
+ else
+ wait_for_completion_io(done);
+}
+
#endif
Index: linux-2.6/block/bio.c
===================================================================
--- linux-2.6.orig/block/bio.c 2024-03-30 20:07:03.000000000 +0100
+++ linux-2.6/block/bio.c 2024-04-15 15:55:13.000000000 +0200
@@ -1378,7 +1378,7 @@ int submit_bio_wait(struct bio *bio)
bio->bi_end_io = submit_bio_wait_endio;
bio->bi_opf |= REQ_SYNC;
submit_bio(bio);
- blk_wait_io(&done);
+ wait_for_completion_long_io(&done);
return blk_status_to_errno(bio->bi_status);
}
Index: linux-2.6/block/blk-mq.c
===================================================================
--- linux-2.6.orig/block/blk-mq.c 2024-03-30 20:07:03.000000000 +0100
+++ linux-2.6/block/blk-mq.c 2024-04-15 15:55:05.000000000 +0200
@@ -1407,7 +1407,7 @@ blk_status_t blk_execute_rq(struct reque
if (blk_rq_is_poll(rq))
blk_rq_poll_completion(rq, &wait.done);
else
- blk_wait_io(&wait.done);
+ wait_for_completion_long_io(&wait.done);
return wait.ret;
}
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3] dm-io: don't warn if flush takes too long time
2024-04-17 9:05 [PATCH v3] dm-io: don't warn if flush takes too long time Mikulas Patocka
@ 2024-04-17 14:20 ` Jens Axboe
2024-04-17 16:31 ` Mikulas Patocka
2024-04-17 14:50 ` Christoph Hellwig
1 sibling, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2024-04-17 14:20 UTC (permalink / raw
To: Mikulas Patocka, Mike Snitzer, Damien Le Moal
Cc: Guangwu Zhang, dm-devel, linux-block
On 4/17/24 3:05 AM, Mikulas Patocka wrote:
> There was reported hang warning when using dm-integrity on the top of loop
> device on XFS on a rotational disk. The warning was triggered because
> flush on the loop device was too slow.
>
> There's no easy way to reduce the latency, so I made a patch that shuts
> the warning up.
>
> There's already a function blk_wait_io that avoids the hung task warning.
> This commit moves this function from block/blk.h to
> include/linux/completion.h, renames it to wait_for_completion_long_io
> (because it is not dependent on the block layer at all) and uses it in
> dm-io instead of wait_for_completion_io.
Change looks fine to me, but while at it, let's just move it into
blk-core.c and make it public, no need for this function to be a static
inline.
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3] dm-io: don't warn if flush takes too long time
2024-04-17 9:05 [PATCH v3] dm-io: don't warn if flush takes too long time Mikulas Patocka
2024-04-17 14:20 ` Jens Axboe
@ 2024-04-17 14:50 ` Christoph Hellwig
1 sibling, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2024-04-17 14:50 UTC (permalink / raw
To: Mikulas Patocka
Cc: Mike Snitzer, Damien Le Moal, Guangwu Zhang, dm-devel, Jens Axboe,
linux-block, Peter Zijlstra, Thomas Gleixner, linux-kernel
> --- linux-2.6.orig/include/linux/completion.h 2024-04-15 15:54:22.000000000 +0200
> +++ linux-2.6/include/linux/completion.h 2024-04-15 15:57:14.000000000 +0200
> @@ -10,6 +10,7 @@
> */
>
> #include <linux/swait.h>
> +#include <linux/sched/sysctl.h>
If you're touching completion.h you need to CC lkml and the people
who wrote/maintain it even if we don't have a proper maintainer.
I don't think adding yet another include into it is a good idea.
As is this whole hack here. Pleas just add the proper TASK_STATE for
task that can legitimately sleep for very long times instead of
extending this hack again and again, just like I told Kent when he messed with the timeout.
>
> /*
> * struct completion - structure used to maintain state for a "completion"
> @@ -119,4 +120,20 @@ extern void complete(struct completion *
> extern void complete_on_current_cpu(struct completion *x);
> extern void complete_all(struct completion *);
>
> +/**
> + * wait_for_completion_long_io - this is like wait_for_completion_io,
> + * but it doesn't warn if the wait takes too long.
> + */
> +static inline void wait_for_completion_long_io(struct completion *done)
> +{
> + /* Prevent hang_check timer from firing at us during very long I/O */
> + unsigned long timeout = sysctl_hung_task_timeout_secs * HZ / 2;
> +
> + if (timeout)
> + while (!wait_for_completion_io_timeout(done, timeout))
> + ;
> + else
> + wait_for_completion_io(done);
> +}
> +
> #endif
> Index: linux-2.6/block/bio.c
> ===================================================================
> --- linux-2.6.orig/block/bio.c 2024-03-30 20:07:03.000000000 +0100
> +++ linux-2.6/block/bio.c 2024-04-15 15:55:13.000000000 +0200
> @@ -1378,7 +1378,7 @@ int submit_bio_wait(struct bio *bio)
> bio->bi_end_io = submit_bio_wait_endio;
> bio->bi_opf |= REQ_SYNC;
> submit_bio(bio);
> - blk_wait_io(&done);
> + wait_for_completion_long_io(&done);
>
> return blk_status_to_errno(bio->bi_status);
> }
> Index: linux-2.6/block/blk-mq.c
> ===================================================================
> --- linux-2.6.orig/block/blk-mq.c 2024-03-30 20:07:03.000000000 +0100
> +++ linux-2.6/block/blk-mq.c 2024-04-15 15:55:05.000000000 +0200
> @@ -1407,7 +1407,7 @@ blk_status_t blk_execute_rq(struct reque
> if (blk_rq_is_poll(rq))
> blk_rq_poll_completion(rq, &wait.done);
> else
> - blk_wait_io(&wait.done);
> + wait_for_completion_long_io(&wait.done);
>
> return wait.ret;
> }
>
>
---end quoted text---
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3] dm-io: don't warn if flush takes too long time
2024-04-17 14:20 ` Jens Axboe
@ 2024-04-17 16:31 ` Mikulas Patocka
0 siblings, 0 replies; 4+ messages in thread
From: Mikulas Patocka @ 2024-04-17 16:31 UTC (permalink / raw
To: Jens Axboe
Cc: Mike Snitzer, Damien Le Moal, Guangwu Zhang, dm-devel,
linux-block
On Wed, 17 Apr 2024, Jens Axboe wrote:
> On 4/17/24 3:05 AM, Mikulas Patocka wrote:
> > There was reported hang warning when using dm-integrity on the top of loop
> > device on XFS on a rotational disk. The warning was triggered because
> > flush on the loop device was too slow.
> >
> > There's no easy way to reduce the latency, so I made a patch that shuts
> > the warning up.
> >
> > There's already a function blk_wait_io that avoids the hung task warning.
> > This commit moves this function from block/blk.h to
> > include/linux/completion.h, renames it to wait_for_completion_long_io
> > (because it is not dependent on the block layer at all) and uses it in
> > dm-io instead of wait_for_completion_io.
>
> Change looks fine to me, but while at it, let's just move it into
> blk-core.c and make it public, no need for this function to be a static
> inline.
>
> --
> Jens Axboe
I think we should move it to ./kernel/sched/completion.c. Because the
function has no dependency on the block layer.
I'll send a patch that does it.
Mikulas
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-04-17 16:31 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-17 9:05 [PATCH v3] dm-io: don't warn if flush takes too long time Mikulas Patocka
2024-04-17 14:20 ` Jens Axboe
2024-04-17 16:31 ` Mikulas Patocka
2024-04-17 14:50 ` Christoph Hellwig
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.