From: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
To: "Jason-JH.Lin" <jason-jh.lin@mediatek.com>,
Chun-Kuang Hu <chunkuang.hu@kernel.org>,
Jassi Brar <jassisinghbrar@gmail.com>
Cc: Alexandre Mergnat <amergnat@baylibre.com>,
Jason-ch Chen <jason-ch.chen@mediatek.com>,
Johnson Wang <johnson.wang@mediatek.com>,
Singo Chang <singo.chang@mediatek.com>,
Nancy Lin <nancy.lin@mediatek.com>,
Shawn Sung <shawn.sung@mediatek.com>,
dri-devel@lists.freedesktop.org,
linux-mediatek@lists.infradead.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org,
Project_Global_Chrome_Upstream_Group@mediatek.com,
Fei Shao <fshao@chromium.org>
Subject: Re: [PATCH v2] mailbox: mtk-cmdq: Fix sleeping function called from invalid context
Date: Wed, 8 May 2024 14:44:33 +0200 [thread overview]
Message-ID: <5de7d6af-54ac-49b7-b5c2-fa9381310288@collabora.com> (raw)
In-Reply-To: <20240508095143.12023-1-jason-jh.lin@mediatek.com>
Il 08/05/24 11:51, Jason-JH.Lin ha scritto:
> When we run kernel with lockdebug option, we will get the BUG below:
> [ 106.692124] BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1164
> [ 106.692190] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 3616, name: kworker/u17:3
> [ 106.692226] preempt_count: 1, expected: 0
> [ 106.692254] RCU nest depth: 0, expected: 0
> [ 106.692282] INFO: lockdep is turned off.
> [ 106.692306] irq event stamp: 0
> [ 106.692331] hardirqs last enabled at (0): [<0000000000000000>] 0x0
> [ 106.692376] hardirqs last disabled at (0): [<ffffffee15d37fa0>] copy_process+0xc90/0x2ac0
> [ 106.692429] softirqs last enabled at (0): [<ffffffee15d37fc4>] copy_process+0xcb4/0x2ac0
> [ 106.692473] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [ 106.692513] CPU: 1 PID: 3616 Comm: kworker/u17:3 Not tainted 6.1.87-lockdep-14133-g26e933aca785 #1 6839942e1cf34914b0a366137843dd2366f52aa9
> [ 106.692556] Hardware name: Google Ciri sku0/unprovisioned board (DT)
> [ 106.692586] Workqueue: imgsys_runner imgsys_runner_func
> [ 106.692638] Call trace:
> [ 106.692662] dump_backtrace+0x100/0x120
> [ 106.692702] show_stack+0x20/0x2c
> [ 106.692737] dump_stack_lvl+0x84/0xb4
> [ 106.692775] dump_stack+0x18/0x48
> [ 106.692809] __might_resched+0x354/0x4c0
> [ 106.692847] __might_sleep+0x98/0xe4
> [ 106.692883] __pm_runtime_resume+0x70/0x124
> [ 106.692921] cmdq_mbox_send_data+0xe4/0xb1c
> [ 106.692964] msg_submit+0x194/0x2dc
> [ 106.693003] mbox_send_message+0x190/0x330
> [ 106.693043] imgsys_cmdq_sendtask+0x1618/0x2224
> [ 106.693082] imgsys_runner_func+0xac/0x11c
> [ 106.693118] process_one_work+0x638/0xf84
> [ 106.693158] worker_thread+0x808/0xcd0
> [ 106.693196] kthread+0x24c/0x324
> [ 106.693231] ret_from_fork+0x10/0x20
>
> We found that there is a spin_lock_irqsave protection in msg_submit()
> of mailbox.c and it is in the atomic context.
> So when cmdq driver calls pm_runtime_get_sync() in cmdq_mbox_send_data(),
> it will get this BUG report.
>
> 1. Change pm_runtime_get_sync() to pm_runtime_get() to avoid using sleep
> in atomic context.
> 2. Move clk_bulk_enable() outside cmdq_runtime_resume() to ensure GCE
> clocks are enabled before configuring GCE register.
> 3. Add used_count to avoid cmdq_runtime_suspend() being called before
> calling cmdq_runtime_resume().
>
> Fixes: 8afe816b0c99 ("mailbox: mtk-cmdq-mailbox: Implement Runtime PM with autosuspend")
> Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com>
> ---
> drivers/mailbox/mtk-cmdq-mailbox.c | 24 +++++++++++++++---------
> 1 file changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c b/drivers/mailbox/mtk-cmdq-mailbox.c
> index 033aff11f87c..b50f42e69aab 100644
> --- a/drivers/mailbox/mtk-cmdq-mailbox.c
> +++ b/drivers/mailbox/mtk-cmdq-mailbox.c
> @@ -82,6 +82,7 @@ struct cmdq {
> const struct gce_plat *pdata;
> struct cmdq_thread *thread;
> struct clk_bulk_data clocks[CMDQ_GCE_NUM_MAX];
> + atomic_t used_count;
> bool suspended;
> };
>
> @@ -317,14 +318,21 @@ static int cmdq_runtime_resume(struct device *dev)
> {
> struct cmdq *cmdq = dev_get_drvdata(dev);
>
> - return clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks);
> + atomic_inc(&cmdq->used_count);
> + return 0;
> }
>
> static int cmdq_runtime_suspend(struct device *dev)
> {
> struct cmdq *cmdq = dev_get_drvdata(dev);
>
> + if (atomic_read(&cmdq->used_count) == 0) {
> + dev_warn(dev, "%s when used_count is 0!", __func__);
> + return -EINVAL;
> + }
> +
> clk_bulk_disable(cmdq->pdata->gce_num, cmdq->clocks);
> + atomic_dec(&cmdq->used_count);
> return 0;
> }
>
> @@ -392,9 +400,8 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data)
> /* Client should not flush new tasks if suspended. */
> WARN_ON(cmdq->suspended);
>
> - ret = pm_runtime_get_sync(cmdq->mbox.dev);
> - if (ret < 0)
> - return ret;
> + WARN_ON(pm_runtime_get(cmdq->mbox.dev) < 0);
That's a bit sketchy, and I'm afraid that this will break in some other ways.
We could - again - simply change the msg_submit() function in mailbox.c, so that
it takes into account that a driver may need PM done.
A low effort example (which may be good enough or not) is:
static void msg_submit(struct mbox_chan *chan)
{
unsigned count, idx;
unsigned long flags;
void *data;
int err;
if (chan->mbox->ops->pm_off) {
err = chan->mbox->ops->pm_on();
if (err)
return err;
}
spin_lock_irqsave(&chan->lock, flags);
if (!chan->msg_count || chan->active_req) {
err = -EBUSY;
goto exit;
}
......
exit:
spin_unlock_irqrestore(&chan->lock, flags);
if (!err && (chan->txdone_method & TXDONE_BY_POLL)) {
/* kick start the timer immediately to avoid delays */
spin_lock_irqsave(&chan->mbox->poll_hrt_lock, flags);
hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL);
spin_unlock_irqrestore(&chan->mbox->poll_hrt_lock, flags);
}
/* I guess setting it OFF can't fail anyway, so this would be a void function */
if (chan->mbox->ops->pm_off)
chan->mbox->ops->pm_off();
}
Then we can wire up the two functions in the MediaTek driver - where our mailbox
only simply needs pm_runtime_get_sync() and pm_runtime_put().
The reason why I'm suggesting a callback is that this would catch the case of
mailboxes that need "more complicated" flows for ON/OFF.
Jassi, do you like the idea?
If you do, I can eventually go for a real commit with better names than pm_off/on
if I can find any - so that then Jason can go on with his mtk-cmdq fix on top of
that.
Cheers,
Angelo
> + WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks));
>
> task = kzalloc(sizeof(*task), GFP_ATOMIC);
> if (!task) {
> @@ -465,7 +472,8 @@ static void cmdq_mbox_shutdown(struct mbox_chan *chan)
> struct cmdq_task *task, *tmp;
> unsigned long flags;
>
> - WARN_ON(pm_runtime_get_sync(cmdq->mbox.dev) < 0);
> + WARN_ON(pm_runtime_get(cmdq->mbox.dev) < 0);
> + WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks));
>
> spin_lock_irqsave(&thread->chan->lock, flags);
> if (list_empty(&thread->task_busy_list))
> @@ -507,11 +515,9 @@ static int cmdq_mbox_flush(struct mbox_chan *chan, unsigned long timeout)
> struct cmdq_task *task, *tmp;
> unsigned long flags;
> u32 enable;
> - int ret;
>
> - ret = pm_runtime_get_sync(cmdq->mbox.dev);
> - if (ret < 0)
> - return ret;
> + WARN_ON(pm_runtime_get(cmdq->mbox.dev) < 0);
> + WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks));
>
> spin_lock_irqsave(&thread->chan->lock, flags);
> if (list_empty(&thread->task_busy_list))
next prev parent reply other threads:[~2024-05-08 12:44 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-08 9:51 [PATCH v2] mailbox: mtk-cmdq: Fix sleeping function called from invalid context Jason-JH.Lin
2024-05-08 12:44 ` AngeloGioacchino Del Regno [this message]
2024-05-08 20:36 ` Krzysztof Kozlowski
2024-05-10 3:34 ` Jason-JH Lin (林睿祥)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5de7d6af-54ac-49b7-b5c2-fa9381310288@collabora.com \
--to=angelogioacchino.delregno@collabora.com \
--cc=Project_Global_Chrome_Upstream_Group@mediatek.com \
--cc=amergnat@baylibre.com \
--cc=chunkuang.hu@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=fshao@chromium.org \
--cc=jason-ch.chen@mediatek.com \
--cc=jason-jh.lin@mediatek.com \
--cc=jassisinghbrar@gmail.com \
--cc=johnson.wang@mediatek.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mediatek@lists.infradead.org \
--cc=nancy.lin@mediatek.com \
--cc=shawn.sung@mediatek.com \
--cc=singo.chang@mediatek.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).