dri-devel Archive mirror
 help / color / mirror / Atom feed
From: Krzysztof Kozlowski <krzk@kernel.org>
To: "Jason-JH.Lin" <jason-jh.lin@mediatek.com>,
	Chun-Kuang Hu <chunkuang.hu@kernel.org>
Cc: AngeloGioacchino Del Regno
	<angelogioacchino.delregno@collabora.com>,
	Alexandre Mergnat <amergnat@baylibre.com>,
	Jason-ch Chen <jason-ch.chen@mediatek.com>,
	Johnson Wang <johnson.wang@mediatek.com>,
	Singo Chang <singo.chang@mediatek.com>,
	Nancy Lin <nancy.lin@mediatek.com>,
	Shawn Sung <shawn.sung@mediatek.com>,
	dri-devel@lists.freedesktop.org,
	linux-mediatek@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	Project_Global_Chrome_Upstream_Group@mediatek.com,
	Fei Shao <fshao@chromium.org>
Subject: Re: [PATCH v2] mailbox: mtk-cmdq: Fix sleeping function called from invalid context
Date: Wed, 8 May 2024 22:36:20 +0200	[thread overview]
Message-ID: <74767f1d-639a-4477-a38b-e5f15bf575df@kernel.org> (raw)
In-Reply-To: <20240508095143.12023-1-jason-jh.lin@mediatek.com>

On 08/05/2024 11:51, Jason-JH.Lin wrote:
> When we run kernel with lockdebug option, we will get the BUG below:
> [  106.692124] BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1164
> [  106.692190] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 3616, name: kworker/u17:3

Trim logs from irrelevant context. timestamps are useless his. Registers
as well. Addresses of stacktrace as well and better just decode it.


> [  106.692226] preempt_count: 1, expected: 0
> [  106.692254] RCU nest depth: 0, expected: 0
> [  106.692282] INFO: lockdep is turned off.
> [  106.692306] irq event stamp: 0
> [  106.692331] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> [  106.692376] hardirqs last disabled at (0): [<ffffffee15d37fa0>] copy_process+0xc90/0x2ac0
> [  106.692429] softirqs last  enabled at (0): [<ffffffee15d37fc4>] copy_process+0xcb4/0x2ac0
> [  106.692473] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [  106.692513] CPU: 1 PID: 3616 Comm: kworker/u17:3 Not tainted 6.1.87-lockdep-14133-g26e933aca785 #1 6839942e1cf34914b0a366137843dd2366f52aa9
> [  106.692556] Hardware name: Google Ciri sku0/unprovisioned board (DT)
> [  106.692586] Workqueue: imgsys_runner imgsys_runner_func
> [  106.692638] Call trace:
> [  106.692662]  dump_backtrace+0x100/0x120
> [  106.692702]  show_stack+0x20/0x2c
> [  106.692737]  dump_stack_lvl+0x84/0xb4
> [  106.692775]  dump_stack+0x18/0x48
> [  106.692809]  __might_resched+0x354/0x4c0
> [  106.692847]  __might_sleep+0x98/0xe4
> [  106.692883]  __pm_runtime_resume+0x70/0x124
> [  106.692921]  cmdq_mbox_send_data+0xe4/0xb1c
> [  106.692964]  msg_submit+0x194/0x2dc
> [  106.693003]  mbox_send_message+0x190/0x330
> [  106.693043]  imgsys_cmdq_sendtask+0x1618/0x2224
> [  106.693082]  imgsys_runner_func+0xac/0x11c
> [  106.693118]  process_one_work+0x638/0xf84
> [  106.693158]  worker_thread+0x808/0xcd0
> [  106.693196]  kthread+0x24c/0x324
> [  106.693231]  ret_from_fork+0x10/0x20
> 
> We found that there is a spin_lock_irqsave protection in msg_submit()
> of mailbox.c and it is in the atomic context.
> So when cmdq driver calls pm_runtime_get_sync() in cmdq_mbox_send_data(),
> it will get this BUG report.
> 
> 1. Change pm_runtime_get_sync() to pm_runtime_get() to avoid using sleep
>    in atomic context.
> 2. Move clk_bulk_enable() outside cmdq_runtime_resume() to ensure GCE
>    clocks are enabled before configuring GCE register.
> 3. Add used_count to avoid cmdq_runtime_suspend() being called before
>    calling cmdq_runtime_resume().
> 
> Fixes: 8afe816b0c99 ("mailbox: mtk-cmdq-mailbox: Implement Runtime PM with autosuspend")
> Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com>
> ---
>  drivers/mailbox/mtk-cmdq-mailbox.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c b/drivers/mailbox/mtk-cmdq-mailbox.c
> index 033aff11f87c..b50f42e69aab 100644
> --- a/drivers/mailbox/mtk-cmdq-mailbox.c
> +++ b/drivers/mailbox/mtk-cmdq-mailbox.c
> @@ -82,6 +82,7 @@ struct cmdq {
>  	const struct gce_plat	*pdata;
>  	struct cmdq_thread	*thread;
>  	struct clk_bulk_data	clocks[CMDQ_GCE_NUM_MAX];
> +	atomic_t		used_count;
>  	bool			suspended;
>  };
>  
> @@ -317,14 +318,21 @@ static int cmdq_runtime_resume(struct device *dev)
>  {
>  	struct cmdq *cmdq = dev_get_drvdata(dev);
>  
> -	return clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks);
> +	atomic_inc(&cmdq->used_count);

Do not implement own runtime PM counter...

> +	return 0;
>  }
>  
>  static int cmdq_runtime_suspend(struct device *dev)
>  {
>  	struct cmdq *cmdq = dev_get_drvdata(dev);
>  
> +	if (atomic_read(&cmdq->used_count) == 0) {
> +		dev_warn(dev, "%s when used_count is 0!", __func__);
> +		return -EINVAL;
> +	}
> +
>  	clk_bulk_disable(cmdq->pdata->gce_num, cmdq->clocks);
> +	atomic_dec(&cmdq->used_count);
>  	return 0;
>  }
>  
> @@ -392,9 +400,8 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data)
>  	/* Client should not flush new tasks if suspended. */
>  	WARN_ON(cmdq->suspended);
>  
> -	ret = pm_runtime_get_sync(cmdq->mbox.dev);
> -	if (ret < 0)
> -		return ret;
> +	WARN_ON(pm_runtime_get(cmdq->mbox.dev) < 0);
> +	WARN_ON(clk_bulk_enable(cmdq->pdata->gce_num, cmdq->clocks));

That's a no... First, you remove error handling. Second, you add WARN
and code should not have WARNs for error handling.

All this looks like terrible hacky workaround just to make warning go
away, without actually addressing real problem.

Best regards,
Krzysztof


  parent reply	other threads:[~2024-05-08 20:36 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-08  9:51 [PATCH v2] mailbox: mtk-cmdq: Fix sleeping function called from invalid context Jason-JH.Lin
2024-05-08 12:44 ` AngeloGioacchino Del Regno
2024-05-08 20:36 ` Krzysztof Kozlowski [this message]
2024-05-10  3:34   ` Jason-JH Lin (林睿祥)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=74767f1d-639a-4477-a38b-e5f15bf575df@kernel.org \
    --to=krzk@kernel.org \
    --cc=Project_Global_Chrome_Upstream_Group@mediatek.com \
    --cc=amergnat@baylibre.com \
    --cc=angelogioacchino.delregno@collabora.com \
    --cc=chunkuang.hu@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=fshao@chromium.org \
    --cc=jason-ch.chen@mediatek.com \
    --cc=jason-jh.lin@mediatek.com \
    --cc=johnson.wang@mediatek.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=nancy.lin@mediatek.com \
    --cc=shawn.sung@mediatek.com \
    --cc=singo.chang@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).