patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Stephen Boyd <sboyd@kernel.org>
To: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Michael Turquette <mturquette@baylibre.com>,
	linux-kernel@vger.kernel.org, linux-clk@vger.kernel.org,
	patches@lists.linux.dev, linux-arm-msm@vger.kernel.org,
	Douglas Anderson <dianders@chromium.org>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	Krzysztof Kozlowski <krzk@kernel.org>
Subject: Re: [PATCH v2 4/5] clk: Get runtime PM before walking tree during disable_unused
Date: Tue, 09 Apr 2024 04:55:16 -0700	[thread overview]
Message-ID: <306af49fd69707feccb640d02dc4fbc5.sboyd@kernel.org> (raw)
In-Reply-To: <CAPDyKFoGxxSzykQ-=o08LD_P_=8m=KRm4SHK_grBFgXNNv4OVw@mail.gmail.com>

Quoting Ulf Hansson (2024-04-09 03:32:04)
> 
> Apologies for not being able to review this, it got lost in my email
> filters. Looks like you manage to solve the locking order for the clk
> disable unused thing - great!
> 
> However I think the main problem we are seeing with these kind of
> locking issues is that we are holding a global lock while calling into
> pm_runtime_get|put*(). Similar problems have also been reported in the
> past. It's been on my todo list for quite some time to have a closer
> look, but I haven't reached it yet.
> 
> Without going into too much detail, let me just ask a related
> question. Would it not be possible to call pm_runtime_get/put() within
> the clock framework, without *always* keeping the clock prepare lock
> acquired? I assume a clock can't be unregistered, as long as there is
> reference taken for it, right? Wouldn't that be a sufficient guarantee
> that it's okay to runtime_resume|suspend its corresponding device?

The problem is that the clk tree is being walked with the prepare_lock
held and during that walk pm_runtime_get() is called on different
devices. We hold the prepare_lock while walking the tree because we
don't want the tree topology to change while walking. The prepare_lock
is also a topology lock.

If we could walk the tree, figure out what all clks will change, drop
the prepare_lock, pm_runtime_get() all of those devices, regrab the
prepare_lock, check the topology to make sure topology hasn't changed,
and then make all the clk_ops calls, it would work because we have
extracted the runtime PM calls out of the prepare_lock. Dropping and
regrabbing the lock is probably a bad idea though, because we may never
make progress because we're preempted by another task that changes the
topology.

I was also wondering if we get stuck again if the clk driver
implementing the clk_ops is on some bus like i2c or spi, that runtime PM
resumes the bus controller for register writes from the clk_ops, and
that resume function calls clk operations, and that happens in another
thread. Maybe that isn't a problem though because the runtime resume of
the device will also runtime resume the parent which is spi or i2c? 

Either way, it really seems like we need a per-clk prepare_lock. That
would let us know for sure the topology isn't changing while we walk the
tree and figure out what we're going to do. Anything that calls into the
clk framework again hopefully gets a different prepare lock for a
different clk.

BTW, I don't think lockdep is aware that runtime PM is waiting like
this for a parallel resume in another thread. Probably we need to add
annotations so that any locks held while waiting for the resume or
suspend to finish are chained to a pseudo read lock and the actual
resume/suspend operation is a pseudo write lock.

> 
> Or maybe I should just send a patch. :-)
> 

Sure! ;-)

  reply	other threads:[~2024-04-09 11:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-25 18:41 [PATCH v2 0/5] Fix a deadlock with clk_pm_runtime_get() Stephen Boyd
2024-03-25 18:41 ` [PATCH v2 1/5] clk: Remove prepare_lock hold assertion in __clk_release() Stephen Boyd
2024-04-08  2:35   ` Stephen Boyd
2024-03-25 18:41 ` [PATCH v2 2/5] clk: Don't hold prepare_lock when calling kref_put() Stephen Boyd
2024-04-08  2:35   ` Stephen Boyd
2024-03-25 18:41 ` [PATCH v2 3/5] clk: Initialize struct clk_core kref earlier Stephen Boyd
2024-04-08  2:36   ` Stephen Boyd
2024-03-25 18:41 ` [PATCH v2 4/5] clk: Get runtime PM before walking tree during disable_unused Stephen Boyd
2024-03-25 19:39   ` Doug Anderson
2024-04-08  2:36   ` Stephen Boyd
2024-04-09 10:32     ` Ulf Hansson
2024-04-09 11:55       ` Stephen Boyd [this message]
2024-03-25 18:41 ` [PATCH v2 5/5] clk: Get runtime PM before walking tree for clk_summary Stephen Boyd
2024-03-25 19:39   ` Doug Anderson
2024-04-08  2:37   ` Stephen Boyd
2024-03-25 18:44 ` [PATCH v2 0/5] Fix a deadlock with clk_pm_runtime_get() Stephen Boyd

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=306af49fd69707feccb640d02dc4fbc5.sboyd@kernel.org \
    --to=sboyd@kernel.org \
    --cc=dianders@chromium.org \
    --cc=krzk@kernel.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-clk@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=mturquette@baylibre.com \
    --cc=patches@lists.linux.dev \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).