All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Javier Martinez Canillas <javierm@redhat.com>,
	linux-kernel@vger.kernel.org,
	Peter Robinson <pbrobinson@gmail.com>,
	Shawn Lin <shawn.lin@rock-chips.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Heiko Stuebner <heiko@sntech.de>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Rob Herring <robh@kernel.org>,
	linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org,
	linux-rockchip@lists.infradead.org
Subject: Re: [PATCH v2] PCI: rockchip: Avoid accessing PCIe registers with clocks gated
Date: Thu, 24 Jun 2021 18:28:41 -0500	[thread overview]
Message-ID: <20210624232841.GA3579021@bjorn-Precision-5520> (raw)
In-Reply-To: <44c551d7-fee4-13cf-2929-6d2383dd5497@arm.com>

On Fri, Jun 25, 2021 at 12:18:48AM +0100, Robin Murphy wrote:
> On 2021-06-24 22:57, Bjorn Helgaas wrote:
> > On Tue, Jun 08, 2021 at 10:04:09AM +0200, Javier Martinez Canillas wrote:
> > > IRQ handlers that are registered for shared interrupts can be called at
> > > any time after have been registered using the request_irq() function.
> > > 
> > > It's up to drivers to ensure that's always safe for these to be called.
> > > 
> > > Both the "pcie-sys" and "pcie-client" interrupts are shared, but since
> > > their handlers are registered very early in the probe function, an error
> > > later can lead to these handlers being executed before all the required
> > > resources have been properly setup.
> > > 
> > > For example, the rockchip_pcie_read() function used by these IRQ handlers
> > > expects that some PCIe clocks will already be enabled, otherwise trying
> > > to access the PCIe registers causes the read to hang and never return.
> > 
> > The read *never* completes?  That might be a bit problematic because
> > it implies that we may not be able to recover from PCIe errors.  Most
> > controllers will timeout eventually, log an error, and either
> > fabricate some data (typically ~0) to complete the CPU's read or cause
> > some kind of abort or machine check.
> > 
> > Just asking in case there's some controller configuration that should
> > be tweaked.
> 
> If I'm following correctly, that'll be a read transaction to the native side
> of the controller itself; it can't complete that read, or do anything else
> either, because it's clock-gated, and thus completely oblivious (it might be
> that if another CPU was able to enable the clocks then everything would
> carry on as normal, or it might end up totally deadlocking the SoC
> interconnect). I think it's safe to assume that in that state nothing of
> importance would be happening on the PCIe side, and even if it was we'd
> never get to know about it.

Oh, right, that makes sense.  I was thinking about the PCIe side, but
if the controller itself isn't working, of course we wouldn't get that
far.

I would expect that the CPU itself would have some kind of timeout for
the read, but that's far outside of the PCI world.

Bjorn

WARNING: multiple messages have this Message-ID (diff)
From: Bjorn Helgaas <helgaas@kernel.org>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Javier Martinez Canillas <javierm@redhat.com>,
	linux-kernel@vger.kernel.org,
	Peter Robinson <pbrobinson@gmail.com>,
	Shawn Lin <shawn.lin@rock-chips.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Heiko Stuebner <heiko@sntech.de>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Rob Herring <robh@kernel.org>,
	linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org,
	linux-rockchip@lists.infradead.org
Subject: Re: [PATCH v2] PCI: rockchip: Avoid accessing PCIe registers with clocks gated
Date: Thu, 24 Jun 2021 18:28:41 -0500	[thread overview]
Message-ID: <20210624232841.GA3579021@bjorn-Precision-5520> (raw)
In-Reply-To: <44c551d7-fee4-13cf-2929-6d2383dd5497@arm.com>

On Fri, Jun 25, 2021 at 12:18:48AM +0100, Robin Murphy wrote:
> On 2021-06-24 22:57, Bjorn Helgaas wrote:
> > On Tue, Jun 08, 2021 at 10:04:09AM +0200, Javier Martinez Canillas wrote:
> > > IRQ handlers that are registered for shared interrupts can be called at
> > > any time after have been registered using the request_irq() function.
> > > 
> > > It's up to drivers to ensure that's always safe for these to be called.
> > > 
> > > Both the "pcie-sys" and "pcie-client" interrupts are shared, but since
> > > their handlers are registered very early in the probe function, an error
> > > later can lead to these handlers being executed before all the required
> > > resources have been properly setup.
> > > 
> > > For example, the rockchip_pcie_read() function used by these IRQ handlers
> > > expects that some PCIe clocks will already be enabled, otherwise trying
> > > to access the PCIe registers causes the read to hang and never return.
> > 
> > The read *never* completes?  That might be a bit problematic because
> > it implies that we may not be able to recover from PCIe errors.  Most
> > controllers will timeout eventually, log an error, and either
> > fabricate some data (typically ~0) to complete the CPU's read or cause
> > some kind of abort or machine check.
> > 
> > Just asking in case there's some controller configuration that should
> > be tweaked.
> 
> If I'm following correctly, that'll be a read transaction to the native side
> of the controller itself; it can't complete that read, or do anything else
> either, because it's clock-gated, and thus completely oblivious (it might be
> that if another CPU was able to enable the clocks then everything would
> carry on as normal, or it might end up totally deadlocking the SoC
> interconnect). I think it's safe to assume that in that state nothing of
> importance would be happening on the PCIe side, and even if it was we'd
> never get to know about it.

Oh, right, that makes sense.  I was thinking about the PCIe side, but
if the controller itself isn't working, of course we wouldn't get that
far.

I would expect that the CPU itself would have some kind of timeout for
the read, but that's far outside of the PCI world.

Bjorn

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

WARNING: multiple messages have this Message-ID (diff)
From: Bjorn Helgaas <helgaas@kernel.org>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Javier Martinez Canillas <javierm@redhat.com>,
	linux-kernel@vger.kernel.org,
	Peter Robinson <pbrobinson@gmail.com>,
	Shawn Lin <shawn.lin@rock-chips.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Heiko Stuebner <heiko@sntech.de>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Rob Herring <robh@kernel.org>,
	linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org,
	linux-rockchip@lists.infradead.org
Subject: Re: [PATCH v2] PCI: rockchip: Avoid accessing PCIe registers with clocks gated
Date: Thu, 24 Jun 2021 18:28:41 -0500	[thread overview]
Message-ID: <20210624232841.GA3579021@bjorn-Precision-5520> (raw)
In-Reply-To: <44c551d7-fee4-13cf-2929-6d2383dd5497@arm.com>

On Fri, Jun 25, 2021 at 12:18:48AM +0100, Robin Murphy wrote:
> On 2021-06-24 22:57, Bjorn Helgaas wrote:
> > On Tue, Jun 08, 2021 at 10:04:09AM +0200, Javier Martinez Canillas wrote:
> > > IRQ handlers that are registered for shared interrupts can be called at
> > > any time after have been registered using the request_irq() function.
> > > 
> > > It's up to drivers to ensure that's always safe for these to be called.
> > > 
> > > Both the "pcie-sys" and "pcie-client" interrupts are shared, but since
> > > their handlers are registered very early in the probe function, an error
> > > later can lead to these handlers being executed before all the required
> > > resources have been properly setup.
> > > 
> > > For example, the rockchip_pcie_read() function used by these IRQ handlers
> > > expects that some PCIe clocks will already be enabled, otherwise trying
> > > to access the PCIe registers causes the read to hang and never return.
> > 
> > The read *never* completes?  That might be a bit problematic because
> > it implies that we may not be able to recover from PCIe errors.  Most
> > controllers will timeout eventually, log an error, and either
> > fabricate some data (typically ~0) to complete the CPU's read or cause
> > some kind of abort or machine check.
> > 
> > Just asking in case there's some controller configuration that should
> > be tweaked.
> 
> If I'm following correctly, that'll be a read transaction to the native side
> of the controller itself; it can't complete that read, or do anything else
> either, because it's clock-gated, and thus completely oblivious (it might be
> that if another CPU was able to enable the clocks then everything would
> carry on as normal, or it might end up totally deadlocking the SoC
> interconnect). I think it's safe to assume that in that state nothing of
> importance would be happening on the PCIe side, and even if it was we'd
> never get to know about it.

Oh, right, that makes sense.  I was thinking about the PCIe side, but
if the controller itself isn't working, of course we wouldn't get that
far.

I would expect that the CPU itself would have some kind of timeout for
the read, but that's far outside of the PCI world.

Bjorn

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-06-24 23:28 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-08  8:04 [PATCH v2] PCI: rockchip: Avoid accessing PCIe registers with clocks gated Javier Martinez Canillas
2021-06-08  8:04 ` Javier Martinez Canillas
2021-06-08  8:04 ` Javier Martinez Canillas
2021-06-12 22:02 ` Peter Robinson
2021-06-12 22:02   ` Peter Robinson
2021-06-12 22:02   ` Peter Robinson
2021-06-22 10:31 ` Lorenzo Pieralisi
2021-06-22 10:31   ` Lorenzo Pieralisi
2021-06-22 10:31   ` Lorenzo Pieralisi
2021-06-24 21:57 ` Bjorn Helgaas
2021-06-24 21:57   ` Bjorn Helgaas
2021-06-24 21:57   ` Bjorn Helgaas
2021-06-24 23:18   ` Robin Murphy
2021-06-24 23:18     ` Robin Murphy
2021-06-24 23:18     ` Robin Murphy
2021-06-24 23:28     ` Bjorn Helgaas [this message]
2021-06-24 23:28       ` Bjorn Helgaas
2021-06-24 23:28       ` Bjorn Helgaas
2021-06-24 23:51       ` Robin Murphy
2021-06-24 23:51         ` Robin Murphy
2021-06-24 23:51         ` Robin Murphy
2021-06-24 22:40 ` Bjorn Helgaas
2021-06-24 22:40   ` Bjorn Helgaas
2021-06-24 22:40   ` Bjorn Helgaas
2021-06-25  7:09   ` Javier Martinez Canillas
2021-06-25  7:09     ` Javier Martinez Canillas
2021-06-25  7:09     ` Javier Martinez Canillas
2021-06-25 14:32     ` Bjorn Helgaas
2021-06-25 14:32       ` Bjorn Helgaas
2021-06-25 14:32       ` Bjorn Helgaas
2021-06-25 18:34       ` Javier Martinez Canillas
2021-06-25 18:34         ` Javier Martinez Canillas
2021-06-25 18:34         ` Javier Martinez Canillas
2021-06-29  0:38   ` Bjorn Helgaas
2021-06-29  0:38     ` Bjorn Helgaas
2021-06-29  0:38     ` Bjorn Helgaas
2021-06-29  6:17     ` Javier Martinez Canillas
2021-06-29  6:17       ` Javier Martinez Canillas
2021-06-29  6:17       ` Javier Martinez Canillas
2021-06-29 10:52       ` Robin Murphy
2021-06-29 10:52         ` Robin Murphy
2021-06-29 10:52         ` Robin Murphy
2021-06-29 23:14         ` Bjorn Helgaas
2021-06-29 23:14           ` Bjorn Helgaas
2021-06-29 23:14           ` Bjorn Helgaas
2021-06-30  9:44           ` Robin Murphy
2021-06-30  9:44             ` Robin Murphy
2021-06-30  9:44             ` Robin Murphy
2021-06-30 18:49         ` Bjorn Helgaas
2021-06-30 18:49           ` Bjorn Helgaas
2021-06-30 18:49           ` Bjorn Helgaas
2021-06-30 18:59 ` Bjorn Helgaas
2021-06-30 18:59   ` Bjorn Helgaas
2021-06-30 18:59   ` Bjorn Helgaas
2021-06-30 19:59   ` Javier Martinez Canillas
2021-06-30 19:59     ` Javier Martinez Canillas
2021-06-30 19:59     ` Javier Martinez Canillas
2021-06-30 20:30     ` Bjorn Helgaas
2021-06-30 20:30       ` Bjorn Helgaas
2021-06-30 20:30       ` Bjorn Helgaas
2021-06-30 20:46       ` Peter Robinson
2021-06-30 20:46         ` Peter Robinson
2021-06-30 20:46         ` Peter Robinson
2021-06-30 22:09       ` Javier Martinez Canillas
2021-06-30 22:09         ` Javier Martinez Canillas
2021-06-30 22:09         ` Javier Martinez Canillas
2021-07-01 13:59         ` Bjorn Helgaas
2021-07-01 13:59           ` Bjorn Helgaas
2021-07-01 13:59           ` Bjorn Helgaas
2021-07-01 14:59           ` Javier Martinez Canillas
2021-07-01 14:59             ` Javier Martinez Canillas
2021-07-01 14:59             ` Javier Martinez Canillas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210624232841.GA3579021@bjorn-Precision-5520 \
    --to=helgaas@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=heiko@sntech.de \
    --cc=javierm@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=pbrobinson@gmail.com \
    --cc=robh@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=shawn.lin@rock-chips.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.