Linux-ide Archive mirror
 help / color / mirror / Atom feed
From: Phillip Susi <phill@thesusis.net>
To: Damien Le Moal <dlemoal@kernel.org>, linux-ide@vger.kernel.org
Cc: linux-scsi@vger.kernel.org,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	John Garry <john.g.garry@oracle.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>,
	Paul Ausbeck <paula@soe.ucsc.edu>,
	Kai-Heng Feng <kai.heng.feng@canonical.com>,
	Joe Breuer <linux-kernel@jmbreuer.net>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Chia-Lin Kao <acelan.kao@canonical.com>
Subject: Re: [PATCH v8 04/23] scsi: sd: Differentiate system and runtime start/stop management
Date: Thu, 26 Oct 2023 17:21:44 -0400	[thread overview]
Message-ID: <877cn95bbr.fsf@vps.thesusis.net> (raw)
In-Reply-To: <52f44651-ce94-468f-a43b-02d512294fe4@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 1764 bytes --]

Damien Le Moal <dlemoal@kernel.org> writes:

> On 10/21/23 06:23, Phillip Susi wrote:
> Looks like a deadlock somewhere, likely with the device remove that you
> triggered with the "echo 1 > /sys/block/sdf/device/delete".
>
> Can you send the exact list of commands & events you executed to get to that
> point ? Also please share your kernel config.

I wrote auto to /sys/block/sd[ef]/device/power/config and 10000 to
/sys/block/sd[ef]/device/power/auto_suspend_delay_ms, and auto to the
control file of their corresponding ata8 port ( both drives are behind
an PMP in an eSATA dock on ata8 ).  As I said before, it the dmesg
showed that the ata port only suspended after BOTH drives had suspended,
which is as it should be.  The problem seems to be after I echo 1 >
/sys/block/sdf/device/delete, when this happens:

Oct 26 16:43:00 faldara kernel: ata8.15: failed to read PMP GSCR[0] (Emask=0x100)
Oct 26 16:43:00 faldara kernel: ata8.15: PMP revalidation failed (errno=-5)
Oct 26 16:43:05 faldara kernel: ata8.15: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 26 16:43:05 faldara kernel: ata8.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Oct 26 16:43:05 faldara kernel: ata8.01: SATA link up 3.0 Gbps (SStatus
123 SControl 320)

Then I get the hung task.  I reverted the PuiS patch that I have been
working on, and the hang no longer happens, however, the above errors do
still happen.  I think that is a problem in itself, and may or may not
be related to the hang.  I see no reason why the PMP revalidation should
fail after the two disks and the port are runtime pm suspended, but for
some reason, with my patch applied, that then leads to the hang.

Here is my git log showing your two patches applied on the upstream
kernel, plus my patch:


[-- Attachment #2: git log --]
[-- Type: text/plain, Size: 4224 bytes --]

commit bb5db8bb505171fd2b8b67c3f22b16d8355d2a8b
Author: Phillip Susi <phill@thesusis.net>
Date:   Mon Oct 16 17:03:51 2023 -0400

    Olibata: don't start disks on resume
    
    Disks with Power Up In Standby enabled that required the
    SET FEATURES command to start up were being issued the
    command during resume.  Suppress this until the disk
    is actually accessed.

commit 4f1a1a9da4ff833417fa520d097b3f07c20e3c5d
Author: Damien Le Moal <dlemoal@kernel.org>
Date:   Thu Oct 12 16:18:00 2023 +0900

    [PATCH 2/2] ata: libata-core: Improve ata_dev_power_set_active()
    
    Improve the function ata_dev_power_set_active() by having it do nothing
    for a disk that is already in the active power state. To do that,
    introduce the function ata_dev_power_is_active() to test the current
    power state of the disk and return true if the disk is in the PM0:
    active or PM1: idle state (0xff value for the count field of the CHECK
    POWER MODE command output).
    
    To preserve the existing behavior, if the CHECK POWER MODE command
    issued in ata_dev_power_is_active() fails, the drive is assumed to be in
    standby mode and false is returned.
    
    With this change, issuing the VERIFY command to access the disk media to
    spin it up becomes unnecessary most of the time during system resume as
    the port reset done by libata-eh on resume often result in the drive to
    spin-up (this behavior is not clearly defined by the ACS specifications
    and may thus vary between disk models).
    
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>

commit bb7e1fcd9e3a207820e4b828e9f4f02986942d8d
Author: Damien Le Moal <dlemoal@kernel.org>
Date:   Thu Oct 12 16:17:59 2023 +0900

    ata: libata-eh: Spinup disk on resume after revalidation
    
    Move the call to ata_dev_power_set_active() to transition a disk in
    standby power mode to the active power mode from
    ata_eh_revalidate_and_attach() before doing revalidation to the end of
    ata_eh_recover(), after the link speed for the device is reconfigured
    (if that was necessary). This is safer as this ensure that the VERIFY
    command executed to spinup the disk is executed with the drive properly
    reconfigured first.
    
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>

commit 727fb83765049981e342db4c5a8b51aca72201d8
Merge: 8cb1f10d8c4b 5c15c60e7be6
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Fri Oct 13 23:19:16 2023 -0700

    Merge tag 'input-for-v6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
    
    Pull input fixes from Dmitry Torokhov:
    
     - a reworked way for handling reset delay on SMBus-connected Synaptics
       touchpads (the original one, while being correct, uncovered an old
       bug in fallback to PS/2 code that was fixed separately; the new one
       however avoids having delay in serio port "fast" resume, and instead
       has the wait in the RMI4 code)
    
     - a fix for potential crashes when devices with Elan controllers (and
       Synaptics) fall back to PS/2 code. Can't be hit without the original
       patch above, but still good to have it fixed
    
     - a couple new device IDs in xpad Xbox driver
    
     - another quirk for Goodix driver to deal with stuff vendors put in
       ACPI tables
    
     - a fix for use-after-free on disconnect for powermate driver
    
     - a quirk to not initialize PS/2 mouse port on Fujitsu Lifebook E5411
       laptop as it makes keyboard not usable and the device uses
       hid-over-i2c touchpad anyways
    
    * tag 'input-for-v6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
      Input: powermate - fix use-after-free in powermate_config_complete
      Input: xpad - add PXN V900 support
      Input: synaptics-rmi4 - handle reset delay when using SMBus trsnsport
      Input: psmouse - fix fast_reconnect function for PS/2 mode
      Revert "Input: psmouse - add delay when deactivating for SMBus mode"
      Input: goodix - ensure int GPIO is in input for gpio_count == 1 && gpio_int_idx == 0 case
      Input: i8042 - add Fujitsu Lifebook E5411 to i8042 quirk table
      Input: xpad - add HyperX Clutch Gladiate Support

[-- Attachment #3: Type: text/plain, Size: 1047 bytes --]


And here is my patch, which basically checks for PuiS during system
resume, and either forces the disk to suspend or resume depending on
whether it is PuiS.  Since I have not even engaged in any system
suspend/resume for this test, this patch should not have any effect.

So far in my testing of this patch on my 3 internal drives that I have
enabled PuiS on, it appears to work in so much as after a
suspend/resume, the runtime status of the disks is suspended ( as long
as I have *enabled* runtime pm on them, otherwise not ).  Another
problem seems to be that while the disks are left suspended after system
resume, they very quickly are resumed due to begnign IO eminating from
either ext4 or udsisks2 that does not cause a drive to spin up when it
is suspended with hdparm -y.  This would be the case of either FLUSH
CASH or CHECK POWER CONDITION not causing the drive to spin itself up
when given the commands, but the runtime pm core does not know that
these commands can be completed without resuming the disk, so it resumes
them first.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0001-libata-don-t-start-disks-on-resume.patch --]
[-- Type: text/x-diff, Size: 5009 bytes --]

From 77a3511d4058e3afccc4ba745c8c6ad6f7ac07a3 Mon Sep 17 00:00:00 2001
From: Phillip Susi <psusi@ubuntu.com>
Date: Mon, 16 Dec 2013 18:30:55 -0500
Subject: [PATCH] libata: don't start disks on resume

Disks with Power Up In Standby enabled that required the
SET FEATURES command to start up were being issued the
command during resume.  Suppress this until the disk
is actually accessed.
---
 drivers/ata/libata-core.c | 25 +++++++++++++++++++++----
 drivers/ata/libata-eh.c   | 19 ++++++++++++++++++-
 drivers/ata/libata.h      |  1 +
 include/linux/libata.h    |  1 +
 4 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 7c261907a1d0..cd4718fe02e1 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -1912,7 +1912,20 @@ int ata_dev_read_id(struct ata_device *dev, unsigned int *p_class,
 			goto err_out;
 	}
 
-	if (!tried_spinup && (id[2] == 0x37c8 || id[2] == 0x738c)) {
+	if (flags & ATA_READID_NOSTART && id[2] == 0x37c8)
+	{
+		/*
+		 * the drive has powered up in standby, and returned incomplete IDENTIFY info
+		 * so we can't revalidate it yet.  We have also been asked to avoid starting the
+		 * drive, so stop  here and leave the drive asleep and the EH pending, to be
+		 * restarted on later IO request
+		 */
+		dev->flags |= ATA_DFLAG_SLEEPING;
+		return -EAGAIN;
+	}
+
+	if (!(flags & ATA_READID_NOSTART) &&
+	    !tried_spinup && (id[2] == 0x37c8 || id[2] == 0x738c)) {
 		tried_spinup = 1;
 		/*
 		 * Drive powered-up in standby mode, and requires a specific
@@ -3956,6 +3969,8 @@ int ata_dev_revalidate(struct ata_device *dev, unsigned int new_class,
 
 	/* re-read ID */
 	rc = ata_dev_reread_id(dev, readid_flags);
+	if (rc == -EAGAIN)
+		return rc;
 	if (rc)
 		goto fail;
 
@@ -5172,6 +5187,10 @@ static void ata_port_request_pm(struct ata_port *ap, pm_message_t mesg,
 		spin_lock_irqsave(ap->lock, flags);
 	}
 
+	/* on system resume, don't wake PuiS drives */
+	if (mesg.event == PM_EVENT_RESUME)
+		ehi_flags |= ATA_EHI_NOSTART;
+	
 	/* Request PM operation to EH */
 	ap->pm_mesg = mesg;
 	ap->pflags |= ATA_PFLAG_PM_PENDING;
@@ -5269,9 +5288,7 @@ static void ata_port_resume_async(struct ata_port *ap, pm_message_t mesg)
 static int ata_port_pm_resume(struct device *dev)
 {
 	ata_port_resume_async(to_ata_port(dev), PMSG_RESUME);
-	pm_runtime_disable(dev);
-	pm_runtime_set_active(dev);
-	pm_runtime_enable(dev);
+	printk(KERN_INFO "resume device: %p", dev);
 	return 0;
 }
 
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 09be31723a3c..e77805ba46b0 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -22,6 +22,7 @@
 #include <scsi/scsi_cmnd.h>
 #include <scsi/scsi_dbg.h>
 #include "../scsi/scsi_transport_api.h"
+#include <linux/pm_runtime.h>
 
 #include <linux/libata.h>
 
@@ -3042,6 +3043,8 @@ static int ata_eh_revalidate_and_attach(struct ata_link *link,
 
 		if (ehc->i.flags & ATA_EHI_DID_RESET)
 			readid_flags |= ATA_READID_POSTRESET;
+		if (ehc->i.flags & ATA_EHI_NOSTART)
+			readid_flags |= ATA_READID_NOSTART;
 
 		if ((action & ATA_EH_REVALIDATE) && ata_dev_enabled(dev)) {
 			WARN_ON(dev->class == ATA_DEV_PMP);
@@ -3071,9 +3074,23 @@ static int ata_eh_revalidate_and_attach(struct ata_link *link,
 			ata_eh_about_to_do(link, dev, ATA_EH_REVALIDATE);
 			rc = ata_dev_revalidate(dev, ehc->classes[dev->devno],
 						readid_flags);
-			if (rc)
+			if (rc == -EAGAIN) {
+				rc = 0; /* start required but suppressed, handle later */
+				printk(KERN_INFO "sdev: %p", &dev->sdev->sdev_dev);
+				pm_runtime_disable(&dev->sdev->sdev_dev);
+				pm_runtime_set_suspended(&dev->sdev->sdev_dev);
+				pm_runtime_enable(&dev->sdev->sdev_dev);
+
+				continue;
+			}
+			else if (rc)
 				goto err;
 
+			printk(KERN_INFO "sdev: %p", &dev->sdev->sdev_dev);
+			pm_runtime_disable(&dev->sdev->sdev_dev);
+			pm_runtime_set_active(&dev->sdev->sdev_dev);
+			pm_runtime_enable(&dev->sdev->sdev_dev);
+
 			ata_eh_done(link, dev, ATA_EH_REVALIDATE);
 
 			/* Configuration may have changed, reconfigure
diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
index 05ac80da8ebc..cc13777d2811 100644
--- a/drivers/ata/libata.h
+++ b/drivers/ata/libata.h
@@ -19,6 +19,7 @@
 enum {
 	/* flags for ata_dev_read_id() */
 	ATA_READID_POSTRESET	= (1 << 0), /* reading ID after reset */
+	ATA_READID_NOSTART	= (1 << 1), /* do not start drive */
 
 	/* selector for ata_down_xfermask_limit() */
 	ATA_DNXFER_PIO		= 0,	/* speed down PIO */
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 2a7d2af0ed80..77769351ab99 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -328,6 +328,7 @@ enum {
 
 	/* ata_eh_info->flags */
 	ATA_EHI_HOTPLUGGED	= (1 << 0),  /* could have been hotplugged */
+	ATA_EHI_NOSTART		= (1 << 1),  /* don't start the disk */
 	ATA_EHI_NO_AUTOPSY	= (1 << 2),  /* no autopsy */
 	ATA_EHI_QUIET		= (1 << 3),  /* be quiet */
 	ATA_EHI_NO_RECOVERY	= (1 << 4),  /* no recovery */
-- 
2.41.0


  reply	other threads:[~2023-10-26 21:21 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-27 14:18 [PATCH v8 00/23] Fix libata suspend/resume handling and code cleanup Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 01/23] ata: libata-core: Fix ata_port_request_pm() locking Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 02/23] ata: libata-core: Fix port and device removal Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 03/23] ata: libata-scsi: link ata port and scsi device Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 04/23] scsi: sd: Differentiate system and runtime start/stop management Damien Le Moal
2023-09-27 19:50   ` Martin K. Petersen
2023-10-10 13:09   ` Phillip Susi
2023-10-10 14:04     ` Damien Le Moal
2023-10-15 16:14   ` Phillip Susi
2023-10-15 22:44     ` Damien Le Moal
2023-10-16 12:39       ` Phillip Susi
2023-10-16 12:55         ` Damien Le Moal
2023-10-17 18:03           ` Phillip Susi
2023-10-17 23:32             ` Damien Le Moal
2023-10-20 19:00               ` Phillip Susi
2023-10-18  6:16             ` Damien Le Moal
2023-10-20 21:23               ` Phillip Susi
2023-10-23  5:51                 ` Damien Le Moal
2023-10-26 21:21                   ` Phillip Susi [this message]
2023-09-27 14:18 ` [PATCH v8 05/23] ata: libata-scsi: Disable scsi device manage_system_start_stop Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 06/23] scsi: Do not attempt to rescan suspended devices Damien Le Moal
2023-09-27 19:51   ` Martin K. Petersen
2023-09-27 14:18 ` [PATCH v8 07/23] ata: libata-scsi: Fix delayed scsi_rescan_device() execution Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 08/23] ata: libata-core: Do not register PM operations for SAS ports Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 09/23] scsi: sd: Do not issue commands to suspended disks on shutdown Damien Le Moal
2023-09-27 19:52   ` Martin K. Petersen
2023-09-27 14:18 ` [PATCH v8 10/23] ata: libata-core: Fix compilation warning in ata_dev_config_ncq() Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 11/23] ata: libata-eh: Fix compilation warning in ata_eh_link_report() Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 12/23] scsi: Remove scsi device no_start_on_resume flag Damien Le Moal
2023-09-27 19:52   ` Martin K. Petersen
2023-09-27 14:18 ` [PATCH v8 13/23] ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat() Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 14/23] ata: libata-core: Synchronize ata_port_detach() with hotplug Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 15/23] ata: libata-core: Detach a port devices on shutdown Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 16/23] ata: libata-core: Remove ata_port_suspend_async() Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 17/23] ata: libata-core: Remove ata_port_resume_async() Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 18/23] ata: libata-core: Do not poweroff runtime suspended ports Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 19/23] ata: libata-core: Do not resume " Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 20/23] ata: libata-sata: Improve ata_sas_slave_configure() Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 21/23] ata: libata-eh: Improve reset error messages Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 22/23] ata: libata-eh: Reduce "disable device" message verbosity Damien Le Moal
2023-09-27 14:18 ` [PATCH v8 23/23] ata: libata: Cleanup inline DMA helper functions Damien Le Moal
2023-10-02 23:39 ` [PATCH v8 00/23] Fix libata suspend/resume handling and code cleanup Phillip Susi
2023-10-03  0:27   ` Phillip Susi
2023-10-03  0:44     ` Damien Le Moal
2023-10-03 21:22       ` Phillip Susi
2023-10-03 23:46         ` Damien Le Moal
2023-10-04 21:01           ` Phillip Susi
2023-10-04 22:33             ` Damien Le Moal
2023-10-05 12:38               ` Phillip Susi
2023-10-03  0:32   ` Damien Le Moal
     [not found] <c0b086ab-dcd5-4b7b-b931-4d407dd7ad47()kernel!org>
2023-10-12 19:01 ` [PATCH v8 04/23] scsi: sd: Differentiate system and runtime start/stop management Phillip Susi
2023-10-13  0:57   ` Damien Le Moal
2023-10-13 14:36     ` Phillip Susi
2023-10-15 22:09       ` Damien Le Moal
2023-10-21 17:56         ` Phillip Susi
2023-10-23  5:49           ` Damien Le Moal
2023-11-03 18:05             ` Phillip Susi
2023-11-03 23:01               ` Phillip Susi
2023-11-06  2:32                 ` Damien Le Moal
2023-11-07 13:27                   ` Phillip Susi
2023-11-07 21:59                     ` Damien Le Moal
2023-11-08 22:07                       ` Phillip Susi
2023-11-06  3:00               ` Damien Le Moal
2023-11-07 13:45                 ` Phillip Susi
2023-11-07 21:48                   ` Phillip Susi
2023-11-07 23:11                     ` Damien Le Moal
2023-11-08 22:15                       ` Phillip Susi
2023-11-09 22:09                         ` Phillip Susi
2023-11-09 22:57                           ` Damien Le Moal
2023-11-10 16:41                             ` Phillip Susi
2023-11-10  0:43                           ` Damien Le Moal
2023-11-07 22:13                   ` Damien Le Moal
2023-11-08 22:25                     ` Phillip Susi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877cn95bbr.fsf@vps.thesusis.net \
    --to=phill@thesusis.net \
    --cc=acelan.kao@canonical.com \
    --cc=dlemoal@kernel.org \
    --cc=geert@linux-m68k.org \
    --cc=john.g.garry@oracle.com \
    --cc=kai.heng.feng@canonical.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@jmbreuer.net \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=paula@soe.ucsc.edu \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).