From: Kalle Valo <kvalo@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
"Rafael J. Wysocki" <rafael@kernel.org>
Cc: x86@kernel.org, linux-pm@vger.kernel.org,
linux-kernel@vger.kernel.org, regressions@lists.linux.dev,
Jeff Johnson <quic_jjohnson@quicinc.com>
Subject: [regression] suspend stress test stalls within 30 minutes
Date: Sat, 11 May 2024 21:22:43 +0300 [thread overview]
Message-ID: <87o79cjjik.fsf@kernel.org> (raw)
Hi,
I have a weird problem with suspend. Somewhere around v6.9-rc4 or so (not sure
exactly) I started seeing that our ath11k Wi-Fi driver suspend tests to
randomly fail. I have been investigating this for some time and now it
looks like it's somehow related to CPU_MITIGATIONS Kconfig option and
nothing to do with wireless.
The simplified test case I have is to run suspend and resume in loop
like this (Wi-Fi modules are not loaded):
for i in {1..400}; do echo "rtcwake test $i" > /dev/kmsg; rtcwake -m mem -s 10; sleep 10; done
If CPU_MITIGATIONS is enabled I usually see suspend stalling within 30
minutes. If I disable CPU_MITIGATIONS using menuconfig I don't see the bug.
When the bug happens in the kernel.log I see this and suspend stalls:
[ 361.716546] PM: suspend entry (deep)
[ 361.722558] Filesystems sync: 0.005 seconds
[ 624.222721] kworker/dying (2519) used greatest stack depth: 22240 bytes left
[ 633.897857] loop0: detected capacity change from 0 to 8
And if I don't do anything for several minutes nothing happens. What is
really strange is that once I run 'sudo shutdown -h now' then suspend
somehow immediately unstalls and continues with suspend, like this:
[ 847.631147] Freezing user space processes
[ 847.649590] Freezing user space processes completed (elapsed 0.016 seconds)
[ 847.650710] OOM killer disabled.
[ 847.651799] Freezing remaining freezable tasks
[ 847.654618] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 847.663757] printk: Suspending console(s) (use no_console_suspend to debug)
[ 847.710060] e1000e: EEE TX LPI TIMER: 00000011
[ 847.852370] ACPI: EC: interrupt blocked
[ 847.899416] ACPI: PM: Preparing to enter system sleep state S3
[ 847.933433] ACPI: EC: event blocked
[ 847.933437] ACPI: EC: EC stopped
[ 847.933441] ACPI: PM: Saving platform NVS memory
[ 847.933817] Disabling non-boot CPUs ...
And now the system goes into suspend state as it should. And if I press
the power button on the device then the system resumes and after that
shuts down (as expected because I run the shutdown command). This
behaviour is consistent, I see it every time the suspend bug happens.
The test setup is a several years old Intel NUC x86 system, more info
below.
Any recommendations how should I debug this further? I tried to bisect
this earlier but that failed, most likely because I hadn't yet realised
that this is related to CPU_MITIGATIONS and might have messed up the
.config settings during bisect.
Kalle
DMI: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021
Ubuntu 20.04.6 LTS (GNU/Linux 6.9.0-rc7+ x86_64)
systemd 245.4-4ubuntu3.23 running in system mode. (+PAM +AUDIT +SELINUX
+IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS
+ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2
default-hierarchy=hybrid)
I verified that I see this on latest commit from Linus' tree:
cf87f46fd34d Merge tag 'drm-fixes-2024-05-11' of https://gitlab.freedesktop.org/drm/kernel
Here's the diff between broken and working .config:
$ diffconfig broken.config works.config
-CALL_PADDING y
-CALL_THUNKS y
-CALL_THUNKS_DEBUG n
-HAVE_CALL_THUNKS y
-MITIGATION_CALL_DEPTH_TRACKING y
-MITIGATION_GDS_FORCE y
-MITIGATION_IBPB_ENTRY y
-MITIGATION_IBRS_ENTRY y
-MITIGATION_PAGE_TABLE_ISOLATION y
-MITIGATION_RETHUNK y
-MITIGATION_RETPOLINE y
-MITIGATION_RFDS y
-MITIGATION_SLS y
-MITIGATION_SPECTRE_BHI y
-MITIGATION_SRSO y
-MITIGATION_UNRET_ENTRY y
-PREFIX_SYMBOLS y
CPU_MITIGATIONS y -> n
next reply other threads:[~2024-05-11 18:22 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-11 18:22 Kalle Valo [this message]
2024-05-11 18:48 ` [regression] suspend stress test stalls within 30 minutes Borislav Petkov
2024-05-11 18:49 ` Borislav Petkov
2024-05-11 20:26 ` Kalle Valo
2024-05-13 19:58 ` Kalle Valo
2024-05-14 13:17 ` Kalle Valo
2024-05-14 16:05 ` Borislav Petkov
2024-05-14 17:36 ` Pawan Gupta
2024-05-17 17:15 ` Kalle Valo
2024-05-17 17:22 ` Dave Hansen
2024-05-17 18:37 ` Kalle Valo
2024-05-17 18:48 ` Dave Hansen
2024-05-17 18:58 ` Kalle Valo
2024-05-17 19:08 ` Rafael J. Wysocki
2024-05-17 19:00 ` Rafael J. Wysocki
2024-05-22 1:52 ` Len Brown
2024-05-17 17:26 ` Borislav Petkov
2024-05-17 18:22 ` Kalle Valo
2024-05-14 16:10 ` Dave Hansen
2024-05-15 7:22 ` Pawan Gupta
2024-05-15 7:44 ` Borislav Petkov
2024-05-15 16:27 ` Pawan Gupta
2024-05-15 16:47 ` Kalle Valo
2024-05-16 7:03 ` Pawan Gupta
2024-05-16 14:25 ` Pawan Gupta
2024-05-16 14:32 ` Dave Hansen
2024-05-16 15:41 ` Pawan Gupta
2024-05-17 17:41 ` Kalle Valo
2024-05-17 18:31 ` Pawan Gupta
2024-05-17 17:23 ` Kalle Valo
2024-05-17 17:19 ` Kalle Valo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87o79cjjik.fsf@kernel.org \
--to=kvalo@kernel.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=quic_jjohnson@quicinc.com \
--cc=rafael@kernel.org \
--cc=regressions@lists.linux.dev \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).