* [PATCH stable 5.4+] nvme: fix possible hang when ns scanning fails during error recovery
@ 2020-05-06 23:14 Sagi Grimberg
2020-05-07 7:03 ` Greg KH
0 siblings, 1 reply; 3+ messages in thread
From: Sagi Grimberg @ 2020-05-06 23:14 UTC (permalink / raw
To: stable; +Cc: Christoph Hellwig, Keith Busch
When the controller is reconnecting, the host fails I/O and admin
commands as the host cannot reach the controller. ns scanning may
revalidate namespaces during that period and it is wrong to remove
namespaces due to these failures as we may hang (see 205da2434301).
One command that may fail is nvme_identify_ns_descs. Since we return
success due to having ns descriptor list optional, we continue to
validate ns identifiers in nvme_revalidate_disk, obviously fail and
return -ENODEV to nvme_validate_ns, which will remove the namespace.
Exactly what we don't want to happen.
Fixes: 22802bf742c2 ("nvme: Namepace identification descriptor list is optional")
Tested-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
drivers/nvme/host/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 31b7dcd791c2..8ce9b4fbc821 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1746,7 +1746,7 @@ static int nvme_report_ns_ids(struct nvme_ctrl *ctrl, unsigned int nsid,
if (ret)
dev_warn(ctrl->device,
"Identify Descriptors failed (%d)\n", ret);
- if (ret > 0)
+ if (ret > 0 && !(ret & NVME_SC_DNR))
ret = 0;
}
return ret;
--
2.20.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH stable 5.4+] nvme: fix possible hang when ns scanning fails during error recovery
2020-05-06 23:14 [PATCH stable 5.4+] nvme: fix possible hang when ns scanning fails during error recovery Sagi Grimberg
@ 2020-05-07 7:03 ` Greg KH
2020-05-08 0:45 ` Sagi Grimberg
0 siblings, 1 reply; 3+ messages in thread
From: Greg KH @ 2020-05-07 7:03 UTC (permalink / raw
To: Sagi Grimberg; +Cc: stable, Christoph Hellwig, Keith Busch
On Wed, May 06, 2020 at 04:14:51PM -0700, Sagi Grimberg wrote:
> When the controller is reconnecting, the host fails I/O and admin
> commands as the host cannot reach the controller. ns scanning may
> revalidate namespaces during that period and it is wrong to remove
> namespaces due to these failures as we may hang (see 205da2434301).
>
> One command that may fail is nvme_identify_ns_descs. Since we return
> success due to having ns descriptor list optional, we continue to
> validate ns identifiers in nvme_revalidate_disk, obviously fail and
> return -ENODEV to nvme_validate_ns, which will remove the namespace.
>
> Exactly what we don't want to happen.
>
> Fixes: 22802bf742c2 ("nvme: Namepace identification descriptor list is optional")
> Tested-by: Anton Eidelman <anton@lightbitslabs.com>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
> drivers/nvme/host/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
What is the git commit id of this patch in Linus's tree?
And why sign-off on a patch twice with a blank line?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH stable 5.4+] nvme: fix possible hang when ns scanning fails during error recovery
2020-05-07 7:03 ` Greg KH
@ 2020-05-08 0:45 ` Sagi Grimberg
0 siblings, 0 replies; 3+ messages in thread
From: Sagi Grimberg @ 2020-05-08 0:45 UTC (permalink / raw
To: Greg KH; +Cc: stable, Christoph Hellwig, Keith Busch
>> When the controller is reconnecting, the host fails I/O and admin
>> commands as the host cannot reach the controller. ns scanning may
>> revalidate namespaces during that period and it is wrong to remove
>> namespaces due to these failures as we may hang (see 205da2434301).
>>
>> One command that may fail is nvme_identify_ns_descs. Since we return
>> success due to having ns descriptor list optional, we continue to
>> validate ns identifiers in nvme_revalidate_disk, obviously fail and
>> return -ENODEV to nvme_validate_ns, which will remove the namespace.
>>
>> Exactly what we don't want to happen.
>>
>> Fixes: 22802bf742c2 ("nvme: Namepace identification descriptor list is optional")
>> Tested-by: Anton Eidelman <anton@lightbitslabs.com>
>> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
>>
>> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
>> ---
>> drivers/nvme/host/core.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> What is the git commit id of this patch in Linus's tree?
>
> And why sign-off on a patch twice with a blank line?
I'll resend greg, sorry for the inconvenience.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-05-08 0:45 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-05-06 23:14 [PATCH stable 5.4+] nvme: fix possible hang when ns scanning fails during error recovery Sagi Grimberg
2020-05-07 7:03 ` Greg KH
2020-05-08 0:45 ` Sagi Grimberg
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.