All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH stable 5.4+] nvme: fix possible hang when ns scanning fails during error recovery
@ 2020-05-06 23:14 Sagi Grimberg
  2020-05-07  7:03 ` Greg KH
  0 siblings, 1 reply; 3+ messages in thread
From: Sagi Grimberg @ 2020-05-06 23:14 UTC (permalink / raw
  To: stable; +Cc: Christoph Hellwig, Keith Busch

When the controller is reconnecting, the host fails I/O and admin
commands as the host cannot reach the controller. ns scanning may
revalidate namespaces during that period and it is wrong to remove
namespaces due to these failures as we may hang (see 205da2434301).

One command that may fail is nvme_identify_ns_descs. Since we return
success due to having ns descriptor list optional, we continue to
validate ns identifiers in nvme_revalidate_disk, obviously fail and
return -ENODEV to nvme_validate_ns, which will remove the namespace.

Exactly what we don't want to happen.

Fixes: 22802bf742c2 ("nvme: Namepace identification descriptor list is optional")
Tested-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 31b7dcd791c2..8ce9b4fbc821 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1746,7 +1746,7 @@ static int nvme_report_ns_ids(struct nvme_ctrl *ctrl, unsigned int nsid,
 		if (ret)
 			dev_warn(ctrl->device,
 				 "Identify Descriptors failed (%d)\n", ret);
-		if (ret > 0)
+		if (ret > 0 && !(ret & NVME_SC_DNR))
 			ret = 0;
 	}
 	return ret;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH stable 5.4+] nvme: fix possible hang when ns scanning fails during error recovery
  2020-05-06 23:14 [PATCH stable 5.4+] nvme: fix possible hang when ns scanning fails during error recovery Sagi Grimberg
@ 2020-05-07  7:03 ` Greg KH
  2020-05-08  0:45   ` Sagi Grimberg
  0 siblings, 1 reply; 3+ messages in thread
From: Greg KH @ 2020-05-07  7:03 UTC (permalink / raw
  To: Sagi Grimberg; +Cc: stable, Christoph Hellwig, Keith Busch

On Wed, May 06, 2020 at 04:14:51PM -0700, Sagi Grimberg wrote:
> When the controller is reconnecting, the host fails I/O and admin
> commands as the host cannot reach the controller. ns scanning may
> revalidate namespaces during that period and it is wrong to remove
> namespaces due to these failures as we may hang (see 205da2434301).
> 
> One command that may fail is nvme_identify_ns_descs. Since we return
> success due to having ns descriptor list optional, we continue to
> validate ns identifiers in nvme_revalidate_disk, obviously fail and
> return -ENODEV to nvme_validate_ns, which will remove the namespace.
> 
> Exactly what we don't want to happen.
> 
> Fixes: 22802bf742c2 ("nvme: Namepace identification descriptor list is optional")
> Tested-by: Anton Eidelman <anton@lightbitslabs.com>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>  drivers/nvme/host/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

What is the git commit id of this patch in Linus's tree?

And why sign-off on a patch twice with a blank line?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH stable 5.4+] nvme: fix possible hang when ns scanning fails during error recovery
  2020-05-07  7:03 ` Greg KH
@ 2020-05-08  0:45   ` Sagi Grimberg
  0 siblings, 0 replies; 3+ messages in thread
From: Sagi Grimberg @ 2020-05-08  0:45 UTC (permalink / raw
  To: Greg KH; +Cc: stable, Christoph Hellwig, Keith Busch


>> When the controller is reconnecting, the host fails I/O and admin
>> commands as the host cannot reach the controller. ns scanning may
>> revalidate namespaces during that period and it is wrong to remove
>> namespaces due to these failures as we may hang (see 205da2434301).
>>
>> One command that may fail is nvme_identify_ns_descs. Since we return
>> success due to having ns descriptor list optional, we continue to
>> validate ns identifiers in nvme_revalidate_disk, obviously fail and
>> return -ENODEV to nvme_validate_ns, which will remove the namespace.
>>
>> Exactly what we don't want to happen.
>>
>> Fixes: 22802bf742c2 ("nvme: Namepace identification descriptor list is optional")
>> Tested-by: Anton Eidelman <anton@lightbitslabs.com>
>> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
>>
>> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
>> ---
>>   drivers/nvme/host/core.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> What is the git commit id of this patch in Linus's tree?
> 
> And why sign-off on a patch twice with a blank line?

I'll resend greg, sorry for the inconvenience.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-05-08  0:45 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-05-06 23:14 [PATCH stable 5.4+] nvme: fix possible hang when ns scanning fails during error recovery Sagi Grimberg
2020-05-07  7:03 ` Greg KH
2020-05-08  0:45   ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.