All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: "Engel, Amit" <Amit.Engel@Dell.com>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>
Cc: "Anner, Ran" <Ran.Anner@dell.com>, "Grupi, Elad" <Elad.Grupi@dell.com>
Subject: Re: nvme_tcp BUG: unable to handle kernel NULL pointer dereference at 0000000000000230
Date: Thu, 10 Jun 2021 13:03:03 -0700	[thread overview]
Message-ID: <172e3128-d1dc-ca9a-c679-752b621e956f@grimberg.me> (raw)
In-Reply-To: <CO1PR19MB48859E1CCD596402D4007458EE369@CO1PR19MB4885.namprd19.prod.outlook.com>


> Correct, free_queue is being called (sock->sk becomes NULL) before restore_sock_calls
> 
> When restore_sock_calls is called, we fail on 'write_lock_bh(&sock->sk->sk_callback_lock)'
> 
> NULL pointer dereference at 0x230 → 560 decimal
> crash> struct sock -o
> struct sock {
>     [0] struct sock_common __sk_common;
>     …
>     ...
>     …
>     [560] rwlock_t sk_callback_lock;
> 
> stop queue in ctx2 does not really do anything since 'NVME_TCP_Q_LIVE' bit is already cleared (by ctx1).
> can you please explain how stop the queue before free helps to serialize ctx1 ?

What I understood from your description is:
1. ctx1 calls stop_queue - calls kernel_sock_shutdown
2. ctx1 gets to restore_sock_calls (just before)
3. ctx2 is triggered from state_change - scheduling err_work
4. ctx2 does stop_queues
5. ctx2 calls destroy_queues -> there does sock_release
6. ctx1 does frwd progress and access an already freed sk

Hence with the mutex protection, ctx2 will be serialized on step (4)
until ctx2 releases the mutex and hence cannot get to step (5) but
only after ctx1 releases the mutex, in step (6).

But maybe I'm not interpreting this correctly?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  parent reply	other threads:[~2021-06-10 20:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-01 17:51 nvme_tcp BUG: unable to handle kernel NULL pointer dereference at 0000000000000230 Engel, Amit
2021-06-02 12:28 ` Engel, Amit
2021-06-08 23:39   ` Sagi Grimberg
2021-06-09  7:48     ` Engel, Amit
2021-06-09  8:04       ` Sagi Grimberg
2021-06-09  8:39         ` Engel, Amit
2021-06-09  9:11           ` Sagi Grimberg
2021-06-09 11:14             ` Engel, Amit
2021-06-10  8:44               ` Engel, Amit
2021-06-10 20:03               ` Sagi Grimberg [this message]
2021-06-13  8:35                 ` Engel, Amit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=172e3128-d1dc-ca9a-c679-752b621e956f@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=Amit.Engel@Dell.com \
    --cc=Elad.Grupi@dell.com \
    --cc=Ran.Anner@dell.com \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.