Re: [Linux kernel bug] general protection fault in disable_store

All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed

From: Sam Sun <samsun1006219@gmail.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org,
	 Greg KH <gregkh@linuxfoundation.org>,
	swboyd@chromium.org, ricardo@marliere.net,  hkallweit1@gmail.com,
	heikki.krogerus@linux.intel.com,  mathias.nyman@linux.intel.com,
	royluo@google.com,  syzkaller-bugs@googlegroups.com,
	xrivendell7@gmail.com
Subject: Re: [Linux kernel bug] general protection fault in disable_store
Date: Sat, 13 Apr 2024 00:26:07 +0800	[thread overview]
Message-ID: <CAEkJfYMjO+vMBGPcaLa51gjeKxFAJBrSa0t_iJUtauQD3DaK8w@mail.gmail.com> (raw)
In-Reply-To: <45e246ab-01e8-40b7-8ede-b47957df0d7b@rowland.harvard.edu>

On Fri, Apr 12, 2024 at 10:40 PM Alan Stern <stern@rowland.harvard.edu> wrote:
>
> On Fri, Apr 12, 2024 at 09:08:12PM +0800, Sam Sun wrote:
> > Sorry for the mistake I made when debugging this bug. Now I have more
> > information about it. Disassembly of function disable_store() in the
> > latest upstream kernel is listed below.
> > ```
> > Dump of assembler code for function disable_store:
> >    ...
> >    0xffffffff86e907eb <+187>:   lea    -0x8(%r14),%r12
> >    0xffffffff86e907ef <+191>:   mov    (%rbx),%rax
> >    0xffffffff86e907f2 <+194>:   mov    %rax,0x20(%rsp)
> >    0xffffffff86e907f7 <+199>:   lea    -0xa8(%rax),%rdi
> >    0xffffffff86e907fe <+206>:   mov    %rdi,0x18(%rsp)
> >    0xffffffff86e90803 <+211>:   call   0xffffffff86e20220
> > <usb_hub_to_struct_hub>
> >    0xffffffff86e90808 <+216>:   mov    %rax,%rbx
> >    0xffffffff86e9080b <+219>:   shr    $0x3,%rax
> >    0xffffffff86e9080f <+223>:   movabs $0xdffffc0000000000,%rcx
> >    0xffffffff86e90819 <+233>:   cmpb   $0x0,(%rax,%rcx,1)
> >    0xffffffff86e9081d <+237>:   je     0xffffffff86e90827 <disable_store+247>
> >    0xffffffff86e9081f <+239>:   mov    %rbx,%rdi
> >    0xffffffff86e90822 <+242>:   call   0xffffffff81eeb0b0
> > <__asan_report_load8_noabort>
> >    0xffffffff86e90827 <+247>:   lea    0x60(%rsp),%rsi
> >    ...
> > ```
> > The cmpb in disable_store()<+233> is generated by KASAN to check the
> > shadow memory status. If equals 0, which means the load 8 is valid,
> > pass the KASAN check. However, this time rax is 0, so it first
> > triggers general protection fault, since 0xdffffc0000000000 is not a
> > valid address. rax contains the return address of function
> > usb_hub_to_struct_hub(), in this case is a NULL.
> >
> > In function usb_hub_to_struct_hub(), I checked hdev and its sub
> > domains, and they are not NULL. Is it possible that
> > usb_deauthorized_device() set
> > hdev->actconfig->interface[0]->dev.driver_data to NULL? I cannot
> > confirm that since every time I try to breakpoint the code it crashes
> > differently.
>
> I suspect the usb_hub_to_struct_hub() call is racing with the
> spinlock-protected region in hub_disconnect() (in hub.c).
>
> > If there is any other thing I could help, please let me know.
>
> Try the patch below.  It should eliminate that race, which hopefully
> will fix the problem.
>
> Alan Stern
>
>
>
> Index: usb-devel/drivers/usb/core/hub.c
> ===================================================================
> --- usb-devel.orig/drivers/usb/core/hub.c
> +++ usb-devel/drivers/usb/core/hub.c
> @@ -72,6 +72,9 @@
>   * change to USB_STATE_NOTATTACHED even when the semaphore isn't held. */
>  static DEFINE_SPINLOCK(device_state_lock);
>
> +/* Protect hdev->maxchild and hub's intfdata */
> +static DEFINE_SPINLOCK(hub_state_lock);
> +
>  /* workqueue to process hub events */
>  static struct workqueue_struct *hub_wq;
>  static void hub_event(struct work_struct *work);
> @@ -152,9 +155,13 @@ static inline char *portspeed(struct usb
>  /* Note that hdev or one of its children must be locked! */
>  struct usb_hub *usb_hub_to_struct_hub(struct usb_device *hdev)
>  {
> -       if (!hdev || !hdev->actconfig || !hdev->maxchild)
> -               return NULL;
> -       return usb_get_intfdata(hdev->actconfig->interface[0]);
> +       struct usb_hub *hub = NULL;
> +
> +       spin_lock_irq(&hub_state_lock);
> +       if (hdev && hdev->actconfig && hdev->maxchild)
> +               hub = usb_get_intfdata(hdev->actconfig->interface[0]);
> +       spin_unlock_irq(&hub_state_lock);
> +       return hub;
>  }
>
>  int usb_device_supports_lpm(struct usb_device *udev)
> @@ -1714,7 +1721,9 @@ static int hub_configure(struct usb_hub
>                         break;
>                 }
>         }
> +       spin_lock_irq(&hub_state_lock);
>         hdev->maxchild = i;
> +       spin_unlock_irq(&hub_state_lock);
>         for (i = 0; i < hdev->maxchild; i++) {
>                 struct usb_port *port_dev = hub->ports[i];
>
> @@ -1790,9 +1799,11 @@ static void hub_disconnect(struct usb_in
>
>         /* Avoid races with recursively_mark_NOTATTACHED() */
>         spin_lock_irq(&device_state_lock);
> +       spin_lock(&hub_state_lock);
>         port1 = hdev->maxchild;
>         hdev->maxchild = 0;
>         usb_set_intfdata(intf, NULL);
> +       spin_unlock(&hub_state_lock);
>         spin_unlock_irq(&device_state_lock);
>
>         for (; port1 > 0; --port1)
>

I applied this patch and tried to execute several times, no more
kernel core dump in my environment. I think this bug is fixed by the
patch. But I do have one more question about it. Since it is a data
race bug, it has reproducibility issues originally. How can I confirm
if a racy bug is fixed by test? This kind of bug might still have a
race window but is harder to trigger. Just curious, not for this
patch. I think this patch eliminates the racy window.

Best,
Yue

next prev parent reply	other threads:[~2024-04-12 16:26 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-11  6:52 [Linux kernel bug] general protection fault in disable_store Sam Sun
2024-04-11  6:58 ` Greg KH
2024-04-11  7:19   ` Sam Sun
2024-04-11 15:24 ` Alan Stern
2024-04-12 13:08   ` Sam Sun
2024-04-12 14:40     ` Alan Stern
2024-04-12 16:26       ` Sam Sun [this message]
2024-04-12 18:11         ` Alan Stern
2024-04-12 18:32           ` Alan Stern
2024-04-13  5:08           ` Sam Sun
2024-04-15 14:47             ` Alan Stern
2024-04-16  9:05               ` Sam Sun
2024-04-16 16:35                 ` Alan Stern
2024-04-17  7:39                   ` Sam Sun
2024-04-17 14:23                     ` Alan Stern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEkJfYMjO+vMBGPcaLa51gjeKxFAJBrSa0t_iJUtauQD3DaK8w@mail.gmail.com \
    --to=samsun1006219@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=heikki.krogerus@linux.intel.com \
    --cc=hkallweit1@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@linux.intel.com \
    --cc=ricardo@marliere.net \
    --cc=royluo@google.com \
    --cc=stern@rowland.harvard.edu \
    --cc=swboyd@chromium.org \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=xrivendell7@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.