dri-devel Archive mirror
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: Bui Quang Minh <minhquangbui99@gmail.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>
Cc: syzbot <syzbot+045b454ab35fd82a35fb@syzkaller.appspotmail.com>,
	axboe@kernel.dk, brauner@kernel.org, io-uring@vger.kernel.org,
	jack@suse.cz, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com,
	viro@zeniv.linux.org.uk, "Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>,
	linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linaro-mm-sig@lists.linaro.org,
	"Laura Abbott" <laura@labbott.name>
Subject: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)
Date: Fri, 3 May 2024 11:26:32 -0700	[thread overview]
Message-ID: <202405031110.6F47982593@keescook> (raw)
In-Reply-To: <7c41cf3c-2a71-4dbb-8f34-0337890906fc@gmail.com>

On Fri, May 03, 2024 at 06:54:22PM +0700, Bui Quang Minh wrote:
> [...]
> Root cause:
> AFAIK, eventpoll (epoll) does not increase the registered file's reference.
> To ensure the safety, when the registered file is deallocated in __fput,
> eventpoll_release is called to unregister the file from epoll. When calling
> poll on epoll, epoll will loop through registered files and call vfs_poll on
> these files. In the file's poll, file is guaranteed to be alive, however, as
> epoll does not increase the registered file's reference, the file may be
> dying
> and it's not safe the get the file for later use. And dma_buf_poll violates
> this. In the dma_buf_poll, it tries to get_file to use in the callback. This
> leads to a race where the dmabuf->file can be fput twice.
> 
> Here is the race occurs in the above proof-of-concept
> 
> close(dmabuf->file)
> __fput_sync (f_count == 1, last ref)
> f_count-- (f_count == 0 now)
> __fput
>                                     epoll_wait
>                                     vfs_poll(dmabuf->file)
>                                     get_file(dmabuf->file)(f_count == 1)
> eventpoll_release
> dmabuf->file deallocation
>                                     fput(dmabuf->file) (f_count == 1)
>                                     f_count--
>                                     dmabuf->file deallocation
> 
> I am not familiar with the dma_buf so I don't know the proper fix for the
> issue. About the rule that don't get the file for later use in poll callback
> of
> file, I wonder if it is there when only select/poll exist or just after
> epoll
> appears.
> 
> I hope the analysis helps us to fix the issue.

Thanks for doing this analysis! I suspect at least a start of a fix
would be this:

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 8fe5aa67b167..15e8f74ee0f2 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -267,9 +267,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
 
 		if (events & EPOLLOUT) {
 			/* Paired with fput in dma_buf_poll_cb */
-			get_file(dmabuf->file);
-
-			if (!dma_buf_poll_add_cb(resv, true, dcb))
+			if (!atomic_long_inc_not_zero(&dmabuf->file) &&
+			    !dma_buf_poll_add_cb(resv, true, dcb))
 				/* No callback queued, wake up any other waiters */
 				dma_buf_poll_cb(NULL, &dcb->cb);
 			else
@@ -290,9 +289,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
 
 		if (events & EPOLLIN) {
 			/* Paired with fput in dma_buf_poll_cb */
-			get_file(dmabuf->file);
-
-			if (!dma_buf_poll_add_cb(resv, false, dcb))
+			if (!atomic_long_inc_not_zero(&dmabuf->file) &&
+			    !dma_buf_poll_add_cb(resv, false, dcb))
 				/* No callback queued, wake up any other waiters */
 				dma_buf_poll_cb(NULL, &dcb->cb);
 			else


But this ends up leaving "active" non-zero, and at close time it runs
into:

        BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);

But the bottom line is that get_file() is unsafe to use in some places,
one of which appears to be in the poll handler. There are maybe some
other fragile places too, like in drivers/gpu/drm/vmwgfx/ttm_object.c:

static bool __must_check get_dma_buf_unless_doomed(struct dma_buf *dmabuf)
{
	return atomic_long_inc_not_zero(&dmabuf->file->f_count) != 0L;
}

Which I also note involves a dmabuf...

Due to this issue I've proposed fixing get_file() to detect pathological states:
https://lore.kernel.org/lkml/20240502222252.work.690-kees@kernel.org/

But that has run into some push-back. I'm hoping that seeing this epoll
example will help illustrate what needs fixing a little better.

I think the best current proposal is to just WARN sooner instead of a
full refcount_t implementation:


diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8dfd53b52744..e09107d0a3d6 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1040,7 +1040,8 @@ struct file_handle {
 
 static inline struct file *get_file(struct file *f)
 {
-	atomic_long_inc(&f->f_count);
+	long prior = atomic_long_fetch_inc_relaxed(&f->f_count);
+	WARN_ONCE(!prior, "struct file::f_count incremented from zero; use-after-free condition present!\n");
 	return f;
 }
 


What's the right way to deal with the dmabuf situation? (And I suspect
it applies to get_dma_buf_unless_doomed() as well...)

-Kees

-- 
Kees Cook

  reply	other threads:[~2024-05-03 18:26 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <0000000000002d631f0615918f1e@google.com>
2024-05-03 11:54 ` [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove Bui Quang Minh
2024-05-03 18:26   ` Kees Cook [this message]
2024-05-03 18:49     ` get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove) Jens Axboe
2024-05-03 19:22       ` Kees Cook
2024-05-03 19:35         ` Jens Axboe
2024-05-03 19:59           ` Kees Cook
2024-05-03 20:28             ` Kees Cook
2024-05-03 21:11               ` Al Viro
2024-05-03 21:24                 ` Linus Torvalds
2024-05-03 21:30                   ` Al Viro
2024-05-06 17:46                   ` Stefan Metzmacher
2024-05-06 18:17                     ` Linus Torvalds
2024-05-08  8:47                       ` David Laight
2024-05-03 21:36                 ` Al Viro
2024-05-03 21:42                   ` Linus Torvalds
2024-05-03 21:53                     ` Al Viro
2024-05-06 12:23                       ` Daniel Vetter
2024-05-04  9:59             ` Christian Brauner
2024-05-03 21:11     ` [PATCH] epoll: try to be a _bit_ better about file lifetimes Linus Torvalds
2024-05-03 21:24       ` Al Viro
2024-05-03 21:33         ` Linus Torvalds
2024-05-03 21:45           ` Al Viro
2024-05-03 21:52             ` Linus Torvalds
2024-05-03 22:01               ` Al Viro
2024-05-03 22:07                 ` Al Viro
2024-05-03 23:16                   ` Linus Torvalds
2024-05-03 23:39                     ` Al Viro
2024-05-03 23:54                       ` Linus Torvalds
2024-05-04 10:44                       ` Christian Brauner
2024-05-03 22:46               ` Kees Cook
2024-05-03 23:03                 ` Al Viro
2024-05-03 23:23                   ` Kees Cook
2024-05-03 23:41                     ` Linus Torvalds
2024-05-04  9:19                       ` Christian Brauner
2024-05-06 12:37                       ` Daniel Vetter
2024-05-04  9:37           ` Christian Brauner
2024-05-04 15:32             ` Linus Torvalds
2024-05-04 15:40               ` Linus Torvalds
2024-05-04 15:53                 ` Linus Torvalds
2024-05-05 19:46                   ` Al Viro
2024-05-05 20:03                     ` Linus Torvalds
2024-05-05 20:30                       ` Al Viro
2024-05-05 20:53                         ` Linus Torvalds
2024-05-06 12:47                           ` Daniel Vetter
2024-05-06 14:46                             ` Christian Brauner
2024-05-07 10:58                               ` Daniel Vetter
2024-05-06 16:15                           ` Christian König
2024-05-05 10:50                 ` Christian Brauner
2024-05-05 16:46                   ` Linus Torvalds
2024-05-05 17:55                     ` [PATCH v2] epoll: be " Linus Torvalds
2024-05-05 18:04                       ` Jens Axboe
2024-05-05 20:01                       ` David Laight
2024-05-05 20:16                         ` Linus Torvalds
2024-05-05 20:12                     ` [PATCH] epoll: try to be a _bit_ " Al Viro
2024-05-06  8:45                     ` Christian Brauner
2024-05-06  9:26                       ` Christian Brauner
2024-05-06 14:19                         ` Christian Brauner
2024-05-07 21:02                       ` David Laight
2024-05-04 18:20               ` Linus Torvalds
2024-05-06 14:29                 ` [Linaro-mm-sig] " Christian König
2024-05-07 11:02                   ` Daniel Vetter
2024-05-07 16:46                     ` Linus Torvalds
2024-05-07 17:45                       ` Christian König
2024-05-08  7:51                         ` Michel Dänzer
2024-05-08  7:59                           ` Christian König
2024-05-08  8:23                         ` Christian Brauner
2024-05-08  9:10                           ` Christian König
2024-05-07 18:04                       ` Daniel Vetter
2024-05-07 19:07                         ` Linus Torvalds
2024-05-08  5:55                           ` Christian König
2024-05-08  8:32                             ` Daniel Vetter
2024-05-08 10:16                               ` Christian Brauner
2024-05-08  8:05                           ` Christian Brauner
2024-05-08 16:19                           ` Linus Torvalds
2024-05-08 17:14                             ` Linus Torvalds
2024-05-09 11:38                               ` Christian Brauner
2024-05-09 15:48                                 ` Linus Torvalds
2024-05-10  6:33                                   ` Christian Brauner
2024-05-08 10:08                   ` Christian Brauner
2024-05-08 15:45                     ` Daniel Vetter
2024-05-10 10:55                       ` Christian Brauner
2024-05-11 18:25                         ` David Laight
2024-05-05 17:31       ` Jens Axboe
2024-05-04  9:45     ` get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove) Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202405031110.6F47982593@keescook \
    --to=keescook@chromium.org \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=io-uring@vger.kernel.org \
    --cc=jack@suse.cz \
    --cc=laura@labbott.name \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=minhquangbui99@gmail.com \
    --cc=sumit.semwal@linaro.org \
    --cc=syzbot+045b454ab35fd82a35fb@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).