All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: "Eric W. Biederman" <ebiederm@xmission.com>
To: Pavel Begunkov <asml.silence@gmail.com>
Cc: Olivier Langlois <olivier@trillion01.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	io-uring@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Jens Axboe <axboe@kernel.dk>, Oleg Nesterov <oleg@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [RFC] coredump: Do not interrupt dump for TIF_NOTIFY_SIGNAL
Date: Mon, 14 Mar 2022 18:58:28 -0500	[thread overview]
Message-ID: <87ilsg13yz.fsf@email.froward.int.ebiederm.org> (raw)
In-Reply-To: <303f7772-eb31-5beb-2bd0-4278566591b0@gmail.com> (Pavel Begunkov's message of "Tue, 28 Dec 2021 11:24:56 +0000")

Pavel Begunkov <asml.silence@gmail.com> writes:

> On 12/24/21 19:52, Eric W. Biederman wrote:
>> Pavel Begunkov <asml.silence@gmail.com> writes:
> [...]
>>> FWIW, I worked it around in io_uring back then by breaking the
>>> dependency.
>> I am in the middle of untangling the dependencies between ptrace,
>> coredump, signal handling and maybe a few related things.
>
> Sounds great
>
>> Do folks have a reproducer I can look at?  Pavel especially if you have
>> something that reproduces on the current kernels.
>
> A syz reproducer was triggering it reliably, I'd try to revert the
> commit below and test:
> https://syzkaller.appspot.com/text?tag=ReproC&x=15d3600cb00000
>
> It should hung a task. Syzbot report for reference:
> https://syzkaller.appspot.com/bug?extid=27d62ee6f256b186883e
>
>
> commit 1d5f5ea7cb7d15b9fb1cc82673ebb054f02cd7d2
> Author: Pavel Begunkov <asml.silence@gmail.com>
> Date:   Fri Oct 29 13:11:33 2021 +0100
>
>     io-wq: remove worker to owner tw dependency
>
>     INFO: task iou-wrk-6609:6612 blocked for more than 143 seconds.
>           Not tainted 5.15.0-rc5-syzkaller #0
>     "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>     task:iou-wrk-6609    state:D stack:27944 pid: 6612 ppid:  6526 flags:0x00004006
>     Call Trace:
>      context_switch kernel/sched/core.c:4940 [inline]
>      __schedule+0xb44/0x5960 kernel/sched/core.c:6287
>      schedule+0xd3/0x270 kernel/sched/core.c:6366
>      schedule_timeout+0x1db/0x2a0 kernel/time/timer.c:1857
>      do_wait_for_common kernel/sched/completion.c:85 [inline]
>      __wait_for_common kernel/sched/completion.c:106 [inline]
>      wait_for_common kernel/sched/completion.c:117 [inline]
>      wait_for_completion+0x176/0x280 kernel/sched/completion.c:138
>      io_worker_exit fs/io-wq.c:183 [inline]
>      io_wqe_worker+0x66d/0xc40 fs/io-wq.c:597
>      ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
>      ...

Thank you very much for this.  There were some bugs elsewhere I had to
deal with so I am slower looking at this part of the code than I was
expecting.

I have now reproduced this with the commit reverted on current kernels
and the repro.c from the syzcaller report.  I am starting to look into
how this interacts with my planned code changes in this area.

In combination with my other planned changes I think all that needs to
happen in do_coredump is to clear TIF_NOTIFY_SIGNAL along with
TIF_SIGPENDING to prevent io_uring interaction problems.  But we will
see.

The deadlock you demonstrate here shows that it is definitely not enough
to clear TIF_NOTIFY_SIGNAL (without other changes) so that
signal_pending returns false, which I was hoping was be the case.

Eric

  reply	other threads:[~2022-03-14 23:59 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <192c9697e379bf084636a8213108be6c3b948d0b.camel@trillion01.com>
     [not found] ` <9692dbb420eef43a9775f425cb8f6f33c9ba2db9.camel@trillion01.com>
     [not found]   ` <87h7i694ij.fsf_-_@disp2133>
2021-06-09 20:33     ` [RFC] coredump: Do not interrupt dump for TIF_NOTIFY_SIGNAL Linus Torvalds
2021-06-09 20:48       ` Eric W. Biederman
2021-06-09 20:52         ` Linus Torvalds
2021-06-09 21:02       ` Olivier Langlois
2021-06-09 21:05         ` Eric W. Biederman
2021-06-09 21:26           ` Olivier Langlois
2021-06-09 21:56             ` Olivier Langlois
2021-06-10 14:26             ` Eric W. Biederman
2021-06-10 15:17               ` Olivier Langlois
2021-06-10 18:58               ` [CFT}[PATCH] coredump: Limit what can interrupt coredumps Eric W. Biederman
2021-06-10 19:10                 ` Linus Torvalds
2021-06-10 19:18                   ` Eric W. Biederman
2021-06-10 19:50                     ` Linus Torvalds
2021-06-10 20:11                       ` [PATCH] " Eric W. Biederman
2021-06-10 21:04                         ` Linus Torvalds
2021-06-12 14:36                         ` Olivier Langlois
2021-06-12 16:26                           ` Jens Axboe
2021-06-14 14:10                         ` Oleg Nesterov
2021-06-14 16:37                           ` Eric W. Biederman
2021-06-14 16:59                             ` Oleg Nesterov
2021-06-15 22:08                           ` Eric W. Biederman
2021-06-16 19:23                             ` Olivier Langlois
2021-06-16 20:00                               ` Eric W. Biederman
2021-06-18 20:05                                 ` Olivier Langlois
2021-08-05 13:06                             ` Olivier Langlois
2021-08-10 21:48                               ` Tony Battersby
2021-08-11 20:47                                 ` Olivier Langlois
2021-08-12  1:55                                 ` Jens Axboe
2021-08-12 13:53                                   ` Tony Battersby
2021-08-15 20:42                                   ` Olivier Langlois
2021-08-16 13:02                                     ` Pavel Begunkov
2021-08-16 13:06                                       ` Pavel Begunkov
2021-08-17 18:15                                     ` Jens Axboe
2021-08-17 18:24                                       ` Jens Axboe
2021-08-17 19:29                                         ` Tony Battersby
2021-08-17 19:59                                           ` Jens Axboe
2021-08-17 21:28                                             ` Jens Axboe
2021-08-17 21:39                                               ` Tony Battersby
2021-08-17 22:05                                                 ` Jens Axboe
2021-08-18 14:37                                                   ` Tony Battersby
2021-08-18 14:46                                                     ` Jens Axboe
2021-08-18  2:57                                               ` Jens Axboe
2021-08-18  2:58                                                 ` Jens Axboe
2021-08-21 10:08                                                 ` Olivier Langlois
2021-08-21 16:47                                                   ` Olivier Langlois
2021-08-21 16:51                                                     ` Jens Axboe
2021-08-21 17:21                                                       ` Olivier Langlois
2021-08-21  9:52                                         ` Olivier Langlois
2021-08-21  9:48                                       ` Olivier Langlois
2021-10-22 14:13     ` [RFC] coredump: Do not interrupt dump for TIF_NOTIFY_SIGNAL Pavel Begunkov
2021-12-24  1:34       ` Olivier Langlois
2021-12-24 10:37         ` Pavel Begunkov
2021-12-24 19:52           ` Eric W. Biederman
2021-12-28 11:24             ` Pavel Begunkov
2022-03-14 23:58               ` Eric W. Biederman [this message]
     [not found]                 ` <8218f1a245d054c940e25142fd00a5f17238d078.camel@trillion01.com>
2022-06-01  3:15                   ` Jens Axboe
2022-07-20 16:49                     ` [PATCH 0/2] coredump: Allow io_uring using apps to dump to pipes Eric W. Biederman
2022-07-20 16:50                       ` [PATCH 1/2] signal: Move stopping for the coredump from do_exit into get_signal Eric W. Biederman
2022-07-20 16:51                       ` [PATCH 2/2] coredump: Allow coredumps to pipes to work with io_uring Eric W. Biederman
2022-08-22 21:16                         ` Olivier Langlois
2022-08-23  3:35                           ` Olivier Langlois
2022-08-23 18:22                             ` Eric W. Biederman
2022-08-23 18:27                               ` Jens Axboe
2022-08-24 15:11                                 ` Eric W. Biederman
2022-08-24 15:51                                   ` Jens Axboe
2022-01-05 19:39           ` [RFC] coredump: Do not interrupt dump for TIF_NOTIFY_SIGNAL Olivier Langlois

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ilsg13yz.fsf@email.froward.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=olivier@trillion01.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.