Linux-mm Archive mirror
 help / color / mirror / Atom feed
From: Jeff Xu <jeffxu@chromium.org>
To: "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 jeffxu@chromium.org, keescook@chromium.org, jannh@google.com,
	 sroettger@google.com, willy@infradead.org,
	gregkh@linuxfoundation.org,  torvalds@linux-foundation.org,
	usama.anjum@collabora.com, corbet@lwn.net,  surenb@google.com,
	merimus@google.com, rdunlap@infradead.org,  jeffxu@google.com,
	jorgelo@chromium.org, groeck@chromium.org,
	 linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	 linux-mm@kvack.org, pedro.falcato@gmail.com,
	dave.hansen@intel.com,  linux-hardening@vger.kernel.org,
	deraadt@openbsd.org
Subject: Re: [PATCH v10 0/5] Introduce mseal
Date: Wed, 15 May 2024 10:18:17 -0700	[thread overview]
Message-ID: <CABi2SkXBpL8qdSMTwe5njWasqidsWDkhme6xw2_38JARrhPRwA@mail.gmail.com> (raw)
In-Reply-To: <d46qb2rkfsagw22u6ishgagsvsmqsu5nrmf5up5mhi6xrwolyt@6ir6g2v63of7>

On Tue, May 14, 2024 at 2:28 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote:
>
> * Andrew Morton <akpm@linux-foundation.org> [240514 13:47]:
> > On Mon, 15 Apr 2024 16:35:19 +0000 jeffxu@chromium.org wrote:
> >
> > > This patchset proposes a new mseal() syscall for the Linux kernel.
> >
> > I have not moved this into mm-stable for a 6.10 merge.  Mainly because
> > of the total lack of Reviewed-by:s and Acked-by:s.
> >
> > The code appears to be stable enough for a merge.
> >
> > It's awkward that we're in conference this week, but I ask people to
> > give consideration to the desirability of moving mseal() into mainline
> > sometime over the next week, please.
>
> I have looked at this code a fair bit at this point, but I wanted to get
> some clarification on the fact that we now have mseal doing checks
> upfront while others fail in the middle.
>
> mbind:
>                 /*
>                  * If any vma in the range got policy other than MPOL_BIND
>                  * or MPOL_PREFERRED_MANY we return error. We don't reset
>                  * the home node for vmas we already updated before.
>                  */
>
>
> mlock:
> mlock will abort (through one path), when it sees a gap in mapped areas,
> but will not undo what it did so far.
>
> mlock_fixup can fail on vma_modify_flags(), but previous vmas are not
> updated.  This can fail due to allocation failures on the splitting of
> VMAs (or failed merging).  The allocations could happen before, but this
> is more work (but doable, no doubt).
>
> userfaultfd is... complicated.
>
> And even munmap() can fail and NOT undo the previous split(s).
> mmap.c:
>                         /*
>                          * If userfaultfd_unmap_prep returns an error the vmas
>                          * will remain split, but userland will get a
>                          * highly unexpected error anyway. This is no
>                          * different than the case where the first of the two
>                          * __split_vma fails, but we don't undo the first
>                          * split, despite we could. This is unlikely enough
>                          * failure that it's not worth optimizing it for.
>                          */
>
>
> But this one is different form the others..
> madvise:
>         /*
>          * If the interval [start,end) covers some unmapped address
>          * ranges, just ignore them, but return -ENOMEM at the end.
>          * - different from the way of handling in mlock etc.
>          */
>
Thanks.

The current mseal patch does up-front checking in two different situations:
1 when applying mseal()
   Checking for unallocated memory in the given memory range.

2 When checking mseal flag during mprotect/unmap/remap/mmap
  Checking mseal flag is placed ahead of the main business logic, and
treated the same as input arguments check.

> Either we are planning to clean this up and do what we can up-front, or
> just move the mseal check with the rest.  Otherwise we are making a
> larger mess with more technical dept for a single user, and I think this
> is not an acceptable trade-off.
>
The sealing use case  is different  from regular mm API and this
didn't create additional technical debt.  Please allow me to explain
those separately.

The main use case and threat model is that an attacker exploits a
vulnerability and has arbitrary write access to the process, and can
manipulate some arguments to syscalls from some threads. Placing the
checking of mseal flag ahead of mprotect main business logic is
stricter compared with doing it in-place. It is meant to be harder for
the attacker, e.g. blocking the  opportunistically attempt of munmap
by modifying the size argument.

The legit app code won't call mprotect/munmap on sealed memory.  It is
irrelevant for both precheck and in-place check approaches, from a
legit app code point of view.

The regular mm API (other than sealing)'s user-case is  for legit
code, a legit code knows the whole picture of memory mappings and is
unlikely to rely on the opportunist return.

The user-cases are different, I hope we look into the difference and
not force them into one direction.

About tech debt, code-wise , placing pre-check ahead of the main
business logic of mprotect/munmap APIs, reduces the size of code
change, and is easy to carry from release to release, or backporting.

But let's compare  the alternatives - doing it in-place without precheck.
- munmap
munmap calls arch_unmap(mm, start, end) ahead of main business logic,
the checking of sealing flags would need to be architect specific. In
addition, if arch_unmap return fails due to sealing, the code should
still proceed, till the main business logic fails again.

- mremap/mmap
The check of sealing would be scattered, e.g. checking the src address
range in-place, dest arrange in-place, unmap in-place, etc. The code
is complex and prone to error.

-mprotect/madvice
Easy to change to in-place.

- mseal
mseal() check unallocated memory in the given memory range in the
pre-check. Easy to change to in-place (same as mprotect)

The situation in munmap and mremap/mmap make in-place checks less desirable imo.

> Considering the benchmarks that were provided, performance arguments
> seem like they are not a concern.
>
Yes. Performance is not a factor in making a design choice on this.

> I want to know if we are planning to sort and move existing checks if we
> proceed with this change?
>
I would argue that we should not change the existing mm code. mseal is
new and no backward compatible problem. That is not the case for
mprotect and other mm api. E.g. if we were to change mprotect to add a
precheck for memory gap, some badly written application might break.

The 'atomic' approach is also really difficult to enforce to the whole
MM area, mseal() doesn't claim it is atomic. Most regular mm API might
go deeper in mm data structure to update page tables and HW, etc. The
rollback in handling those error cases, and performance cost. I'm not
sure if the benefit is worth the cost. However, atomicity is another
topic to discuss unrelated to mm sealing.  The current design of mm
sealing is due to its use case and practical coding reason.

Thanks
-Jeff

> Thanks,
> Liam


  reply	other threads:[~2024-05-15 17:18 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-15 16:35 [PATCH v10 0/5] Introduce mseal jeffxu
2024-04-15 16:35 ` [PATCH v10 1/5] mseal: Wire up mseal syscall jeffxu
2024-04-15 18:12   ` Muhammad Usama Anjum
2024-04-15 18:21     ` Linus Torvalds
2024-04-15 19:06       ` Jeff Xu
2024-04-15 16:35 ` [PATCH v10 2/5] mseal: add " jeffxu
2024-04-16 14:59   ` Liam R. Howlett
2024-04-16 15:17     ` Jann Horn
2024-04-16 16:42     ` Theo de Raadt
2024-04-15 16:35 ` [PATCH v10 3/5] selftest mm/mseal memory sealing jeffxu
2024-04-15 18:32   ` Muhammad Usama Anjum
2024-04-15 20:27     ` Jeff Xu
2024-04-16  0:34       ` Kees Cook
2024-05-02 11:24   ` Ryan Roberts
2024-05-02 15:18     ` Jeff Xu
2024-05-02 22:39     ` Jeff Xu
2024-05-03  8:30       ` Ryan Roberts
2024-04-15 16:35 ` [PATCH v10 4/5] mseal:add documentation jeffxu
2024-04-15 16:35 ` [PATCH v10 5/5] selftest mm/mseal read-only elf memory segment jeffxu
2024-04-16 15:13 ` [PATCH v10 0/5] Introduce mseal Liam R. Howlett
2024-04-16 19:40   ` Jeff Xu
2024-04-18 20:19     ` Suren Baghdasaryan
2024-04-19  1:22       ` Jeff Xu
2024-04-19 14:57         ` Suren Baghdasaryan
2024-04-19 15:14           ` Jeff Xu
2024-04-19 16:54             ` Suren Baghdasaryan
2024-04-19 17:59         ` Pedro Falcato
2024-04-20  1:23           ` Jeff Xu
2024-05-14 17:46 ` Andrew Morton
2024-05-14 19:52   ` Kees Cook
2024-05-23 23:32     ` Kees Cook
2024-05-23 23:54       ` Andrew Morton
2024-05-24 15:19         ` Linus Torvalds
2024-05-14 20:59   ` Jonathan Corbet
2024-05-14 21:28     ` Matthew Wilcox
2024-05-14 22:48       ` Theo de Raadt
2024-05-14 23:01         ` Andrew Morton
2024-05-14 23:47           ` Theo de Raadt
2024-05-15  2:58             ` Willy Tarreau
2024-05-15  3:36               ` Linus Torvalds
2024-05-15  4:14                 ` Linus Torvalds
2024-05-15  6:14                   ` Willy Tarreau
2024-05-15  0:43         ` Linus Torvalds
2024-05-15  0:57           ` Theo de Raadt
2024-05-15  1:20             ` Linus Torvalds
2024-05-15  1:47               ` Theo de Raadt
2024-05-15  2:28                 ` Linus Torvalds
2024-05-15  2:42                   ` Theo de Raadt
2024-05-15  4:53                     ` Liam R. Howlett
2024-05-14 21:28   ` Liam R. Howlett
2024-05-15 17:18     ` Jeff Xu [this message]
2024-05-15 22:19       ` Liam R. Howlett
2024-05-16  0:59         ` Jeff Xu
2024-05-21 16:00           ` Liam R. Howlett
2024-05-21 20:55             ` Jeff Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABi2SkXBpL8qdSMTwe5njWasqidsWDkhme6xw2_38JARrhPRwA@mail.gmail.com \
    --to=jeffxu@chromium.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@intel.com \
    --cc=deraadt@openbsd.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=groeck@chromium.org \
    --cc=jannh@google.com \
    --cc=jeffxu@google.com \
    --cc=jorgelo@chromium.org \
    --cc=keescook@chromium.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=merimus@google.com \
    --cc=pedro.falcato@gmail.com \
    --cc=rdunlap@infradead.org \
    --cc=sroettger@google.com \
    --cc=surenb@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=usama.anjum@collabora.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).