All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "Stas Sergeev" <stsp2@yandex.ru>,
	"Aleksa Sarai" <cyphar@cyphar.com>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	linux-kernel@vger.kernel.org,
	"Stefan Metzmacher" <metze@samba.org>,
	"Eric Biederman" <ebiederm@xmission.com>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Andy Lutomirski" <luto@kernel.org>, "Jan Kara" <jack@suse.cz>,
	"Jeff Layton" <jlayton@kernel.org>,
	"Chuck Lever" <chuck.lever@oracle.com>,
	"Alexander Aring" <alex.aring@gmail.com>,
	"David Laight" <David.Laight@aculab.com>,
	linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Christian Göttsche" <cgzones@googlemail.com>
Subject: Re: [PATCH v5 0/3] implement OA2_CRED_INHERIT flag for openat2()
Date: Mon, 29 Apr 2024 11:12:39 +0200	[thread overview]
Message-ID: <20240429-donnerstag-behilflich-a083311d8e00@brauner> (raw)
In-Reply-To: <CALCETrUL3zXAX94CpcQYwj1omwO+=-1Li+J7Bw2kpAw4d7nsyw@mail.gmail.com>

On Sun, Apr 28, 2024 at 09:41:20AM -0700, Andy Lutomirski wrote:
> > On Apr 26, 2024, at 6:39 AM, Stas Sergeev <stsp2@yandex.ru> wrote:
> > This patch-set implements the OA2_CRED_INHERIT flag for openat2() syscall.
> > It is needed to perform an open operation with the creds that were in
> > effect when the dir_fd was opened, if the dir was opened with O_CRED_ALLOW
> > flag. This allows the process to pre-open some dirs and switch eUID
> > (and other UIDs/GIDs) to the less-privileged user, while still retaining
> > the possibility to open/create files within the pre-opened directory set.
> >
> 
> I’ve been contemplating this, and I want to propose a different solution.
> 
> First, the problem Stas is solving is quite narrow and doesn’t
> actually need kernel support: if I want to write a user program that
> sandboxes itself, I have at least three solutions already.  I can make
> a userns and a mountns; I can use landlock; and I can have a separate
> process that brokers filesystem access using SCM_RIGHTS.
> 
> But what if I want to run a container, where the container can access
> a specific host directory, and the contained application is not aware
> of the exact technology being used?  I recently started using
> containers in anger in a production setting, and “anger” was
> definitely the right word: binding part of a filesystem in is
> *miserable*.  Getting the DAC rules right is nasty.  LSMs are worse.

Nowadays it's extremely simple due tue open_tree(OPEN_TREE_CLONE) and
move_mount(). I rewrote the bind-mount logic in systemd based on that
and util-linux uses that as well now.
https://brauner.io/2023/02/28/mounting-into-mount-namespaces.html

> Podman’s “bind,relabel” feature is IMO utterly disgusting.  I think I
> actually gave up on making one of my use cases work on a Fedora
> system.
> 
> Here’s what I wanted to do, logically, in production: pick a host
> directory, pick a host *principal* (UID, GID, label, etc), and have
> the *entire container* access the directory as that principal. This is
> what happens automatically if I run the whole container as a userns
> with only a single UID mapped, but I don’t really want to do that for
> a whole variety and of reasons.

You're describing idmapped mounts for the most part which are upstream
and are used in exactly that way by a lot of userspace.

> 
> So maybe reimagining Stas’ feature a bit can actually solve this
> problem.  Instead of a special dirfd, what if there was a special
> subtree (in the sense of open_tree) that captures a set of creds and
> does all opens inside the subtree using those creds?

That would mean override creds in the VFS layer when accessing a
specific subtree which is a terrible idea imho. Not just because it will
quickly become a potential dos when you do that with a lot of subtrees
it will also have complex interactions with overlayfs.

> 
> This isn’t a fully formed proposal, but I *think* it should be
> generally fairly safe for even an unprivileged user to clone a subtree
> with a specific flag set to do this. Maybe a capability would be
> needed (CAP_CAPTURE_CREDS?), but it would be nice to allow delegating
> this to a daemon if a privilege is needed, and getting the API right
> might be a bit tricky.
> 
> Then two different things could be done:
> 
> 1. The subtree could be used unmounted or via /proc magic links. This
> would be for programs that are aware of this interface.
> 
> 2. The subtree could be mounted, and accessed through the mount would
> use the captured creds.
> 
> (Hmm. What would a new open_tree() pointing at this special subtree do?)
> 
> 
> With all this done, if userspace wired it up, a container user could
> do something like:
> 
> —bind-capture-creds source=dest
> 
> And the contained program would access source *as the user who started
> the container*, and this would just work without relabeling or
> fiddling with owner uids or gids or ACLs, and it would continue to
> work even if the container has multiple dynamically allocated subuids
> mapped (e.g. one for “root” and one for the actual application).
> 
> Bonus points for the ability to revoke the creds in an already opened
> subtree. Or even for the creds to automatically revoke themselves when
> the opener exits (or maybe when a specific cred-pinning fd goes away).
> 
> (This should work for single files as well as for directories.)
> 
> New LSM hooks or extensions of existing hooks might be needed to make
> LSMs comfortable with this.
> 
> What do you all think?

I think the problem you're describing is already mostly solved.

  parent reply	other threads:[~2024-04-29  9:12 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-26 13:33 [PATCH v5 0/3] implement OA2_CRED_INHERIT flag for openat2() Stas Sergeev
2024-04-26 13:33 ` [PATCH v5 1/3] fs: reorganize path_openat() Stas Sergeev
2024-04-26 13:33 ` [PATCH v5 2/3] open: add O_CRED_ALLOW flag Stas Sergeev
2024-04-27  2:12   ` kernel test robot
2024-04-26 13:33 ` [PATCH v5 3/3] openat2: add OA2_CRED_INHERIT flag Stas Sergeev
2024-04-28 16:41 ` [PATCH v5 0/3] implement OA2_CRED_INHERIT flag for openat2() Andy Lutomirski
2024-04-28 17:39   ` stsp
2024-04-28 19:15     ` stsp
2024-04-28 20:19     ` Andy Lutomirski
2024-04-28 21:14       ` stsp
2024-04-28 21:30         ` Andy Lutomirski
2024-04-28 22:12           ` stsp
2024-04-29  1:12             ` stsp
2024-04-29  9:12   ` Christian Brauner [this message]
2024-05-06  7:13   ` Aleksa Sarai
2024-05-06 17:29     ` Andy Lutomirski
2024-05-06 17:34       ` Andy Lutomirski
2024-05-06 19:34       ` David Laight
2024-05-06 21:53         ` Andy Lutomirski
2024-05-07  7:42       ` Christian Brauner
2024-05-07 20:38         ` Andy Lutomirski
2024-05-08  7:32           ` Christian Brauner
2024-05-08 17:30             ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240429-donnerstag-behilflich-a083311d8e00@brauner \
    --to=brauner@kernel.org \
    --cc=David.Laight@aculab.com \
    --cc=alex.aring@gmail.com \
    --cc=cgzones@googlemail.com \
    --cc=chuck.lever@oracle.com \
    --cc=cyphar@cyphar.com \
    --cc=ebiederm@xmission.com \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=metze@samba.org \
    --cc=pbonzini@redhat.com \
    --cc=serge@hallyn.com \
    --cc=stsp2@yandex.ru \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.