Containers Archive mirror
 help / color / mirror / Atom feed
From: Sargun Dhillon <sargun-GaZTRHToo+CzQB+pC5nmwQ@public.gmane.org>
To: Alexei Starovoitov
	<alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>,
	netdev <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	"David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>,
	Lorenzo Colitti <lorenzo-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH net-next 0/3] eBPF Seccomp filters
Date: Fri, 16 Feb 2018 10:39:24 -0800	[thread overview]
Message-ID: <CAMp4zn-6bKkc539fihgVBbdYW9_ps4ARG_hKwWvcVG0x-hR=7Q__48696.2332194254$1518806317$gmane$org@mail.gmail.com> (raw)
In-Reply-To: <20180215043027.zssmhvfdn7iz3rlz-+o4/htvd0TCa6kscz5V53/3mLCh9rsb+VpNB7YpNyf8@public.gmane.org>

On Wed, Feb 14, 2018 at 8:30 PM, Alexei Starovoitov
<alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Wed, Feb 14, 2018 at 10:32:22AM -0700, Tycho Andersen wrote:
>> > >
>> > > What's the reason for adding eBPF support? seccomp shouldn't need it,
>> > > and it only makes the code more complex. I'd rather stick with cBPF
>> > > until we have an overwhelmingly good reason to use eBPF as a "native"
>> > > seccomp filter language.
>> > >
>> >
>> > I can think of two fairly strong use cases for eBPF's ability to call
>> > functions: logging and Tycho's user notifier thing.
>>
>> Worth noting that there is one additional thing that I didn't
>> implement, but which would be nice and is probably not possible with
>> eBPF (at least, not without a bunch of additional infrastructure):
>> passing fds back to the tracee from the manager if you intercept
>> socket(), or accept() or something.
>>
>> This could again be accomplished via other means, though it would be a
>> lot nicer to have a primitive for it.
>
> there is bpf_perf_event_output() interface that allows to stream
> arbitrary data from kernel into user space via perf ring buffer.
> User space can epoll on it. We use this in both tracing and networking
> for notifications and streaming data transfers.
> I suspect this can be used for 'logging' too, since it's cheap and fast.
>
> Specifically for android we added bpf_lsm hooks, cookie/uid helpers,
> and read-only maps.
> Lorenzo,
> there was a claim in this thread that bpf is disabled on android.
> Can you please clarify ?
> If it's actually disabled and there is no intent to enable it,
> I'd rather not add any more android specific features to bpf.
>
> What I think is important to understand is that BPF goes through
> very active development. The verifier is constantly getting smarter.
> There is work to add bounded loops, lock/unlock, get/put tracking,
> global/percpu variables, dynamic linking and so on.
> Most of the features are available to root only and unpriv
> has very limited set. Like getting bpf_perf_event_output() to work
> for unpriv will likely require additional verifier work.
>
> So all cool bits will not be usable by seccomp+eBPF and unpriv
> on day one. It's not a lot of work either, but once it's done
> I'd hate to see arguments against adding more verifier features
> just because eBPF is used by seccomp/landlock/other_security_thing.
>
> Also I think the argument that seccomp+eBPF will be faster than
> seccomp+cBPF is a weak one. I bet kpti on/off makes no difference
> under seccomp, since _all_ syscalls are already slow for sandboxed app.
> Instead of making seccomp 5% faster with eBPF, I think it's
> worth looking into extending LSM hooks to cover all syscalls and
> have programmable (bpf or whatever) filtering applied per syscall.
> Like we can have a white list syscall table covered by lsm hooks
> and any other syscall will get into old seccomp-style
> filtering category automatically.
> lsm+bpf would need to follow process hierarchy. It shouldn't be
> a runtime check at syscall entry either, but compile time
> extra branch in SYSCALL_DEFINE for non-whitelisted syscalls.
> There are bunch of other things to figure out, but I think
> the perf win will be bigger than replacing cBPF with eBPF in
> existing seccomp.
>
Given this test program:
for (i = 10; i < 99999999; i++) syscall(__NR_getpid);

If I implement an eBPF filter with PROG_ARRAYs, and tail call, the
numbers are such:
ebpf JIT 12.3% slower than native
ebpf no JIT 13.6% slower than native
seccomp JIT 17.6% slower than native
seccomp no JIT 37% slower than native

This is using libseccomp for the standard seccomp BPF program. There's
no reasonable way for our workload to know which syscalls come
"earlier", so we can't take that optimization. Potentially, libseccomp
can be smarter about ordering cases (using ranges), and use an
O(log(n)) search algorithm, but both of these are microptimizations
that scale with the number of syscalls and per-syscall rules. The
nicety of using a PROG_ARRAY means that adding additional filters
(syscalls) comes at no cost, whereas there's a tradeoff any time you
add another rule in traditional seccomp filters.

This was tested on an Amazon M4.16XL running with pcid, and KPTI.

  parent reply	other threads:[~2018-02-16 18:39 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20180213154244.GA3292@ircssh-2.c.rugged-nimbus-611.internal>
     [not found] ` <20180213154244.GA3292-du9IEJ8oIxHXYT48pCVpJ3c7ZZ+wIVaZYkHkVr5ML8kVGlcevz2xqA@public.gmane.org>
2018-02-13 15:47   ` [PATCH net-next 0/3] eBPF Seccomp filters Kees Cook
2018-02-14  0:47   ` Mickaël Salaün
     [not found] ` <CAGXu5jLiYh0rSRuJ_-2xLB03Wod5G07njpoESR4SnmsmiUnsEw@mail.gmail.com>
     [not found]   ` <CAMp4zn8VNurTjmrUtHnaK21A4hUQQz5tnarj15vmTU+TjY79XA@mail.gmail.com>
     [not found]     ` <CAMp4zn8VNurTjmrUtHnaK21A4hUQQz5tnarj15vmTU+TjY79XA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-13 17:02       ` Jessie Frazelle
     [not found]     ` <CAEk6tEw3ty0kBH+06TYt4=Ywt-4_cHBa9f8p3ajMghtjRkHmMg@mail.gmail.com>
     [not found]       ` <CAEk6tEw3ty0kBH+06TYt4=Ywt-4_cHBa9f8p3ajMghtjRkHmMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-13 17:07         ` Brian Goff
2018-02-13 17:31         ` Sargun Dhillon
     [not found]       ` <CAMp4zn-Lw0grNrCyjHJZUje1Aznaj03iAUWZ86ki68MZMN1-zA@mail.gmail.com>
     [not found]         ` <CAMp4zn-Lw0grNrCyjHJZUje1Aznaj03iAUWZ86ki68MZMN1-zA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-13 20:16           ` Kees Cook
     [not found]         ` <CAGXu5jKv3QFVKLhok1JWiPamE0b4CqLTO-hx8sP0KWED921=6w@mail.gmail.com>
     [not found]           ` <CAGXu5jKv3QFVKLhok1JWiPamE0b4CqLTO-hx8sP0KWED921=6w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-13 20:50             ` Tycho Andersen
2018-02-13 21:08             ` Paul Moore
     [not found]   ` <CAGXu5jLiYh0rSRuJ_-2xLB03Wod5G07njpoESR4SnmsmiUnsEw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-13 16:29     ` Sargun Dhillon
2018-02-14 17:25     ` Andy Lutomirski
     [not found]   ` <CALCETrV9xUd3XRgobTDgVNRFY_+o=pEDkfjvuxQ7w_UyH324zA@mail.gmail.com>
     [not found]     ` <CALCETrV9xUd3XRgobTDgVNRFY_+o=pEDkfjvuxQ7w_UyH324zA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-14 17:32       ` Tycho Andersen
     [not found]     ` <20180214173222.kvos6izqcywkuyi5@cisco>
2018-02-15  4:30       ` Alexei Starovoitov
     [not found]       ` <20180215043027.zssmhvfdn7iz3rlz@ast-mbp.dhcp.thefacebook.com>
     [not found]         ` <20180215043027.zssmhvfdn7iz3rlz-+o4/htvd0TCa6kscz5V53/3mLCh9rsb+VpNB7YpNyf8@public.gmane.org>
2018-02-15  8:35           ` Lorenzo Colitti via Containers
2018-02-15 16:05           ` Andy Lutomirski
2018-02-16 18:39           ` Sargun Dhillon [this message]
2018-02-13 20:33 Tom Hromatka
     [not found] ` <7eb1497e-e5f3-c5ba-e255-7f510795b51d-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2018-02-13 20:35   ` Kees Cook
     [not found] ` <CAGXu5jJZgrgLrhkZO33RNdOds8zwnnOZh+rqwguxJM+zm=EJ7g@mail.gmail.com>
     [not found]   ` <CAGXu5jJZgrgLrhkZO33RNdOds8zwnnOZh+rqwguxJM+zm=EJ7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-13 20:38     ` Tom Hromatka
  -- strict thread matches above, loose matches on Subject: below --
2018-02-13 15:42 Sargun Dhillon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMp4zn-6bKkc539fihgVBbdYW9_ps4ARG_hKwWvcVG0x-hR=7Q__48696.2332194254$1518806317$gmane$org@mail.gmail.com' \
    --to=sargun-gaztrhtoo+czqb+pc5nmwq@public.gmane.org \
    --cc=alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org \
    --cc=davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org \
    --cc=keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=lorenzo-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).