From: Raman Shukhau <ramasha@meta.com>
To: Martin KaFai Lau <martin.lau@linux.dev>
Cc: "bpf@vger.kernel.org" <bpf@vger.kernel.org>,
"ast@kernel.org" <ast@kernel.org>,
"andrii@kernel.org" <andrii@kernel.org>,
"daniel@iogearbox.net" <daniel@iogearbox.net>
Subject: Re: [PATCH bpf-next 1/1] Fix for bpf_sysctl_set_new_value
Date: Thu, 16 May 2024 21:16:50 +0000 [thread overview]
Message-ID: <0A9C587D-A524-4206-BDBB-C27515606DB4@fb.com> (raw)
In-Reply-To: <ca8136e0-5d2a-402b-ad03-cc8a218affd4@linux.dev>
> btw, I am curious what is missed in the test_sysctl.c that didn't catch the return value case?
Test didn’t check new sysctl value, only if return code is successful. In this case it silently ignores new value.
> From looking at how new_updated is set, my understanding is new_len cannot be 0 here. just want to double check.
bpf_sysctl_set_new_value checks that new_len is not 0, otherwise returns EINVAL
> On May 7, 2024, at 4:20 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
>
> On 5/4/24 3:23 AM, Raman Shukhau wrote:
>> Noticed that call to bpf_sysctl_set_new_value doesn't change final value
>> of the parameter, when called from cgroup/syscall bpf handler. No error
>> thrown in this case, new value is simply ignored and original value, sent
>> to sysctl, is set. Example (see test added to this change for BPF handler
>> logic):
>> sysctl -w net.ipv4.ip_local_reserved_ports = 11111
>> ... cgroup/syscal handler call bpf_sysctl_set_new_value and set 22222
>> sysctl net.ipv4.ip_local_reserved_ports
>> ... returns 11111
>> On investigation I found 2 things that needs to be changed:
>> * return value check
>> * new_len provided by bpf back to sysctl. proc_sys_call_handler expects
>> this value NOT to include \0 symbol, e.g. if user do
>
> Thanks for the report and the patch.
>
> This patch is changing a few things (1 fix, 1 improvement, 1 test).
>
> Separate these individual changes into its own patch. Patch 1 fixes the return value. Patch 2 improves the '\0' and *pcount situation. Patch 3 adds the test.
>
> btw, I am curious what is missed in the test_sysctl.c that didn't catch the return value case?
>
>> ```
>> open("/proc/sys/net/ipv4/ip_local_reserved_ports", ...)
>> write(fd, "11111", sizeof("22222"))
>> ```
>> or `echo -n "11111" > /proc/sys/net/ipv4/ip_local_reserved_ports`
>> or `sysctl -w net.ipv4.ip_local_reserved_ports=11111
>> proc_sys_call_handler receives count equal to `5`. To make it consistent
>> with bpf_sysctl_set_new_value, this change also adjust `new_len` with
>> `-1`, if `\0` passed as last character. Alternatively, using
>> `sizeof("11111") - 1` in BPF handler should work, but it might not be
>> obvious and spark confusion. Note: if incorrect count is used, sysctl
>> returns EINVAL to the user.
>> Signed-off-by: Raman Shukhau <ramasha@fb.com>
>> ---
>> kernel/bpf/cgroup.c | 7 ++-
>> .../bpf/progs/test_sysctl_overwrite.c | 47 +++++++++++++++++++
>> tools/testing/selftests/bpf/test_sysctl.c | 35 +++++++++++++-
>> 3 files changed, 85 insertions(+), 4 deletions(-)
>> create mode 100644 tools/testing/selftests/bpf/progs/test_sysctl_overwrite.c
>> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
>> index 8ba73042a239..23736aed1b53 100644
>> --- a/kernel/bpf/cgroup.c
>> +++ b/kernel/bpf/cgroup.c
>> @@ -1739,10 +1739,13 @@ int __cgroup_bpf_run_filter_sysctl(struct ctl_table_header *head,
>> kfree(ctx.cur_val);
>> - if (ret == 1 && ctx.new_updated) {
>> + if (ret == 0 && ctx.new_updated) {
>> kfree(*buf);
>> *buf = ctx.new_val;
>> - *pcount = ctx.new_len;
>> + if (!(*buf)[ctx.new_len])
>> + *pcount = ctx.new_len - 1;
>
> From looking at how new_updated is set, my understanding is new_len cannot be 0 here. just want to double check.
>
>
>> + else
>> + *pcount = ctx.new_len;
>> } else {
>> kfree(ctx.new_val);
>> }
>> diff --git a/tools/testing/selftests/bpf/progs/test_sysctl_overwrite.c b/tools/testing/selftests/bpf/progs/test_sysctl_overwrite.c
>> new file mode 100644
>> index 000000000000..e44b429fcfc1
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/test_sysctl_overwrite.c
>> @@ -0,0 +1,47 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (c) 2019 Facebook
>> +
>> +#include <string.h>
>> +
>> +#include <linux/bpf.h>
>> +
>> +#include <bpf/bpf_helpers.h>
>> +
>> +#include "bpf_compiler.h"
>> +
>> +static const char sysctl_value[] = "31337";
>> +static const char sysctl_name[] = "net/ipv4/ip_local_reserved_ports";
>> +static __always_inline int is_expected_name(struct bpf_sysctl *ctx)
>> +{
>> + unsigned char i;
>> + char name[sizeof(sysctl_name)];
>> + int ret;
>> +
>> + memset(name, 0, sizeof(name));
>> + ret = bpf_sysctl_get_name(ctx, name, sizeof(name), 0);
>> + if (ret < 0 || ret != sizeof(sysctl_name) - 1)
>> + return 0;
>> +
>> + __pragma_loop_unroll_full
>> + for (i = 0; i < sizeof(sysctl_name); ++i)
>> + if (name[i] != sysctl_name[i])
>
> bpf_strncmp() should be useful here.
>
>> + return 0;
>> +
>> + return 1;
>> +}
>> +
>> +SEC("cgroup/sysctl")
>> +int test_value_overwrite(struct bpf_sysctl *ctx)
>> +{
>> + if (!ctx->write)
>> + return 1;
>> +
>> + if (!is_expected_name(ctx))
>> + return 0;
>> +
>> + if (bpf_sysctl_set_new_value(ctx, sysctl_value, sizeof(sysctl_value)) == 0)
>> + return 1;
>> + return 0;
>> +}
>> +
>> +char _license[] SEC("license") = "GPL";
>> diff --git a/tools/testing/selftests/bpf/test_sysctl.c b/tools/testing/selftests/bpf/test_sysctl.c
>> index bcdbd27f22f0..dfa479861d3a 100644
>> --- a/tools/testing/selftests/bpf/test_sysctl.c
>> +++ b/tools/testing/selftests/bpf/test_sysctl.c
>> @@ -35,6 +35,7 @@ struct sysctl_test {
>> int seek;
>> const char *newval;
>> const char *oldval;
>> + const char *updval;
>> enum {
>> LOAD_REJECT,
>> ATTACH_REJECT,
>> @@ -1395,6 +1396,16 @@ static struct sysctl_test tests[] = {
>> .open_flags = O_RDONLY,
>> .result = SUCCESS,
>> },
>> + {
>> + "C prog: override write to ip_local_reserved_ports",
>> + .prog_file = "./test_sysctl_overwrite.bpf.o",
>
> test_sysctl.c is not run in bpf CI. It is not very useful to extend this test further. Lets take this chance to create a new progs/cgrp_sysctl.c test that will be exercised by ./test_progs in bpf CI. Then it can use the newer skel open_and_load also.
>
> Not asking to to migrate the existing tests in test_sysctl.c to the new progs/cgrp_sysctl.c in this patch set. The new cgrp_sysctl.c can only have the tests that exercise the changes in this patch set. However, it will be useful if progs/cgrp_sysctl.c can be bootstrapped in a way that the future test_sysctl.c migration will be easier. I also wouldn't worry too much on the existing raw insns tests in test_sysctl.c for now. They will need to be moved to either C or bpf asm in the future.
>
> pw-bot: cr
>
>> + .attach_type = BPF_CGROUP_SYSCTL,
>> + .sysctl = "net/ipv4/ip_local_reserved_ports",
>> + .open_flags = O_RDWR,
>> + .newval = "11111",
>> + .updval = "31337",
>> + .result = SUCCESS,
>> + },
>> };
>> static size_t probe_prog_length(const struct bpf_insn *fp)
>> @@ -1520,13 +1531,33 @@ static int access_sysctl(const char *sysctl_path,
>> log_err("Read value %s != %s", buf, test->oldval);
>> goto err;
>> }
>> - } else if (test->open_flags == O_WRONLY) {
>> + } else if (test->open_flags == O_WRONLY || test->open_flags == O_RDWR) {
>> if (!test->newval) {
>> log_err("New value for sysctl is not set");
>> goto err;
>> }
>> - if (write(fd, test->newval, strlen(test->newval)) == -1)
>> + if (write(fd, test->newval, strlen(test->newval)) == -1) {
>> + log_err("Unable to write sysctl value");
>> goto err;
>> + }
>> + if (test->open_flags == O_RDWR) {
>> + char buf[128];
>> +
>> + if (!test->updval) {
>> + log_err("Expected value for sysctl is not set");
>> + goto err;
>> + }
>> +
>> + lseek(fd, 0, SEEK_SET);
>> + if (read(fd, buf, sizeof(buf)) == -1) {
>> + log_err("Unable to read updated value");
>> + goto err;
>> + }
>> + if (strncmp(buf, test->updval, strlen(test->updval))) {
>> + log_err("Overwritten value %s != %s", buf, test->updval);
>> + goto err;
>> + }
>> + }
>> } else {
>> log_err("Unexpected sysctl access: neither read nor write");
>> goto err;
>
prev parent reply other threads:[~2024-05-16 21:16 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-04 10:23 [PATCH bpf-next 0/1] Fix for bpf_sysctl_set_new_value Raman Shukhau
2024-05-04 10:23 ` [PATCH bpf-next 1/1] " Raman Shukhau
2024-05-07 23:20 ` Martin KaFai Lau
2024-05-16 21:16 ` Raman Shukhau [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0A9C587D-A524-4206-BDBB-C27515606DB4@fb.com \
--to=ramasha@meta.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=martin.lau@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).