BPF Archive mirror
 help / color / mirror / Atom feed
From: Raman Shukhau <ramasha@meta.com>
To: Martin KaFai Lau <martin.lau@linux.dev>
Cc: "bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"ast@kernel.org" <ast@kernel.org>,
	"andrii@kernel.org" <andrii@kernel.org>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>
Subject: Re: [PATCH bpf-next 1/1] Fix for bpf_sysctl_set_new_value
Date: Thu, 16 May 2024 21:16:50 +0000	[thread overview]
Message-ID: <0A9C587D-A524-4206-BDBB-C27515606DB4@fb.com> (raw)
In-Reply-To: <ca8136e0-5d2a-402b-ad03-cc8a218affd4@linux.dev>

> btw, I am curious what is missed in the test_sysctl.c that didn't catch the return value case?

Test didn’t check new sysctl value, only if return code is successful. In this case it silently ignores new value.


> From looking at how new_updated is set, my understanding is new_len cannot be 0 here. just want to double check.

bpf_sysctl_set_new_value checks that new_len is not 0, otherwise returns EINVAL



> On May 7, 2024, at 4:20 PM, Martin KaFai Lau <martin.lau@linux.dev> wrote:
> 
> 
> On 5/4/24 3:23 AM, Raman Shukhau wrote:
>> Noticed that call to bpf_sysctl_set_new_value doesn't change final value
>> of the parameter, when called from cgroup/syscall bpf handler. No error
>> thrown in this case, new value is simply ignored and original value, sent
>> to sysctl, is set. Example (see test added to this change for BPF handler
>> logic):
>> sysctl -w net.ipv4.ip_local_reserved_ports = 11111
>> ... cgroup/syscal handler call bpf_sysctl_set_new_value	and set 22222
>> sysctl net.ipv4.ip_local_reserved_ports
>> ... returns 11111
>> On investigation I found 2 things that needs to be changed:
>> * return value check
>> * new_len provided by bpf back to sysctl. proc_sys_call_handler	expects
>>   this value NOT to include \0 symbol, e.g. if user do
> 
> Thanks for the report and the patch.
> 
> This patch is changing a few things (1 fix, 1 improvement, 1 test).
> 
> Separate these individual changes into its own patch. Patch 1 fixes the return value. Patch 2 improves the '\0' and *pcount situation. Patch 3 adds the test.
> 
> btw, I am curious what is missed in the test_sysctl.c that didn't catch the return value case?
> 
>> 	```
>>   open("/proc/sys/net/ipv4/ip_local_reserved_ports", ...)
>>   write(fd, "11111", sizeof("22222"))
>>   ```
>>   or `echo -n "11111" > /proc/sys/net/ipv4/ip_local_reserved_ports`
>>   or `sysctl -w	net.ipv4.ip_local_reserved_ports=11111
>>   proc_sys_call_handler receives count equal to `5`. To make it consistent
>>   with bpf_sysctl_set_new_value, this change also adjust `new_len` with
>>   `-1`, if `\0` passed as last character. Alternatively, using
>>   `sizeof("11111") - 1` in BPF handler should work, but it might not be
>>   obvious and spark confusion. Note: if incorrect count is used, sysctl
>>   returns EINVAL to the user.
>> Signed-off-by: Raman Shukhau <ramasha@fb.com>
>> ---
>>  kernel/bpf/cgroup.c                           |  7 ++-
>>  .../bpf/progs/test_sysctl_overwrite.c         | 47 +++++++++++++++++++
>>  tools/testing/selftests/bpf/test_sysctl.c     | 35 +++++++++++++-
>>  3 files changed, 85 insertions(+), 4 deletions(-)
>>  create mode 100644 tools/testing/selftests/bpf/progs/test_sysctl_overwrite.c
>> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
>> index 8ba73042a239..23736aed1b53 100644
>> --- a/kernel/bpf/cgroup.c
>> +++ b/kernel/bpf/cgroup.c
>> @@ -1739,10 +1739,13 @@ int __cgroup_bpf_run_filter_sysctl(struct ctl_table_header *head,
>>    	kfree(ctx.cur_val);
>>  -	if (ret == 1 && ctx.new_updated) {
>> +	if (ret == 0 && ctx.new_updated) {
>>  		kfree(*buf);
>>  		*buf = ctx.new_val;
>> -		*pcount = ctx.new_len;
>> +		if (!(*buf)[ctx.new_len])
>> +			*pcount = ctx.new_len - 1;
> 
> From looking at how new_updated is set, my understanding is new_len cannot be 0 here. just want to double check.
> 
> 
>> +		else
>> +			*pcount = ctx.new_len;
>>  	} else {
>>  		kfree(ctx.new_val);
>>  	}
>> diff --git a/tools/testing/selftests/bpf/progs/test_sysctl_overwrite.c b/tools/testing/selftests/bpf/progs/test_sysctl_overwrite.c
>> new file mode 100644
>> index 000000000000..e44b429fcfc1
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/progs/test_sysctl_overwrite.c
>> @@ -0,0 +1,47 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (c) 2019 Facebook
>> +
>> +#include <string.h>
>> +
>> +#include <linux/bpf.h>
>> +
>> +#include <bpf/bpf_helpers.h>
>> +
>> +#include "bpf_compiler.h"
>> +
>> +static const char sysctl_value[] = "31337";
>> +static const char sysctl_name[] = "net/ipv4/ip_local_reserved_ports";
>> +static __always_inline int is_expected_name(struct bpf_sysctl *ctx)
>> +{
>> +	unsigned char i;
>> +	char name[sizeof(sysctl_name)];
>> +	int ret;
>> +
>> +	memset(name, 0, sizeof(name));
>> +	ret = bpf_sysctl_get_name(ctx, name, sizeof(name), 0);
>> +	if (ret < 0 || ret != sizeof(sysctl_name) - 1)
>> +		return 0;
>> +
>> +	__pragma_loop_unroll_full
>> +	for (i = 0; i < sizeof(sysctl_name); ++i)
>> +		if (name[i] != sysctl_name[i])
> 
> bpf_strncmp() should be useful here.
> 
>> +			return 0;
>> +
>> +	return 1;
>> +}
>> +
>> +SEC("cgroup/sysctl")
>> +int test_value_overwrite(struct bpf_sysctl *ctx)
>> +{
>> +	if (!ctx->write)
>> +		return 1;
>> +
>> +	if (!is_expected_name(ctx))
>> +		return 0;
>> +
>> +	if (bpf_sysctl_set_new_value(ctx, sysctl_value, sizeof(sysctl_value)) == 0)
>> +		return 1;
>> +	return 0;
>> +}
>> +
>> +char _license[] SEC("license") = "GPL";
>> diff --git a/tools/testing/selftests/bpf/test_sysctl.c b/tools/testing/selftests/bpf/test_sysctl.c
>> index bcdbd27f22f0..dfa479861d3a 100644
>> --- a/tools/testing/selftests/bpf/test_sysctl.c
>> +++ b/tools/testing/selftests/bpf/test_sysctl.c
>> @@ -35,6 +35,7 @@ struct sysctl_test {
>>  	int seek;
>>  	const char *newval;
>>  	const char *oldval;
>> +	const char *updval;
>>  	enum {
>>  		LOAD_REJECT,
>>  		ATTACH_REJECT,
>> @@ -1395,6 +1396,16 @@ static struct sysctl_test tests[] = {
>>  		.open_flags = O_RDONLY,
>>  		.result = SUCCESS,
>>  	},
>> +	{
>> +		"C prog: override write to ip_local_reserved_ports",
>> +		.prog_file = "./test_sysctl_overwrite.bpf.o",
> 
> test_sysctl.c is not run in bpf CI. It is not very useful to extend this test further. Lets take this chance to create a new progs/cgrp_sysctl.c test that will be exercised by ./test_progs in bpf CI. Then it can use the newer skel open_and_load also.
> 
> Not asking to to migrate the existing tests in test_sysctl.c to the new progs/cgrp_sysctl.c in this patch set. The new cgrp_sysctl.c can only have the tests that exercise the changes in this patch set. However, it will be useful if progs/cgrp_sysctl.c can be bootstrapped in a way that the future test_sysctl.c migration will be easier. I also wouldn't worry too much on the existing raw insns tests in test_sysctl.c for now. They will need to be moved to either C or bpf asm in the future.
> 
> pw-bot: cr
> 
>> +		.attach_type = BPF_CGROUP_SYSCTL,
>> +		.sysctl = "net/ipv4/ip_local_reserved_ports",
>> +		.open_flags = O_RDWR,
>> +		.newval = "11111",
>> +		.updval = "31337",
>> +		.result = SUCCESS,
>> +	},
>>  };
>>    static size_t probe_prog_length(const struct bpf_insn *fp)
>> @@ -1520,13 +1531,33 @@ static int access_sysctl(const char *sysctl_path,
>>  			log_err("Read value %s != %s", buf, test->oldval);
>>  			goto err;
>>  		}
>> -	} else if (test->open_flags == O_WRONLY) {
>> +	} else if (test->open_flags == O_WRONLY || test->open_flags == O_RDWR) {
>>  		if (!test->newval) {
>>  			log_err("New value for sysctl is not set");
>>  			goto err;
>>  		}
>> -		if (write(fd, test->newval, strlen(test->newval)) == -1)
>> +		if (write(fd, test->newval, strlen(test->newval)) == -1) {
>> +			log_err("Unable to write sysctl value");
>>  			goto err;
>> +		}
>> +		if (test->open_flags == O_RDWR) {
>> +			char buf[128];
>> +
>> +			if (!test->updval) {
>> +				log_err("Expected value for sysctl is not set");
>> +				goto err;
>> +			}
>> +
>> +			lseek(fd, 0, SEEK_SET);
>> +			if (read(fd, buf, sizeof(buf)) == -1) {
>> +				log_err("Unable to read updated value");
>> +				goto err;
>> +			}
>> +			if (strncmp(buf, test->updval, strlen(test->updval))) {
>> +				log_err("Overwritten value %s != %s", buf, test->updval);
>> +				goto err;
>> +			}
>> +		}
>>  	} else {
>>  		log_err("Unexpected sysctl access: neither read nor write");
>>  		goto err;
> 


      reply	other threads:[~2024-05-16 21:16 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-04 10:23 [PATCH bpf-next 0/1] Fix for bpf_sysctl_set_new_value Raman Shukhau
2024-05-04 10:23 ` [PATCH bpf-next 1/1] " Raman Shukhau
2024-05-07 23:20   ` Martin KaFai Lau
2024-05-16 21:16     ` Raman Shukhau [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0A9C587D-A524-4206-BDBB-C27515606DB4@fb.com \
    --to=ramasha@meta.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=martin.lau@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).