Containers Archive mirror
 help / color / mirror / Atom feed
From: Giuseppe Scrivano <gscrivan@redhat.com>
To: Christian Brauner <christian.brauner@ubuntu.com>
Cc: linux-fsdevel@vger.kernel.org,
	containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, viro@zeniv.linux.org.uk
Subject: Re: [PATCH 1/2] fs, close_range: add flag CLOSE_RANGE_CLOEXEC
Date: Wed, 14 Oct 2020 00:45:00 +0200	[thread overview]
Message-ID: <87imbdrbir.fsf@redhat.com> (raw)
In-Reply-To: <20201013205427.clvqno24ctwxbuyv@wittgenstein> (Christian Brauner's message of "Tue, 13 Oct 2020 22:54:27 +0200")

Christian Brauner <christian.brauner@ubuntu.com> writes:

> On Tue, Oct 13, 2020 at 04:06:08PM +0200, Giuseppe Scrivano wrote:
>
> Hey Guiseppe,
>
> Thanks for the patch!
>
>> When the flag CLOSE_RANGE_CLOEXEC is set, close_range doesn't
>> immediately close the files but it sets the close-on-exec bit.
>
> Hm, please expand on the use-cases a little here so people know where
> and how this is useful. Keeping the rationale for a change in the commit
> log is really important.
>
>> 
>> Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
>> ---
>
>>  fs/file.c                        | 56 ++++++++++++++++++++++----------
>>  include/uapi/linux/close_range.h |  3 ++
>>  2 files changed, 42 insertions(+), 17 deletions(-)
>> 
>> diff --git a/fs/file.c b/fs/file.c
>> index 21c0893f2f1d..ad4ebee41e09 100644
>> --- a/fs/file.c
>> +++ b/fs/file.c
>> @@ -672,6 +672,17 @@ int __close_fd(struct files_struct *files, unsigned fd)
>>  }
>>  EXPORT_SYMBOL(__close_fd); /* for ksys_close() */
>>  
>> +static unsigned int __get_max_fds(struct files_struct *cur_fds)
>> +{
>> +	unsigned int max_fds;
>> +
>> +	rcu_read_lock();
>> +	/* cap to last valid index into fdtable */
>> +	max_fds = files_fdtable(cur_fds)->max_fds;
>> +	rcu_read_unlock();
>> +	return max_fds;
>> +}
>> +
>>  /**
>>   * __close_range() - Close all file descriptors in a given range.
>>   *
>> @@ -683,27 +694,23 @@ EXPORT_SYMBOL(__close_fd); /* for ksys_close() */
>>   */
>>  int __close_range(unsigned fd, unsigned max_fd, unsigned int flags)
>>  {
>> -	unsigned int cur_max;
>> +	unsigned int cur_max = UINT_MAX;
>>  	struct task_struct *me = current;
>>  	struct files_struct *cur_fds = me->files, *fds = NULL;
>>  
>> -	if (flags & ~CLOSE_RANGE_UNSHARE)
>> +	if (flags & ~(CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC))
>>  		return -EINVAL;
>>  
>>  	if (fd > max_fd)
>>  		return -EINVAL;
>>  
>> -	rcu_read_lock();
>> -	cur_max = files_fdtable(cur_fds)->max_fds;
>> -	rcu_read_unlock();
>> -
>> -	/* cap to last valid index into fdtable */
>> -	cur_max--;
>> -
>>  	if (flags & CLOSE_RANGE_UNSHARE) {
>>  		int ret;
>>  		unsigned int max_unshare_fds = NR_OPEN_MAX;
>>  
>> +		/* cap to last valid index into fdtable */
>> +		cur_max = __get_max_fds(cur_fds) - 1;
>> +
>>  		/*
>>  		 * If the requested range is greater than the current maximum,
>>  		 * we're closing everything so only copy all file descriptors
>> @@ -724,16 +731,31 @@ int __close_range(unsigned fd, unsigned max_fd, unsigned int flags)
>>  			swap(cur_fds, fds);
>>  	}
>>  
>> -	max_fd = min(max_fd, cur_max);
>> -	while (fd <= max_fd) {
>> -		struct file *file;
>> +	if (flags & CLOSE_RANGE_CLOEXEC) {
>> +		struct fdtable *fdt;
>>  
>> -		file = pick_file(cur_fds, fd++);
>> -		if (!file)
>> -			continue;
>> +		spin_lock(&cur_fds->file_lock);
>> +		fdt = files_fdtable(cur_fds);
>> +		cur_max = fdt->max_fds - 1;
>> +		max_fd = min(max_fd, cur_max);
>> +		while (fd <= max_fd)
>> +			__set_close_on_exec(fd++, fdt);
>> +		spin_unlock(&cur_fds->file_lock);
>> +	} else {
>> +		/* Initialize cur_max if needed.  */
>> +		if (cur_max == UINT_MAX)
>> +			cur_max = __get_max_fds(cur_fds) - 1;
>
> The separation between how cur_fd is retrieved in the two branches makes
> the code more difficult to follow imho. Unless there's a clear reason
> why you've done it that way I would think that something like the patch
> I appended below might be a little clearer and easier to maintain(?).

Thanks for the review!

I've opted for this version as in the flags=CLOSE_RANGE_CLOEXEC case we
can read max_fds directly from the fds table and avoid doing it from the
RCU critical section as well.  I'll change it in favor of more readable
code.

Giuseppe

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

  parent reply	other threads:[~2020-10-13 22:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-13 14:06 [PATCH 0/2] fs, close_range: add flag CLOSE_RANGE_CLOEXEC Giuseppe Scrivano
2020-10-13 14:06 ` [PATCH 1/2] " Giuseppe Scrivano
2020-10-13 20:54   ` Christian Brauner
2020-10-13 21:04     ` Rasmus Villemoes
2020-10-13 21:22       ` Christian Brauner
2020-10-13 22:45     ` Giuseppe Scrivano [this message]
2020-10-13 21:09   ` Al Viro
2020-10-13 21:32     ` Christian Brauner
2020-10-13 21:49     ` Rasmus Villemoes
2020-10-13 14:06 ` [PATCH 2/2] selftests: add tests for CLOSE_RANGE_CLOEXEC Giuseppe Scrivano
2020-10-13 15:22   ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87imbdrbir.fsf@redhat.com \
    --to=gscrivan@redhat.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).