From: Miklos Szeredi <mszeredi@redhat.com>
To: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
linux-man@vger.kernel.org, linux-security-module@vger.kernel.org,
Karel Zak <kzak@redhat.com>, Ian Kent <raven@themaw.net>,
David Howells <dhowells@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Al Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <christian@brauner.io>,
Amir Goldstein <amir73il@gmail.com>,
Matthew House <mattlloydhouse@gmail.com>,
Florian Weimer <fweimer@redhat.com>,
Arnd Bergmann <arnd@arndb.de>
Subject: [PATCH v4 0/6] querying mount attributes
Date: Wed, 25 Oct 2023 16:01:58 +0200 [thread overview]
Message-ID: <20231025140205.3586473-1-mszeredi@redhat.com> (raw)
Implement mount querying syscalls agreed on at LSF/MM 2023.
Features:
- statx-like want/got mask
- allows returning ascii strings (fs type, root, mount point)
- returned buffer is relocatable (no pointers)
Still missing:
- man pages
- kselftest
Please find the test utility at the end of this mail.
Usage: statmnt [-l|-r] [-u] (mnt_id|path)
Git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git#statmount-v4
Changes v3..v4:
- incorporate patch moving list of mounts to an rbtree
- wire up syscalls for all archs
- add LISTMOUNT_RECURSIVE (depth first iteration of mount tree)
- add LSMT_ROOT (list root instead of a specific mount ID)
- list_for_each_entry_del() moved to a separate patchset
Changes v1..v3:
- rename statmnt(2) -> statmount(2)
- rename listmnt(2) -> listmount(2)
- make ABI 32bit compatible by passing 64bit args in a struct (tested on
i386 and x32)
- only accept new 64bit mount IDs
- fix compile on !CONFIG_PROC_FS
- call security_sb_statfs() in both syscalls
- make lookup_mnt_in_ns() static
- add LISTMOUNT_UNREACHABLE flag to listmnt() to explicitly ask for
listing unreachable mounts
- remove .sb_opts
- remove subtype from .fs_type
- return the number of bytes used (including strings) in .size
- rename .mountpoint -> .mnt_point
- point strings by an offset against char[] VLA at the end of the struct.
E.g. printf("fs_type: %s\n", st->str + st->fs_type);
- don't save string lengths
- extend spare space in struct statmnt (complete size is now 512 bytes)
Miklos Szeredi (6):
add unique mount ID
mounts: keep list of mounts in an rbtree
namespace: extract show_path() helper
add statmount(2) syscall
add listmount(2) syscall
wire up syscalls for statmount/listmount
arch/alpha/kernel/syscalls/syscall.tbl | 3 +
arch/arm/tools/syscall.tbl | 3 +
arch/arm64/include/asm/unistd32.h | 4 +
arch/ia64/kernel/syscalls/syscall.tbl | 3 +
arch/m68k/kernel/syscalls/syscall.tbl | 3 +
arch/microblaze/kernel/syscalls/syscall.tbl | 3 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 3 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 3 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 3 +
arch/parisc/kernel/syscalls/syscall.tbl | 3 +
arch/powerpc/kernel/syscalls/syscall.tbl | 3 +
arch/s390/kernel/syscalls/syscall.tbl | 3 +
arch/sh/kernel/syscalls/syscall.tbl | 3 +
arch/sparc/kernel/syscalls/syscall.tbl | 3 +
arch/x86/entry/syscalls/syscall_32.tbl | 3 +
arch/x86/entry/syscalls/syscall_64.tbl | 2 +
arch/xtensa/kernel/syscalls/syscall.tbl | 3 +
fs/internal.h | 2 +
fs/mount.h | 27 +-
fs/namespace.c | 573 ++++++++++++++++----
fs/pnode.c | 2 +-
fs/proc_namespace.c | 13 +-
fs/stat.c | 9 +-
include/linux/mount.h | 5 +-
include/linux/syscalls.h | 8 +
include/uapi/asm-generic/unistd.h | 8 +-
include/uapi/linux/mount.h | 65 +++
include/uapi/linux/stat.h | 1 +
28 files changed, 635 insertions(+), 129 deletions(-)
--
2.41.0
=== statmnt.c ===
#define _GNU_SOURCE
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <sys/mount.h>
#include <sys/stat.h>
#include <sys/param.h>
#include <err.h>
/*
* Structure for getting mount/superblock/filesystem info with statmount(2).
*
* The interface is similar to statx(2): individual fields or groups can be
* selected with the @mask argument of statmount(). Kernel will set the @mask
* field according to the supported fields.
*
* If string fields are selected, then the caller needs to pass a buffer that
* has space after the fixed part of the structure. Nul terminated strings are
* copied there and offsets relative to @str are stored in the relevant fields.
* If the buffer is too small, then EOVERFLOW is returned. The actually used
* size is returned in @size.
*/
struct statmnt {
__u32 size; /* Total size, including strings */
__u32 __spare1;
__u64 mask; /* What results were written */
__u32 sb_dev_major; /* Device ID */
__u32 sb_dev_minor;
__u64 sb_magic; /* ..._SUPER_MAGIC */
__u32 sb_flags; /* MS_{RDONLY,SYNCHRONOUS,DIRSYNC,LAZYTIME} */
__u32 fs_type; /* [str] Filesystem type */
__u64 mnt_id; /* Unique ID of mount */
__u64 mnt_parent_id; /* Unique ID of parent (for root == mnt_id) */
__u32 mnt_id_old; /* Reused IDs used in proc/.../mountinfo */
__u32 mnt_parent_id_old;
__u64 mnt_attr; /* MOUNT_ATTR_... */
__u64 mnt_propagation; /* MS_{SHARED,SLAVE,PRIVATE,UNBINDABLE} */
__u64 mnt_peer_group; /* ID of shared peer group */
__u64 mnt_master; /* Mount receives propagation from this ID */
__u64 propagate_from; /* Propagation from in current namespace */
__u32 mnt_root; /* [str] Root of mount relative to root of fs */
__u32 mnt_point; /* [str] Mountpoint relative to current root */
__u64 __spare2[50];
char str[]; /* Variable size part containing strings */
};
/*
* To be used on the kernel ABI only for passing 64bit arguments to statmount(2)
*/
struct __mount_arg {
__u64 mnt_id;
__u64 request_mask;
};
/*
* @mask bits for statmount(2)
*/
#define STMT_SB_BASIC 0x00000001U /* Want/got sb_... */
#define STMT_MNT_BASIC 0x00000002U /* Want/got mnt_... */
#define STMT_PROPAGATE_FROM 0x00000004U /* Want/got propagate_from */
#define STMT_MNT_ROOT 0x00000008U /* Want/got mnt_root */
#define STMT_MNT_POINT 0x00000010U /* Want/got mnt_point */
#define STMT_FS_TYPE 0x00000020U /* Want/got fs_type */
/* listmount(2) flags */
#define LISTMOUNT_UNREACHABLE 0x01 /* List unreachable mounts too */
#define LISTMOUNT_RECURSIVE 0x02 /* List a mount tree */
/*
* Special @mnt_id values that can be passed to listmount
*/
#define LSMT_ROOT 0xffffffffffffffff /* root mount */
#ifdef __alpha__
#define __NR_statmount 564
#define __NR_listmount 565
#else
#define __NR_statmount 454
#define __NR_listmount 455
#endif
#define STATX_MNT_ID_UNIQUE 0x00004000U /* Want/got extended stx_mount_id */
static void free_if_neq(void *p, const void *q)
{
if (p != q)
free(p);
}
static struct statmnt *statmount(uint64_t mnt_id, uint64_t mask, unsigned int flags)
{
struct __mount_arg arg = {
.mnt_id = mnt_id,
.request_mask = mask,
};
union {
struct statmnt m;
char s[4096];
} buf;
struct statmnt *ret, *mm = &buf.m;
size_t bufsize = sizeof(buf);
while (syscall(__NR_statmount, &arg, mm, bufsize, flags) == -1) {
free_if_neq(mm, &buf.m);
if (errno != EOVERFLOW)
return NULL;
bufsize = MAX(1 << 15, bufsize << 1);
mm = malloc(bufsize);
if (!mm)
return NULL;
}
ret = malloc(mm->size);
if (ret)
memcpy(ret, mm, mm->size);
free_if_neq(mm, &buf.m);
return ret;
}
static int listmount(uint64_t mnt_id, uint64_t **listp, unsigned int flags)
{
struct __mount_arg arg = {
.mnt_id = mnt_id,
};
uint64_t buf[512];
size_t bufsize = sizeof(buf);
uint64_t *ret, *ll = buf;
long len;
while ((len = syscall(__NR_listmount, &arg, ll, bufsize / sizeof(buf[0]), flags)) == -1) {
free_if_neq(ll, buf);
if (errno != EOVERFLOW)
return -1;
bufsize = MAX(1 << 15, bufsize << 1);
ll = malloc(bufsize);
if (!ll)
return -1;
}
bufsize = len * sizeof(buf[0]);
ret = malloc(bufsize);
if (!ret)
return -1;
*listp = ret;
memcpy(ret, ll, bufsize);
free_if_neq(ll, buf);
return len;
}
int main(int argc, char *argv[])
{
struct statmnt *st;
char *end;
int res;
int list = 0;
int flags = 0;
uint64_t mask = STMT_SB_BASIC | STMT_MNT_BASIC | STMT_PROPAGATE_FROM | STMT_MNT_ROOT | STMT_MNT_POINT | STMT_FS_TYPE;
uint64_t mnt_id;
int opt;
for (;;) {
opt = getopt(argc, argv, "lru");
if (opt == -1)
break;
switch (opt) {
case 'r':
flags |= LISTMOUNT_RECURSIVE;
/* fallthrough */
case 'l':
list = 1;
break;
case 'u':
flags |= LISTMOUNT_UNREACHABLE;
break;
default:
errx(1, "usage: %s [-l|-r] [-u] (mnt_id|path)", argv[0]);
}
}
if (optind >= argc) {
if (!list)
errx(1, "missing mnt_id or path");
else
mnt_id = -1LL;
} else {
const char *arg = argv[optind];
mnt_id = strtoll(arg, &end, 0);
if (!mnt_id || *end != '\0') {
struct statx sx;
res = statx(AT_FDCWD, arg, 0, STATX_MNT_ID_UNIQUE, &sx);
if (res == -1)
err(1, "%s", arg);
if (!(sx.stx_mask & (STATX_MNT_ID | STATX_MNT_ID_UNIQUE)))
errx(1, "Sorry, no mount ID");
mnt_id = sx.stx_mnt_id;
}
}
if (list) {
uint64_t *list;
int num, i;
res = listmount(mnt_id, &list, flags);
if (res == -1)
err(1, "listmnt(0x%llx)", (unsigned long long) mnt_id);
num = res;
for (i = 0; i < num; i++) {
printf("0x%llx", (unsigned long long) list[i]);
st = statmount(list[i], STMT_MNT_POINT, 0);
if (!st) {
printf("\t[%s]\n", strerror(errno));
} else {
printf("\t%s\n", (st->mask & STMT_MNT_POINT) ? st->str + st->mnt_point : "???");
}
free(st);
}
free(list);
return 0;
}
st = statmount(mnt_id, mask, 0);
if (!st)
err(1, "statmnt(0x%llx)", (unsigned long long) mnt_id);
printf("size: %u\n", st->size);
printf("mask: 0x%llx\n", st->mask);
if (st->mask & STMT_SB_BASIC) {
printf("sb_dev_major: %u\n", st->sb_dev_major);
printf("sb_dev_minor: %u\n", st->sb_dev_minor);
printf("sb_magic: 0x%llx\n", st->sb_magic);
printf("sb_flags: 0x%08x\n", st->sb_flags);
}
if (st->mask & STMT_MNT_BASIC) {
printf("mnt_id: 0x%llx\n", st->mnt_id);
printf("mnt_parent_id: 0x%llx\n", st->mnt_parent_id);
printf("mnt_id_old: %u\n", st->mnt_id_old);
printf("mnt_parent_id_old: %u\n", st->mnt_parent_id_old);
printf("mnt_attr: 0x%08llx\n", st->mnt_attr);
printf("mnt_propagation: %s%s%s%s\n",
st->mnt_propagation & MS_SHARED ? "shared," : "",
st->mnt_propagation & MS_SLAVE ? "slave," : "",
st->mnt_propagation & MS_UNBINDABLE ? "unbindable," : "",
st->mnt_propagation & MS_PRIVATE ? "private" : "");
printf("mnt_peer_group: %llu\n", st->mnt_peer_group);
printf("mnt_master: %llu\n", st->mnt_master);
}
if (st->mask & STMT_PROPAGATE_FROM)
printf("propagate_from: %llu\n", st->propagate_from);
if (st->mask & STMT_MNT_ROOT)
printf("mnt_root: %u <%s>\n", st->mnt_root, st->str + st->mnt_root);
if (st->mask & STMT_MNT_POINT)
printf("mnt_point: %u <%s>\n", st->mnt_point, st->str + st->mnt_point);
if (st->mask & STMT_FS_TYPE)
printf("fs_type: %u <%s>\n", st->fs_type, st->str + st->fs_type);
free(st);
return 0;
}
next reply other threads:[~2023-10-25 14:03 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-25 14:01 Miklos Szeredi [this message]
2023-10-25 14:01 ` [PATCH v4 1/6] add unique mount ID Miklos Szeredi
2023-10-25 14:02 ` [PATCH v4 2/6] mounts: keep list of mounts in an rbtree Miklos Szeredi
2023-10-27 3:11 ` Ian Kent
2023-10-27 8:17 ` Miklos Szeredi
2023-10-28 1:36 ` Ian Kent
2023-10-30 5:37 ` Ian Kent
2023-10-30 5:45 ` Ian Kent
2023-10-30 9:06 ` Miklos Szeredi
2023-10-31 1:23 ` Ian Kent
2023-10-25 14:02 ` [PATCH v4 3/6] namespace: extract show_path() helper Miklos Szeredi
2023-10-25 14:02 ` [PATCH v4 4/6] add statmount(2) syscall Miklos Szeredi
2023-11-08 2:58 ` Paul Moore
2023-11-08 7:58 ` Christian Brauner
2023-11-08 20:10 ` Paul Moore
2023-11-10 17:00 ` Paul Moore
2023-11-12 13:05 ` Christian Brauner
2023-11-12 20:29 ` Paul Moore
2023-10-25 14:02 ` [PATCH v4 5/6] add listmount(2) syscall Miklos Szeredi
2023-11-07 21:23 ` Jonathan Corbet
2023-11-08 7:53 ` Christian Brauner
2023-11-08 16:20 ` Jonathan Corbet
2023-11-08 16:23 ` Christian Brauner
2023-11-08 2:58 ` Paul Moore
2024-01-10 22:23 ` Guenter Roeck
2024-01-11 0:32 ` Linus Torvalds
2024-01-11 5:12 ` Guenter Roeck
2024-01-11 18:57 ` Guenter Roeck
2024-01-11 20:14 ` Linus Torvalds
2024-01-11 23:01 ` Arnd Bergmann
2024-01-11 23:57 ` Guenter Roeck
2024-01-12 3:40 ` Linus Torvalds
2024-01-12 5:24 ` Guenter Roeck
2024-01-12 9:00 ` Christian Brauner
2024-01-23 14:14 ` John Paul Adrian Glaubitz
2024-01-23 15:31 ` Guenter Roeck
2024-01-23 14:14 ` John Paul Adrian Glaubitz
2023-10-25 14:02 ` [PATCH v4 6/6] wire up syscalls for statmount/listmount Miklos Szeredi
2024-01-09 1:11 ` Florian Fainelli
2023-11-01 11:13 ` [PATCH v4 0/6] querying mount attributes Christian Brauner
2023-11-01 13:18 ` Miklos Szeredi
2023-11-01 15:54 ` Christian Brauner
2023-11-01 11:52 ` Ian Kent
2023-11-06 12:10 ` Karel Zak
2023-11-06 13:33 ` Amir Goldstein
2023-11-07 0:47 ` Ian Kent
2023-11-06 23:54 ` Ian Kent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231025140205.3586473-1-mszeredi@redhat.com \
--to=mszeredi@redhat.com \
--cc=amir73il@gmail.com \
--cc=arnd@arndb.de \
--cc=christian@brauner.io \
--cc=dhowells@redhat.com \
--cc=fweimer@redhat.com \
--cc=kzak@redhat.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-man@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=mattlloydhouse@gmail.com \
--cc=raven@themaw.net \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).