[PATCH -V11 0/9] Generic name to handle and open by handle syscalls

LKML Archive mirror
 help / color / mirror / Atom feed

* [PATCH -V11 0/9] Generic name to handle and open by handle syscalls
@ 2010-05-20  7:35 Aneesh Kumar K.V
  2010-05-20  7:35 ` [PATCH -V11 1/9] exportfs: Return the minimum required handle size Aneesh Kumar K.V
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2010-05-20  7:35 UTC (permalink / raw
  To: hch, viro, adilger, corbet, serue, neilb, hooanon05
  Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel

Hi,

The below set of patches implement open by handle support using exportfs
operations. This allows user space application to map a file name to file 
handle and later open the file using handle. This should be usable
for userspace NFS [1] and 9P server [2]. XFS already support this with the ioctls
XFS_IOC_PATH_TO_HANDLE and XFS_IOC_OPEN_BY_HANDLE.

[1] http://nfs-ganesha.sourceforge.net/
[2] http://thread.gmane.org/gmane.comp.emulators.qemu/68992

Changes from V10:
a) Missed an stg refresh before sending out the patchset. Send
   updated patchset.

Changes from V9:
a) Fix compile errors with CONFIG_EXPORTFS not defined
b) Return -EOPNOTSUPP if file system doesn't support fh_to_dentry exportfs callback.

Changes from V8:
a)  exportfs_decode_fh now returns -ESTALE if export operations is not defined.
b)  drop get_fsid super_operations. Instead use superblock to store uuid.

Changes from V7:
a) open_by_handle now use mountdirfd to identify the vfsmount.
b) We don't validate the UUID passed as a part of file handle in open_by_handle.
   UUID is provided as a part of file handle as an easy way for userspace to
   use the kernel returned handle as it is. It also helps in finding the 16 byte
   filessytem UUID in userspace without using file system specific libraries to
   read file system superblock. If a particular file system doesn't support UUID
   or any form of unique id this field in the file handle will be zero filled.
c) drop freadlink syscall. Instead use readlinkat with NULL pathname to indicate
   read the link target name of the link pointed by fd. This is similar to
   sys_utimensat
d) Instead of opencoding all the open flag related check use helper functions.
   Did finish_open_by_handle similar to finish_open.
c) Fix may_open to not return ELOOP for symlink when we are called from handle open.
   open(2) still returns error as expected. 

Changes from V6:
a) Add uuid to vfsmount lookup and drop uuid to superblock lookup
b) Return -EOPNOTSUPP in sys_name_to_handle if the file system returned uuid
   doesn't give the same vfsmount on lookup. This ensure that we fail
   sys_name_to_handle when we have multiple file system returning same UUID.

Changes from V5:
a) added sys_name_to_handle_at syscall which takes AT_SYMLINK_NOFOLLOW flag 
   instead of two syscalls sys_name_to_handle and sys_lname_to_handle.
b) addressed review comments from Niel Brown
c) rebased to b91ce4d14a21fc04d165be30319541e0f9204f15
d) Add compat_sys_open_by_handle

Chages from V4:
a) Changed the syscal arguments so that we don't need compat syscalls
   as suggested by Christoph
c) Added two new syscall sys_lname_to_handle and sys_freadlink to work with
   symlinks
d) Changed open_by_handle to work with all file types
e) Add ext3 support

Changes from V3:
a) Code cleanup suggested by Andreas
b) x86_64 syscall support
c) add compat syscall

Chages from V2:
a) Support system wide unique handle.

Changes from v1:
a) handle size is now specified in bytes
b) returns -EOVERFLOW if the handle size is small
c) dropped open_handle syscall and added open_by_handle_at syscall
   open_by_handle_at takes mount_fd as the directory fd of the mount point
   containing the file
e) handle will only be unique in a given file system. So for an NFS server
   exporting multiple file system, NFS server will have to internally track the
   mount point to which a file handle belongs to. We should be able to do it much
   easily than expecting kernel to give a system wide unique file handle. System
   wide unique file handle would need much larger changes to the exportfs or VFS
   interface and I was not sure whether we really need to do that in the kernel or
   in the user space
f) open_handle_at now only check for DAC_OVERRIDE capability

Example program: (x86_32). (x86_64 would need a different syscall number)
-------
cc <source.c> -luuid
--------
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>

#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <string.h>
#include <uuid/uuid.h>

struct uuid {
	unsigned char uuid[16];
};

struct file_handle {
        int handle_size;
        int handle_type;
	struct uuid fsid;
        unsigned char handle[0];
};

#define AT_FDCWD		-100
#define AT_SYMLINK_FOLLOW	0x400

static int name_to_handle(const char *name, struct file_handle  *fh)
{
	return syscall(338, AT_FDCWD, name, fh, AT_SYMLINK_FOLLOW);
}

static int lname_to_handle(const char *name, struct file_handle  *fh)
{
	return syscall(338, AT_FDCWD, name, fh, 0);
}

static int open_by_handle(int mountfd, struct file_handle *fh,  int flags)
{
	return syscall(339, mountfd, fh, flags);
}

#define BUFSZ 100
int main(int argc, char *argv[])
{
        int fd;
        int ret;
	int mountfd;
	int handle_sz;
	struct stat bufstat;
        char buf[BUFSZ];
	char uuid[36];
        struct file_handle *fh = NULL;;
	if (argc != 3 ) {
		printf("Usage: %s <filename> <mount-dir-name>\n", argv[0]);
		exit(1);
	}
again:
	if (fh && fh->handle_size) {
		handle_sz = fh->handle_size;
		free(fh);
		fh = malloc(sizeof(struct file_handle) + handle_sz);
		fh->handle_size = handle_sz;
	} else {
		fh = malloc(sizeof(struct file_handle));
		fh->handle_size = 0;
	}
        errno  = 0;
        ret = lname_to_handle(argv[1], fh);
        if (ret && errno == EOVERFLOW) {
		printf("Found the handle size needed to be %d\n", fh->handle_size);
		goto again;
        } else if (ret) {
                perror("Error:");
		exit(1);
	}
	uuid_unparse(fh->fsid.uuid, uuid);
	printf("UUID:%s\n", uuid);
	printf("Waiting for input");
	getchar();
	mountfd = open(argv[2], O_RDONLY | O_DIRECTORY);
	if (mountfd <= 0) {
                perror("Error:");
                exit(1);
        }
        fd = open_by_handle(mountfd, fh, O_RDWR);
        if (fd <= 0 ) {
                perror("Error:");
                exit(1);
        }
	printf("Reading the content now \n");
	fstat(fd, &bufstat);
	ret = S_ISLNK(bufstat.st_mode);
	if (ret) {
		memset(buf, 0 , BUFSZ);
		readlinkat(fd, NULL, buf, BUFSZ);
		printf("%s is a symlink pointing to %s\n", argv[1], buf);
	}
        memset(buf, 0 , BUFSZ);
	while (1) {
		ret = read(fd, buf, BUFSZ -1);
		if (ret <= 0)
			break;
		buf[ret] = '\0';
                printf("%s", buf);
                memset(buf, 0 , BUFSZ);
        }
        return 0;
}

-aneesh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH -V11 1/9] exportfs: Return the minimum required handle size
  2010-05-20  7:35 [PATCH -V11 0/9] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
@ 2010-05-20  7:35 ` Aneesh Kumar K.V
  2010-05-21 22:15   ` J. Bruce Fields
  2010-05-20  7:35 ` [PATCH -V11 2/9] vfs: Add name to file handle conversion support Aneesh Kumar K.V
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Aneesh Kumar K.V @ 2010-05-20  7:35 UTC (permalink / raw
  To: hch, viro, adilger, corbet, serue, neilb, hooanon05
  Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
	Aneesh Kumar K.V

The exportfs encode handle function should return the minimum required
handle size. This helps user to find out the handle size by passing 0
handle size in the first step and then redoing to the call again with
the returned handle size value.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/btrfs/export.c             |    8 ++++++--
 fs/exportfs/expfs.c           |    9 +++++++--
 fs/fat/inode.c                |    4 +++-
 fs/fuse/inode.c               |    4 +++-
 fs/gfs2/export.c              |    8 ++++++--
 fs/isofs/export.c             |    8 ++++++--
 fs/ocfs2/export.c             |    8 +++++++-
 fs/reiserfs/inode.c           |    7 ++++++-
 fs/udf/namei.c                |    7 ++++++-
 fs/xfs/linux-2.6/xfs_export.c |    4 +++-
 mm/shmem.c                    |    4 +++-
 11 files changed, 56 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c
index 951ef09..5f8ee5a 100644
--- a/fs/btrfs/export.c
+++ b/fs/btrfs/export.c
@@ -21,9 +21,13 @@ static int btrfs_encode_fh(struct dentry *dentry, u32 *fh, int *max_len,
 	int len = *max_len;
 	int type;
 
-	if ((len < BTRFS_FID_SIZE_NON_CONNECTABLE) ||
-	    (connectable && len < BTRFS_FID_SIZE_CONNECTABLE))
+	if (connectable && (len < BTRFS_FID_SIZE_CONNECTABLE)) {
+		*max_len = BTRFS_FID_SIZE_CONNECTABLE;
 		return 255;
+	} else if (len < BTRFS_FID_SIZE_NON_CONNECTABLE) {
+		*max_len = BTRFS_FID_SIZE_NON_CONNECTABLE;
+		return 255;
+	}
 
 	len  = BTRFS_FID_SIZE_NON_CONNECTABLE;
 	type = FILEID_BTRFS_WITHOUT_PARENT;
diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index e9e1759..cfee0f0 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -319,9 +319,14 @@ static int export_encode_fh(struct dentry *dentry, struct fid *fid,
 	struct inode * inode = dentry->d_inode;
 	int len = *max_len;
 	int type = FILEID_INO32_GEN;
-	
-	if (len < 2 || (connectable && len < 4))
+
+	if (connectable && (len < 4)) {
+		*max_len = 4;
+		return 255;
+	} else if (len < 2) {
+		*max_len = 2;
 		return 255;
+	}
 
 	len = 2;
 	fid->i32.ino = inode->i_ino;
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 0ce143b..6f83bc7 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -738,8 +738,10 @@ fat_encode_fh(struct dentry *de, __u32 *fh, int *lenp, int connectable)
 	struct inode *inode =  de->d_inode;
 	u32 ipos_h, ipos_m, ipos_l;
 
-	if (len < 5)
+	if (len < 5) {
+		*lenp = 5;
 		return 255; /* no room */
+	}
 
 	ipos_h = MSDOS_I(inode)->i_pos >> 8;
 	ipos_m = (MSDOS_I(inode)->i_pos & 0xf0) << 24;
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index ec14d19..beaea69 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -638,8 +638,10 @@ static int fuse_encode_fh(struct dentry *dentry, u32 *fh, int *max_len,
 	u64 nodeid;
 	u32 generation;
 
-	if (*max_len < len)
+	if (*max_len < len) {
+		*max_len = len;
 		return  255;
+	}
 
 	nodeid = get_fuse_inode(inode)->nodeid;
 	generation = inode->i_generation;
diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index c22c211..d022236 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -36,9 +36,13 @@ static int gfs2_encode_fh(struct dentry *dentry, __u32 *p, int *len,
 	struct super_block *sb = inode->i_sb;
 	struct gfs2_inode *ip = GFS2_I(inode);
 
-	if (*len < GFS2_SMALL_FH_SIZE ||
-	    (connectable && *len < GFS2_LARGE_FH_SIZE))
+	if (connectable && (*len < GFS2_LARGE_FH_SIZE)) {
+		*len = GFS2_LARGE_FH_SIZE;
 		return 255;
+	} else if (*len < GFS2_SMALL_FH_SIZE) {
+		*len = GFS2_SMALL_FH_SIZE;
+		return 255;
+	}
 
 	fh[0] = cpu_to_be32(ip->i_no_formal_ino >> 32);
 	fh[1] = cpu_to_be32(ip->i_no_formal_ino & 0xFFFFFFFF);
diff --git a/fs/isofs/export.c b/fs/isofs/export.c
index ed752cb..dd4687f 100644
--- a/fs/isofs/export.c
+++ b/fs/isofs/export.c
@@ -124,9 +124,13 @@ isofs_export_encode_fh(struct dentry *dentry,
 	 * offset of the inode and the upper 16 bits of fh32[1] to
 	 * hold the offset of the parent.
 	 */
-
-	if (len < 3 || (connectable && len < 5))
+	if (connectable && (len < 5)) {
+		*max_len = 5;
+		return 255;
+	} else if (len < 3) {
+		*max_len = 3;
 		return 255;
+	}
 
 	len = 3;
 	fh32[0] = ei->i_iget5_block;
diff --git a/fs/ocfs2/export.c b/fs/ocfs2/export.c
index 19ad145..250a347 100644
--- a/fs/ocfs2/export.c
+++ b/fs/ocfs2/export.c
@@ -201,8 +201,14 @@ static int ocfs2_encode_fh(struct dentry *dentry, u32 *fh_in, int *max_len,
 		   dentry->d_name.len, dentry->d_name.name,
 		   fh, len, connectable);
 
-	if (len < 3 || (connectable && len < 6)) {
+	if (connectable && (len < 6)) {
 		mlog(ML_ERROR, "fh buffer is too small for encoding\n");
+		*max_len = 6;
+		type = 255;
+		goto bail;
+	} else if (len < 3) {
+		mlog(ML_ERROR, "fh buffer is too small for encoding\n");
+		*max_len = 3;
 		type = 255;
 		goto bail;
 	}
diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
index dc2c65e..5fff1e2 100644
--- a/fs/reiserfs/inode.c
+++ b/fs/reiserfs/inode.c
@@ -1588,8 +1588,13 @@ int reiserfs_encode_fh(struct dentry *dentry, __u32 * data, int *lenp,
 	struct inode *inode = dentry->d_inode;
 	int maxlen = *lenp;
 
-	if (maxlen < 3)
+	if (need_parent && (maxlen < 5)) {
+		*lenp = 5;
 		return 255;
+	} else if (maxlen < 3) {
+		*lenp = 3;
+		return 255;
+	}
 
 	data[0] = inode->i_ino;
 	data[1] = le32_to_cpu(INODE_PKEY(inode)->k_dir_id);
diff --git a/fs/udf/namei.c b/fs/udf/namei.c
index 7581602..37ce713 100644
--- a/fs/udf/namei.c
+++ b/fs/udf/namei.c
@@ -1360,8 +1360,13 @@ static int udf_encode_fh(struct dentry *de, __u32 *fh, int *lenp,
 	struct fid *fid = (struct fid *)fh;
 	int type = FILEID_UDF_WITHOUT_PARENT;
 
-	if (len < 3 || (connectable && len < 5))
+	if (connectable && (len < 5)) {
+		*lenp = 5;
+		return 255;
+	} else if (len < 3) {
+		*lenp = 3;
 		return 255;
+	}
 
 	*lenp = 3;
 	fid->udf.block = location.logicalBlockNum;
diff --git a/fs/xfs/linux-2.6/xfs_export.c b/fs/xfs/linux-2.6/xfs_export.c
index 846b75a..82c0553 100644
--- a/fs/xfs/linux-2.6/xfs_export.c
+++ b/fs/xfs/linux-2.6/xfs_export.c
@@ -81,8 +81,10 @@ xfs_fs_encode_fh(
 	 * seven combinations work.  The real answer is "don't use v2".
 	 */
 	len = xfs_fileid_length(fileid_type);
-	if (*max_len < len)
+	if (*max_len < len) {
+		*max_len = len
 		return 255;
+	}
 	*max_len = len;
 
 	switch (fileid_type) {
diff --git a/mm/shmem.c b/mm/shmem.c
index eef4ebe..bbeda1c 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2125,8 +2125,10 @@ static int shmem_encode_fh(struct dentry *dentry, __u32 *fh, int *len,
 {
 	struct inode *inode = dentry->d_inode;
 
-	if (*len < 3)
+	if (*len < 3) {
+		*len = 3;
 		return 255;
+	}
 
 	if (hlist_unhashed(&inode->i_hash)) {
 		/* Unfortunately insert_inode_hash is not idempotent,
-- 
1.7.1.78.g212f0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -V11 2/9] vfs: Add name to file handle conversion support
  2010-05-20  7:35 [PATCH -V11 0/9] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
  2010-05-20  7:35 ` [PATCH -V11 1/9] exportfs: Return the minimum required handle size Aneesh Kumar K.V
@ 2010-05-20  7:35 ` Aneesh Kumar K.V
  2010-05-21 22:15   ` J. Bruce Fields
  2010-05-20  7:35 ` [PATCH -V11 3/9] vfs: Add open by file handle support Aneesh Kumar K.V
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Aneesh Kumar K.V @ 2010-05-20  7:35 UTC (permalink / raw
  To: hch, viro, adilger, corbet, serue, neilb, hooanon05
  Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
	Aneesh Kumar K.V

This patch add a new superblock field unsigned char s_uuid[16]
to store UUID mapping for the file system. The s_uuid[16] is used to
identify the file system apart of file_handle

Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/exportfs/expfs.c      |    2 +-
 fs/open.c                |  100 ++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h       |   11 +++++
 include/linux/syscalls.h |    3 +
 4 files changed, 115 insertions(+), 1 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index cfee0f0..d103c31 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -352,7 +352,7 @@ int exportfs_encode_fh(struct dentry *dentry, struct fid *fid, int *max_len,
 	const struct export_operations *nop = dentry->d_sb->s_export_op;
 	int error;
 
-	if (nop->encode_fh)
+	if (nop && nop->encode_fh)
 		error = nop->encode_fh(dentry, fid->raw, max_len, connectable);
 	else
 		error = export_encode_fh(dentry, fid, max_len, connectable);
diff --git a/fs/open.c b/fs/open.c
index 74e5cd9..c4c2577 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -30,6 +30,7 @@
 #include <linux/falloc.h>
 #include <linux/fs_struct.h>
 #include <linux/ima.h>
+#include <linux/exportfs.h>
 
 #include "internal.h"
 
@@ -1206,3 +1207,102 @@ int nonseekable_open(struct inode *inode, struct file *filp)
 }
 
 EXPORT_SYMBOL(nonseekable_open);
+
+#ifdef CONFIG_EXPORTFS
+/* limit the handle size to some value */
+#define MAX_HANDLE_SZ 4096
+static long do_sys_name_to_handle(struct path *path,
+			struct file_handle __user *ufh)
+{
+	int retval;
+	int handle_size;
+	struct super_block *sb;
+	struct file_handle f_handle;
+	struct file_handle *handle = NULL;
+
+	if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
+		retval = -EFAULT;
+		goto err_out;
+	}
+	if (f_handle.handle_size > MAX_HANDLE_SZ) {
+		retval = -EINVAL;
+		goto err_out;
+	}
+	handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_size,
+			GFP_KERNEL);
+	if (!handle) {
+		retval = -ENOMEM;
+		goto err_out;
+	}
+	handle_size = f_handle.handle_size;
+
+	/* we ask for a non connected handle */
+	retval = exportfs_encode_fh(path->dentry,
+				(struct fid *)handle->f_handle,
+				&handle_size,  0);
+	/* convert handle size to bytes */
+	handle_size *= sizeof(u32);
+	handle->handle_type = retval;
+	handle->handle_size = handle_size;
+	if (handle_size <= f_handle.handle_size) {
+		/* get the uuid */
+		sb = path->mnt->mnt_sb;
+		memcpy(handle->fs_uuid,
+			sb->s_uuid,
+			sizeof(handle->fs_uuid));
+		retval = 0;
+	} else {
+		/*
+		 * set the handle_size to zero so we copy only
+		 * non variable part of the file_handle
+		 */
+		handle_size = 0;
+		retval = -EOVERFLOW;
+	}
+	if (copy_to_user(ufh, handle,
+				sizeof(struct file_handle) + handle_size))
+		retval = -EFAULT;
+
+	kfree(handle);
+err_out:
+	return retval;
+}
+
+SYSCALL_DEFINE4(name_to_handle_at, int, dfd, const char __user *, name,
+		struct file_handle __user *, handle, int, flag)
+{
+
+	int follow;
+	long ret = -EINVAL;
+	struct path path;
+
+	if ((flag & ~AT_SYMLINK_FOLLOW) != 0)
+		goto err_out;
+
+	follow = (flag & AT_SYMLINK_FOLLOW) ? LOOKUP_FOLLOW : 0;
+	ret = user_path_at(dfd, name, follow, &path);
+	if (ret)
+		goto err_out;
+	/*
+	 * We need t make sure wether the file system
+	 * support decoding of the file handle
+	 */
+	if (!path.mnt->mnt_sb->s_export_op ||
+		!path.mnt->mnt_sb->s_export_op->fh_to_dentry) {
+		ret = -EOPNOTSUPP;
+		goto out_path;
+	}
+	ret = do_sys_name_to_handle(&path, handle);
+
+out_path:
+	path_put(&path);
+err_out:
+	return ret;
+}
+#else
+SYSCALL_DEFINE4(name_to_handle_at, int, dfd, const char __user *, name,
+		struct file_handle __user *, handle, int, flag)
+{
+	return -ENOSYS;
+}
+#endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 44f35ae..d428b1a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -948,6 +948,16 @@ struct file {
 	unsigned long f_mnt_write_state;
 #endif
 };
+
+struct file_handle {
+	int handle_size;
+	int handle_type;
+	/* File system UUID identifier */
+	u8 fs_uuid[16];
+	/* file identifier */
+	unsigned char f_handle[0];
+};
+
 extern spinlock_t files_lock;
 #define file_list_lock() spin_lock(&files_lock);
 #define file_list_unlock() spin_unlock(&files_lock);
@@ -1358,6 +1368,7 @@ struct super_block {
 	wait_queue_head_t	s_wait_unfrozen;
 
 	char s_id[32];				/* Informational name */
+	u8 s_uuid[16];				/* UUID */
 
 	void 			*s_fs_info;	/* Filesystem private info */
 	fmode_t			s_mode;
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 057929b..d0deef0 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -61,6 +61,7 @@ struct robust_list_head;
 struct getcpu_cache;
 struct old_linux_dirent;
 struct perf_event_attr;
+struct file_handle;
 
 #include <linux/types.h>
 #include <linux/aio_abi.h>
@@ -846,5 +847,7 @@ asmlinkage long sys_mmap_pgoff(unsigned long addr, unsigned long len,
 			unsigned long prot, unsigned long flags,
 			unsigned long fd, unsigned long pgoff);
 asmlinkage long sys_old_mmap(struct mmap_arg_struct __user *arg);
+asmlinkage long sys_name_to_handle_at(int dfd, const char __user *name,
+				struct file_handle __user *handle, int flag);
 
 #endif
-- 
1.7.1.78.g212f0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -V11 3/9] vfs: Add open by file handle support
  2010-05-20  7:35 [PATCH -V11 0/9] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
  2010-05-20  7:35 ` [PATCH -V11 1/9] exportfs: Return the minimum required handle size Aneesh Kumar K.V
  2010-05-20  7:35 ` [PATCH -V11 2/9] vfs: Add name to file handle conversion support Aneesh Kumar K.V
@ 2010-05-20  7:35 ` Aneesh Kumar K.V
  2010-05-20  7:35 ` [PATCH -V11 4/9] vfs: Allow handle based open on symlinks Aneesh Kumar K.V
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2010-05-20  7:35 UTC (permalink / raw
  To: hch, viro, adilger, corbet, serue, neilb, hooanon05
  Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
	Aneesh Kumar K.V

Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/exportfs/expfs.c      |    2 +
 fs/namei.c               |   50 +++++++++++++
 fs/open.c                |  177 ++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h       |    3 +-
 include/linux/syscalls.h |    2 +
 5 files changed, 233 insertions(+), 1 deletions(-)

diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
index d103c31..e73c0ab 100644
--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -373,6 +373,8 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
 	/*
 	 * Try to get any dentry for the given file handle from the filesystem.
 	 */
+	if (!nop || !nop->fh_to_dentry)
+		return ERR_PTR(-ESTALE);
 	result = nop->fh_to_dentry(mnt->mnt_sb, fid, fh_len, fileid_type);
 	if (!result)
 		result = ERR_PTR(-ESTALE);
diff --git a/fs/namei.c b/fs/namei.c
index b86b96f..21cf0a5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1556,6 +1556,56 @@ static int open_will_truncate(int flag, struct inode *inode)
 	return (flag & O_TRUNC);
 }
 
+struct file *finish_open_handle(struct path *path,
+			int open_flag, int acc_mode)
+{
+	int error;
+	struct file *filp;
+	int will_truncate;
+
+	will_truncate = open_will_truncate(open_flag, path->dentry->d_inode);
+	if (will_truncate) {
+		error = mnt_want_write(path->mnt);
+		if (error)
+			goto exit;
+	}
+	error = may_open(path, acc_mode, open_flag);
+	if (error) {
+		if (will_truncate)
+			mnt_drop_write(path->mnt);
+		goto exit;
+	}
+	filp = dentry_open(path->dentry, path->mnt, open_flag, current_cred());
+	if (!IS_ERR(filp)) {
+		error = ima_file_check(filp, acc_mode);
+		if (error) {
+			fput(filp);
+			filp = ERR_PTR(error);
+		}
+	}
+	if (!IS_ERR(filp)) {
+		if (will_truncate) {
+			error = handle_truncate(path);
+			if (error) {
+				fput(filp);
+				filp = ERR_PTR(error);
+			}
+		}
+	}
+	/*
+	 * It is now safe to drop the mnt write
+	 * because the filp has had a write taken
+	 * on its behalf.
+	 */
+	if (will_truncate)
+		mnt_drop_write(path->mnt);
+	return filp;
+
+exit:
+	path_put(path);
+	return ERR_PTR(error);
+}
+
 static struct file *finish_open(struct nameidata *nd,
 				int open_flag, int acc_mode)
 {
diff --git a/fs/open.c b/fs/open.c
index c4c2577..9f8a09a 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1306,3 +1306,180 @@ SYSCALL_DEFINE4(name_to_handle_at, int, dfd, const char __user *, name,
 	return -ENOSYS;
 }
 #endif
+
+#ifdef CONFIG_EXPORTFS
+static struct vfsmount *get_vfsmount_from_fd(int fd)
+{
+	int fput_needed;
+	struct path *path;
+	struct file *filep;
+
+	if (fd == AT_FDCWD) {
+		struct fs_struct *fs = current->fs;
+		read_lock(&fs->lock);
+		path = &fs->pwd;
+		mntget(path->mnt);
+		read_unlock(&fs->lock);
+	} else {
+		filep = fget_light(fd, &fput_needed);
+		if (!filep)
+			return ERR_PTR(-EBADF);
+		path = &filep->f_path;
+		mntget(path->mnt);
+		fput_light(filep, fput_needed);
+	}
+	return path->mnt;
+}
+
+static int vfs_dentry_acceptable(void *context, struct dentry *dentry)
+{
+	return 1;
+}
+
+static struct path *handle_to_path(int mountdirfd, struct file_handle *handle)
+{
+	int retval;
+	int handle_size;
+	struct path *path;
+
+	path = kmalloc(sizeof(struct path), GFP_KERNEL);
+	if (!path)
+		return ERR_PTR(-ENOMEM);
+
+	path->mnt = get_vfsmount_from_fd(mountdirfd);
+	if (IS_ERR(path->mnt)) {
+		retval = PTR_ERR(path->mnt);
+		goto out_err;
+	}
+	/* change the handle size to multiple of sizeof(u32) */
+	handle_size = handle->handle_size >> 2;
+	path->dentry = exportfs_decode_fh(path->mnt,
+					(struct fid *)handle->f_handle,
+					handle_size, handle->handle_type,
+					vfs_dentry_acceptable, NULL);
+	if (IS_ERR(path->dentry)) {
+		retval = PTR_ERR(path->dentry);
+		goto out_mnt;
+	}
+	return path;
+out_mnt:
+	mntput(path->mnt);
+out_err:
+	kfree(path);
+	return ERR_PTR(retval);
+}
+
+static long do_sys_open_by_handle(int mountdirfd,
+				struct file_handle __user *ufh, int open_flag)
+{
+	int acc_mode;
+	int fd, retval = 0;
+	struct file *filp;
+	struct path *path;
+	struct file_handle f_handle;
+	struct file_handle *handle = NULL;
+
+	/* can't use O_CREATE with open_by_handle */
+	if (open_flag & O_CREAT) {
+		retval = -EINVAL;
+		goto out_err;
+	}
+	if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
+		retval = -EFAULT;
+		goto out_err;
+	}
+	if ((f_handle.handle_size > MAX_HANDLE_SZ) ||
+		(f_handle.handle_size <= 0)) {
+		retval =  -EINVAL;
+		goto out_err;
+	}
+	if (!capable(CAP_DAC_OVERRIDE)) {
+		retval = -EPERM;
+		goto out_err;
+	}
+	handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_size,
+			GFP_KERNEL);
+	if (!handle) {
+		retval =  -ENOMEM;
+		goto out_err;
+	}
+	/* copy the full handle */
+	if (copy_from_user(handle, ufh,
+				sizeof(struct file_handle) +
+				f_handle.handle_size)) {
+		retval = -EFAULT;
+		goto out_handle;
+	}
+	path = handle_to_path(mountdirfd, handle);
+	if (IS_ERR(path)) {
+		retval = PTR_ERR(path);
+		goto out_handle;
+	}
+	/*
+	 * O_SYNC is implemented as __O_SYNC|O_DSYNC.  As many places only
+	 * check for O_DSYNC if the need any syncing at all we enforce it's
+	 * always set instead of having to deal with possibly weird behaviour
+	 * for malicious applications setting only __O_SYNC.
+	 */
+	if (open_flag & __O_SYNC)
+		open_flag |= O_DSYNC;
+
+	acc_mode = MAY_OPEN | ACC_MODE(open_flag);
+
+	/* O_TRUNC implies we need access checks for write permissions */
+	if (open_flag & O_TRUNC)
+		acc_mode |= MAY_WRITE;
+	/*
+	 * Allow the LSM permission hook to distinguish append
+	 * access from general write access.
+	 */
+	if (open_flag & O_APPEND)
+		acc_mode |= MAY_APPEND;
+
+	fd = get_unused_fd_flags(open_flag);
+	if (fd < 0) {
+		retval = fd;
+		goto out_path;
+	}
+	filp = finish_open_handle(path, open_flag, acc_mode);
+	if (IS_ERR(filp)) {
+		put_unused_fd(fd);
+		retval =  PTR_ERR(filp);
+	} else {
+		retval = fd;
+		fsnotify_open(filp->f_path.dentry);
+		fd_install(fd, filp);
+	}
+	kfree(path);
+	kfree(handle);
+	return retval;
+
+out_path:
+	path_put(path);
+	kfree(path);
+out_handle:
+	kfree(handle);
+out_err:
+	return retval;
+}
+
+SYSCALL_DEFINE3(open_by_handle_at, int, mountdirfd,
+		struct file_handle __user *, handle,
+		int, flags)
+{
+	long ret;
+
+	if (force_o_largefile())
+		flags |= O_LARGEFILE;
+
+	ret = do_sys_open_by_handle(mountdirfd, handle, flags);
+	return ret;
+}
+#else
+SYSCALL_DEFINE3(open_by_handle_at, int, mountdirfd,
+		struct file_handle __user *, handle,
+		int, flags)
+{
+	return -ENOSYS;
+}
+#endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d428b1a..c30940c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2140,7 +2140,8 @@ extern int may_open(struct path *, int, int);
 
 extern int kernel_read(struct file *, loff_t, char *, unsigned long);
 extern struct file * open_exec(const char *);
- 
+extern struct file *finish_open_handle(struct path *, int, int);
+
 /* fs/dcache.c -- generic fs support functions */
 extern int is_subdir(struct dentry *, struct dentry *);
 extern int path_is_under(struct path *, struct path *);
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index d0deef0..58b9702 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -849,5 +849,7 @@ asmlinkage long sys_mmap_pgoff(unsigned long addr, unsigned long len,
 asmlinkage long sys_old_mmap(struct mmap_arg_struct __user *arg);
 asmlinkage long sys_name_to_handle_at(int dfd, const char __user *name,
 				struct file_handle __user *handle, int flag);
+asmlinkage long sys_open_by_handle_at(int mountdirfd,
+				struct file_handle __user *handle, int flags);
 
 #endif
-- 
1.7.1.78.g212f0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -V11 4/9] vfs: Allow handle based open on symlinks
  2010-05-20  7:35 [PATCH -V11 0/9] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
                   ` (2 preceding siblings ...)
  2010-05-20  7:35 ` [PATCH -V11 3/9] vfs: Add open by file handle support Aneesh Kumar K.V
@ 2010-05-20  7:35 ` Aneesh Kumar K.V
  2010-05-20  7:35 ` [PATCH -V11 5/9] vfs: Support null pathname in readlink Aneesh Kumar K.V
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2010-05-20  7:35 UTC (permalink / raw
  To: hch, viro, adilger, corbet, serue, neilb, hooanon05
  Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
	Aneesh Kumar K.V

The patch update may_open to allow handle based open on symlinks.
The file handle based API use file descritor returned from open_by_handle_at
to do different file system operations. To find the link target name we
need to get a file descriptor on symlinks.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/namei.c |   17 ++++++++++++++---
 1 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 21cf0a5..3a93c15 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1421,7 +1421,7 @@ int vfs_create(struct inode *dir, struct dentry *dentry, int mode,
 	return error;
 }
 
-int may_open(struct path *path, int acc_mode, int flag)
+static int __may_open(struct path *path, int acc_mode, int flag, int handle)
 {
 	struct dentry *dentry = path->dentry;
 	struct inode *inode = dentry->d_inode;
@@ -1432,7 +1432,13 @@ int may_open(struct path *path, int acc_mode, int flag)
 
 	switch (inode->i_mode & S_IFMT) {
 	case S_IFLNK:
-		return -ELOOP;
+		/*
+		 * For file handle based open we should allow
+		 * open of symlink.
+		 */
+		if (!handle)
+			return -ELOOP;
+		break;
 	case S_IFDIR:
 		if (acc_mode & MAY_WRITE)
 			return -EISDIR;
@@ -1472,6 +1478,11 @@ int may_open(struct path *path, int acc_mode, int flag)
 	return break_lease(inode, flag);
 }
 
+int may_open(struct path *path, int acc_mode, int flag)
+{
+	return __may_open(path, acc_mode, flag, 0);
+}
+
 static int handle_truncate(struct path *path)
 {
 	struct inode *inode = path->dentry->d_inode;
@@ -1569,7 +1580,7 @@ struct file *finish_open_handle(struct path *path,
 		if (error)
 			goto exit;
 	}
-	error = may_open(path, acc_mode, open_flag);
+	error = __may_open(path, acc_mode, open_flag, 1);
 	if (error) {
 		if (will_truncate)
 			mnt_drop_write(path->mnt);
-- 
1.7.1.78.g212f0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -V11 5/9] vfs: Support null pathname in readlink
  2010-05-20  7:35 [PATCH -V11 0/9] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
                   ` (3 preceding siblings ...)
  2010-05-20  7:35 ` [PATCH -V11 4/9] vfs: Allow handle based open on symlinks Aneesh Kumar K.V
@ 2010-05-20  7:35 ` Aneesh Kumar K.V
  2010-05-20  7:35 ` [PATCH -V11 6/9] ext4: Copy fs UUID to superblock Aneesh Kumar K.V
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2010-05-20  7:35 UTC (permalink / raw
  To: hch, viro, adilger, corbet, serue, neilb, hooanon05
  Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
	Aneesh Kumar K.V

From: NeilBrown <neilb@suse.de>

This enables to use readlink to get the link target name
from a file descriptor point to the link. This can be used
with open_by_handle syscall that returns a file descriptor for a link.
We can then use this file descriptor to get the target name.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/stat.c |   30 ++++++++++++++++++++++--------
 1 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/fs/stat.c b/fs/stat.c
index c4ecd52..49b95a7 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -284,26 +284,40 @@ SYSCALL_DEFINE2(newfstat, unsigned int, fd, struct stat __user *, statbuf)
 SYSCALL_DEFINE4(readlinkat, int, dfd, const char __user *, pathname,
 		char __user *, buf, int, bufsiz)
 {
-	struct path path;
-	int error;
+	int error = 0;
+	struct path path, *pp;
+	struct file *file = NULL;
 
 	if (bufsiz <= 0)
 		return -EINVAL;
 
-	error = user_path_at(dfd, pathname, 0, &path);
+	if (pathname == NULL && dfd != AT_FDCWD) {
+		file = fget(dfd);
+
+		if (file)
+			pp = &file->f_path;
+		else
+			error = -EBADF;
+	} else {
+		error = user_path_at(dfd, pathname, 0, &path);
+		pp = &path;
+	}
 	if (!error) {
-		struct inode *inode = path.dentry->d_inode;
+		struct inode *inode = pp->dentry->d_inode;
 
 		error = -EINVAL;
 		if (inode->i_op->readlink) {
-			error = security_inode_readlink(path.dentry);
+			error = security_inode_readlink(pp->dentry);
 			if (!error) {
-				touch_atime(path.mnt, path.dentry);
-				error = inode->i_op->readlink(path.dentry,
+				touch_atime(pp->mnt, pp->dentry);
+				error = inode->i_op->readlink(pp->dentry,
 							      buf, bufsiz);
 			}
 		}
-		path_put(&path);
+		if (file)
+			fput(file);
+		else
+			path_put(&path);
 	}
 	return error;
 }
-- 
1.7.1.78.g212f0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -V11 6/9] ext4: Copy fs UUID to superblock
  2010-05-20  7:35 [PATCH -V11 0/9] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
                   ` (4 preceding siblings ...)
  2010-05-20  7:35 ` [PATCH -V11 5/9] vfs: Support null pathname in readlink Aneesh Kumar K.V
@ 2010-05-20  7:35 ` Aneesh Kumar K.V
  2010-05-20  7:35 ` [PATCH -V11 7/9] x86: Add new syscalls for x86_32 Aneesh Kumar K.V
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2010-05-20  7:35 UTC (permalink / raw
  To: hch, viro, adilger, corbet, serue, neilb, hooanon05
  Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
	Aneesh Kumar K.V

This enables userspace application to find the file system
UUID using sys_name_to_handle_at syscall

Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/ext4/super.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index e14d22c..02ba21b 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2828,6 +2828,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 	sb->s_qcop = &ext4_qctl_operations;
 	sb->dq_op = &ext4_quota_operations;
 #endif
+	memcpy(sb->s_uuid, es->s_uuid, sizeof(es->s_uuid));
+
 	INIT_LIST_HEAD(&sbi->s_orphan); /* unlinked but open files */
 	mutex_init(&sbi->s_orphan_lock);
 	mutex_init(&sbi->s_resize_lock);
-- 
1.7.1.78.g212f0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -V11 7/9] x86: Add new syscalls for x86_32
  2010-05-20  7:35 [PATCH -V11 0/9] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
                   ` (5 preceding siblings ...)
  2010-05-20  7:35 ` [PATCH -V11 6/9] ext4: Copy fs UUID to superblock Aneesh Kumar K.V
@ 2010-05-20  7:35 ` Aneesh Kumar K.V
  2010-05-20  7:35 ` [PATCH -V11 8/9] x86: Add new syscalls for x86_64 Aneesh Kumar K.V
  2010-05-20  7:35 ` [PATCH -V11 9/9] ext3: Copy fs UUID to superblock Aneesh Kumar K.V
  8 siblings, 0 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2010-05-20  7:35 UTC (permalink / raw
  To: hch, viro, adilger, corbet, serue, neilb, hooanon05
  Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
	Aneesh Kumar K.V

This patch adds sys_name_to_handle_at and sys_open_by_handle_at
syscalls to x86_32

Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/x86/include/asm/unistd_32.h   |    4 +++-
 arch/x86/kernel/syscall_table_32.S |    2 ++
 2 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/unistd_32.h b/arch/x86/include/asm/unistd_32.h
index beb9b5f..06890db 100644
--- a/arch/x86/include/asm/unistd_32.h
+++ b/arch/x86/include/asm/unistd_32.h
@@ -343,10 +343,12 @@
 #define __NR_rt_tgsigqueueinfo	335
 #define __NR_perf_event_open	336
 #define __NR_recvmmsg		337
+#define __NR_name_to_handle_at	338
+#define __NR_open_by_handle_at  339
 
 #ifdef __KERNEL__
 
-#define NR_syscalls 338
+#define NR_syscalls 340
 
 #define __ARCH_WANT_IPC_PARSE_VERSION
 #define __ARCH_WANT_OLD_READDIR
diff --git a/arch/x86/kernel/syscall_table_32.S b/arch/x86/kernel/syscall_table_32.S
index 8b37293..646717f 100644
--- a/arch/x86/kernel/syscall_table_32.S
+++ b/arch/x86/kernel/syscall_table_32.S
@@ -337,3 +337,5 @@ ENTRY(sys_call_table)
 	.long sys_rt_tgsigqueueinfo	/* 335 */
 	.long sys_perf_event_open
 	.long sys_recvmmsg
+	.long sys_name_to_handle_at
+	.long sys_open_by_handle_at	/* 339 */
-- 
1.7.1.78.g212f0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -V11 8/9] x86: Add new syscalls for x86_64
  2010-05-20  7:35 [PATCH -V11 0/9] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
                   ` (6 preceding siblings ...)
  2010-05-20  7:35 ` [PATCH -V11 7/9] x86: Add new syscalls for x86_32 Aneesh Kumar K.V
@ 2010-05-20  7:35 ` Aneesh Kumar K.V
  2010-05-20  7:35 ` [PATCH -V11 9/9] ext3: Copy fs UUID to superblock Aneesh Kumar K.V
  8 siblings, 0 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2010-05-20  7:35 UTC (permalink / raw
  To: hch, viro, adilger, corbet, serue, neilb, hooanon05
  Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
	Aneesh Kumar K.V

Add sys_name_to_handle_at and sys_open_by_handle_at syscalls
for x86_64

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/x86/ia32/ia32entry.S        |    2 ++
 arch/x86/include/asm/unistd_64.h |    4 ++++
 fs/compat.c                      |   11 +++++++++++
 fs/open.c                        |    4 ++--
 include/linux/fs.h               |    1 +
 5 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index e790bc1..99f9623 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -842,4 +842,6 @@ ia32_sys_call_table:
 	.quad compat_sys_rt_tgsigqueueinfo	/* 335 */
 	.quad sys_perf_event_open
 	.quad compat_sys_recvmmsg
+	.quad sys_name_to_handle_at
+	.quad compat_sys_open_by_handle_at	/* 339 */
 ia32_syscall_end:
diff --git a/arch/x86/include/asm/unistd_64.h b/arch/x86/include/asm/unistd_64.h
index ff4307b..5146adf 100644
--- a/arch/x86/include/asm/unistd_64.h
+++ b/arch/x86/include/asm/unistd_64.h
@@ -663,6 +663,10 @@ __SYSCALL(__NR_rt_tgsigqueueinfo, sys_rt_tgsigqueueinfo)
 __SYSCALL(__NR_perf_event_open, sys_perf_event_open)
 #define __NR_recvmmsg				299
 __SYSCALL(__NR_recvmmsg, sys_recvmmsg)
+#define __NR_name_to_handle_at			300
+__SYSCALL(__NR_name_to_handle, sys_name_to_handle)
+#define __NR_open_by_handle_at			301
+__SYSCALL(__NR_open_by_handle_at, sys_open_by_handle_at)
 
 #ifndef __NO_STUBS
 #define __ARCH_WANT_OLD_READDIR
diff --git a/fs/compat.c b/fs/compat.c
index 0544873..0cbff4d 100644
--- a/fs/compat.c
+++ b/fs/compat.c
@@ -2308,3 +2308,14 @@ asmlinkage long compat_sys_timerfd_gettime(int ufd,
 }
 
 #endif /* CONFIG_TIMERFD */
+
+/*
+ * Exactly like fs/open.c:sys_open_by_handle_at(), except that it
+ * doesn't set the O_LARGEFILE flag.
+ */
+asmlinkage long
+compat_sys_open_by_handle_at(int mountdirfd,
+			struct file_handle __user *handle, int flags)
+{
+	return do_sys_open_by_handle(mountdirfd, handle, flags);
+}
diff --git a/fs/open.c b/fs/open.c
index 9f8a09a..e9c4637 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1369,8 +1369,8 @@ out_err:
 	return ERR_PTR(retval);
 }
 
-static long do_sys_open_by_handle(int mountdirfd,
-				struct file_handle __user *ufh, int open_flag)
+long do_sys_open_by_handle(int mountdirfd,
+			struct file_handle __user *ufh, int open_flag)
 {
 	int acc_mode;
 	int fd, retval = 0;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c30940c..3fe241a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1929,6 +1929,7 @@ extern struct file * dentry_open(struct dentry *, struct vfsmount *, int,
 				 const struct cred *);
 extern int filp_close(struct file *, fl_owner_t id);
 extern char * getname(const char __user *);
+extern long do_sys_open_by_handle(int, struct file_handle __user *, int);
 
 /* fs/ioctl.c */
 
-- 
1.7.1.78.g212f0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH -V11 9/9] ext3: Copy fs UUID to superblock.
  2010-05-20  7:35 [PATCH -V11 0/9] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
                   ` (7 preceding siblings ...)
  2010-05-20  7:35 ` [PATCH -V11 8/9] x86: Add new syscalls for x86_64 Aneesh Kumar K.V
@ 2010-05-20  7:35 ` Aneesh Kumar K.V
  8 siblings, 0 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2010-05-20  7:35 UTC (permalink / raw
  To: hch, viro, adilger, corbet, serue, neilb, hooanon05
  Cc: linux-fsdevel, sfrench, philippe.deniel, linux-kernel,
	Aneesh Kumar K.V

This enable user space application to get the file system UUID
using sys_name_to_handle_at syscall

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/ext3/super.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index 1bee604..7c304e7 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -1928,6 +1928,7 @@ static int ext3_fill_super (struct super_block *sb, void *data, int silent)
 	sb->s_qcop = &ext3_qctl_operations;
 	sb->dq_op = &ext3_quota_operations;
 #endif
+	memcpy(sb->s_uuid, es->s_uuid, sizeof(es->s_uuid));
 	INIT_LIST_HEAD(&sbi->s_orphan); /* unlinked but open files */
 	mutex_init(&sbi->s_orphan_lock);
 	mutex_init(&sbi->s_resize_lock);
-- 
1.7.1.78.g212f0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH -V11 2/9] vfs: Add name to file handle conversion support
  2010-05-20  7:35 ` [PATCH -V11 2/9] vfs: Add name to file handle conversion support Aneesh Kumar K.V
@ 2010-05-21 22:15   ` J. Bruce Fields
  2010-05-22  9:04     ` Aneesh Kumar K. V
  0 siblings, 1 reply; 16+ messages in thread
From: J. Bruce Fields @ 2010-05-21 22:15 UTC (permalink / raw
  To: Aneesh Kumar K.V
  Cc: hch, viro, adilger, corbet, serue, neilb, hooanon05,
	linux-fsdevel, sfrench, philippe.deniel, linux-kernel

On Thu, May 20, 2010 at 01:05:31PM +0530, Aneesh Kumar K.V wrote:
> This patch add a new superblock field unsigned char s_uuid[16]
> to store UUID mapping for the file system. The s_uuid[16] is used to
> identify the file system apart of file_handle

Sorry, I lost track of the previous argument.  I thought that the
decision was to return just the file-identifying part of the filehandle
and let userspace tack on the uuid if that's what it wants?  That seems
the more flexible approach at least.

> 
> Acked-by: Serge Hallyn <serue@us.ibm.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  fs/exportfs/expfs.c      |    2 +-
>  fs/open.c                |  100 ++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h       |   11 +++++
>  include/linux/syscalls.h |    3 +
>  4 files changed, 115 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> index cfee0f0..d103c31 100644
> --- a/fs/exportfs/expfs.c
> +++ b/fs/exportfs/expfs.c
> @@ -352,7 +352,7 @@ int exportfs_encode_fh(struct dentry *dentry, struct fid *fid, int *max_len,
>  	const struct export_operations *nop = dentry->d_sb->s_export_op;
>  	int error;
>  
> -	if (nop->encode_fh)
> +	if (nop && nop->encode_fh)

Is there any legitimate reason to call this with nop == NULL?  If not,
this should be a BUG() if we want to test for the case at all.

Some user documentation for the new interface would be helpful.  (The
sort of information that would go into the man page eventually, if not
actually in man page format.)

>  		error = nop->encode_fh(dentry, fid->raw, max_len, connectable);
>  	else
>  		error = export_encode_fh(dentry, fid, max_len, connectable);
> diff --git a/fs/open.c b/fs/open.c
> index 74e5cd9..c4c2577 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -30,6 +30,7 @@
>  #include <linux/falloc.h>
>  #include <linux/fs_struct.h>
>  #include <linux/ima.h>
> +#include <linux/exportfs.h>
>  
>  #include "internal.h"
>  
> @@ -1206,3 +1207,102 @@ int nonseekable_open(struct inode *inode, struct file *filp)
>  }
>  
>  EXPORT_SYMBOL(nonseekable_open);
> +
> +#ifdef CONFIG_EXPORTFS
> +/* limit the handle size to some value */
> +#define MAX_HANDLE_SZ 4096
> +static long do_sys_name_to_handle(struct path *path,
> +			struct file_handle __user *ufh)
> +{
> +	int retval;
> +	int handle_size;
> +	struct super_block *sb;
> +	struct file_handle f_handle;
> +	struct file_handle *handle = NULL;
> +
> +	if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
> +		retval = -EFAULT;
> +		goto err_out;
> +	}
> +	if (f_handle.handle_size > MAX_HANDLE_SZ) {
> +		retval = -EINVAL;
> +		goto err_out;
> +	}
> +	handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_size,
> +			GFP_KERNEL);
> +	if (!handle) {
> +		retval = -ENOMEM;
> +		goto err_out;
> +	}
> +	handle_size = f_handle.handle_size;
> +
> +	/* we ask for a non connected handle */
> +	retval = exportfs_encode_fh(path->dentry,
> +				(struct fid *)handle->f_handle,
> +				&handle_size,  0);
> +	/* convert handle size to bytes */
> +	handle_size *= sizeof(u32);

I'm confused about units:

	- the max_len parameter is in 4-byte units.
	- So handle_size was also in 4-byte units before the above line.
	- So f_handle->handle_size was also in 4-byte units?  Then why
	  were we passing it to kmalloc?  And isn't that a confusing
	  convention for the length passed in from userspace?

> +	handle->handle_type = retval;

Don't we need to check for the special case retval == 255?

--b.

> +	handle->handle_size = handle_size;
> +	if (handle_size <= f_handle.handle_size) {
> +		/* get the uuid */
> +		sb = path->mnt->mnt_sb;
> +		memcpy(handle->fs_uuid,
> +			sb->s_uuid,
> +			sizeof(handle->fs_uuid));
> +		retval = 0;
> +	} else {
> +		/*
> +		 * set the handle_size to zero so we copy only
> +		 * non variable part of the file_handle
> +		 */
> +		handle_size = 0;
> +		retval = -EOVERFLOW;
> +	}
> +	if (copy_to_user(ufh, handle,
> +				sizeof(struct file_handle) + handle_size))
> +		retval = -EFAULT;
> +
> +	kfree(handle);
> +err_out:
> +	return retval;
> +}
> +
> +SYSCALL_DEFINE4(name_to_handle_at, int, dfd, const char __user *, name,
> +		struct file_handle __user *, handle, int, flag)
> +{
> +
> +	int follow;
> +	long ret = -EINVAL;
> +	struct path path;
> +
> +	if ((flag & ~AT_SYMLINK_FOLLOW) != 0)
> +		goto err_out;
> +
> +	follow = (flag & AT_SYMLINK_FOLLOW) ? LOOKUP_FOLLOW : 0;
> +	ret = user_path_at(dfd, name, follow, &path);
> +	if (ret)
> +		goto err_out;
> +	/*
> +	 * We need t make sure wether the file system
> +	 * support decoding of the file handle
> +	 */
> +	if (!path.mnt->mnt_sb->s_export_op ||
> +		!path.mnt->mnt_sb->s_export_op->fh_to_dentry) {
> +		ret = -EOPNOTSUPP;
> +		goto out_path;
> +	}
> +	ret = do_sys_name_to_handle(&path, handle);
> +
> +out_path:
> +	path_put(&path);
> +err_out:
> +	return ret;
> +}
> +#else
> +SYSCALL_DEFINE4(name_to_handle_at, int, dfd, const char __user *, name,
> +		struct file_handle __user *, handle, int, flag)
> +{
> +	return -ENOSYS;
> +}
> +#endif
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 44f35ae..d428b1a 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -948,6 +948,16 @@ struct file {
>  	unsigned long f_mnt_write_state;
>  #endif
>  };
> +
> +struct file_handle {
> +	int handle_size;
> +	int handle_type;
> +	/* File system UUID identifier */
> +	u8 fs_uuid[16];
> +	/* file identifier */
> +	unsigned char f_handle[0];
> +};
> +
>  extern spinlock_t files_lock;
>  #define file_list_lock() spin_lock(&files_lock);
>  #define file_list_unlock() spin_unlock(&files_lock);
> @@ -1358,6 +1368,7 @@ struct super_block {
>  	wait_queue_head_t	s_wait_unfrozen;
>  
>  	char s_id[32];				/* Informational name */
> +	u8 s_uuid[16];				/* UUID */
>  
>  	void 			*s_fs_info;	/* Filesystem private info */
>  	fmode_t			s_mode;
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 057929b..d0deef0 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -61,6 +61,7 @@ struct robust_list_head;
>  struct getcpu_cache;
>  struct old_linux_dirent;
>  struct perf_event_attr;
> +struct file_handle;
>  
>  #include <linux/types.h>
>  #include <linux/aio_abi.h>
> @@ -846,5 +847,7 @@ asmlinkage long sys_mmap_pgoff(unsigned long addr, unsigned long len,
>  			unsigned long prot, unsigned long flags,
>  			unsigned long fd, unsigned long pgoff);
>  asmlinkage long sys_old_mmap(struct mmap_arg_struct __user *arg);
> +asmlinkage long sys_name_to_handle_at(int dfd, const char __user *name,
> +				struct file_handle __user *handle, int flag);
>  
>  #endif
> -- 
> 1.7.1.78.g212f0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -V11 1/9] exportfs: Return the minimum required handle size
  2010-05-20  7:35 ` [PATCH -V11 1/9] exportfs: Return the minimum required handle size Aneesh Kumar K.V
@ 2010-05-21 22:15   ` J. Bruce Fields
  2010-05-22  8:32     ` Aneesh Kumar K. V
  2010-05-22 15:27     ` Aneesh Kumar K. V
  0 siblings, 2 replies; 16+ messages in thread
From: J. Bruce Fields @ 2010-05-21 22:15 UTC (permalink / raw
  To: Aneesh Kumar K.V
  Cc: hch, viro, adilger, corbet, serue, neilb, hooanon05,
	linux-fsdevel, sfrench, philippe.deniel, linux-kernel

On Thu, May 20, 2010 at 01:05:30PM +0530, Aneesh Kumar K.V wrote:
> The exportfs encode handle function should return the minimum required
> handle size. This helps user to find out the handle size by passing 0
> handle size in the first step and then redoing to the call again with
> the returned handle size value.

The encode_fh() interface is a little confusing.  (Not your fault,
really, mainly it's the return value (and the special use of 255) that I
always find odd.)

But maybe it would help to have a little more documention in the
export_encode_fh() kerneldoc comment and/or in
Documentation/filesystems/nfs/Exporting?

--b.

> 
> Acked-by: Serge Hallyn <serue@us.ibm.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  fs/btrfs/export.c             |    8 ++++++--
>  fs/exportfs/expfs.c           |    9 +++++++--
>  fs/fat/inode.c                |    4 +++-
>  fs/fuse/inode.c               |    4 +++-
>  fs/gfs2/export.c              |    8 ++++++--
>  fs/isofs/export.c             |    8 ++++++--
>  fs/ocfs2/export.c             |    8 +++++++-
>  fs/reiserfs/inode.c           |    7 ++++++-
>  fs/udf/namei.c                |    7 ++++++-
>  fs/xfs/linux-2.6/xfs_export.c |    4 +++-
>  mm/shmem.c                    |    4 +++-
>  11 files changed, 56 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c
> index 951ef09..5f8ee5a 100644
> --- a/fs/btrfs/export.c
> +++ b/fs/btrfs/export.c
> @@ -21,9 +21,13 @@ static int btrfs_encode_fh(struct dentry *dentry, u32 *fh, int *max_len,
>  	int len = *max_len;
>  	int type;
>  
> -	if ((len < BTRFS_FID_SIZE_NON_CONNECTABLE) ||
> -	    (connectable && len < BTRFS_FID_SIZE_CONNECTABLE))
> +	if (connectable && (len < BTRFS_FID_SIZE_CONNECTABLE)) {
> +		*max_len = BTRFS_FID_SIZE_CONNECTABLE;
>  		return 255;
> +	} else if (len < BTRFS_FID_SIZE_NON_CONNECTABLE) {
> +		*max_len = BTRFS_FID_SIZE_NON_CONNECTABLE;
> +		return 255;
> +	}
>  
>  	len  = BTRFS_FID_SIZE_NON_CONNECTABLE;
>  	type = FILEID_BTRFS_WITHOUT_PARENT;
> diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> index e9e1759..cfee0f0 100644
> --- a/fs/exportfs/expfs.c
> +++ b/fs/exportfs/expfs.c
> @@ -319,9 +319,14 @@ static int export_encode_fh(struct dentry *dentry, struct fid *fid,
>  	struct inode * inode = dentry->d_inode;
>  	int len = *max_len;
>  	int type = FILEID_INO32_GEN;
> -	
> -	if (len < 2 || (connectable && len < 4))
> +
> +	if (connectable && (len < 4)) {
> +		*max_len = 4;
> +		return 255;
> +	} else if (len < 2) {
> +		*max_len = 2;
>  		return 255;
> +	}
>  
>  	len = 2;
>  	fid->i32.ino = inode->i_ino;
> diff --git a/fs/fat/inode.c b/fs/fat/inode.c
> index 0ce143b..6f83bc7 100644
> --- a/fs/fat/inode.c
> +++ b/fs/fat/inode.c
> @@ -738,8 +738,10 @@ fat_encode_fh(struct dentry *de, __u32 *fh, int *lenp, int connectable)
>  	struct inode *inode =  de->d_inode;
>  	u32 ipos_h, ipos_m, ipos_l;
>  
> -	if (len < 5)
> +	if (len < 5) {
> +		*lenp = 5;
>  		return 255; /* no room */
> +	}
>  
>  	ipos_h = MSDOS_I(inode)->i_pos >> 8;
>  	ipos_m = (MSDOS_I(inode)->i_pos & 0xf0) << 24;
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index ec14d19..beaea69 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -638,8 +638,10 @@ static int fuse_encode_fh(struct dentry *dentry, u32 *fh, int *max_len,
>  	u64 nodeid;
>  	u32 generation;
>  
> -	if (*max_len < len)
> +	if (*max_len < len) {
> +		*max_len = len;
>  		return  255;
> +	}
>  
>  	nodeid = get_fuse_inode(inode)->nodeid;
>  	generation = inode->i_generation;
> diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
> index c22c211..d022236 100644
> --- a/fs/gfs2/export.c
> +++ b/fs/gfs2/export.c
> @@ -36,9 +36,13 @@ static int gfs2_encode_fh(struct dentry *dentry, __u32 *p, int *len,
>  	struct super_block *sb = inode->i_sb;
>  	struct gfs2_inode *ip = GFS2_I(inode);
>  
> -	if (*len < GFS2_SMALL_FH_SIZE ||
> -	    (connectable && *len < GFS2_LARGE_FH_SIZE))
> +	if (connectable && (*len < GFS2_LARGE_FH_SIZE)) {
> +		*len = GFS2_LARGE_FH_SIZE;
>  		return 255;
> +	} else if (*len < GFS2_SMALL_FH_SIZE) {
> +		*len = GFS2_SMALL_FH_SIZE;
> +		return 255;
> +	}
>  
>  	fh[0] = cpu_to_be32(ip->i_no_formal_ino >> 32);
>  	fh[1] = cpu_to_be32(ip->i_no_formal_ino & 0xFFFFFFFF);
> diff --git a/fs/isofs/export.c b/fs/isofs/export.c
> index ed752cb..dd4687f 100644
> --- a/fs/isofs/export.c
> +++ b/fs/isofs/export.c
> @@ -124,9 +124,13 @@ isofs_export_encode_fh(struct dentry *dentry,
>  	 * offset of the inode and the upper 16 bits of fh32[1] to
>  	 * hold the offset of the parent.
>  	 */
> -
> -	if (len < 3 || (connectable && len < 5))
> +	if (connectable && (len < 5)) {
> +		*max_len = 5;
> +		return 255;
> +	} else if (len < 3) {
> +		*max_len = 3;
>  		return 255;
> +	}
>  
>  	len = 3;
>  	fh32[0] = ei->i_iget5_block;
> diff --git a/fs/ocfs2/export.c b/fs/ocfs2/export.c
> index 19ad145..250a347 100644
> --- a/fs/ocfs2/export.c
> +++ b/fs/ocfs2/export.c
> @@ -201,8 +201,14 @@ static int ocfs2_encode_fh(struct dentry *dentry, u32 *fh_in, int *max_len,
>  		   dentry->d_name.len, dentry->d_name.name,
>  		   fh, len, connectable);
>  
> -	if (len < 3 || (connectable && len < 6)) {
> +	if (connectable && (len < 6)) {
>  		mlog(ML_ERROR, "fh buffer is too small for encoding\n");
> +		*max_len = 6;
> +		type = 255;
> +		goto bail;
> +	} else if (len < 3) {
> +		mlog(ML_ERROR, "fh buffer is too small for encoding\n");
> +		*max_len = 3;
>  		type = 255;
>  		goto bail;
>  	}
> diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c
> index dc2c65e..5fff1e2 100644
> --- a/fs/reiserfs/inode.c
> +++ b/fs/reiserfs/inode.c
> @@ -1588,8 +1588,13 @@ int reiserfs_encode_fh(struct dentry *dentry, __u32 * data, int *lenp,
>  	struct inode *inode = dentry->d_inode;
>  	int maxlen = *lenp;
>  
> -	if (maxlen < 3)
> +	if (need_parent && (maxlen < 5)) {
> +		*lenp = 5;
>  		return 255;
> +	} else if (maxlen < 3) {
> +		*lenp = 3;
> +		return 255;
> +	}
>  
>  	data[0] = inode->i_ino;
>  	data[1] = le32_to_cpu(INODE_PKEY(inode)->k_dir_id);
> diff --git a/fs/udf/namei.c b/fs/udf/namei.c
> index 7581602..37ce713 100644
> --- a/fs/udf/namei.c
> +++ b/fs/udf/namei.c
> @@ -1360,8 +1360,13 @@ static int udf_encode_fh(struct dentry *de, __u32 *fh, int *lenp,
>  	struct fid *fid = (struct fid *)fh;
>  	int type = FILEID_UDF_WITHOUT_PARENT;
>  
> -	if (len < 3 || (connectable && len < 5))
> +	if (connectable && (len < 5)) {
> +		*lenp = 5;
> +		return 255;
> +	} else if (len < 3) {
> +		*lenp = 3;
>  		return 255;
> +	}
>  
>  	*lenp = 3;
>  	fid->udf.block = location.logicalBlockNum;
> diff --git a/fs/xfs/linux-2.6/xfs_export.c b/fs/xfs/linux-2.6/xfs_export.c
> index 846b75a..82c0553 100644
> --- a/fs/xfs/linux-2.6/xfs_export.c
> +++ b/fs/xfs/linux-2.6/xfs_export.c
> @@ -81,8 +81,10 @@ xfs_fs_encode_fh(
>  	 * seven combinations work.  The real answer is "don't use v2".
>  	 */
>  	len = xfs_fileid_length(fileid_type);
> -	if (*max_len < len)
> +	if (*max_len < len) {
> +		*max_len = len
>  		return 255;
> +	}
>  	*max_len = len;
>  
>  	switch (fileid_type) {
> diff --git a/mm/shmem.c b/mm/shmem.c
> index eef4ebe..bbeda1c 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2125,8 +2125,10 @@ static int shmem_encode_fh(struct dentry *dentry, __u32 *fh, int *len,
>  {
>  	struct inode *inode = dentry->d_inode;
>  
> -	if (*len < 3)
> +	if (*len < 3) {
> +		*len = 3;
>  		return 255;
> +	}
>  
>  	if (hlist_unhashed(&inode->i_hash)) {
>  		/* Unfortunately insert_inode_hash is not idempotent,
> -- 
> 1.7.1.78.g212f0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -V11 1/9] exportfs: Return the minimum required handle size
  2010-05-21 22:15   ` J. Bruce Fields
@ 2010-05-22  8:32     ` Aneesh Kumar K. V
  2010-05-22 15:27     ` Aneesh Kumar K. V
  1 sibling, 0 replies; 16+ messages in thread
From: Aneesh Kumar K. V @ 2010-05-22  8:32 UTC (permalink / raw
  To: J. Bruce Fields
  Cc: hch, viro, adilger, corbet, serue, neilb, hooanon05,
	linux-fsdevel, sfrench, philippe.deniel, linux-kernel

On Fri, 21 May 2010 18:15:16 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> On Thu, May 20, 2010 at 01:05:30PM +0530, Aneesh Kumar K.V wrote:
> > The exportfs encode handle function should return the minimum required
> > handle size. This helps user to find out the handle size by passing 0
> > handle size in the first step and then redoing to the call again with
> > the returned handle size value.
> 
> The encode_fh() interface is a little confusing.  (Not your fault,
> really, mainly it's the return value (and the special use of 255) that I
> always find odd.)
> 
> But maybe it would help to have a little more documention in the
> export_encode_fh() kerneldoc comment and/or in
> Documentation/filesystems/nfs/Exporting?
> 


Will update in the next iteration.

-aneesh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -V11 2/9] vfs: Add name to file handle conversion support
  2010-05-21 22:15   ` J. Bruce Fields
@ 2010-05-22  9:04     ` Aneesh Kumar K. V
  0 siblings, 0 replies; 16+ messages in thread
From: Aneesh Kumar K. V @ 2010-05-22  9:04 UTC (permalink / raw
  To: J. Bruce Fields
  Cc: hch, viro, adilger, corbet, serue, neilb, hooanon05,
	linux-fsdevel, sfrench, philippe.deniel, linux-kernel

On Fri, 21 May 2010 18:15:07 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> On Thu, May 20, 2010 at 01:05:31PM +0530, Aneesh Kumar K.V wrote:
> > This patch add a new superblock field unsigned char s_uuid[16]
> > to store UUID mapping for the file system. The s_uuid[16] is used to
> > identify the file system apart of file_handle
> 
> Sorry, I lost track of the previous argument.  I thought that the
> decision was to return just the file-identifying part of the filehandle
> and let userspace tack on the uuid if that's what it wants?  That seems
> the more flexible approach at least.

Having UUID as a part of a handle enables us to use the handle directly
in the userspace as a file identifier. In most case UUID should work as
a identifier for the file system. What is changed now is even though in
name to handle conversion we add UUID, we don't use UUID to identify the
file system in handle to open path. Rather userspace should map UUID or
what ever unique identifier for the file system it have decided to use
to a mountdir fd and use that to identify the file system in
sys_open_by_handle_at syscall.

UUID is now a part of handle only as a easy way to get the 16 byte
unique handler for the file system. Userspace applications like NFS
server still have to make sure that for the list of exported file system
whether UUID is the correct unique identifier. If yes, they can directly
use the file handle returned from the syscall. If not NFS server will
have to find an alternative to uniquely identify the file system.

So instead of doing

sys_name_to_handle("/tmp/a", handle);
statfs("/tmp/a")
memcpy(handle.f_fsid, statfs.f_fsid);

we can now do 

sys_name_to_handle("/tmp/a, handle);


> 
> > 
> > Acked-by: Serge Hallyn <serue@us.ibm.com>
> > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > ---
> >  fs/exportfs/expfs.c      |    2 +-
> >  fs/open.c                |  100 ++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/fs.h       |   11 +++++
> >  include/linux/syscalls.h |    3 +
> >  4 files changed, 115 insertions(+), 1 deletions(-)
> > 
> > diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> > index cfee0f0..d103c31 100644
> > --- a/fs/exportfs/expfs.c
> > +++ b/fs/exportfs/expfs.c
> > @@ -352,7 +352,7 @@ int exportfs_encode_fh(struct dentry *dentry, struct fid *fid, int *max_len,
> >  	const struct export_operations *nop = dentry->d_sb->s_export_op;
> >  	int error;
> >  
> > -	if (nop->encode_fh)
> > +	if (nop && nop->encode_fh)
> 
> Is there any legitimate reason to call this with nop == NULL?  If not,
> this should be a BUG() if we want to test for the case at all.

For name_to_handle i have added a check to make sure we return
-EOPNOTSUPP. 

       if (!path.mnt->mnt_sb->s_export_op ||
		!path.mnt->mnt_sb->s_export_op->fh_to_dentry) {
		ret = -EOPNOTSUPP;
		goto out_path;
	}

So may be i can drop that nop == NULL check. 

We want to retain the null check in exportfs_decode_fh because
application can call open_by_handle with any mountdirfd that can be for
a file system that doesn't support export operations. 

> 
> Some user documentation for the new interface would be helpful.  (The
> sort of information that would go into the man page eventually, if not
> actually in man page format.)


Will add that in the next iteration.

> 
> >  		error = nop->encode_fh(dentry, fid->raw, max_len, connectable);
> >  	else
> >  		error = export_encode_fh(dentry, fid, max_len, connectable);
> > diff --git a/fs/open.c b/fs/open.c
> > index 74e5cd9..c4c2577 100644
> > --- a/fs/open.c
> > +++ b/fs/open.c
> > @@ -30,6 +30,7 @@
> >  #include <linux/falloc.h>
> >  #include <linux/fs_struct.h>
> >  #include <linux/ima.h>
> > +#include <linux/exportfs.h>
> >  
> >  #include "internal.h"
> >  
> > @@ -1206,3 +1207,102 @@ int nonseekable_open(struct inode *inode, struct file *filp)
> >  }
> >  
> >  EXPORT_SYMBOL(nonseekable_open);
> > +
> > +#ifdef CONFIG_EXPORTFS
> > +/* limit the handle size to some value */
> > +#define MAX_HANDLE_SZ 4096
> > +static long do_sys_name_to_handle(struct path *path,
> > +			struct file_handle __user *ufh)
> > +{
> > +	int retval;
> > +	int handle_size;
> > +	struct super_block *sb;
> > +	struct file_handle f_handle;
> > +	struct file_handle *handle = NULL;
> > +
> > +	if (copy_from_user(&f_handle, ufh, sizeof(struct file_handle))) {
> > +		retval = -EFAULT;
> > +		goto err_out;
> > +	}
> > +	if (f_handle.handle_size > MAX_HANDLE_SZ) {
> > +		retval = -EINVAL;
> > +		goto err_out;
> > +	}
> > +	handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_size,
> > +			GFP_KERNEL);
> > +	if (!handle) {
> > +		retval = -ENOMEM;
> > +		goto err_out;
> > +	}
> > +	handle_size = f_handle.handle_size;
> > +
> > +	/* we ask for a non connected handle */
> > +	retval = exportfs_encode_fh(path->dentry,
> > +				(struct fid *)handle->f_handle,
> > +				&handle_size,  0);
> > +	/* convert handle size to bytes */
> > +	handle_size *= sizeof(u32);
> 
> I'm confused about units:
> 
> 	- the max_len parameter is in 4-byte units.
> 	- So handle_size was also in 4-byte units before the above line.
> 	- So f_handle->handle_size was also in 4-byte units?  Then why
> 	  were we passing it to kmalloc?  And isn't that a confusing
> 	  convention for the length passed in from userspace?

The units passed from userpace should be in bytes. So yes i missed a 

handle_size = f_handle.handle_size >> 2;


> 
> > +	handle->handle_type = retval;
> 
> Don't we need to check for the special case retval == 255?

encode_f returns retval == 255 when we don't have enough space to copy the handle right ?
which should be captured by 

if (handle_size <= f_handle.handle_size) 

Are there any other error condition indicated by retval == 255 ? IIUC
encode_fh have two possible returns

a) retval == 255 if handle_size is small. I updated to put the required
handle_size in max_len. So i am capture this via a handle_size check.
b) Successfully copied the handle 

did i miss any other return ?


Thanks for reviewing the patches

-aneesh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -V11 1/9] exportfs: Return the minimum required handle size
  2010-05-21 22:15   ` J. Bruce Fields
  2010-05-22  8:32     ` Aneesh Kumar K. V
@ 2010-05-22 15:27     ` Aneesh Kumar K. V
  2010-05-22 21:44       ` Neil Brown
  1 sibling, 1 reply; 16+ messages in thread
From: Aneesh Kumar K. V @ 2010-05-22 15:27 UTC (permalink / raw
  To: J. Bruce Fields
  Cc: hch, viro, adilger, corbet, serue, neilb, hooanon05,
	linux-fsdevel, sfrench, philippe.deniel, linux-kernel

On Fri, 21 May 2010 18:15:16 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> On Thu, May 20, 2010 at 01:05:30PM +0530, Aneesh Kumar K.V wrote:
> > The exportfs encode handle function should return the minimum required
> > handle size. This helps user to find out the handle size by passing 0
> > handle size in the first step and then redoing to the call again with
> > the returned handle size value.
> 
> The encode_fh() interface is a little confusing.  (Not your fault,
> really, mainly it's the return value (and the special use of 255) that I
> always find odd.)
> 
> But maybe it would help to have a little more documention in the
> export_encode_fh() kerneldoc comment and/or in
> Documentation/filesystems/nfs/Exporting?
> 

Kernel documentation says 

 * encode_fh:
 *    @encode_fh should store in the file handle fragment @fh (using at most
 *    @max_len bytes) information that can be used by @decode_fh to recover the
 *    file refered to by the &struct dentry @de.  If the @connectable flag is
 *    set, the encode_fh() should store sufficient information so that a good
 *    attempt can be made to find not only the file but also it's place in the
 *    filesystem.   This typically means storing a reference to de->d_parent in
 *    the filehandle fragment.  encode_fh() should return the number of bytes
 *    stored or a negative error code such as %-ENOSPC
 *

Clearly the file system encode_fh is not returning the correct return
values. Should i fix the kernel to follow the documentation or should
the kernel documentation should be fixed. I would prefer code, because
the documentation look more easy/clear to follow that returning value 255.

-aneesh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH -V11 1/9] exportfs: Return the minimum required handle size
  2010-05-22 15:27     ` Aneesh Kumar K. V
@ 2010-05-22 21:44       ` Neil Brown
  0 siblings, 0 replies; 16+ messages in thread
From: Neil Brown @ 2010-05-22 21:44 UTC (permalink / raw
  To: Aneesh Kumar K. V
  Cc: J. Bruce Fields, hch, viro, adilger, corbet, serue, hooanon05,
	linux-fsdevel, sfrench, philippe.deniel, linux-kernel

On Sat, 22 May 2010 20:57:50 +0530
"Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com> wrote:

> On Fri, 21 May 2010 18:15:16 -0400, "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > On Thu, May 20, 2010 at 01:05:30PM +0530, Aneesh Kumar K.V wrote:
> > > The exportfs encode handle function should return the minimum required
> > > handle size. This helps user to find out the handle size by passing 0
> > > handle size in the first step and then redoing to the call again with
> > > the returned handle size value.
> > 
> > The encode_fh() interface is a little confusing.  (Not your fault,
> > really, mainly it's the return value (and the special use of 255) that I
> > always find odd.)
> > 
> > But maybe it would help to have a little more documention in the
> > export_encode_fh() kerneldoc comment and/or in
> > Documentation/filesystems/nfs/Exporting?
> > 
> 
> Kernel documentation says 
> 
>  * encode_fh:
>  *    @encode_fh should store in the file handle fragment @fh (using at most
>  *    @max_len bytes) information that can be used by @decode_fh to recover the
>  *    file refered to by the &struct dentry @de.  If the @connectable flag is
>  *    set, the encode_fh() should store sufficient information so that a good
>  *    attempt can be made to find not only the file but also it's place in the
>  *    filesystem.   This typically means storing a reference to de->d_parent in
>  *    the filehandle fragment.  encode_fh() should return the number of bytes
>  *    stored or a negative error code such as %-ENOSPC
>  *
> 
> Clearly the file system encode_fh is not returning the correct return
> values. Should i fix the kernel to follow the documentation or should
> the kernel documentation should be fixed. I would prefer code, because
> the documentation look more easy/clear to follow that returning value 255.
>

The documentation is wrong in that it never returns the number of bytes.
The number of bytes is stored back in the 'max_len' by-reference argument.
The return value is a 'type' which is stored in the 4th byte of the
filehandle.

Error return is by a magic type number (255) simply because it is easier if
this is stored temporarily in fb_fileid_type which is __u8.  However it
doesn't need to be stored there.
code like
		_fh_update(fhp, fhp->fh_export, dentry);
		if (fhp->fh_handle.fh_fileid_type == 255)
			return nfserr_opnotsupp;

could be changed to
		err = _fh_update(fhp, fhp->fh_export, dentry);
		if (err < 0)
			return nfserr_opnotsupp;


and _fh_update could be changed from
		fhp->fh_handle.fh_fileid_type =
			exportfs_encode_fh(dentry, fid, &maxsize, subtreecheck);
to
		type = exportfs_encode_fh(dentry, fid, &maxsize, subtreecheck);
		if (type == 255) type = -ENOSPC; /* temp until filesystems changed*/
		if (type > 0)
			fhp-.fh_filehandle.fh_fileid_type = type;
		...
		return type;


And the documentation should be changed to report how the size is returned
and that the return value is a type, or an error.

NeilBrown

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-05-22 21:45 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-20  7:35 [PATCH -V11 0/9] Generic name to handle and open by handle syscalls Aneesh Kumar K.V
2010-05-20  7:35 ` [PATCH -V11 1/9] exportfs: Return the minimum required handle size Aneesh Kumar K.V
2010-05-21 22:15   ` J. Bruce Fields
2010-05-22  8:32     ` Aneesh Kumar K. V
2010-05-22 15:27     ` Aneesh Kumar K. V
2010-05-22 21:44       ` Neil Brown
2010-05-20  7:35 ` [PATCH -V11 2/9] vfs: Add name to file handle conversion support Aneesh Kumar K.V
2010-05-21 22:15   ` J. Bruce Fields
2010-05-22  9:04     ` Aneesh Kumar K. V
2010-05-20  7:35 ` [PATCH -V11 3/9] vfs: Add open by file handle support Aneesh Kumar K.V
2010-05-20  7:35 ` [PATCH -V11 4/9] vfs: Allow handle based open on symlinks Aneesh Kumar K.V
2010-05-20  7:35 ` [PATCH -V11 5/9] vfs: Support null pathname in readlink Aneesh Kumar K.V
2010-05-20  7:35 ` [PATCH -V11 6/9] ext4: Copy fs UUID to superblock Aneesh Kumar K.V
2010-05-20  7:35 ` [PATCH -V11 7/9] x86: Add new syscalls for x86_32 Aneesh Kumar K.V
2010-05-20  7:35 ` [PATCH -V11 8/9] x86: Add new syscalls for x86_64 Aneesh Kumar K.V
2010-05-20  7:35 ` [PATCH -V11 9/9] ext3: Copy fs UUID to superblock Aneesh Kumar K.V

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).