Linux-Fsdevel Archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Jonathan Gilbert <logic@deltaq.org>
Cc: linux-fsdevel@vger.kernel.org, Jan Kara <jack@suse.cz>
Subject: Re: fanotify and files being moved or deleted
Date: Tue, 21 May 2024 20:13:38 +0300	[thread overview]
Message-ID: <CAOQ4uxhM-KTafejKZOFmE9+REpYXqVcv_72d67qL-j6yHUriEw@mail.gmail.com> (raw)
In-Reply-To: <CAPSOpYsZCw_HJhskzfe3L9OHBZHm0x=P0hDsiNuFB6Lz_huHzw@mail.gmail.com>

On Tue, May 21, 2024 at 7:04 PM Jonathan Gilbert <logic@deltaq.org> wrote:
>
> Hmm, okay. In earlier testing, I must have had a bug because I wasn't
> seeing filenames for FAN_MOVE or FAN_DELETE. But, my code is more
> robust now, and when I switch it to those events I do see filenames --
> but not paths. Looks like I can do the open_by_handle_at trick on the
> fd in the main FAN_MOVED_FROM, FAN_MOVED_TO and FAN_DELETE event and
> that'll give me the directory path and then I can combine it with the
> file name in the info structure?
>

Yes. That's the idea.
open_by_handle_at() with the parent's file handle is guaranteed to return
a fd with "connected" path (i.e. known path), unless that directory was deleted.

Note that you will be combining the *current* directory path with the *past*
filename, so you may get a path that never existed in reality, but as you wrote
fanotify is not meant for keeping historical records of the filesystem
namespace.

> Are FAN_MOVED_FROM and FAN_MOVED_TO guaranteed to be emitted
> atomically, or is there a possibility they could be split up by other
> events? If so, could there be multiple overlapping
> FAN_MOVED_FROM/FAN_MOVED_TO pairs under the right circumstances??

You are looking for FAN_RENAME, the new event that combines
information from FAN_MOVED_FROM/FAN_MOVED_TO.

Unlike FAN_MOVED_FROM/FAN_MOVED_TO, FAN_RENAME cannot
be merged with other events like FAN_CREATE/FAN_DELETE because
it does not carry the same type of information.

>
> One other thing I'm seeing is that in enumerating the mount table in
> order to mark things, I find multiple entries with the same fsid.
> These seem to be cases where an item _inside another mount_ has been
> used as the device for a mount. One example is /boot/grub, which is
> mounted from /boot/efi/grub, where /boot/efi is itself mounted from a
> physical device.

Yes, this is called a bind mount, which can be generated using
mount --bind /boot/efi/grub /boot/grub

> When enumerating the mounts, both of these return the
> same fsid from fstatfs. There is at least one other with such a
> collision, though it does not appear in fstab. Both the root
> filesystem / and a filesystem mounted at
> /var/snap/firefox/common/host-unspell return the same fsid. Does this
> mean that there is simply a category of event that cannot be
> guaranteed to return the correct path, because the only identifying
> information, the fsid, isn't guaranteed to be unique? Or is there a
> way to resolve this?

That depends on how you are setting up your watches.
Are you setting up FAN_MARK_FILESYSTEM watches on all
mounted filesystem?

Note that not all filesystems support NFS export file handles,
so not all filesystem support being watched with FAN_REPORT_FID and
FAN_MARK_FILESYSTEM.

If, for example you care about reconstructing changing over certain
paths (e.g. /home), you can keep an open mount_fd of that path when you
start watching it and keep it in a hash table with fsid as the key
(that is how fsnotifywatch does it [1]) and then use that mount_fd whenever
you want to decode the path from a parent file handle.

If /home is a bind mount from, say, /data/home/ and you are watching
both /home and /data, you will need to figure out that they are the same
underlying fs and use a mount_fd of /data.

Then, when you get a file handles of, say, /home/docs, it will be resolved
to path /data/home/docs.

If you try to resolve a file handle of /data/archive using mount_fd that
was opened from /home, the path you will observe is "/", so this will
not be useful to constructing history.

So the answer really depends on what exactly are the requirements
of your tool.

Thanks,
Amir.

[1] https://github.com/inotify-tools/inotify-tools/blob/master/libinotifytools/src/inotifytools.cpp#L1343

  reply	other threads:[~2024-05-21 17:13 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-21  1:03 fanotify and files being moved or deleted Jonathan Gilbert
2024-05-21  3:58 ` Amir Goldstein
2024-05-21 16:03   ` Jonathan Gilbert
2024-05-21 17:13     ` Amir Goldstein [this message]
2024-05-21 21:09       ` Jonathan Gilbert
2024-05-22  5:21         ` Amir Goldstein
2024-05-22  5:26           ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxhM-KTafejKZOFmE9+REpYXqVcv_72d67qL-j6yHUriEw@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=logic@deltaq.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).