QEMU-Devel Archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: qemu-devel@nongnu.org, berrange@redhat.com, armbru@redhat.com,
	Claudio Fontana <cfontana@suse.de>, Jim Fehlig <jfehlig@suse.com>
Subject: Re: [PATCH 2/9] migration: Fix file migration with fdset
Date: Fri, 3 May 2024 17:04:11 -0400	[thread overview]
Message-ID: <ZjVRS6yT6n7_wb0V@x1n> (raw)
In-Reply-To: <87a5l6oejr.fsf@suse.de>

On Fri, May 03, 2024 at 04:56:08PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Fri, Apr 26, 2024 at 11:20:35AM -0300, Fabiano Rosas wrote:
> >> When the migration using the "file:" URI was implemented, I don't
> >> think any of us noticed that if you pass in a file name with the
> >> format "/dev/fdset/N", this allows a file descriptor to be passed in
> >> to QEMU and that behaves just like the "fd:" URI. So the "file:"
> >> support has been added without regard for the fdset part and we got
> >> some things wrong.
> >> 
> >> The first issue is that we should not truncate the migration file if
> >> we're allowing an fd + offset. We need to leave the file contents
> >> untouched.
> >
> > I'm wondering whether we can use fallocate() instead on the ranges so that
> > we always don't open() with O_TRUNC.  Before that..  could you remind me
> > why do we need to truncate in the first place?  I definitely missed
> > something else here too.
> 
> AFAIK, just to avoid any issues if the file is pre-existing. I don't see
> the difference between O_TRUNC and fallocate in this case.

Then, shall we avoid truncations at all, leaving all the feasibility to
user (also errors prone to make)?

> 
> >
> >> 
> >> The second issue is that there's an expectation that QEMU removes the
> >> fd after the migration has finished. That's what the "fd:" code
> >> does. Otherwise a second migration on the same VM could attempt to
> >> provide an fdset with the same name and QEMU would reject it.
> >
> > Let me check what we do when with "fd:" and when migration completes or
> > cancels.
> >
> > IIUC it's qio_channel_file_close() that does the final cleanup work on
> > e.g. to_dst_file, right?  Then there's qemu_close(), and it has:
> >
> >     /* Close fd that was dup'd from an fdset */
> >     fdset_id = monitor_fdset_dup_fd_find(fd);
> >     if (fdset_id != -1) {
> >         int ret;
> >
> >         ret = close(fd);
> >         if (ret == 0) {
> >             monitor_fdset_dup_fd_remove(fd);
> >         }
> >
> >         return ret;
> >     }
> >
> > Shouldn't this done the work already?
> 
> That removes the mon_fdset_fd_dup->fd, we want to remove the
> mon_fdset_fd->fd.

What I read so far is when we are removing the dup-fds, we'll do one more
thing:

monitor_fdset_dup_fd_find_remove():
                    if (QLIST_EMPTY(&mon_fdset->dup_fds)) {
                        monitor_fdset_cleanup(mon_fdset);
                    }

It means if we removed all the dup-fds correctly, we should also remove the
whole fdset, which includes the ->fds, IIUC.

> 
> >
> > Off topic: I think this code is over complicated too, maybe I missed
> > something, but afaict we don't need monitor_fdset_dup_fd_find at all.. we
> > simply walk the list and remove stuff..  I attach a patch at the end that I
> > tried to clean that up, just in case there's early comments.  But we can
> > ignore that so we don't get side-tracked, and focus on the direct-io
> > issues.
> 
> Well, I'm not confident touching this code. This is more than a decade
> old, I have no idea what the original motivations were. The possible
> interactions with the user via command-line (-add-fd), QMP (add-fd) and
> the monitor lifetime make me confused. Not to mention the fdset part
> being plumbed into the guts of a widely used qemu_open_internal() that
> very misleadingly presents itself as just a wrapper for open().

If to make QEMU long live, we'll probably need to touch it at some
point.. or at least discuss about it and figure things out. We pay tech
debts like this when there's no good comment / docs to refer in this case,
then the earlier, perhaps also the better.. to try taking the stab, imho.

Definitely not a request to clean everything up. :) Let's see whether
others can chim in with better knowledge of the history.

> 
> >
> > Thanks,
> >
> > =======
> >
> > From 2f6b6d1224486d8ee830a7afe34738a07003b863 Mon Sep 17 00:00:00 2001
> > From: Peter Xu <peterx@redhat.com>
> > Date: Fri, 3 May 2024 11:27:20 -0400
> > Subject: [PATCH] monitor: Drop monitor_fdset_dup_fd_add()
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=UTF-8
> > Content-Transfer-Encoding: 8bit
> >
> > This function is not needed, one remove function should already work.
> > Clean it up.
> >
> > Here the code doesn't really care about whether we need to keep that dupfd
> > around if close() failed: when that happens something got very wrong,
> > keeping the dup_fd around the fdsets may not help that situation so far.
> >
> > Cc: Dr. David Alan Gilbert <dave@treblig.org>
> > Cc: Markus Armbruster <armbru@redhat.com>
> > Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Daniel P. Berrangé <berrange@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  include/monitor/monitor.h |  1 -
> >  monitor/fds.c             | 27 +++++----------------------
> >  stubs/fdset.c             |  5 -----
> >  util/osdep.c              | 15 +--------------
> >  4 files changed, 6 insertions(+), 42 deletions(-)
> >
> > diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
> > index 965f5d5450..fd9b3f538c 100644
> > --- a/include/monitor/monitor.h
> > +++ b/include/monitor/monitor.h
> > @@ -53,7 +53,6 @@ AddfdInfo *monitor_fdset_add_fd(int fd, bool has_fdset_id, int64_t fdset_id,
> >                                  const char *opaque, Error **errp);
> >  int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags);
> >  void monitor_fdset_dup_fd_remove(int dup_fd);
> > -int64_t monitor_fdset_dup_fd_find(int dup_fd);
> >  
> >  void monitor_register_hmp(const char *name, bool info,
> >                            void (*cmd)(Monitor *mon, const QDict *qdict));
> > diff --git a/monitor/fds.c b/monitor/fds.c
> > index d86c2c674c..d5aecfb70e 100644
> > --- a/monitor/fds.c
> > +++ b/monitor/fds.c
> > @@ -458,7 +458,7 @@ int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags)
> >  #endif
> >  }
> >  
> > -static int64_t monitor_fdset_dup_fd_find_remove(int dup_fd, bool remove)
> > +void monitor_fdset_dup_fd_remove(int dup_fd)
> >  {
> >      MonFdset *mon_fdset;
> >      MonFdsetFd *mon_fdset_fd_dup;
> > @@ -467,31 +467,14 @@ static int64_t monitor_fdset_dup_fd_find_remove(int dup_fd, bool remove)
> >      QLIST_FOREACH(mon_fdset, &mon_fdsets, next) {
> >          QLIST_FOREACH(mon_fdset_fd_dup, &mon_fdset->dup_fds, next) {
> >              if (mon_fdset_fd_dup->fd == dup_fd) {
> > -                if (remove) {
> > -                    QLIST_REMOVE(mon_fdset_fd_dup, next);
> > -                    g_free(mon_fdset_fd_dup);
> > -                    if (QLIST_EMPTY(&mon_fdset->dup_fds)) {
> > -                        monitor_fdset_cleanup(mon_fdset);
> > -                    }
> > -                    return -1;
> > -                } else {
> > -                    return mon_fdset->id;
> > +                QLIST_REMOVE(mon_fdset_fd_dup, next);
> > +                g_free(mon_fdset_fd_dup);
> > +                if (QLIST_EMPTY(&mon_fdset->dup_fds)) {
> > +                    monitor_fdset_cleanup(mon_fdset);
> >                  }
> >              }
> >          }
> >      }
> > -
> > -    return -1;
> > -}
> > -
> > -int64_t monitor_fdset_dup_fd_find(int dup_fd)
> > -{
> > -    return monitor_fdset_dup_fd_find_remove(dup_fd, false);
> > -}
> > -
> > -void monitor_fdset_dup_fd_remove(int dup_fd)
> > -{
> > -    monitor_fdset_dup_fd_find_remove(dup_fd, true);
> >  }
> >  
> >  int monitor_fd_param(Monitor *mon, const char *fdname, Error **errp)
> > diff --git a/stubs/fdset.c b/stubs/fdset.c
> > index d7c39a28ac..389e368a29 100644
> > --- a/stubs/fdset.c
> > +++ b/stubs/fdset.c
> > @@ -9,11 +9,6 @@ int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags)
> >      return -1;
> >  }
> >  
> > -int64_t monitor_fdset_dup_fd_find(int dup_fd)
> > -{
> > -    return -1;
> > -}
> > -
> >  void monitor_fdset_dup_fd_remove(int dupfd)
> >  {
> >  }
> > diff --git a/util/osdep.c b/util/osdep.c
> > index e996c4744a..2d9749d060 100644
> > --- a/util/osdep.c
> > +++ b/util/osdep.c
> > @@ -393,21 +393,8 @@ int qemu_open_old(const char *name, int flags, ...)
> >  
> >  int qemu_close(int fd)
> >  {
> > -    int64_t fdset_id;
> > -
> >      /* Close fd that was dup'd from an fdset */
> > -    fdset_id = monitor_fdset_dup_fd_find(fd);
> > -    if (fdset_id != -1) {
> > -        int ret;
> > -
> > -        ret = close(fd);
> > -        if (ret == 0) {
> > -            monitor_fdset_dup_fd_remove(fd);
> > -        }
> > -
> > -        return ret;
> > -    }
> > -
> > +    monitor_fdset_dup_fd_remove(fd);
> >      return close(fd);
> >  }
> >  
> > -- 
> > 2.44.0
> 

-- 
Peter Xu



  reply	other threads:[~2024-05-03 21:05 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-26 14:20 [PATCH 0/9] migration/mapped-ram: Add direct-io support Fabiano Rosas
2024-04-26 14:20 ` [PATCH 1/9] monitor: Honor QMP request for fd removal immediately Fabiano Rosas
2024-05-03 16:02   ` Peter Xu
2024-05-16 21:46     ` Fabiano Rosas
2024-05-08  7:17   ` Daniel P. Berrangé
2024-05-16 22:00     ` Fabiano Rosas
2024-05-17  7:33       ` Daniel P. Berrangé
2024-04-26 14:20 ` [PATCH 2/9] migration: Fix file migration with fdset Fabiano Rosas
2024-05-03 16:23   ` Peter Xu
2024-05-03 19:56     ` Fabiano Rosas
2024-05-03 21:04       ` Peter Xu [this message]
2024-05-03 21:31         ` Fabiano Rosas
2024-05-03 21:56           ` Peter Xu
2024-05-08  8:02     ` Daniel P. Berrangé
2024-05-08 12:49       ` Peter Xu
2024-05-08  8:00   ` Daniel P. Berrangé
2024-05-08 20:45     ` Fabiano Rosas
2024-04-26 14:20 ` [PATCH 3/9] tests/qtest/migration: Fix file migration offset check Fabiano Rosas
2024-05-03 16:47   ` Peter Xu
2024-05-03 20:36     ` Fabiano Rosas
2024-05-03 21:08       ` Peter Xu
2024-05-08  8:10       ` Daniel P. Berrangé
2024-04-26 14:20 ` [PATCH 4/9] migration: Add direct-io parameter Fabiano Rosas
2024-04-26 14:33   ` Markus Armbruster
2024-05-03 18:05   ` Peter Xu
2024-05-03 20:49     ` Fabiano Rosas
2024-05-03 21:16       ` Peter Xu
2024-05-14 14:10         ` Markus Armbruster
2024-05-14 17:57           ` Fabiano Rosas
2024-05-15  7:17             ` Markus Armbruster
2024-05-15 12:51               ` Fabiano Rosas
2024-05-08  8:25   ` Daniel P. Berrangé
2024-04-26 14:20 ` [PATCH 5/9] migration/multifd: Add direct-io support Fabiano Rosas
2024-05-03 18:29   ` Peter Xu
2024-05-03 20:54     ` Fabiano Rosas
2024-05-03 21:18       ` Peter Xu
2024-05-08  8:27   ` Daniel P. Berrangé
2024-04-26 14:20 ` [PATCH 6/9] tests/qtest/migration: Add tests for file migration with direct-io Fabiano Rosas
2024-05-03 18:38   ` Peter Xu
2024-05-03 21:05     ` Fabiano Rosas
2024-05-03 21:25       ` Peter Xu
2024-05-08  8:34   ` Daniel P. Berrangé
2024-04-26 14:20 ` [PATCH 7/9] monitor: fdset: Match against O_DIRECT Fabiano Rosas
2024-05-03 18:53   ` Peter Xu
2024-05-03 21:19     ` Fabiano Rosas
2024-05-03 22:16       ` Peter Xu
2024-04-26 14:20 ` [PATCH 8/9] migration: Add support for fdset with multifd + file Fabiano Rosas
2024-05-08  8:53   ` Daniel P. Berrangé
2024-05-08 18:23     ` Peter Xu
2024-05-08 20:39       ` Fabiano Rosas
2024-05-09  8:08         ` Daniel P. Berrangé
2024-05-17 22:43           ` Fabiano Rosas
2024-05-18  8:36             ` Daniel P. Berrangé
2024-04-26 14:20 ` [PATCH 9/9] tests/qtest/migration: Add a test for mapped-ram with passing of fds Fabiano Rosas
2024-05-08  8:56   ` Daniel P. Berrangé
2024-05-02 20:01 ` [PATCH 0/9] migration/mapped-ram: Add direct-io support Peter Xu
2024-05-02 20:34   ` Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZjVRS6yT6n7_wb0V@x1n \
    --to=peterx@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=cfontana@suse.de \
    --cc=farosas@suse.de \
    --cc=jfehlig@suse.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).