All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: "Wang, Lei" <lei4.wang@intel.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Wang, Wei W" <wei.w.wang@intel.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"farosas@suse.de" <farosas@suse.de>
Subject: Re: [PATCH] migration: Yield coroutine when receiving MIG_CMD_POSTCOPY_LISTEN
Date: Wed, 3 Apr 2024 16:35:35 +0800	[thread overview]
Message-ID: <c0607330-60af-4e0f-819e-4a22a38edd6d@intel.com> (raw)
In-Reply-To: <Zgx7DI4LXYrR_dk-@x1n>

On 4/3/2024 5:39, Peter Xu wrote:>>>>>
>>>>> I'm not 100% sure such thing (no matter here or moved into it, which
>>>>> does look cleaner) would work for us.
>>>>>
>>>>> The problem is I still don't yet see an ordering restricted on top of
>>>>> (1)
>>>>> accept() happens, and (2) receive LISTEN cmd here.  What happens if
>>>>> the
>>>>> accept() request is not yet received when reaching LISTEN?  Or is it
>>>>> always guaranteed the accept(fd) will always be polled here?
>>>>>
>>>>> For example, the source QEMU (no matter pre-7.2 or later) will always
>>>>> setup the preempt channel asynchrounously, then IIUC it can connect()
>>>>> after sending the whole chunk of packed data which should include this
>>>>> LISTEN.  I think it means it's not guaranteed this will 100% work, but
>>>>> maybe further reduce the possibility of the race.
>>>>
>>>> I think the following code:
>>>>
>>>> postcopy_start() ->
>>>> 	postcopy_preempt_establish_channel() ->
>>>> 		qemu_sem_wait(&s->postcopy_qemufile_src_sem);
>>>>
>>>> can guarantee that the connect() syscall is successful so the destination side
>>>> receives the connect() request before it loads the LISTEN command, otherwise
>>>> it won't post the sem:
>>>>
>>>> postcopy_preempt_send_channel_new() ->
>>>> 	postcopy_preempt_send_channel_done() ->
>>>>     		qemu_sem_post(&s->postcopy_qemufile_src_sem);
>>>>
>>>
>>> Yes. But as mentioned in another thread, connect() and accept() are async.
>>> So in theory accept() could still come later after the LISTEN command.
>>
>> IIUC accept() is the callback and will be triggered by the connect() event.
>>
>> The reason accept() is not called in the destination is the main loop doesn't
>> get a chance to handle other events (connect()), so if we can guarantee
>> connect() is before LISTEN, then when handling the LISTEN cmd, we yield to the
>> main loop and the connect() event is there, then we can handle it by calling the
>> accept():
>>
>> qio_net_listener_channel_func
>> 	qio_channel_socket_accept
>> 		qemu_accept
>> 			accept
>>
>> so it seems the case accept() comes alter after LISTEN is in our expectation?
> 
> The major thing uncertain to me is "accept() will return with a valid fd"
> on dest host is not guaranteed to order against anything.
> 
> For example, I won't be surprised if a kernel implementation provides an
> async model of "accept()" syscall, so that even if the other side returned
> with "connect()", the "accept()" can still fetch nothing if the async model
> will need a delay for the new channel to be finally delivered to the
> "accept()" thread context.  It just sounds like tricky to rely on such
> thing.
> 
> What I proposed below shouldn't rely on any kernel details on how accept()
> could be implemented, it simply waits for the fd to be created before doing
> anything else (including creating the preempt thread and process packed
> data).

Thanks for the detailed explanation!

> 
>>
>>>
>>>>>
>>>>> One right fix that I can think of is moving the sem_wait(&done) into
>>>>> the main thread too, so we wait for the sem _before_ reading the
>>>>> packed data, so there's no chance of fault.  However I don't think
>>>>> sem_wait() will be smart enough to yield when in a coroutine..  In the
>>>>> long term run I think we should really make migration loadvm to do
>>>>> work in the thread rather than the main thread.  I think it means we
>>>>> have one more example to be listed in this todo so that's preferred..
>>>>>
>>>>> https://wiki.qemu.org/ToDo/LiveMigration#Create_a_thread_for_migration
>>>>> _destination
>>>>>
>>>>> I attached such draft patch below, but I'm not sure it'll work.  Let
>>>>> me know how both of you think about it.
>>>>
>>>> Sadly it doesn't work, there is an unknown segfault.
> 
> Could you paste the stack of the segfault ("(gdb) thread apply all bt")?
> Or help to figure out what is wrong?
> 
> Since I cannot reproduce myself, I may need your help debugging this.  If
> you agree with what I said above and agree on such fix, please also feel
> free to go ahead and fix the segfault.

We should change the following line from

	while (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done, 100)) {

to

	while (qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done, 100)) {

After that fix, test passed and no segfault.

Given that the test shows a yield to the main loop won't introduce much overhead
(<1ms), how about first yield unconditionally, then we enter the while loop to
wait for several ms and yield periodically?


  reply	other threads:[~2024-04-03  8:36 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-29  3:32 [PATCH] migration: Yield coroutine when receiving MIG_CMD_POSTCOPY_LISTEN Lei Wang
2024-03-29  8:54 ` Wang, Wei W
2024-04-01 16:13   ` Peter Xu
2024-04-01 17:17     ` Fabiano Rosas
2024-04-01 18:47       ` Peter Xu
2024-04-01 21:22         ` Fabiano Rosas
2024-04-02  6:55     ` Wang, Lei
2024-04-02  7:25       ` Wang, Wei W
2024-04-02  9:28         ` Wang, Lei
2024-04-02 21:39           ` Peter Xu
2024-04-03  8:35             ` Wang, Lei [this message]
2024-04-03 14:42               ` Peter Xu
2024-04-03 16:04                 ` Wang, Wei W
2024-04-03 16:33                   ` Peter Xu
2024-04-04 10:11                     ` Wang, Wei W
2024-04-02  7:20     ` Wang, Wei W
2024-04-02 21:43       ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c0607330-60af-4e0f-819e-4a22a38edd6d@intel.com \
    --to=lei4.wang@intel.com \
    --cc=farosas@suse.de \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wei.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.