From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D943C47082 for ; Wed, 9 Jun 2021 00:21:57 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 55D9D610A1 for ; Wed, 9 Jun 2021 00:21:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 55D9D610A1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35768 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lqlyl-0007yD-5L for qemu-devel@archiver.kernel.org; Tue, 08 Jun 2021 20:21:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35286) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lqlyA-0007G3-Ox for qemu-devel@nongnu.org; Tue, 08 Jun 2021 20:21:18 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:2489) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lqly7-0007xX-MX for qemu-devel@nongnu.org; Tue, 08 Jun 2021 20:21:18 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4G071d6zSXzWtHH; Wed, 9 Jun 2021 08:16:05 +0800 (CST) Received: from dggpeml500016.china.huawei.com (7.185.36.70) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 9 Jun 2021 08:20:57 +0800 Received: from [10.174.148.223] (10.174.148.223) by dggpeml500016.china.huawei.com (7.185.36.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 9 Jun 2021 08:20:56 +0800 Subject: Re: A bug of Monitor Chardev ? To: =?UTF-8?Q?Daniel_P=2e_Berrang=c3=a9?= , "Markus Armbruster" References: <87o8cgxxel.fsf@dusky.pond.sub.org> From: "Longpeng (Mike, Cloud Infrastructure Service Product Dept.)" Message-ID: <6528dfe1-7cc7-c530-a56f-06517a627cda@huawei.com> Date: Wed, 9 Jun 2021 08:20:56 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.148.223] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpeml500016.china.huawei.com (7.185.36.70) X-CFilter-Loop: Reflected Received-SPF: pass client-ip=45.249.212.187; envelope-from=longpeng2@huawei.com; helo=szxga01-in.huawei.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: chenjiashang@huawei.com, qemu-devel@nongnu.org, Peter Xu , "Gonglei \(Arei\)" , pbonzini@redhat.com, marcandre.lureau@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" 在 2021/6/8 23:37, Daniel P. Berrangé 写道: > On Tue, Jun 08, 2021 at 04:07:30PM +0200, Markus Armbruster wrote: >> "Longpeng (Mike, Cloud Infrastructure Service Product Dept.)" >> writes: >> >>> We find a race during QEMU starting, which would case the QEMU process coredump. >>> >>>
| >>> | >>> [1] create MON chardev | >>> qemu_create_early_backends | >>> chardev_init_func | >>> | >>> [2] create MON iothread | >>> qemu_create_late_backends | >>> mon_init_func | >>> aio_bh_schedule-----------------------> monitor_qmp_setup_handlers_bh >>> [3] enter main loog | tcp_chr_update_read_handler >>> (* A client come in, e.g. Libvirt *) | update_ioc_handlers >>> tcp_chr_new_client | >>> update_ioc_handlers | >>> | >>> [4] create new hup_source | >>> s->hup_source = *PTR1* | >>> g_source_attach(s->hup_source)| >>> | [5] remove_hup_source(*PTR1*) >>> | (create new hup_source) >>> | s->hup_source = *PTR2* >>> [6] g_source_attach_unlocked | >>> *PTR1* is freed by [5] | >>> >>> Do you have any suggestion to fix this bug ? Thanks! >> >> Do we? We talked, but I'm not sure we reached a conclusion. > > Seems like we ended up with two options. > > 1. A workaround for the current specific problem by rearranging > the initilization code in the monitor a little. > > 2. A design fix of splitting the chardev creation into two > parts, one creation, and one activation. > > The latter is significantly more work, but is a better long term bet IMHO. > But what we really is someone motivated to actually implement one of the > two options. > How about the following implementation of option-1 ? We've tested it for several weeks, it works fine. diff --git a/chardev/char-socket.c b/chardev/char-socket.c index a484641..ecb3db9 100644 --- a/chardev/char-socket.c +++ b/chardev/char-socket.c @@ -722,6 +722,19 @@ static void tcp_chr_update_read_handler(Chardev *chr) update_ioc_handlers(s); } +static void tcp_chr_disable_handler(Chardev *chr) +{ + SocketChardev *s = SOCKET_CHARDEV(chr); + + if (s->listener && s->state == TCP_CHARDEV_STATE_DISCONNECTED) { + qio_net_listener_set_client_func_full(s->listener, NULL, NULL, + NULL, chr->gcontext); + } + + remove_fd_in_watch(chr); + remove_hup_source(s); +} + static bool tcp_chr_is_connected(Chardev *chr) { SocketChardev *s = SOCKET_CHARDEV(chr); @@ -1703,6 +1716,7 @@ static void char_socket_class_init(ObjectClass *oc, void *data) cc->chr_add_watch = tcp_chr_add_watch; cc->chr_set_reconnect_time = tcp_chr_set_reconnect_time; cc->chr_update_read_handler = tcp_chr_update_read_handler; + cc->chr_disable_handler = tcp_chr_disable_handler; cc->chr_is_connected = tcp_chr_is_connected; cc->chr_get_connect_id = tcp_chr_get_connect_id; diff --git a/chardev/char.c b/chardev/char.c index ff0a3cf..990fe4f 100644 --- a/chardev/char.c +++ b/chardev/char.c @@ -238,6 +238,15 @@ void qemu_chr_be_update_read_handlers(Chardev *s, } } +void qemu_chr_be_disable_handlers(Chardev *s) +{ + ChardevClass *cc = CHARDEV_GET_CLASS(s); + + if (cc->chr_disable_handler) { + cc->chr_disable_handler(s); + } +} + int qemu_chr_add_client(Chardev *s, int fd) { return CHARDEV_GET_CLASS(s)->chr_add_client ? diff --git a/include/chardev/char.h b/include/chardev/char.h index d1ec628..7a8c740 100644 --- a/include/chardev/char.h +++ b/include/chardev/char.h @@ -212,6 +212,8 @@ void qemu_chr_be_write_impl(Chardev *s, uint8_t *buf, int len); void qemu_chr_be_update_read_handlers(Chardev *s, GMainContext *context); +void qemu_chr_be_disable_handlers(Chardev *s); + /** * qemu_chr_be_event: * @event: the event to send @@ -282,6 +284,7 @@ typedef struct ChardevClass { int (*chr_sync_read)(Chardev *s, const uint8_t *buf, int len); GSource *(*chr_add_watch)(Chardev *s, GIOCondition cond); void (*chr_update_read_handler)(Chardev *s); + void (*chr_disable_handler)(Chardev *s); int (*chr_ioctl)(Chardev *s, int cmd, void *arg); int (*get_msgfds)(Chardev *s, int* fds, int num); int (*set_msgfds)(Chardev *s, int *fds, int num); diff --git a/monitor/qmp.c b/monitor/qmp.c index 9a69ae4..2c2248c 100644 --- a/monitor/qmp.c +++ b/monitor/qmp.c @@ -413,11 +413,13 @@ void monitor_init_qmp(Chardev *chr, bool pretty) * e.g. the chardev is in client mode, with wait=on. */ remove_fd_in_watch(chr); + /* - * We can't call qemu_chr_fe_set_handlers() directly here - * since chardev might be running in the monitor I/O - * thread. Schedule a bottom half. + * Before schedule a bottom half, we should clean up the handler in the + * default context to prevent the race between main thread and iothread */ + qemu_chr_be_disable_handlers(chr); + aio_bh_schedule_oneshot(iothread_get_aio_context(mon_iothread), monitor_qmp_setup_handlers_bh, mon); /* The bottom half will add @mon to @mon_list */ -- 1.8.3.1 > Regards, > Daniel >