virtio-comment.lists.oasis-open.org archive mirror
 help / color / mirror / Atom feed
From: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
To: Jason Wang <jasowang@redhat.com>
Cc: "Zhu, Lingshan" <lingshan.zhu@intel.com>,
	"virtio-comment@lists.oasis-open.org"
	<virtio-comment@lists.oasis-open.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Parav Pandit <parav@nvidia.com>,
	Yui Washizu <yui.washidu@gmail.com>
Subject: Re: [virtio-comment] About the plan of Admin Queue
Date: Wed, 2 Aug 2023 14:55:50 +0800	[thread overview]
Message-ID: <1690959350.2261508-2-xuanzhuo@linux.alibaba.com> (raw)
In-Reply-To: <CACGkMEsxOwyytv7+Ox4nGTCRKgatUbx6w2rrJqh3f9bfkYEs=A@mail.gmail.com>

On Wed, 2 Aug 2023 14:53:28 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Wed, Aug 2, 2023 at 2:37 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Wed, 2 Aug 2023 14:15:55 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > On Wed, Aug 2, 2023 at 2:01 PM Yui Washizu <yui.washidu@gmail.com> wrote:
> > > >
> > > >
> > > > On 2023/07/27 17:28, Jason Wang wrote:
> > > > > On Thu, Jul 27, 2023 at 4:20 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > >> On Thu, 27 Jul 2023 16:03:56 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > > >>> On Thu, Jul 27, 2023 at 2:23 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > > > >>>> On Thu, 27 Jul 2023 14:17:53 +0800, "Zhu, Lingshan" <lingshan.zhu@intel.com> wrote:
> > > > >>>>>
> > > > >>>>> On 7/27/2023 2:09 PM, Xuan Zhuo wrote:
> > > > >>>>>> On Thu, 27 Jul 2023 11:56:32 +0800, "Zhu, Lingshan" <lingshan.zhu@intel.com> wrote:
> > > > >>>>>>> On 7/27/2023 10:30 AM, Xuan Zhuo wrote:
> > > > >>>>>>>> On Mon, 3 Jul 2023 12:29:32 +0800, "Zhu, Lingshan" <lingshan.zhu@intel.com> wrote:
> > > > >>>>>>>>> On 6/30/2023 7:35 PM, Parav Pandit wrote:
> > > > >>>>>>>>>>> From: virtio-comment@lists.oasis-open.org <virtio-comment@lists.oasis-
> > > > >>>>>>>>>>> open.org> On Behalf Of Zhu, Lingshan
> > > > >>>>>>>>>>> Sent: Friday, June 30, 2023 6:33 AM
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Can we let the DPU notify the driver to create a new devicer from the
> > > > >>>>>>>>>>> backend?
> > > > >>>>>>>>>> Yes, why not.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>>>>> The key point is who want to create a new device.
> > > > >>>>>>>>>>>>> DPU can come with a certain number of pre-created ADIs, just make
> > > > >>>>>>>>>>>>> sure the orchestration SW is aware of their device IDs.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>> Cloud often need these devices to be created dynamically, many a time after the host OS is booted.
> > > > >>>>>>>>>> To be more generic, those devices to be created and connected to the host regardless of the life cycle of the host.
> > > > >>>>>>>>>> Xuan partly explained it.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>>>> If you want the DPU randomly create ADIs and notify the driver, I
> > > > >>>>>>>>>>>>> think we need interrupt, e.g., re-use config interrupt. But why DPU
> > > > >>>>>>>>>>>>> wants to create and hot plug in a device to a guest?
> > > > >>>>>>>>>>>>> Shall the host handle that or DPU pre-create then expose to baremteal
> > > > >>>>>>>>>>>>> machines?
> > > > >>>>>>>>>>>> In your scenario, the supervisor is on the os, which controls the DPU
> > > > >>>>>>>>>>>> to create new devices.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> In the cloud scenario, the vendor manager is in the DPU, and the
> > > > >>>>>>>>>>>> entire host is for users. Of course, there are situations where the
> > > > >>>>>>>>>>>> vendor manager are in the HOST. But for bare metal machines, the host
> > > > >>>>>>>>>>>> belongs to the customer, the vendor manager is only in the DPU.
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> So when the customers buy a new nic for the host, the vendor manager
> > > > >>>>>>>>>>>> will plug a device to the host from the DPU.
> > > > >>>>>>>>>>> I understand once a customer orders a new NIC, you wants to present the NIC
> > > > >>>>>>>>>>> to the host.
> > > > >>>>>>>>>>> However you only owns the DPU and the customer owns the host, that means
> > > > >>>>>>>>>>> this creation and hot plug must be transparent to the host and there may not be
> > > > >>>>>>>>>>> a host driver help handling an interrupt/probe.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>> That is ok. when driver is loaded, it would query about its child devices and probe it, if we strictly want to follow SIOV model.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> However this is not PCI which has a tree/switch and can enumerate devices to
> > > > >>>>>>>>>>> the host by spanning the device across the PCI hierarchy.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>> Those enumeration is triggered by the parent PCI device and pci bridge and switch will also discover it.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> To address an ADI, there is only a device_id.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>> SIOV device must have a unique identifier at PCI bus level for sure.
> > > > >>>>>>>>>> I cannot speak more about it in this forum due to other logistics issue.
> > > > >>>>>>>>>> But assume that there is PCI level unique identifier for SIOV device that switches on the path will learn about.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>> So, do you mind share how your DPU offload the device model? What kind of
> > > > >>>>>>>>>>> device your DPU provide to the host? Lets see whether DPU can mediate this by
> > > > >>>>>>>>>>> its own?
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>> It is a virtio nic, blk and other virtio devices for us.
> > > > >>>>>>>>>> A DPU hotplugs a device, host side either gets interrupt or later gets to know about it when explicitly queries.
> > > > >>>>>>>>>> There is no mediation per say here, it is just a dpu based SIOV device like a regular PF.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> For non virtio DPU device, I implemented them in Linux for dpus 2 years ago.
> > > > >>>>>>>>>> You might find a Linux reference model useful at [1].
> > > > >>>>>>>>>> A usage model already exists in one OS and in use for non virtio devices.
> > > > >>>>>>>>>> This certainly works without SIOV unique PCI device identifiers, because DPU (non-host) managed SIOV device spec still does not exist.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> For virtio, I think we should wait for this piece to be defined and leverage that, instead of virtio tc creating its own.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> [1] https://github.com/Mellanox/scalablefunctions/wiki
> > > > >>>>>>>>> well I see SF facing the similar challenge, I can add a command for the
> > > > >>>>>>>>> driver to query all existing SIOV ADIs of a device,
> > > > >>>>>>>>> and the device return ADIs id and status. Looks good? and work for you
> > > > >>>>>>>>> @Xuan?
> > > > >>>>>>>> Could I have your plan for this?
> > > > >>>>>>>>
> > > > >>>>>>>> If you do not mind, I'd like to add a command to query VF's info. Such
> > > > >>>>>>>> as mac, ip, etc.
> > > > >>>>>>> I think the query commands for SIOV is a little more complex, e.g.,
> > > > >>>>>>> need to report device type and its scale(e.g., features, mq).
> > > > >>>>>>> There can be thousands of SIOV ADIs and we don't want output flood.
> > > > >>>>>>>
> > > > >>>>>>> We have discussed implementation a config interrupt to report new
> > > > >>>>>>> created / deleted
> > > > >>>>>>> ADIs on the DPU side, therefore there must be a cap contains related
> > > > >>>>>>> information,
> > > > >>>>>>> my rough approach of the process is:
> > > > >>>>>>> 1) a cap contains the total number of existing ADIs and the max dev id
> > > > >>>>>>> 2) driver queries detailed information of a certain ADI or a bunch of
> > > > >>>>>>> ADIs in a [dev_id....dev_id2] range.
> > > > >>>>>> Yes, Admin Queue can obtain the info of the specific one or more devices.
> > > > >>>>>>
> > > > >>>>>>> I am not sure whether a NIC stores its IP
> > > > >>>>>> IP is the other topic. I want the Admin Queue manage the switch.
> > > > >>>>>> So the switch know about the IP of every device, and the
> > > > >>>>>> Admin Queue will has the ability to config the IP of the device inside the
> > > > >>>>>> switch.
> > > > >>>>> DPU onboard switch? OVS? Does it beyond virtio spec?
> > > > >>>> YES.
> > > > >>> Adding Washizu.
> > > > >>>
> > > > >>> We can have a switch/dpa defined in the networking device for sure.
> > > > >> Yes, I think we should introduce that for the sr-iov. Or for other.
> > > > > This should be a general one as a switch should be transport independent.
> > > > >
> > > > >> I would like to know who is doing this?
> > > > > Washizu, could you confirm if you want to do this or not?
> > > >
> > > >
> > > > Does this mean adding a switch definition to the virtio spec?
> > > >
> > > >
> > > > If so, it will be necessary for the implementation of my plan,
> > > >
> > > > but it may take time (probably several months?) to get started,
> > > >
> > > > as I'm currently working on another task (virtio-net SR-IOV feature in
> > > > qemu).
> > > >
> > > > Anyone is welcome to work on adding the switch definition in the meantime,
> > > >
> > > > it's completely fine with me.
> > > >
> > > > I think I'll work on that if no one has finished the work.
> > >
> > > Ok, great, I think we need to start this by considering reusing one of
> > > the existing DPA spec. (For example, ofdpa? or any other?)
> >
> > We have a need on this area. I want to start this work now.
> >
> > Could you give me the link of the spec?
>
> I'm not sure this is the best one, we can hear from others. I mention
> it since it has an emulation code that is done in Qemu (rocker
> switch).
>
> The spec is:
>
> https://docs.broadcom.com/doc/12378911

OK.

I will study it. But I think we can start with a simple model. For virtio-net,
we don't need a full-featured switch. I would mainly consider the SR-IOV case.

Thanks.

>
> (not sure this is the recent one though).
>
> Thanks
>
> >
> > Thanks
> >
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > >
> > > >
> > > > >
> > > > >> Another question, @Jason are you referring to a new device type or a
> > > > >> new virtio-net feature.
> > > > > Extending virtio-net should be fine, did you see any issues for this?
> > > > >
> > > > >>>> For SIOV, I think this is MUST.
> > > > >>> A learning bridge would be fine as a starter. It's better not to
> > > > >>> couple new scalable capability with any device specific features.
> > > > >>>
> > > > >>>> Maybe you have one simple implementation.
> > > > >>>> But you have to solve the IP steering. So admin queue should has the ability
> > > > >>>> to config the IP steering.
> > > > >>> I think not. Those L2/LN tables/filters are networking specific.
> > > > >> Let us assume that there is a switch/bridge firstly.
> > > > >> The VFs may be passed to different VMs.
> > > > >>
> > > > >> I also think this is the networking specific. But I want to config
> > > > >> the ip for every vf from the pf.
> > > > > What do you mean by ip here (e.g who is the user for this ip?)
> > > > >
> > > > >> Because the user of the vf may be unreliable.
> > > > >> We need a manager to config the ip for every vf.
> > > > > Did you mean you're using a tunnel or not?
> > > > >
> > > > >>
> > > > >>> Control virtqueue is better than admin virtqueue here.
> > > > >> by cq?
> > > > >>
> > > > >> What case?
> > > > > We've already used control virtqueue for steering.
> > > > >
> > > > > Thanks
> > > > >
> > > > >> Thanks.
> > > > >>
> > > > >>> Thanks
> > > > >>>
> > > > >>>> Thanks.
> > > > >>>>
> > > > >>>>>>
> > > > >>>>>> Thanks.
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>> Thanks
> > > > >>>>>>>> Thanks.
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>
> > > > >>>>>>>>> Thanks
> > > > >>>>>>>>>
> > > > >>>>>>>> This publicly archived list offers a means to provide input to the
> > > > >>>>>>>> OASIS Virtual I/O Device (VIRTIO) TC.
> > > > >>>>>>>>
> > > > >>>>>>>> In order to verify user consent to the Feedback License terms and
> > > > >>>>>>>> to minimize spam in the list archive, subscription is required
> > > > >>>>>>>> before posting.
> > > > >>>>>>>>
> > > > >>>>>>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > >>>>>>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > >>>>>>>> List help: virtio-comment-help@lists.oasis-open.org
> > > > >>>>>>>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > >>>>>>>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > >>>>>>>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > >>>>>>>> Committee: https://www.oasis-open.org/committees/virtio/
> > > > >>>>>>>> Join OASIS: https://www.oasis-open.org/join/
> > > > >>>>>>>>
> > > > >>>>>
> > > > >>>>> This publicly archived list offers a means to provide input to the
> > > > >>>>> OASIS Virtual I/O Device (VIRTIO) TC.
> > > > >>>>>
> > > > >>>>> In order to verify user consent to the Feedback License terms and
> > > > >>>>> to minimize spam in the list archive, subscription is required
> > > > >>>>> before posting.
> > > > >>>>>
> > > > >>>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > >>>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > >>>>> List help: virtio-comment-help@lists.oasis-open.org
> > > > >>>>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > >>>>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > >>>>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > >>>>> Committee: https://www.oasis-open.org/committees/virtio/
> > > > >>>>> Join OASIS: https://www.oasis-open.org/join/
> > > > >>>>>
> > > > >>>> This publicly archived list offers a means to provide input to the
> > > > >>>> OASIS Virtual I/O Device (VIRTIO) TC.
> > > > >>>>
> > > > >>>> In order to verify user consent to the Feedback License terms and
> > > > >>>> to minimize spam in the list archive, subscription is required
> > > > >>>> before posting.
> > > > >>>>
> > > > >>>> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > >>>> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > >>>> List help: virtio-comment-help@lists.oasis-open.org
> > > > >>>> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > >>>> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > >>>> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > >>>> Committee: https://www.oasis-open.org/committees/virtio/
> > > > >>>> Join OASIS: https://www.oasis-open.org/join/
> > > > >>>>
> > > >
> > >
> > >
> > > This publicly archived list offers a means to provide input to the
> > > OASIS Virtual I/O Device (VIRTIO) TC.
> > >
> > > In order to verify user consent to the Feedback License terms and
> > > to minimize spam in the list archive, subscription is required
> > > before posting.
> > >
> > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > List help: virtio-comment-help@lists.oasis-open.org
> > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > Committee: https://www.oasis-open.org/committees/virtio/
> > > Join OASIS: https://www.oasis-open.org/join/
> > >
> >
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> >
> > In order to verify user consent to the Feedback License terms and
> > to minimize spam in the list archive, subscription is required
> > before posting.
> >
> > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > List help: virtio-comment-help@lists.oasis-open.org
> > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > Committee: https://www.oasis-open.org/committees/virtio/
> > Join OASIS: https://www.oasis-open.org/join/
> >
>

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


  reply	other threads:[~2023-08-02  7:00 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-20  6:44 [virtio-comment] About the plan of Admin Queue Xuan Zhuo
2023-06-20  8:11 ` Zhu, Lingshan
2023-06-30  5:54   ` Xuan Zhuo
2023-06-30  6:19     ` Zhu, Lingshan
2023-06-30  7:46       ` Xuan Zhuo
2023-06-30  7:54         ` Zhu, Lingshan
2023-06-30  7:56           ` Xuan Zhuo
2023-06-30  8:32             ` Zhu, Lingshan
2023-06-30  9:07               ` Zhu, Lingshan
2023-06-30  9:14               ` Xuan Zhuo
2023-06-30 10:33                 ` Zhu, Lingshan
2023-06-30 11:35                   ` Parav Pandit
2023-07-03  4:29                     ` Zhu, Lingshan
2023-07-03  5:54                       ` Xuan Zhuo
2023-07-03  8:01                         ` Zhu, Lingshan
2023-07-03  8:21                           ` Xuan Zhuo
2023-07-03  8:23                             ` Zhu, Lingshan
2023-07-27  2:30                       ` Xuan Zhuo
2023-07-27  3:56                         ` Zhu, Lingshan
2023-07-27  6:09                           ` Xuan Zhuo
2023-07-27  6:17                             ` Zhu, Lingshan
2023-07-27  6:20                               ` Xuan Zhuo
2023-07-27  8:03                                 ` Jason Wang
2023-07-27  8:07                                   ` Xuan Zhuo
2023-07-27  8:28                                     ` Jason Wang
2023-07-27  8:30                                       ` Xuan Zhuo
2023-07-27  8:56                                         ` Jason Wang
2023-07-27  9:01                                           ` Xuan Zhuo
     [not found]                                       ` <aafe1885-0ec2-66ca-4511-f2606bc881ee@gmail.com>
2023-08-02  6:13                                         ` Xuan Zhuo
2023-08-02  6:15                                         ` Jason Wang
2023-08-02  6:34                                           ` Xuan Zhuo
2023-08-02  6:53                                             ` Jason Wang
2023-08-02  6:55                                               ` Xuan Zhuo [this message]
2023-07-28  6:09                                   ` Xuan Zhuo
2023-07-31  1:20                                     ` Jason Wang
2023-07-31  2:02                                       ` Parav Pandit
2023-07-03  8:10     ` Jason Wang
2023-07-03  8:20       ` Xuan Zhuo
2023-07-03 13:05         ` Michael S. Tsirkin
2023-07-03 13:06           ` Parav Pandit
2023-07-03 20:38           ` Parav Pandit
2023-07-04  3:48             ` Zhu, Lingshan
2023-07-04 12:11           ` Xuan Zhuo
2023-07-04 12:14           ` Xuan Zhuo
2023-07-04 13:15             ` Parav Pandit
2023-07-05  4:30               ` Xuan Zhuo
2023-07-05  4:35                 ` Parav Pandit
2023-07-05  4:36                   ` Xuan Zhuo
2023-07-05  4:38               ` Xuan Zhuo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1690959350.2261508-2-xuanzhuo@linux.alibaba.com \
    --to=xuanzhuo@linux.alibaba.com \
    --cc=jasowang@redhat.com \
    --cc=lingshan.zhu@intel.com \
    --cc=mst@redhat.com \
    --cc=parav@nvidia.com \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=yui.washidu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).