All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Jason Wang <jasowang@redhat.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	"Jiang, Dave" <dave.jiang@intel.com>,
	"Raj, Ashok" <ashok.raj@intel.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Jason Gunthorpe <jgg@nvidia.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"parav@mellanox.com" <parav@mellanox.com>,
	"Enrico Weigelt, metux IT consult" <lkml@metux.net>,
	Robin Murphy <robin.murphy@arm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Shenming Lu <lushenming@huawei.com>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	David Woodhouse <dwmw2@infradead.org>
Subject: Re: Plan for /dev/ioasid RFC v2
Date: Thu, 24 Jun 2021 14:23:36 +1000	[thread overview]
Message-ID: <YNQIyP4RR0PmVtLo@yekko> (raw)
In-Reply-To: <20210617151452.08beadae.alex.williamson@redhat.com>


[-- Attachment #1.1: Type: text/plain, Size: 4547 bytes --]

On Thu, Jun 17, 2021 at 03:14:52PM -0600, Alex Williamson wrote:
> On Thu, 17 Jun 2021 07:31:03 +0000
> "Tian, Kevin" <kevin.tian@intel.com> wrote:
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Thursday, June 17, 2021 3:40 AM
> > > On Wed, 16 Jun 2021 06:43:23 +0000
> > > "Tian, Kevin" <kevin.tian@intel.com> wrote:
> > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > Sent: Wednesday, June 16, 2021 12:12 AM
> > > > > On Tue, 15 Jun 2021 02:31:39 +0000
> > > > > "Tian, Kevin" <kevin.tian@intel.com> wrote:
> > > > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > > > Sent: Tuesday, June 15, 2021 12:28 AM
[snip]

> > > > > 3) A dual-function conventional PCI e1000 NIC where the functions are
> > > > >    grouped together due to shared RID.
> > > > >
> > > > >    a) Repeat 2.a) and 2.b) such that we have a valid, user accessible
> > > > >       devices in the same IOMMU context.
> > > > >
> > > > >    b) Function 1 is detached from the IOASID.
> > > > >
> > > > >    I think function 1 cannot be placed into a different IOMMU context
> > > > >    here, does the detach work?  What's the IOMMU context now?  
> > > >
> > > > Yes. Function 1 is back to block-DMA. Since both functions share RID,
> > > > essentially it implies function 0 is in block-DMA state too (though its
> > > > tracking state may not change yet) since the shared IOMMU context
> > > > entry blocks DMA now. In IOMMU fd function 0 is still attached to the
> > > > IOASID thus the user still needs do an explicit detach to clear the
> > > > tracking state for function 0.
> > > >  
> > > > >
> > > > >    c) A new IOASID is alloc'd within the existing iommu_fd and function
> > > > >       1 is attached to the new IOASID.
> > > > >
> > > > >    Where, how, by whom does this fail?  
> > > >
> > > > No need to fail. It can succeed since doing so just hurts user's own foot.
> > > >
> > > > The only question is how user knows the fact that a group of devices
> > > > share RID thus avoid such thing. I'm curious how it is communicated
> > > > with today's VFIO mechanism. Yes the group-centric VFIO uAPI prevents
> > > > a group of devices from attaching to multiple IOMMU contexts, but
> > > > suppose we still need a way to tell the user to not do so. Especially
> > > > such knowledge would be also reflected in the virtual PCI topology
> > > > when the entire group is assigned to the guest which needs to know
> > > > this fact when vIOMMU is exposed. I haven't found time to investigate
> > > > it but suppose if such channel exists it could be reused, or in the worst
> > > > case we may have the new device capability interface to convey...  
> > > 
> > > No such channel currently exists, it's not an issue today, IOMMU
> > > context is group-based.  
> > 
> > Interesting... If such group of devices are assigned to a guest, how does
> > Qemu decide the virtual PCI topology for them? Do they have same
> > vRID or different?
> 
> That's the beauty of it, it doesn't matter how many RIDs exist in the
> group, or which devices have aliases, the group is the minimum
> granularity of a container where QEMU knows that a container provides
> a single address space.  Therefore a container must exist in a single
> address space in the PCI topology.  In a conventional or non-vIOMMU
> topology, the PCI address space is equivalent to the system memory
> address space.  When vIOMMU gets involved, multiple devices within the
> same group must exist in the same address space.  A vPCIe-to-PCI bridge
> can be used to create that shared address space.
> 
> I've referred to this as a limitation of type1, that we can't put
> devices within the same group into different address spaces, such as
> behind separate vRoot-Ports in a vIOMMU config, but really, who cares?
> As isolation support improves we see fewer multi-device groups, this
> scenario becomes the exception.  Buy better hardware to use the devices
> independently.

Also, that limitation is fundamental.  Groups in a guest must always
be the same or strictly bigger than groups in the host, because if the
real hardware can't isolate them, then the virtual hardware certainly
can't and the guest kernel shouldn't be given the impression that it
can separate them.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: David Gibson <david@gibson.dropbear.id.au>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	Jason Gunthorpe <jgg@nvidia.com>, Joerg Roedel <joro@8bytes.org>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Jason Wang <jasowang@redhat.com>,
	"parav@mellanox.com" <parav@mellanox.com>,
	"Enrico Weigelt, metux IT consult" <lkml@metux.net>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Shenming Lu <lushenming@huawei.com>,
	Eric Auger <eric.auger@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	"Raj, Ashok" <ashok.raj@intel.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>, "Wu, Hao" <hao.wu@intel.com>,
	"Jiang, Dave" <dave.jiang@intel.com>,
	Jacob Pan <jacob.jun.pan@linux.intel.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	Robin Murphy <robin.murphy@arm.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	David Woodhouse <dwmw2@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Lu Baolu <baolu.lu@linux.intel.com>
Subject: Re: Plan for /dev/ioasid RFC v2
Date: Thu, 24 Jun 2021 14:23:36 +1000	[thread overview]
Message-ID: <YNQIyP4RR0PmVtLo@yekko> (raw)
In-Reply-To: <20210617151452.08beadae.alex.williamson@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4547 bytes --]

On Thu, Jun 17, 2021 at 03:14:52PM -0600, Alex Williamson wrote:
> On Thu, 17 Jun 2021 07:31:03 +0000
> "Tian, Kevin" <kevin.tian@intel.com> wrote:
> > > From: Alex Williamson <alex.williamson@redhat.com>
> > > Sent: Thursday, June 17, 2021 3:40 AM
> > > On Wed, 16 Jun 2021 06:43:23 +0000
> > > "Tian, Kevin" <kevin.tian@intel.com> wrote:
> > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > Sent: Wednesday, June 16, 2021 12:12 AM
> > > > > On Tue, 15 Jun 2021 02:31:39 +0000
> > > > > "Tian, Kevin" <kevin.tian@intel.com> wrote:
> > > > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > > > Sent: Tuesday, June 15, 2021 12:28 AM
[snip]

> > > > > 3) A dual-function conventional PCI e1000 NIC where the functions are
> > > > >    grouped together due to shared RID.
> > > > >
> > > > >    a) Repeat 2.a) and 2.b) such that we have a valid, user accessible
> > > > >       devices in the same IOMMU context.
> > > > >
> > > > >    b) Function 1 is detached from the IOASID.
> > > > >
> > > > >    I think function 1 cannot be placed into a different IOMMU context
> > > > >    here, does the detach work?  What's the IOMMU context now?  
> > > >
> > > > Yes. Function 1 is back to block-DMA. Since both functions share RID,
> > > > essentially it implies function 0 is in block-DMA state too (though its
> > > > tracking state may not change yet) since the shared IOMMU context
> > > > entry blocks DMA now. In IOMMU fd function 0 is still attached to the
> > > > IOASID thus the user still needs do an explicit detach to clear the
> > > > tracking state for function 0.
> > > >  
> > > > >
> > > > >    c) A new IOASID is alloc'd within the existing iommu_fd and function
> > > > >       1 is attached to the new IOASID.
> > > > >
> > > > >    Where, how, by whom does this fail?  
> > > >
> > > > No need to fail. It can succeed since doing so just hurts user's own foot.
> > > >
> > > > The only question is how user knows the fact that a group of devices
> > > > share RID thus avoid such thing. I'm curious how it is communicated
> > > > with today's VFIO mechanism. Yes the group-centric VFIO uAPI prevents
> > > > a group of devices from attaching to multiple IOMMU contexts, but
> > > > suppose we still need a way to tell the user to not do so. Especially
> > > > such knowledge would be also reflected in the virtual PCI topology
> > > > when the entire group is assigned to the guest which needs to know
> > > > this fact when vIOMMU is exposed. I haven't found time to investigate
> > > > it but suppose if such channel exists it could be reused, or in the worst
> > > > case we may have the new device capability interface to convey...  
> > > 
> > > No such channel currently exists, it's not an issue today, IOMMU
> > > context is group-based.  
> > 
> > Interesting... If such group of devices are assigned to a guest, how does
> > Qemu decide the virtual PCI topology for them? Do they have same
> > vRID or different?
> 
> That's the beauty of it, it doesn't matter how many RIDs exist in the
> group, or which devices have aliases, the group is the minimum
> granularity of a container where QEMU knows that a container provides
> a single address space.  Therefore a container must exist in a single
> address space in the PCI topology.  In a conventional or non-vIOMMU
> topology, the PCI address space is equivalent to the system memory
> address space.  When vIOMMU gets involved, multiple devices within the
> same group must exist in the same address space.  A vPCIe-to-PCI bridge
> can be used to create that shared address space.
> 
> I've referred to this as a limitation of type1, that we can't put
> devices within the same group into different address spaces, such as
> behind separate vRoot-Ports in a vIOMMU config, but really, who cares?
> As isolation support improves we see fewer multi-device groups, this
> scenario becomes the exception.  Buy better hardware to use the devices
> independently.

Also, that limitation is fundamental.  Groups in a guest must always
be the same or strictly bigger than groups in the host, because if the
real hardware can't isolate them, then the virtual hardware certainly
can't and the guest kernel shouldn't be given the impression that it
can separate them.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2021-06-24  4:52 UTC|newest]

Thread overview: 162+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-07  2:58 Plan for /dev/ioasid RFC v2 Tian, Kevin
2021-06-07  2:58 ` Tian, Kevin
2021-06-09  8:14 ` Eric Auger
2021-06-09  8:14   ` Eric Auger
2021-06-09  9:37   ` Tian, Kevin
2021-06-09  9:37     ` Tian, Kevin
2021-06-09 10:14     ` Eric Auger
2021-06-09 10:14       ` Eric Auger
2021-06-09  9:01 ` Leon Romanovsky
2021-06-09  9:01   ` Leon Romanovsky
2021-06-09  9:43   ` Tian, Kevin
2021-06-09  9:43     ` Tian, Kevin
2021-06-09 12:24 ` Joerg Roedel
2021-06-09 12:24   ` Joerg Roedel
2021-06-09 12:39   ` Jason Gunthorpe
2021-06-09 12:39     ` Jason Gunthorpe
2021-06-09 13:32     ` Joerg Roedel
2021-06-09 13:32       ` Joerg Roedel
2021-06-09 15:00       ` Jason Gunthorpe
2021-06-09 15:00         ` Jason Gunthorpe
2021-06-09 15:51         ` Joerg Roedel
2021-06-09 15:51           ` Joerg Roedel
2021-06-09 16:15           ` Alex Williamson
2021-06-09 16:15             ` Alex Williamson
2021-06-09 16:27             ` Alex Williamson
2021-06-09 16:27               ` Alex Williamson
2021-06-09 18:49               ` Jason Gunthorpe
2021-06-09 18:49                 ` Jason Gunthorpe
2021-06-10 15:38                 ` Alex Williamson
2021-06-10 15:38                   ` Alex Williamson
2021-06-11  0:58                   ` Tian, Kevin
2021-06-11  0:58                     ` Tian, Kevin
2021-06-11 21:38                     ` Alex Williamson
2021-06-11 21:38                       ` Alex Williamson
2021-06-14  3:09                       ` Tian, Kevin
2021-06-14  3:09                         ` Tian, Kevin
2021-06-14  3:22                         ` Alex Williamson
2021-06-14  3:22                           ` Alex Williamson
2021-06-15  1:05                           ` Tian, Kevin
2021-06-15  1:05                             ` Tian, Kevin
2021-06-14 13:38                         ` Jason Gunthorpe
2021-06-14 13:38                           ` Jason Gunthorpe
2021-06-15  1:21                           ` Tian, Kevin
2021-06-15  1:21                             ` Tian, Kevin
2021-06-15 16:56                             ` Alex Williamson
2021-06-15 16:56                               ` Alex Williamson
2021-06-16  6:53                               ` Tian, Kevin
2021-06-16  6:53                                 ` Tian, Kevin
2021-06-24  4:50                             ` David Gibson
2021-06-24  4:50                               ` David Gibson
2021-06-11 16:45                   ` Jason Gunthorpe
2021-06-11 16:45                     ` Jason Gunthorpe
2021-06-11 19:38                     ` Alex Williamson
2021-06-11 19:38                       ` Alex Williamson
2021-06-12  1:28                       ` Jason Gunthorpe
2021-06-12  1:28                         ` Jason Gunthorpe
2021-06-12 16:57                         ` Alex Williamson
2021-06-12 16:57                           ` Alex Williamson
2021-06-14 14:07                           ` Jason Gunthorpe
2021-06-14 14:07                             ` Jason Gunthorpe
2021-06-14 16:28                             ` Alex Williamson
2021-06-14 16:28                               ` Alex Williamson
2021-06-14 19:40                               ` Jason Gunthorpe
2021-06-14 19:40                                 ` Jason Gunthorpe
2021-06-15  2:31                               ` Tian, Kevin
2021-06-15  2:31                                 ` Tian, Kevin
2021-06-15 16:12                                 ` Alex Williamson
2021-06-15 16:12                                   ` Alex Williamson
2021-06-16  6:43                                   ` Tian, Kevin
2021-06-16  6:43                                     ` Tian, Kevin
2021-06-16 19:39                                     ` Alex Williamson
2021-06-16 19:39                                       ` Alex Williamson
2021-06-17  3:39                                       ` Liu Yi L
2021-06-17  3:39                                         ` Liu Yi L
2021-06-17  7:31                                       ` Tian, Kevin
2021-06-17  7:31                                         ` Tian, Kevin
2021-06-17 21:14                                         ` Alex Williamson
2021-06-17 21:14                                           ` Alex Williamson
2021-06-18  0:19                                           ` Jason Gunthorpe
2021-06-18  0:19                                             ` Jason Gunthorpe
2021-06-18 16:57                                             ` Tian, Kevin
2021-06-18 16:57                                               ` Tian, Kevin
2021-06-18 18:23                                               ` Jason Gunthorpe
2021-06-18 18:23                                                 ` Jason Gunthorpe
2021-06-25 10:27                                                 ` Tian, Kevin
2021-06-25 10:27                                                   ` Tian, Kevin
2021-06-25 14:36                                                   ` Jason Gunthorpe
2021-06-25 14:36                                                     ` Jason Gunthorpe
2021-06-28  1:09                                                     ` Tian, Kevin
2021-06-28  1:09                                                       ` Tian, Kevin
2021-06-28 22:31                                                       ` Alex Williamson
2021-06-28 22:31                                                         ` Alex Williamson
2021-06-28 22:48                                                         ` Jason Gunthorpe
2021-06-28 22:48                                                           ` Jason Gunthorpe
2021-06-28 23:09                                                           ` Alex Williamson
2021-06-28 23:09                                                             ` Alex Williamson
2021-06-28 23:13                                                             ` Jason Gunthorpe
2021-06-28 23:13                                                               ` Jason Gunthorpe
2021-06-29  0:26                                                               ` Tian, Kevin
2021-06-29  0:26                                                                 ` Tian, Kevin
2021-06-29  0:28                                                             ` Tian, Kevin
2021-06-29  0:28                                                               ` Tian, Kevin
2021-06-29  0:43                                                         ` Tian, Kevin
2021-06-29  0:43                                                           ` Tian, Kevin
2021-06-28  2:03                                                     ` Tian, Kevin
2021-06-28  2:03                                                       ` Tian, Kevin
2021-06-28 14:41                                                       ` Jason Gunthorpe
2021-06-28 14:41                                                         ` Jason Gunthorpe
2021-06-28  6:45                                                     ` Tian, Kevin
2021-06-28  6:45                                                       ` Tian, Kevin
2021-06-28 16:26                                                       ` Jason Gunthorpe
2021-06-28 16:26                                                         ` Jason Gunthorpe
2021-06-24  4:26                                               ` David Gibson
2021-06-24  4:26                                                 ` David Gibson
2021-06-24  5:59                                                 ` Tian, Kevin
2021-06-24  5:59                                                   ` Tian, Kevin
2021-06-24 12:22                                                 ` Lu Baolu
2021-06-24 12:22                                                   ` Lu Baolu
2021-06-24  4:23                                           ` David Gibson [this message]
2021-06-24  4:23                                             ` David Gibson
2021-06-18  0:52                                         ` Jason Gunthorpe
2021-06-18  0:52                                           ` Jason Gunthorpe
2021-06-18 13:47                                         ` Joerg Roedel
2021-06-18 13:47                                           ` Joerg Roedel
2021-06-18 15:15                                           ` Jason Gunthorpe
2021-06-18 15:15                                             ` Jason Gunthorpe
2021-06-18 15:37                                             ` Raj, Ashok
2021-06-18 15:37                                               ` Raj, Ashok
2021-06-18 15:51                                               ` Alex Williamson
2021-06-18 15:51                                                 ` Alex Williamson
2021-06-24  4:29                                             ` David Gibson
2021-06-24  4:29                                               ` David Gibson
2021-06-24 11:56                                               ` Jason Gunthorpe
2021-06-24 11:56                                                 ` Jason Gunthorpe
2021-06-18  0:10                                   ` Jason Gunthorpe
2021-06-18  0:10                                     ` Jason Gunthorpe
2021-06-17  5:29                     ` David Gibson
2021-06-17  5:29                       ` David Gibson
2021-06-17  5:02             ` David Gibson
2021-06-17  5:02               ` David Gibson
2021-06-17 23:04               ` Jason Gunthorpe
2021-06-17 23:04                 ` Jason Gunthorpe
2021-06-24  4:37                 ` David Gibson
2021-06-24  4:37                   ` David Gibson
2021-06-24 11:57                   ` Jason Gunthorpe
2021-06-24 11:57                     ` Jason Gunthorpe
2021-06-10  5:50     ` Lu Baolu
2021-06-10  5:50       ` Lu Baolu
2021-06-17  5:22       ` David Gibson
2021-06-17  5:22         ` David Gibson
2021-06-18  5:21         ` Lu Baolu
2021-06-18  5:21           ` Lu Baolu
2021-06-24  4:03           ` David Gibson
2021-06-24  4:03             ` David Gibson
2021-06-24 13:42             ` Lu Baolu
2021-06-24 13:42               ` Lu Baolu
2021-06-17  4:45     ` David Gibson
2021-06-17  4:45       ` David Gibson
2021-06-17 23:10       ` Jason Gunthorpe
2021-06-17 23:10         ` Jason Gunthorpe
2021-06-24  4:07         ` David Gibson
2021-06-24  4:07           ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YNQIyP4RR0PmVtLo@yekko \
    --to=david@gibson.dropbear.id.au \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=corbet@lwn.net \
    --cc=dave.jiang@intel.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jasowang@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkml@metux.net \
    --cc=lushenming@huawei.com \
    --cc=parav@mellanox.com \
    --cc=pbonzini@redhat.com \
    --cc=robin.murphy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.