From: Yi Liu <yi.l.liu@intel.com>
To: "Tian, Kevin" <kevin.tian@intel.com>,
Alex Williamson <alex.williamson@redhat.com>
Cc: "jgg@nvidia.com" <jgg@nvidia.com>,
"joro@8bytes.org" <joro@8bytes.org>,
"robin.murphy@arm.com" <robin.murphy@arm.com>,
"eric.auger@redhat.com" <eric.auger@redhat.com>,
"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"chao.p.peng@linux.intel.com" <chao.p.peng@linux.intel.com>,
"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>,
"Duan, Zhenzhong" <zhenzhong.duan@intel.com>,
"Pan, Jacob jun" <jacob.jun.pan@intel.com>
Subject: Re: [PATCH v2 4/4] vfio: Report PASID capability via VFIO_DEVICE_FEATURE ioctl
Date: Thu, 18 Apr 2024 16:23:06 +0800 [thread overview]
Message-ID: <53e06b21-a2ba-443a-b8a4-87c4826b0798@intel.com> (raw)
In-Reply-To: <BN9PR11MB527690902C6D7D479C16DB3C8C0E2@BN9PR11MB5276.namprd11.prod.outlook.com>
On 2024/4/18 08:21, Tian, Kevin wrote:
>> From: Alex Williamson <alex.williamson@redhat.com>
>> Sent: Thursday, April 18, 2024 4:26 AM
>>
>> On Wed, 17 Apr 2024 07:09:52 +0000
>> "Tian, Kevin" <kevin.tian@intel.com> wrote:
>>
>>>> From: Alex Williamson <alex.williamson@redhat.com>
>>>> Sent: Wednesday, April 17, 2024 1:57 AM
>>>>
>>>> On Fri, 12 Apr 2024 01:21:21 -0700
>>>> Yi Liu <yi.l.liu@intel.com> wrote:
>>>>
>>>>> + */
>>>>> +struct vfio_device_feature_pasid {
>>>>> + __u16 capabilities;
>>>>> +#define VFIO_DEVICE_PASID_CAP_EXEC (1 << 0)
>>>>> +#define VFIO_DEVICE_PASID_CAP_PRIV (1 << 1)
>>>>> + __u8 width;
>>>>> + __u8 __reserved;
>>>>> +};
>>>>
>>>> Building on Kevin's comment on the cover letter, if we could describe
>>>> an offset for emulating a PASID capability, this seems like the place
>>>> we'd do it. I think we're not doing that because we'd like an in-band
>>>> mechanism for a device to report unused config space, such as a DVSEC
>>>> capability, so that it can be implemented on a physical device. As
>>>> noted in the commit log here, we'd also prefer not to bloat the kernel
>>>> with more device quirks.
>>>>
>>>> In an ideal world we might be able to jump start support of that DVSEC
>>>> option by emulating the DVSEC capability on top of the PASID capability
>>>> for PFs, but unfortunately the PASID capability is 8 bytes while the
>>>> DVSEC capability is at least 12 bytes, so we can't implement that
>>>> generically either.
>>>
>>> Yeah, that's a problem.
>>>
>>>>
>>>> I don't know there's any good solution here or whether there's actually
>>>> any value to the PASID capability on a PF, but do we need to consider
>>>> leaving a field+flag here to describe the offset for that scenario?
>>>
>>> Yes, I prefer to this way.
>>>
>>>> Would we then allow variant drivers to take advantage of it? Does this
>>>> then turn into the quirk that we're trying to avoid in the kernel
>>>> rather than userspace and is that a problem? Thanks,
>>>>
>>>
>>> We don't want to proactively pursue quirks in the kernel.
>>>
>>> But if a variant driver exists for other reasons, I don't see why it
>>> should be prohibited from deciding an offset to ease the
>>> userspace. 😊
>>
>> At that point we've turned the corner into an arbitrary policy decision
>> that I can't defend. A "worthy" variant driver can implement something
>> through a side channel vfio API, but implementing that side channel
>> itself is not enough to justify a variant driver? It doesn't make
>> sense.
>>
>> Further, if we have a variant driver, why do we need a side channel for
>> the purpose of describing available config space when we expect devices
>> themselves to eventually describe the same through a DVSEC capability?
>> The purpose of enabling variant drivers is to enhance the functionality
>> of the device. Adding an emulated DVSEC capability seems like a valid
>> enhancement to justify a variant driver to me.
>>
>> So the more I think about it, it would be easy to add something here
>> that hints a location for an emulated PASID capability in the VMM, but
>> it would also be counterproductive to an end goal of having a DVSEC
>> capability that describes unused config space. The very narrow scope
>> where that side-band channel would be useful is an unknown PF device
>> which doesn't implement a DVSEC capability and without intervention
>> simply behaves as it always has, without PASID support.
>>
>> A vendor desiring such support can a) implement DVSEC in the hardware,
>> b) implement a variant driver emulating a DVSEC capability, or c)
>> directly modify the VMM to tell it where to place the PASID capability.
>> I also don't think we should exclude the possibility that b) could turn
>> into a shared variant driver that knows about multiple devices and has
>> a table of free config space for each. Option c) is only the last
>> resort if there's not already 12 bytes of contiguous, aligned free
>> space to place a DVSEC capability. That seems unlikely.
>
> or b) could be a table in vfio_pci_config.c i.e. kind of making vfio-pci
> as the shared variant driver.
>
>>
>> At some point we need to define the format and use of this DVSEC. Do
>> we allow (not require) one at every gap in config space that's at least
>> 12-bytes long and adjust the DVSEC Length to describe longer gaps, or do
>
> Does PCI spec allows multiple same-type capabilities co-existing?
For DVSEC, it allows. Below is the sentence from PCIe spec.
A single PCI Express Function or RCRB is permitted to contain multiple
DVSEC Capability structures
>> we use a single DVSEC to describe a table of ranges throughout extended
>> (maybe even conventional) config space? The former seems easier,
>
> this might be challenging as the table itself requires a contiguous
> large free block.
>
>> especially if we expect a device has a large block of free space,
>> enough for multiple emulated capabilities and described by a single
>> DVSEC. Thanks,
>>
this is a good point. The ATS and PRI capability do not exist in VF as
well. They need to be emulated.
>
> yes that sounds simpler.
--
Regards,
Yi Liu
next prev parent reply other threads:[~2024-04-18 8:19 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-12 8:21 [PATCH v2 0/4] vfio-pci support pasid attach/detach Yi Liu
2024-04-12 8:21 ` [PATCH v2 1/4] ida: Add ida_get_lowest() Yi Liu
2024-04-16 16:03 ` Alex Williamson
2024-04-18 7:02 ` Yi Liu
2024-04-18 16:23 ` Alex Williamson
2024-04-18 17:12 ` Jason Gunthorpe
2024-04-19 13:43 ` Yi Liu
2024-04-19 13:55 ` Alex Williamson
2024-04-19 14:00 ` Jason Gunthorpe
2024-04-23 7:19 ` Yi Liu
2024-04-19 13:40 ` Yi Liu
2024-04-12 8:21 ` [PATCH v2 2/4] vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices Yi Liu
2024-04-16 9:01 ` Tian, Kevin
2024-04-16 9:24 ` Yi Liu
2024-04-16 9:47 ` Tian, Kevin
2024-04-18 7:04 ` Yi Liu
2024-04-23 12:43 ` Jason Gunthorpe
2024-04-24 0:33 ` Tian, Kevin
2024-04-24 4:48 ` Yi Liu
2024-04-12 8:21 ` [PATCH v2 3/4] vfio: Add VFIO_DEVICE_PASID_[AT|DE]TACH_IOMMUFD_PT Yi Liu
2024-04-16 9:13 ` Tian, Kevin
2024-04-16 9:36 ` Yi Liu
2024-04-23 12:45 ` Jason Gunthorpe
2024-04-12 8:21 ` [PATCH v2 4/4] vfio: Report PASID capability via VFIO_DEVICE_FEATURE ioctl Yi Liu
2024-04-16 9:40 ` Tian, Kevin
2024-04-16 17:57 ` Alex Williamson
2024-04-17 7:09 ` Tian, Kevin
2024-04-17 20:25 ` Alex Williamson
2024-04-18 0:21 ` Tian, Kevin
2024-04-18 8:23 ` Yi Liu [this message]
2024-04-18 16:34 ` Alex Williamson
2024-04-23 12:39 ` Jason Gunthorpe
2024-04-24 0:24 ` Tian, Kevin
2024-04-24 13:59 ` Jason Gunthorpe
2024-04-16 8:38 ` [PATCH v2 0/4] vfio-pci support pasid attach/detach Tian, Kevin
2024-04-16 17:50 ` Jason Gunthorpe
2024-04-17 7:16 ` Tian, Kevin
2024-04-17 12:20 ` Jason Gunthorpe
2024-04-17 23:02 ` Alex Williamson
2024-04-18 0:06 ` Tian, Kevin
2024-04-18 9:03 ` Yi Liu
2024-04-18 20:37 ` Alex Williamson
2024-04-19 5:52 ` Tian, Kevin
2024-04-19 16:35 ` Alex Williamson
2024-04-23 7:43 ` Tian, Kevin
2024-04-23 12:01 ` Jason Gunthorpe
2024-04-23 23:47 ` Tian, Kevin
2024-04-24 0:12 ` Jason Gunthorpe
2024-04-24 2:57 ` Tian, Kevin
2024-04-24 12:29 ` Baolu Lu
2024-04-24 14:04 ` Jason Gunthorpe
2024-04-24 5:19 ` Tian, Kevin
2024-04-24 14:15 ` Jason Gunthorpe
2024-04-24 18:38 ` Alex Williamson
2024-04-24 18:45 ` Jason Gunthorpe
2024-04-24 18:24 ` Alex Williamson
2024-04-24 18:36 ` Jason Gunthorpe
2024-04-24 20:13 ` Alex Williamson
2024-04-26 14:11 ` Jason Gunthorpe
2024-04-26 20:13 ` Alex Williamson
2024-04-28 6:19 ` Tian, Kevin
2024-04-29 7:43 ` Yi Liu
2024-04-29 17:15 ` Jason Gunthorpe
2024-04-29 17:44 ` Jason Gunthorpe
2024-04-27 5:05 ` Christoph Hellwig
2024-04-25 9:26 ` Yi Liu
2024-04-25 12:58 ` Alex Williamson
2024-04-26 9:01 ` Yi Liu
2024-04-19 13:59 ` Jason Gunthorpe
2024-04-23 7:58 ` Yi Liu
2024-04-23 12:05 ` Jason Gunthorpe
2024-04-19 13:34 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53e06b21-a2ba-443a-b8a4-87c4826b0798@intel.com \
--to=yi.l.liu@intel.com \
--cc=alex.williamson@redhat.com \
--cc=baolu.lu@linux.intel.com \
--cc=chao.p.peng@linux.intel.com \
--cc=eric.auger@redhat.com \
--cc=iommu@lists.linux.dev \
--cc=jacob.jun.pan@intel.com \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=nicolinc@nvidia.com \
--cc=robin.murphy@arm.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).