KVM Archive mirror
 help / color / mirror / Atom feed
From: Yi Liu <yi.l.liu@intel.com>
To: "Tian, Kevin" <kevin.tian@intel.com>,
	Alex Williamson <alex.williamson@redhat.com>
Cc: "jgg@nvidia.com" <jgg@nvidia.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"chao.p.peng@linux.intel.com" <chao.p.peng@linux.intel.com>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>,
	"Duan, Zhenzhong" <zhenzhong.duan@intel.com>,
	"Pan, Jacob jun" <jacob.jun.pan@intel.com>
Subject: Re: [PATCH v2 4/4] vfio: Report PASID capability via VFIO_DEVICE_FEATURE ioctl
Date: Thu, 18 Apr 2024 16:23:06 +0800	[thread overview]
Message-ID: <53e06b21-a2ba-443a-b8a4-87c4826b0798@intel.com> (raw)
In-Reply-To: <BN9PR11MB527690902C6D7D479C16DB3C8C0E2@BN9PR11MB5276.namprd11.prod.outlook.com>

On 2024/4/18 08:21, Tian, Kevin wrote:
>> From: Alex Williamson <alex.williamson@redhat.com>
>> Sent: Thursday, April 18, 2024 4:26 AM
>>
>> On Wed, 17 Apr 2024 07:09:52 +0000
>> "Tian, Kevin" <kevin.tian@intel.com> wrote:
>>
>>>> From: Alex Williamson <alex.williamson@redhat.com>
>>>> Sent: Wednesday, April 17, 2024 1:57 AM
>>>>
>>>> On Fri, 12 Apr 2024 01:21:21 -0700
>>>> Yi Liu <yi.l.liu@intel.com> wrote:
>>>>
>>>>> + */
>>>>> +struct vfio_device_feature_pasid {
>>>>> +	__u16 capabilities;
>>>>> +#define VFIO_DEVICE_PASID_CAP_EXEC	(1 << 0)
>>>>> +#define VFIO_DEVICE_PASID_CAP_PRIV	(1 << 1)
>>>>> +	__u8 width;
>>>>> +	__u8 __reserved;
>>>>> +};
>>>>
>>>> Building on Kevin's comment on the cover letter, if we could describe
>>>> an offset for emulating a PASID capability, this seems like the place
>>>> we'd do it.  I think we're not doing that because we'd like an in-band
>>>> mechanism for a device to report unused config space, such as a DVSEC
>>>> capability, so that it can be implemented on a physical device.  As
>>>> noted in the commit log here, we'd also prefer not to bloat the kernel
>>>> with more device quirks.
>>>>
>>>> In an ideal world we might be able to jump start support of that DVSEC
>>>> option by emulating the DVSEC capability on top of the PASID capability
>>>> for PFs, but unfortunately the PASID capability is 8 bytes while the
>>>> DVSEC capability is at least 12 bytes, so we can't implement that
>>>> generically either.
>>>
>>> Yeah, that's a problem.
>>>
>>>>
>>>> I don't know there's any good solution here or whether there's actually
>>>> any value to the PASID capability on a PF, but do we need to consider
>>>> leaving a field+flag here to describe the offset for that scenario?
>>>
>>> Yes, I prefer to this way.
>>>
>>>> Would we then allow variant drivers to take advantage of it?  Does this
>>>> then turn into the quirk that we're trying to avoid in the kernel
>>>> rather than userspace and is that a problem?  Thanks,
>>>>
>>>
>>> We don't want to proactively pursue quirks in the kernel.
>>>
>>> But if a variant driver exists for other reasons, I don't see why it
>>> should be prohibited from deciding an offset to ease the
>>> userspace. 😊
>>
>> At that point we've turned the corner into an arbitrary policy decision
>> that I can't defend.  A "worthy" variant driver can implement something
>> through a side channel vfio API, but implementing that side channel
>> itself is not enough to justify a variant driver?  It doesn't make
>> sense.
>>
>> Further, if we have a variant driver, why do we need a side channel for
>> the purpose of describing available config space when we expect devices
>> themselves to eventually describe the same through a DVSEC capability?
>> The purpose of enabling variant drivers is to enhance the functionality
>> of the device.  Adding an emulated DVSEC capability seems like a valid
>> enhancement to justify a variant driver to me.
>>
>> So the more I think about it, it would be easy to add something here
>> that hints a location for an emulated PASID capability in the VMM, but
>> it would also be counterproductive to an end goal of having a DVSEC
>> capability that describes unused config space.  The very narrow scope
>> where that side-band channel would be useful is an unknown PF device
>> which doesn't implement a DVSEC capability and without intervention
>> simply behaves as it always has, without PASID support.
>>
>> A vendor desiring such support can a) implement DVSEC in the hardware,
>> b) implement a variant driver emulating a DVSEC capability, or c)
>> directly modify the VMM to tell it where to place the PASID capability.
>> I also don't think we should exclude the possibility that b) could turn
>> into a shared variant driver that knows about multiple devices and has
>> a table of free config space for each.  Option c) is only the last
>> resort if there's not already 12 bytes of contiguous, aligned free
>> space to place a DVSEC capability.  That seems unlikely.
> 
> or b) could be a table in vfio_pci_config.c i.e. kind of making vfio-pci
> as the shared variant driver.
> 
>>
>> At some point we need to define the format and use of this DVSEC.  Do
>> we allow (not require) one at every gap in config space that's at least
>> 12-bytes long and adjust the DVSEC Length to describe longer gaps, or do
> 
> Does PCI spec allows multiple same-type capabilities co-existing?

For DVSEC, it allows. Below is the sentence from PCIe spec.

A single PCI Express Function or RCRB is permitted to contain multiple 
DVSEC Capability structures


>> we use a single DVSEC to describe a table of ranges throughout extended
>> (maybe even conventional) config space?  The former seems easier,
> 
> this might be challenging as the table itself requires a contiguous
> large free block.
> 
>> especially if we expect a device has a large block of free space,
>> enough for multiple emulated capabilities and described by a single
>> DVSEC.  Thanks,
>>

this is a good point. The ATS and PRI capability do not exist in VF as
well. They need to be emulated.

> 
> yes that sounds simpler.

-- 
Regards,
Yi Liu

  reply	other threads:[~2024-04-18  8:19 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-12  8:21 [PATCH v2 0/4] vfio-pci support pasid attach/detach Yi Liu
2024-04-12  8:21 ` [PATCH v2 1/4] ida: Add ida_get_lowest() Yi Liu
2024-04-16 16:03   ` Alex Williamson
2024-04-18  7:02     ` Yi Liu
2024-04-18 16:23       ` Alex Williamson
2024-04-18 17:12         ` Jason Gunthorpe
2024-04-19 13:43           ` Yi Liu
2024-04-19 13:55             ` Alex Williamson
2024-04-19 14:00               ` Jason Gunthorpe
2024-04-23  7:19                 ` Yi Liu
2024-04-19 13:40         ` Yi Liu
2024-04-12  8:21 ` [PATCH v2 2/4] vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices Yi Liu
2024-04-16  9:01   ` Tian, Kevin
2024-04-16  9:24     ` Yi Liu
2024-04-16  9:47       ` Tian, Kevin
2024-04-18  7:04         ` Yi Liu
2024-04-23 12:43   ` Jason Gunthorpe
2024-04-24  0:33     ` Tian, Kevin
2024-04-24  4:48     ` Yi Liu
2024-04-12  8:21 ` [PATCH v2 3/4] vfio: Add VFIO_DEVICE_PASID_[AT|DE]TACH_IOMMUFD_PT Yi Liu
2024-04-16  9:13   ` Tian, Kevin
2024-04-16  9:36     ` Yi Liu
2024-04-23 12:45   ` Jason Gunthorpe
2024-04-12  8:21 ` [PATCH v2 4/4] vfio: Report PASID capability via VFIO_DEVICE_FEATURE ioctl Yi Liu
2024-04-16  9:40   ` Tian, Kevin
2024-04-16 17:57   ` Alex Williamson
2024-04-17  7:09     ` Tian, Kevin
2024-04-17 20:25       ` Alex Williamson
2024-04-18  0:21         ` Tian, Kevin
2024-04-18  8:23           ` Yi Liu [this message]
2024-04-18 16:34           ` Alex Williamson
2024-04-23 12:39   ` Jason Gunthorpe
2024-04-24  0:24     ` Tian, Kevin
2024-04-24 13:59       ` Jason Gunthorpe
2024-04-16  8:38 ` [PATCH v2 0/4] vfio-pci support pasid attach/detach Tian, Kevin
2024-04-16 17:50   ` Jason Gunthorpe
2024-04-17  7:16     ` Tian, Kevin
2024-04-17 12:20       ` Jason Gunthorpe
2024-04-17 23:02         ` Alex Williamson
2024-04-18  0:06           ` Tian, Kevin
2024-04-18  9:03             ` Yi Liu
2024-04-18 20:37               ` Alex Williamson
2024-04-19  5:52                 ` Tian, Kevin
2024-04-19 16:35                   ` Alex Williamson
2024-04-23  7:43                     ` Tian, Kevin
2024-04-23 12:01                       ` Jason Gunthorpe
2024-04-23 23:47                         ` Tian, Kevin
2024-04-24  0:12                           ` Jason Gunthorpe
2024-04-24  2:57                             ` Tian, Kevin
2024-04-24 12:29                               ` Baolu Lu
2024-04-24 14:04                               ` Jason Gunthorpe
2024-04-24  5:19                             ` Tian, Kevin
2024-04-24 14:15                               ` Jason Gunthorpe
2024-04-24 18:38                                 ` Alex Williamson
2024-04-24 18:45                                   ` Jason Gunthorpe
2024-04-24 18:24                             ` Alex Williamson
2024-04-24 18:36                               ` Jason Gunthorpe
2024-04-24 20:13                                 ` Alex Williamson
2024-04-26 14:11                                   ` Jason Gunthorpe
2024-04-26 20:13                                     ` Alex Williamson
2024-04-28  6:19                                       ` Tian, Kevin
2024-04-29  7:43                                         ` Yi Liu
2024-04-29 17:15                                         ` Jason Gunthorpe
2024-04-29 17:44                                       ` Jason Gunthorpe
2024-04-27  5:05                                     ` Christoph Hellwig
2024-04-25  9:26                               ` Yi Liu
2024-04-25 12:58                                 ` Alex Williamson
2024-04-26  9:01                                   ` Yi Liu
2024-04-19 13:59                 ` Jason Gunthorpe
2024-04-23  7:58                   ` Yi Liu
2024-04-23 12:05                     ` Jason Gunthorpe
2024-04-19 13:34           ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53e06b21-a2ba-443a-b8a4-87c4826b0798@intel.com \
    --to=yi.l.liu@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jacob.jun.pan@intel.com \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).