IOMMU Archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	"Liu, Yi L" <yi.l.liu@intel.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"robin.murphy@arm.com" <robin.murphy@arm.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"chao.p.peng@linux.intel.com" <chao.p.peng@linux.intel.com>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	"baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>,
	"Duan, Zhenzhong" <zhenzhong.duan@intel.com>,
	"Pan, Jacob jun" <jacob.jun.pan@intel.com>,
	"Cédric Le Goater" <clg@redhat.com>
Subject: Re: [PATCH v2 0/4] vfio-pci support pasid attach/detach
Date: Mon, 29 Apr 2024 14:44:42 -0300	[thread overview]
Message-ID: <20240429174442.GJ941030@nvidia.com> (raw)
In-Reply-To: <20240426141354.1f003b5f.alex.williamson@redhat.com>

On Fri, Apr 26, 2024 at 02:13:54PM -0600, Alex Williamson wrote:
> On Fri, 26 Apr 2024 11:11:17 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > On Wed, Apr 24, 2024 at 02:13:49PM -0600, Alex Williamson wrote:
> > 
> > > This is kind of an absurd example to portray as a ubiquitous problem.
> > > Typically the config space layout is a reflection of hardware whether
> > > the device supports migration or not.  
> > 
> > Er, all our HW has FW constructed config space. It changes with FW
> > upgrades. We change it during the life of the product. This has to be
> > considered..
> 
> So as I understand it, the concern is that you have firmware that
> supports migration, but it also openly hostile to the fundamental
> aspects of exposing a stable device ABI in support of migration.

Well, that makes it sound rude, but yes that is part of it.

mlx5 is tremendously FW defined. The FW can only cope with migration
in some limited cases today. Making that compatability bigger is
ongoing work.

Config space is one of the areas that has not been addressed.
Currently things are such that the FW won't support migration in
combinations that have different physical config space - so it is not
a problem.

But, in principle, it is an issue. AFAIK, the only complete solution
is for the hypervisor to fully synthesize a stable config space.

So, if we keep this in the kernel then I'd imagine the kernel will
need to grow some shared infrastructure to fully synthezise the config
space - not text file based, but basically the same as what I
described for the VMM.

> > But that won't be true if the kernel is making decisions. The config
> > space layout depends now on the kernel driver version too.
> 
> But in the cases where we support migration there's a device specific
> variant driver that supports that migration.  It's the job of that
> variant driver to not only export and import the device state, but also
> to provide a consistent ABI to the user, which includes the config
> space layout. 

Yes, we could do that, but I'm not sure how it will work in all cases.

> I don't understand why we'd say the device programming ABI itself
> falls within the purview of the device/variant driver, but PCI
> config space is defined by device specific code at a higher level.

The "device programming ABI" doesn't contain any *policy*. The layout
of the config space is like 50% policy. Especially when we start to
talk about standards defined migration. The standards will set the
"device programming ABI" and maybe even specify the migration
stream. They will not, and arguably can not, specify the config space.

Config space layout is substantially policy of the instance type. Even
little things like the vendor IDs can be meaningfully replaced in VMs.

> Regarding "if we accept that text file configuration should be
> something the VMM supports", I'm not on board with this yet, so
> applying it to PASID discussion seems premature.

Sure, I'm just explaining a way this could all fit together.

> We've developed variant drivers specifically to host the device specific
> aspects of migration support.  The requirement of a consistent config
> space layout is a problem that only exists relative to migration.  

Well, I wouldn't go quite so far. Arguably even non-migritable
instance types may want to adjust thier config space. Eg if I'm using
a DPU and I get a NVMe/Virtio PCI function I may want to scrub out
details from the config space to make it more general. Even without
migration.

This already happens today in places like VDPA which completely
replace the underlying config space in some cases.

I see it as a difference from a world of highly constrained "instance
types" and a more ad hoc world.

> is an issue that I would have considered the responsibility of the
> variant driver, which would likely expect a consistent interface from
> the hardware/firmware.  Why does hostile firmware suddenly make it the
> VMM's problem to provide a consistent ABI to the config space of the
> device rather than the variant driver?

It is not "hostile firmware"! It accepting that a significant aspect
of the config layout is actually policy.

Plus the standards limitations that mean we can't change the config
space on the fly make it pretty much impossible for the device to
acutally do anything to help here. Software must fix the config space.

> Obviously config maps are something that a VMM could do, but it also
> seems to impose a non-trivial burden that every VMM requires an
> implementation of a config space map and integration for each device
> rather than simply expecting the exposed config space of the device to
> be part of the migration ABI.  

Well, the flip is true to, it is alot of burden on every variant
device driver implement and on the kernel in general to manage config
space policy on behalf of the VMM.

My point is if the VMM is already going to be forced to manage config
space policy for other good reasons, are we sure we want to put a
bunch of stuff in the kernel that sometimes won't be used?

> Also this solution specifically only addresses config space
> compatibility without considering the more generic issue that a
> variant driver can expose different device personas.  A versioned
> persona and config space virtualization in the variant driver is a
> much more flexible solution.

It is addressed, the different personas would have their own text file
maps. The target VMM would have to load the right map. Shared common
code across all the variant drivers.

Jason

  parent reply	other threads:[~2024-04-29 17:44 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-12  8:21 [PATCH v2 0/4] vfio-pci support pasid attach/detach Yi Liu
2024-04-12  8:21 ` [PATCH v2 1/4] ida: Add ida_get_lowest() Yi Liu
2024-04-16 16:03   ` Alex Williamson
2024-04-18  7:02     ` Yi Liu
2024-04-18 16:23       ` Alex Williamson
2024-04-18 17:12         ` Jason Gunthorpe
2024-04-19 13:43           ` Yi Liu
2024-04-19 13:55             ` Alex Williamson
2024-04-19 14:00               ` Jason Gunthorpe
2024-04-23  7:19                 ` Yi Liu
2024-04-19 13:40         ` Yi Liu
2024-04-12  8:21 ` [PATCH v2 2/4] vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices Yi Liu
2024-04-16  9:01   ` Tian, Kevin
2024-04-16  9:24     ` Yi Liu
2024-04-16  9:47       ` Tian, Kevin
2024-04-18  7:04         ` Yi Liu
2024-04-23 12:43   ` Jason Gunthorpe
2024-04-24  0:33     ` Tian, Kevin
2024-04-24  4:48     ` Yi Liu
2024-04-12  8:21 ` [PATCH v2 3/4] vfio: Add VFIO_DEVICE_PASID_[AT|DE]TACH_IOMMUFD_PT Yi Liu
2024-04-16  9:13   ` Tian, Kevin
2024-04-16  9:36     ` Yi Liu
2024-04-23 12:45   ` Jason Gunthorpe
2024-04-12  8:21 ` [PATCH v2 4/4] vfio: Report PASID capability via VFIO_DEVICE_FEATURE ioctl Yi Liu
2024-04-16  9:40   ` Tian, Kevin
2024-04-16 17:57   ` Alex Williamson
2024-04-17  7:09     ` Tian, Kevin
2024-04-17 20:25       ` Alex Williamson
2024-04-18  0:21         ` Tian, Kevin
2024-04-18  8:23           ` Yi Liu
2024-04-18 16:34           ` Alex Williamson
2024-04-23 12:39   ` Jason Gunthorpe
2024-04-24  0:24     ` Tian, Kevin
2024-04-24 13:59       ` Jason Gunthorpe
2024-04-16  8:38 ` [PATCH v2 0/4] vfio-pci support pasid attach/detach Tian, Kevin
2024-04-16 17:50   ` Jason Gunthorpe
2024-04-17  7:16     ` Tian, Kevin
2024-04-17 12:20       ` Jason Gunthorpe
2024-04-17 23:02         ` Alex Williamson
2024-04-18  0:06           ` Tian, Kevin
2024-04-18  9:03             ` Yi Liu
2024-04-18 20:37               ` Alex Williamson
2024-04-19  5:52                 ` Tian, Kevin
2024-04-19 16:35                   ` Alex Williamson
2024-04-23  7:43                     ` Tian, Kevin
2024-04-23 12:01                       ` Jason Gunthorpe
2024-04-23 23:47                         ` Tian, Kevin
2024-04-24  0:12                           ` Jason Gunthorpe
2024-04-24  2:57                             ` Tian, Kevin
2024-04-24 12:29                               ` Baolu Lu
2024-04-24 14:04                               ` Jason Gunthorpe
2024-04-24  5:19                             ` Tian, Kevin
2024-04-24 14:15                               ` Jason Gunthorpe
2024-04-24 18:38                                 ` Alex Williamson
2024-04-24 18:45                                   ` Jason Gunthorpe
2024-04-24 18:24                             ` Alex Williamson
2024-04-24 18:36                               ` Jason Gunthorpe
2024-04-24 20:13                                 ` Alex Williamson
2024-04-26 14:11                                   ` Jason Gunthorpe
2024-04-26 20:13                                     ` Alex Williamson
2024-04-28  6:19                                       ` Tian, Kevin
2024-04-29  7:43                                         ` Yi Liu
2024-04-29 17:15                                         ` Jason Gunthorpe
2024-04-29 17:44                                       ` Jason Gunthorpe [this message]
2024-04-27  5:05                                     ` Christoph Hellwig
2024-04-25  9:26                               ` Yi Liu
2024-04-25 12:58                                 ` Alex Williamson
2024-04-26  9:01                                   ` Yi Liu
2024-04-19 13:59                 ` Jason Gunthorpe
2024-04-23  7:58                   ` Yi Liu
2024-04-23 12:05                     ` Jason Gunthorpe
2024-04-19 13:34           ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240429174442.GJ941030@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=clg@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=jacob.jun.pan@intel.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=nicolinc@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).