dri-devel Archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: "Kasireddy, Vivek" <vivek.kasireddy@intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH v1 2/2] vfio/pci: Allow MMIO regions to be exported through dma-buf
Date: Tue, 7 May 2024 21:31:53 -0300	[thread overview]
Message-ID: <20240508003153.GC4650@nvidia.com> (raw)
In-Reply-To: <IA0PR11MB718509BB8B56455710DB2033F8182@IA0PR11MB7185.namprd11.prod.outlook.com>

On Thu, May 02, 2024 at 07:50:36AM +0000, Kasireddy, Vivek wrote:
> Hi Jason,
> 
> > 
> > On Tue, Apr 30, 2024 at 04:24:50PM -0600, Alex Williamson wrote:
> > > > +static vm_fault_t vfio_pci_dma_buf_fault(struct vm_fault *vmf)
> > > > +{
> > > > +	struct vm_area_struct *vma = vmf->vma;
> > > > +	struct vfio_pci_dma_buf *priv = vma->vm_private_data;
> > > > +	pgoff_t pgoff = vmf->pgoff;
> > > > +
> > > > +	if (pgoff >= priv->nr_pages)
> > > > +		return VM_FAULT_SIGBUS;
> > > > +
> > > > +	return vmf_insert_pfn(vma, vmf->address,
> > > > +			      page_to_pfn(priv->pages[pgoff]));
> > > > +}
> > >
> > > How does this prevent the MMIO space from being mmap'd when disabled
> > at
> > > the device?  How is the mmap revoked when the MMIO becomes disabled?
> > > Is it part of the move protocol?
> In this case, I think the importers that mmap'd the dmabuf need to be tracked
> separately and their VMA PTEs need to be zapped when MMIO access is revoked.

Which, as we know, is quite hard.

> > Yes, we should not have a mmap handler for dmabuf. vfio memory must be
> > mmapped in the normal way.
> Although optional, I think most dmabuf exporters (drm ones) provide a mmap
> handler. Otherwise, there is no easy way to provide CPU access (backup slow path)
> to the dmabuf for the importer.

Here we should not, there is no reason since VFIO already provides a
mmap mechanism itself. Anything using this API should just call the
native VFIO function instead of trying to mmap the DMABUF. Yes, it
will be inconvient for the scatterlist case you have, but the kernel
side implementation is much easier ..


> > > > +/**
> > > > + * Upon VFIO_DEVICE_FEATURE_GET create a dma_buf fd for the
> > > > + * region selected.
> > > > + *
> > > > + * open_flags are the typical flags passed to open(2), eg O_RDWR,
> > O_CLOEXEC,
> > > > + * etc. offset/length specify a slice of the region to create the dmabuf
> > from.
> > > > + * If both are 0 then the whole region is used.
> > > > + *
> > > > + * Return: The fd number on success, -1 and errno is set on failure.
> > > > + */
> > > > +#define VFIO_DEVICE_FEATURE_DMA_BUF 11
> > > > +
> > > > +struct vfio_region_p2p_area {
> > > > +	__u32	region_index;
> > > > +	__u32	__pad;
> > > > +	__u64	offset;
> > > > +	__u64	length;
> > > > +};
> > > > +
> > > > +struct vfio_device_feature_dma_buf {
> > > > +	__u32	open_flags;
> > > > +	__u32	nr_areas;
> > > > +	struct vfio_region_p2p_area p2p_areas[];
> > > > +};
> > 
> > Still have no clue what this p2p areas is. You want to create a dmabuf
> > out of a scatterlist? Why??

> Because the data associated with a buffer that needs to be shared can
> come from multiple ranges. I probably should have used the terms ranges
> or slices or chunks to make it more clear instead of p2p areas.

Yes, please pick a better name.

> > I'm also not sure of the use of the pci_p2pdma family of functions, it
> > is a bold step to make struct pages, that isn't going to work in quite

> I guess things may have changed since the last discussion on this topic or
> maybe I misunderstood but I thought Christoph's suggestion was to use
> struct pages to populate the scatterlist instead of using DMA addresses
> and, I figured pci_p2pdma APIs can easily help with that.

It was, but it doesn't really work for VFIO. You can only create
struct pages for large MMIO spaces, and only on some architectures.

Requiring them will make VFIO unusable in alot of places. Requiring
them just for DMABUF will make that feature unusable.

Creating them on first use, and then ignoring how broken it is 
perhaps reasonable for now, but I'm unhappy to see it so poorly
usable.
> 
> > alot of cases. It is really hacky/wrong to immediately call
> > pci_alloc_p2pmem() to defeat the internal genalloc.

> In my use-case, I need to use all the pages from the pool and I don't see any
> better way to do it.

You have to fix the P2P API to work properly for this kind of use
case.

There should be no genalloc at all.

> > I'd rather we stick with the original design. Leon is working on DMA
> > API changes that should address half the issue.
>
> Ok, I'll keep an eye out for Leon's work.

It saves from messing with the P2P stuff, but you get the other issues
back.

Jason

  parent reply	other threads:[~2024-05-08  0:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-22  6:30 [PATCH v1 0/2] vfio/pci: Allow MMIO regions to be exported through dma-buf Vivek Kasireddy
2024-04-22  6:30 ` [PATCH v1 1/2] vfio: Export vfio device get and put registration helpers Vivek Kasireddy
2024-04-22  6:30 ` [PATCH v1 2/2] vfio/pci: Allow MMIO regions to be exported through dma-buf Vivek Kasireddy
2024-04-22 14:44   ` Zhu Yanjun
2024-04-30 22:24   ` Alex Williamson
2024-05-01 12:53     ` Jason Gunthorpe
2024-05-02  7:50       ` Kasireddy, Vivek
2024-05-02 12:57         ` Leon Romanovsky
2024-05-08  0:31         ` Jason Gunthorpe [this message]
2024-05-08  8:23           ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240508003153.GC4650@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=vivek.kasireddy@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).