KVM Archive mirror
 help / color / mirror / Atom feed
From: Yan Zhao <yan.y.zhao@intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: <kvm@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<x86@kernel.org>, <alex.williamson@redhat.com>,
	<kevin.tian@intel.com>, <iommu@lists.linux.dev>,
	<pbonzini@redhat.com>, <seanjc@google.com>,
	<dave.hansen@linux.intel.com>, <luto@kernel.org>,
	<peterz@infradead.org>, <tglx@linutronix.de>, <mingo@redhat.com>,
	<bp@alien8.de>, <hpa@zytor.com>, <corbet@lwn.net>,
	<joro@8bytes.org>, <will@kernel.org>, <robin.murphy@arm.com>,
	<baolu.lu@linux.intel.com>, <yi.l.liu@intel.com>
Subject: Re: [PATCH 5/5] iommufd: Flush CPU caches on DMA pages in non-coherent domains
Date: Mon, 20 May 2024 10:45:56 +0800	[thread overview]
Message-ID: <Zkq5ZL+saJbEkfBQ@yzhao56-desk.sh.intel.com> (raw)
In-Reply-To: <20240517170418.GA20229@nvidia.com>

On Fri, May 17, 2024 at 02:04:18PM -0300, Jason Gunthorpe wrote:
> On Thu, May 16, 2024 at 10:32:43AM +0800, Yan Zhao wrote:
> > On Wed, May 15, 2024 at 05:43:04PM -0300, Jason Gunthorpe wrote:
> > > On Wed, May 15, 2024 at 03:06:36PM +0800, Yan Zhao wrote:
> > > 
> > > > > So it has to be calculated on closer to a page by page basis (really a
> > > > > span by span basis) if flushing of that span is needed based on where
> > > > > the pages came from. Only pages that came from a hwpt that is
> > > > > non-coherent can skip the flushing.
> > > > Is area by area basis also good?
> > > > Isn't an area either not mapped to any domain or mapped into all domains?
> > > 
> > > Yes, this is what the span iterator turns into in the background, it
> > > goes area by area to cover things.
> > > 
> > > > But, yes, considering the limited number of non-coherent domains, it appears
> > > > more robust and clean to always flush for non-coherent domain in
> > > > iopt_area_fill_domain().
> > > > It eliminates the need to decide whether to retain the area flag during a split.
> > > 
> > > And flush for pin user pages, so you basically always flush because
> > > you can't tell where the pages came from.
> > As a summary, do you think it's good to flush in below way?
> > 
> > 1. in iopt_area_fill_domains(), flush before mapping a page into domains when
> >    iopt->noncoherent_domain_cnt > 0, no matter where the page is from.
> >    Record cache_flush_required in pages for unpin.
> > 2. in iopt_area_fill_domain(), pass in hwpt to check domain non-coherency.
> >    flush before mapping a page into a non-coherent domain, no matter where the
> >    page is from.
> >    Record cache_flush_required in pages for unpin.
> > 3. in batch_unpin(), flush if pages->cache_flush_required before
> >    unpin_user_pages.
> 
> It does not quite sound right, there should be no tracking in the
> pages of this stuff.
What's the downside of having tracking in the pages?

Lazily flush pages right before unpin pages is not only to save flush count
for performance, but also for some real problem we encountered. see below.

> 
> If pfn_reader_fill_span() does batch_from_domain() and
> the source domain's storage_domain is non-coherent then you can skip
> the flush. This is not pedantically perfect in skipping all flushes, but
> in practice it is probably good enough.
We don't know whether the source storage_domain is non-coherent since
area->storage_domain is of "struct iommu_domain".

Do you want to add a flag in "area", e.g. area->storage_domain_is_noncoherent,
and set this flag along side setting storage_domain?
(But looks this is not easy in iopt_area_fill_domains() as we don't have hwpt
there.)

> __iopt_area_unfill_domain() (and children) must flush after
> iopt_area_unmap_domain_range() if the area's domain is
> non-coherent. This is also not perfect, but probably good enough.
Do you mean flush after each iopt_area_unmap_domain_range() if the domain is
non-coherent?
The problem is that iopt_area_unmap_domain_range() knows only IOVA, the
IOVA->PFN relationship is not available without iommu_iova_to_phys() and
iommu_domain contains no coherency info.
Besides, when the non-coherent domain is a storage domain, we still need to do
the flush in batch_unpin(), right?
Then, with a more complex case, if the non-coherent domain is a storage domain,
and if some pages are still held in pages->access_itree when unfilling the
domain, should we get PFNs from pages->pinned_pfns and do the flush in
__iopt_area_unfill_domain()?
> 
> Doing better in both cases would require inspecting the areas under
> the used span to see what is there. This is not so easy.
My feeling is that checking non-coherency of target domain and save non-coherency
in pages might be the easiest way with least code change.


  reply	other threads:[~2024-05-20  2:46 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-07  6:18 [PATCH 0/5] Enforce CPU cache flush for non-coherent device assignment Yan Zhao
2024-05-07  6:19 ` [PATCH 1/5] x86/pat: Let pat_pfn_immune_to_uc_mtrr() check MTRR for untracked PAT range Yan Zhao
2024-05-07  8:26   ` Tian, Kevin
2024-05-07  9:12     ` Yan Zhao
2024-05-08 22:14       ` Alex Williamson
2024-05-09  3:36         ` Yan Zhao
2024-05-16  7:42       ` Tian, Kevin
2024-05-16 14:07         ` Sean Christopherson
2024-05-20  2:36           ` Tian, Kevin
2024-05-07  6:20 ` [PATCH 2/5] KVM: x86/mmu: Fine-grained check of whether a invalid & RAM PFN is MMIO Yan Zhao
2024-05-07  8:39   ` Tian, Kevin
2024-05-07  9:19     ` Yan Zhao
2024-05-07  6:20 ` [PATCH 3/5] x86/mm: Introduce and export interface arch_clean_nonsnoop_dma() Yan Zhao
2024-05-07  8:51   ` Tian, Kevin
2024-05-07  9:40     ` Yan Zhao
2024-05-20 14:07   ` Christoph Hellwig
2024-05-21 15:49     ` Jason Gunthorpe
2024-05-21 16:00       ` Jason Gunthorpe
2024-05-22  3:41         ` Yan Zhao
2024-05-28  6:37         ` Christoph Hellwig
2024-06-01 19:46           ` Jason Gunthorpe
2024-05-07  6:21 ` [PATCH 4/5] vfio/type1: Flush CPU caches on DMA pages in non-coherent domains Yan Zhao
2024-05-09 18:10   ` Alex Williamson
2024-05-10 10:31     ` Yan Zhao
2024-05-10 16:57       ` Alex Williamson
2024-05-13  7:11         ` Yan Zhao
2024-05-16  7:53           ` Tian, Kevin
2024-05-16  8:34           ` Tian, Kevin
2024-05-16 20:31             ` Alex Williamson
2024-05-17 17:11               ` Jason Gunthorpe
2024-05-20  2:52                 ` Tian, Kevin
2024-05-21 16:07                   ` Jason Gunthorpe
2024-05-21 16:21                     ` Alex Williamson
2024-05-21 16:34                       ` Jason Gunthorpe
2024-05-21 18:19                         ` Alex Williamson
2024-05-21 18:37                           ` Jason Gunthorpe
2024-05-22  6:24                             ` Tian, Kevin
2024-05-22 12:29                               ` Jason Gunthorpe
2024-05-22 14:43                                 ` Alex Williamson
2024-05-22 16:52                                   ` Jason Gunthorpe
2024-05-22 18:22                                     ` Alex Williamson
2024-05-22 23:26                                 ` Tian, Kevin
2024-05-22 23:32                                   ` Jason Gunthorpe
2024-05-22 23:40                                     ` Tian, Kevin
2024-05-23 14:58                                       ` Jason Gunthorpe
2024-05-23 22:47                                         ` Alex Williamson
2024-05-24  0:30                                           ` Tian, Kevin
2024-05-24 13:50                                           ` Jason Gunthorpe
2024-05-22  3:33                           ` Yan Zhao
2024-05-22  3:24                         ` Yan Zhao
2024-05-22 12:26                           ` Jason Gunthorpe
2024-05-24  3:07                             ` Yan Zhao
2024-05-16 20:50           ` Alex Williamson
2024-05-17  3:11             ` Yan Zhao
2024-05-17  4:44               ` Alex Williamson
2024-05-17  5:00                 ` Yan Zhao
2024-05-07  6:22 ` [PATCH 5/5] iommufd: " Yan Zhao
2024-05-09 14:13   ` Jason Gunthorpe
2024-05-10  8:03     ` Yan Zhao
2024-05-10 13:29       ` Jason Gunthorpe
2024-05-13  7:43         ` Yan Zhao
2024-05-14 15:11           ` Jason Gunthorpe
2024-05-15  7:06             ` Yan Zhao
2024-05-15 20:43               ` Jason Gunthorpe
2024-05-16  2:32                 ` Yan Zhao
2024-05-16  8:38                   ` Tian, Kevin
2024-05-16  9:48                     ` Yan Zhao
2024-05-17 17:04                   ` Jason Gunthorpe
2024-05-20  2:45                     ` Yan Zhao [this message]
2024-05-21 16:04                       ` Jason Gunthorpe
2024-05-22  3:17                         ` Yan Zhao
2024-05-22  6:29                           ` Yan Zhao
2024-05-22 17:01                             ` Jason Gunthorpe
2024-05-27  7:15                               ` Yan Zhao
2024-06-01 16:48                                 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zkq5ZL+saJbEkfBQ@yzhao56-desk.sh.intel.com \
    --to=yan.y.zhao@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=robin.murphy@arm.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).