Intel-GFX Archive mirror
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>,
	Kenneth Graunke <kenneth.w.graunke@intel.com>,
	 Matt Roper <matthew.d.roper@intel.com>
Cc: intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	"Chery, Nanley G" <nanley.g.chery@intel.com>,
	"Saarinen, Jani" <jani.saarinen@intel.com>,
	"Souza, Jose" <jose.souza@intel.com>,
	"Mathew, Alwin" <alwin.mathew@intel.com>,
	"Zhang, Jianxun" <jianxun.zhang@intel.com>,
	"Syrjala, Ville" <ville.syrjala@linux.intel.com>,
	"Nikula, Jani" <jani.nikula@intel.com>
Subject: Re: [RFC PATCH 0/3] Introducing I915_FORMAT_MOD_4_TILED_XE2_CCS Modifier for Xe2
Date: Tue, 14 May 2024 18:51:09 +0200	[thread overview]
Message-ID: <208ba57378377f8815a7cf07b40ac9ee1ddaaa92.camel@linux.intel.com> (raw)
In-Reply-To: <171567875573.17841.1431042175236504579@jlahtine-mobl.ger.corp.intel.com>

On Tue, 2024-05-14 at 12:25 +0300, Joonas Lahtinen wrote:
> Quoting Kenneth Graunke (2024-05-11 03:58:34)
> > On Tuesday, May 7, 2024 3:56:57 PM PDT Matt Roper wrote:
> > > On Mon, May 06, 2024 at 09:52:35PM +0300, Juha-Pekka Heikkila
> > > wrote:
> > > > These patches introduce I915_FORMAT_MOD_4_TILED_XE2_CCS
> > > > modifier, which,
> > > > from the kernel's perspective, behaves similarly to 
> > `I915_FORMAT_MOD_4_TILED`.
> > > > This new modifier is primarily intended for user space to
> > > > effectively 
> > monitor
> > > > compression status, especially when dealing with a mix of
> > > > compressed and
> > > > uncompressed buffers.
> > > > 
> > > > The addition of this modifier facilitates user space in
> > > > managing 
> > compression
> > > > status, particularly when utilizing both compressed and
> > > > uncompressed 
> > buffers
> > > > concurrently. To leverage compression for these buffers, user
> > > > space
> > > > applications must configure the appropriate Page Attribute
> > > > Table (PAT) 
> > index.
> > > > Display engine will treat all Tile4 as if it were compressed
> > > > under all
> > > > circumstances on Xe2 architecture.
> > > 
> > > I may have missed some discussion about this, but I thought the
> > > previous
> > > consensus was that we didn't want/need new modifiers for
> > > compression on
> > > Xe2?  If a userspace client (or the display hardware) receives a
> > > buffer
> > > of unknown origin and unknown compression status, it's always
> > > fine to
> > > select a compressed PAT when binding the buffer to read since
> > > even for
> > > uncompressed buffers the CCS metadata will accurately reflect the
> > > compression status.  Unlike Xe1, where generating content without
> > > compression enabled would leave random garbage in the FlatCCS
> > > area, Xe2
> > > will set the corresponding FlatCCS to '0x0' for each block,
> > > indicating
> > > uncompressed data.
> > > 
> > > Can you explain more what the benefit of handling these modifiers
> > > explicitly is?
> > > 
> > > 
> > > Matt
> > 
> > Thanks, Matt!  I'm a bit late in getting up to speed with the Xe2
> > compression 
> > changes; this is really good information.
> > 
> > As I understand it...all blocks on the GPU behave in the way you
> > mentioned, 
> > where generating uncompressed data via the GPU will set FlatCCS =
> > 0, so you 
> > can assume a compressed PAT entry and everything works.
> > 
> > One snag is...I've heard that CPU access doesn't work that way. 
> > So, if you 
> > mmap a buffer on the CPU, and write data with the CPU, then I think
> > we're back 
> > to the "FlatCCS contains uninitialized garbage" case, where it's
> > unsafe to 
> > assume a compressed PAT.  And... we don't really know when sharing
> > buffers 
> > whether the other side is going to want to do CPU access.
> 
> I think the previous discussion has specifically happened in the
> context of
> dma-buf, so not only CPU but other GPUs/accelerators/decoders/devices
> in the
> system are also relevant.
> 
> It boils down to the fact that when exporting a dma-buf, one can't
> know it will
> be consumed only by the same GPU (or other device for that matter)
> unless there
> is an explicit negotiation between exporter and importers.
> 
> > It would be really nice to assume compression by default, though,
> > which got me 
> > thinking: if we mmap a buffer via DRM_XE_GEM_MMAP_OFFSET, could
> > xe.ko disable 
> > compression for us?  So, resolve any outstanding CCS data, and then
> > switch any 
> > PAT entries to uncompressed.  Mapping would block until that
> > resolve is done.  
> > It could leave compression off forever (once you CPU map a buffer,
> > it's never 
> > compressed again).  Or it could turn CCS back on when map count
> > reaches 0 (but 
> > frankly I'm not sure that's terribly important, and sounds more
> > complex).
> 
> This would only really work for a single device but the dma-buf is
> specifically targeting more generic sharing. There's no built-in
> mechanism
> to limit the sharing to subset of devices without explicit
> negotiation
> between the exporter and importers.
> 
> So I think the "by default" mode needs to be interoperable, and the
> explicit negotiation can then use less compatible formats given those
> FD
> are never passed to importers that were not part of the negotiation.
> 
> > As I understand it, at least on discrete GPUs, the kernel already
> > has to do 
> > something similar for eviction, when migrating BOs to system memory
> > (which 
> > doesn't support compression).  So this would be doing basically the
> > same 
> > "resolve and disable CCS" step the kernel can presumably already
> > do, but now 
> > on mmap as well.
> 
> So far the eviction strategy has been to copy both the backing store
> and
> compression bits in raw form. With Xe2 it would indeed be possible to
> do
> an implicit resolve IFF the buffer has not been shared to someone who
> doesn't
> understand compression and might have left garbage in the CCS bits.
> 
> When evicting in raw form, kernel doesn't have to know if the CCS
> bits
> are garbage or not on any given moment.
> 
> Regards, Joonas

Just a follow-up comment (TBC), IF we're going for the everything-is-
compressed approach, I think there are some considerations to be made:

dma-buf exports to foreign devices need to resolve at map_attachment
time. Foreign devices are all devices that can't interpret the
compressed content.
dma-buf mmaps need to resolve. IIRC Could be implemented in the
DMA_BUF_IOCTL_SYNC callbacks that wrap cpu access.
dma-buf imports from foreign devices need to never use compressed PAT
for writing. Should KMD enforce this? Implicitly? Explicitly? I don't
see how UMD could know whether the imported dma-buf is from a foreign
device.

For mmaps of buffer objects of local device a resolve is needed. Having
KMD do that on a pagefault-basis is definitely possible, but will most
likely be terribly inefficient. Better to leave that to UMD?

/Thomas

> 
> > 
> > What do you think?  Viable?  Crazy?  Have I missed something?
> > 
> > --Ken


      reply	other threads:[~2024-05-14 16:51 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-06 18:52 [RFC PATCH 0/3] Introducing I915_FORMAT_MOD_4_TILED_XE2_CCS Modifier for Xe2 Juha-Pekka Heikkila
2024-05-06 18:52 ` [RFC PATCH 1/3] drm/fourcc: define Intel Xe2 related tile4 ccs modifier Juha-Pekka Heikkila
2024-06-04 15:54   ` Chery, Nanley G
2024-05-06 18:52 ` [RFC PATCH 2/3] drm/xe/display: allow creation of case I915_FORMAT_MOD_4_TILED_XE2_CCS type framebuffer Juha-Pekka Heikkila
2024-05-07 16:07   ` Ville Syrjälä
2024-05-07 16:41     ` Juha-Pekka Heikkila
2024-05-06 18:52 ` [RFC PATCH 3/3] drm/i915/display: " Juha-Pekka Heikkila
2024-05-07 16:06   ` Ville Syrjälä
2024-05-07 16:43     ` Juha-Pekka Heikkila
2024-05-06 20:41 ` ✗ Fi.CI.CHECKPATCH: warning for Introducing I915_FORMAT_MOD_4_TILED_XE2_CCS Modifier for Xe2 Patchwork
2024-05-06 20:56 ` ✓ Fi.CI.BAT: success " Patchwork
2024-05-07  5:12 ` ✗ Fi.CI.IGT: failure " Patchwork
2024-05-07 22:56 ` [RFC PATCH 0/3] " Matt Roper
2024-05-08 13:08   ` Juha-Pekka Heikkila
2024-05-11  0:58   ` Kenneth Graunke
2024-05-14  9:25     ` Joonas Lahtinen
2024-05-14 16:51       ` Thomas Hellström [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=208ba57378377f8815a7cf07b40ac9ee1ddaaa92.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=alwin.mathew@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jani.nikula@intel.com \
    --cc=jani.saarinen@intel.com \
    --cc=jianxun.zhang@intel.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=jose.souza@intel.com \
    --cc=juhapekka.heikkila@gmail.com \
    --cc=kenneth.w.graunke@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=nanley.g.chery@intel.com \
    --cc=ville.syrjala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).