dri-devel Archive mirror
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: "Christian König" <ckoenig.leichtzumerken@gmail.com>
Cc: "Marek Olšák" <maraeo@gmail.com>,
	"Michel Dänzer" <michel@daenzer.net>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	"Jason Ekstrand" <jason@jlekstrand.net>,
	"ML Mesa-dev" <mesa-dev@lists.freedesktop.org>
Subject: Re: [Mesa-dev] Linux Graphics Next: Userspace submission update
Date: Thu, 17 Jun 2021 18:48:28 +0200	[thread overview]
Message-ID: <YMt83HMgDqvep9cN@phenom.ffwll.local> (raw)
In-Reply-To: <ebeabd65-9d8e-d364-a084-62bcdd7aa439@gmail.com>

On Mon, Jun 14, 2021 at 07:13:00PM +0200, Christian König wrote:
> As long as we can figure out who touched to a certain sync object last that
> would indeed work, yes.

Don't you need to know who will touch it next, i.e. who is holding up your
fence? Or maybe I'm just again totally confused.
-Daniel

> 
> Christian.
> 
> Am 14.06.21 um 19:10 schrieb Marek Olšák:
> > The call to the hw scheduler has a limitation on the size of all
> > parameters combined. I think we can only pass a 32-bit sequence number
> > and a ~16-bit global (per-GPU) syncobj handle in one call and not much
> > else.
> > 
> > The syncobj handle can be an element index in a global (per-GPU) syncobj
> > table and it's read only for all processes with the exception of the
> > signal command. Syncobjs can either have per VMID write access flags for
> > the signal command (slow), or any process can write to any syncobjs and
> > only rely on the kernel checking the write log (fast).
> > 
> > In any case, we can execute the memory write in the queue engine and
> > only use the hw scheduler for logging, which would be perfect.
> > 
> > Marek
> > 
> > On Thu, Jun 10, 2021 at 12:33 PM Christian König
> > <ckoenig.leichtzumerken@gmail.com
> > <mailto:ckoenig.leichtzumerken@gmail.com>> wrote:
> > 
> >     Hi guys,
> > 
> >     maybe soften that a bit. Reading from the shared memory of the
> >     user fence is ok for everybody. What we need to take more care of
> >     is the writing side.
> > 
> >     So my current thinking is that we allow read only access, but
> >     writing a new sequence value needs to go through the scheduler/kernel.
> > 
> >     So when the CPU wants to signal a timeline fence it needs to call
> >     an IOCTL. When the GPU wants to signal the timeline fence it needs
> >     to hand that of to the hardware scheduler.
> > 
> >     If we lockup the kernel can check with the hardware who did the
> >     last write and what value was written.
> > 
> >     That together with an IOCTL to give out sequence number for
> >     implicit sync to applications should be sufficient for the kernel
> >     to track who is responsible if something bad happens.
> > 
> >     In other words when the hardware says that the shader wrote stuff
> >     like 0xdeadbeef 0x0 or 0xffffffff into memory we kill the process
> >     who did that.
> > 
> >     If the hardware says that seq - 1 was written fine, but seq is
> >     missing then the kernel blames whoever was supposed to write seq.
> > 
> >     Just pieping the write through a privileged instance should be
> >     fine to make sure that we don't run into issues.
> > 
> >     Christian.
> > 
> >     Am 10.06.21 um 17:59 schrieb Marek Olšák:
> > >     Hi Daniel,
> > > 
> > >     We just talked about this whole topic internally and we came up
> > >     to the conclusion that the hardware needs to understand sync
> > >     object handles and have high-level wait and signal operations in
> > >     the command stream. Sync objects will be backed by memory, but
> > >     they won't be readable or writable by processes directly. The
> > >     hardware will log all accesses to sync objects and will send the
> > >     log to the kernel periodically. The kernel will identify
> > >     malicious behavior.
> > > 
> > >     Example of a hardware command stream:
> > >     ...
> > >     ImplicitSyncWait(syncObjHandle, sequenceNumber); // the sequence
> > >     number is assigned by the kernel
> > >     Draw();
> > >     ImplicitSyncSignalWhenDone(syncObjHandle);
> > >     ...
> > > 
> > >     I'm afraid we have no other choice because of the TLB
> > >     invalidation overhead.
> > > 
> > >     Marek
> > > 
> > > 
> > >     On Wed, Jun 9, 2021 at 2:31 PM Daniel Vetter <daniel@ffwll.ch
> > >     <mailto:daniel@ffwll.ch>> wrote:
> > > 
> > >         On Wed, Jun 09, 2021 at 03:58:26PM +0200, Christian König wrote:
> > >         > Am 09.06.21 um 15:19 schrieb Daniel Vetter:
> > >         > > [SNIP]
> > >         > > > Yeah, we call this the lightweight and the heavyweight
> > >         tlb flush.
> > >         > > >
> > >         > > > The lighweight can be used when you are sure that you
> > >         don't have any of the
> > >         > > > PTEs currently in flight in the 3D/DMA engine and you
> > >         just need to
> > >         > > > invalidate the TLB.
> > >         > > >
> > >         > > > The heavyweight must be used when you need to
> > >         invalidate the TLB *AND* make
> > >         > > > sure that no concurrently operation moves new stuff
> > >         into the TLB.
> > >         > > >
> > >         > > > The problem is for this use case we have to use the
> > >         heavyweight one.
> > >         > > Just for my own curiosity: So the lightweight flush is
> > >         only for in-between
> > >         > > CS when you know access is idle? Or does that also not
> > >         work if userspace
> > >         > > has a CS on a dma engine going at the same time because
> > >         the tlb aren't
> > >         > > isolated enough between engines?
> > >         >
> > >         > More or less correct, yes.
> > >         >
> > >         > The problem is a lightweight flush only invalidates the
> > >         TLB, but doesn't
> > >         > take care of entries which have been handed out to the
> > >         different engines.
> > >         >
> > >         > In other words what can happen is the following:
> > >         >
> > >         > 1. Shader asks TLB to resolve address X.
> > >         > 2. TLB looks into its cache and can't find address X so it
> > >         asks the walker
> > >         > to resolve.
> > >         > 3. Walker comes back with result for address X and TLB puts
> > >         that into its
> > >         > cache and gives it to Shader.
> > >         > 4. Shader starts doing some operation using result for
> > >         address X.
> > >         > 5. You send lightweight TLB invalidate and TLB throws away
> > >         cached values for
> > >         > address X.
> > >         > 6. Shader happily still uses whatever the TLB gave to it in
> > >         step 3 to
> > >         > accesses address X
> > >         >
> > >         > See it like the shader has their own 1 entry L0 TLB cache
> > >         which is not
> > >         > affected by the lightweight flush.
> > >         >
> > >         > The heavyweight flush on the other hand sends out a
> > >         broadcast signal to
> > >         > everybody and only comes back when we are sure that an
> > >         address is not in use
> > >         > any more.
> > > 
> > >         Ah makes sense. On intel the shaders only operate in VA,
> > >         everything goes
> > >         around as explicit async messages to IO blocks. So we don't
> > >         have this, the
> > >         only difference in tlb flushes is between tlb flush in the IB
> > >         and an mmio
> > >         one which is independent for anything currently being
> > >         executed on an
> > >         egine.
> > >         -Daniel
> > >         --         Daniel Vetter
> > >         Software Engineer, Intel Corporation
> > >         http://blog.ffwll.ch <http://blog.ffwll.ch>
> > > 
> > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

  reply	other threads:[~2021-06-17 16:48 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-27 21:51 Linux Graphics Next: Userspace submission update Marek Olšák
2021-05-28 14:41 ` Christian König
2021-05-28 22:25   ` Marek Olšák
2021-05-29  3:33     ` Marek Olšák
2021-05-31  8:25       ` Christian König
2021-06-01  9:02 ` Michel Dänzer
2021-06-01 10:21   ` Christian König
2021-06-01 10:49     ` Michel Dänzer
2021-06-01 12:10       ` [Mesa-dev] " Christian König
2021-06-01 12:30         ` Daniel Vetter
2021-06-01 12:51           ` Christian König
2021-06-01 13:01             ` Marek Olšák
2021-06-01 13:24               ` Michel Dänzer
2021-06-02  8:57             ` Daniel Stone
2021-06-02  9:34               ` Marek Olšák
2021-06-02  9:38                 ` Marek Olšák
2021-06-02 18:48                   ` Daniel Vetter
2021-06-02 18:52                     ` Christian König
2021-06-02 19:19                       ` Daniel Vetter
2021-06-04  7:00                         ` Christian König
2021-06-04  8:57                           ` Daniel Vetter
2021-06-04 11:27                             ` Christian König
2021-06-09 13:19                               ` Daniel Vetter
2021-06-09 13:58                                 ` Christian König
2021-06-09 18:31                                   ` Daniel Vetter
2021-06-10 15:59                                     ` Marek Olšák
2021-06-10 16:33                                       ` Christian König
2021-06-14 17:10                                         ` Marek Olšák
2021-06-14 17:13                                           ` Christian König
2021-06-17 16:48                                             ` Daniel Vetter [this message]
2021-06-17 18:28                                               ` Marek Olšák
2021-06-17 19:04                                                 ` Daniel Vetter
2021-06-17 19:23                                                   ` Marek Olšák
2021-06-03  3:16                     ` Marek Olšák
2021-06-03  7:47                       ` Daniel Vetter
2021-06-03  8:20                         ` Marek Olšák
2021-06-03 10:03                           ` Daniel Vetter
2021-06-03 10:55                             ` Marek Olšák
2021-06-03 11:22                               ` Daniel Vetter
2021-06-03 17:52                                 ` Marek Olšák
2021-06-03 19:18                                   ` Daniel Vetter
2021-06-04  5:26                                     ` Marek Olšák
2021-06-02  9:44               ` Christian König
2021-06-02  9:58                 ` Marek Olšák
2021-06-02 10:06                   ` Christian König
2021-06-01 13:18         ` Michel Dänzer
2021-06-01 17:39           ` Michel Dänzer
2021-06-01 17:42           ` Daniel Stone
2021-06-02  8:09       ` Michel Dänzer
2021-06-02 19:20         ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YMt83HMgDqvep9cN@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jason@jlekstrand.net \
    --cc=maraeo@gmail.com \
    --cc=mesa-dev@lists.freedesktop.org \
    --cc=michel@daenzer.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).