From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: "Marek Olšák" <maraeo@gmail.com>,
"Michel Dänzer" <michel@daenzer.net>,
dri-devel <dri-devel@lists.freedesktop.org>,
"Jason Ekstrand" <jason@jlekstrand.net>,
"ML Mesa-dev" <mesa-dev@lists.freedesktop.org>
Subject: Re: [Mesa-dev] Linux Graphics Next: Userspace submission update
Date: Fri, 4 Jun 2021 09:00:31 +0200 [thread overview]
Message-ID: <0fbb1197-fa88-c474-09db-6daec13d3004@gmail.com> (raw)
In-Reply-To: <YLfZq5lAaR/RiFV8@phenom.ffwll.local>
Am 02.06.21 um 21:19 schrieb Daniel Vetter:
> On Wed, Jun 02, 2021 at 08:52:38PM +0200, Christian König wrote:
>>
>> Am 02.06.21 um 20:48 schrieb Daniel Vetter:
>>> On Wed, Jun 02, 2021 at 05:38:51AM -0400, Marek Olšák wrote:
>>>> On Wed, Jun 2, 2021 at 5:34 AM Marek Olšák <maraeo@gmail.com> wrote:
>>>>
>>>>> Yes, we can't break anything because we don't want to complicate things
>>>>> for us. It's pretty much all NAK'd already. We are trying to gather more
>>>>> knowledge and then make better decisions.
>>>>>
>>>>> The idea we are considering is that we'll expose memory-based sync objects
>>>>> to userspace for read only, and the kernel or hw will strictly control the
>>>>> memory writes to those sync objects. The hole in that idea is that
>>>>> userspace can decide not to signal a job, so even if userspace can't
>>>>> overwrite memory-based sync object states arbitrarily, it can still decide
>>>>> not to signal them, and then a future fence is born.
>>>>>
>>>> This would actually be treated as a GPU hang caused by that context, so it
>>>> should be fine.
>>> This is practically what I proposed already, except your not doing it with
>>> dma_fence. And on the memory fence side this also doesn't actually give
>>> what you want for that compute model.
>>>
>>> This seems like a bit a worst of both worlds approach to me? Tons of work
>>> in the kernel to hide these not-dma_fence-but-almost, and still pain to
>>> actually drive the hardware like it should be for compute or direct
>>> display.
>>>
>>> Also maybe I've missed it, but I didn't see any replies to my suggestion
>>> how to fake the entire dma_fence stuff on top of new hw. Would be
>>> interesting to know what doesn't work there instead of amd folks going of
>>> into internal again and then coming back with another rfc from out of
>>> nowhere :-)
>> Well to be honest I would just push back on our hardware/firmware guys that
>> we need to keep kernel queues forever before going down that route.
> I looked again, and you said the model wont work because preemption is way
> too slow, even when the context is idle.
>
> I guess at that point I got maybe too fed up and just figured "not my
> problem", but if preempt is too slow as the unload fence, you can do it
> with pte removal and tlb shootdown too (that is hopefully not too slow,
> otherwise your hw is just garbage and wont even be fast for direct submit
> compute workloads).
Have you seen that one here:
https://www.spinics.net/lists/amd-gfx/msg63101.html :)
I've rejected it because I think polling for 6 seconds on a TLB flush
which can block interrupts as well is just madness.
>
> The only thing that you need to do when you use pte clearing + tlb
> shootdown instad of preemption as the unload fence for buffers that get
> moved is that if you get any gpu page fault, you don't serve that, but
> instead treat it as a tdr and shot the context permanently.
>
> So summarizing the model I proposed:
>
> - you allow userspace to directly write into the ringbuffer, and also
> write the fences directly
>
> - actual submit is done by the kernel, using drm/scheduler. The kernel
> blindly trusts userspace to set up everything else, and even just wraps
> dma_fences around the userspace memory fences.
>
> - the only check is tdr. If a fence doesn't complete an tdr fires, a) the
> kernel shot the entire context and b) userspace recovers by setting up a
> new ringbuffer
>
> - memory management is done using ttm only, you still need to supply the
> buffer list (ofc that list includes the always present ones, so CS will
> only get the list of special buffers like today). If you hw can't trun
> gpu page faults and you ever get one we pull up the same old solution:
> Kernel shots the entire context.
>
> The important thing is that from the gpu pov memory management works
> exactly like compute workload with direct submit, except that you just
> terminate the context on _any_ page fault, instead of only those that go
> somewhere where there's really no mapping and repair the others.
>
> Also I guess from reading the old thread this means you'd disable page
> fault retry because that is apparently also way too slow for anything.
>
> - memory management uses an unload fence. That unload fences waits for all
> userspace memory fences (represented as dma_fence) to complete, with
> maybe some fudge to busy-spin until we've reached the actual end of the
> ringbuffer (maybe you have a IB tail there after the memory fence write,
> we have that on intel hw), and it waits for the memory to get
> "unloaded". This is either preemption, or pte clearing + tlb shootdown,
> or whatever else your hw provides which is a) used for dynamic memory
> management b) fast enough for actual memory management.
>
> - any time a context dies we force-complete all it's pending fences,
> in-order ofc
>
> So from hw pov this looks 99% like direct userspace submit, with the exact
> same mappings, command sequences and everything else. The only difference
> is that the rinbuffer head/tail updates happen from drm/scheduler, instead
> of directly from userspace.
>
> None of this stuff needs funny tricks where the kernel controls the
> writes to memory fences, or where you need kernel ringbuffers, or anything
> like thist. Userspace is allowed to do anything stupid, the rules are
> guaranteed with:
>
> - we rely on the hw isolation features to work, but _exactly_ like compute
> direct submit would too
>
> - dying on any page fault captures memory management issues
>
> - dying (without kernel recover, this is up to userspace if it cares) on
> any tdr makes sure fences complete still
>
>> That syncfile and all that Android stuff isn't working out of the box with
>> the new shiny user queue submission model (which in turn is mostly because
>> of Windows) already raised some eyebrows here.
> I think if you really want to make sure the current linux stack doesn't
> break the _only_ option you have is provide a ctx mode that allows
> dma_fence and drm/scheduler to be used like today.
Yeah, but I still can just tell our hw/fw guys that we really really
need to keep kernel queues or the whole Linux/Android infrastructure
needs to get a compatibility layer like you describe above.
> For everything else it sounds you're a few years too late, because even
> just huge kernel changes wont happen in time. Much less rewriting
> userspace protocols.
Seconded, question is rather if we are going to start migrating at some
point or if we should keep pushing on our hw/fw guys.
Christian.
> -Daniel
next prev parent reply other threads:[~2021-06-04 7:00 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-27 21:51 Linux Graphics Next: Userspace submission update Marek Olšák
2021-05-28 14:41 ` Christian König
2021-05-28 22:25 ` Marek Olšák
2021-05-29 3:33 ` Marek Olšák
2021-05-31 8:25 ` Christian König
2021-06-01 9:02 ` Michel Dänzer
2021-06-01 10:21 ` Christian König
2021-06-01 10:49 ` Michel Dänzer
2021-06-01 12:10 ` [Mesa-dev] " Christian König
2021-06-01 12:30 ` Daniel Vetter
2021-06-01 12:51 ` Christian König
2021-06-01 13:01 ` Marek Olšák
2021-06-01 13:24 ` Michel Dänzer
2021-06-02 8:57 ` Daniel Stone
2021-06-02 9:34 ` Marek Olšák
2021-06-02 9:38 ` Marek Olšák
2021-06-02 18:48 ` Daniel Vetter
2021-06-02 18:52 ` Christian König
2021-06-02 19:19 ` Daniel Vetter
2021-06-04 7:00 ` Christian König [this message]
2021-06-04 8:57 ` Daniel Vetter
2021-06-04 11:27 ` Christian König
2021-06-09 13:19 ` Daniel Vetter
2021-06-09 13:58 ` Christian König
2021-06-09 18:31 ` Daniel Vetter
2021-06-10 15:59 ` Marek Olšák
2021-06-10 16:33 ` Christian König
2021-06-14 17:10 ` Marek Olšák
2021-06-14 17:13 ` Christian König
2021-06-17 16:48 ` Daniel Vetter
2021-06-17 18:28 ` Marek Olšák
2021-06-17 19:04 ` Daniel Vetter
2021-06-17 19:23 ` Marek Olšák
2021-06-03 3:16 ` Marek Olšák
2021-06-03 7:47 ` Daniel Vetter
2021-06-03 8:20 ` Marek Olšák
2021-06-03 10:03 ` Daniel Vetter
2021-06-03 10:55 ` Marek Olšák
2021-06-03 11:22 ` Daniel Vetter
2021-06-03 17:52 ` Marek Olšák
2021-06-03 19:18 ` Daniel Vetter
2021-06-04 5:26 ` Marek Olšák
2021-06-02 9:44 ` Christian König
2021-06-02 9:58 ` Marek Olšák
2021-06-02 10:06 ` Christian König
2021-06-01 13:18 ` Michel Dänzer
2021-06-01 17:39 ` Michel Dänzer
2021-06-01 17:42 ` Daniel Stone
2021-06-02 8:09 ` Michel Dänzer
2021-06-02 19:20 ` Daniel Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0fbb1197-fa88-c474-09db-6daec13d3004@gmail.com \
--to=ckoenig.leichtzumerken@gmail.com \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=jason@jlekstrand.net \
--cc=maraeo@gmail.com \
--cc=mesa-dev@lists.freedesktop.org \
--cc=michel@daenzer.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).