dri-devel Archive mirror
 help / color / mirror / Atom feed
From: Mina Almasry <almasrymina@google.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: "Jason Gunthorpe" <jgg@ziepe.ca>,
	"Pavel Begunkov" <asml.silence@gmail.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-alpha@vger.kernel.org,
	linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org,
	sparclinux@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	linux-arch@vger.kernel.org, bpf@vger.kernel.org,
	linux-kselftest@vger.kernel.org, linux-media@vger.kernel.org,
	dri-devel@lists.freedesktop.org,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Ivan Kokshaysky" <ink@jurassic.park.msu.ru>,
	"Matt Turner" <mattst88@gmail.com>,
	"Thomas Bogendoerfer" <tsbogend@alpha.franken.de>,
	"James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
	"Helge Deller" <deller@gmx.de>,
	"Andreas Larsson" <andreas@gaisler.com>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Masami Hiramatsu" <mhiramat@kernel.org>,
	"Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Martin KaFai Lau" <martin.lau@linux.dev>,
	"Eduard Zingerman" <eddyz87@gmail.com>,
	"Song Liu" <song@kernel.org>,
	"Yonghong Song" <yonghong.song@linux.dev>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"KP Singh" <kpsingh@kernel.org>,
	"Stanislav Fomichev" <sdf@google.com>,
	"Hao Luo" <haoluo@google.com>, "Jiri Olsa" <jolsa@kernel.org>,
	"Steffen Klassert" <steffen.klassert@secunet.com>,
	"Herbert Xu" <herbert@gondor.apana.org.au>,
	"David Ahern" <dsahern@kernel.org>,
	"Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Amritha Nambiar" <amritha.nambiar@intel.com>,
	"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
	"Alexander Mikhalitsyn" <alexander@mihalicyn.com>,
	"Kaiyuan Zhang" <kaiyuanz@google.com>,
	"Christian Brauner" <brauner@kernel.org>,
	"Simon Horman" <horms@kernel.org>,
	"David Howells" <dhowells@redhat.com>,
	"Florian Westphal" <fw@strlen.de>,
	"Yunsheng Lin" <linyunsheng@huawei.com>,
	"Kuniyuki Iwashima" <kuniyu@amazon.com>,
	"Jens Axboe" <axboe@kernel.dk>,
	"Arseniy Krasnov" <avkrasnov@salutedevices.com>,
	"Aleksander Lobakin" <aleksander.lobakin@intel.com>,
	"Michael Lass" <bevan@bi-co.net>, "Jiri Pirko" <jiri@resnulli.us>,
	"Sebastian Andrzej Siewior" <bigeasy@linutronix.de>,
	"Lorenzo Bianconi" <lorenzo@kernel.org>,
	"Richard Gobert" <richardbgobert@gmail.com>,
	"Sridhar Samudrala" <sridhar.samudrala@intel.com>,
	"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
	"Johannes Berg" <johannes.berg@intel.com>,
	"Abel Wu" <wuyun.abel@bytedance.com>,
	"Breno Leitao" <leitao@debian.org>, "David Wei" <dw@davidwei.uk>,
	"Shailend Chand" <shailend@google.com>,
	"Harshitha Ramamurthy" <hramamurthy@google.com>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Jeroen de Borst" <jeroendb@google.com>,
	"Praveen Kaligineedi" <pkaligineedi@google.com>
Subject: Re: [RFC PATCH net-next v8 02/14] net: page_pool: create hooks for custom page providers
Date: Tue, 7 May 2024 09:42:05 -0700	[thread overview]
Message-ID: <CAHS8izPH+sRLSiZ7vbrNtRdHrFEf8XQ61XAyHuxRSL9Jjy8YbQ@mail.gmail.com> (raw)
In-Reply-To: <ZjpVfPqGNfE5N4bl@infradead.org>

On Tue, May 7, 2024 at 9:24 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Tue, May 07, 2024 at 01:18:57PM -0300, Jason Gunthorpe wrote:
> > On Tue, May 07, 2024 at 05:05:12PM +0100, Pavel Begunkov wrote:
> > > > even in tree if you give them enough rope, and they should not have
> > > > that rope when the only sensible options are page/folio based kernel
> > > > memory (incuding large/huge folios) and dmabuf.
> > >
> > > I believe there is at least one deep confusion here, considering you
> > > previously mentioned Keith's pre-mapping patches. The "hooks" are not
> > > that about in what format you pass memory, it's arguably the least
> > > interesting part for page pool, more or less it'd circulate whatever
> > > is given. It's more of how to have a better control over buffer lifetime
> > > and implement a buffer pool passing data to users and empty buffers
> > > back.
> >
> > Isn't that more or less exactly what dmabuf is? Why do you need
> > another almost dma-buf thing for another project?
>
> That's the exact point I've been making since the last round of
> the series.  We don't need to reinvent dmabuf poorly in every
> subsystem, but instead fix the odd parts in it and make it suitable
> for everyone.
>


FWIW the change Christoph is requesting is straight forward from my
POV and doesn't really hurt the devmem use case. I'd basically remove
the ops and add an if statement in the slow path where the ops are
being used to alloc/free from dmabuf instead of alloc_pages().
Something like (very rough, doesn't compile):

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 92be1aaf18ccc..2cc986455bce6 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -557,8 +557,8 @@ netmem_ref page_pool_alloc_netmem(struct page_pool
*pool, gfp_t gfp)
                return netmem;

        /* Slow-path: cache empty, do real allocation */
-       if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
-               netmem = pool->mp_ops->alloc_pages(pool, gfp);
+       if (page_pool_is_dmabuf(pool))
+               netmem = mp_dmabuf_devmem_alloc_pages():
        else
                netmem = __page_pool_alloc_pages_slow(pool, gfp);
        return netmem;


The folks that will be negatively impacted by this are
Jakub/Pavel/David. I think all were planning to extend the hooks for
io_uring or other memory types.

Pavel/David, AFAICT you have these options here (but maybe you can
think of more):

1. Align with devmem TCP to use udmabuf for your io_uring memory. I
think in the past you said it's a uapi you don't link but in the face
of this pushback you may want to reconsider.

2. Follow the example of devmem TCP and add another if statement to
alloc from io_uring, so something like:

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 92be1aaf18ccc..3545bb82c7d05 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -557,8 +557,10 @@ netmem_ref page_pool_alloc_netmem(struct
page_pool *pool, gfp_t gfp)
                return netmem;

        /* Slow-path: cache empty, do real allocation */
-       if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)
-               netmem = pool->mp_ops->alloc_pages(pool, gfp);
+       if (page_pool_is_dmabuf(pool))
+               netmem = mp_dmabuf_devmem_alloc_pages():
+       else if (page_pool_is_io_uring(pool))
+               netmem = mp_io_uring_alloc_pages():
        else
                netmem = __page_pool_alloc_pages_slow(pool, gfp);
        return netmem;

Note that Christoph/Jason may not like you adding non-dmabuf io_uring
backing memory in the first place, so there may be pushback against
this approach.

3. Pushback on the nack on this thread. It seems you're already
discussing this. I'll see what happens.

To be honest the GVE queue-API has just been merged I think, so I'm
now unblocked on sending non-RFCs of this work and I'm hoping to send
the next version soon. I may apply these changes on the next version
for more discussion or leave as is and carry the nack until the
conversation converges.

-- 
Thanks,
Mina

  parent reply	other threads:[~2024-05-07 16:42 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-03  0:20 [RFC PATCH net-next v8 00/14] Device Memory TCP Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 01/14] queue_api: define queue api Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 02/14] net: page_pool: create hooks for custom page providers Mina Almasry
     [not found]   ` <ZjH1QaSSQ98mw158@infradead.org>
2024-05-03 20:10     ` Mina Almasry
     [not found]       ` <ZjjHUh1eINPg1wkn@infradead.org>
2024-05-07 16:05         ` Pavel Begunkov
2024-05-07 16:18           ` Jason Gunthorpe
     [not found]             ` <ZjpVfPqGNfE5N4bl@infradead.org>
2024-05-07 16:42               ` Mina Almasry [this message]
2024-05-07 16:48                 ` Jason Gunthorpe
2024-05-07 17:19                   ` Daniel Vetter
2024-05-07 17:25                   ` Pavel Begunkov
2024-05-07 17:56                     ` Jason Gunthorpe
2024-05-07 19:35                       ` Pavel Begunkov
2024-05-07 23:32                         ` Jason Gunthorpe
2024-05-08  7:16                           ` Daniel Vetter
2024-05-08 11:35                             ` Pavel Begunkov
2024-05-08 15:34                               ` Daniel Vetter
     [not found]                               ` <ZjufddNVJs5Csaix@infradead.org>
2024-05-08 17:02                                 ` Pavel Begunkov
2024-05-08 11:30                           ` Pavel Begunkov
2024-05-08 14:25                             ` Jason Gunthorpe
2024-05-08 15:44                               ` Pavel Begunkov
2024-05-08 15:58                                 ` Jason Gunthorpe
2024-05-08 16:13                                   ` Pavel Begunkov
2024-05-07 17:17                 ` Pavel Begunkov
2024-05-07 16:55               ` Pavel Begunkov
2024-05-07 17:15                 ` Mina Almasry
2024-05-07 17:34                   ` Pavel Begunkov
2024-04-03  0:20 ` [RFC PATCH net-next v8 03/14] net: netdev netlink api to bind dma-buf to a net device Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 04/14] netdev: support binding dma-buf to netdevice Mina Almasry
2024-04-24 17:35   ` David Wei
2024-04-24 22:11     ` Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 05/14] netdev: netdevice devmem allocator Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 06/14] page_pool: convert to use netmem Mina Almasry
2024-04-03 17:27   ` Simon Horman
2024-04-03  0:20 ` [RFC PATCH net-next v8 07/14] page_pool: devmem support Mina Almasry
2024-04-27  0:17   ` David Wei
2024-04-27  2:11     ` Mina Almasry
2024-04-30 13:31       ` Pavel Begunkov
2024-04-30 13:45       ` Jens Axboe
2024-04-30 18:29         ` Mina Almasry
2024-04-30 18:55           ` Jens Axboe
2024-04-30 19:19             ` Mina Almasry
2024-05-01 13:58             ` Jesper Dangaard Brouer
     [not found]     ` <ZjH1hO8qJgOqNKub@infradead.org>
2024-05-06  0:29       ` David Wei
2024-04-03  0:20 ` [RFC PATCH net-next v8 08/14] memory-provider: dmabuf devmem memory provider Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 09/14] net: support non paged skb frags Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 10/14] net: add support for skbs with unreadable frags Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 11/14] tcp: RX path for devmem TCP Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 12/14] net: add SO_DEVMEM_DONTNEED setsockopt to release RX frags Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 13/14] net: add devmem TCP documentation Mina Almasry
2024-05-03 13:14   ` Bagas Sanjaya
2024-04-03  0:20 ` [RFC PATCH net-next v8 14/14] selftests: add ncdevmem, netcat for devmem TCP Mina Almasry
2024-04-08 15:57   ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHS8izPH+sRLSiZ7vbrNtRdHrFEf8XQ61XAyHuxRSL9Jjy8YbQ@mail.gmail.com \
    --to=almasrymina@google.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=aleksander.lobakin@intel.com \
    --cc=alexander@mihalicyn.com \
    --cc=amritha.nambiar@intel.com \
    --cc=andreas@gaisler.com \
    --cc=andrii@kernel.org \
    --cc=arnd@arndb.de \
    --cc=asml.silence@gmail.com \
    --cc=ast@kernel.org \
    --cc=avkrasnov@salutedevices.com \
    --cc=axboe@kernel.dk \
    --cc=bevan@bi-co.net \
    --cc=bigeasy@linutronix.de \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=deller@gmx.de \
    --cc=dhowells@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=dsahern@kernel.org \
    --cc=dw@davidwei.uk \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=haoluo@google.com \
    --cc=hawk@kernel.org \
    --cc=hch@infradead.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=horms@kernel.org \
    --cc=hramamurthy@google.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jeroendb@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jiri@resnulli.us \
    --cc=johannes.berg@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kaiyuanz@google.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuniyu@amazon.com \
    --cc=leitao@debian.org \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=martin.lau@linux.dev \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mattst88@gmail.com \
    --cc=mhiramat@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pkaligineedi@google.com \
    --cc=richard.henderson@linaro.org \
    --cc=richardbgobert@gmail.com \
    --cc=rostedt@goodmis.org \
    --cc=sdf@google.com \
    --cc=shailend@google.com \
    --cc=shakeel.butt@linux.dev \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=sridhar.samudrala@intel.com \
    --cc=steffen.klassert@secunet.com \
    --cc=sumit.semwal@linaro.org \
    --cc=tsbogend@alpha.franken.de \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=wuyun.abel@bytedance.com \
    --cc=xuanzhuo@linux.alibaba.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).