From: Yunsheng Lin <linyunsheng@huawei.com>
To: Mat Martineau <martineau@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>, <davem@davemloft.net>,
<kuba@kernel.org>, <netdev@vger.kernel.org>,
<linux-kernel@vger.kernel.org>,
Ayush Sawal <ayush.sawal@chelsio.com>,
Eric Dumazet <edumazet@google.com>,
Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
Jason Wang <jasowang@redhat.com>, Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Daniel Bristot de Oliveira <bristot@redhat.com>,
Valentin Schneider <vschneid@redhat.com>,
John Fastabend <john.fastabend@gmail.com>,
Jakub Sitnicki <jakub@cloudflare.com>,
David Ahern <dsahern@kernel.org>,
Matthieu Baerts <matttbe@kernel.org>,
Geliang Tang <geliang@kernel.org>,
Boris Pismenny <borisp@nvidia.com>, <bpf@vger.kernel.org>,
<mptcp@lists.linux.dev>
Subject: Re: [PATCH net-next v2 13/15] net: replace page_frag with page_frag_cache
Date: Fri, 19 Apr 2024 20:37:53 +0800 [thread overview]
Message-ID: <e1e7ebcf-8c7a-8ef4-c48a-3ddfa05d22bb@huawei.com> (raw)
In-Reply-To: <83991c67-8e4a-c287-b4a5-5dbba8835947@kernel.org>
On 2024/4/17 5:40, Mat Martineau wrote:
...
> I wasn't concerned as much about the direct cost of the inlined page_frag_alloc_commit() helper, it was that we could make fewer prepare calls if the commit was deferred as long as possible. As we discussed above, I see now that the prepare is not expensive when there is more space available in the current frag.
>
>> Maybe what we could do is to do the prepare in the inline
>> helper instead of a function when cache is enough, so that
>> we can avoid a function call as the old code does, as an
>> inlined function requires less overhead and is generally
>> faster than a function call.
>>
>> But that requires more refactoring, as this patchset is bigger
>> enough now, I guess we try it later if it is possible.
>
> A more generic (possible) optimization would be to inline some of page_frag_cache_refill(), but I'm not sure the code size tradeoff is worth it - would have to collect some data to find out for sure!
In my arm64 system, It seems inlining some of page_frag_cache_refill() results
in smaller kernel code size as below, has not done the performance test yet, but
the smaller code size seems odd here.
Without this patchset:
linyunsheng@ubuntu:~/ksize$ ./ksize.sh
Linux Kernel total | text data bss
--------------------------------------------------------------------------------
vmlinux 51974159 | 32238573 19087890 647696
With this patchset:
linyunsheng@ubuntu:~/ksize$ ./ksize.sh
Linux Kernel total | text data bss
--------------------------------------------------------------------------------
vmlinux 51970078 | 32234468 19087914 647696
With this patchset and below patch inlining some of page_frag_cache_refill():
linyunsheng@ubuntu:~/ksize$ ./ksize.sh
Linux Kernel total | text data bss
--------------------------------------------------------------------------------
vmlinux 51966078 | 32230468 19087914 647696
diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h
index 529e7c040dad..3e1de20c8146 100644
--- a/include/linux/page_frag_cache.h
+++ b/include/linux/page_frag_cache.h
@@ -60,8 +60,33 @@ static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache *nc)
void page_frag_cache_drain(struct page_frag_cache *nc);
void __page_frag_cache_drain(struct page *page, unsigned int count);
-void *page_frag_cache_refill(struct page_frag_cache *nc, unsigned int fragsz,
- gfp_t gfp_mask);
+void *__page_frag_cache_refill(struct page_frag_cache *nc, gfp_t gfp_mask);
+void *page_frag_cache_flush(struct page_frag_cache *nc, gfp_t gfp_mask);
+
+static inline void *page_frag_cache_refill(struct page_frag_cache *nc,
+ unsigned int fragsz, gfp_t gfp_mask)
+{
+ unsigned long size_mask;
+ void *va;
+
+ if (unlikely(!nc->va)) {
+ va = __page_frag_cache_refill(nc, gfp_mask);
+ if (likely(va))
+ return va;
+ }
+
+#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
+ /* if size can vary use size else just use PAGE_SIZE */
+ size_mask = nc->size_mask;
+#else
+ size_mask = PAGE_SIZE - 1;
+#endif
+
+ if (unlikely(nc->offset + fragsz > (size_mask + 1)))
+ return page_frag_cache_flush(nc, gfp_mask);
+
+ return (void *)((unsigned long)nc->va & ~size_mask);
+}
/**
* page_frag_alloc_va() - Alloc a page fragment.
diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c
index 8b1d35aafcc1..74f643d472fb 100644
--- a/mm/page_frag_cache.c
+++ b/mm/page_frag_cache.c
@@ -18,8 +18,7 @@
#include <linux/page_frag_cache.h>
#include "internal.h"
-static bool __page_frag_cache_refill(struct page_frag_cache *nc,
- gfp_t gfp_mask)
+void *__page_frag_cache_refill(struct page_frag_cache *nc, gfp_t gfp_mask)
{
struct page *page = NULL;
gfp_t gfp = gfp_mask;
@@ -40,7 +39,7 @@ static bool __page_frag_cache_refill(struct page_frag_cache *nc,
if (unlikely(!page)) {
nc->va = NULL;
- return false;
+ return NULL;
}
nc->va = page_address(page);
@@ -57,8 +56,9 @@ static bool __page_frag_cache_refill(struct page_frag_cache *nc,
nc->pfmemalloc = page_is_pfmemalloc(page);
nc->offset = 0;
- return true;
+ return nc->va;
}
+EXPORT_SYMBOL(__page_frag_cache_refill);
/**
* page_frag_cache_drain - Drain the current page from page_frag cache.
@@ -83,20 +83,12 @@ void __page_frag_cache_drain(struct page *page, unsigned int count)
}
EXPORT_SYMBOL(__page_frag_cache_drain);
-void *page_frag_cache_refill(struct page_frag_cache *nc, unsigned int fragsz,
- gfp_t gfp_mask)
+void *page_frag_cache_flush(struct page_frag_cache *nc, gfp_t gfp_mask)
{
unsigned long size_mask;
- unsigned int offset;
struct page *page;
void *va;
- if (unlikely(!nc->va)) {
-refill:
- if (!__page_frag_cache_refill(nc, gfp_mask))
- return NULL;
- }
-
#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
/* if size can vary use size else just use PAGE_SIZE */
size_mask = nc->size_mask;
@@ -105,41 +97,23 @@ void *page_frag_cache_refill(struct page_frag_cache *nc, unsigned int fragsz,
#endif
va = (void *)((unsigned long)nc->va & ~size_mask);
- offset = nc->offset;
-
- if (unlikely(offset + fragsz > (size_mask + 1))) {
- page = virt_to_page(va);
-
- if (!page_ref_sub_and_test(page, nc->pagecnt_bias & size_mask))
- goto refill;
-
- if (unlikely(nc->pfmemalloc)) {
- free_unref_page(page, compound_order(page));
- goto refill;
- }
-
- /* OK, page count is 0, we can safely set it */
- set_page_count(page, size_mask);
- nc->pagecnt_bias |= size_mask;
-
- nc->offset = 0;
- if (unlikely(fragsz > (size_mask + 1))) {
- /*
- * The caller is trying to allocate a fragment
- * with fragsz > PAGE_SIZE but the cache isn't big
- * enough to satisfy the request, this may
- * happen in low memory conditions.
- * We don't release the cache page because
- * it could make memory pressure worse
- * so we simply return NULL here.
- */
- return NULL;
- }
+ page = virt_to_page(va);
+ if (!page_ref_sub_and_test(page, nc->pagecnt_bias & size_mask))
+ return __page_frag_cache_refill(nc, gfp_mask);
+
+ if (unlikely(nc->pfmemalloc)) {
+ free_unref_page(page, compound_order(page));
+ return __page_frag_cache_refill(nc, gfp_mask);
}
+ /* OK, page count is 0, we can safely set it */
+ set_page_count(page, size_mask);
+ nc->pagecnt_bias |= size_mask;
+ nc->offset = 0;
+
return va;
}
-EXPORT_SYMBOL(page_frag_cache_refill);
+EXPORT_SYMBOL(page_frag_cache_flush);
/*
* Frees a page fragment allocated out of either a compound or order 0 page.
>
> Thanks,
>
> Mat
>
> .
>
prev parent reply other threads:[~2024-04-19 12:37 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20240415131941.51153-1-linyunsheng@huawei.com>
2024-04-15 13:19 ` [PATCH net-next v2 13/15] net: replace page_frag with page_frag_cache Yunsheng Lin
2024-04-16 1:37 ` Mat Martineau
2024-04-16 13:11 ` Yunsheng Lin
2024-04-16 21:40 ` Mat Martineau
2024-04-19 12:37 ` Yunsheng Lin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e1e7ebcf-8c7a-8ef4-c48a-3ddfa05d22bb@huawei.com \
--to=linyunsheng@huawei.com \
--cc=ayush.sawal@chelsio.com \
--cc=borisp@nvidia.com \
--cc=bpf@vger.kernel.org \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=davem@davemloft.net \
--cc=dietmar.eggemann@arm.com \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=geliang@kernel.org \
--cc=jakub@cloudflare.com \
--cc=jasowang@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=juri.lelli@redhat.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=martineau@kernel.org \
--cc=matttbe@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=mptcp@lists.linux.dev \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).