All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Jianfeng Wang <jianfeng.w.wang@oracle.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, vbabka@suse.cz,
	 cl@linux.com, akpm@linux-foundation.org, penberg@kernel.org
Subject: Re: [PATCH v3 1/2] slub: introduce count_partial_free_approx()
Date: Fri, 19 Apr 2024 17:18:18 -0700 (PDT)	[thread overview]
Message-ID: <3e5d2937-76ab-546b-9ce8-7e7140424278@google.com> (raw)
In-Reply-To: <20240419175611.47413-2-jianfeng.w.wang@oracle.com>

On Fri, 19 Apr 2024, Jianfeng Wang wrote:

> diff --git a/mm/slub.c b/mm/slub.c
> index 1bb2a93cf7b6..993cbbdd2b6c 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3213,6 +3213,43 @@ static inline bool free_debug_processing(struct kmem_cache *s,
>  #endif /* CONFIG_SLUB_DEBUG */
>  
>  #if defined(CONFIG_SLUB_DEBUG) || defined(SLAB_SUPPORTS_SYSFS)
> +#define MAX_PARTIAL_TO_SCAN 10000
> +
> +static unsigned long count_partial_free_approx(struct kmem_cache_node *n)
> +{
> +	unsigned long flags;
> +	unsigned long x = 0;
> +	struct slab *slab;
> +
> +	spin_lock_irqsave(&n->list_lock, flags);
> +	if (n->nr_partial <= MAX_PARTIAL_TO_SCAN) {
> +		list_for_each_entry(slab, &n->partial, slab_list)
> +			x += slab->objects - slab->inuse;
> +	} else {
> +		/*
> +		 * For a long list, approximate the total count of objects in
> +		 * it to meet the limit on the number of slabs to scan.
> +		 * Scan from both the list's head and tail for better accuracy.
> +		 */
> +		unsigned long scanned = 0;
> +
> +		list_for_each_entry(slab, &n->partial, slab_list) {
> +			x += slab->objects - slab->inuse;
> +			if (++scanned == MAX_PARTIAL_TO_SCAN / 2)
> +				break;
> +		}
> +		list_for_each_entry_reverse(slab, &n->partial, slab_list) {
> +			x += slab->objects - slab->inuse;
> +			if (++scanned == MAX_PARTIAL_TO_SCAN)
> +				break;
> +		}
> +		x = mult_frac(x, n->nr_partial, scanned);
> +		x = min(x, node_nr_objs(n));
> +	}
> +	spin_unlock_irqrestore(&n->list_lock, flags);
> +	return x;
> +}

Creative :)

The default value of MAX_PARTIAL_TO_SCAN seems to work well in practice 
while being large enough to bias for actual values?

I can't think of a better way to avoid the disruption that very long 
partial lists cause.  If the actual value is needed, it will need to be 
read from the sysfs file for that slab cache.

It does beg the question of whether we want to extend slabinfo to indicate 
that some fields are approximations, however.  Adding a suffix such as 
" : approx" to a slab cache line may be helpful if the disparity in the 
estimates would actually make a difference in practice.

I have a hard time believing that this approximation will not be "close 
enough" for all practical purposes, given that the value could very well 
substantially change the instant after the iteration is done anyway.

So for that reason, this sounds good to me!

Acked-by: David Rientjes <rientjes@google.com>

> +
>  static unsigned long count_partial(struct kmem_cache_node *n,
>  					int (*get_count)(struct slab *))
>  {
> @@ -7089,7 +7126,7 @@ void get_slabinfo(struct kmem_cache *s, struct slabinfo *sinfo)
>  	for_each_kmem_cache_node(s, node, n) {
>  		nr_slabs += node_nr_slabs(n);
>  		nr_objs += node_nr_objs(n);
> -		nr_free += count_partial(n, count_free);
> +		nr_free += count_partial_free_approx(n);
>  	}
>  
>  	sinfo->active_objs = nr_objs - nr_free;

  reply	other threads:[~2024-04-20  0:18 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-19 17:56 [PATCH v3 0/2] slub: introduce count_partial_free_approx() Jianfeng Wang
2024-04-19 17:56 ` [PATCH v3 1/2] " Jianfeng Wang
2024-04-20  0:18   ` David Rientjes [this message]
2024-04-22  7:49     ` Vlastimil Babka
2024-04-19 17:56 ` [PATCH v3 2/2] slub: use count_partial_free_approx() in slab_out_of_memory() Jianfeng Wang
2024-04-20  0:18   ` David Rientjes
2024-04-22  7:56 ` [PATCH v3 0/2] slub: introduce count_partial_free_approx() Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e5d2937-76ab-546b-9ce8-7e7140424278@google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=jianfeng.w.wang@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.