* [RFC 0/8] ARM64 TLB logic revision for more control and enhanced diagnostics
@ 2023-12-04 19:15 Christoph Lameter
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Lameter @ 2023-12-04 19:15 UTC (permalink / raw
To: linux-arm-kernel
Cc: tokamoto@jp.fujitsu.com, qi.fuli@fujitsu.com, Takao Indoh,
Will Deacon
WARNING: First draft of the patchset, tested with kernel compiles, may
corrupt your memory.
This patchset intends to aid in help debugging and scaling TLB operations
on ARM64. This is in particular desirable for ARM architectures with
large numbers of cores. What is often seen is that the mesh is flooded with
snoop traffic that is related to TLBI broadcasts.
The patchset adds the following features:
- Allow diagnostics via /proc/vmstat like already possible on X86 with the
CONFIG_DEBUG_TLB option.
Some sample output:
cat /proc/vmstat
...
nr_tlb_remote_flush 554104 Flushes converted to IPIs with local flushing
nr_tlb_remote_flush_received 1312243 IPIs received to perform local flushing
nr_tlb_local_flush_all 141837 Local flush alls
nr_tlb_local_flush_range 571784 Local flush range
nr_tlb_local_flush_one 8011880 Local individual page flushes
nr_tlb_flush_all 28239 Flush alls through the mesh
nr_tlb_flush_range 13003 Flush range through the mesh
nr_tlb_flush_one 54764 Flush one through the mesh
nr_tlb_skipped 0 Suppressed flush
- Tracks the cores that have used an address space. With that we can
compute the weight of cpus that have used this address space
to decide on how to optimally do the flushing when such an action
is required.
- Control the TLB flushing behavior via the kernel command line and also on a running system.
New Kernel parameter tlb_mode=<tlb_mode>
New sysfs setting /sys/kernel/debug/tlb_mode
tlb_mode is comprised of a set of flags starting at bit 10. Bit 0-9
are used to set a boundary as to what cpu weight will lead to a mesh
flush. If the cpu weight is lower then IPIs are send avoiding the mesh.
Feature flags:
Bit 10 = If the current cpu is the only one that has ever used an
address space then perform local invalidation.
This catches the majority of flushes on boot and
activities of typical single threaded Unixy processes.
Bit 11 = Enable TLB range. Various hardware has problems with TLB range.
This allows the kernel to recognize that TLB range
should not be used and an alternate method is to be
used to do the flushing.
Bit 12 = Suppress TLB flushes if the address space is unused.
If this bit is set and a flush is requested in an
unused address space then no flush will be performed
since there cannot any TLB entries. If this is
not set then perform mesh flush (just to be sure).
- Autotunes the feature flags on bootup if the user has not specified tlb_mode.
Calculates an optimal balance between IPIs and mesh flushing
based on the number of cpus in the system. Enables local validation
always and tlb range flushing if the processor features indicate
that the processor supports it.
We need a more detailed description but I hope this is enough to get started.
These issues have been discussed before in an patchset that contains a similar feature
in 2019:
https://lore.kernel.org/linux-arm-kernel/20190617143255.10462-1-indou.takao@jp.fujitsu.com/
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC 0/8] ARM64 TLB logic revision for more control and enhanced diagnostics
@ 2023-12-05 0:25 Christoph Lameter
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Lameter @ 2023-12-05 0:25 UTC (permalink / raw
To: linux-arm-kernel
Cc: tokamoto@jp.fujitsu.com, qi.fuli@fujitsu.com, Takao Indoh,
Will Deacon, Catalin Marinas, Jon Masters
WARNING: First draft of the patchset, tested with kernel compiles, may
corrupt your memory.
This patchset intends to aid in help debugging and scaling TLB operations
on ARM64. This is in particular desirable for ARM architectures with
large numbers of cores. What is often seen is that the mesh is flooded with
snoop traffic that is related to TLBI broadcasts.
The patchset adds the following features:
- Allow diagnostics via /proc/vmstat like already possible on X86 with the
CONFIG_DEBUG_TLB option.
Some sample output:
cat /proc/vmstat
...
nr_tlb_remote_flush 554104 Flushes converted to IPIs with local flushing
nr_tlb_remote_flush_received 1312243 IPIs received to perform local flushing
nr_tlb_local_flush_all 141837 Local flush alls
nr_tlb_local_flush_range 571784 Local flush range
nr_tlb_local_flush_one 8011880 Local individual page flushes
nr_tlb_flush_all 28239 Flush alls through the mesh
nr_tlb_flush_range 13003 Flush range through the mesh
nr_tlb_flush_one 54764 Flush one through the mesh
nr_tlb_skipped 0 Suppressed flush
- Tracks the cores that have used an address space. With that we can
compute the weight of cpus that have used this address space
to decide on how to optimally do the flushing when such an action
is required.
- Control the TLB flushing behavior via the kernel command line and also on a running system.
New Kernel parameter tlb_mode=<tlb_mode>
New sysfs setting /sys/kernel/debug/tlb_mode
tlb_mode is comprised of a set of flags starting at bit 10. Bit 0-9
are used to set a boundary as to what cpu weight will lead to a mesh
flush. If the cpu weight is lower then IPIs are send avoiding the mesh.
Feature flags:
Bit 10 = If the current cpu is the only one that has ever used an
address space then perform local invalidation.
This catches the majority of flushes on boot and
activities of typical single threaded Unixy processes.
Bit 11 = Enable TLB range. Various hardware has problems with TLB range.
This allows the kernel to recognize that TLB range
should not be used and an alternate method is to be
used to do the flushing.
Bit 12 = Suppress TLB flushes if the address space is unused.
If this bit is set and a flush is requested in an
unused address space then no flush will be performed
since there cannot any TLB entries. If this is
not set then perform mesh flush (just to be sure).
- Autotunes the feature flags on bootup if the user has not specified tlb_mode.
Calculates an optimal balance between IPIs and mesh flushing
based on the number of cpus in the system. Enables local validation
always and tlb range flushing if the processor features indicate
that the processor supports it.
We need a more detailed description but I hope this is enough to get started.
These issues have been discussed before in an patchset that contains a similar feature
in 2019:
https://lore.kernel.org/linux-arm-kernel/20190617143255.10462-1-indou.takao@jp.fujitsu.com/
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC 0/8] ARM64 TLB logic revision for more control and enhanced diagnostics
@ 2023-12-07 3:54 Christoph Lameter
2023-12-07 4:23 ` Christoph Lameter (Ampere)
0 siblings, 1 reply; 5+ messages in thread
From: Christoph Lameter @ 2023-12-07 3:54 UTC (permalink / raw
To: linux-arm-kernel
Cc: tokamoto@jp.fujitsu.com, qi.fuli@fujitsu.com, Takao Indoh,
Will Deacon, Catalin Marinas, Jon Masters
WARNING: First draft of the patchset, tested with kernel compiles, may
corrupt your memory.
This patchset intends to aid in help debugging and scaling TLB operations
on ARM64. This is in particular desirable for ARM architectures with
large numbers of cores. What is often seen is that the mesh is flooded with
snoop traffic that is related to TLBI broadcasts.
The patchset adds the following features:
- Allow diagnostics via /proc/vmstat like already possible on X86 with the
CONFIG_DEBUG_TLB option.
Some sample output:
cat /proc/vmstat
...
nr_tlb_remote_flush 554104 Flushes converted to IPIs with local flushing
nr_tlb_remote_flush_received 1312243 IPIs received to perform local flushing
nr_tlb_local_flush_all 141837 Local flush alls
nr_tlb_local_flush_range 571784 Local flush range
nr_tlb_local_flush_one 8011880 Local individual page flushes
nr_tlb_flush_all 28239 Flush alls through the mesh
nr_tlb_flush_range 13003 Flush range through the mesh
nr_tlb_flush_one 54764 Flush one through the mesh
nr_tlb_skipped 0 Suppressed flush
- Tracks the cores that have used an address space. With that we can
compute the weight of cpus that have used this address space
to decide on how to optimally do the flushing when such an action
is required.
- Control the TLB flushing behavior via the kernel command line and also on a running system.
New Kernel parameter tlb_mode=<tlb_mode>
New sysfs setting /sys/kernel/debug/tlb_mode
tlb_mode is comprised of a set of flags starting at bit 10. Bit 0-9
are used to set a boundary as to what cpu weight will lead to a mesh
flush. If the cpu weight is lower then IPIs are send avoiding the mesh.
Feature flags:
Bit 10 = If the current cpu is the only one that has ever used an
address space then perform local invalidation.
This catches the majority of flushes on boot and
activities of typical single threaded Unixy processes.
Bit 11 = Enable TLB range. Various hardware has problems with TLB range.
This allows the kernel to recognize that TLB range
should not be used and an alternate method is to be
used to do the flushing.
Bit 12 = Suppress TLB flushes if the address space is unused.
If this bit is set and a flush is requested in an
unused address space then no flush will be performed
since there cannot any TLB entries. If this is
not set then perform mesh flush (just to be sure).
- Autotunes the feature flags on bootup if the user has not specified tlb_mode.
Calculates an optimal balance between IPIs and mesh flushing
based on the number of cpus in the system. Enables local validation
always and tlb range flushing if the processor features indicate
that the processor supports it.
We need a more detailed description but I hope this is enough to get started.
These issues have been discussed before in an patchset that contains a similar feature
in 2019:
https://lore.kernel.org/linux-arm-kernel/20190617143255.10462-1-indou.takao@jp.fujitsu.com/
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC 0/8] ARM64 TLB logic revision for more control and enhanced diagnostics
@ 2023-12-07 3:57 Christoph Lameter
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Lameter @ 2023-12-07 3:57 UTC (permalink / raw
To: linux-arm-kernel
Cc: tokamoto@jp.fujitsu.com, qi.fuli@fujitsu.com, Takao Indoh,
Will Deacon, Catalin Marinas, Jon Masters
WARNING: First draft of the patchset, tested with kernel compiles, may
corrupt your memory.
This patchset intends to aid in help debugging and scaling TLB operations
on ARM64. This is in particular desirable for ARM architectures with
large numbers of cores. What is often seen is that the mesh is flooded with
snoop traffic that is related to TLBI broadcasts.
The patchset adds the following features:
- Allow diagnostics via /proc/vmstat like already possible on X86 with the
CONFIG_DEBUG_TLB option.
Some sample output:
cat /proc/vmstat
...
nr_tlb_remote_flush 554104 Flushes converted to IPIs with local flushing
nr_tlb_remote_flush_received 1312243 IPIs received to perform local flushing
nr_tlb_local_flush_all 141837 Local flush alls
nr_tlb_local_flush_range 571784 Local flush range
nr_tlb_local_flush_one 8011880 Local individual page flushes
nr_tlb_flush_all 28239 Flush alls through the mesh
nr_tlb_flush_range 13003 Flush range through the mesh
nr_tlb_flush_one 54764 Flush one through the mesh
nr_tlb_skipped 0 Suppressed flush
- Tracks the cores that have used an address space. With that we can
compute the weight of cpus that have used this address space
to decide on how to optimally do the flushing when such an action
is required.
- Control the TLB flushing behavior via the kernel command line and also on a running system.
New Kernel parameter tlb_mode=<tlb_mode>
New sysfs setting /sys/kernel/debug/tlb_mode
tlb_mode is comprised of a set of flags starting at bit 10. Bit 0-9
are used to set a boundary as to what cpu weight will lead to a mesh
flush. If the cpu weight is lower then IPIs are send avoiding the mesh.
Feature flags:
Bit 10 = If the current cpu is the only one that has ever used an
address space then perform local invalidation.
This catches the majority of flushes on boot and
activities of typical single threaded Unixy processes.
Bit 11 = Enable TLB range. Various hardware has problems with TLB range.
This allows the kernel to recognize that TLB range
should not be used and an alternate method is to be
used to do the flushing.
Bit 12 = Suppress TLB flushes if the address space is unused.
If this bit is set and a flush is requested in an
unused address space then no flush will be performed
since there cannot any TLB entries. If this is
not set then perform mesh flush (just to be sure).
- Autotunes the feature flags on bootup if the user has not specified tlb_mode.
Calculates an optimal balance between IPIs and mesh flushing
based on the number of cpus in the system. Enables local validation
always and tlb range flushing if the processor features indicate
that the processor supports it.
We need a more detailed description but I hope this is enough to get started.
These issues have been discussed before in an patchset that contains a similar feature
in 2019:
https://lore.kernel.org/linux-arm-kernel/20190617143255.10462-1-indou.takao@jp.fujitsu.com/
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC 0/8] ARM64 TLB logic revision for more control and enhanced diagnostics
2023-12-07 3:54 Christoph Lameter
@ 2023-12-07 4:23 ` Christoph Lameter (Ampere)
0 siblings, 0 replies; 5+ messages in thread
From: Christoph Lameter (Ampere) @ 2023-12-07 4:23 UTC (permalink / raw
To: linux-arm-kernel
Cc: tokamoto@jp.fujitsu.com, qi.fuli@fujitsu.com, Takao Indoh,
Will Deacon, Catalin Marinas, Jon Masters
The patches in the patchset were rejected by the infradead
mailer. Tried multiple times.
The patchset can be found on kernel org in the meantime:
https://git.kernel.org/pub/scm/linux/kernel/git/christoph/linux.git/log/?h=tlb
Could someone give me a clue as to why the mailing list does not take
patches from quilt anymore?
I got the following back from the mail server. Everything seems to check
out as far as I can see. Tried from multiple emails. But nope...
From linux-arm-kernel-bounces+cl=gentwo.org@lists.infradead.org Wed Dec
6 19:57:29 2023
Return-Path: <linux-arm-kernel-bounces+cl=gentwo.org@lists.infradead.org>
X-Original-To: cl@gentwo.org
Delivered-To: cl@gentwo.org
Received: from bombadil.infradead.org (bombadil.infradead.org
[IPv6:2607:7c80:54:3::133])
by gentwo.org (Postfix) with ESMTPS id 4B49B48F4C
for <cl@gentwo.org>; Wed, 6 Dec 2023 19:57:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
d=lists.infradead.org; s=bombadil.20210309;
h=Sender:List-Id:Date:Message-ID:
To:From:Subject:Content-Transfer-Encoding:Content-Type:MIME-Version:Reply-To:
Cc:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender:
Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Help:
List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive;
bh=zWvnfDmYfVoivQxCjAVOhgbHxeaE+nz22oVa3ZjNTZE=;
b=SFPz6W4YOfzjphrltl0TcQqdFa
+scg6wXD/ngEjw6lLUJ06adn7Jk1WbG79r1ejLjWoWdQWFJSjkOYMl7Ti98m0OnmO6e2xOvTUuS5B
cqTPlIHyRx2vw7Zs68zVQdIsp3CyJYqwJQQkGsqF97ZdLCzrOcburj7MI1wDzRxpIMSXh3d9eBGOP
jMacUI6fgRYxge6KmSo4Pd3g0bm2lXSSQJPiRS7Wj5wdwXgXs8LOP+RTv436VcBWfuC3Ct06Xujc2
hLPEokLaYBauB1b7oFDCGQn1NsoNwz4BFcoFqjl5FalQvrtU5xj/ICDTMq9L4xDbFIXUtYEr3dChP
ierxmTgg==;
Received: from localhost ([::1] helo=bombadil.infradead.org)
by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat
Linux))
id 1rB5Vw-00Bnyx-3B
for cl@gentwo.org;
Thu, 07 Dec 2023 03:57:28 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Subject: Your message to linux-arm-kernel awaits moderator approval
From: linux-arm-kernel-owner@lists.infradead.org
To: cl@gentwo.org
Message-ID:
<mailman.61130.1701921444.1880391.linux-arm-kernel@lists.infradead.org>
Date: Wed, 06 Dec 2023 19:57:24 -0800
Precedence: bulk
X-BeenThere: linux-arm-kernel@lists.infradead.org
X-Mailman-Version: 2.1.34
List-Id: <linux-arm-kernel.lists.infradead.org>
X-List-Administrivia: yes
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: linux-arm-kernel-bounces+cl=gentwo.org@lists.infradead.org
Status: O
X-Status:
X-Keywords:
X-UID: 10563
Your mail to 'linux-arm-kernel' with the subject
[RFC 8/8] Remove temporary _count_vm_tlb_event
Is being held until the list moderator can review it for approval.
The reason it is being held:
Message has a suspicious header
Either the message will get posted to the list, or you will receive
notification of the moderator's decision. If you would like to cancel
this posting, please visit the following URL:
http://lists.infradead.org/mailman/confirm/linux-arm-kernel/b301494b26df71c14e8d923da3f452472fb6186f
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-12-07 4:24 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-05 0:25 [RFC 0/8] ARM64 TLB logic revision for more control and enhanced diagnostics Christoph Lameter
-- strict thread matches above, loose matches on Subject: below --
2023-12-07 3:57 Christoph Lameter
2023-12-07 3:54 Christoph Lameter
2023-12-07 4:23 ` Christoph Lameter (Ampere)
2023-12-04 19:15 Christoph Lameter
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.