Linux-mm Archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v5 0/6] Add hardware prefetch control driver for A64FX and x86
       [not found] <20220607120530.2447112-1-tarumizu.kohei@fujitsu.com>
@ 2022-06-10 13:48 ` Linus Walleij
  2022-06-17  9:06   ` tarumizu.kohei
  0 siblings, 1 reply; 2+ messages in thread
From: Linus Walleij @ 2022-06-10 13:48 UTC (permalink / raw
  To: Kohei Tarumizu, Mel Gorman, Linux Memory Management List,
	Michal Hocko, Andrew Morton
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	rafael, lenb, gregkh, eugenis, tony.luck, pcc, peterz, marcos,
	marcan, nicolas.ferre, conor.dooley, arnd, ast, peter.chen, kuba,
	linux-kernel, linux-arm-kernel, linux-acpi, Paolo Valente

On Tue, Jun 7, 2022 at 2:07 PM Kohei Tarumizu
<tarumizu.kohei@fujitsu.com> wrote:

> This patch series add sysfs interface to control CPU's hardware
> prefetch behavior for performance tuning from userspace for the
> processor A64FX and x86 (on supported CPU).

OK

> A64FX and some Intel processors have implementation-dependent register
> for controlling CPU's hardware prefetch behavior. A64FX has
> IMP_PF_STREAM_DETECT_CTRL_EL0[1], and Intel processors have MSR 0x1a4
> (MSR_MISC_FEATURE_CONTROL)[2].

Hardware prefetch (I guess of memory contents) is a memory hierarchy feature.

Linux has a memory hierarchy manager, conveniently named "mm",
developed by some of the smartest people I know. The main problem
addressed by that is paging, but prefetching into the CPU from the
next lowest level in the memory hierarchy is just another memory
hierarchy hardware feature, such as hard disks, primary RAM etc.

> These registers cannot be accessed from userspace.

Good. The kernel managed hardware. If the memory hierarchy people have
userspace now doing stuff behind their back, through some special
interface, that makes their world more complicated.

This looks like it needs information from the generic memory manager,
from the scheduler, and possibly all the way down from the block
layer to do the right thing, so it has no business in userspace.
Have you seen mm/damon for example? Access to statistics for
memory access patterns seems really useful for tuning the behaviour
of this hardware. Just my €0.01.

If it does interact with userspace I suppose it should be using control
groups, like everything else of this type, see e.g. mm/memcontrol.c,
not custom sysfs files.

Just an example from one of the patches:

+                       - "* Adjacent Cache Line Prefetcher Disable (R/W)"
+                           corresponds to the
"adjacent_cache_line_prefetcher_enable"

I might only be on "a little knowledge is dangerous" on the memory
manager topics, but I know for sure that they at times adjust the members
of structs to fit nicely on cache lines. And now this? It looks really useful
for kernel machinery that know very well what needs to go into the cache
line next and when.

Talk to the people on linux-mm and memory maintainer Andrew Morton on
how to do this right, it's a really interesting feature! Also given
that people say
that the memory hierarchy is an important part in the performance of the Apple
M1 (M2) silicon, I expect that machine to have this too?

Yours,
Linus Walleij


^ permalink raw reply	[flat|nested] 2+ messages in thread

* RE: [PATCH v5 0/6] Add hardware prefetch control driver for A64FX and x86
  2022-06-10 13:48 ` [PATCH v5 0/6] Add hardware prefetch control driver for A64FX and x86 Linus Walleij
@ 2022-06-17  9:06   ` tarumizu.kohei
  0 siblings, 0 replies; 2+ messages in thread
From: tarumizu.kohei @ 2022-06-17  9:06 UTC (permalink / raw
  To: 'Linus Walleij', Mel Gorman, Linux Memory Management List,
	Michal Hocko, Andrew Morton
  Cc: catalin.marinas@arm.com, will@kernel.org, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	x86@kernel.org, hpa@zytor.com, rafael@kernel.org, lenb@kernel.org,
	gregkh@linuxfoundation.org, eugenis@google.com,
	tony.luck@intel.com, pcc@google.com, peterz@infradead.org,
	marcos@orca.pet, marcan@marcan.st, nicolas.ferre@microchip.com,
	conor.dooley@microchip.com, arnd@arndb.de, ast@kernel.org,
	peter.chen@kernel.org, kuba@kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-acpi@vger.kernel.org,
	Paolo Valente

Hi Linus,

Thanks for the comment.

> OK
> 
> > A64FX and some Intel processors have implementation-dependent register
> > for controlling CPU's hardware prefetch behavior. A64FX has
> > IMP_PF_STREAM_DETECT_CTRL_EL0[1], and Intel processors have MSR
> 0x1a4
> > (MSR_MISC_FEATURE_CONTROL)[2].
> 
> Hardware prefetch (I guess of memory contents) is a memory hierarchy feature.
> 
> Linux has a memory hierarchy manager, conveniently named "mm", developed
> by some of the smartest people I know. The main problem addressed by that is
> paging, but prefetching into the CPU from the next lowest level in the memory
> hierarchy is just another memory hierarchy hardware feature, such as hard
> disks, primary RAM etc.
> 
> > These registers cannot be accessed from userspace.
> 
> Good. The kernel managed hardware. If the memory hierarchy people have
> userspace now doing stuff behind their back, through some special interface,
> that makes their world more complicated.
> 
> This looks like it needs information from the generic memory manager, from the
> scheduler, and possibly all the way down from the block layer to do the right
> thing, so it has no business in userspace.
> Have you seen mm/damon for example? Access to statistics for memory
> access patterns seems really useful for tuning the behaviour of this hardware.
> Just my €0.01.

Thank you for the information. I will see if mm/damon statistics can
be used for tuning.

> If it does interact with userspace I suppose it should be using control groups,
> like everything else of this type, see e.g. mm/memcontrol.c, not custom sysfs
> files.

Hardware prefetch registers exist for each core, and the settings are
independent for each cache. Therefore, currently, I created it under
/sys/devices/system/cpu/cpu*/cache/index*.
However, when user actually configure it for an application, they may
want to set it on a per-process basis. Considering that, I think
control groups is suitable for this usage.

For example, is your idea of interface like the following?

```
  /sys/fs/cgroup/memory/memory.hardware_prefetcher.enable
```

Cpuset controller has information about which CPU a process belonging
to a group is bound to, so maybe cpuset controller is more appropriate.

Control groups has hierarchical structure, so it is necessary to
consider whether they can map hardware prefetch behavior well.
Currentry I have two concerns.
First, upper hierarchy contains the same CPU as the lower hierarchy.
In this case, it may not be possible to configure independent setting
in each hierarchy.
Next, context switch considerations. This function rewrites the
value of the register that exists for each core. Therefore, the
register value must be changed at the timing of the context switch
with a process belonging to a different group.

> Just an example from one of the patches:
> 
> +                       - "* Adjacent Cache Line Prefetcher Disable (R/W)"
> +                           corresponds to the
> "adjacent_cache_line_prefetcher_enable"
> 
> I might only be on "a little knowledge is dangerous" on the memory manager
> topics, but I know for sure that they at times adjust the members of structs to fit
> nicely on cache lines. And now this? It looks really useful for kernel machinery
> that know very well what needs to go into the cache line next and when.
> 
> Talk to the people on linux-mm and memory maintainer Andrew Morton on how
> to do this right, it's a really interesting feature! Also given that people say that
> the memory hierarchy is an important part in the performance of the Apple
> M1 (M2) silicon, I expect that machine to have this too?

I think this proposal will be useful for users, so I will proceed
with concrete studies and talk to the MM people.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-06-17  9:07 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20220607120530.2447112-1-tarumizu.kohei@fujitsu.com>
2022-06-10 13:48 ` [PATCH v5 0/6] Add hardware prefetch control driver for A64FX and x86 Linus Walleij
2022-06-17  9:06   ` tarumizu.kohei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).