Xen-Devel Archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/12] x86/mtrr: fix handling with PAT but without MTRR
@ 2023-03-06 16:34 Juergen Gross
  2023-03-06 16:34 ` [PATCH v4 05/12] x86/xen: set MTRR state when running as Xen PV initial domain Juergen Gross
  2023-03-07 21:09 ` [PATCH v4 00/12] x86/mtrr: fix handling with PAT but without MTRR Michael Kelley (LINUX)
  0 siblings, 2 replies; 5+ messages in thread
From: Juergen Gross @ 2023-03-06 16:34 UTC (permalink / raw
  To: linux-kernel, x86, linux-hyperv
  Cc: Juergen Gross, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Boris Ostrovsky, xen-devel, Andy Lutomirski,
	Peter Zijlstra

This series tries to fix the rather special case of PAT being available
without having MTRRs (either due to CONFIG_MTRR being not set, or
because the feature has been disabled e.g. by a hypervisor).

The main use cases are Xen PV guests and SEV-SNP guests running under
Hyper-V.

Instead of trying to work around all the issues by adding if statements
here and there, just try to use the complete available infrastructure
by setting up a read-only MTRR state when needed.

In the Xen PV case the current MTRR MSR values can be read from the
hypervisor, while for the SEV-SNP case all needed is to set the
default caching mode to "WB".

I have added more cleanup which has been discussed when looking into
the most recent failures.

Note that I couldn't test the Hyper-V related change (patch 3).

Running on bare metal and with Xen didn't show any problems with the
series applied.

It should be noted that patches 9+10 are replacing today's way to
lookup the MTRR cache type for a memory region from looking at the
MTRR register values to building a memory map with the cache types.
This should make the lookup much faster and much easier to understand.

Changes in V2:
- replaced former patches 1+2 with new patches 1-4, avoiding especially
  the rather hacky approach of V1, while making all the MTRR type
  conflict tests available for the Xen PV case
- updated patch 6 (was patch 4 in V1)

Changes in V3:
- dropped patch 5 of V2, as already applied
- split patch 1 of V2 into 2 patches
- new patches 6-10
- addressed comments

Changes in V4:
- addressed comments

Juergen Gross (12):
  x86/mtrr: split off physical address size calculation
  x86/mtrr: optimize mtrr_calc_physbits()
  x86/mtrr: support setting MTRR state for software defined MTRRs
  x86/hyperv: set MTRR state when running as SEV-SNP Hyper-V guest
  x86/xen: set MTRR state when running as Xen PV initial domain
  x86/mtrr: replace vendor tests in MTRR code
  x86/mtrr: allocate mtrr_value array dynamically
  x86/mtrr: add get_effective_type() service function
  x86/mtrr: construct a memory map with cache modes
  x86/mtrr: use new cache_map in mtrr_type_lookup()
  x86/mtrr: don't let mtrr_type_lookup() return MTRR_TYPE_INVALID
  x86/mm: only check uniform after calling mtrr_type_lookup()

 arch/x86/include/asm/mtrr.h        |  15 +-
 arch/x86/include/uapi/asm/mtrr.h   |   6 +-
 arch/x86/kernel/cpu/mshyperv.c     |   4 +
 arch/x86/kernel/cpu/mtrr/amd.c     |   2 +-
 arch/x86/kernel/cpu/mtrr/centaur.c |   2 +-
 arch/x86/kernel/cpu/mtrr/cleanup.c |   4 +-
 arch/x86/kernel/cpu/mtrr/cyrix.c   |   2 +-
 arch/x86/kernel/cpu/mtrr/generic.c | 492 ++++++++++++++++++-----------
 arch/x86/kernel/cpu/mtrr/mtrr.c    |  94 +++---
 arch/x86/kernel/cpu/mtrr/mtrr.h    |   7 +-
 arch/x86/kernel/setup.c            |   2 +
 arch/x86/mm/pgtable.c              |  24 +-
 arch/x86/xen/enlighten_pv.c        |  52 +++
 13 files changed, 454 insertions(+), 252 deletions(-)

-- 
2.35.3



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v4 05/12] x86/xen: set MTRR state when running as Xen PV initial domain
  2023-03-06 16:34 [PATCH v4 00/12] x86/mtrr: fix handling with PAT but without MTRR Juergen Gross
@ 2023-03-06 16:34 ` Juergen Gross
  2023-03-07 21:47   ` Boris Ostrovsky
  2023-03-23 12:43   ` Borislav Petkov
  2023-03-07 21:09 ` [PATCH v4 00/12] x86/mtrr: fix handling with PAT but without MTRR Michael Kelley (LINUX)
  1 sibling, 2 replies; 5+ messages in thread
From: Juergen Gross @ 2023-03-06 16:34 UTC (permalink / raw
  To: linux-kernel, x86
  Cc: Juergen Gross, Boris Ostrovsky, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, xen-devel

When running as Xen PV initial domain (aka dom0), MTRRs are disabled
by the hypervisor, but the system should nevertheless use correct
cache memory types. This has always kind of worked, as disabled MTRRs
resulted in disabled PAT, too, so that the kernel avoided code paths
resulting in inconsistencies. This bypassed all of the sanity checks
the kernel is doing with enabled MTRRs in order to avoid memory
mappings with conflicting memory types.

This has been changed recently, leading to PAT being accepted to be
enabled, while MTRRs stayed disabled. The result is that
mtrr_type_lookup() no longer is accepting all memory type requests,
but started to return WB even if UC- was requested. This led to
driver failures during initialization of some devices.

In reality MTRRs are still in effect, but they are under complete
control of the Xen hypervisor. It is possible, however, to retrieve
the MTRR settings from the hypervisor.

In order to fix those problems, overwrite the MTRR state via
mtrr_overwrite_state() with the MTRR data from the hypervisor, if the
system is running as a Xen dom0.

Fixes: 72cbc8f04fe2 ("x86/PAT: Have pat_enabled() properly reflect state when running on Xen")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- new patch
V3:
- move the call of mtrr_overwrite_state() to xen_pv_init_platform()
V4:
- only call mtrr_overwrite_state() if any MTRR got from Xen
  (Boris Ostrovsky)
---
 arch/x86/xen/enlighten_pv.c | 52 +++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index bb59cc6ddb2d..12e6b6845870 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -68,6 +68,7 @@
 #include <asm/reboot.h>
 #include <asm/hypervisor.h>
 #include <asm/mach_traps.h>
+#include <asm/mtrr.h>
 #include <asm/mwait.h>
 #include <asm/pci_x86.h>
 #include <asm/cpu.h>
@@ -119,6 +120,54 @@ static int __init parse_xen_msr_safe(char *str)
 }
 early_param("xen_msr_safe", parse_xen_msr_safe);
 
+/* Get MTRR settings from Xen and put them into mtrr_state. */
+static void __init xen_set_mtrr_data(void)
+{
+#ifdef CONFIG_MTRR
+	struct xen_platform_op op = {
+		.cmd = XENPF_read_memtype,
+		.interface_version = XENPF_INTERFACE_VERSION,
+	};
+	unsigned int reg;
+	unsigned long mask;
+	uint32_t eax, width;
+	static struct mtrr_var_range var[MTRR_MAX_VAR_RANGES] __initdata;
+
+	/* Get physical address width (only 64-bit cpus supported). */
+	width = 36;
+	eax = cpuid_eax(0x80000000);
+	if ((eax >> 16) == 0x8000 && eax >= 0x80000008) {
+		eax = cpuid_eax(0x80000008);
+		width = eax & 0xff;
+	}
+
+	for (reg = 0; reg < MTRR_MAX_VAR_RANGES; reg++) {
+		op.u.read_memtype.reg = reg;
+		if (HYPERVISOR_platform_op(&op))
+			break;
+
+		/*
+		 * Only called in dom0, which has all RAM PFNs mapped at
+		 * RAM MFNs, and all PCI space etc. is identity mapped.
+		 * This means we can treat MFN == PFN regarding MTTR settings.
+		 */
+		var[reg].base_lo = op.u.read_memtype.type;
+		var[reg].base_lo |= op.u.read_memtype.mfn << PAGE_SHIFT;
+		var[reg].base_hi = op.u.read_memtype.mfn >> (32 - PAGE_SHIFT);
+		mask = ~((op.u.read_memtype.nr_mfns << PAGE_SHIFT) - 1);
+		mask &= (1UL << width) - 1;
+		if (mask)
+			mask |= 1 << 11;
+		var[reg].mask_lo = mask;
+		var[reg].mask_hi = mask >> 32;
+	}
+
+	/* Only overwrite MTRR state if any MTRR could be got from Xen. */
+	if (reg)
+		mtrr_overwrite_state(var, reg, MTRR_TYPE_UNCACHABLE);
+#endif
+}
+
 static void __init xen_pv_init_platform(void)
 {
 	/* PV guests can't operate virtio devices without grants. */
@@ -135,6 +184,9 @@ static void __init xen_pv_init_platform(void)
 
 	/* pvclock is in shared info area */
 	xen_init_time_ops();
+
+	if (xen_initial_domain())
+		xen_set_mtrr_data();
 }
 
 static void __init xen_pv_guest_late_init(void)
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* RE: [PATCH v4 00/12] x86/mtrr: fix handling with PAT but without MTRR
  2023-03-06 16:34 [PATCH v4 00/12] x86/mtrr: fix handling with PAT but without MTRR Juergen Gross
  2023-03-06 16:34 ` [PATCH v4 05/12] x86/xen: set MTRR state when running as Xen PV initial domain Juergen Gross
@ 2023-03-07 21:09 ` Michael Kelley (LINUX)
  1 sibling, 0 replies; 5+ messages in thread
From: Michael Kelley (LINUX) @ 2023-03-07 21:09 UTC (permalink / raw
  To: Juergen Gross, linux-kernel@vger.kernel.org, x86@kernel.org,
	linux-hyperv@vger.kernel.org
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, KY Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui,
	Boris Ostrovsky, xen-devel@lists.xenproject.org, Andy Lutomirski,
	Peter Zijlstra

From: Juergen Gross <jgross@suse.com> Sent: Monday, March 6, 2023 8:34 AM
> 
> This series tries to fix the rather special case of PAT being available
> without having MTRRs (either due to CONFIG_MTRR being not set, or
> because the feature has been disabled e.g. by a hypervisor).
> 
> The main use cases are Xen PV guests and SEV-SNP guests running under
> Hyper-V.
> 
> Instead of trying to work around all the issues by adding if statements
> here and there, just try to use the complete available infrastructure
> by setting up a read-only MTRR state when needed.
> 
> In the Xen PV case the current MTRR MSR values can be read from the
> hypervisor, while for the SEV-SNP case all needed is to set the
> default caching mode to "WB".
> 
> I have added more cleanup which has been discussed when looking into
> the most recent failures.
> 
> Note that I couldn't test the Hyper-V related change (patch 3).
> 
> Running on bare metal and with Xen didn't show any problems with the
> series applied.
> 
> It should be noted that patches 9+10 are replacing today's way to
> lookup the MTRR cache type for a memory region from looking at the
> MTRR register values to building a memory map with the cache types.
> This should make the lookup much faster and much easier to understand.
> 
> Changes in V2:
> - replaced former patches 1+2 with new patches 1-4, avoiding especially
>   the rather hacky approach of V1, while making all the MTRR type
>   conflict tests available for the Xen PV case
> - updated patch 6 (was patch 4 in V1)
> 
> Changes in V3:
> - dropped patch 5 of V2, as already applied
> - split patch 1 of V2 into 2 patches
> - new patches 6-10
> - addressed comments
> 
> Changes in V4:
> - addressed comments
> 
> Juergen Gross (12):
>   x86/mtrr: split off physical address size calculation
>   x86/mtrr: optimize mtrr_calc_physbits()
>   x86/mtrr: support setting MTRR state for software defined MTRRs
>   x86/hyperv: set MTRR state when running as SEV-SNP Hyper-V guest
>   x86/xen: set MTRR state when running as Xen PV initial domain
>   x86/mtrr: replace vendor tests in MTRR code
>   x86/mtrr: allocate mtrr_value array dynamically
>   x86/mtrr: add get_effective_type() service function
>   x86/mtrr: construct a memory map with cache modes
>   x86/mtrr: use new cache_map in mtrr_type_lookup()
>   x86/mtrr: don't let mtrr_type_lookup() return MTRR_TYPE_INVALID
>   x86/mm: only check uniform after calling mtrr_type_lookup()
> 
>  arch/x86/include/asm/mtrr.h        |  15 +-
>  arch/x86/include/uapi/asm/mtrr.h   |   6 +-
>  arch/x86/kernel/cpu/mshyperv.c     |   4 +
>  arch/x86/kernel/cpu/mtrr/amd.c     |   2 +-
>  arch/x86/kernel/cpu/mtrr/centaur.c |   2 +-
>  arch/x86/kernel/cpu/mtrr/cleanup.c |   4 +-
>  arch/x86/kernel/cpu/mtrr/cyrix.c   |   2 +-
>  arch/x86/kernel/cpu/mtrr/generic.c | 492 ++++++++++++++++++-----------
>  arch/x86/kernel/cpu/mtrr/mtrr.c    |  94 +++---
>  arch/x86/kernel/cpu/mtrr/mtrr.h    |   7 +-
>  arch/x86/kernel/setup.c            |   2 +
>  arch/x86/mm/pgtable.c              |  24 +-
>  arch/x86/xen/enlighten_pv.c        |  52 +++
>  13 files changed, 454 insertions(+), 252 deletions(-)
> 
> --
> 2.35.3

I've tested a Linux 6.2 kernel plus this series in a normal Hyper-V
guest and in a Hyper-V guest using SEV-SNP with vTOM.  MMIO
memory is correctly mapped as WB or UC- depending on the
request, which fixes the original problem introduced for Hyper-V
by the Xen-specific change.

Tested-by: Michael Kelley <mikelley@microsoft.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 05/12] x86/xen: set MTRR state when running as Xen PV initial domain
  2023-03-06 16:34 ` [PATCH v4 05/12] x86/xen: set MTRR state when running as Xen PV initial domain Juergen Gross
@ 2023-03-07 21:47   ` Boris Ostrovsky
  2023-03-23 12:43   ` Borislav Petkov
  1 sibling, 0 replies; 5+ messages in thread
From: Boris Ostrovsky @ 2023-03-07 21:47 UTC (permalink / raw
  To: Juergen Gross, linux-kernel, x86
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, xen-devel



On 3/6/23 11:34 AM, Juergen Gross wrote:
> When running as Xen PV initial domain (aka dom0), MTRRs are disabled
> by the hypervisor, but the system should nevertheless use correct
> cache memory types. This has always kind of worked, as disabled MTRRs
> resulted in disabled PAT, too, so that the kernel avoided code paths
> resulting in inconsistencies. This bypassed all of the sanity checks
> the kernel is doing with enabled MTRRs in order to avoid memory
> mappings with conflicting memory types.
> 
> This has been changed recently, leading to PAT being accepted to be
> enabled, while MTRRs stayed disabled. The result is that
> mtrr_type_lookup() no longer is accepting all memory type requests,
> but started to return WB even if UC- was requested. This led to
> driver failures during initialization of some devices.
> 
> In reality MTRRs are still in effect, but they are under complete
> control of the Xen hypervisor. It is possible, however, to retrieve
> the MTRR settings from the hypervisor.
> 
> In order to fix those problems, overwrite the MTRR state via
> mtrr_overwrite_state() with the MTRR data from the hypervisor, if the
> system is running as a Xen dom0.
> 
> Fixes: 72cbc8f04fe2 ("x86/PAT: Have pat_enabled() properly reflect state when running on Xen")
> Signed-off-by: Juergen Gross <jgross@suse.com>


Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 05/12] x86/xen: set MTRR state when running as Xen PV initial domain
  2023-03-06 16:34 ` [PATCH v4 05/12] x86/xen: set MTRR state when running as Xen PV initial domain Juergen Gross
  2023-03-07 21:47   ` Boris Ostrovsky
@ 2023-03-23 12:43   ` Borislav Petkov
  1 sibling, 0 replies; 5+ messages in thread
From: Borislav Petkov @ 2023-03-23 12:43 UTC (permalink / raw
  To: Juergen Gross
  Cc: linux-kernel, x86, Boris Ostrovsky, Thomas Gleixner, Ingo Molnar,
	Dave Hansen, H. Peter Anvin, xen-devel

On Mon, Mar 06, 2023 at 05:34:18PM +0100, Juergen Gross wrote:
> +	for (reg = 0; reg < MTRR_MAX_VAR_RANGES; reg++) {
> +		op.u.read_memtype.reg = reg;
> +		if (HYPERVISOR_platform_op(&op))
> +			break;
> +
> +		/*
> +		 * Only called in dom0, which has all RAM PFNs mapped at
> +		 * RAM MFNs, and all PCI space etc. is identity mapped.
> +		 * This means we can treat MFN == PFN regarding MTTR settings.
								^^^^

"MTRR"

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-03-23 12:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-06 16:34 [PATCH v4 00/12] x86/mtrr: fix handling with PAT but without MTRR Juergen Gross
2023-03-06 16:34 ` [PATCH v4 05/12] x86/xen: set MTRR state when running as Xen PV initial domain Juergen Gross
2023-03-07 21:47   ` Boris Ostrovsky
2023-03-23 12:43   ` Borislav Petkov
2023-03-07 21:09 ` [PATCH v4 00/12] x86/mtrr: fix handling with PAT but without MTRR Michael Kelley (LINUX)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).