Xen-Devel Archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/12] x86/mtrr: fix handling with PAT but without MTRR
@ 2023-02-23  9:32 Juergen Gross
  2023-02-23  9:32 ` [PATCH v3 05/12] x86/xen: set MTRR state when running as Xen PV initial domain Juergen Gross
  0 siblings, 1 reply; 6+ messages in thread
From: Juergen Gross @ 2023-02-23  9:32 UTC (permalink / raw
  To: linux-kernel, x86, linux-hyperv
  Cc: Juergen Gross, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Boris Ostrovsky, xen-devel, Andy Lutomirski,
	Peter Zijlstra

This series tries to fix the rather special case of PAT being available
without having MTRRs (either due to CONFIG_MTRR being not set, or
because the feature has been disabled e.g. by a hypervisor).

The main use cases are Xen PV guests and SEV-SNP guests running under
Hyper-V.

Instead of trying to work around all the issues by adding if statements
here and there, just try to use the complete available infrastructure
by setting up a read-only MTRR state when needed.

In the Xen PV case the current MTRR MSR values can be read from the
hypervisor, while for the SEV-SNP case all needed is to set the
default caching mode to "WB".

I have added more cleanup which has been discussed when looking into
the most recent failures.

Note that I couldn't test the Hyper-V related change (patch 3).

Running on bare metal and with Xen didn't show any problems with the
series applied.

It should be noted that patches 9+10 are replacing today's way to
lookup the MTRR cache type for a memory region from looking at the
MTRR register values to building a memory map with the cache types.
This should make the lookup much faster and much easier to understand.

Changes in V2:
- replaced former patches 1+2 with new patches 1-4, avoiding especially
  the rather hacky approach of V1, while making all the MTRR type
  conflict tests available for the Xen PV case
- updated patch 6 (was patch 4 in V1)

Changes in V3:
- dropped patch 5 of V2, as already applied
- split patch 1 of V2 into 2 patches
- new patches 6-10
- addressed comments

Juergen Gross (12):
  x86/mtrr: split off physical address size calculation
  x86/mtrr: optimize mtrr_calc_physbits()
  x86/mtrr: support setting MTRR state for software defined MTRRs
  x86/hyperv: set MTRR state when running as SEV-SNP Hyper-V guest
  x86/xen: set MTRR state when running as Xen PV initial domain
  x86/mtrr: replace vendor tests in MTRR code
  x86/mtrr: allocate mtrr_value array dynamically
  x86/mtrr: add get_effective_type() service function
  x86/mtrr: construct a memory map with cache modes
  x86/mtrr: use new cache_map in mtrr_type_lookup()
  x86/mtrr: don't let mtrr_type_lookup() return MTRR_TYPE_INVALID
  x86/mm: only check uniform after calling mtrr_type_lookup()

 arch/x86/include/asm/mtrr.h        |  15 +-
 arch/x86/include/uapi/asm/mtrr.h   |   6 +-
 arch/x86/kernel/cpu/mshyperv.c     |   4 +
 arch/x86/kernel/cpu/mtrr/amd.c     |   2 +-
 arch/x86/kernel/cpu/mtrr/centaur.c |   2 +-
 arch/x86/kernel/cpu/mtrr/cleanup.c |   4 +-
 arch/x86/kernel/cpu/mtrr/cyrix.c   |   2 +-
 arch/x86/kernel/cpu/mtrr/generic.c | 487 ++++++++++++++++++-----------
 arch/x86/kernel/cpu/mtrr/mtrr.c    |  94 +++---
 arch/x86/kernel/cpu/mtrr/mtrr.h    |   7 +-
 arch/x86/kernel/setup.c            |   2 +
 arch/x86/mm/pgtable.c              |  24 +-
 arch/x86/xen/enlighten_pv.c        |  50 +++
 13 files changed, 447 insertions(+), 252 deletions(-)

-- 
2.35.3



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 05/12] x86/xen: set MTRR state when running as Xen PV initial domain
  2023-02-23  9:32 [PATCH v3 00/12] x86/mtrr: fix handling with PAT but without MTRR Juergen Gross
@ 2023-02-23  9:32 ` Juergen Gross
  2023-02-24 21:00   ` Boris Ostrovsky
  0 siblings, 1 reply; 6+ messages in thread
From: Juergen Gross @ 2023-02-23  9:32 UTC (permalink / raw
  To: linux-kernel, x86
  Cc: Juergen Gross, Boris Ostrovsky, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, xen-devel

When running as Xen PV initial domain (aka dom0), MTRRs are disabled
by the hypervisor, but the system should nevertheless use correct
cache memory types. This has always kind of worked, as disabled MTRRs
resulted in disabled PAT, too, so that the kernel avoided code paths
resulting in inconsistencies. This bypassed all of the sanity checks
the kernel is doing with enabled MTRRs in order to avoid memory
mappings with conflicting memory types.

This has been changed recently, leading to PAT being accepted to be
enabled, while MTRRs stayed disabled. The result is that
mtrr_type_lookup() no longer is accepting all memory type requests,
but started to return WB even if UC- was requested. This led to
driver failures during initialization of some devices.

In reality MTRRs are still in effect, but they are under complete
control of the Xen hypervisor. It is possible, however, to retrieve
the MTRR settings from the hypervisor.

In order to fix those problems, overwrite the MTRR state via
mtrr_overwrite_state() with the MTRR data from the hypervisor, if the
system is running as a Xen dom0.

Fixes: 72cbc8f04fe2 ("x86/PAT: Have pat_enabled() properly reflect state when running on Xen")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- new patch
V3:
- move the call of mtrr_overwrite_state() to xen_pv_init_platform()
---
 arch/x86/xen/enlighten_pv.c | 50 +++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index bb59cc6ddb2d..729fb447a5b6 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -68,6 +68,7 @@
 #include <asm/reboot.h>
 #include <asm/hypervisor.h>
 #include <asm/mach_traps.h>
+#include <asm/mtrr.h>
 #include <asm/mwait.h>
 #include <asm/pci_x86.h>
 #include <asm/cpu.h>
@@ -119,6 +120,52 @@ static int __init parse_xen_msr_safe(char *str)
 }
 early_param("xen_msr_safe", parse_xen_msr_safe);
 
+/* Get MTRR settings from Xen and put them into mtrr_state. */
+static void __init xen_set_mtrr_data(void)
+{
+#ifdef CONFIG_MTRR
+	struct xen_platform_op op = {
+		.cmd = XENPF_read_memtype,
+		.interface_version = XENPF_INTERFACE_VERSION,
+	};
+	unsigned int reg;
+	unsigned long mask;
+	uint32_t eax, width;
+	static struct mtrr_var_range var[MTRR_MAX_VAR_RANGES] __initdata;
+
+	/* Get physical address width (only 64-bit cpus supported). */
+	width = 36;
+	eax = cpuid_eax(0x80000000);
+	if ((eax >> 16) == 0x8000 && eax >= 0x80000008) {
+		eax = cpuid_eax(0x80000008);
+		width = eax & 0xff;
+	}
+
+	for (reg = 0; reg < MTRR_MAX_VAR_RANGES; reg++) {
+		op.u.read_memtype.reg = reg;
+		if (HYPERVISOR_platform_op(&op))
+			break;
+
+		/*
+		 * Only called in dom0, which has all RAM PFNs mapped at
+		 * RAM MFNs, and all PCI space etc. is identity mapped.
+		 * This means we can treat MFN == PFN regarding MTTR settings.
+		 */
+		var[reg].base_lo = op.u.read_memtype.type;
+		var[reg].base_lo |= op.u.read_memtype.mfn << PAGE_SHIFT;
+		var[reg].base_hi = op.u.read_memtype.mfn >> (32 - PAGE_SHIFT);
+		mask = ~((op.u.read_memtype.nr_mfns << PAGE_SHIFT) - 1);
+		mask &= (1UL << width) - 1;
+		if (mask)
+			mask |= 1 << 11;
+		var[reg].mask_lo = mask;
+		var[reg].mask_hi = mask >> 32;
+	}
+
+	mtrr_overwrite_state(var, reg, MTRR_TYPE_UNCACHABLE);
+#endif
+}
+
 static void __init xen_pv_init_platform(void)
 {
 	/* PV guests can't operate virtio devices without grants. */
@@ -135,6 +182,9 @@ static void __init xen_pv_init_platform(void)
 
 	/* pvclock is in shared info area */
 	xen_init_time_ops();
+
+	if (xen_initial_domain())
+		xen_set_mtrr_data();
 }
 
 static void __init xen_pv_guest_late_init(void)
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 05/12] x86/xen: set MTRR state when running as Xen PV initial domain
  2023-02-23  9:32 ` [PATCH v3 05/12] x86/xen: set MTRR state when running as Xen PV initial domain Juergen Gross
@ 2023-02-24 21:00   ` Boris Ostrovsky
  2023-02-27  7:12     ` Juergen Gross
  0 siblings, 1 reply; 6+ messages in thread
From: Boris Ostrovsky @ 2023-02-24 21:00 UTC (permalink / raw
  To: Juergen Gross, linux-kernel, x86
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, xen-devel

[-- Attachment #1: Type: text/plain, Size: 978 bytes --]


On 2/23/23 4:32 AM, Juergen Gross wrote:
> +
> +	for (reg = 0; reg < MTRR_MAX_VAR_RANGES; reg++) {
> +		op.u.read_memtype.reg = reg;
> +		if (HYPERVISOR_platform_op(&op))
> +			break;


If we fail on the first iteration, do we still want to mark MTRRs are enabled/set in mtrr_overwrite_state()?


-boris


> +
> +		/*
> +		 * Only called in dom0, which has all RAM PFNs mapped at
> +		 * RAM MFNs, and all PCI space etc. is identity mapped.
> +		 * This means we can treat MFN == PFN regarding MTTR settings.
> +		 */
> +		var[reg].base_lo = op.u.read_memtype.type;
> +		var[reg].base_lo |= op.u.read_memtype.mfn << PAGE_SHIFT;
> +		var[reg].base_hi = op.u.read_memtype.mfn >> (32 - PAGE_SHIFT);
> +		mask = ~((op.u.read_memtype.nr_mfns << PAGE_SHIFT) - 1);
> +		mask &= (1UL << width) - 1;
> +		if (mask)
> +			mask |= 1 << 11;
> +		var[reg].mask_lo = mask;
> +		var[reg].mask_hi = mask >> 32;
> +	}
> +
> +	mtrr_overwrite_state(var, reg, MTRR_TYPE_UNCACHABLE);
> +#endif
> +}

[-- Attachment #2: Type: text/html, Size: 1567 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 05/12] x86/xen: set MTRR state when running as Xen PV initial domain
  2023-02-24 21:00   ` Boris Ostrovsky
@ 2023-02-27  7:12     ` Juergen Gross
  2023-02-27 13:52       ` Boris Ostrovsky
  0 siblings, 1 reply; 6+ messages in thread
From: Juergen Gross @ 2023-02-27  7:12 UTC (permalink / raw
  To: Boris Ostrovsky, linux-kernel, x86
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 483 bytes --]

On 24.02.23 22:00, Boris Ostrovsky wrote:
> 
> On 2/23/23 4:32 AM, Juergen Gross wrote:
>> +
>> +	for (reg = 0; reg < MTRR_MAX_VAR_RANGES; reg++) {
>> +		op.u.read_memtype.reg = reg;
>> +		if (HYPERVISOR_platform_op(&op))
>> +			break;
> 
> 
> If we fail on the first iteration, do we still want to mark MTRRs are 
> enabled/set in mtrr_overwrite_state()?

Hmm, good idea.

I think we should just drop the call of mtrr_overwrite_state() in this
case.


Juergen


[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 05/12] x86/xen: set MTRR state when running as Xen PV initial domain
  2023-02-27  7:12     ` Juergen Gross
@ 2023-02-27 13:52       ` Boris Ostrovsky
  2023-02-27 13:56         ` Juergen Gross
  0 siblings, 1 reply; 6+ messages in thread
From: Boris Ostrovsky @ 2023-02-27 13:52 UTC (permalink / raw
  To: Juergen Gross, linux-kernel, x86
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, xen-devel



On 2/27/23 2:12 AM, Juergen Gross wrote:
> On 24.02.23 22:00, Boris Ostrovsky wrote:
>>
>> On 2/23/23 4:32 AM, Juergen Gross wrote:
>>> +
>>> +    for (reg = 0; reg < MTRR_MAX_VAR_RANGES; reg++) {
>>> +        op.u.read_memtype.reg = reg;
>>> +        if (HYPERVISOR_platform_op(&op))
>>> +            break;
>>
>>
>> If we fail on the first iteration, do we still want to mark MTRRs are enabled/set in mtrr_overwrite_state()?
> 
> Hmm, good idea.
> 
> I think we should just drop the call of mtrr_overwrite_state() in this
> case.


TBH I am not sure what the right way is to handle errors here. What if the hypercall fails on second iteration?


-boris


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 05/12] x86/xen: set MTRR state when running as Xen PV initial domain
  2023-02-27 13:52       ` Boris Ostrovsky
@ 2023-02-27 13:56         ` Juergen Gross
  0 siblings, 0 replies; 6+ messages in thread
From: Juergen Gross @ 2023-02-27 13:56 UTC (permalink / raw
  To: Boris Ostrovsky, linux-kernel, x86
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 1077 bytes --]

On 27.02.23 14:52, Boris Ostrovsky wrote:
> 
> 
> On 2/27/23 2:12 AM, Juergen Gross wrote:
>> On 24.02.23 22:00, Boris Ostrovsky wrote:
>>>
>>> On 2/23/23 4:32 AM, Juergen Gross wrote:
>>>> +
>>>> +    for (reg = 0; reg < MTRR_MAX_VAR_RANGES; reg++) {
>>>> +        op.u.read_memtype.reg = reg;
>>>> +        if (HYPERVISOR_platform_op(&op))
>>>> +            break;
>>>
>>>
>>> If we fail on the first iteration, do we still want to mark MTRRs are 
>>> enabled/set in mtrr_overwrite_state()?
>>
>> Hmm, good idea.
>>
>> I think we should just drop the call of mtrr_overwrite_state() in this
>> case.
> 
> 
> TBH I am not sure what the right way is to handle errors here. What if the 
> hypercall fails on second iteration?

The main reason would be that only one variable MTRR is available.

Its not as if there are very complicated scenarios leading to failures here.

Either the interface is usable and then it will work, or it isn't usable
and we can fall back to today's handling by ignoring MTRRs.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-02-27 13:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-23  9:32 [PATCH v3 00/12] x86/mtrr: fix handling with PAT but without MTRR Juergen Gross
2023-02-23  9:32 ` [PATCH v3 05/12] x86/xen: set MTRR state when running as Xen PV initial domain Juergen Gross
2023-02-24 21:00   ` Boris Ostrovsky
2023-02-27  7:12     ` Juergen Gross
2023-02-27 13:52       ` Boris Ostrovsky
2023-02-27 13:56         ` Juergen Gross

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).