All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-15 19:34 ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This RFC patchset is aimed to seek comments/suggestions for the design
and changes to support of Write-Through (WT) mapping.  The study below
shows that using WT mapping may be useful for non-volatile memory.

  http://www.hpl.hp.com/techreports/2012/HPL-2012-236.pdf

There were idea & patches to support WT in the past, which stimulated
very valuable discussions on this topic.

  https://lkml.org/lkml/2013/4/24/424
  https://lkml.org/lkml/2013/10/27/70
  https://lkml.org/lkml/2013/11/3/72

This RFC patchset tries to address the issues raised by taking the
following design approach:

 - Keep the MTRR interface
 - Keep the WB, WC, and UC- slots in the PAT MSR
 - Keep the PAT bit unused
 - Reassign the UC slot to WT in the PAT MSR

There are 4 usable slots in the PAT MSR, which are currently assigned to:

  PA0/4: WB, PA1/5: WC, PA2/6: UC-, PA3/7: UC

The PAT bit is unused since it shares the same bit as the PSE bit and
there was a bug in older processors.  Among the 4 slots, the uncached
memory type consumes 2 slots, UC- and UC.  They are functionally
equivalent, but UC- allows MTRRs to overwrite it with WC.  All interfaces
that set the uncached memory type use UC- in order to work with MTRRs.
The PA3/7 slot is effectively unused today.  Therefore, this patchset
reassigns the PA3/7 slot to WT.  If MTRRs get deprecated in future,
UC- can be reassigned to UC, and there is still no need to consume
2 slots for the uncached memory type.

This patchset is consist of two parts.  The 1st part, patch [1/11] to
[6/11], enables WT mapping and adds new interfaces for setting WT mapping.
The 2nd part, patch [7/11] to [11/11], cleans up the code that has
internal knowledge of the PAT slot assignment.  This keeps the kernel
code independent from the PAT slot assignment.

This patchset applies on top of the Linus's tree, 3.16.0-rc5.

---
Toshi Kani (11):
  1/11: x86, mm, pat: Redefine _PAGE_CACHE_UC as UC_MINUS
  2/11: x86, mm, pat: Define _PAGE_CACHE_WT for PA3/7 of PAT
  3/11: x86, mm, pat: Change reserve_memtype() to handle WT type
  4/11: x86, mm, asm-gen: Add ioremap_wt() for WT mapping
  5/11: x86, mm: Add set_memory[_array]_wt() for setting WT
  6/11: x86, mm, pat: Add pgprot_writethrough() for WT
  7/11: x86, mm: Keep _set_memory_<type>() slot-independent
  8/11: x86, mm, pat: Keep pgprot_<type>() slot-independent
  9/11: x86, efi: Cleanup PCD bit manipulation in EFI
 10/11: x86, xen: Cleanup PWT/PCD bit manipulation in Xen
 11/11: x86, fbdev: Cleanup PWT/PCD bit manipulation in fbdev

---
 arch/x86/include/asm/cacheflush.h         |  8 +++-
 arch/x86/include/asm/fb.h                 |  3 +-
 arch/x86/include/asm/io.h                 |  2 +
 arch/x86/include/asm/pgtable.h            |  2 +-
 arch/x86/include/asm/pgtable_types.h      | 22 ++++++---
 arch/x86/mm/ioremap.c                     | 37 +++++++++++----
 arch/x86/mm/pageattr.c                    | 75 ++++++++++++++++++++++++++-----
 arch/x86/mm/pat.c                         | 38 +++++++++-------
 arch/x86/mm/pat_internal.h                |  2 +-
 arch/x86/platform/efi/efi_64.c            |  4 +-
 arch/x86/xen/enlighten.c                  |  2 +-
 arch/x86/xen/mmu.c                        |  8 ++--
 drivers/video/fbdev/gbefb.c               |  3 +-
 drivers/video/fbdev/vermilion/vermilion.c |  4 +-
 include/asm-generic/io.h                  |  4 ++
 include/asm-generic/iomap.h               |  4 ++
 include/asm-generic/pgtable.h             |  4 ++
 17 files changed, 169 insertions(+), 53 deletions(-)

=====
This test patch applies on top of the RFC patchset and provides
an easy way to test WT mapping through /dev/mem.  This change is
a hack and test only.

  fd = open("/dev/mem", O_RDWR|O_DSYNC);
  p = mmap(NULL, <map-size>, PROT_READ|PROT_WRITE,
		MAP_SHARED, fd, <addr>);

---
 arch/x86/mm/pat.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 0be7ebd..79850b3 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -525,7 +525,7 @@ int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
 		return 0;
 
 	if (file->f_flags & O_DSYNC)
-		flags = _PAGE_CACHE_UC_MINUS;
+		flags = _PAGE_CACHE_WT;
 
 #ifdef CONFIG_X86_32
 	/*

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-15 19:34 ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This RFC patchset is aimed to seek comments/suggestions for the design
and changes to support of Write-Through (WT) mapping.  The study below
shows that using WT mapping may be useful for non-volatile memory.

  http://www.hpl.hp.com/techreports/2012/HPL-2012-236.pdf

There were idea & patches to support WT in the past, which stimulated
very valuable discussions on this topic.

  https://lkml.org/lkml/2013/4/24/424
  https://lkml.org/lkml/2013/10/27/70
  https://lkml.org/lkml/2013/11/3/72

This RFC patchset tries to address the issues raised by taking the
following design approach:

 - Keep the MTRR interface
 - Keep the WB, WC, and UC- slots in the PAT MSR
 - Keep the PAT bit unused
 - Reassign the UC slot to WT in the PAT MSR

There are 4 usable slots in the PAT MSR, which are currently assigned to:

  PA0/4: WB, PA1/5: WC, PA2/6: UC-, PA3/7: UC

The PAT bit is unused since it shares the same bit as the PSE bit and
there was a bug in older processors.  Among the 4 slots, the uncached
memory type consumes 2 slots, UC- and UC.  They are functionally
equivalent, but UC- allows MTRRs to overwrite it with WC.  All interfaces
that set the uncached memory type use UC- in order to work with MTRRs.
The PA3/7 slot is effectively unused today.  Therefore, this patchset
reassigns the PA3/7 slot to WT.  If MTRRs get deprecated in future,
UC- can be reassigned to UC, and there is still no need to consume
2 slots for the uncached memory type.

This patchset is consist of two parts.  The 1st part, patch [1/11] to
[6/11], enables WT mapping and adds new interfaces for setting WT mapping.
The 2nd part, patch [7/11] to [11/11], cleans up the code that has
internal knowledge of the PAT slot assignment.  This keeps the kernel
code independent from the PAT slot assignment.

This patchset applies on top of the Linus's tree, 3.16.0-rc5.

---
Toshi Kani (11):
  1/11: x86, mm, pat: Redefine _PAGE_CACHE_UC as UC_MINUS
  2/11: x86, mm, pat: Define _PAGE_CACHE_WT for PA3/7 of PAT
  3/11: x86, mm, pat: Change reserve_memtype() to handle WT type
  4/11: x86, mm, asm-gen: Add ioremap_wt() for WT mapping
  5/11: x86, mm: Add set_memory[_array]_wt() for setting WT
  6/11: x86, mm, pat: Add pgprot_writethrough() for WT
  7/11: x86, mm: Keep _set_memory_<type>() slot-independent
  8/11: x86, mm, pat: Keep pgprot_<type>() slot-independent
  9/11: x86, efi: Cleanup PCD bit manipulation in EFI
 10/11: x86, xen: Cleanup PWT/PCD bit manipulation in Xen
 11/11: x86, fbdev: Cleanup PWT/PCD bit manipulation in fbdev

---
 arch/x86/include/asm/cacheflush.h         |  8 +++-
 arch/x86/include/asm/fb.h                 |  3 +-
 arch/x86/include/asm/io.h                 |  2 +
 arch/x86/include/asm/pgtable.h            |  2 +-
 arch/x86/include/asm/pgtable_types.h      | 22 ++++++---
 arch/x86/mm/ioremap.c                     | 37 +++++++++++----
 arch/x86/mm/pageattr.c                    | 75 ++++++++++++++++++++++++++-----
 arch/x86/mm/pat.c                         | 38 +++++++++-------
 arch/x86/mm/pat_internal.h                |  2 +-
 arch/x86/platform/efi/efi_64.c            |  4 +-
 arch/x86/xen/enlighten.c                  |  2 +-
 arch/x86/xen/mmu.c                        |  8 ++--
 drivers/video/fbdev/gbefb.c               |  3 +-
 drivers/video/fbdev/vermilion/vermilion.c |  4 +-
 include/asm-generic/io.h                  |  4 ++
 include/asm-generic/iomap.h               |  4 ++
 include/asm-generic/pgtable.h             |  4 ++
 17 files changed, 169 insertions(+), 53 deletions(-)

=====
This test patch applies on top of the RFC patchset and provides
an easy way to test WT mapping through /dev/mem.  This change is
a hack and test only.

  fd = open("/dev/mem", O_RDWR|O_DSYNC);
  p = mmap(NULL, <map-size>, PROT_READ|PROT_WRITE,
		MAP_SHARED, fd, <addr>);

---
 arch/x86/mm/pat.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 0be7ebd..79850b3 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -525,7 +525,7 @@ int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
 		return 0;
 
 	if (file->f_flags & O_DSYNC)
-		flags = _PAGE_CACHE_UC_MINUS;
+		flags = _PAGE_CACHE_WT;
 
 #ifdef CONFIG_X86_32
 	/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 1/11] x86, mm, pat: Redefine _PAGE_CACHE_UC as UC_MINUS
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 19:34   ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

ioremap_nocache() and other interfaces that set the uncached memory
type use _PAGE_CACHE_UC_MINUS in order to support legacy graphics
drivers using MTRRs to overwrite it to WC.  _PAGE_CACHE_UC is defined,
but is unused on the systems with the PAT feature.

This patch redefines _PAGE_CACHE_UC to _PAGE_CACHE_UC_MINUS, and
and frees up the PA3/7 slot in the PAT MSR that was used for
_PAGE_CACHE_UC.  This keeps _PAGE_CACHE_UC defined in case out-of-tree
drivers refer it.

Note: The legacy code in phys_mem_access_prot_allowed() that sets
_PAGE_CACHE_UC for Pentiums and earlier processors is changed to
set the PCD & PWT bits in order to avoid any change.  They do not
support PAT and MTRRs.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/pgtable_types.h |   10 +++++-----
 arch/x86/mm/ioremap.c                |   14 +++++---------
 arch/x86/mm/pat.c                    |    9 +--------
 arch/x86/mm/pat_internal.h           |    1 -
 4 files changed, 11 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index f216963..03d40da 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -133,7 +133,7 @@
 #define _PAGE_CACHE_WB		(0)
 #define _PAGE_CACHE_WC		(_PAGE_PWT)
 #define _PAGE_CACHE_UC_MINUS	(_PAGE_PCD)
-#define _PAGE_CACHE_UC		(_PAGE_PCD | _PAGE_PWT)
+#define _PAGE_CACHE_UC		(_PAGE_CACHE_UC_MINUS)
 
 #define PAGE_NONE	__pgprot(_PAGE_PROTNONE | _PAGE_ACCESSED)
 #define PAGE_SHARED	__pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | \
@@ -157,13 +157,13 @@
 
 #define __PAGE_KERNEL_RO		(__PAGE_KERNEL & ~_PAGE_RW)
 #define __PAGE_KERNEL_RX		(__PAGE_KERNEL_EXEC & ~_PAGE_RW)
-#define __PAGE_KERNEL_EXEC_NOCACHE	(__PAGE_KERNEL_EXEC | _PAGE_PCD | _PAGE_PWT)
+#define __PAGE_KERNEL_EXEC_NOCACHE	(__PAGE_KERNEL_EXEC | _PAGE_CACHE_UC)
 #define __PAGE_KERNEL_WC		(__PAGE_KERNEL | _PAGE_CACHE_WC)
-#define __PAGE_KERNEL_NOCACHE		(__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
-#define __PAGE_KERNEL_UC_MINUS		(__PAGE_KERNEL | _PAGE_PCD)
+#define __PAGE_KERNEL_NOCACHE		(__PAGE_KERNEL | _PAGE_CACHE_UC)
+#define __PAGE_KERNEL_UC_MINUS		(__PAGE_KERNEL | _PAGE_CACHE_UC_MINUS)
 #define __PAGE_KERNEL_VSYSCALL		(__PAGE_KERNEL_RX | _PAGE_USER)
 #define __PAGE_KERNEL_VVAR		(__PAGE_KERNEL_RO | _PAGE_USER)
-#define __PAGE_KERNEL_VVAR_NOCACHE	(__PAGE_KERNEL_VVAR | _PAGE_PCD | _PAGE_PWT)
+#define __PAGE_KERNEL_VVAR_NOCACHE	(__PAGE_KERNEL_VVAR | _PAGE_CACHE_UC)
 #define __PAGE_KERNEL_LARGE		(__PAGE_KERNEL | _PAGE_PSE)
 #define __PAGE_KERNEL_LARGE_NOCACHE	(__PAGE_KERNEL | _PAGE_CACHE_UC | _PAGE_PSE)
 #define __PAGE_KERNEL_LARGE_EXEC	(__PAGE_KERNEL_EXEC | _PAGE_PSE)
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index baff1da..282829f 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -35,7 +35,7 @@ int ioremap_change_attr(unsigned long vaddr, unsigned long size,
 	int err;
 
 	switch (prot_val) {
-	case _PAGE_CACHE_UC:
+	case _PAGE_CACHE_UC_MINUS:
 	default:
 		err = _set_memory_uc(vaddr, nrpages);
 		break;
@@ -142,11 +142,8 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 	}
 
 	switch (prot_val) {
-	case _PAGE_CACHE_UC:
-	default:
-		prot = PAGE_KERNEL_IO_NOCACHE;
-		break;
 	case _PAGE_CACHE_UC_MINUS:
+	default:
 		prot = PAGE_KERNEL_IO_UC_MINUS;
 		break;
 	case _PAGE_CACHE_WC:
@@ -218,11 +215,10 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
 	 *	pat_enabled ? _PAGE_CACHE_UC : _PAGE_CACHE_UC_MINUS;
 	 *
 	 * Till we fix all X drivers to use ioremap_wc(), we will use
-	 * UC MINUS.
+	 * UC MINUS. _PAGE_CACHE_UC is also defined as _PAGE_CACHE_UC_MINUS
+	 * in pgtable_types.h.
 	 */
-	unsigned long val = _PAGE_CACHE_UC_MINUS;
-
-	return __ioremap_caller(phys_addr, size, val,
+	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_UC_MINUS,
 				__builtin_return_address(0));
 }
 EXPORT_SYMBOL(ioremap_nocache);
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 6574388..c3567a5 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -213,12 +213,6 @@ static int reserve_ram_pages_type(u64 start, u64 end, unsigned long req_type,
 	struct page *page;
 	u64 pfn;
 
-	if (req_type == _PAGE_CACHE_UC) {
-		/* We do not support strong UC */
-		WARN_ON_ONCE(1);
-		req_type = _PAGE_CACHE_UC_MINUS;
-	}
-
 	for (pfn = (start >> PAGE_SHIFT); pfn < (end >> PAGE_SHIFT); ++pfn) {
 		unsigned long type;
 
@@ -261,7 +255,6 @@ static int free_ram_pages_type(u64 start, u64 end)
  * - _PAGE_CACHE_WB
  * - _PAGE_CACHE_WC
  * - _PAGE_CACHE_UC_MINUS
- * - _PAGE_CACHE_UC
  *
  * If new_type is NULL, function will return an error if it cannot reserve the
  * region with req_type. If new_type is non-NULL, function will return
@@ -543,7 +536,7 @@ int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
 	      boot_cpu_has(X86_FEATURE_CYRIX_ARR) ||
 	      boot_cpu_has(X86_FEATURE_CENTAUR_MCR)) &&
 	    (pfn << PAGE_SHIFT) >= __pa(high_memory)) {
-		flags = _PAGE_CACHE_UC;
+		flags = _PAGE_PCD | _PAGE_PWT;	/* UC w/o PAT */
 	}
 #endif
 
diff --git a/arch/x86/mm/pat_internal.h b/arch/x86/mm/pat_internal.h
index 77e5ba1..2593d40 100644
--- a/arch/x86/mm/pat_internal.h
+++ b/arch/x86/mm/pat_internal.h
@@ -17,7 +17,6 @@ struct memtype {
 static inline char *cattr_name(unsigned long flags)
 {
 	switch (flags & _PAGE_CACHE_MASK) {
-	case _PAGE_CACHE_UC:		return "uncached";
 	case _PAGE_CACHE_UC_MINUS:	return "uncached-minus";
 	case _PAGE_CACHE_WB:		return "write-back";
 	case _PAGE_CACHE_WC:		return "write-combining";

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 1/11] x86, mm, pat: Redefine _PAGE_CACHE_UC as UC_MINUS
@ 2014-07-15 19:34   ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

ioremap_nocache() and other interfaces that set the uncached memory
type use _PAGE_CACHE_UC_MINUS in order to support legacy graphics
drivers using MTRRs to overwrite it to WC.  _PAGE_CACHE_UC is defined,
but is unused on the systems with the PAT feature.

This patch redefines _PAGE_CACHE_UC to _PAGE_CACHE_UC_MINUS, and
and frees up the PA3/7 slot in the PAT MSR that was used for
_PAGE_CACHE_UC.  This keeps _PAGE_CACHE_UC defined in case out-of-tree
drivers refer it.

Note: The legacy code in phys_mem_access_prot_allowed() that sets
_PAGE_CACHE_UC for Pentiums and earlier processors is changed to
set the PCD & PWT bits in order to avoid any change.  They do not
support PAT and MTRRs.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/pgtable_types.h |   10 +++++-----
 arch/x86/mm/ioremap.c                |   14 +++++---------
 arch/x86/mm/pat.c                    |    9 +--------
 arch/x86/mm/pat_internal.h           |    1 -
 4 files changed, 11 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index f216963..03d40da 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -133,7 +133,7 @@
 #define _PAGE_CACHE_WB		(0)
 #define _PAGE_CACHE_WC		(_PAGE_PWT)
 #define _PAGE_CACHE_UC_MINUS	(_PAGE_PCD)
-#define _PAGE_CACHE_UC		(_PAGE_PCD | _PAGE_PWT)
+#define _PAGE_CACHE_UC		(_PAGE_CACHE_UC_MINUS)
 
 #define PAGE_NONE	__pgprot(_PAGE_PROTNONE | _PAGE_ACCESSED)
 #define PAGE_SHARED	__pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | \
@@ -157,13 +157,13 @@
 
 #define __PAGE_KERNEL_RO		(__PAGE_KERNEL & ~_PAGE_RW)
 #define __PAGE_KERNEL_RX		(__PAGE_KERNEL_EXEC & ~_PAGE_RW)
-#define __PAGE_KERNEL_EXEC_NOCACHE	(__PAGE_KERNEL_EXEC | _PAGE_PCD | _PAGE_PWT)
+#define __PAGE_KERNEL_EXEC_NOCACHE	(__PAGE_KERNEL_EXEC | _PAGE_CACHE_UC)
 #define __PAGE_KERNEL_WC		(__PAGE_KERNEL | _PAGE_CACHE_WC)
-#define __PAGE_KERNEL_NOCACHE		(__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
-#define __PAGE_KERNEL_UC_MINUS		(__PAGE_KERNEL | _PAGE_PCD)
+#define __PAGE_KERNEL_NOCACHE		(__PAGE_KERNEL | _PAGE_CACHE_UC)
+#define __PAGE_KERNEL_UC_MINUS		(__PAGE_KERNEL | _PAGE_CACHE_UC_MINUS)
 #define __PAGE_KERNEL_VSYSCALL		(__PAGE_KERNEL_RX | _PAGE_USER)
 #define __PAGE_KERNEL_VVAR		(__PAGE_KERNEL_RO | _PAGE_USER)
-#define __PAGE_KERNEL_VVAR_NOCACHE	(__PAGE_KERNEL_VVAR | _PAGE_PCD | _PAGE_PWT)
+#define __PAGE_KERNEL_VVAR_NOCACHE	(__PAGE_KERNEL_VVAR | _PAGE_CACHE_UC)
 #define __PAGE_KERNEL_LARGE		(__PAGE_KERNEL | _PAGE_PSE)
 #define __PAGE_KERNEL_LARGE_NOCACHE	(__PAGE_KERNEL | _PAGE_CACHE_UC | _PAGE_PSE)
 #define __PAGE_KERNEL_LARGE_EXEC	(__PAGE_KERNEL_EXEC | _PAGE_PSE)
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index baff1da..282829f 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -35,7 +35,7 @@ int ioremap_change_attr(unsigned long vaddr, unsigned long size,
 	int err;
 
 	switch (prot_val) {
-	case _PAGE_CACHE_UC:
+	case _PAGE_CACHE_UC_MINUS:
 	default:
 		err = _set_memory_uc(vaddr, nrpages);
 		break;
@@ -142,11 +142,8 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 	}
 
 	switch (prot_val) {
-	case _PAGE_CACHE_UC:
-	default:
-		prot = PAGE_KERNEL_IO_NOCACHE;
-		break;
 	case _PAGE_CACHE_UC_MINUS:
+	default:
 		prot = PAGE_KERNEL_IO_UC_MINUS;
 		break;
 	case _PAGE_CACHE_WC:
@@ -218,11 +215,10 @@ void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
 	 *	pat_enabled ? _PAGE_CACHE_UC : _PAGE_CACHE_UC_MINUS;
 	 *
 	 * Till we fix all X drivers to use ioremap_wc(), we will use
-	 * UC MINUS.
+	 * UC MINUS. _PAGE_CACHE_UC is also defined as _PAGE_CACHE_UC_MINUS
+	 * in pgtable_types.h.
 	 */
-	unsigned long val = _PAGE_CACHE_UC_MINUS;
-
-	return __ioremap_caller(phys_addr, size, val,
+	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_UC_MINUS,
 				__builtin_return_address(0));
 }
 EXPORT_SYMBOL(ioremap_nocache);
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 6574388..c3567a5 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -213,12 +213,6 @@ static int reserve_ram_pages_type(u64 start, u64 end, unsigned long req_type,
 	struct page *page;
 	u64 pfn;
 
-	if (req_type == _PAGE_CACHE_UC) {
-		/* We do not support strong UC */
-		WARN_ON_ONCE(1);
-		req_type = _PAGE_CACHE_UC_MINUS;
-	}
-
 	for (pfn = (start >> PAGE_SHIFT); pfn < (end >> PAGE_SHIFT); ++pfn) {
 		unsigned long type;
 
@@ -261,7 +255,6 @@ static int free_ram_pages_type(u64 start, u64 end)
  * - _PAGE_CACHE_WB
  * - _PAGE_CACHE_WC
  * - _PAGE_CACHE_UC_MINUS
- * - _PAGE_CACHE_UC
  *
  * If new_type is NULL, function will return an error if it cannot reserve the
  * region with req_type. If new_type is non-NULL, function will return
@@ -543,7 +536,7 @@ int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
 	      boot_cpu_has(X86_FEATURE_CYRIX_ARR) ||
 	      boot_cpu_has(X86_FEATURE_CENTAUR_MCR)) &&
 	    (pfn << PAGE_SHIFT) >= __pa(high_memory)) {
-		flags = _PAGE_CACHE_UC;
+		flags = _PAGE_PCD | _PAGE_PWT;	/* UC w/o PAT */
 	}
 #endif
 
diff --git a/arch/x86/mm/pat_internal.h b/arch/x86/mm/pat_internal.h
index 77e5ba1..2593d40 100644
--- a/arch/x86/mm/pat_internal.h
+++ b/arch/x86/mm/pat_internal.h
@@ -17,7 +17,6 @@ struct memtype {
 static inline char *cattr_name(unsigned long flags)
 {
 	switch (flags & _PAGE_CACHE_MASK) {
-	case _PAGE_CACHE_UC:		return "uncached";
 	case _PAGE_CACHE_UC_MINUS:	return "uncached-minus";
 	case _PAGE_CACHE_WB:		return "write-back";
 	case _PAGE_CACHE_WC:		return "write-combining";

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 2/11] x86, mm, pat: Define _PAGE_CACHE_WT for PA3/7 of PAT
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 19:34   ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch defines _PAGE_CACHE_WT and its relevant macros, which
now use the PA3/7 slot in the PAT MSR.  pat_init() is also changed
to set the WT memory type to the PA3/7 slot in the PAT MSR.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/pgtable_types.h |    5 +++++
 arch/x86/mm/pat.c                    |    8 ++++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 03d40da..7b905cb 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -132,6 +132,7 @@
 #define _PAGE_CACHE_MASK	(_PAGE_PCD | _PAGE_PWT)
 #define _PAGE_CACHE_WB		(0)
 #define _PAGE_CACHE_WC		(_PAGE_PWT)
+#define _PAGE_CACHE_WT		(_PAGE_PCD | _PAGE_PWT)
 #define _PAGE_CACHE_UC_MINUS	(_PAGE_PCD)
 #define _PAGE_CACHE_UC		(_PAGE_CACHE_UC_MINUS)
 
@@ -159,6 +160,7 @@
 #define __PAGE_KERNEL_RX		(__PAGE_KERNEL_EXEC & ~_PAGE_RW)
 #define __PAGE_KERNEL_EXEC_NOCACHE	(__PAGE_KERNEL_EXEC | _PAGE_CACHE_UC)
 #define __PAGE_KERNEL_WC		(__PAGE_KERNEL | _PAGE_CACHE_WC)
+#define __PAGE_KERNEL_WT		(__PAGE_KERNEL | _PAGE_CACHE_WT)
 #define __PAGE_KERNEL_NOCACHE		(__PAGE_KERNEL | _PAGE_CACHE_UC)
 #define __PAGE_KERNEL_UC_MINUS		(__PAGE_KERNEL | _PAGE_CACHE_UC_MINUS)
 #define __PAGE_KERNEL_VSYSCALL		(__PAGE_KERNEL_RX | _PAGE_USER)
@@ -172,12 +174,14 @@
 #define __PAGE_KERNEL_IO_NOCACHE	(__PAGE_KERNEL_NOCACHE | _PAGE_IOMAP)
 #define __PAGE_KERNEL_IO_UC_MINUS	(__PAGE_KERNEL_UC_MINUS | _PAGE_IOMAP)
 #define __PAGE_KERNEL_IO_WC		(__PAGE_KERNEL_WC | _PAGE_IOMAP)
+#define __PAGE_KERNEL_IO_WT		(__PAGE_KERNEL_WT | _PAGE_IOMAP)
 
 #define PAGE_KERNEL			__pgprot(__PAGE_KERNEL)
 #define PAGE_KERNEL_RO			__pgprot(__PAGE_KERNEL_RO)
 #define PAGE_KERNEL_EXEC		__pgprot(__PAGE_KERNEL_EXEC)
 #define PAGE_KERNEL_RX			__pgprot(__PAGE_KERNEL_RX)
 #define PAGE_KERNEL_WC			__pgprot(__PAGE_KERNEL_WC)
+#define PAGE_KERNEL_WT			__pgprot(__PAGE_KERNEL_WT)
 #define PAGE_KERNEL_NOCACHE		__pgprot(__PAGE_KERNEL_NOCACHE)
 #define PAGE_KERNEL_UC_MINUS		__pgprot(__PAGE_KERNEL_UC_MINUS)
 #define PAGE_KERNEL_EXEC_NOCACHE	__pgprot(__PAGE_KERNEL_EXEC_NOCACHE)
@@ -192,6 +196,7 @@
 #define PAGE_KERNEL_IO_NOCACHE		__pgprot(__PAGE_KERNEL_IO_NOCACHE)
 #define PAGE_KERNEL_IO_UC_MINUS		__pgprot(__PAGE_KERNEL_IO_UC_MINUS)
 #define PAGE_KERNEL_IO_WC		__pgprot(__PAGE_KERNEL_IO_WC)
+#define PAGE_KERNEL_IO_WT		__pgprot(__PAGE_KERNEL_IO_WT)
 
 /*         xwr */
 #define __P000	PAGE_NONE
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index c3567a5..176d4d6 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -101,7 +101,7 @@ void pat_init(void)
 		}
 	}
 
-	/* Set PWT to Write-Combining. All other bits stay the same */
+	/* Set PWT to Write-Combining, and PCD|PWT to Write-Through. */
 	/*
 	 * PTE encoding used in Linux:
 	 *      PAT
@@ -111,11 +111,11 @@ void pat_init(void)
 	 *      000 WB		_PAGE_CACHE_WB
 	 *      001 WC		_PAGE_CACHE_WC
 	 *      010 UC-		_PAGE_CACHE_UC_MINUS
-	 *      011 UC		_PAGE_CACHE_UC
+	 *      011 WT		_PAGE_CACHE_WT
 	 * PAT bit unused
 	 */
-	pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
-	      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, UC);
+	pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, WT) |
+	      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, WT);
 
 	/* Boot CPU check */
 	if (!boot_pat_state)

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 2/11] x86, mm, pat: Define _PAGE_CACHE_WT for PA3/7 of PAT
@ 2014-07-15 19:34   ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch defines _PAGE_CACHE_WT and its relevant macros, which
now use the PA3/7 slot in the PAT MSR.  pat_init() is also changed
to set the WT memory type to the PA3/7 slot in the PAT MSR.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/pgtable_types.h |    5 +++++
 arch/x86/mm/pat.c                    |    8 ++++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 03d40da..7b905cb 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -132,6 +132,7 @@
 #define _PAGE_CACHE_MASK	(_PAGE_PCD | _PAGE_PWT)
 #define _PAGE_CACHE_WB		(0)
 #define _PAGE_CACHE_WC		(_PAGE_PWT)
+#define _PAGE_CACHE_WT		(_PAGE_PCD | _PAGE_PWT)
 #define _PAGE_CACHE_UC_MINUS	(_PAGE_PCD)
 #define _PAGE_CACHE_UC		(_PAGE_CACHE_UC_MINUS)
 
@@ -159,6 +160,7 @@
 #define __PAGE_KERNEL_RX		(__PAGE_KERNEL_EXEC & ~_PAGE_RW)
 #define __PAGE_KERNEL_EXEC_NOCACHE	(__PAGE_KERNEL_EXEC | _PAGE_CACHE_UC)
 #define __PAGE_KERNEL_WC		(__PAGE_KERNEL | _PAGE_CACHE_WC)
+#define __PAGE_KERNEL_WT		(__PAGE_KERNEL | _PAGE_CACHE_WT)
 #define __PAGE_KERNEL_NOCACHE		(__PAGE_KERNEL | _PAGE_CACHE_UC)
 #define __PAGE_KERNEL_UC_MINUS		(__PAGE_KERNEL | _PAGE_CACHE_UC_MINUS)
 #define __PAGE_KERNEL_VSYSCALL		(__PAGE_KERNEL_RX | _PAGE_USER)
@@ -172,12 +174,14 @@
 #define __PAGE_KERNEL_IO_NOCACHE	(__PAGE_KERNEL_NOCACHE | _PAGE_IOMAP)
 #define __PAGE_KERNEL_IO_UC_MINUS	(__PAGE_KERNEL_UC_MINUS | _PAGE_IOMAP)
 #define __PAGE_KERNEL_IO_WC		(__PAGE_KERNEL_WC | _PAGE_IOMAP)
+#define __PAGE_KERNEL_IO_WT		(__PAGE_KERNEL_WT | _PAGE_IOMAP)
 
 #define PAGE_KERNEL			__pgprot(__PAGE_KERNEL)
 #define PAGE_KERNEL_RO			__pgprot(__PAGE_KERNEL_RO)
 #define PAGE_KERNEL_EXEC		__pgprot(__PAGE_KERNEL_EXEC)
 #define PAGE_KERNEL_RX			__pgprot(__PAGE_KERNEL_RX)
 #define PAGE_KERNEL_WC			__pgprot(__PAGE_KERNEL_WC)
+#define PAGE_KERNEL_WT			__pgprot(__PAGE_KERNEL_WT)
 #define PAGE_KERNEL_NOCACHE		__pgprot(__PAGE_KERNEL_NOCACHE)
 #define PAGE_KERNEL_UC_MINUS		__pgprot(__PAGE_KERNEL_UC_MINUS)
 #define PAGE_KERNEL_EXEC_NOCACHE	__pgprot(__PAGE_KERNEL_EXEC_NOCACHE)
@@ -192,6 +196,7 @@
 #define PAGE_KERNEL_IO_NOCACHE		__pgprot(__PAGE_KERNEL_IO_NOCACHE)
 #define PAGE_KERNEL_IO_UC_MINUS		__pgprot(__PAGE_KERNEL_IO_UC_MINUS)
 #define PAGE_KERNEL_IO_WC		__pgprot(__PAGE_KERNEL_IO_WC)
+#define PAGE_KERNEL_IO_WT		__pgprot(__PAGE_KERNEL_IO_WT)
 
 /*         xwr */
 #define __P000	PAGE_NONE
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index c3567a5..176d4d6 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -101,7 +101,7 @@ void pat_init(void)
 		}
 	}
 
-	/* Set PWT to Write-Combining. All other bits stay the same */
+	/* Set PWT to Write-Combining, and PCD|PWT to Write-Through. */
 	/*
 	 * PTE encoding used in Linux:
 	 *      PAT
@@ -111,11 +111,11 @@ void pat_init(void)
 	 *      000 WB		_PAGE_CACHE_WB
 	 *      001 WC		_PAGE_CACHE_WC
 	 *      010 UC-		_PAGE_CACHE_UC_MINUS
-	 *      011 UC		_PAGE_CACHE_UC
+	 *      011 WT		_PAGE_CACHE_WT
 	 * PAT bit unused
 	 */
-	pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
-	      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, UC);
+	pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, WT) |
+	      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, WT);
 
 	/* Boot CPU check */
 	if (!boot_pat_state)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 19:34   ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch changes reserve_memtype() to handle the new WT type.
When (!pat_enabled && new_type), it continues to set either WB
or UC- to *new_type.  When pat_enabled, it can reserve a given
non-RAM range for WT.  At this point, it may not reserve a RAM
range for WT since reserve_ram_pages_type() uses the page flags
limited to three memory types, WB, WC and UC.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/cacheflush.h |    2 ++
 arch/x86/mm/pat.c                 |   12 +++++++++---
 arch/x86/mm/pat_internal.h        |    1 +
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h
index 9863ee3..c80a3a1 100644
--- a/arch/x86/include/asm/cacheflush.h
+++ b/arch/x86/include/asm/cacheflush.h
@@ -42,6 +42,8 @@ static inline void set_page_memtype(struct page *pg, unsigned long memtype)
 	unsigned long old_flags;
 	unsigned long new_flags;
 
+	BUG_ON(memtype == _PAGE_CACHE_WT);
+
 	switch (memtype) {
 	case _PAGE_CACHE_WC:
 		memtype_flags = _PGMT_WC;
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 176d4d6..8a8be17 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -203,6 +203,8 @@ static int pat_pagerange_is_ram(resource_size_t start, resource_size_t end)
 
 /*
  * For RAM pages, we use page flags to mark the pages with appropriate type.
+ * The page flags are currently limited to three types, WB, WC and UC. Hence,
+ * any request to WT will fail with -EINVAL.
  * Here we do two pass:
  * - Find the memtype of all the pages in the range, look for any conflicts
  * - In case of no conflicts, set the new memtype for pages in the range
@@ -213,6 +215,9 @@ static int reserve_ram_pages_type(u64 start, u64 end, unsigned long req_type,
 	struct page *page;
 	u64 pfn;
 
+	if (req_type == _PAGE_CACHE_WT)
+		return -EINVAL;
+
 	for (pfn = (start >> PAGE_SHIFT); pfn < (end >> PAGE_SHIFT); ++pfn) {
 		unsigned long type;
 
@@ -254,6 +259,7 @@ static int free_ram_pages_type(u64 start, u64 end)
  * req_type typically has one of the:
  * - _PAGE_CACHE_WB
  * - _PAGE_CACHE_WC
+ * - _PAGE_CACHE_WT
  * - _PAGE_CACHE_UC_MINUS
  *
  * If new_type is NULL, function will return an error if it cannot reserve the
@@ -274,10 +280,10 @@ int reserve_memtype(u64 start, u64 end, unsigned long req_type,
 	if (!pat_enabled) {
 		/* This is identical to page table setting without PAT */
 		if (new_type) {
-			if (req_type == _PAGE_CACHE_WC)
-				*new_type = _PAGE_CACHE_UC_MINUS;
+			if (req_type == _PAGE_CACHE_WB)
+				*new_type = _PAGE_CACHE_WB;
 			else
-				*new_type = req_type & _PAGE_CACHE_MASK;
+				*new_type = _PAGE_CACHE_UC_MINUS;
 		}
 		return 0;
 	}
diff --git a/arch/x86/mm/pat_internal.h b/arch/x86/mm/pat_internal.h
index 2593d40..7ae6b37 100644
--- a/arch/x86/mm/pat_internal.h
+++ b/arch/x86/mm/pat_internal.h
@@ -20,6 +20,7 @@ static inline char *cattr_name(unsigned long flags)
 	case _PAGE_CACHE_UC_MINUS:	return "uncached-minus";
 	case _PAGE_CACHE_WB:		return "write-back";
 	case _PAGE_CACHE_WC:		return "write-combining";
+	case _PAGE_CACHE_WT:		return "write-through";
 	default:			return "broken";
 	}
 }

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
@ 2014-07-15 19:34   ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch changes reserve_memtype() to handle the new WT type.
When (!pat_enabled && new_type), it continues to set either WB
or UC- to *new_type.  When pat_enabled, it can reserve a given
non-RAM range for WT.  At this point, it may not reserve a RAM
range for WT since reserve_ram_pages_type() uses the page flags
limited to three memory types, WB, WC and UC.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/cacheflush.h |    2 ++
 arch/x86/mm/pat.c                 |   12 +++++++++---
 arch/x86/mm/pat_internal.h        |    1 +
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h
index 9863ee3..c80a3a1 100644
--- a/arch/x86/include/asm/cacheflush.h
+++ b/arch/x86/include/asm/cacheflush.h
@@ -42,6 +42,8 @@ static inline void set_page_memtype(struct page *pg, unsigned long memtype)
 	unsigned long old_flags;
 	unsigned long new_flags;
 
+	BUG_ON(memtype == _PAGE_CACHE_WT);
+
 	switch (memtype) {
 	case _PAGE_CACHE_WC:
 		memtype_flags = _PGMT_WC;
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 176d4d6..8a8be17 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -203,6 +203,8 @@ static int pat_pagerange_is_ram(resource_size_t start, resource_size_t end)
 
 /*
  * For RAM pages, we use page flags to mark the pages with appropriate type.
+ * The page flags are currently limited to three types, WB, WC and UC. Hence,
+ * any request to WT will fail with -EINVAL.
  * Here we do two pass:
  * - Find the memtype of all the pages in the range, look for any conflicts
  * - In case of no conflicts, set the new memtype for pages in the range
@@ -213,6 +215,9 @@ static int reserve_ram_pages_type(u64 start, u64 end, unsigned long req_type,
 	struct page *page;
 	u64 pfn;
 
+	if (req_type == _PAGE_CACHE_WT)
+		return -EINVAL;
+
 	for (pfn = (start >> PAGE_SHIFT); pfn < (end >> PAGE_SHIFT); ++pfn) {
 		unsigned long type;
 
@@ -254,6 +259,7 @@ static int free_ram_pages_type(u64 start, u64 end)
  * req_type typically has one of the:
  * - _PAGE_CACHE_WB
  * - _PAGE_CACHE_WC
+ * - _PAGE_CACHE_WT
  * - _PAGE_CACHE_UC_MINUS
  *
  * If new_type is NULL, function will return an error if it cannot reserve the
@@ -274,10 +280,10 @@ int reserve_memtype(u64 start, u64 end, unsigned long req_type,
 	if (!pat_enabled) {
 		/* This is identical to page table setting without PAT */
 		if (new_type) {
-			if (req_type == _PAGE_CACHE_WC)
-				*new_type = _PAGE_CACHE_UC_MINUS;
+			if (req_type == _PAGE_CACHE_WB)
+				*new_type = _PAGE_CACHE_WB;
 			else
-				*new_type = req_type & _PAGE_CACHE_MASK;
+				*new_type = _PAGE_CACHE_UC_MINUS;
 		}
 		return 0;
 	}
diff --git a/arch/x86/mm/pat_internal.h b/arch/x86/mm/pat_internal.h
index 2593d40..7ae6b37 100644
--- a/arch/x86/mm/pat_internal.h
+++ b/arch/x86/mm/pat_internal.h
@@ -20,6 +20,7 @@ static inline char *cattr_name(unsigned long flags)
 	case _PAGE_CACHE_UC_MINUS:	return "uncached-minus";
 	case _PAGE_CACHE_WB:		return "write-back";
 	case _PAGE_CACHE_WC:		return "write-combining";
+	case _PAGE_CACHE_WT:		return "write-through";
 	default:			return "broken";
 	}
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 4/11] x86, mm, asm-gen: Add ioremap_wt() for WT mapping
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 19:34   ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch introduces ioremap_wt() for creating WT maps on x86.
It follows the same model as ioremap_wc() for multi-architecture
support.  ARCH_HAS_IOREMAP_WT is defined in x86's io.h to indicate
that ioremap_wt() is implemented on x86.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/io.h   |    2 ++
 arch/x86/mm/ioremap.c       |   23 +++++++++++++++++++++++
 include/asm-generic/io.h    |    4 ++++
 include/asm-generic/iomap.h |    4 ++++
 4 files changed, 33 insertions(+)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index b8237d8..646e367 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -35,6 +35,7 @@
   */
 
 #define ARCH_HAS_IOREMAP_WC
+#define ARCH_HAS_IOREMAP_WT
 
 #include <linux/string.h>
 #include <linux/compiler.h>
@@ -316,6 +317,7 @@ extern void unxlate_dev_mem_ptr(unsigned long phys, void *addr);
 extern int ioremap_change_attr(unsigned long vaddr, unsigned long size,
 				unsigned long prot_val);
 extern void __iomem *ioremap_wc(resource_size_t offset, unsigned long size);
+extern void __iomem *ioremap_wt(resource_size_t offset, unsigned long size);
 
 extern bool is_early_ioremap_ptep(pte_t *ptep);
 
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 282829f..d3dab0b 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -149,6 +149,9 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 	case _PAGE_CACHE_WC:
 		prot = PAGE_KERNEL_IO_WC;
 		break;
+	case _PAGE_CACHE_WT:
+		prot = PAGE_KERNEL_IO_WT;
+		break;
 	case _PAGE_CACHE_WB:
 		prot = PAGE_KERNEL_IO;
 		break;
@@ -243,6 +246,26 @@ void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
 }
 EXPORT_SYMBOL(ioremap_wc);
 
+/**
+ * ioremap_wt	-	map memory into CPU space write through
+ * @phys_addr:	bus address of the memory
+ * @size:	size of the resource to map
+ *
+ * This version of ioremap ensures that the memory is marked write through.
+ * Write through provides cached reads and uncached writes.
+ *
+ * Must be freed with iounmap.
+ */
+void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
+{
+	if (pat_enabled)
+		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WT,
+					__builtin_return_address(0));
+	else
+		return ioremap_nocache(phys_addr, size);
+}
+EXPORT_SYMBOL(ioremap_wt);
+
 void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
 {
 	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WB,
diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
index 975e1cc..03e31a7 100644
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -322,6 +322,10 @@ static inline void __iomem *ioremap(phys_addr_t offset, unsigned long size)
 #define ioremap_wc ioremap_nocache
 #endif
 
+#ifndef ioremap_wt
+#define ioremap_wc ioremap_nocache
+#endif
+
 static inline void iounmap(void __iomem *addr)
 {
 }
diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
index 1b41011..d8f8622 100644
--- a/include/asm-generic/iomap.h
+++ b/include/asm-generic/iomap.h
@@ -66,6 +66,10 @@ extern void ioport_unmap(void __iomem *);
 #define ioremap_wc ioremap_nocache
 #endif
 
+#ifndef ARCH_HAS_IOREMAP_WT
+#define ioremap_wt ioremap_nocache
+#endif
+
 #ifdef CONFIG_PCI
 /* Destroy a virtual mapping cookie for a PCI BAR (memory or IO) */
 struct pci_dev;

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 4/11] x86, mm, asm-gen: Add ioremap_wt() for WT mapping
@ 2014-07-15 19:34   ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch introduces ioremap_wt() for creating WT maps on x86.
It follows the same model as ioremap_wc() for multi-architecture
support.  ARCH_HAS_IOREMAP_WT is defined in x86's io.h to indicate
that ioremap_wt() is implemented on x86.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/io.h   |    2 ++
 arch/x86/mm/ioremap.c       |   23 +++++++++++++++++++++++
 include/asm-generic/io.h    |    4 ++++
 include/asm-generic/iomap.h |    4 ++++
 4 files changed, 33 insertions(+)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index b8237d8..646e367 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -35,6 +35,7 @@
   */
 
 #define ARCH_HAS_IOREMAP_WC
+#define ARCH_HAS_IOREMAP_WT
 
 #include <linux/string.h>
 #include <linux/compiler.h>
@@ -316,6 +317,7 @@ extern void unxlate_dev_mem_ptr(unsigned long phys, void *addr);
 extern int ioremap_change_attr(unsigned long vaddr, unsigned long size,
 				unsigned long prot_val);
 extern void __iomem *ioremap_wc(resource_size_t offset, unsigned long size);
+extern void __iomem *ioremap_wt(resource_size_t offset, unsigned long size);
 
 extern bool is_early_ioremap_ptep(pte_t *ptep);
 
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 282829f..d3dab0b 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -149,6 +149,9 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 	case _PAGE_CACHE_WC:
 		prot = PAGE_KERNEL_IO_WC;
 		break;
+	case _PAGE_CACHE_WT:
+		prot = PAGE_KERNEL_IO_WT;
+		break;
 	case _PAGE_CACHE_WB:
 		prot = PAGE_KERNEL_IO;
 		break;
@@ -243,6 +246,26 @@ void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
 }
 EXPORT_SYMBOL(ioremap_wc);
 
+/**
+ * ioremap_wt	-	map memory into CPU space write through
+ * @phys_addr:	bus address of the memory
+ * @size:	size of the resource to map
+ *
+ * This version of ioremap ensures that the memory is marked write through.
+ * Write through provides cached reads and uncached writes.
+ *
+ * Must be freed with iounmap.
+ */
+void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
+{
+	if (pat_enabled)
+		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WT,
+					__builtin_return_address(0));
+	else
+		return ioremap_nocache(phys_addr, size);
+}
+EXPORT_SYMBOL(ioremap_wt);
+
 void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
 {
 	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_WB,
diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
index 975e1cc..03e31a7 100644
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -322,6 +322,10 @@ static inline void __iomem *ioremap(phys_addr_t offset, unsigned long size)
 #define ioremap_wc ioremap_nocache
 #endif
 
+#ifndef ioremap_wt
+#define ioremap_wc ioremap_nocache
+#endif
+
 static inline void iounmap(void __iomem *addr)
 {
 }
diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
index 1b41011..d8f8622 100644
--- a/include/asm-generic/iomap.h
+++ b/include/asm-generic/iomap.h
@@ -66,6 +66,10 @@ extern void ioport_unmap(void __iomem *);
 #define ioremap_wc ioremap_nocache
 #endif
 
+#ifndef ARCH_HAS_IOREMAP_WT
+#define ioremap_wt ioremap_nocache
+#endif
+
 #ifdef CONFIG_PCI
 /* Destroy a virtual mapping cookie for a PCI BAR (memory or IO) */
 struct pci_dev;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 5/11] x86, mm: Add set_memory[_array]_wt() for setting WT
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 19:34   ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch introduces set_memory_wt() and set_memory_array_wt()
for setting a given range of the memory attribute to WT.

Note that reserve_memtype() only supports tracking of WT for
non-RAM ranges at this point.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/cacheflush.h |    6 ++++-
 arch/x86/mm/pageattr.c            |   43 +++++++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h
index c80a3a1..050504b 100644
--- a/arch/x86/include/asm/cacheflush.h
+++ b/arch/x86/include/asm/cacheflush.h
@@ -69,7 +69,7 @@ static inline void set_page_memtype(struct page *pg, unsigned long memtype) { }
 /*
  * The set_memory_* API can be used to change various attributes of a virtual
  * address range. The attributes include:
- * Cachability   : UnCached, WriteCombining, WriteBack
+ * Cachability   : UnCached, WriteCombining, WriteThrough, WriteBack
  * Executability : eXeutable, NoteXecutable
  * Read/Write    : ReadOnly, ReadWrite
  * Presence      : NotPresent
@@ -96,9 +96,11 @@ static inline void set_page_memtype(struct page *pg, unsigned long memtype) { }
 
 int _set_memory_uc(unsigned long addr, int numpages);
 int _set_memory_wc(unsigned long addr, int numpages);
+int _set_memory_wt(unsigned long addr, int numpages);
 int _set_memory_wb(unsigned long addr, int numpages);
 int set_memory_uc(unsigned long addr, int numpages);
 int set_memory_wc(unsigned long addr, int numpages);
+int set_memory_wt(unsigned long addr, int numpages);
 int set_memory_wb(unsigned long addr, int numpages);
 int set_memory_x(unsigned long addr, int numpages);
 int set_memory_nx(unsigned long addr, int numpages);
@@ -109,10 +111,12 @@ int set_memory_4k(unsigned long addr, int numpages);
 
 int set_memory_array_uc(unsigned long *addr, int addrinarray);
 int set_memory_array_wc(unsigned long *addr, int addrinarray);
+int set_memory_array_wt(unsigned long *addr, int addrinarray);
 int set_memory_array_wb(unsigned long *addr, int addrinarray);
 
 int set_pages_array_uc(struct page **pages, int addrinarray);
 int set_pages_array_wc(struct page **pages, int addrinarray);
+int set_pages_array_wt(struct page **pages, int addrinarray);
 int set_pages_array_wb(struct page **pages, int addrinarray);
 
 /*
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index ae242a7..a2a1e70 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1562,6 +1562,43 @@ out_err:
 }
 EXPORT_SYMBOL(set_memory_wc);
 
+int set_memory_array_wt(unsigned long *addr, int addrinarray)
+{
+	return _set_memory_array(addr, addrinarray, _PAGE_CACHE_WT);
+}
+EXPORT_SYMBOL(set_memory_array_wt);
+
+int _set_memory_wt(unsigned long addr, int numpages)
+{
+	return change_page_attr_set(&addr, numpages,
+				    __pgprot(_PAGE_CACHE_WT), 0);
+}
+
+int set_memory_wt(unsigned long addr, int numpages)
+{
+	int ret;
+
+	if (!pat_enabled)
+		return set_memory_uc(addr, numpages);
+
+	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
+		_PAGE_CACHE_WT, NULL);
+	if (ret)
+		goto out_err;
+
+	ret = _set_memory_wt(addr, numpages);
+	if (ret)
+		goto out_free;
+
+	return 0;
+
+out_free:
+	free_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE);
+out_err:
+	return ret;
+}
+EXPORT_SYMBOL(set_memory_wt);
+
 int _set_memory_wb(unsigned long addr, int numpages)
 {
 	return change_page_attr_clear(&addr, numpages,
@@ -1699,6 +1736,12 @@ int set_pages_array_wc(struct page **pages, int addrinarray)
 }
 EXPORT_SYMBOL(set_pages_array_wc);
 
+int set_pages_array_wt(struct page **pages, int addrinarray)
+{
+	return _set_pages_array(pages, addrinarray, _PAGE_CACHE_WT);
+}
+EXPORT_SYMBOL(set_pages_array_wt);
+
 int set_pages_wb(struct page *page, int numpages)
 {
 	unsigned long addr = (unsigned long)page_address(page);

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 5/11] x86, mm: Add set_memory[_array]_wt() for setting WT
@ 2014-07-15 19:34   ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch introduces set_memory_wt() and set_memory_array_wt()
for setting a given range of the memory attribute to WT.

Note that reserve_memtype() only supports tracking of WT for
non-RAM ranges at this point.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/cacheflush.h |    6 ++++-
 arch/x86/mm/pageattr.c            |   43 +++++++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h
index c80a3a1..050504b 100644
--- a/arch/x86/include/asm/cacheflush.h
+++ b/arch/x86/include/asm/cacheflush.h
@@ -69,7 +69,7 @@ static inline void set_page_memtype(struct page *pg, unsigned long memtype) { }
 /*
  * The set_memory_* API can be used to change various attributes of a virtual
  * address range. The attributes include:
- * Cachability   : UnCached, WriteCombining, WriteBack
+ * Cachability   : UnCached, WriteCombining, WriteThrough, WriteBack
  * Executability : eXeutable, NoteXecutable
  * Read/Write    : ReadOnly, ReadWrite
  * Presence      : NotPresent
@@ -96,9 +96,11 @@ static inline void set_page_memtype(struct page *pg, unsigned long memtype) { }
 
 int _set_memory_uc(unsigned long addr, int numpages);
 int _set_memory_wc(unsigned long addr, int numpages);
+int _set_memory_wt(unsigned long addr, int numpages);
 int _set_memory_wb(unsigned long addr, int numpages);
 int set_memory_uc(unsigned long addr, int numpages);
 int set_memory_wc(unsigned long addr, int numpages);
+int set_memory_wt(unsigned long addr, int numpages);
 int set_memory_wb(unsigned long addr, int numpages);
 int set_memory_x(unsigned long addr, int numpages);
 int set_memory_nx(unsigned long addr, int numpages);
@@ -109,10 +111,12 @@ int set_memory_4k(unsigned long addr, int numpages);
 
 int set_memory_array_uc(unsigned long *addr, int addrinarray);
 int set_memory_array_wc(unsigned long *addr, int addrinarray);
+int set_memory_array_wt(unsigned long *addr, int addrinarray);
 int set_memory_array_wb(unsigned long *addr, int addrinarray);
 
 int set_pages_array_uc(struct page **pages, int addrinarray);
 int set_pages_array_wc(struct page **pages, int addrinarray);
+int set_pages_array_wt(struct page **pages, int addrinarray);
 int set_pages_array_wb(struct page **pages, int addrinarray);
 
 /*
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index ae242a7..a2a1e70 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1562,6 +1562,43 @@ out_err:
 }
 EXPORT_SYMBOL(set_memory_wc);
 
+int set_memory_array_wt(unsigned long *addr, int addrinarray)
+{
+	return _set_memory_array(addr, addrinarray, _PAGE_CACHE_WT);
+}
+EXPORT_SYMBOL(set_memory_array_wt);
+
+int _set_memory_wt(unsigned long addr, int numpages)
+{
+	return change_page_attr_set(&addr, numpages,
+				    __pgprot(_PAGE_CACHE_WT), 0);
+}
+
+int set_memory_wt(unsigned long addr, int numpages)
+{
+	int ret;
+
+	if (!pat_enabled)
+		return set_memory_uc(addr, numpages);
+
+	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
+		_PAGE_CACHE_WT, NULL);
+	if (ret)
+		goto out_err;
+
+	ret = _set_memory_wt(addr, numpages);
+	if (ret)
+		goto out_free;
+
+	return 0;
+
+out_free:
+	free_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE);
+out_err:
+	return ret;
+}
+EXPORT_SYMBOL(set_memory_wt);
+
 int _set_memory_wb(unsigned long addr, int numpages)
 {
 	return change_page_attr_clear(&addr, numpages,
@@ -1699,6 +1736,12 @@ int set_pages_array_wc(struct page **pages, int addrinarray)
 }
 EXPORT_SYMBOL(set_pages_array_wc);
 
+int set_pages_array_wt(struct page **pages, int addrinarray)
+{
+	return _set_pages_array(pages, addrinarray, _PAGE_CACHE_WT);
+}
+EXPORT_SYMBOL(set_pages_array_wt);
+
 int set_pages_wb(struct page *page, int numpages)
 {
 	unsigned long addr = (unsigned long)page_address(page);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 6/11] x86, mm, pat: Add pgprot_writethrough() for WT
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 19:34   ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch introduces pgprot_writethrough() for setting the WT type
for a given pgprot_t.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/pgtable_types.h |    3 +++
 arch/x86/mm/pat.c                    |    9 +++++++++
 include/asm-generic/pgtable.h        |    4 ++++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 7b905cb..1fe8af7 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -343,6 +343,9 @@ extern int nx_enabled;
 #define pgprot_writecombine	pgprot_writecombine
 extern pgprot_t pgprot_writecombine(pgprot_t prot);
 
+#define pgprot_writethrough	pgprot_writethrough
+extern pgprot_t pgprot_writethrough(pgprot_t prot);
+
 /* Indicate that x86 has its own track and untrack pfn vma functions */
 #define __HAVE_PFNMAP_TRACKING
 
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 8a8be17..a987071 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -796,6 +796,15 @@ pgprot_t pgprot_writecombine(pgprot_t prot)
 }
 EXPORT_SYMBOL_GPL(pgprot_writecombine);
 
+pgprot_t pgprot_writethrough(pgprot_t prot)
+{
+	if (pat_enabled)
+		return __pgprot(pgprot_val(prot) | _PAGE_CACHE_WT);
+	else
+		return pgprot_noncached(prot);
+}
+EXPORT_SYMBOL_GPL(pgprot_writethrough);
+
 #if defined(CONFIG_DEBUG_FS) && defined(CONFIG_X86_PAT)
 
 static struct memtype *memtype_get_idx(loff_t pos)
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 53b2acc..1af0ed9 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -249,6 +249,10 @@ static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
 #define pgprot_writecombine pgprot_noncached
 #endif
 
+#ifndef pgprot_writethrough
+#define pgprot_writethrough pgprot_noncached
+#endif
+
 /*
  * When walking page tables, get the address of the next boundary,
  * or the end address of the range if that comes earlier.  Although no

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 6/11] x86, mm, pat: Add pgprot_writethrough() for WT
@ 2014-07-15 19:34   ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch introduces pgprot_writethrough() for setting the WT type
for a given pgprot_t.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/pgtable_types.h |    3 +++
 arch/x86/mm/pat.c                    |    9 +++++++++
 include/asm-generic/pgtable.h        |    4 ++++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 7b905cb..1fe8af7 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -343,6 +343,9 @@ extern int nx_enabled;
 #define pgprot_writecombine	pgprot_writecombine
 extern pgprot_t pgprot_writecombine(pgprot_t prot);
 
+#define pgprot_writethrough	pgprot_writethrough
+extern pgprot_t pgprot_writethrough(pgprot_t prot);
+
 /* Indicate that x86 has its own track and untrack pfn vma functions */
 #define __HAVE_PFNMAP_TRACKING
 
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 8a8be17..a987071 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -796,6 +796,15 @@ pgprot_t pgprot_writecombine(pgprot_t prot)
 }
 EXPORT_SYMBOL_GPL(pgprot_writecombine);
 
+pgprot_t pgprot_writethrough(pgprot_t prot)
+{
+	if (pat_enabled)
+		return __pgprot(pgprot_val(prot) | _PAGE_CACHE_WT);
+	else
+		return pgprot_noncached(prot);
+}
+EXPORT_SYMBOL_GPL(pgprot_writethrough);
+
 #if defined(CONFIG_DEBUG_FS) && defined(CONFIG_X86_PAT)
 
 static struct memtype *memtype_get_idx(loff_t pos)
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 53b2acc..1af0ed9 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -249,6 +249,10 @@ static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
 #define pgprot_writecombine pgprot_noncached
 #endif
 
+#ifndef pgprot_writethrough
+#define pgprot_writethrough pgprot_noncached
+#endif
+
 /*
  * When walking page tables, get the address of the next boundary,
  * or the end address of the range if that comes earlier.  Although no

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 7/11] x86, mm: Keep _set_memory_<type>() slot-independent
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 19:34   ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

The _set_memory_<type>() interfaces assume how each memory type is
assigned to PAT slots in the PAT MSTR.  For instance, _set_memory_wb()
assumes that WB is assigned to the PA0/4 slot by calling
change_page_attr_clear().

This patch changes the _set_memory_<type>() interfaces to call
change_page_attr_set_clr() directly for all memory types, and keep
them independent from the PAT slot assignment.

It also introduces pgprot_set_cache() for setting a specified page
cache value to a pgprot_t value.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/mm/pageattr.c |   36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index a2a1e70..da597d0 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1444,8 +1444,10 @@ int _set_memory_uc(unsigned long addr, int numpages)
 	/*
 	 * for now UC MINUS. see comments in ioremap_nocache()
 	 */
-	return change_page_attr_set(&addr, numpages,
-				    __pgprot(_PAGE_CACHE_UC_MINUS), 0);
+	return change_page_attr_set_clr(&addr, numpages,
+					__pgprot(_PAGE_CACHE_UC_MINUS),
+					__pgprot(_PAGE_CACHE_MASK),
+					0, 0, NULL);
 }
 
 int set_memory_uc(unsigned long addr, int numpages)
@@ -1489,8 +1491,10 @@ static int _set_memory_array(unsigned long *addr, int addrinarray,
 			goto out_free;
 	}
 
-	ret = change_page_attr_set(addr, addrinarray,
-				    __pgprot(_PAGE_CACHE_UC_MINUS), 1);
+	ret = change_page_attr_set_clr(addr, addrinarray,
+				       __pgprot(_PAGE_CACHE_UC_MINUS),
+				       __pgprot(_PAGE_CACHE_MASK),
+				       0, CPA_ARRAY, NULL);
 
 	if (!ret && new_type == _PAGE_CACHE_WC)
 		ret = change_page_attr_set_clr(addr, addrinarray,
@@ -1526,8 +1530,10 @@ int _set_memory_wc(unsigned long addr, int numpages)
 	int ret;
 	unsigned long addr_copy = addr;
 
-	ret = change_page_attr_set(&addr, numpages,
-				    __pgprot(_PAGE_CACHE_UC_MINUS), 0);
+	ret = change_page_attr_set_clr(&addr, numpages,
+				       __pgprot(_PAGE_CACHE_UC_MINUS),
+				       __pgprot(_PAGE_CACHE_MASK),
+				       0, 0, NULL);
 	if (!ret) {
 		ret = change_page_attr_set_clr(&addr_copy, numpages,
 					       __pgprot(_PAGE_CACHE_WC),
@@ -1570,8 +1576,10 @@ EXPORT_SYMBOL(set_memory_array_wt);
 
 int _set_memory_wt(unsigned long addr, int numpages)
 {
-	return change_page_attr_set(&addr, numpages,
-				    __pgprot(_PAGE_CACHE_WT), 0);
+	return change_page_attr_set_clr(&addr, numpages,
+					__pgprot(_PAGE_CACHE_WT),
+					__pgprot(_PAGE_CACHE_MASK),
+					0, 0, NULL);
 }
 
 int set_memory_wt(unsigned long addr, int numpages)
@@ -1601,8 +1609,10 @@ EXPORT_SYMBOL(set_memory_wt);
 
 int _set_memory_wb(unsigned long addr, int numpages)
 {
-	return change_page_attr_clear(&addr, numpages,
-				      __pgprot(_PAGE_CACHE_MASK), 0);
+	return change_page_attr_set_clr(&addr, numpages,
+					__pgprot(_PAGE_CACHE_WB),
+					__pgprot(_PAGE_CACHE_MASK),
+					0, 0, NULL);
 }
 
 int set_memory_wb(unsigned long addr, int numpages)
@@ -1623,8 +1633,10 @@ int set_memory_array_wb(unsigned long *addr, int addrinarray)
 	int i;
 	int ret;
 
-	ret = change_page_attr_clear(addr, addrinarray,
-				      __pgprot(_PAGE_CACHE_MASK), 1);
+	ret = change_page_attr_set_clr(addr, addrinarray,
+				       __pgprot(_PAGE_CACHE_WB),
+				       __pgprot(_PAGE_CACHE_MASK),
+				       0, CPA_ARRAY, NULL);
 	if (ret)
 		return ret;
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 7/11] x86, mm: Keep _set_memory_<type>() slot-independent
@ 2014-07-15 19:34   ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

The _set_memory_<type>() interfaces assume how each memory type is
assigned to PAT slots in the PAT MSTR.  For instance, _set_memory_wb()
assumes that WB is assigned to the PA0/4 slot by calling
change_page_attr_clear().

This patch changes the _set_memory_<type>() interfaces to call
change_page_attr_set_clr() directly for all memory types, and keep
them independent from the PAT slot assignment.

It also introduces pgprot_set_cache() for setting a specified page
cache value to a pgprot_t value.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/mm/pageattr.c |   36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index a2a1e70..da597d0 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1444,8 +1444,10 @@ int _set_memory_uc(unsigned long addr, int numpages)
 	/*
 	 * for now UC MINUS. see comments in ioremap_nocache()
 	 */
-	return change_page_attr_set(&addr, numpages,
-				    __pgprot(_PAGE_CACHE_UC_MINUS), 0);
+	return change_page_attr_set_clr(&addr, numpages,
+					__pgprot(_PAGE_CACHE_UC_MINUS),
+					__pgprot(_PAGE_CACHE_MASK),
+					0, 0, NULL);
 }
 
 int set_memory_uc(unsigned long addr, int numpages)
@@ -1489,8 +1491,10 @@ static int _set_memory_array(unsigned long *addr, int addrinarray,
 			goto out_free;
 	}
 
-	ret = change_page_attr_set(addr, addrinarray,
-				    __pgprot(_PAGE_CACHE_UC_MINUS), 1);
+	ret = change_page_attr_set_clr(addr, addrinarray,
+				       __pgprot(_PAGE_CACHE_UC_MINUS),
+				       __pgprot(_PAGE_CACHE_MASK),
+				       0, CPA_ARRAY, NULL);
 
 	if (!ret && new_type == _PAGE_CACHE_WC)
 		ret = change_page_attr_set_clr(addr, addrinarray,
@@ -1526,8 +1530,10 @@ int _set_memory_wc(unsigned long addr, int numpages)
 	int ret;
 	unsigned long addr_copy = addr;
 
-	ret = change_page_attr_set(&addr, numpages,
-				    __pgprot(_PAGE_CACHE_UC_MINUS), 0);
+	ret = change_page_attr_set_clr(&addr, numpages,
+				       __pgprot(_PAGE_CACHE_UC_MINUS),
+				       __pgprot(_PAGE_CACHE_MASK),
+				       0, 0, NULL);
 	if (!ret) {
 		ret = change_page_attr_set_clr(&addr_copy, numpages,
 					       __pgprot(_PAGE_CACHE_WC),
@@ -1570,8 +1576,10 @@ EXPORT_SYMBOL(set_memory_array_wt);
 
 int _set_memory_wt(unsigned long addr, int numpages)
 {
-	return change_page_attr_set(&addr, numpages,
-				    __pgprot(_PAGE_CACHE_WT), 0);
+	return change_page_attr_set_clr(&addr, numpages,
+					__pgprot(_PAGE_CACHE_WT),
+					__pgprot(_PAGE_CACHE_MASK),
+					0, 0, NULL);
 }
 
 int set_memory_wt(unsigned long addr, int numpages)
@@ -1601,8 +1609,10 @@ EXPORT_SYMBOL(set_memory_wt);
 
 int _set_memory_wb(unsigned long addr, int numpages)
 {
-	return change_page_attr_clear(&addr, numpages,
-				      __pgprot(_PAGE_CACHE_MASK), 0);
+	return change_page_attr_set_clr(&addr, numpages,
+					__pgprot(_PAGE_CACHE_WB),
+					__pgprot(_PAGE_CACHE_MASK),
+					0, 0, NULL);
 }
 
 int set_memory_wb(unsigned long addr, int numpages)
@@ -1623,8 +1633,10 @@ int set_memory_array_wb(unsigned long *addr, int addrinarray)
 	int i;
 	int ret;
 
-	ret = change_page_attr_clear(addr, addrinarray,
-				      __pgprot(_PAGE_CACHE_MASK), 1);
+	ret = change_page_attr_set_clr(addr, addrinarray,
+				       __pgprot(_PAGE_CACHE_WB),
+				       __pgprot(_PAGE_CACHE_MASK),
+				       0, CPA_ARRAY, NULL);
 	if (ret)
 		return ret;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 8/11] x86, mm, pat: Keep pgprot_<type>() slot-independent
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 19:34   ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

The pgrot_<type>() interfaces only set the _PAGE_PCD and/or
_PAGE_PWT bits by assuming that a given pgprot_t value is always
set to the PA0 slot.

This patch changes the pgrot_<type>() interfaces to assure that
a requested memory type is set to the given pgprot_t regardless
of the original pgprot_t value.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/pgtable.h       |    2 +-
 arch/x86/include/asm/pgtable_types.h |    4 ++++
 arch/x86/mm/pat.c                    |    4 ++--
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 0ec0560..df18b14 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -11,7 +11,7 @@
  */
 #define pgprot_noncached(prot)					\
 	((boot_cpu_data.x86 > 3)				\
-	 ? (__pgprot(pgprot_val(prot) | _PAGE_CACHE_UC_MINUS))	\
+	 ? pgprot_set_cache(prot, _PAGE_CACHE_UC_MINUS)		\
 	 : (prot))
 
 #ifndef __ASSEMBLY__
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 1fe8af7..81a3859 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -136,6 +136,10 @@
 #define _PAGE_CACHE_UC_MINUS	(_PAGE_PCD)
 #define _PAGE_CACHE_UC		(_PAGE_CACHE_UC_MINUS)
 
+/* Macro to set a page cache value */
+#define pgprot_set_cache(_prot, _type)					\
+	__pgprot((pgprot_val(_prot) & ~_PAGE_CACHE_MASK) | (_type))
+
 #define PAGE_NONE	__pgprot(_PAGE_PROTNONE | _PAGE_ACCESSED)
 #define PAGE_SHARED	__pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | \
 				 _PAGE_ACCESSED | _PAGE_NX)
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index a987071..0be7ebd 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -790,7 +790,7 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 pgprot_t pgprot_writecombine(pgprot_t prot)
 {
 	if (pat_enabled)
-		return __pgprot(pgprot_val(prot) | _PAGE_CACHE_WC);
+		return pgprot_set_cache(prot, _PAGE_CACHE_WC);
 	else
 		return pgprot_noncached(prot);
 }
@@ -799,7 +799,7 @@ EXPORT_SYMBOL_GPL(pgprot_writecombine);
 pgprot_t pgprot_writethrough(pgprot_t prot)
 {
 	if (pat_enabled)
-		return __pgprot(pgprot_val(prot) | _PAGE_CACHE_WT);
+		return pgprot_set_cache(prot, _PAGE_CACHE_WT);
 	else
 		return pgprot_noncached(prot);
 }

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 8/11] x86, mm, pat: Keep pgprot_<type>() slot-independent
@ 2014-07-15 19:34   ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

The pgrot_<type>() interfaces only set the _PAGE_PCD and/or
_PAGE_PWT bits by assuming that a given pgprot_t value is always
set to the PA0 slot.

This patch changes the pgrot_<type>() interfaces to assure that
a requested memory type is set to the given pgprot_t regardless
of the original pgprot_t value.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/pgtable.h       |    2 +-
 arch/x86/include/asm/pgtable_types.h |    4 ++++
 arch/x86/mm/pat.c                    |    4 ++--
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 0ec0560..df18b14 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -11,7 +11,7 @@
  */
 #define pgprot_noncached(prot)					\
 	((boot_cpu_data.x86 > 3)				\
-	 ? (__pgprot(pgprot_val(prot) | _PAGE_CACHE_UC_MINUS))	\
+	 ? pgprot_set_cache(prot, _PAGE_CACHE_UC_MINUS)		\
 	 : (prot))
 
 #ifndef __ASSEMBLY__
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 1fe8af7..81a3859 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -136,6 +136,10 @@
 #define _PAGE_CACHE_UC_MINUS	(_PAGE_PCD)
 #define _PAGE_CACHE_UC		(_PAGE_CACHE_UC_MINUS)
 
+/* Macro to set a page cache value */
+#define pgprot_set_cache(_prot, _type)					\
+	__pgprot((pgprot_val(_prot) & ~_PAGE_CACHE_MASK) | (_type))
+
 #define PAGE_NONE	__pgprot(_PAGE_PROTNONE | _PAGE_ACCESSED)
 #define PAGE_SHARED	__pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | \
 				 _PAGE_ACCESSED | _PAGE_NX)
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index a987071..0be7ebd 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -790,7 +790,7 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 pgprot_t pgprot_writecombine(pgprot_t prot)
 {
 	if (pat_enabled)
-		return __pgprot(pgprot_val(prot) | _PAGE_CACHE_WC);
+		return pgprot_set_cache(prot, _PAGE_CACHE_WC);
 	else
 		return pgprot_noncached(prot);
 }
@@ -799,7 +799,7 @@ EXPORT_SYMBOL_GPL(pgprot_writecombine);
 pgprot_t pgprot_writethrough(pgprot_t prot)
 {
 	if (pat_enabled)
-		return __pgprot(pgprot_val(prot) | _PAGE_CACHE_WT);
+		return pgprot_set_cache(prot, _PAGE_CACHE_WT);
 	else
 		return pgprot_noncached(prot);
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 9/11] x86, efi: Cleanup PCD bit manipulation in EFI
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 19:34   ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch cleans up the PCD bit manipulation in EFI virtual mapping,
and uses _PAGE_CACHE_<type> macros, instead.  This keeps the efi code
independent from the PAT slot assignment.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/platform/efi/efi_64.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 290d397..55c6e77 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -202,10 +202,10 @@ void efi_cleanup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 static void __init __map_region(efi_memory_desc_t *md, u64 va)
 {
 	pgd_t *pgd = (pgd_t *)__va(real_mode_header->trampoline_pgd);
-	unsigned long pf = 0;
+	unsigned long pf = _PAGE_CACHE_WB;
 
 	if (!(md->attribute & EFI_MEMORY_WB))
-		pf |= _PAGE_PCD;
+		pf = _PAGE_CACHE_UC_MINUS;
 
 	if (kernel_map_pages_in_pgd(pgd, md->phys_addr, va, md->num_pages, pf))
 		pr_warn("Error mapping PA 0x%llx -> VA 0x%llx!\n",

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 9/11] x86, efi: Cleanup PCD bit manipulation in EFI
@ 2014-07-15 19:34   ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch cleans up the PCD bit manipulation in EFI virtual mapping,
and uses _PAGE_CACHE_<type> macros, instead.  This keeps the efi code
independent from the PAT slot assignment.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/platform/efi/efi_64.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 290d397..55c6e77 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -202,10 +202,10 @@ void efi_cleanup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 static void __init __map_region(efi_memory_desc_t *md, u64 va)
 {
 	pgd_t *pgd = (pgd_t *)__va(real_mode_header->trampoline_pgd);
-	unsigned long pf = 0;
+	unsigned long pf = _PAGE_CACHE_WB;
 
 	if (!(md->attribute & EFI_MEMORY_WB))
-		pf |= _PAGE_PCD;
+		pf = _PAGE_CACHE_UC_MINUS;
 
 	if (kernel_map_pages_in_pgd(pgd, md->phys_addr, va, md->num_pages, pf))
 		pr_warn("Error mapping PA 0x%llx -> VA 0x%llx!\n",

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 10/11] x86, xen: Cleanup PWT/PCD bit manipulation in Xen
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 19:34   ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch cleans up the PWT & PCD bit manipulation for the kernel
memory types in Xen, and uses _PAGE_CACHE_<type> macros, instead.
This keeps the Xen code independent from the PAT slot assignment.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/xen/enlighten.c |    2 +-
 arch/x86/xen/mmu.c       |    8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index ffb101e..1917bef 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1557,7 +1557,7 @@ asmlinkage __visible void __init xen_start_kernel(void)
 #if 0
 	if (!xen_initial_domain())
 #endif
-		__supported_pte_mask &= ~(_PAGE_PWT | _PAGE_PCD);
+		__supported_pte_mask &= ~_PAGE_CACHE_MASK;
 
 	__supported_pte_mask |= _PAGE_IOMAP;
 
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index e8a1201..8ef154a 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -438,7 +438,7 @@ __visible pteval_t xen_pte_val(pte_t pte)
 	/* If this is a WC pte, convert back from Xen WC to Linux WC */
 	if ((pteval & (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT)) == _PAGE_PAT) {
 		WARN_ON(!pat_enabled);
-		pteval = (pteval & ~_PAGE_PAT) | _PAGE_PWT;
+		pteval = (pteval & ~_PAGE_PAT) | _PAGE_CACHE_WC;
 	}
 #endif
 	if (xen_initial_domain() && (pteval & _PAGE_IOMAP))
@@ -465,11 +465,11 @@ PV_CALLEE_SAVE_REGS_THUNK(xen_pgd_val);
  * 0                     WB       WB     WB
  * 1            PWT      WC       WT     WT
  * 2        PCD          UC-      UC-    UC-
- * 3        PCD PWT      UC       UC     UC
+ * 3        PCD PWT      WT       UC     UC
  * 4    PAT              WB       WC     WB
  * 5    PAT     PWT      WC       WP     WT
  * 6    PAT PCD          UC-      rsv    UC-
- * 7    PAT PCD PWT      UC       rsv    UC
+ * 7    PAT PCD PWT      WT       rsv    UC
  */
 
 void xen_set_pat(u64 pat)
@@ -492,7 +492,7 @@ __visible pte_t xen_make_pte(pteval_t pte)
 	 * but we could see hugetlbfs mappings, I think.).
 	 */
 	if (pat_enabled && !WARN_ON(pte & _PAGE_PAT)) {
-		if ((pte & (_PAGE_PCD | _PAGE_PWT)) == _PAGE_PWT)
+		if ((pte & _PAGE_CACHE_MASK) == _PAGE_CACHE_WC)
 			pte = (pte & ~(_PAGE_PCD | _PAGE_PWT)) | _PAGE_PAT;
 	}
 #endif

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 10/11] x86, xen: Cleanup PWT/PCD bit manipulation in Xen
@ 2014-07-15 19:34   ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch cleans up the PWT & PCD bit manipulation for the kernel
memory types in Xen, and uses _PAGE_CACHE_<type> macros, instead.
This keeps the Xen code independent from the PAT slot assignment.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/xen/enlighten.c |    2 +-
 arch/x86/xen/mmu.c       |    8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index ffb101e..1917bef 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1557,7 +1557,7 @@ asmlinkage __visible void __init xen_start_kernel(void)
 #if 0
 	if (!xen_initial_domain())
 #endif
-		__supported_pte_mask &= ~(_PAGE_PWT | _PAGE_PCD);
+		__supported_pte_mask &= ~_PAGE_CACHE_MASK;
 
 	__supported_pte_mask |= _PAGE_IOMAP;
 
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index e8a1201..8ef154a 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -438,7 +438,7 @@ __visible pteval_t xen_pte_val(pte_t pte)
 	/* If this is a WC pte, convert back from Xen WC to Linux WC */
 	if ((pteval & (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT)) == _PAGE_PAT) {
 		WARN_ON(!pat_enabled);
-		pteval = (pteval & ~_PAGE_PAT) | _PAGE_PWT;
+		pteval = (pteval & ~_PAGE_PAT) | _PAGE_CACHE_WC;
 	}
 #endif
 	if (xen_initial_domain() && (pteval & _PAGE_IOMAP))
@@ -465,11 +465,11 @@ PV_CALLEE_SAVE_REGS_THUNK(xen_pgd_val);
  * 0                     WB       WB     WB
  * 1            PWT      WC       WT     WT
  * 2        PCD          UC-      UC-    UC-
- * 3        PCD PWT      UC       UC     UC
+ * 3        PCD PWT      WT       UC     UC
  * 4    PAT              WB       WC     WB
  * 5    PAT     PWT      WC       WP     WT
  * 6    PAT PCD          UC-      rsv    UC-
- * 7    PAT PCD PWT      UC       rsv    UC
+ * 7    PAT PCD PWT      WT       rsv    UC
  */
 
 void xen_set_pat(u64 pat)
@@ -492,7 +492,7 @@ __visible pte_t xen_make_pte(pteval_t pte)
 	 * but we could see hugetlbfs mappings, I think.).
 	 */
 	if (pat_enabled && !WARN_ON(pte & _PAGE_PAT)) {
-		if ((pte & (_PAGE_PCD | _PAGE_PWT)) == _PAGE_PWT)
+		if ((pte & _PAGE_CACHE_MASK) == _PAGE_CACHE_WC)
 			pte = (pte & ~(_PAGE_PCD | _PAGE_PWT)) | _PAGE_PAT;
 	}
 #endif

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 11/11] x86, fbdev: Cleanup PWT/PCD bit manipulation in fbdev
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 19:34   ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch cleans up the PWT & PCD bit manipulation in fbdev,
and uses _PAGE_CACHE_<type> macros, instead.  This keeps the
fbdev code independent from the PAT slot assignment.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/fb.h                 |    3 ++-
 drivers/video/fbdev/gbefb.c               |    3 ++-
 drivers/video/fbdev/vermilion/vermilion.c |    4 ++--
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/fb.h b/arch/x86/include/asm/fb.h
index 2519d06..05fa937 100644
--- a/arch/x86/include/asm/fb.h
+++ b/arch/x86/include/asm/fb.h
@@ -9,7 +9,8 @@ static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
 				unsigned long off)
 {
 	if (boot_cpu_data.x86 > 3)
-		pgprot_val(vma->vm_page_prot) |= _PAGE_PCD;
+		vma->vm_page_prot = pgprot_set_cache(vma->vm_page_prot,
+						     _PAGE_CACHE_UC_MINUS);
 }
 
 extern int fb_is_primary_device(struct fb_info *info);
diff --git a/drivers/video/fbdev/gbefb.c b/drivers/video/fbdev/gbefb.c
index 4aa56ba..4af9ec7 100644
--- a/drivers/video/fbdev/gbefb.c
+++ b/drivers/video/fbdev/gbefb.c
@@ -54,7 +54,8 @@ struct gbefb_par {
 #endif
 #endif
 #ifdef CONFIG_X86
-#define pgprot_fb(_prot) ((_prot) | _PAGE_PCD)
+/* NOTE: use _PAGE_CACHE_WT if desired */
+#define pgprot_fb(_prot) (((_prot) & ~_PAGE_CACHE_MASK) | _PAGE_CACHE_UC_MINUS)
 #endif
 
 /*
diff --git a/drivers/video/fbdev/vermilion/vermilion.c b/drivers/video/fbdev/vermilion/vermilion.c
index 048a666..6a7c744 100644
--- a/drivers/video/fbdev/vermilion/vermilion.c
+++ b/drivers/video/fbdev/vermilion/vermilion.c
@@ -1009,8 +1009,8 @@ static int vmlfb_mmap(struct fb_info *info, struct vm_area_struct *vma)
 	if (ret)
 		return -EINVAL;
 
-	pgprot_val(vma->vm_page_prot) |= _PAGE_PCD;
-	pgprot_val(vma->vm_page_prot) &= ~_PAGE_PWT;
+	vma->vm_page_prot = pgprot_set_cache(vma->vm_page_prot,
+					     _PAGE_CACHE_UC_MINUS);
 
 	return vm_iomap_memory(vma, vinfo->vram_start,
 			vinfo->vram_contig_size);

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [RFC PATCH 11/11] x86, fbdev: Cleanup PWT/PCD bit manipulation in fbdev
@ 2014-07-15 19:34   ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 19:34 UTC (permalink / raw
  To: hpa, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp,
	Toshi Kani

This patch cleans up the PWT & PCD bit manipulation in fbdev,
and uses _PAGE_CACHE_<type> macros, instead.  This keeps the
fbdev code independent from the PAT slot assignment.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/fb.h                 |    3 ++-
 drivers/video/fbdev/gbefb.c               |    3 ++-
 drivers/video/fbdev/vermilion/vermilion.c |    4 ++--
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/fb.h b/arch/x86/include/asm/fb.h
index 2519d06..05fa937 100644
--- a/arch/x86/include/asm/fb.h
+++ b/arch/x86/include/asm/fb.h
@@ -9,7 +9,8 @@ static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
 				unsigned long off)
 {
 	if (boot_cpu_data.x86 > 3)
-		pgprot_val(vma->vm_page_prot) |= _PAGE_PCD;
+		vma->vm_page_prot = pgprot_set_cache(vma->vm_page_prot,
+						     _PAGE_CACHE_UC_MINUS);
 }
 
 extern int fb_is_primary_device(struct fb_info *info);
diff --git a/drivers/video/fbdev/gbefb.c b/drivers/video/fbdev/gbefb.c
index 4aa56ba..4af9ec7 100644
--- a/drivers/video/fbdev/gbefb.c
+++ b/drivers/video/fbdev/gbefb.c
@@ -54,7 +54,8 @@ struct gbefb_par {
 #endif
 #endif
 #ifdef CONFIG_X86
-#define pgprot_fb(_prot) ((_prot) | _PAGE_PCD)
+/* NOTE: use _PAGE_CACHE_WT if desired */
+#define pgprot_fb(_prot) (((_prot) & ~_PAGE_CACHE_MASK) | _PAGE_CACHE_UC_MINUS)
 #endif
 
 /*
diff --git a/drivers/video/fbdev/vermilion/vermilion.c b/drivers/video/fbdev/vermilion/vermilion.c
index 048a666..6a7c744 100644
--- a/drivers/video/fbdev/vermilion/vermilion.c
+++ b/drivers/video/fbdev/vermilion/vermilion.c
@@ -1009,8 +1009,8 @@ static int vmlfb_mmap(struct fb_info *info, struct vm_area_struct *vma)
 	if (ret)
 		return -EINVAL;
 
-	pgprot_val(vma->vm_page_prot) |= _PAGE_PCD;
-	pgprot_val(vma->vm_page_prot) &= ~_PAGE_PWT;
+	vma->vm_page_prot = pgprot_set_cache(vma->vm_page_prot,
+					     _PAGE_CACHE_UC_MINUS);
 
 	return vm_iomap_memory(vma, vinfo->vram_start,
 			vinfo->vram_contig_size);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 19:53   ` Andy Lutomirski
  -1 siblings, 0 replies; 74+ messages in thread
From: Andy Lutomirski @ 2014-07-15 19:53 UTC (permalink / raw
  To: Toshi Kani
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> This RFC patchset is aimed to seek comments/suggestions for the design
> and changes to support of Write-Through (WT) mapping.  The study below
> shows that using WT mapping may be useful for non-volatile memory.
>
>   http://www.hpl.hp.com/techreports/2012/HPL-2012-236.pdf
>
> There were idea & patches to support WT in the past, which stimulated
> very valuable discussions on this topic.
>
>   https://lkml.org/lkml/2013/4/24/424
>   https://lkml.org/lkml/2013/10/27/70
>   https://lkml.org/lkml/2013/11/3/72
>
> This RFC patchset tries to address the issues raised by taking the
> following design approach:
>
>  - Keep the MTRR interface
>  - Keep the WB, WC, and UC- slots in the PAT MSR
>  - Keep the PAT bit unused
>  - Reassign the UC slot to WT in the PAT MSR
>
> There are 4 usable slots in the PAT MSR, which are currently assigned to:
>
>   PA0/4: WB, PA1/5: WC, PA2/6: UC-, PA3/7: UC
>
> The PAT bit is unused since it shares the same bit as the PSE bit and
> there was a bug in older processors.  Among the 4 slots, the uncached
> memory type consumes 2 slots, UC- and UC.  They are functionally
> equivalent, but UC- allows MTRRs to overwrite it with WC.  All interfaces
> that set the uncached memory type use UC- in order to work with MTRRs.
> The PA3/7 slot is effectively unused today.  Therefore, this patchset
> reassigns the PA3/7 slot to WT.  If MTRRs get deprecated in future,
> UC- can be reassigned to UC, and there is still no need to consume
> 2 slots for the uncached memory type.

Note that MTRRs are already partially deprecated: all drivers *should*
be using arch_phys_wc_add, not mtrr_add, and arch_phys_wc_add is a
no-op on systems with working PAT.

Unfortunately, I never finished excising mtrr_add.  Finishing the job
wouldn't be very hard.

--Andy

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-15 19:53   ` Andy Lutomirski
  0 siblings, 0 replies; 74+ messages in thread
From: Andy Lutomirski @ 2014-07-15 19:53 UTC (permalink / raw
  To: Toshi Kani
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> This RFC patchset is aimed to seek comments/suggestions for the design
> and changes to support of Write-Through (WT) mapping.  The study below
> shows that using WT mapping may be useful for non-volatile memory.
>
>   http://www.hpl.hp.com/techreports/2012/HPL-2012-236.pdf
>
> There were idea & patches to support WT in the past, which stimulated
> very valuable discussions on this topic.
>
>   https://lkml.org/lkml/2013/4/24/424
>   https://lkml.org/lkml/2013/10/27/70
>   https://lkml.org/lkml/2013/11/3/72
>
> This RFC patchset tries to address the issues raised by taking the
> following design approach:
>
>  - Keep the MTRR interface
>  - Keep the WB, WC, and UC- slots in the PAT MSR
>  - Keep the PAT bit unused
>  - Reassign the UC slot to WT in the PAT MSR
>
> There are 4 usable slots in the PAT MSR, which are currently assigned to:
>
>   PA0/4: WB, PA1/5: WC, PA2/6: UC-, PA3/7: UC
>
> The PAT bit is unused since it shares the same bit as the PSE bit and
> there was a bug in older processors.  Among the 4 slots, the uncached
> memory type consumes 2 slots, UC- and UC.  They are functionally
> equivalent, but UC- allows MTRRs to overwrite it with WC.  All interfaces
> that set the uncached memory type use UC- in order to work with MTRRs.
> The PA3/7 slot is effectively unused today.  Therefore, this patchset
> reassigns the PA3/7 slot to WT.  If MTRRs get deprecated in future,
> UC- can be reassigned to UC, and there is still no need to consume
> 2 slots for the uncached memory type.

Note that MTRRs are already partially deprecated: all drivers *should*
be using arch_phys_wc_add, not mtrr_add, and arch_phys_wc_add is a
no-op on systems with working PAT.

Unfortunately, I never finished excising mtrr_add.  Finishing the job
wouldn't be very hard.

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
  2014-07-15 19:34   ` Toshi Kani
@ 2014-07-15 19:56     ` Andy Lutomirski
  -1 siblings, 0 replies; 74+ messages in thread
From: Andy Lutomirski @ 2014-07-15 19:56 UTC (permalink / raw
  To: Toshi Kani
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> This patch changes reserve_memtype() to handle the new WT type.
> When (!pat_enabled && new_type), it continues to set either WB
> or UC- to *new_type.  When pat_enabled, it can reserve a given
> non-RAM range for WT.  At this point, it may not reserve a RAM
> range for WT since reserve_ram_pages_type() uses the page flags
> limited to three memory types, WB, WC and UC.

FWIW, last time I looked at this, it seemed like all the fancy
reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
easy to track in the direct map page tables?

--Andy

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
@ 2014-07-15 19:56     ` Andy Lutomirski
  0 siblings, 0 replies; 74+ messages in thread
From: Andy Lutomirski @ 2014-07-15 19:56 UTC (permalink / raw
  To: Toshi Kani
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> This patch changes reserve_memtype() to handle the new WT type.
> When (!pat_enabled && new_type), it continues to set either WB
> or UC- to *new_type.  When pat_enabled, it can reserve a given
> non-RAM range for WT.  At this point, it may not reserve a RAM
> range for WT since reserve_ram_pages_type() uses the page flags
> limited to three memory types, WB, WC and UC.

FWIW, last time I looked at this, it seemed like all the fancy
reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
easy to track in the direct map page tables?

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-15 19:34 ` Toshi Kani
@ 2014-07-15 20:09   ` H. Peter Anvin
  -1 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-15 20:09 UTC (permalink / raw
  To: Toshi Kani, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp

On 07/15/2014 12:34 PM, Toshi Kani wrote:
> This RFC patchset is aimed to seek comments/suggestions for the design
> and changes to support of Write-Through (WT) mapping.  The study below
> shows that using WT mapping may be useful for non-volatile memory.
> 
>   http://www.hpl.hp.com/techreports/2012/HPL-2012-236.pdf
> 
> There were idea & patches to support WT in the past, which stimulated
> very valuable discussions on this topic.
> 
>   https://lkml.org/lkml/2013/4/24/424
>   https://lkml.org/lkml/2013/10/27/70
>   https://lkml.org/lkml/2013/11/3/72
> 
> This RFC patchset tries to address the issues raised by taking the
> following design approach:
> 
>  - Keep the MTRR interface
>  - Keep the WB, WC, and UC- slots in the PAT MSR
>  - Keep the PAT bit unused
>  - Reassign the UC slot to WT in the PAT MSR
> 
> There are 4 usable slots in the PAT MSR, which are currently assigned to:
> 
>   PA0/4: WB, PA1/5: WC, PA2/6: UC-, PA3/7: UC
> 
> The PAT bit is unused since it shares the same bit as the PSE bit and
> there was a bug in older processors.  Among the 4 slots, the uncached
> memory type consumes 2 slots, UC- and UC.  They are functionally
> equivalent, but UC- allows MTRRs to overwrite it with WC.  All interfaces
> that set the uncached memory type use UC- in order to work with MTRRs.
> The PA3/7 slot is effectively unused today.  Therefore, this patchset
> reassigns the PA3/7 slot to WT.  If MTRRs get deprecated in future,
> UC- can be reassigned to UC, and there is still no need to consume
> 2 slots for the uncached memory type.

Not going to happen any time in the forseeable future.

Furthermore, I don't think it is a big deal if on some old, buggy
processors we take the performance hit of cache type demotion, as long
as we don't actively lose data.

> This patchset is consist of two parts.  The 1st part, patch [1/11] to
> [6/11], enables WT mapping and adds new interfaces for setting WT mapping.
> The 2nd part, patch [7/11] to [11/11], cleans up the code that has
> internal knowledge of the PAT slot assignment.  This keeps the kernel
> code independent from the PAT slot assignment.

I have given this piece of feedback at least three times now, possibly
to different people, and I'm getting a bit grumpy about it:

We already have an issue with Xen, because Xen assigned mappings
differently and it is incompatible with the use of PAT in Linux.  As a
result we get requests for hacks to work around this, which is something
I really don't want to see.  I would like to see a design involving a
"reverse PAT" table where the kernel can hold the mapping between memory
types and page table encodings (including the two different ones for
small and large pages.)

	-hpa


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-15 20:09   ` H. Peter Anvin
  0 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-15 20:09 UTC (permalink / raw
  To: Toshi Kani, tglx, mingo, akpm, arnd, konrad.wilk, plagnioj,
	tomi.valkeinen
  Cc: linux-mm, linux-kernel, stefan.bader, luto, airlied, bp

On 07/15/2014 12:34 PM, Toshi Kani wrote:
> This RFC patchset is aimed to seek comments/suggestions for the design
> and changes to support of Write-Through (WT) mapping.  The study below
> shows that using WT mapping may be useful for non-volatile memory.
> 
>   http://www.hpl.hp.com/techreports/2012/HPL-2012-236.pdf
> 
> There were idea & patches to support WT in the past, which stimulated
> very valuable discussions on this topic.
> 
>   https://lkml.org/lkml/2013/4/24/424
>   https://lkml.org/lkml/2013/10/27/70
>   https://lkml.org/lkml/2013/11/3/72
> 
> This RFC patchset tries to address the issues raised by taking the
> following design approach:
> 
>  - Keep the MTRR interface
>  - Keep the WB, WC, and UC- slots in the PAT MSR
>  - Keep the PAT bit unused
>  - Reassign the UC slot to WT in the PAT MSR
> 
> There are 4 usable slots in the PAT MSR, which are currently assigned to:
> 
>   PA0/4: WB, PA1/5: WC, PA2/6: UC-, PA3/7: UC
> 
> The PAT bit is unused since it shares the same bit as the PSE bit and
> there was a bug in older processors.  Among the 4 slots, the uncached
> memory type consumes 2 slots, UC- and UC.  They are functionally
> equivalent, but UC- allows MTRRs to overwrite it with WC.  All interfaces
> that set the uncached memory type use UC- in order to work with MTRRs.
> The PA3/7 slot is effectively unused today.  Therefore, this patchset
> reassigns the PA3/7 slot to WT.  If MTRRs get deprecated in future,
> UC- can be reassigned to UC, and there is still no need to consume
> 2 slots for the uncached memory type.

Not going to happen any time in the forseeable future.

Furthermore, I don't think it is a big deal if on some old, buggy
processors we take the performance hit of cache type demotion, as long
as we don't actively lose data.

> This patchset is consist of two parts.  The 1st part, patch [1/11] to
> [6/11], enables WT mapping and adds new interfaces for setting WT mapping.
> The 2nd part, patch [7/11] to [11/11], cleans up the code that has
> internal knowledge of the PAT slot assignment.  This keeps the kernel
> code independent from the PAT slot assignment.

I have given this piece of feedback at least three times now, possibly
to different people, and I'm getting a bit grumpy about it:

We already have an issue with Xen, because Xen assigned mappings
differently and it is incompatible with the use of PAT in Linux.  As a
result we get requests for hacks to work around this, which is something
I really don't want to see.  I would like to see a design involving a
"reverse PAT" table where the kernel can hold the mapping between memory
types and page table encodings (including the two different ones for
small and large pages.)

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-15 19:53   ` Andy Lutomirski
@ 2014-07-15 20:10     ` H. Peter Anvin
  -1 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-15 20:10 UTC (permalink / raw
  To: Andy Lutomirski, Toshi Kani
  Cc: Thomas Gleixner, Ingo Molnar, Andrew Morton, Arnd Bergmann,
	Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On 07/15/2014 12:53 PM, Andy Lutomirski wrote:
> 
> Note that MTRRs are already partially deprecated: all drivers *should*
> be using arch_phys_wc_add, not mtrr_add, and arch_phys_wc_add is a
> no-op on systems with working PAT.
> 
> Unfortunately, I never finished excising mtrr_add.  Finishing the job
> wouldn't be very hard.
> 

The use of MTRRs in drivers is separate from the MTRR global setup done
by the firmware, though.

	-hpa



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-15 20:10     ` H. Peter Anvin
  0 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-15 20:10 UTC (permalink / raw
  To: Andy Lutomirski, Toshi Kani
  Cc: Thomas Gleixner, Ingo Molnar, Andrew Morton, Arnd Bergmann,
	Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On 07/15/2014 12:53 PM, Andy Lutomirski wrote:
> 
> Note that MTRRs are already partially deprecated: all drivers *should*
> be using arch_phys_wc_add, not mtrr_add, and arch_phys_wc_add is a
> no-op on systems with working PAT.
> 
> Unfortunately, I never finished excising mtrr_add.  Finishing the job
> wouldn't be very hard.
> 

The use of MTRRs in drivers is separate from the MTRR global setup done
by the firmware, though.

	-hpa


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-15 20:09   ` H. Peter Anvin
@ 2014-07-15 21:23     ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 21:23 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: tglx, mingo, akpm, arnd, konrad.wilk, plagnioj, tomi.valkeinen,
	linux-mm, linux-kernel, stefan.bader, luto, airlied, bp

On Tue, 2014-07-15 at 13:09 -0700, H. Peter Anvin wrote:
> On 07/15/2014 12:34 PM, Toshi Kani wrote:
> > This RFC patchset is aimed to seek comments/suggestions for the design
> > and changes to support of Write-Through (WT) mapping.  The study below
> > shows that using WT mapping may be useful for non-volatile memory.
> > 
> >   http://www.hpl.hp.com/techreports/2012/HPL-2012-236.pdf
> > 
> > There were idea & patches to support WT in the past, which stimulated
> > very valuable discussions on this topic.
> > 
> >   https://lkml.org/lkml/2013/4/24/424
> >   https://lkml.org/lkml/2013/10/27/70
> >   https://lkml.org/lkml/2013/11/3/72
> > 
> > This RFC patchset tries to address the issues raised by taking the
> > following design approach:
> > 
> >  - Keep the MTRR interface
> >  - Keep the WB, WC, and UC- slots in the PAT MSR
> >  - Keep the PAT bit unused
> >  - Reassign the UC slot to WT in the PAT MSR
> > 
> > There are 4 usable slots in the PAT MSR, which are currently assigned to:
> > 
> >   PA0/4: WB, PA1/5: WC, PA2/6: UC-, PA3/7: UC
> > 
> > The PAT bit is unused since it shares the same bit as the PSE bit and
> > there was a bug in older processors.  Among the 4 slots, the uncached
> > memory type consumes 2 slots, UC- and UC.  They are functionally
> > equivalent, but UC- allows MTRRs to overwrite it with WC.  All interfaces
> > that set the uncached memory type use UC- in order to work with MTRRs.
> > The PA3/7 slot is effectively unused today.  Therefore, this patchset
> > reassigns the PA3/7 slot to WT.  If MTRRs get deprecated in future,
> > UC- can be reassigned to UC, and there is still no need to consume
> > 2 slots for the uncached memory type.
> 
> Not going to happen any time in the forseeable future.
> 
> Furthermore, I don't think it is a big deal if on some old, buggy
> processors we take the performance hit of cache type demotion, as long
> as we don't actively lose data.
> 
> > This patchset is consist of two parts.  The 1st part, patch [1/11] to
> > [6/11], enables WT mapping and adds new interfaces for setting WT mapping.
> > The 2nd part, patch [7/11] to [11/11], cleans up the code that has
> > internal knowledge of the PAT slot assignment.  This keeps the kernel
> > code independent from the PAT slot assignment.
> 
> I have given this piece of feedback at least three times now, possibly
> to different people, and I'm getting a bit grumpy about it:
> 
> We already have an issue with Xen, because Xen assigned mappings
> differently and it is incompatible with the use of PAT in Linux.  As a
> result we get requests for hacks to work around this, which is something
> I really don't want to see.  I would like to see a design involving a
> "reverse PAT" table where the kernel can hold the mapping between memory
> types and page table encodings (including the two different ones for
> small and large pages.)

Thanks for pointing this out! (And sorry for making you repeat it three
time...)  I was not aware of the issue with Xen.  I will look into the
email archive to see what the Xen issue is, and how it can be addressed.

Thanks,
-Toshi



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-15 21:23     ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 21:23 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: tglx, mingo, akpm, arnd, konrad.wilk, plagnioj, tomi.valkeinen,
	linux-mm, linux-kernel, stefan.bader, luto, airlied, bp

On Tue, 2014-07-15 at 13:09 -0700, H. Peter Anvin wrote:
> On 07/15/2014 12:34 PM, Toshi Kani wrote:
> > This RFC patchset is aimed to seek comments/suggestions for the design
> > and changes to support of Write-Through (WT) mapping.  The study below
> > shows that using WT mapping may be useful for non-volatile memory.
> > 
> >   http://www.hpl.hp.com/techreports/2012/HPL-2012-236.pdf
> > 
> > There were idea & patches to support WT in the past, which stimulated
> > very valuable discussions on this topic.
> > 
> >   https://lkml.org/lkml/2013/4/24/424
> >   https://lkml.org/lkml/2013/10/27/70
> >   https://lkml.org/lkml/2013/11/3/72
> > 
> > This RFC patchset tries to address the issues raised by taking the
> > following design approach:
> > 
> >  - Keep the MTRR interface
> >  - Keep the WB, WC, and UC- slots in the PAT MSR
> >  - Keep the PAT bit unused
> >  - Reassign the UC slot to WT in the PAT MSR
> > 
> > There are 4 usable slots in the PAT MSR, which are currently assigned to:
> > 
> >   PA0/4: WB, PA1/5: WC, PA2/6: UC-, PA3/7: UC
> > 
> > The PAT bit is unused since it shares the same bit as the PSE bit and
> > there was a bug in older processors.  Among the 4 slots, the uncached
> > memory type consumes 2 slots, UC- and UC.  They are functionally
> > equivalent, but UC- allows MTRRs to overwrite it with WC.  All interfaces
> > that set the uncached memory type use UC- in order to work with MTRRs.
> > The PA3/7 slot is effectively unused today.  Therefore, this patchset
> > reassigns the PA3/7 slot to WT.  If MTRRs get deprecated in future,
> > UC- can be reassigned to UC, and there is still no need to consume
> > 2 slots for the uncached memory type.
> 
> Not going to happen any time in the forseeable future.
> 
> Furthermore, I don't think it is a big deal if on some old, buggy
> processors we take the performance hit of cache type demotion, as long
> as we don't actively lose data.
> 
> > This patchset is consist of two parts.  The 1st part, patch [1/11] to
> > [6/11], enables WT mapping and adds new interfaces for setting WT mapping.
> > The 2nd part, patch [7/11] to [11/11], cleans up the code that has
> > internal knowledge of the PAT slot assignment.  This keeps the kernel
> > code independent from the PAT slot assignment.
> 
> I have given this piece of feedback at least three times now, possibly
> to different people, and I'm getting a bit grumpy about it:
> 
> We already have an issue with Xen, because Xen assigned mappings
> differently and it is incompatible with the use of PAT in Linux.  As a
> result we get requests for hacks to work around this, which is something
> I really don't want to see.  I would like to see a design involving a
> "reverse PAT" table where the kernel can hold the mapping between memory
> types and page table encodings (including the two different ones for
> small and large pages.)

Thanks for pointing this out! (And sorry for making you repeat it three
time...)  I was not aware of the issue with Xen.  I will look into the
email archive to see what the Xen issue is, and how it can be addressed.

Thanks,
-Toshi


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
  2014-07-15 19:56     ` Andy Lutomirski
@ 2014-07-15 23:10       ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 23:10 UTC (permalink / raw
  To: Andy Lutomirski
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> > This patch changes reserve_memtype() to handle the new WT type.
> > When (!pat_enabled && new_type), it continues to set either WB
> > or UC- to *new_type.  When pat_enabled, it can reserve a given
> > non-RAM range for WT.  At this point, it may not reserve a RAM
> > range for WT since reserve_ram_pages_type() uses the page flags
> > limited to three memory types, WB, WC and UC.
> 
> FWIW, last time I looked at this, it seemed like all the fancy
> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
> easy to track in the direct map page tables?

Are you referring the direct map page tables as the kernel page
directory tables (pgd/pud/..)?

I think it needs to be able to keep track of the memory type per a
physical memory range, not per a translation, in order to prevent
aliasing of the memory type.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
@ 2014-07-15 23:10       ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 23:10 UTC (permalink / raw
  To: Andy Lutomirski
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> > This patch changes reserve_memtype() to handle the new WT type.
> > When (!pat_enabled && new_type), it continues to set either WB
> > or UC- to *new_type.  When pat_enabled, it can reserve a given
> > non-RAM range for WT.  At this point, it may not reserve a RAM
> > range for WT since reserve_ram_pages_type() uses the page flags
> > limited to three memory types, WB, WC and UC.
> 
> FWIW, last time I looked at this, it seemed like all the fancy
> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
> easy to track in the direct map page tables?

Are you referring the direct map page tables as the kernel page
directory tables (pgd/pud/..)?

I think it needs to be able to keep track of the memory type per a
physical memory range, not per a translation, in order to prevent
aliasing of the memory type.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
  2014-07-15 23:10       ` Toshi Kani
@ 2014-07-15 23:36         ` Andy Lutomirski
  -1 siblings, 0 replies; 74+ messages in thread
From: Andy Lutomirski @ 2014-07-15 23:36 UTC (permalink / raw
  To: Toshi Kani
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, Jul 15, 2014 at 4:10 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
>> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>> > This patch changes reserve_memtype() to handle the new WT type.
>> > When (!pat_enabled && new_type), it continues to set either WB
>> > or UC- to *new_type.  When pat_enabled, it can reserve a given
>> > non-RAM range for WT.  At this point, it may not reserve a RAM
>> > range for WT since reserve_ram_pages_type() uses the page flags
>> > limited to three memory types, WB, WC and UC.
>>
>> FWIW, last time I looked at this, it seemed like all the fancy
>> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
>> easy to track in the direct map page tables?
>
> Are you referring the direct map page tables as the kernel page
> directory tables (pgd/pud/..)?
>
> I think it needs to be able to keep track of the memory type per a
> physical memory range, not per a translation, in order to prevent
> aliasing of the memory type.

Actual RAM (the lowmem kind, which is all of it on x86_64) is mapped
linearly somewhere in kernel address space.  The pagetables for that
mapping could be used as the canonical source of the memory type for
the ram range in question.

This only works for lowmem, so maybe it's not a good idea to rely on it.

--Andy

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
@ 2014-07-15 23:36         ` Andy Lutomirski
  0 siblings, 0 replies; 74+ messages in thread
From: Andy Lutomirski @ 2014-07-15 23:36 UTC (permalink / raw
  To: Toshi Kani
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, Jul 15, 2014 at 4:10 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
>> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>> > This patch changes reserve_memtype() to handle the new WT type.
>> > When (!pat_enabled && new_type), it continues to set either WB
>> > or UC- to *new_type.  When pat_enabled, it can reserve a given
>> > non-RAM range for WT.  At this point, it may not reserve a RAM
>> > range for WT since reserve_ram_pages_type() uses the page flags
>> > limited to three memory types, WB, WC and UC.
>>
>> FWIW, last time I looked at this, it seemed like all the fancy
>> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
>> easy to track in the direct map page tables?
>
> Are you referring the direct map page tables as the kernel page
> directory tables (pgd/pud/..)?
>
> I think it needs to be able to keep track of the memory type per a
> physical memory range, not per a translation, in order to prevent
> aliasing of the memory type.

Actual RAM (the lowmem kind, which is all of it on x86_64) is mapped
linearly somewhere in kernel address space.  The pagetables for that
mapping could be used as the canonical source of the memory type for
the ram range in question.

This only works for lowmem, so maybe it's not a good idea to rely on it.

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
  2014-07-15 23:36         ` Andy Lutomirski
@ 2014-07-15 23:46           ` H. Peter Anvin
  -1 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-15 23:46 UTC (permalink / raw
  To: Andy Lutomirski, Toshi Kani
  Cc: Thomas Gleixner, Ingo Molnar, Andrew Morton, Arnd Bergmann,
	Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On 07/15/2014 04:36 PM, Andy Lutomirski wrote:
> On Tue, Jul 15, 2014 at 4:10 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>> On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
>>> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>>>> This patch changes reserve_memtype() to handle the new WT type.
>>>> When (!pat_enabled && new_type), it continues to set either WB
>>>> or UC- to *new_type.  When pat_enabled, it can reserve a given
>>>> non-RAM range for WT.  At this point, it may not reserve a RAM
>>>> range for WT since reserve_ram_pages_type() uses the page flags
>>>> limited to three memory types, WB, WC and UC.
>>>
>>> FWIW, last time I looked at this, it seemed like all the fancy
>>> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
>>> easy to track in the direct map page tables?
>>
>> Are you referring the direct map page tables as the kernel page
>> directory tables (pgd/pud/..)?
>>
>> I think it needs to be able to keep track of the memory type per a
>> physical memory range, not per a translation, in order to prevent
>> aliasing of the memory type.
> 
> Actual RAM (the lowmem kind, which is all of it on x86_64) is mapped
> linearly somewhere in kernel address space.  The pagetables for that
> mapping could be used as the canonical source of the memory type for
> the ram range in question.
> 
> This only works for lowmem, so maybe it's not a good idea to rely on it.
> 

We could do that, but would it be better?

	-hpa



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
@ 2014-07-15 23:46           ` H. Peter Anvin
  0 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-15 23:46 UTC (permalink / raw
  To: Andy Lutomirski, Toshi Kani
  Cc: Thomas Gleixner, Ingo Molnar, Andrew Morton, Arnd Bergmann,
	Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On 07/15/2014 04:36 PM, Andy Lutomirski wrote:
> On Tue, Jul 15, 2014 at 4:10 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>> On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
>>> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>>>> This patch changes reserve_memtype() to handle the new WT type.
>>>> When (!pat_enabled && new_type), it continues to set either WB
>>>> or UC- to *new_type.  When pat_enabled, it can reserve a given
>>>> non-RAM range for WT.  At this point, it may not reserve a RAM
>>>> range for WT since reserve_ram_pages_type() uses the page flags
>>>> limited to three memory types, WB, WC and UC.
>>>
>>> FWIW, last time I looked at this, it seemed like all the fancy
>>> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
>>> easy to track in the direct map page tables?
>>
>> Are you referring the direct map page tables as the kernel page
>> directory tables (pgd/pud/..)?
>>
>> I think it needs to be able to keep track of the memory type per a
>> physical memory range, not per a translation, in order to prevent
>> aliasing of the memory type.
> 
> Actual RAM (the lowmem kind, which is all of it on x86_64) is mapped
> linearly somewhere in kernel address space.  The pagetables for that
> mapping could be used as the canonical source of the memory type for
> the ram range in question.
> 
> This only works for lowmem, so maybe it's not a good idea to rely on it.
> 

We could do that, but would it be better?

	-hpa


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
  2014-07-15 23:36         ` Andy Lutomirski
@ 2014-07-15 23:53           ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 23:53 UTC (permalink / raw
  To: Andy Lutomirski
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, 2014-07-15 at 16:36 -0700, Andy Lutomirski wrote:
> On Tue, Jul 15, 2014 at 4:10 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> > On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
> >> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> >> > This patch changes reserve_memtype() to handle the new WT type.
> >> > When (!pat_enabled && new_type), it continues to set either WB
> >> > or UC- to *new_type.  When pat_enabled, it can reserve a given
> >> > non-RAM range for WT.  At this point, it may not reserve a RAM
> >> > range for WT since reserve_ram_pages_type() uses the page flags
> >> > limited to three memory types, WB, WC and UC.
> >>
> >> FWIW, last time I looked at this, it seemed like all the fancy
> >> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
> >> easy to track in the direct map page tables?
> >
> > Are you referring the direct map page tables as the kernel page
> > directory tables (pgd/pud/..)?
> >
> > I think it needs to be able to keep track of the memory type per a
> > physical memory range, not per a translation, in order to prevent
> > aliasing of the memory type.
> 
> Actual RAM (the lowmem kind, which is all of it on x86_64) is mapped
> linearly somewhere in kernel address space.  The pagetables for that
> mapping could be used as the canonical source of the memory type for
> the ram range in question.
>
> This only works for lowmem, so maybe it's not a good idea to rely on it.

Right.

I think using struct page table for the RAM ranges is a good way for
saving memory, but I wonder how often the RAM ranges are mapped other
than WB...  If not often, reserve_memtype() could simply call
rbt_memtype_check_insert() for all ranges, including RAM.

In this patch, I left using reserve_ram_pages_type() since I do not see
much reason to use WT for RAM, either.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
@ 2014-07-15 23:53           ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-15 23:53 UTC (permalink / raw
  To: Andy Lutomirski
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, 2014-07-15 at 16:36 -0700, Andy Lutomirski wrote:
> On Tue, Jul 15, 2014 at 4:10 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> > On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
> >> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> >> > This patch changes reserve_memtype() to handle the new WT type.
> >> > When (!pat_enabled && new_type), it continues to set either WB
> >> > or UC- to *new_type.  When pat_enabled, it can reserve a given
> >> > non-RAM range for WT.  At this point, it may not reserve a RAM
> >> > range for WT since reserve_ram_pages_type() uses the page flags
> >> > limited to three memory types, WB, WC and UC.
> >>
> >> FWIW, last time I looked at this, it seemed like all the fancy
> >> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
> >> easy to track in the direct map page tables?
> >
> > Are you referring the direct map page tables as the kernel page
> > directory tables (pgd/pud/..)?
> >
> > I think it needs to be able to keep track of the memory type per a
> > physical memory range, not per a translation, in order to prevent
> > aliasing of the memory type.
> 
> Actual RAM (the lowmem kind, which is all of it on x86_64) is mapped
> linearly somewhere in kernel address space.  The pagetables for that
> mapping could be used as the canonical source of the memory type for
> the ram range in question.
>
> This only works for lowmem, so maybe it's not a good idea to rely on it.

Right.

I think using struct page table for the RAM ranges is a good way for
saving memory, but I wonder how often the RAM ranges are mapped other
than WB...  If not often, reserve_memtype() could simply call
rbt_memtype_check_insert() for all ranges, including RAM.

In this patch, I left using reserve_ram_pages_type() since I do not see
much reason to use WT for RAM, either.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
  2014-07-15 23:46           ` H. Peter Anvin
@ 2014-07-15 23:54             ` Andy Lutomirski
  -1 siblings, 0 replies; 74+ messages in thread
From: Andy Lutomirski @ 2014-07-15 23:54 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: Toshi Kani, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, Jul 15, 2014 at 4:46 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 07/15/2014 04:36 PM, Andy Lutomirski wrote:
>> On Tue, Jul 15, 2014 at 4:10 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>>> On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
>>>> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>>>>> This patch changes reserve_memtype() to handle the new WT type.
>>>>> When (!pat_enabled && new_type), it continues to set either WB
>>>>> or UC- to *new_type.  When pat_enabled, it can reserve a given
>>>>> non-RAM range for WT.  At this point, it may not reserve a RAM
>>>>> range for WT since reserve_ram_pages_type() uses the page flags
>>>>> limited to three memory types, WB, WC and UC.
>>>>
>>>> FWIW, last time I looked at this, it seemed like all the fancy
>>>> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
>>>> easy to track in the direct map page tables?
>>>
>>> Are you referring the direct map page tables as the kernel page
>>> directory tables (pgd/pud/..)?
>>>
>>> I think it needs to be able to keep track of the memory type per a
>>> physical memory range, not per a translation, in order to prevent
>>> aliasing of the memory type.
>>
>> Actual RAM (the lowmem kind, which is all of it on x86_64) is mapped
>> linearly somewhere in kernel address space.  The pagetables for that
>> mapping could be used as the canonical source of the memory type for
>> the ram range in question.
>>
>> This only works for lowmem, so maybe it's not a good idea to rely on it.
>>
>
> We could do that, but would it be better?

>From vague memory, the current mechanism for tracking RAM memtypes (as
opposed to memtypes for everything that isn't RAM) is limited to a
very small number of types, leading to oddities like not being able to
create WT ram with this patchset.

Using the pagetables directly would be simpler (no extra data
structure) and would automatically exactly track the set of memtypes
that can fit in the pagetable structures.

--Andy

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
@ 2014-07-15 23:54             ` Andy Lutomirski
  0 siblings, 0 replies; 74+ messages in thread
From: Andy Lutomirski @ 2014-07-15 23:54 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: Toshi Kani, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, Jul 15, 2014 at 4:46 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 07/15/2014 04:36 PM, Andy Lutomirski wrote:
>> On Tue, Jul 15, 2014 at 4:10 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>>> On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
>>>> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>>>>> This patch changes reserve_memtype() to handle the new WT type.
>>>>> When (!pat_enabled && new_type), it continues to set either WB
>>>>> or UC- to *new_type.  When pat_enabled, it can reserve a given
>>>>> non-RAM range for WT.  At this point, it may not reserve a RAM
>>>>> range for WT since reserve_ram_pages_type() uses the page flags
>>>>> limited to three memory types, WB, WC and UC.
>>>>
>>>> FWIW, last time I looked at this, it seemed like all the fancy
>>>> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
>>>> easy to track in the direct map page tables?
>>>
>>> Are you referring the direct map page tables as the kernel page
>>> directory tables (pgd/pud/..)?
>>>
>>> I think it needs to be able to keep track of the memory type per a
>>> physical memory range, not per a translation, in order to prevent
>>> aliasing of the memory type.
>>
>> Actual RAM (the lowmem kind, which is all of it on x86_64) is mapped
>> linearly somewhere in kernel address space.  The pagetables for that
>> mapping could be used as the canonical source of the memory type for
>> the ram range in question.
>>
>> This only works for lowmem, so maybe it's not a good idea to rely on it.
>>
>
> We could do that, but would it be better?

>From vague memory, the current mechanism for tracking RAM memtypes (as
opposed to memtypes for everything that isn't RAM) is limited to a
very small number of types, leading to oddities like not being able to
create WT ram with this patchset.

Using the pagetables directly would be simpler (no extra data
structure) and would automatically exactly track the set of memtypes
that can fit in the pagetable structures.

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
  2014-07-15 23:54             ` Andy Lutomirski
@ 2014-07-15 23:59               ` H. Peter Anvin
  -1 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-15 23:59 UTC (permalink / raw
  To: Andy Lutomirski
  Cc: Toshi Kani, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On 07/15/2014 04:54 PM, Andy Lutomirski wrote:
> 
> From vague memory, the current mechanism for tracking RAM memtypes (as
> opposed to memtypes for everything that isn't RAM) is limited to a
> very small number of types, leading to oddities like not being able to
> create WT ram with this patchset.
> 
> Using the pagetables directly would be simpler (no extra data
> structure) and would automatically exactly track the set of memtypes
> that can fit in the pagetable structures.
> 

I don't think there is anything fundamental, though.  The number of
types had more to do with what there was demand for.  I will look into it.

	-hpa



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
@ 2014-07-15 23:59               ` H. Peter Anvin
  0 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-15 23:59 UTC (permalink / raw
  To: Andy Lutomirski
  Cc: Toshi Kani, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On 07/15/2014 04:54 PM, Andy Lutomirski wrote:
> 
> From vague memory, the current mechanism for tracking RAM memtypes (as
> opposed to memtypes for everything that isn't RAM) is limited to a
> very small number of types, leading to oddities like not being able to
> create WT ram with this patchset.
> 
> Using the pagetables directly would be simpler (no extra data
> structure) and would automatically exactly track the set of memtypes
> that can fit in the pagetable structures.
> 

I don't think there is anything fundamental, though.  The number of
types had more to do with what there was demand for.  I will look into it.

	-hpa


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
  2014-07-15 23:53           ` Toshi Kani
@ 2014-07-16  0:05             ` H. Peter Anvin
  -1 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-16  0:05 UTC (permalink / raw
  To: Toshi Kani, Andy Lutomirski
  Cc: Thomas Gleixner, Ingo Molnar, Andrew Morton, Arnd Bergmann,
	Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On 07/15/2014 04:53 PM, Toshi Kani wrote:
> 
> Right.
> 
> I think using struct page table for the RAM ranges is a good way for
> saving memory, but I wonder how often the RAM ranges are mapped other
> than WB...  If not often, reserve_memtype() could simply call
> rbt_memtype_check_insert() for all ranges, including RAM.
> 
> In this patch, I left using reserve_ram_pages_type() since I do not see
> much reason to use WT for RAM, either.
> 

They get flipped to WC or WT or even UC for some I/O devices, but
ultimately the number of ranges is pretty small.

	-hpa



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
@ 2014-07-16  0:05             ` H. Peter Anvin
  0 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-16  0:05 UTC (permalink / raw
  To: Toshi Kani, Andy Lutomirski
  Cc: Thomas Gleixner, Ingo Molnar, Andrew Morton, Arnd Bergmann,
	Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On 07/15/2014 04:53 PM, Toshi Kani wrote:
> 
> Right.
> 
> I think using struct page table for the RAM ranges is a good way for
> saving memory, but I wonder how often the RAM ranges are mapped other
> than WB...  If not often, reserve_memtype() could simply call
> rbt_memtype_check_insert() for all ranges, including RAM.
> 
> In this patch, I left using reserve_ram_pages_type() since I do not see
> much reason to use WT for RAM, either.
> 

They get flipped to WC or WT or even UC for some I/O devices, but
ultimately the number of ranges is pretty small.

	-hpa


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
  2014-07-15 23:53           ` Toshi Kani
@ 2014-07-16  0:28             ` Andy Lutomirski
  -1 siblings, 0 replies; 74+ messages in thread
From: Andy Lutomirski @ 2014-07-16  0:28 UTC (permalink / raw
  To: Toshi Kani
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, Jul 15, 2014 at 4:53 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> On Tue, 2014-07-15 at 16:36 -0700, Andy Lutomirski wrote:
>> On Tue, Jul 15, 2014 at 4:10 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>> > On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
>> >> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>> >> > This patch changes reserve_memtype() to handle the new WT type.
>> >> > When (!pat_enabled && new_type), it continues to set either WB
>> >> > or UC- to *new_type.  When pat_enabled, it can reserve a given
>> >> > non-RAM range for WT.  At this point, it may not reserve a RAM
>> >> > range for WT since reserve_ram_pages_type() uses the page flags
>> >> > limited to three memory types, WB, WC and UC.
>> >>
>> >> FWIW, last time I looked at this, it seemed like all the fancy
>> >> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
>> >> easy to track in the direct map page tables?
>> >
>> > Are you referring the direct map page tables as the kernel page
>> > directory tables (pgd/pud/..)?
>> >
>> > I think it needs to be able to keep track of the memory type per a
>> > physical memory range, not per a translation, in order to prevent
>> > aliasing of the memory type.
>>
>> Actual RAM (the lowmem kind, which is all of it on x86_64) is mapped
>> linearly somewhere in kernel address space.  The pagetables for that
>> mapping could be used as the canonical source of the memory type for
>> the ram range in question.
>>
>> This only works for lowmem, so maybe it's not a good idea to rely on it.
>
> Right.
>
> I think using struct page table for the RAM ranges is a good way for
> saving memory, but I wonder how often the RAM ranges are mapped other
> than WB...  If not often, reserve_memtype() could simply call
> rbt_memtype_check_insert() for all ranges, including RAM.
>
> In this patch, I left using reserve_ram_pages_type() since I do not see
> much reason to use WT for RAM, either.

I hereby predict that someone, some day, will build a system with
nonvolatile "RAM", and someone will want this feature.  Just saying :)

More realistically, someone might want to write a silly driver that
lets programs mmap some WT memory for testing.

--Andy

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
@ 2014-07-16  0:28             ` Andy Lutomirski
  0 siblings, 0 replies; 74+ messages in thread
From: Andy Lutomirski @ 2014-07-16  0:28 UTC (permalink / raw
  To: Toshi Kani
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, Jul 15, 2014 at 4:53 PM, Toshi Kani <toshi.kani@hp.com> wrote:
> On Tue, 2014-07-15 at 16:36 -0700, Andy Lutomirski wrote:
>> On Tue, Jul 15, 2014 at 4:10 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>> > On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
>> >> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>> >> > This patch changes reserve_memtype() to handle the new WT type.
>> >> > When (!pat_enabled && new_type), it continues to set either WB
>> >> > or UC- to *new_type.  When pat_enabled, it can reserve a given
>> >> > non-RAM range for WT.  At this point, it may not reserve a RAM
>> >> > range for WT since reserve_ram_pages_type() uses the page flags
>> >> > limited to three memory types, WB, WC and UC.
>> >>
>> >> FWIW, last time I looked at this, it seemed like all the fancy
>> >> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type be
>> >> easy to track in the direct map page tables?
>> >
>> > Are you referring the direct map page tables as the kernel page
>> > directory tables (pgd/pud/..)?
>> >
>> > I think it needs to be able to keep track of the memory type per a
>> > physical memory range, not per a translation, in order to prevent
>> > aliasing of the memory type.
>>
>> Actual RAM (the lowmem kind, which is all of it on x86_64) is mapped
>> linearly somewhere in kernel address space.  The pagetables for that
>> mapping could be used as the canonical source of the memory type for
>> the ram range in question.
>>
>> This only works for lowmem, so maybe it's not a good idea to rely on it.
>
> Right.
>
> I think using struct page table for the RAM ranges is a good way for
> saving memory, but I wonder how often the RAM ranges are mapped other
> than WB...  If not often, reserve_memtype() could simply call
> rbt_memtype_check_insert() for all ranges, including RAM.
>
> In this patch, I left using reserve_ram_pages_type() since I do not see
> much reason to use WT for RAM, either.

I hereby predict that someone, some day, will build a system with
nonvolatile "RAM", and someone will want this feature.  Just saying :)

More realistically, someone might want to write a silly driver that
lets programs mmap some WT memory for testing.

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
  2014-07-16  0:28             ` Andy Lutomirski
@ 2014-07-16  0:31               ` H. Peter Anvin
  -1 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-16  0:31 UTC (permalink / raw
  To: Andy Lutomirski, Toshi Kani
  Cc: Thomas Gleixner, Ingo Molnar, Andrew Morton, Arnd Bergmann,
	Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

It already happened...

On July 15, 2014 5:28:40 PM PDT, Andy Lutomirski <luto@amacapital.net> wrote:
>On Tue, Jul 15, 2014 at 4:53 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>> On Tue, 2014-07-15 at 16:36 -0700, Andy Lutomirski wrote:
>>> On Tue, Jul 15, 2014 at 4:10 PM, Toshi Kani <toshi.kani@hp.com>
>wrote:
>>> > On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
>>> >> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com>
>wrote:
>>> >> > This patch changes reserve_memtype() to handle the new WT type.
>>> >> > When (!pat_enabled && new_type), it continues to set either WB
>>> >> > or UC- to *new_type.  When pat_enabled, it can reserve a given
>>> >> > non-RAM range for WT.  At this point, it may not reserve a RAM
>>> >> > range for WT since reserve_ram_pages_type() uses the page flags
>>> >> > limited to three memory types, WB, WC and UC.
>>> >>
>>> >> FWIW, last time I looked at this, it seemed like all the fancy
>>> >> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type
>be
>>> >> easy to track in the direct map page tables?
>>> >
>>> > Are you referring the direct map page tables as the kernel page
>>> > directory tables (pgd/pud/..)?
>>> >
>>> > I think it needs to be able to keep track of the memory type per a
>>> > physical memory range, not per a translation, in order to prevent
>>> > aliasing of the memory type.
>>>
>>> Actual RAM (the lowmem kind, which is all of it on x86_64) is mapped
>>> linearly somewhere in kernel address space.  The pagetables for that
>>> mapping could be used as the canonical source of the memory type for
>>> the ram range in question.
>>>
>>> This only works for lowmem, so maybe it's not a good idea to rely on
>it.
>>
>> Right.
>>
>> I think using struct page table for the RAM ranges is a good way for
>> saving memory, but I wonder how often the RAM ranges are mapped other
>> than WB...  If not often, reserve_memtype() could simply call
>> rbt_memtype_check_insert() for all ranges, including RAM.
>>
>> In this patch, I left using reserve_ram_pages_type() since I do not
>see
>> much reason to use WT for RAM, either.
>
>I hereby predict that someone, some day, will build a system with
>nonvolatile "RAM", and someone will want this feature.  Just saying :)
>
>More realistically, someone might want to write a silly driver that
>lets programs mmap some WT memory for testing.
>
>--Andy

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
@ 2014-07-16  0:31               ` H. Peter Anvin
  0 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-16  0:31 UTC (permalink / raw
  To: Andy Lutomirski, Toshi Kani
  Cc: Thomas Gleixner, Ingo Molnar, Andrew Morton, Arnd Bergmann,
	Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

It already happened...

On July 15, 2014 5:28:40 PM PDT, Andy Lutomirski <luto@amacapital.net> wrote:
>On Tue, Jul 15, 2014 at 4:53 PM, Toshi Kani <toshi.kani@hp.com> wrote:
>> On Tue, 2014-07-15 at 16:36 -0700, Andy Lutomirski wrote:
>>> On Tue, Jul 15, 2014 at 4:10 PM, Toshi Kani <toshi.kani@hp.com>
>wrote:
>>> > On Tue, 2014-07-15 at 12:56 -0700, Andy Lutomirski wrote:
>>> >> On Tue, Jul 15, 2014 at 12:34 PM, Toshi Kani <toshi.kani@hp.com>
>wrote:
>>> >> > This patch changes reserve_memtype() to handle the new WT type.
>>> >> > When (!pat_enabled && new_type), it continues to set either WB
>>> >> > or UC- to *new_type.  When pat_enabled, it can reserve a given
>>> >> > non-RAM range for WT.  At this point, it may not reserve a RAM
>>> >> > range for WT since reserve_ram_pages_type() uses the page flags
>>> >> > limited to three memory types, WB, WC and UC.
>>> >>
>>> >> FWIW, last time I looked at this, it seemed like all the fancy
>>> >> reserve_ram_pages stuff was unnecessary: shouldn't the RAM type
>be
>>> >> easy to track in the direct map page tables?
>>> >
>>> > Are you referring the direct map page tables as the kernel page
>>> > directory tables (pgd/pud/..)?
>>> >
>>> > I think it needs to be able to keep track of the memory type per a
>>> > physical memory range, not per a translation, in order to prevent
>>> > aliasing of the memory type.
>>>
>>> Actual RAM (the lowmem kind, which is all of it on x86_64) is mapped
>>> linearly somewhere in kernel address space.  The pagetables for that
>>> mapping could be used as the canonical source of the memory type for
>>> the ram range in question.
>>>
>>> This only works for lowmem, so maybe it's not a good idea to rely on
>it.
>>
>> Right.
>>
>> I think using struct page table for the RAM ranges is a good way for
>> saving memory, but I wonder how often the RAM ranges are mapped other
>> than WB...  If not often, reserve_memtype() could simply call
>> rbt_memtype_check_insert() for all ranges, including RAM.
>>
>> In this patch, I left using reserve_ram_pages_type() since I do not
>see
>> much reason to use WT for RAM, either.
>
>I hereby predict that someone, some day, will build a system with
>nonvolatile "RAM", and someone will want this feature.  Just saying :)
>
>More realistically, someone might want to write a silly driver that
>lets programs mmap some WT memory for testing.
>
>--Andy

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-15 21:23     ` Toshi Kani
@ 2014-07-16  0:40       ` Konrad Rzeszutek Wilk
  -1 siblings, 0 replies; 74+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-16  0:40 UTC (permalink / raw
  To: Toshi Kani, H. Peter Anvin
  Cc: tglx, mingo, akpm, arnd, plagnioj, tomi.valkeinen, linux-mm,
	linux-kernel, stefan.bader, luto, airlied, bp

On July 15, 2014 5:23:24 PM EDT, Toshi Kani <toshi.kani@hp.com> wrote:
>On Tue, 2014-07-15 at 13:09 -0700, H. Peter Anvin wrote:
>> On 07/15/2014 12:34 PM, Toshi Kani wrote:
>> > This RFC patchset is aimed to seek comments/suggestions for the
>design
>> > and changes to support of Write-Through (WT) mapping.  The study
>below
>> > shows that using WT mapping may be useful for non-volatile memory.
>> > 
>> >   http://www.hpl.hp.com/techreports/2012/HPL-2012-236.pdf
>> > 
>> > There were idea & patches to support WT in the past, which
>stimulated
>> > very valuable discussions on this topic.
>> > 
>> >   https://lkml.org/lkml/2013/4/24/424
>> >   https://lkml.org/lkml/2013/10/27/70
>> >   https://lkml.org/lkml/2013/11/3/72
>> > 
>> > This RFC patchset tries to address the issues raised by taking the
>> > following design approach:
>> > 
>> >  - Keep the MTRR interface
>> >  - Keep the WB, WC, and UC- slots in the PAT MSR
>> >  - Keep the PAT bit unused
>> >  - Reassign the UC slot to WT in the PAT MSR
>> > 
>> > There are 4 usable slots in the PAT MSR, which are currently
>assigned to:
>> > 
>> >   PA0/4: WB, PA1/5: WC, PA2/6: UC-, PA3/7: UC
>> > 
>> > The PAT bit is unused since it shares the same bit as the PSE bit
>and
>> > there was a bug in older processors.  Among the 4 slots, the
>uncached
>> > memory type consumes 2 slots, UC- and UC.  They are functionally
>> > equivalent, but UC- allows MTRRs to overwrite it with WC.  All
>interfaces
>> > that set the uncached memory type use UC- in order to work with
>MTRRs.
>> > The PA3/7 slot is effectively unused today.  Therefore, this
>patchset
>> > reassigns the PA3/7 slot to WT.  If MTRRs get deprecated in future,
>> > UC- can be reassigned to UC, and there is still no need to consume
>> > 2 slots for the uncached memory type.
>> 
>> Not going to happen any time in the forseeable future.
>> 
>> Furthermore, I don't think it is a big deal if on some old, buggy
>> processors we take the performance hit of cache type demotion, as
>long
>> as we don't actively lose data.
>> 
>> > This patchset is consist of two parts.  The 1st part, patch [1/11]
>to
>> > [6/11], enables WT mapping and adds new interfaces for setting WT
>mapping.
>> > The 2nd part, patch [7/11] to [11/11], cleans up the code that has
>> > internal knowledge of the PAT slot assignment.  This keeps the
>kernel
>> > code independent from the PAT slot assignment.
>> 
>> I have given this piece of feedback at least three times now,
>possibly
>> to different people, and I'm getting a bit grumpy about it:
>> 
>> We already have an issue with Xen, because Xen assigned mappings
>> differently and it is incompatible with the use of PAT in Linux.  As
>a
>> result we get requests for hacks to work around this, which is
>something
>> I really don't want to see.  I would like to see a design involving a
>> "reverse PAT" table where the kernel can hold the mapping between
>memory
>> types and page table encodings (including the two different ones for
>> small and large pages.)
>
>Thanks for pointing this out! (And sorry for making you repeat it three
>time...)  I was not aware of the issue with Xen.  I will look into the
>email archive to see what the Xen issue is, and how it can be
>addressed.

https://lkml.org/lkml/2011/11/8/406
>
>Thanks,
>-Toshi



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-16  0:40       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 74+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-16  0:40 UTC (permalink / raw
  To: Toshi Kani, H. Peter Anvin
  Cc: tglx, mingo, akpm, arnd, plagnioj, tomi.valkeinen, linux-mm,
	linux-kernel, stefan.bader, luto, airlied, bp

On July 15, 2014 5:23:24 PM EDT, Toshi Kani <toshi.kani@hp.com> wrote:
>On Tue, 2014-07-15 at 13:09 -0700, H. Peter Anvin wrote:
>> On 07/15/2014 12:34 PM, Toshi Kani wrote:
>> > This RFC patchset is aimed to seek comments/suggestions for the
>design
>> > and changes to support of Write-Through (WT) mapping.  The study
>below
>> > shows that using WT mapping may be useful for non-volatile memory.
>> > 
>> >   http://www.hpl.hp.com/techreports/2012/HPL-2012-236.pdf
>> > 
>> > There were idea & patches to support WT in the past, which
>stimulated
>> > very valuable discussions on this topic.
>> > 
>> >   https://lkml.org/lkml/2013/4/24/424
>> >   https://lkml.org/lkml/2013/10/27/70
>> >   https://lkml.org/lkml/2013/11/3/72
>> > 
>> > This RFC patchset tries to address the issues raised by taking the
>> > following design approach:
>> > 
>> >  - Keep the MTRR interface
>> >  - Keep the WB, WC, and UC- slots in the PAT MSR
>> >  - Keep the PAT bit unused
>> >  - Reassign the UC slot to WT in the PAT MSR
>> > 
>> > There are 4 usable slots in the PAT MSR, which are currently
>assigned to:
>> > 
>> >   PA0/4: WB, PA1/5: WC, PA2/6: UC-, PA3/7: UC
>> > 
>> > The PAT bit is unused since it shares the same bit as the PSE bit
>and
>> > there was a bug in older processors.  Among the 4 slots, the
>uncached
>> > memory type consumes 2 slots, UC- and UC.  They are functionally
>> > equivalent, but UC- allows MTRRs to overwrite it with WC.  All
>interfaces
>> > that set the uncached memory type use UC- in order to work with
>MTRRs.
>> > The PA3/7 slot is effectively unused today.  Therefore, this
>patchset
>> > reassigns the PA3/7 slot to WT.  If MTRRs get deprecated in future,
>> > UC- can be reassigned to UC, and there is still no need to consume
>> > 2 slots for the uncached memory type.
>> 
>> Not going to happen any time in the forseeable future.
>> 
>> Furthermore, I don't think it is a big deal if on some old, buggy
>> processors we take the performance hit of cache type demotion, as
>long
>> as we don't actively lose data.
>> 
>> > This patchset is consist of two parts.  The 1st part, patch [1/11]
>to
>> > [6/11], enables WT mapping and adds new interfaces for setting WT
>mapping.
>> > The 2nd part, patch [7/11] to [11/11], cleans up the code that has
>> > internal knowledge of the PAT slot assignment.  This keeps the
>kernel
>> > code independent from the PAT slot assignment.
>> 
>> I have given this piece of feedback at least three times now,
>possibly
>> to different people, and I'm getting a bit grumpy about it:
>> 
>> We already have an issue with Xen, because Xen assigned mappings
>> differently and it is incompatible with the use of PAT in Linux.  As
>a
>> result we get requests for hacks to work around this, which is
>something
>> I really don't want to see.  I would like to see a design involving a
>> "reverse PAT" table where the kernel can hold the mapping between
>memory
>> types and page table encodings (including the two different ones for
>> small and large pages.)
>
>Thanks for pointing this out! (And sorry for making you repeat it three
>time...)  I was not aware of the issue with Xen.  I will look into the
>email archive to see what the Xen issue is, and how it can be
>addressed.

https://lkml.org/lkml/2011/11/8/406
>
>Thanks,
>-Toshi


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
  2014-07-16  0:28             ` Andy Lutomirski
@ 2014-07-16 14:35               ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-16 14:35 UTC (permalink / raw
  To: Andy Lutomirski
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, 2014-07-15 at 17:28 -0700, Andy Lutomirski wrote:
> On Tue, Jul 15, 2014 at 4:53 PM, Toshi Kani <toshi.kani@hp.com> wrote:
 :
> > In this patch, I left using reserve_ram_pages_type() since I do not see
> > much reason to use WT for RAM, either.
> 
> I hereby predict that someone, some day, will build a system with
> nonvolatile "RAM", and someone will want this feature.  Just saying :)
> 
> More realistically, someone might want to write a silly driver that
> lets programs mmap some WT memory for testing.

Agreed.  This limitation needs to be addressed.  I meant to say that
this could be a separate effort.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type
@ 2014-07-16 14:35               ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-16 14:35 UTC (permalink / raw
  To: Andy Lutomirski
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, Konrad Rzeszutek Wilk, plagnioj, tomi.valkeinen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Stefan Bader,
	Dave Airlie, Borislav Petkov

On Tue, 2014-07-15 at 17:28 -0700, Andy Lutomirski wrote:
> On Tue, Jul 15, 2014 at 4:53 PM, Toshi Kani <toshi.kani@hp.com> wrote:
 :
> > In this patch, I left using reserve_ram_pages_type() since I do not see
> > much reason to use WT for RAM, either.
> 
> I hereby predict that someone, some day, will build a system with
> nonvolatile "RAM", and someone will want this feature.  Just saying :)
> 
> More realistically, someone might want to write a silly driver that
> lets programs mmap some WT memory for testing.

Agreed.  This limitation needs to be addressed.  I meant to say that
this could be a separate effort.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-16  0:40       ` Konrad Rzeszutek Wilk
  (?)
@ 2014-07-16 21:28       ` Toshi Kani
  2014-07-21 16:31         ` Toshi Kani
  -1 siblings, 1 reply; 74+ messages in thread
From: Toshi Kani @ 2014-07-16 21:28 UTC (permalink / raw
  To: Konrad Rzeszutek Wilk
  Cc: H. Peter Anvin, tglx, mingo, akpm, arnd, plagnioj, tomi.valkeinen,
	linux-mm, linux-kernel, stefan.bader, luto, airlied, bp

[-- Attachment #1: Type: text/plain, Size: 1766 bytes --]

On Tue, 2014-07-15 at 20:40 -0400, Konrad Rzeszutek Wilk wrote:
> On July 15, 2014 5:23:24 PM EDT, Toshi Kani <toshi.kani@hp.com> wrote:
> >On Tue, 2014-07-15 at 13:09 -0700, H. Peter Anvin wrote:
> >> On 07/15/2014 12:34 PM, Toshi Kani wrote:
 :
> >> 
> >> I have given this piece of feedback at least three times now,
> >possibly
> >> to different people, and I'm getting a bit grumpy about it:
> >> 
> >> We already have an issue with Xen, because Xen assigned mappings
> >> differently and it is incompatible with the use of PAT in Linux.  As
> >a
> >> result we get requests for hacks to work around this, which is
> >something
> >> I really don't want to see.  I would like to see a design involving a
> >> "reverse PAT" table where the kernel can hold the mapping between
> >memory
> >> types and page table encodings (including the two different ones for
> >> small and large pages.)
> >
> >Thanks for pointing this out! (And sorry for making you repeat it three
> >time...)  I was not aware of the issue with Xen.  I will look into the
> >email archive to see what the Xen issue is, and how it can be
> >addressed.
> 
> https://lkml.org/lkml/2011/11/8/406

Thanks Konrad for the pointer!

Since [__]change_page_attr_set_clr() and __change_page_attr() have no
knowledge about PAT and simply work with specified PTE flags, they do
not seem to fit well with additional PAT abstraction table...

I think the root of this issue is that the kernel ignores the PAT bit.
Since __change_page_attr() only supports 4K pages, set_memory_<type>()
can set the PAT bit into the clear mask.

Attached is a patch with this approach (apply on top of this series -
not tested).  The kernel still does not support the PAT bit, but it
behaves slightly better.

Thanks,
-Toshi



[-- Attachment #2: page-ext-mask.patch --]
[-- Type: text/x-patch, Size: 3716 bytes --]

From: Toshi Kani <toshi.kani@hp.com>

---
 arch/x86/include/asm/pgtable_types.h |    1 +
 arch/x86/mm/pageattr.c               |   20 ++++++++++----------
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 81a3859..a392b09 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -130,6 +130,7 @@
 #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE | _PAGE_NUMA)
 
 #define _PAGE_CACHE_MASK	(_PAGE_PCD | _PAGE_PWT)
+#define _PAGE_CACHE_EXT_MASK	(_PAGE_CACHE_MASK | _PAGE_PAT)
 #define _PAGE_CACHE_WB		(0)
 #define _PAGE_CACHE_WC		(_PAGE_PWT)
 #define _PAGE_CACHE_WT		(_PAGE_PCD | _PAGE_PWT)
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index da597d0..348f206 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1446,7 +1446,7 @@ int _set_memory_uc(unsigned long addr, int numpages)
 	 */
 	return change_page_attr_set_clr(&addr, numpages,
 					__pgprot(_PAGE_CACHE_UC_MINUS),
-					__pgprot(_PAGE_CACHE_MASK),
+					__pgprot(_PAGE_CACHE_EXT_MASK),
 					0, 0, NULL);
 }
 
@@ -1493,13 +1493,13 @@ static int _set_memory_array(unsigned long *addr, int addrinarray,
 
 	ret = change_page_attr_set_clr(addr, addrinarray,
 				       __pgprot(_PAGE_CACHE_UC_MINUS),
-				       __pgprot(_PAGE_CACHE_MASK),
+				       __pgprot(_PAGE_CACHE_EXT_MASK),
 				       0, CPA_ARRAY, NULL);
 
 	if (!ret && new_type == _PAGE_CACHE_WC)
 		ret = change_page_attr_set_clr(addr, addrinarray,
 					       __pgprot(_PAGE_CACHE_WC),
-					       __pgprot(_PAGE_CACHE_MASK),
+					       __pgprot(_PAGE_CACHE_EXT_MASK),
 					       0, CPA_ARRAY, NULL);
 	if (ret)
 		goto out_free;
@@ -1532,12 +1532,12 @@ int _set_memory_wc(unsigned long addr, int numpages)
 
 	ret = change_page_attr_set_clr(&addr, numpages,
 				       __pgprot(_PAGE_CACHE_UC_MINUS),
-				       __pgprot(_PAGE_CACHE_MASK),
+				       __pgprot(_PAGE_CACHE_EXT_MASK),
 				       0, 0, NULL);
 	if (!ret) {
 		ret = change_page_attr_set_clr(&addr_copy, numpages,
 					       __pgprot(_PAGE_CACHE_WC),
-					       __pgprot(_PAGE_CACHE_MASK),
+					       __pgprot(_PAGE_CACHE_EXT_MASK),
 					       0, 0, NULL);
 	}
 	return ret;
@@ -1578,7 +1578,7 @@ int _set_memory_wt(unsigned long addr, int numpages)
 {
 	return change_page_attr_set_clr(&addr, numpages,
 					__pgprot(_PAGE_CACHE_WT),
-					__pgprot(_PAGE_CACHE_MASK),
+					__pgprot(_PAGE_CACHE_EXT_MASK),
 					0, 0, NULL);
 }
 
@@ -1611,7 +1611,7 @@ int _set_memory_wb(unsigned long addr, int numpages)
 {
 	return change_page_attr_set_clr(&addr, numpages,
 					__pgprot(_PAGE_CACHE_WB),
-					__pgprot(_PAGE_CACHE_MASK),
+					__pgprot(_PAGE_CACHE_EXT_MASK),
 					0, 0, NULL);
 }
 
@@ -1635,7 +1635,7 @@ int set_memory_array_wb(unsigned long *addr, int addrinarray)
 
 	ret = change_page_attr_set_clr(addr, addrinarray,
 				       __pgprot(_PAGE_CACHE_WB),
-				       __pgprot(_PAGE_CACHE_MASK),
+				       __pgprot(_PAGE_CACHE_EXT_MASK),
 				       0, CPA_ARRAY, NULL);
 	if (ret)
 		return ret;
@@ -1719,7 +1719,7 @@ static int _set_pages_array(struct page **pages, int addrinarray,
 	if (!ret && new_type == _PAGE_CACHE_WC)
 		ret = change_page_attr_set_clr(NULL, addrinarray,
 					       __pgprot(_PAGE_CACHE_WC),
-					       __pgprot(_PAGE_CACHE_MASK),
+					       __pgprot(_PAGE_CACHE_EXT_MASK),
 					       0, CPA_PAGES_ARRAY, pages);
 	if (ret)
 		goto err_out;
@@ -1770,7 +1770,7 @@ int set_pages_array_wb(struct page **pages, int addrinarray)
 	int i;
 
 	retval = cpa_clear_pages_array(pages, addrinarray,
-			__pgprot(_PAGE_CACHE_MASK));
+			__pgprot(_PAGE_CACHE_EXT_MASK));
 	if (retval)
 		return retval;
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-16 21:28       ` Toshi Kani
@ 2014-07-21 16:31         ` Toshi Kani
  2014-07-21 16:47             ` H. Peter Anvin
  0 siblings, 1 reply; 74+ messages in thread
From: Toshi Kani @ 2014-07-21 16:31 UTC (permalink / raw
  To: Konrad Rzeszutek Wilk, H. Peter Anvin
  Cc: tglx, mingo, akpm, arnd, plagnioj, tomi.valkeinen, linux-mm,
	linux-kernel, stefan.bader, luto, airlied, bp

[-- Attachment #1: Type: text/plain, Size: 1980 bytes --]

On Wed, 2014-07-16 at 15:28 -0600, Toshi Kani wrote:
> On Tue, 2014-07-15 at 20:40 -0400, Konrad Rzeszutek Wilk wrote:
> > On July 15, 2014 5:23:24 PM EDT, Toshi Kani <toshi.kani@hp.com> wrote:
> > >On Tue, 2014-07-15 at 13:09 -0700, H. Peter Anvin wrote:
> > >> On 07/15/2014 12:34 PM, Toshi Kani wrote:
>  :
> > >> 
> > >> I have given this piece of feedback at least three times now,
> > >possibly
> > >> to different people, and I'm getting a bit grumpy about it:
> > >> 
> > >> We already have an issue with Xen, because Xen assigned mappings
> > >> differently and it is incompatible with the use of PAT in Linux.  As
> > >a
> > >> result we get requests for hacks to work around this, which is
> > >something
> > >> I really don't want to see.  I would like to see a design involving a
> > >> "reverse PAT" table where the kernel can hold the mapping between
> > >memory
> > >> types and page table encodings (including the two different ones for
> > >> small and large pages.)
> > >
> > >Thanks for pointing this out! (And sorry for making you repeat it three
> > >time...)  I was not aware of the issue with Xen.  I will look into the
> > >email archive to see what the Xen issue is, and how it can be
> > >addressed.
> > 
> > https://lkml.org/lkml/2011/11/8/406
> 
> Thanks Konrad for the pointer!
> 
> Since [__]change_page_attr_set_clr() and __change_page_attr() have no
> knowledge about PAT and simply work with specified PTE flags, they do
> not seem to fit well with additional PAT abstraction table...
> 
> I think the root of this issue is that the kernel ignores the PAT bit.
> Since __change_page_attr() only supports 4K pages, set_memory_<type>()
> can set the PAT bit into the clear mask.
> 
> Attached is a patch with this approach (apply on top of this series -
> not tested).  The kernel still does not support the PAT bit, but it
> behaves slightly better.

Hi Peter, Konrad,

Do you have any comments / suggestions for this approach?

Thanks!
-Toshi




[-- Attachment #2: page-ext-mask.patch --]
[-- Type: text/x-patch, Size: 3716 bytes --]

From: Toshi Kani <toshi.kani@hp.com>

---
 arch/x86/include/asm/pgtable_types.h |    1 +
 arch/x86/mm/pageattr.c               |   20 ++++++++++----------
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 81a3859..a392b09 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -130,6 +130,7 @@
 #define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE | _PAGE_NUMA)
 
 #define _PAGE_CACHE_MASK	(_PAGE_PCD | _PAGE_PWT)
+#define _PAGE_CACHE_EXT_MASK	(_PAGE_CACHE_MASK | _PAGE_PAT)
 #define _PAGE_CACHE_WB		(0)
 #define _PAGE_CACHE_WC		(_PAGE_PWT)
 #define _PAGE_CACHE_WT		(_PAGE_PCD | _PAGE_PWT)
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index da597d0..348f206 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1446,7 +1446,7 @@ int _set_memory_uc(unsigned long addr, int numpages)
 	 */
 	return change_page_attr_set_clr(&addr, numpages,
 					__pgprot(_PAGE_CACHE_UC_MINUS),
-					__pgprot(_PAGE_CACHE_MASK),
+					__pgprot(_PAGE_CACHE_EXT_MASK),
 					0, 0, NULL);
 }
 
@@ -1493,13 +1493,13 @@ static int _set_memory_array(unsigned long *addr, int addrinarray,
 
 	ret = change_page_attr_set_clr(addr, addrinarray,
 				       __pgprot(_PAGE_CACHE_UC_MINUS),
-				       __pgprot(_PAGE_CACHE_MASK),
+				       __pgprot(_PAGE_CACHE_EXT_MASK),
 				       0, CPA_ARRAY, NULL);
 
 	if (!ret && new_type == _PAGE_CACHE_WC)
 		ret = change_page_attr_set_clr(addr, addrinarray,
 					       __pgprot(_PAGE_CACHE_WC),
-					       __pgprot(_PAGE_CACHE_MASK),
+					       __pgprot(_PAGE_CACHE_EXT_MASK),
 					       0, CPA_ARRAY, NULL);
 	if (ret)
 		goto out_free;
@@ -1532,12 +1532,12 @@ int _set_memory_wc(unsigned long addr, int numpages)
 
 	ret = change_page_attr_set_clr(&addr, numpages,
 				       __pgprot(_PAGE_CACHE_UC_MINUS),
-				       __pgprot(_PAGE_CACHE_MASK),
+				       __pgprot(_PAGE_CACHE_EXT_MASK),
 				       0, 0, NULL);
 	if (!ret) {
 		ret = change_page_attr_set_clr(&addr_copy, numpages,
 					       __pgprot(_PAGE_CACHE_WC),
-					       __pgprot(_PAGE_CACHE_MASK),
+					       __pgprot(_PAGE_CACHE_EXT_MASK),
 					       0, 0, NULL);
 	}
 	return ret;
@@ -1578,7 +1578,7 @@ int _set_memory_wt(unsigned long addr, int numpages)
 {
 	return change_page_attr_set_clr(&addr, numpages,
 					__pgprot(_PAGE_CACHE_WT),
-					__pgprot(_PAGE_CACHE_MASK),
+					__pgprot(_PAGE_CACHE_EXT_MASK),
 					0, 0, NULL);
 }
 
@@ -1611,7 +1611,7 @@ int _set_memory_wb(unsigned long addr, int numpages)
 {
 	return change_page_attr_set_clr(&addr, numpages,
 					__pgprot(_PAGE_CACHE_WB),
-					__pgprot(_PAGE_CACHE_MASK),
+					__pgprot(_PAGE_CACHE_EXT_MASK),
 					0, 0, NULL);
 }
 
@@ -1635,7 +1635,7 @@ int set_memory_array_wb(unsigned long *addr, int addrinarray)
 
 	ret = change_page_attr_set_clr(addr, addrinarray,
 				       __pgprot(_PAGE_CACHE_WB),
-				       __pgprot(_PAGE_CACHE_MASK),
+				       __pgprot(_PAGE_CACHE_EXT_MASK),
 				       0, CPA_ARRAY, NULL);
 	if (ret)
 		return ret;
@@ -1719,7 +1719,7 @@ static int _set_pages_array(struct page **pages, int addrinarray,
 	if (!ret && new_type == _PAGE_CACHE_WC)
 		ret = change_page_attr_set_clr(NULL, addrinarray,
 					       __pgprot(_PAGE_CACHE_WC),
-					       __pgprot(_PAGE_CACHE_MASK),
+					       __pgprot(_PAGE_CACHE_EXT_MASK),
 					       0, CPA_PAGES_ARRAY, pages);
 	if (ret)
 		goto err_out;
@@ -1770,7 +1770,7 @@ int set_pages_array_wb(struct page **pages, int addrinarray)
 	int i;
 
 	retval = cpa_clear_pages_array(pages, addrinarray,
-			__pgprot(_PAGE_CACHE_MASK));
+			__pgprot(_PAGE_CACHE_EXT_MASK));
 	if (retval)
 		return retval;
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-21 16:31         ` Toshi Kani
@ 2014-07-21 16:47             ` H. Peter Anvin
  0 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-21 16:47 UTC (permalink / raw
  To: Toshi Kani, Konrad Rzeszutek Wilk
  Cc: tglx, mingo, akpm, arnd, plagnioj, tomi.valkeinen, linux-mm,
	linux-kernel, stefan.bader, luto, airlied, bp

On 07/21/2014 09:31 AM, Toshi Kani wrote:
> Do you have any comments / suggestions for this approach?

Approach to what, specifically?

Keep in mind the PAT bit is different for large pages.  This needs to be
dealt with.  I would also like a systematic way to deal with the fact
that Xen (sigh) is stuck with a separate mapping system.

I guess Linux could adopt the Xen mappings if that makes it easier, as
long as that doesn't have a negative impact on native hardware -- we can
possibly deal with some older chips not being optimal.  However, my
thinking has been to have a "reverse PAT" table in memory of memory
types to encodings, both for regular and large pages.

	-hpa




^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-21 16:47             ` H. Peter Anvin
  0 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-21 16:47 UTC (permalink / raw
  To: Toshi Kani, Konrad Rzeszutek Wilk
  Cc: tglx, mingo, akpm, arnd, plagnioj, tomi.valkeinen, linux-mm,
	linux-kernel, stefan.bader, luto, airlied, bp

On 07/21/2014 09:31 AM, Toshi Kani wrote:
> Do you have any comments / suggestions for this approach?

Approach to what, specifically?

Keep in mind the PAT bit is different for large pages.  This needs to be
dealt with.  I would also like a systematic way to deal with the fact
that Xen (sigh) is stuck with a separate mapping system.

I guess Linux could adopt the Xen mappings if that makes it easier, as
long as that doesn't have a negative impact on native hardware -- we can
possibly deal with some older chips not being optimal.  However, my
thinking has been to have a "reverse PAT" table in memory of memory
types to encodings, both for regular and large pages.

	-hpa



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-21 16:47             ` H. Peter Anvin
@ 2014-07-21 17:16               ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-21 17:16 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, tglx, mingo, akpm, arnd, plagnioj,
	tomi.valkeinen, linux-mm, linux-kernel, stefan.bader, luto,
	airlied, bp

On Mon, 2014-07-21 at 09:47 -0700, H. Peter Anvin wrote:
> On 07/21/2014 09:31 AM, Toshi Kani wrote:
> > Do you have any comments / suggestions for this approach?
> 
> Approach to what, specifically?
>
> Keep in mind the PAT bit is different for large pages.  This needs to be
> dealt with.  

You are right.  I was under a wrong impression that
__change_page_attr() always splits a large pages into 4KB pages, but I
overlooked the fact that it can handle a large page as well.  So, this
approach does not work...

> I would also like a systematic way to deal with the fact
> that Xen (sigh) is stuck with a separate mapping system.
>
> I guess Linux could adopt the Xen mappings if that makes it easier, as
> long as that doesn't have a negative impact on native hardware -- we can
> possibly deal with some older chips not being optimal.  

I see.  I agree that supporting the PAT bit is the right direction, but
I do not know how much effort we need.  I will study on this.

> However, my thinking has been to have a "reverse PAT" table in memory of memory
> types to encodings, both for regular and large pages.

I am not clear about your idea of the "reverse PAT" table.  Would you
care to elaborate?  How is it different from using pte_val() being a
paravirt function on Xen?

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-21 17:16               ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-21 17:16 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, tglx, mingo, akpm, arnd, plagnioj,
	tomi.valkeinen, linux-mm, linux-kernel, stefan.bader, luto,
	airlied, bp

On Mon, 2014-07-21 at 09:47 -0700, H. Peter Anvin wrote:
> On 07/21/2014 09:31 AM, Toshi Kani wrote:
> > Do you have any comments / suggestions for this approach?
> 
> Approach to what, specifically?
>
> Keep in mind the PAT bit is different for large pages.  This needs to be
> dealt with.  

You are right.  I was under a wrong impression that
__change_page_attr() always splits a large pages into 4KB pages, but I
overlooked the fact that it can handle a large page as well.  So, this
approach does not work...

> I would also like a systematic way to deal with the fact
> that Xen (sigh) is stuck with a separate mapping system.
>
> I guess Linux could adopt the Xen mappings if that makes it easier, as
> long as that doesn't have a negative impact on native hardware -- we can
> possibly deal with some older chips not being optimal.  

I see.  I agree that supporting the PAT bit is the right direction, but
I do not know how much effort we need.  I will study on this.

> However, my thinking has been to have a "reverse PAT" table in memory of memory
> types to encodings, both for regular and large pages.

I am not clear about your idea of the "reverse PAT" table.  Would you
care to elaborate?  How is it different from using pte_val() being a
paravirt function on Xen?

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-21 16:47             ` H. Peter Anvin
@ 2014-07-21 17:20               ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-21 17:20 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, tglx, mingo, akpm, arnd, plagnioj,
	tomi.valkeinen, linux-mm, linux-kernel, stefan.bader, luto,
	airlied, bp


On Mon, 2014-07-21 at 09:47 -0700, H. Peter Anvin wrote:
> On 07/21/2014 09:31 AM, Toshi Kani wrote:
> > Do you have any comments / suggestions for this approach?
> 
> Approach to what, specifically?
>
> Keep in mind the PAT bit is different for large pages.  This needs to be
> dealt with.  

You are right.  I was under a wrong impression that
__change_page_attr() always splits a large pages into 4KB pages, but I
overlooked the fact that it can handle a large page as well.  So, this
approach does not work...

> I would also like a systematic way to deal with the fact
> that Xen (sigh) is stuck with a separate mapping system.
>
> I guess Linux could adopt the Xen mappings if that makes it easier, as
> long as that doesn't have a negative impact on native hardware -- we can
> possibly deal with some older chips not being optimal.  

I see.  I agree that supporting the PAT bit is the right direction, but
I do not know how much effort we need.  I will study on this.

> However, my thinking has been to have a "reverse PAT" table in memory of memory
> types to encodings, both for regular and large pages.

I am not clear about your idea of the "reverse PAT" table.  Would you
care to elaborate?  How is it different from using pte_val() being a
paravirt function on Xen?

Thanks,
-Toshi



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-21 17:20               ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-21 17:20 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, tglx, mingo, akpm, arnd, plagnioj,
	tomi.valkeinen, linux-mm, linux-kernel, stefan.bader, luto,
	airlied, bp


On Mon, 2014-07-21 at 09:47 -0700, H. Peter Anvin wrote:
> On 07/21/2014 09:31 AM, Toshi Kani wrote:
> > Do you have any comments / suggestions for this approach?
> 
> Approach to what, specifically?
>
> Keep in mind the PAT bit is different for large pages.  This needs to be
> dealt with.  

You are right.  I was under a wrong impression that
__change_page_attr() always splits a large pages into 4KB pages, but I
overlooked the fact that it can handle a large page as well.  So, this
approach does not work...

> I would also like a systematic way to deal with the fact
> that Xen (sigh) is stuck with a separate mapping system.
>
> I guess Linux could adopt the Xen mappings if that makes it easier, as
> long as that doesn't have a negative impact on native hardware -- we can
> possibly deal with some older chips not being optimal.  

I see.  I agree that supporting the PAT bit is the right direction, but
I do not know how much effort we need.  I will study on this.

> However, my thinking has been to have a "reverse PAT" table in memory of memory
> types to encodings, both for regular and large pages.

I am not clear about your idea of the "reverse PAT" table.  Would you
care to elaborate?  How is it different from using pte_val() being a
paravirt function on Xen?

Thanks,
-Toshi


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-21 17:16               ` Toshi Kani
@ 2014-07-21 17:32                 ` H. Peter Anvin
  -1 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-21 17:32 UTC (permalink / raw
  To: Toshi Kani
  Cc: Konrad Rzeszutek Wilk, tglx, mingo, akpm, arnd, plagnioj,
	tomi.valkeinen, linux-mm, linux-kernel, stefan.bader, luto,
	airlied, bp

On 07/21/2014 10:16 AM, Toshi Kani wrote:
> 
> You are right.  I was under a wrong impression that
> __change_page_attr() always splits a large pages into 4KB pages, but I
> overlooked the fact that it can handle a large page as well.  So, this
> approach does not work...
> 

If it did it would be a major fail.

>> I would also like a systematic way to deal with the fact
>> that Xen (sigh) is stuck with a separate mapping system.
>>
>> I guess Linux could adopt the Xen mappings if that makes it easier, as
>> long as that doesn't have a negative impact on native hardware -- we can
>> possibly deal with some older chips not being optimal.  
> 
> I see.  I agree that supporting the PAT bit is the right direction, but
> I do not know how much effort we need.  I will study on this.
> 
>> However, my thinking has been to have a "reverse PAT" table in memory of memory
>> types to encodings, both for regular and large pages.
> 
> I am not clear about your idea of the "reverse PAT" table.  Would you
> care to elaborate?  How is it different from using pte_val() being a
> paravirt function on Xen?

First of all, paravirt functions are the root of all evil, and we want
to reduce and eliminate them to the utmost level possible.  But yes, we
could plumb that up that way if we really need to.

What I'm thinking of is a table which can deal with both the moving PTE
bit, Xen, and the scattered encodings by having a small table from types
to encodings, and not use the encodings directly until fairly late it
the pipe.  I suspect, but I'm not sure, that we would also need the
inverse operation.

	-hpa



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-21 17:32                 ` H. Peter Anvin
  0 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-21 17:32 UTC (permalink / raw
  To: Toshi Kani
  Cc: Konrad Rzeszutek Wilk, tglx, mingo, akpm, arnd, plagnioj,
	tomi.valkeinen, linux-mm, linux-kernel, stefan.bader, luto,
	airlied, bp

On 07/21/2014 10:16 AM, Toshi Kani wrote:
> 
> You are right.  I was under a wrong impression that
> __change_page_attr() always splits a large pages into 4KB pages, but I
> overlooked the fact that it can handle a large page as well.  So, this
> approach does not work...
> 

If it did it would be a major fail.

>> I would also like a systematic way to deal with the fact
>> that Xen (sigh) is stuck with a separate mapping system.
>>
>> I guess Linux could adopt the Xen mappings if that makes it easier, as
>> long as that doesn't have a negative impact on native hardware -- we can
>> possibly deal with some older chips not being optimal.  
> 
> I see.  I agree that supporting the PAT bit is the right direction, but
> I do not know how much effort we need.  I will study on this.
> 
>> However, my thinking has been to have a "reverse PAT" table in memory of memory
>> types to encodings, both for regular and large pages.
> 
> I am not clear about your idea of the "reverse PAT" table.  Would you
> care to elaborate?  How is it different from using pte_val() being a
> paravirt function on Xen?

First of all, paravirt functions are the root of all evil, and we want
to reduce and eliminate them to the utmost level possible.  But yes, we
could plumb that up that way if we really need to.

What I'm thinking of is a table which can deal with both the moving PTE
bit, Xen, and the scattered encodings by having a small table from types
to encodings, and not use the encodings directly until fairly late it
the pipe.  I suspect, but I'm not sure, that we would also need the
inverse operation.

	-hpa


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-21 17:32                 ` H. Peter Anvin
@ 2014-07-21 17:33                   ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-21 17:33 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, tglx, mingo, akpm, arnd, plagnioj,
	tomi.valkeinen, linux-mm, linux-kernel, stefan.bader, luto,
	airlied, bp

On Mon, 2014-07-21 at 10:32 -0700, H. Peter Anvin wrote:
> On 07/21/2014 10:16 AM, Toshi Kani wrote:
 :
> >> I would also like a systematic way to deal with the fact
> >> that Xen (sigh) is stuck with a separate mapping system.
> >>
> >> I guess Linux could adopt the Xen mappings if that makes it easier, as
> >> long as that doesn't have a negative impact on native hardware -- we can
> >> possibly deal with some older chips not being optimal.  
> > 
> > I see.  I agree that supporting the PAT bit is the right direction, but
> > I do not know how much effort we need.  I will study on this.
> > 
> >> However, my thinking has been to have a "reverse PAT" table in memory of memory
> >> types to encodings, both for regular and large pages.
> > 
> > I am not clear about your idea of the "reverse PAT" table.  Would you
> > care to elaborate?  How is it different from using pte_val() being a
> > paravirt function on Xen?
> 
> First of all, paravirt functions are the root of all evil, and we want
> to reduce and eliminate them to the utmost level possible.  But yes, we
> could plumb that up that way if we really need to.
> 
> What I'm thinking of is a table which can deal with both the moving PTE
> bit, Xen, and the scattered encodings by having a small table from types
> to encodings, and not use the encodings directly until fairly late it
> the pipe.  I suspect, but I'm not sure, that we would also need the
> inverse operation.

Thanks for the explanation!  I will think about it as well.
-Toshi


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-21 17:33                   ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-21 17:33 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: Konrad Rzeszutek Wilk, tglx, mingo, akpm, arnd, plagnioj,
	tomi.valkeinen, linux-mm, linux-kernel, stefan.bader, luto,
	airlied, bp

On Mon, 2014-07-21 at 10:32 -0700, H. Peter Anvin wrote:
> On 07/21/2014 10:16 AM, Toshi Kani wrote:
 :
> >> I would also like a systematic way to deal with the fact
> >> that Xen (sigh) is stuck with a separate mapping system.
> >>
> >> I guess Linux could adopt the Xen mappings if that makes it easier, as
> >> long as that doesn't have a negative impact on native hardware -- we can
> >> possibly deal with some older chips not being optimal.  
> > 
> > I see.  I agree that supporting the PAT bit is the right direction, but
> > I do not know how much effort we need.  I will study on this.
> > 
> >> However, my thinking has been to have a "reverse PAT" table in memory of memory
> >> types to encodings, both for regular and large pages.
> > 
> > I am not clear about your idea of the "reverse PAT" table.  Would you
> > care to elaborate?  How is it different from using pte_val() being a
> > paravirt function on Xen?
> 
> First of all, paravirt functions are the root of all evil, and we want
> to reduce and eliminate them to the utmost level possible.  But yes, we
> could plumb that up that way if we really need to.
> 
> What I'm thinking of is a table which can deal with both the moving PTE
> bit, Xen, and the scattered encodings by having a small table from types
> to encodings, and not use the encodings directly until fairly late it
> the pipe.  I suspect, but I'm not sure, that we would also need the
> inverse operation.

Thanks for the explanation!  I will think about it as well.
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-21 17:32                 ` H. Peter Anvin
@ 2014-07-21 18:33                   ` Konrad Rzeszutek Wilk
  -1 siblings, 0 replies; 74+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-21 18:33 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: Toshi Kani, tglx, mingo, akpm, arnd, plagnioj, tomi.valkeinen,
	linux-mm, linux-kernel, stefan.bader, luto, airlied, bp

On Mon, Jul 21, 2014 at 10:32:34AM -0700, H. Peter Anvin wrote:
> On 07/21/2014 10:16 AM, Toshi Kani wrote:
> > 
> > You are right.  I was under a wrong impression that
> > __change_page_attr() always splits a large pages into 4KB pages, but I
> > overlooked the fact that it can handle a large page as well.  So, this
> > approach does not work...
> > 
> 
> If it did it would be a major fail.
> 
> >> I would also like a systematic way to deal with the fact
> >> that Xen (sigh) is stuck with a separate mapping system.
> >>
> >> I guess Linux could adopt the Xen mappings if that makes it easier, as
> >> long as that doesn't have a negative impact on native hardware -- we can
> >> possibly deal with some older chips not being optimal.  
> > 
> > I see.  I agree that supporting the PAT bit is the right direction, but
> > I do not know how much effort we need.  I will study on this.
> > 
> >> However, my thinking has been to have a "reverse PAT" table in memory of memory
> >> types to encodings, both for regular and large pages.
> > 
> > I am not clear about your idea of the "reverse PAT" table.  Would you
> > care to elaborate?  How is it different from using pte_val() being a
> > paravirt function on Xen?
> 
> First of all, paravirt functions are the root of all evil, and we want

Here I was thinking to actually put an entry in the MAINTAINERS
file for me to become the owner of it - as the folks listed there
are busy with other things.

The Maintainer of 'All Evil' has an interesting ring to it :-)

> to reduce and eliminate them to the utmost level possible.  But yes, we
> could plumb that up that way if we really need to.
> 
> What I'm thinking of is a table which can deal with both the moving PTE
> bit, Xen, and the scattered encodings by having a small table from types
> to encodings, and not use the encodings directly until fairly late it
> the pipe.  I suspect, but I'm not sure, that we would also need the
> inverse operation.

Mr Toshi-san,

This link: http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/hvm/mtrr.c;h=ee18553cdac58dd16836011ee714517fbc16368d;hb=HEAD#l74 might help you in figuring how this can be done.

Thought I have to say that the code is quite complex so it might
be more confusing then helpful.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-21 18:33                   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 74+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-21 18:33 UTC (permalink / raw
  To: H. Peter Anvin
  Cc: Toshi Kani, tglx, mingo, akpm, arnd, plagnioj, tomi.valkeinen,
	linux-mm, linux-kernel, stefan.bader, luto, airlied, bp

On Mon, Jul 21, 2014 at 10:32:34AM -0700, H. Peter Anvin wrote:
> On 07/21/2014 10:16 AM, Toshi Kani wrote:
> > 
> > You are right.  I was under a wrong impression that
> > __change_page_attr() always splits a large pages into 4KB pages, but I
> > overlooked the fact that it can handle a large page as well.  So, this
> > approach does not work...
> > 
> 
> If it did it would be a major fail.
> 
> >> I would also like a systematic way to deal with the fact
> >> that Xen (sigh) is stuck with a separate mapping system.
> >>
> >> I guess Linux could adopt the Xen mappings if that makes it easier, as
> >> long as that doesn't have a negative impact on native hardware -- we can
> >> possibly deal with some older chips not being optimal.  
> > 
> > I see.  I agree that supporting the PAT bit is the right direction, but
> > I do not know how much effort we need.  I will study on this.
> > 
> >> However, my thinking has been to have a "reverse PAT" table in memory of memory
> >> types to encodings, both for regular and large pages.
> > 
> > I am not clear about your idea of the "reverse PAT" table.  Would you
> > care to elaborate?  How is it different from using pte_val() being a
> > paravirt function on Xen?
> 
> First of all, paravirt functions are the root of all evil, and we want

Here I was thinking to actually put an entry in the MAINTAINERS
file for me to become the owner of it - as the folks listed there
are busy with other things.

The Maintainer of 'All Evil' has an interesting ring to it :-)

> to reduce and eliminate them to the utmost level possible.  But yes, we
> could plumb that up that way if we really need to.
> 
> What I'm thinking of is a table which can deal with both the moving PTE
> bit, Xen, and the scattered encodings by having a small table from types
> to encodings, and not use the encodings directly until fairly late it
> the pipe.  I suspect, but I'm not sure, that we would also need the
> inverse operation.

Mr Toshi-san,

This link: http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/hvm/mtrr.c;h=ee18553cdac58dd16836011ee714517fbc16368d;hb=HEAD#l74 might help you in figuring how this can be done.

Thought I have to say that the code is quite complex so it might
be more confusing then helpful.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-21 18:33                   ` Konrad Rzeszutek Wilk
@ 2014-07-21 19:24                     ` Toshi Kani
  -1 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-21 19:24 UTC (permalink / raw
  To: Konrad Rzeszutek Wilk
  Cc: H. Peter Anvin, tglx, mingo, akpm, arnd, plagnioj, tomi.valkeinen,
	linux-mm, linux-kernel, stefan.bader, luto, airlied, bp

On Mon, 2014-07-21 at 14:33 -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Jul 21, 2014 at 10:32:34AM -0700, H. Peter Anvin wrote:
> > On 07/21/2014 10:16 AM, Toshi Kani wrote:
 :
> > 
> > >> I would also like a systematic way to deal with the fact
> > >> that Xen (sigh) is stuck with a separate mapping system.
> > >>
> > >> I guess Linux could adopt the Xen mappings if that makes it easier, as
> > >> long as that doesn't have a negative impact on native hardware -- we can
> > >> possibly deal with some older chips not being optimal.  
> > > 
> > > I see.  I agree that supporting the PAT bit is the right direction, but
> > > I do not know how much effort we need.  I will study on this.
> > > 
> > >> However, my thinking has been to have a "reverse PAT" table in memory of memory
> > >> types to encodings, both for regular and large pages.
> > > 
> > > I am not clear about your idea of the "reverse PAT" table.  Would you
> > > care to elaborate?  How is it different from using pte_val() being a
> > > paravirt function on Xen?
> > 
> > First of all, paravirt functions are the root of all evil, and we want
> 
> Here I was thinking to actually put an entry in the MAINTAINERS
> file for me to become the owner of it - as the folks listed there
> are busy with other things.
> 
> The Maintainer of 'All Evil' has an interesting ring to it :-)

:-)

> > to reduce and eliminate them to the utmost level possible.  But yes, we
> > could plumb that up that way if we really need to.
> > 
> > What I'm thinking of is a table which can deal with both the moving PTE
> > bit, Xen, and the scattered encodings by having a small table from types
> > to encodings, and not use the encodings directly until fairly late it
> > the pipe.  I suspect, but I'm not sure, that we would also need the
> > inverse operation.
> 
> Mr Toshi-san,

Oh, you are so polite, Wilk-san. 

> This link: http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/hvm/mtrr.c;h=ee18553cdac58dd16836011ee714517fbc16368d;hb=HEAD#l74 might help you in figuring how this can be done.
> 
> Thought I have to say that the code is quite complex so it might
> be more confusing then helpful.

Thanks again for the pointer!  I will take a look.  I used to work on a
paravirt on other OS, but I am pretty much new to Xen.  One more thing
to learn. :-)
-Toshi




^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-21 19:24                     ` Toshi Kani
  0 siblings, 0 replies; 74+ messages in thread
From: Toshi Kani @ 2014-07-21 19:24 UTC (permalink / raw
  To: Konrad Rzeszutek Wilk
  Cc: H. Peter Anvin, tglx, mingo, akpm, arnd, plagnioj, tomi.valkeinen,
	linux-mm, linux-kernel, stefan.bader, luto, airlied, bp

On Mon, 2014-07-21 at 14:33 -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Jul 21, 2014 at 10:32:34AM -0700, H. Peter Anvin wrote:
> > On 07/21/2014 10:16 AM, Toshi Kani wrote:
 :
> > 
> > >> I would also like a systematic way to deal with the fact
> > >> that Xen (sigh) is stuck with a separate mapping system.
> > >>
> > >> I guess Linux could adopt the Xen mappings if that makes it easier, as
> > >> long as that doesn't have a negative impact on native hardware -- we can
> > >> possibly deal with some older chips not being optimal.  
> > > 
> > > I see.  I agree that supporting the PAT bit is the right direction, but
> > > I do not know how much effort we need.  I will study on this.
> > > 
> > >> However, my thinking has been to have a "reverse PAT" table in memory of memory
> > >> types to encodings, both for regular and large pages.
> > > 
> > > I am not clear about your idea of the "reverse PAT" table.  Would you
> > > care to elaborate?  How is it different from using pte_val() being a
> > > paravirt function on Xen?
> > 
> > First of all, paravirt functions are the root of all evil, and we want
> 
> Here I was thinking to actually put an entry in the MAINTAINERS
> file for me to become the owner of it - as the folks listed there
> are busy with other things.
> 
> The Maintainer of 'All Evil' has an interesting ring to it :-)

:-)

> > to reduce and eliminate them to the utmost level possible.  But yes, we
> > could plumb that up that way if we really need to.
> > 
> > What I'm thinking of is a table which can deal with both the moving PTE
> > bit, Xen, and the scattered encodings by having a small table from types
> > to encodings, and not use the encodings directly until fairly late it
> > the pipe.  I suspect, but I'm not sure, that we would also need the
> > inverse operation.
> 
> Mr Toshi-san,

Oh, you are so polite, Wilk-san. 

> This link: http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/hvm/mtrr.c;h=ee18553cdac58dd16836011ee714517fbc16368d;hb=HEAD#l74 might help you in figuring how this can be done.
> 
> Thought I have to say that the code is quite complex so it might
> be more confusing then helpful.

Thanks again for the pointer!  I will take a look.  I used to work on a
paravirt on other OS, but I am pretty much new to Xen.  One more thing
to learn. :-)
-Toshi



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
  2014-07-21 18:33                   ` Konrad Rzeszutek Wilk
@ 2014-07-21 20:22                     ` H. Peter Anvin
  -1 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-21 20:22 UTC (permalink / raw
  To: Konrad Rzeszutek Wilk
  Cc: Toshi Kani, tglx, mingo, akpm, arnd, plagnioj, tomi.valkeinen,
	linux-mm, linux-kernel, stefan.bader, luto, airlied, bp

On 07/21/2014 11:33 AM, Konrad Rzeszutek Wilk wrote:
>>
>> First of all, paravirt functions are the root of all evil, and we want
> 
> Here I was thinking to actually put an entry in the MAINTAINERS
> file for me to become the owner of it - as the folks listed there
> are busy with other things.
> 
> The Maintainer of 'All Evil' has an interesting ring to it :-)
> 

Then you can legitimately title yourself Lord of All Evil.  :)

	-hpa


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [RFC PATCH 0/11] Support Write-Through mapping on x86
@ 2014-07-21 20:22                     ` H. Peter Anvin
  0 siblings, 0 replies; 74+ messages in thread
From: H. Peter Anvin @ 2014-07-21 20:22 UTC (permalink / raw
  To: Konrad Rzeszutek Wilk
  Cc: Toshi Kani, tglx, mingo, akpm, arnd, plagnioj, tomi.valkeinen,
	linux-mm, linux-kernel, stefan.bader, luto, airlied, bp

On 07/21/2014 11:33 AM, Konrad Rzeszutek Wilk wrote:
>>
>> First of all, paravirt functions are the root of all evil, and we want
> 
> Here I was thinking to actually put an entry in the MAINTAINERS
> file for me to become the owner of it - as the folks listed there
> are busy with other things.
> 
> The Maintainer of 'All Evil' has an interesting ring to it :-)
> 

Then you can legitimately title yourself Lord of All Evil.  :)

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2014-07-21 20:23 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-15 19:34 [RFC PATCH 0/11] Support Write-Through mapping on x86 Toshi Kani
2014-07-15 19:34 ` Toshi Kani
2014-07-15 19:34 ` [RFC PATCH 1/11] x86, mm, pat: Redefine _PAGE_CACHE_UC as UC_MINUS Toshi Kani
2014-07-15 19:34   ` Toshi Kani
2014-07-15 19:34 ` [RFC PATCH 2/11] x86, mm, pat: Define _PAGE_CACHE_WT for PA3/7 of PAT Toshi Kani
2014-07-15 19:34   ` Toshi Kani
2014-07-15 19:34 ` [RFC PATCH 3/11] x86, mm, pat: Change reserve_memtype() to handle WT type Toshi Kani
2014-07-15 19:34   ` Toshi Kani
2014-07-15 19:56   ` Andy Lutomirski
2014-07-15 19:56     ` Andy Lutomirski
2014-07-15 23:10     ` Toshi Kani
2014-07-15 23:10       ` Toshi Kani
2014-07-15 23:36       ` Andy Lutomirski
2014-07-15 23:36         ` Andy Lutomirski
2014-07-15 23:46         ` H. Peter Anvin
2014-07-15 23:46           ` H. Peter Anvin
2014-07-15 23:54           ` Andy Lutomirski
2014-07-15 23:54             ` Andy Lutomirski
2014-07-15 23:59             ` H. Peter Anvin
2014-07-15 23:59               ` H. Peter Anvin
2014-07-15 23:53         ` Toshi Kani
2014-07-15 23:53           ` Toshi Kani
2014-07-16  0:05           ` H. Peter Anvin
2014-07-16  0:05             ` H. Peter Anvin
2014-07-16  0:28           ` Andy Lutomirski
2014-07-16  0:28             ` Andy Lutomirski
2014-07-16  0:31             ` H. Peter Anvin
2014-07-16  0:31               ` H. Peter Anvin
2014-07-16 14:35             ` Toshi Kani
2014-07-16 14:35               ` Toshi Kani
2014-07-15 19:34 ` [RFC PATCH 4/11] x86, mm, asm-gen: Add ioremap_wt() for WT mapping Toshi Kani
2014-07-15 19:34   ` Toshi Kani
2014-07-15 19:34 ` [RFC PATCH 5/11] x86, mm: Add set_memory[_array]_wt() for setting WT Toshi Kani
2014-07-15 19:34   ` Toshi Kani
2014-07-15 19:34 ` [RFC PATCH 6/11] x86, mm, pat: Add pgprot_writethrough() for WT Toshi Kani
2014-07-15 19:34   ` Toshi Kani
2014-07-15 19:34 ` [RFC PATCH 7/11] x86, mm: Keep _set_memory_<type>() slot-independent Toshi Kani
2014-07-15 19:34   ` Toshi Kani
2014-07-15 19:34 ` [RFC PATCH 8/11] x86, mm, pat: Keep pgprot_<type>() slot-independent Toshi Kani
2014-07-15 19:34   ` Toshi Kani
2014-07-15 19:34 ` [RFC PATCH 9/11] x86, efi: Cleanup PCD bit manipulation in EFI Toshi Kani
2014-07-15 19:34   ` Toshi Kani
2014-07-15 19:34 ` [RFC PATCH 10/11] x86, xen: Cleanup PWT/PCD bit manipulation in Xen Toshi Kani
2014-07-15 19:34   ` Toshi Kani
2014-07-15 19:34 ` [RFC PATCH 11/11] x86, fbdev: Cleanup PWT/PCD bit manipulation in fbdev Toshi Kani
2014-07-15 19:34   ` Toshi Kani
2014-07-15 19:53 ` [RFC PATCH 0/11] Support Write-Through mapping on x86 Andy Lutomirski
2014-07-15 19:53   ` Andy Lutomirski
2014-07-15 20:10   ` H. Peter Anvin
2014-07-15 20:10     ` H. Peter Anvin
2014-07-15 20:09 ` H. Peter Anvin
2014-07-15 20:09   ` H. Peter Anvin
2014-07-15 21:23   ` Toshi Kani
2014-07-15 21:23     ` Toshi Kani
2014-07-16  0:40     ` Konrad Rzeszutek Wilk
2014-07-16  0:40       ` Konrad Rzeszutek Wilk
2014-07-16 21:28       ` Toshi Kani
2014-07-21 16:31         ` Toshi Kani
2014-07-21 16:47           ` H. Peter Anvin
2014-07-21 16:47             ` H. Peter Anvin
2014-07-21 17:16             ` Toshi Kani
2014-07-21 17:16               ` Toshi Kani
2014-07-21 17:32               ` H. Peter Anvin
2014-07-21 17:32                 ` H. Peter Anvin
2014-07-21 17:33                 ` Toshi Kani
2014-07-21 17:33                   ` Toshi Kani
2014-07-21 18:33                 ` Konrad Rzeszutek Wilk
2014-07-21 18:33                   ` Konrad Rzeszutek Wilk
2014-07-21 19:24                   ` Toshi Kani
2014-07-21 19:24                     ` Toshi Kani
2014-07-21 20:22                   ` H. Peter Anvin
2014-07-21 20:22                     ` H. Peter Anvin
2014-07-21 17:20             ` Toshi Kani
2014-07-21 17:20               ` Toshi Kani

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.