LinuxPPC-Dev Archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess
@ 2024-04-03  8:37 Kefeng Wang
  2024-04-03  8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:37 UTC (permalink / raw
  To: akpm
  Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
	linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
	x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
	Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
	Palmer Dabbelt, linuxppc-dev

After VMA lock-based page fault handling enabled, if bad access met
under per-vma lock, it will fallback to mmap_lock-based handling,
so it leads to unnessary mmap lock and vma find again. A test from
lmbench shows 34% improve after this changes on arm64,

  lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198

Only build test on other archs except arm64.

v2: 
- a better changelog, and describe the counting changes, suggested by
  Suren Baghdasaryan
- add RB

Kefeng Wang (7):
  arm64: mm: cleanup __do_page_fault()
  arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS
  arm: mm: accelerate pagefault when VM_FAULT_BADACCESS
  powerpc: mm: accelerate pagefault when badaccess
  riscv: mm: accelerate pagefault when badaccess
  s390: mm: accelerate pagefault when badaccess
  x86: mm: accelerate pagefault when badaccess

 arch/arm/mm/fault.c     |  4 +++-
 arch/arm64/mm/fault.c   | 31 ++++++++++---------------------
 arch/powerpc/mm/fault.c | 33 ++++++++++++++++++++-------------
 arch/riscv/mm/fault.c   |  5 ++++-
 arch/s390/mm/fault.c    |  3 ++-
 arch/x86/mm/fault.c     | 23 ++++++++++++++---------
 6 files changed, 53 insertions(+), 46 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault()
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
@ 2024-04-03  8:37 ` Kefeng Wang
  2024-04-09 11:14   ` Catalin Marinas
  2024-04-03  8:38 ` [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS Kefeng Wang
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:37 UTC (permalink / raw
  To: akpm
  Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
	linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
	x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
	Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
	Palmer Dabbelt, linuxppc-dev

The __do_page_fault() only calls handle_mm_fault() after vm_flags
checked, and it is only called by do_page_fault(), let's squash
it into do_page_fault() to cleanup code.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/arm64/mm/fault.c | 27 +++++++--------------------
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 8251e2fea9c7..9bb9f395351a 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -486,25 +486,6 @@ static void do_bad_area(unsigned long far, unsigned long esr,
 	}
 }
 
-#define VM_FAULT_BADMAP		((__force vm_fault_t)0x010000)
-#define VM_FAULT_BADACCESS	((__force vm_fault_t)0x020000)
-
-static vm_fault_t __do_page_fault(struct mm_struct *mm,
-				  struct vm_area_struct *vma, unsigned long addr,
-				  unsigned int mm_flags, unsigned long vm_flags,
-				  struct pt_regs *regs)
-{
-	/*
-	 * Ok, we have a good vm_area for this memory access, so we can handle
-	 * it.
-	 * Check that the permissions on the VMA allow for the fault which
-	 * occurred.
-	 */
-	if (!(vma->vm_flags & vm_flags))
-		return VM_FAULT_BADACCESS;
-	return handle_mm_fault(vma, addr, mm_flags, regs);
-}
-
 static bool is_el0_instruction_abort(unsigned long esr)
 {
 	return ESR_ELx_EC(esr) == ESR_ELx_EC_IABT_LOW;
@@ -519,6 +500,9 @@ static bool is_write_abort(unsigned long esr)
 	return (esr & ESR_ELx_WNR) && !(esr & ESR_ELx_CM);
 }
 
+#define VM_FAULT_BADMAP		((__force vm_fault_t)0x010000)
+#define VM_FAULT_BADACCESS	((__force vm_fault_t)0x020000)
+
 static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 				   struct pt_regs *regs)
 {
@@ -617,7 +601,10 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 		goto done;
 	}
 
-	fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs);
+	if (!(vma->vm_flags & vm_flags))
+		fault = VM_FAULT_BADACCESS;
+	else
+		fault = handle_mm_fault(vma, addr, mm_flags, regs);
 
 	/* Quick path to respond to signals */
 	if (fault_signal_pending(fault, regs)) {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
  2024-04-03  8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
@ 2024-04-03  8:38 ` Kefeng Wang
  2024-04-09 11:15   ` Catalin Marinas
  2024-04-03  8:38 ` [PATCH v2 3/7] arm: " Kefeng Wang
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:38 UTC (permalink / raw
  To: akpm
  Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
	linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
	x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
	Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
	Palmer Dabbelt, linuxppc-dev

The vm_flags of vma already checked under per-VMA lock, if it is a
bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
no need to retry with mmap_lock again, the latency time reduces 34% in
'lat_sig -P 1 prot lat_sig' from lmbench testcase.

Since the page faut is handled under per-VMA lock, count it as a vma lock
event with VMA_LOCK_SUCCESS.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/arm64/mm/fault.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 9bb9f395351a..405f9aa831bd 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -572,7 +572,9 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 
 	if (!(vma->vm_flags & vm_flags)) {
 		vma_end_read(vma);
-		goto lock_mmap;
+		fault = VM_FAULT_BADACCESS;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		goto done;
 	}
 	fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
 	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/7] arm: mm: accelerate pagefault when VM_FAULT_BADACCESS
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
  2024-04-03  8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
  2024-04-03  8:38 ` [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS Kefeng Wang
@ 2024-04-03  8:38 ` Kefeng Wang
  2024-04-03  8:38 ` [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess Kefeng Wang
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:38 UTC (permalink / raw
  To: akpm
  Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
	linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
	x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
	Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
	Palmer Dabbelt, linuxppc-dev

The vm_flags of vma already checked under per-VMA lock, if it is a
bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
no need to retry with mmap_lock again. Since the page faut is handled
under per-VMA lock, count it as a vma lock event with VMA_LOCK_SUCCESS.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/arm/mm/fault.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 439dc6a26bb9..5c4b417e24f9 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -294,7 +294,9 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 
 	if (!(vma->vm_flags & vm_flags)) {
 		vma_end_read(vma);
-		goto lock_mmap;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		fault = VM_FAULT_BADACCESS;
+		goto bad_area;
 	}
 	fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);
 	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
                   ` (2 preceding siblings ...)
  2024-04-03  8:38 ` [PATCH v2 3/7] arm: " Kefeng Wang
@ 2024-04-03  8:38 ` Kefeng Wang
  2024-04-09  8:56   ` Michael Ellerman
  2024-04-03  8:38 ` [PATCH v2 5/7] riscv: " Kefeng Wang
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:38 UTC (permalink / raw
  To: akpm
  Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
	linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
	x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
	Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
	Palmer Dabbelt, linuxppc-dev

The access_[pkey]_error() of vma already checked under per-VMA lock, if
it is a bad access, directly handle error, no need to retry with mmap_lock
again. In order to release the correct lock, pass the mm_struct into
bad_access_pkey()/bad_access(), if mm is NULL, release vma lock, or
release mmap_lock. Since the page faut is handled under per-VMA lock,
count it as a vma lock event with VMA_LOCK_SUCCESS.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/powerpc/mm/fault.c | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 53335ae21a40..215690452495 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -71,23 +71,26 @@ static noinline int bad_area_nosemaphore(struct pt_regs *regs, unsigned long add
 	return __bad_area_nosemaphore(regs, address, SEGV_MAPERR);
 }
 
-static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
+static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code,
+		      struct mm_struct *mm, struct vm_area_struct *vma)
 {
-	struct mm_struct *mm = current->mm;
 
 	/*
 	 * Something tried to access memory that isn't in our memory map..
 	 * Fix it, but check if it's kernel or user first..
 	 */
-	mmap_read_unlock(mm);
+	if (mm)
+		mmap_read_unlock(mm);
+	else
+		vma_end_read(vma);
 
 	return __bad_area_nosemaphore(regs, address, si_code);
 }
 
 static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
+				    struct mm_struct *mm,
 				    struct vm_area_struct *vma)
 {
-	struct mm_struct *mm = current->mm;
 	int pkey;
 
 	/*
@@ -109,7 +112,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
 	 */
 	pkey = vma_pkey(vma);
 
-	mmap_read_unlock(mm);
+	if (mm)
+		mmap_read_unlock(mm);
+	else
+		vma_end_read(vma);
 
 	/*
 	 * If we are in kernel mode, bail out with a SEGV, this will
@@ -124,9 +130,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
 	return 0;
 }
 
-static noinline int bad_access(struct pt_regs *regs, unsigned long address)
+static noinline int bad_access(struct pt_regs *regs, unsigned long address,
+			       struct mm_struct *mm, struct vm_area_struct *vma)
 {
-	return __bad_area(regs, address, SEGV_ACCERR);
+	return __bad_area(regs, address, SEGV_ACCERR, mm, vma);
 }
 
 static int do_sigbus(struct pt_regs *regs, unsigned long address,
@@ -479,13 +486,13 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
 
 	if (unlikely(access_pkey_error(is_write, is_exec,
 				       (error_code & DSISR_KEYFAULT), vma))) {
-		vma_end_read(vma);
-		goto lock_mmap;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		return bad_access_pkey(regs, address, NULL, vma);
 	}
 
 	if (unlikely(access_error(is_write, is_exec, vma))) {
-		vma_end_read(vma);
-		goto lock_mmap;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		return bad_access(regs, address, NULL, vma);
 	}
 
 	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
@@ -521,10 +528,10 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
 
 	if (unlikely(access_pkey_error(is_write, is_exec,
 				       (error_code & DSISR_KEYFAULT), vma)))
-		return bad_access_pkey(regs, address, vma);
+		return bad_access_pkey(regs, address, mm, vma);
 
 	if (unlikely(access_error(is_write, is_exec, vma)))
-		return bad_access(regs, address);
+		return bad_access(regs, address, mm, vma);
 
 	/*
 	 * If for any reason at all we couldn't handle the fault,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
                   ` (3 preceding siblings ...)
  2024-04-03  8:38 ` [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess Kefeng Wang
@ 2024-04-03  8:38 ` Kefeng Wang
  2024-04-10  7:32   ` Alexandre Ghiti
  2024-04-03  8:38 ` [PATCH v2 6/7] s390: " Kefeng Wang
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:38 UTC (permalink / raw
  To: akpm
  Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
	linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
	x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
	Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
	Palmer Dabbelt, linuxppc-dev

The access_error() of vma already checked under per-VMA lock, if it
is a bad access, directly handle error, no need to retry with mmap_lock
again. Since the page faut is handled under per-VMA lock, count it as
a vma lock event with VMA_LOCK_SUCCESS.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/riscv/mm/fault.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index 3ba1d4dde5dd..b3fcf7d67efb 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
 
 	if (unlikely(access_error(cause, vma))) {
 		vma_end_read(vma);
-		goto lock_mmap;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		tsk->thread.bad_cause = SEGV_ACCERR;
+		bad_area_nosemaphore(regs, code, addr);
+		return;
 	}
 
 	fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 6/7] s390: mm: accelerate pagefault when badaccess
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
                   ` (4 preceding siblings ...)
  2024-04-03  8:38 ` [PATCH v2 5/7] riscv: " Kefeng Wang
@ 2024-04-03  8:38 ` Kefeng Wang
  2024-04-07 17:19   ` Heiko Carstens
  2024-04-03  8:38 ` [PATCH v2 7/7] x86: " Kefeng Wang
  2024-04-03 20:45 ` [PATCH v2 0/7] arch/mm/fault: " Andrew Morton
  7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:38 UTC (permalink / raw
  To: akpm
  Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
	linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
	x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
	Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
	Palmer Dabbelt, linuxppc-dev

The vm_flags of vma already checked under per-VMA lock, if it is a
bad access, directly handle error, no need to retry with mmap_lock
again. Since the page faut is handled under per-VMA lock, count it
as a vma lock event with VMA_LOCK_SUCCESS.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/s390/mm/fault.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index c421dd44ffbe..162ca2576fd4 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -325,7 +325,8 @@ static void do_exception(struct pt_regs *regs, int access)
 		goto lock_mmap;
 	if (!(vma->vm_flags & access)) {
 		vma_end_read(vma);
-		goto lock_mmap;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		return handle_fault_error_nolock(regs, SEGV_ACCERR);
 	}
 	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
 	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 7/7] x86: mm: accelerate pagefault when badaccess
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
                   ` (5 preceding siblings ...)
  2024-04-03  8:38 ` [PATCH v2 6/7] s390: " Kefeng Wang
@ 2024-04-03  8:38 ` Kefeng Wang
  2024-04-03 20:45 ` [PATCH v2 0/7] arch/mm/fault: " Andrew Morton
  7 siblings, 0 replies; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:38 UTC (permalink / raw
  To: akpm
  Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
	linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
	x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
	Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
	Palmer Dabbelt, linuxppc-dev

The access_error() of vma already checked under per-VMA lock, if it
is a bad access, directly handle error, no need to retry with mmap_lock
again. In order to release the correct lock, pass the mm_struct into
bad_area_access_error(), if mm is NULL, release vma lock, or release
mmap_lock. Since the page faut is handled under per-VMA lock, count it
as a vma lock event with VMA_LOCK_SUCCESS.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/x86/mm/fault.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a4cc20d0036d..67b18adc75dd 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -866,14 +866,17 @@ bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 
 static void
 __bad_area(struct pt_regs *regs, unsigned long error_code,
-	   unsigned long address, u32 pkey, int si_code)
+	   unsigned long address, struct mm_struct *mm,
+	   struct vm_area_struct *vma, u32 pkey, int si_code)
 {
-	struct mm_struct *mm = current->mm;
 	/*
 	 * Something tried to access memory that isn't in our memory map..
 	 * Fix it, but check if it's kernel or user first..
 	 */
-	mmap_read_unlock(mm);
+	if (mm)
+		mmap_read_unlock(mm);
+	else
+		vma_end_read(vma);
 
 	__bad_area_nosemaphore(regs, error_code, address, pkey, si_code);
 }
@@ -897,7 +900,8 @@ static inline bool bad_area_access_from_pkeys(unsigned long error_code,
 
 static noinline void
 bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
-		      unsigned long address, struct vm_area_struct *vma)
+		      unsigned long address, struct mm_struct *mm,
+		      struct vm_area_struct *vma)
 {
 	/*
 	 * This OSPKE check is not strictly necessary at runtime.
@@ -927,9 +931,9 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
 		 */
 		u32 pkey = vma_pkey(vma);
 
-		__bad_area(regs, error_code, address, pkey, SEGV_PKUERR);
+		__bad_area(regs, error_code, address, mm, vma, pkey, SEGV_PKUERR);
 	} else {
-		__bad_area(regs, error_code, address, 0, SEGV_ACCERR);
+		__bad_area(regs, error_code, address, mm, vma, 0, SEGV_ACCERR);
 	}
 }
 
@@ -1357,8 +1361,9 @@ void do_user_addr_fault(struct pt_regs *regs,
 		goto lock_mmap;
 
 	if (unlikely(access_error(error_code, vma))) {
-		vma_end_read(vma);
-		goto lock_mmap;
+		bad_area_access_error(regs, error_code, address, NULL, vma);
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		return;
 	}
 	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
 	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
@@ -1394,7 +1399,7 @@ void do_user_addr_fault(struct pt_regs *regs,
 	 * we can handle it..
 	 */
 	if (unlikely(access_error(error_code, vma))) {
-		bad_area_access_error(regs, error_code, address, vma);
+		bad_area_access_error(regs, error_code, address, mm, vma);
 		return;
 	}
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
                   ` (6 preceding siblings ...)
  2024-04-03  8:38 ` [PATCH v2 7/7] x86: " Kefeng Wang
@ 2024-04-03 20:45 ` Andrew Morton
  2024-04-07  7:49   ` Kefeng Wang
  7 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2024-04-03 20:45 UTC (permalink / raw
  To: Kefeng Wang
  Cc: x86, linux-s390, Albert Ou, linuxppc-dev, Peter Zijlstra,
	Catalin Marinas, Paul Walmsley, Russell King, surenb, Dave Hansen,
	linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
	Alexander Gordeev, Will Deacon, Gerald Schaefer, linux-arm-kernel,
	linux-mm

On Wed, 3 Apr 2024 16:37:58 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:

> After VMA lock-based page fault handling enabled, if bad access met
> under per-vma lock, it will fallback to mmap_lock-based handling,
> so it leads to unnessary mmap lock and vma find again. A test from
> lmbench shows 34% improve after this changes on arm64,
> 
>   lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198
> 
> Only build test on other archs except arm64.

Thanks.  So we now want a bunch of architectures to runtime test this.  Do
we have a selftest in place which will adequately do this?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess
  2024-04-03 20:45 ` [PATCH v2 0/7] arch/mm/fault: " Andrew Morton
@ 2024-04-07  7:49   ` Kefeng Wang
  2024-04-07 17:19     ` Heiko Carstens
  0 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-07  7:49 UTC (permalink / raw
  To: Andrew Morton
  Cc: x86, linux-s390, Albert Ou, linuxppc-dev, Peter Zijlstra,
	Catalin Marinas, Paul Walmsley, Russell King, surenb, Dave Hansen,
	linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
	Alexander Gordeev, Will Deacon, Gerald Schaefer, linux-arm-kernel,
	linux-mm



On 2024/4/4 4:45, Andrew Morton wrote:
> On Wed, 3 Apr 2024 16:37:58 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> 
>> After VMA lock-based page fault handling enabled, if bad access met
>> under per-vma lock, it will fallback to mmap_lock-based handling,
>> so it leads to unnessary mmap lock and vma find again. A test from
>> lmbench shows 34% improve after this changes on arm64,
>>
>>    lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198
>>
>> Only build test on other archs except arm64.
> 
> Thanks.  So we now want a bunch of architectures to runtime test this.  Do
> we have a selftest in place which will adequately do this?

I don't find such selftest, and badaccess would lead to coredump, the
performance should not affect most scene, so no selftest is acceptable.
lmbench is easy to use to measure the performance.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess
  2024-04-07  7:49   ` Kefeng Wang
@ 2024-04-07 17:19     ` Heiko Carstens
  0 siblings, 0 replies; 19+ messages in thread
From: Heiko Carstens @ 2024-04-07 17:19 UTC (permalink / raw
  To: Kefeng Wang
  Cc: Peter Zijlstra, Catalin Marinas, Dave Hansen, linux-mm,
	linux-riscv, Will Deacon, Alexander Gordeev, linux-s390, x86,
	Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
	Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
	Palmer Dabbelt, Andrew Morton, linuxppc-dev

On Sun, Apr 07, 2024 at 03:49:53PM +0800, Kefeng Wang wrote:
> On 2024/4/4 4:45, Andrew Morton wrote:
> > On Wed, 3 Apr 2024 16:37:58 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> > 
> > > After VMA lock-based page fault handling enabled, if bad access met
> > > under per-vma lock, it will fallback to mmap_lock-based handling,
> > > so it leads to unnessary mmap lock and vma find again. A test from
> > > lmbench shows 34% improve after this changes on arm64,
> > > 
> > >    lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198
> > > 
> > > Only build test on other archs except arm64.
> > 
> > Thanks.  So we now want a bunch of architectures to runtime test this.  Do
> > we have a selftest in place which will adequately do this?
> 
> I don't find such selftest, and badaccess would lead to coredump, the
> performance should not affect most scene, so no selftest is acceptable.
> lmbench is easy to use to measure the performance.

The rationale for this series (performance improvement) is a bit odd,
since I would expect that the changed code is usually never executed.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 6/7] s390: mm: accelerate pagefault when badaccess
  2024-04-03  8:38 ` [PATCH v2 6/7] s390: " Kefeng Wang
@ 2024-04-07 17:19   ` Heiko Carstens
  0 siblings, 0 replies; 19+ messages in thread
From: Heiko Carstens @ 2024-04-07 17:19 UTC (permalink / raw
  To: Kefeng Wang
  Cc: Peter Zijlstra, Catalin Marinas, Dave Hansen, linux-mm,
	linux-riscv, Will Deacon, Alexander Gordeev, linux-s390, x86,
	Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
	Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
	Palmer Dabbelt, akpm, linuxppc-dev

On Wed, Apr 03, 2024 at 04:38:04PM +0800, Kefeng Wang wrote:
> The vm_flags of vma already checked under per-VMA lock, if it is a
> bad access, directly handle error, no need to retry with mmap_lock
> again. Since the page faut is handled under per-VMA lock, count it
> as a vma lock event with VMA_LOCK_SUCCESS.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>  arch/s390/mm/fault.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
> index c421dd44ffbe..162ca2576fd4 100644
> --- a/arch/s390/mm/fault.c
> +++ b/arch/s390/mm/fault.c
> @@ -325,7 +325,8 @@ static void do_exception(struct pt_regs *regs, int access)
>  		goto lock_mmap;
>  	if (!(vma->vm_flags & access)) {
>  		vma_end_read(vma);
> -		goto lock_mmap;
> +		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> +		return handle_fault_error_nolock(regs, SEGV_ACCERR);

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess
  2024-04-03  8:38 ` [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess Kefeng Wang
@ 2024-04-09  8:56   ` Michael Ellerman
  0 siblings, 0 replies; 19+ messages in thread
From: Michael Ellerman @ 2024-04-09  8:56 UTC (permalink / raw
  To: Kefeng Wang, akpm
  Cc: x86, Kefeng Wang, Albert Ou, linux-s390, Peter Zijlstra,
	Catalin Marinas, Dave Hansen, Russell King, surenb, linuxppc-dev,
	linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
	Paul Walmsley, Alexander Gordeev, Will Deacon, Gerald Schaefer,
	linux-arm-kernel, linux-mm

Kefeng Wang <wangkefeng.wang@huawei.com> writes:
> The access_[pkey]_error() of vma already checked under per-VMA lock, if
> it is a bad access, directly handle error, no need to retry with mmap_lock
> again. In order to release the correct lock, pass the mm_struct into
> bad_access_pkey()/bad_access(), if mm is NULL, release vma lock, or
> release mmap_lock. Since the page faut is handled under per-VMA lock,
> count it as a vma lock event with VMA_LOCK_SUCCESS.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>  arch/powerpc/mm/fault.c | 33 ++++++++++++++++++++-------------
>  1 file changed, 20 insertions(+), 13 deletions(-)

I thought there might be a nicer way to do this, plumbing the mm and vma
down through all those levels is a bit of a pain (vma->vm_mm exists
after all).

But I couldn't come up with anything obviously better, without doing
lots of refactoring first, which would be a pain to integrate into this
series.

So anyway, if the series goes ahead:

Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

cheers

> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 53335ae21a40..215690452495 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -71,23 +71,26 @@ static noinline int bad_area_nosemaphore(struct pt_regs *regs, unsigned long add
>  	return __bad_area_nosemaphore(regs, address, SEGV_MAPERR);
>  }
>  
> -static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
> +static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code,
> +		      struct mm_struct *mm, struct vm_area_struct *vma)
>  {
> -	struct mm_struct *mm = current->mm;
>  
>  	/*
>  	 * Something tried to access memory that isn't in our memory map..
>  	 * Fix it, but check if it's kernel or user first..
>  	 */
> -	mmap_read_unlock(mm);
> +	if (mm)
> +		mmap_read_unlock(mm);
> +	else
> +		vma_end_read(vma);
>  
>  	return __bad_area_nosemaphore(regs, address, si_code);
>  }
>  
>  static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
> +				    struct mm_struct *mm,
>  				    struct vm_area_struct *vma)
>  {
> -	struct mm_struct *mm = current->mm;
>  	int pkey;
>  
>  	/*
> @@ -109,7 +112,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
>  	 */
>  	pkey = vma_pkey(vma);
>  
> -	mmap_read_unlock(mm);
> +	if (mm)
> +		mmap_read_unlock(mm);
> +	else
> +		vma_end_read(vma);
>  
>  	/*
>  	 * If we are in kernel mode, bail out with a SEGV, this will
> @@ -124,9 +130,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
>  	return 0;
>  }
>  
> -static noinline int bad_access(struct pt_regs *regs, unsigned long address)
> +static noinline int bad_access(struct pt_regs *regs, unsigned long address,
> +			       struct mm_struct *mm, struct vm_area_struct *vma)
>  {
> -	return __bad_area(regs, address, SEGV_ACCERR);
> +	return __bad_area(regs, address, SEGV_ACCERR, mm, vma);
>  }
>  
>  static int do_sigbus(struct pt_regs *regs, unsigned long address,
> @@ -479,13 +486,13 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
>  
>  	if (unlikely(access_pkey_error(is_write, is_exec,
>  				       (error_code & DSISR_KEYFAULT), vma))) {
> -		vma_end_read(vma);
> -		goto lock_mmap;
> +		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> +		return bad_access_pkey(regs, address, NULL, vma);
>  	}
>  
>  	if (unlikely(access_error(is_write, is_exec, vma))) {
> -		vma_end_read(vma);
> -		goto lock_mmap;
> +		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> +		return bad_access(regs, address, NULL, vma);
>  	}
>  
>  	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
> @@ -521,10 +528,10 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
>  
>  	if (unlikely(access_pkey_error(is_write, is_exec,
>  				       (error_code & DSISR_KEYFAULT), vma)))
> -		return bad_access_pkey(regs, address, vma);
> +		return bad_access_pkey(regs, address, mm, vma);
>  
>  	if (unlikely(access_error(is_write, is_exec, vma)))
> -		return bad_access(regs, address);
> +		return bad_access(regs, address, mm, vma);
>  
>  	/*
>  	 * If for any reason at all we couldn't handle the fault,
> -- 
> 2.27.0
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault()
  2024-04-03  8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
@ 2024-04-09 11:14   ` Catalin Marinas
  0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2024-04-09 11:14 UTC (permalink / raw
  To: Kefeng Wang
  Cc: x86, Albert Ou, linux-s390, Peter Zijlstra, Dave Hansen,
	Russell King, surenb, linuxppc-dev, linux-riscv, Palmer Dabbelt,
	Nicholas Piggin, Andy Lutomirski, Paul Walmsley, akpm,
	Gerald Schaefer, Will Deacon, Alexander Gordeev, linux-arm-kernel,
	linux-mm

On Wed, Apr 03, 2024 at 04:37:59PM +0800, Kefeng Wang wrote:
> The __do_page_fault() only calls handle_mm_fault() after vm_flags
> checked, and it is only called by do_page_fault(), let's squash
> it into do_page_fault() to cleanup code.
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

As I reviewed v1 and the changes are minimal:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS
  2024-04-03  8:38 ` [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS Kefeng Wang
@ 2024-04-09 11:15   ` Catalin Marinas
  0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2024-04-09 11:15 UTC (permalink / raw
  To: Kefeng Wang
  Cc: x86, Albert Ou, linux-s390, Peter Zijlstra, Dave Hansen,
	Russell King, surenb, linuxppc-dev, linux-riscv, Palmer Dabbelt,
	Nicholas Piggin, Andy Lutomirski, Paul Walmsley, akpm,
	Gerald Schaefer, Will Deacon, Alexander Gordeev, linux-arm-kernel,
	linux-mm

On Wed, Apr 03, 2024 at 04:38:00PM +0800, Kefeng Wang wrote:
> The vm_flags of vma already checked under per-VMA lock, if it is a
> bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
> no need to retry with mmap_lock again, the latency time reduces 34% in
> 'lat_sig -P 1 prot lat_sig' from lmbench testcase.
> 
> Since the page faut is handled under per-VMA lock, count it as a vma lock
> event with VMA_LOCK_SUCCESS.
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
  2024-04-03  8:38 ` [PATCH v2 5/7] riscv: " Kefeng Wang
@ 2024-04-10  7:32   ` Alexandre Ghiti
  2024-04-10  8:07     ` Kefeng Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Alexandre Ghiti @ 2024-04-10  7:32 UTC (permalink / raw
  To: Kefeng Wang, akpm
  Cc: x86, linux-s390, Albert Ou, linuxppc-dev, Peter Zijlstra,
	Catalin Marinas, Paul Walmsley, Russell King, surenb, Dave Hansen,
	linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
	Alexander Gordeev, Will Deacon, Gerald Schaefer, linux-arm-kernel,
	linux-mm

Hi Kefeng,

On 03/04/2024 10:38, Kefeng Wang wrote:
> The access_error() of vma already checked under per-VMA lock, if it
> is a bad access, directly handle error, no need to retry with mmap_lock
> again. Since the page faut is handled under per-VMA lock, count it as
> a vma lock event with VMA_LOCK_SUCCESS.
>
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>   arch/riscv/mm/fault.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index 3ba1d4dde5dd..b3fcf7d67efb 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
>   
>   	if (unlikely(access_error(cause, vma))) {
>   		vma_end_read(vma);
> -		goto lock_mmap;
> +		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> +		tsk->thread.bad_cause = SEGV_ACCERR;


I think we should use the cause variable here instead of SEGV_ACCERR, as 
bad_cause is a riscv internal status which describes the real fault that 
happened.

Thanks,

Alex


> +		bad_area_nosemaphore(regs, code, addr);
> +		return;
>   	}
>   
>   	fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
  2024-04-10  7:32   ` Alexandre Ghiti
@ 2024-04-10  8:07     ` Kefeng Wang
  2024-04-10 17:28       ` Alexandre Ghiti
  0 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-10  8:07 UTC (permalink / raw
  To: Alexandre Ghiti, akpm
  Cc: x86, linux-s390, Albert Ou, linuxppc-dev, Peter Zijlstra,
	Catalin Marinas, Paul Walmsley, Russell King, surenb, Dave Hansen,
	linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
	Alexander Gordeev, Will Deacon, Gerald Schaefer, linux-arm-kernel,
	linux-mm



On 2024/4/10 15:32, Alexandre Ghiti wrote:
> Hi Kefeng,
> 
> On 03/04/2024 10:38, Kefeng Wang wrote:
>> The access_error() of vma already checked under per-VMA lock, if it
>> is a bad access, directly handle error, no need to retry with mmap_lock
>> again. Since the page faut is handled under per-VMA lock, count it as
>> a vma lock event with VMA_LOCK_SUCCESS.
>>
>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>>   arch/riscv/mm/fault.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>> index 3ba1d4dde5dd..b3fcf7d67efb 100644
>> --- a/arch/riscv/mm/fault.c
>> +++ b/arch/riscv/mm/fault.c
>> @@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
>>       if (unlikely(access_error(cause, vma))) {
>>           vma_end_read(vma);
>> -        goto lock_mmap;
>> +        count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>> +        tsk->thread.bad_cause = SEGV_ACCERR;
> 
> 
> I think we should use the cause variable here instead of SEGV_ACCERR, as 
> bad_cause is a riscv internal status which describes the real fault that 
> happened.

Oh, I see, it is exception causes on riscv, so it should be

diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index b3fcf7d67efb..5224f3733802 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -293,8 +293,8 @@ void handle_page_fault(struct pt_regs *regs)
         if (unlikely(access_error(cause, vma))) {
                 vma_end_read(vma);
                 count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
-               tsk->thread.bad_cause = SEGV_ACCERR;
-               bad_area_nosemaphore(regs, code, addr);
+               tsk->thread.bad_cause = cause;
+               bad_area_nosemaphore(regs, SEGV_ACCERR, addr);
                 return;
         }

Hi Alex, could you help to check it?

Hi Andrew, please help to squash it after Alex ack it.

Thanks both.


> 
> Thanks,
> 
> Alex
> 
> 
>> +        bad_area_nosemaphore(regs, code, addr);
>> +        return;
>>       }
>>       fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, 
>> regs);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
  2024-04-10  8:07     ` Kefeng Wang
@ 2024-04-10 17:28       ` Alexandre Ghiti
  2024-04-11  1:17         ` Kefeng Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Alexandre Ghiti @ 2024-04-10 17:28 UTC (permalink / raw
  To: Kefeng Wang, akpm
  Cc: x86, linux-s390, Albert Ou, linuxppc-dev, Peter Zijlstra,
	Catalin Marinas, Paul Walmsley, Russell King, surenb, Dave Hansen,
	linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
	Alexander Gordeev, Will Deacon, Gerald Schaefer, linux-arm-kernel,
	linux-mm

On 10/04/2024 10:07, Kefeng Wang wrote:
>
>
> On 2024/4/10 15:32, Alexandre Ghiti wrote:
>> Hi Kefeng,
>>
>> On 03/04/2024 10:38, Kefeng Wang wrote:
>>> The access_error() of vma already checked under per-VMA lock, if it
>>> is a bad access, directly handle error, no need to retry with mmap_lock
>>> again. Since the page faut is handled under per-VMA lock, count it as
>>> a vma lock event with VMA_LOCK_SUCCESS.
>>>
>>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>> ---
>>>   arch/riscv/mm/fault.c | 5 ++++-
>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>>> index 3ba1d4dde5dd..b3fcf7d67efb 100644
>>> --- a/arch/riscv/mm/fault.c
>>> +++ b/arch/riscv/mm/fault.c
>>> @@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
>>>       if (unlikely(access_error(cause, vma))) {
>>>           vma_end_read(vma);
>>> -        goto lock_mmap;
>>> +        count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>>> +        tsk->thread.bad_cause = SEGV_ACCERR;
>>
>>
>> I think we should use the cause variable here instead of SEGV_ACCERR, 
>> as bad_cause is a riscv internal status which describes the real 
>> fault that happened.
>
> Oh, I see, it is exception causes on riscv, so it should be
>
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index b3fcf7d67efb..5224f3733802 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -293,8 +293,8 @@ void handle_page_fault(struct pt_regs *regs)
>         if (unlikely(access_error(cause, vma))) {
>                 vma_end_read(vma);
>                 count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> -               tsk->thread.bad_cause = SEGV_ACCERR;
> -               bad_area_nosemaphore(regs, code, addr);
> +               tsk->thread.bad_cause = cause;
> +               bad_area_nosemaphore(regs, SEGV_ACCERR, addr);
>                 return;
>         }
>
> Hi Alex, could you help to check it?
>
> Hi Andrew, please help to squash it after Alex ack it.
>
> Thanks both.


So I have just tested Kefeng's fixup on my usual CI and with a simple 
program that triggers such bad access, everything went fine so with the 
fixup applied:

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks,

Alex



>
>
>>
>> Thanks,
>>
>> Alex
>>
>>
>>> +        bad_area_nosemaphore(regs, code, addr);
>>> +        return;
>>>       }
>>>       fault = handle_mm_fault(vma, addr, flags | 
>>> FAULT_FLAG_VMA_LOCK, regs);

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
  2024-04-10 17:28       ` Alexandre Ghiti
@ 2024-04-11  1:17         ` Kefeng Wang
  0 siblings, 0 replies; 19+ messages in thread
From: Kefeng Wang @ 2024-04-11  1:17 UTC (permalink / raw
  To: Alexandre Ghiti, akpm
  Cc: x86, linux-s390, Albert Ou, linuxppc-dev, Peter Zijlstra,
	Catalin Marinas, Paul Walmsley, Russell King, surenb, Dave Hansen,
	linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
	Alexander Gordeev, Will Deacon, Gerald Schaefer, linux-arm-kernel,
	linux-mm



On 2024/4/11 1:28, Alexandre Ghiti wrote:
> On 10/04/2024 10:07, Kefeng Wang wrote:
>>
>>
>> On 2024/4/10 15:32, Alexandre Ghiti wrote:
>>> Hi Kefeng,
>>>
>>> On 03/04/2024 10:38, Kefeng Wang wrote:
>>>> The access_error() of vma already checked under per-VMA lock, if it
>>>> is a bad access, directly handle error, no need to retry with mmap_lock
>>>> again. Since the page faut is handled under per-VMA lock, count it as
>>>> a vma lock event with VMA_LOCK_SUCCESS.
>>>>
>>>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>> ---
>>>>   arch/riscv/mm/fault.c | 5 ++++-
>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>>>> index 3ba1d4dde5dd..b3fcf7d67efb 100644
>>>> --- a/arch/riscv/mm/fault.c
>>>> +++ b/arch/riscv/mm/fault.c
>>>> @@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
>>>>       if (unlikely(access_error(cause, vma))) {
>>>>           vma_end_read(vma);
>>>> -        goto lock_mmap;
>>>> +        count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>>>> +        tsk->thread.bad_cause = SEGV_ACCERR;
>>>
>>>
>>> I think we should use the cause variable here instead of SEGV_ACCERR, 
>>> as bad_cause is a riscv internal status which describes the real 
>>> fault that happened.
>>
>> Oh, I see, it is exception causes on riscv, so it should be
>>
>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>> index b3fcf7d67efb..5224f3733802 100644
>> --- a/arch/riscv/mm/fault.c
>> +++ b/arch/riscv/mm/fault.c
>> @@ -293,8 +293,8 @@ void handle_page_fault(struct pt_regs *regs)
>>         if (unlikely(access_error(cause, vma))) {
>>                 vma_end_read(vma);
>>                 count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>> -               tsk->thread.bad_cause = SEGV_ACCERR;
>> -               bad_area_nosemaphore(regs, code, addr);
>> +               tsk->thread.bad_cause = cause;
>> +               bad_area_nosemaphore(regs, SEGV_ACCERR, addr);
>>                 return;
>>         }
>>
>> Hi Alex, could you help to check it?
>>
>> Hi Andrew, please help to squash it after Alex ack it.
>>
>> Thanks both.
> 
> 
> So I have just tested Kefeng's fixup on my usual CI and with a simple 
> program that triggers such bad access, everything went fine so with the 
> fixup applied:
> 
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Great, thanks.

> 
> Thanks,
> 
> Alex
> 
> 
> 
>>
>>
>>>
>>> Thanks,
>>>
>>> Alex
>>>
>>>
>>>> +        bad_area_nosemaphore(regs, code, addr);
>>>> +        return;
>>>>       }
>>>>       fault = handle_mm_fault(vma, addr, flags | 
>>>> FAULT_FLAG_VMA_LOCK, regs);

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-04-11  1:19 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
2024-04-03  8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
2024-04-09 11:14   ` Catalin Marinas
2024-04-03  8:38 ` [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS Kefeng Wang
2024-04-09 11:15   ` Catalin Marinas
2024-04-03  8:38 ` [PATCH v2 3/7] arm: " Kefeng Wang
2024-04-03  8:38 ` [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess Kefeng Wang
2024-04-09  8:56   ` Michael Ellerman
2024-04-03  8:38 ` [PATCH v2 5/7] riscv: " Kefeng Wang
2024-04-10  7:32   ` Alexandre Ghiti
2024-04-10  8:07     ` Kefeng Wang
2024-04-10 17:28       ` Alexandre Ghiti
2024-04-11  1:17         ` Kefeng Wang
2024-04-03  8:38 ` [PATCH v2 6/7] s390: " Kefeng Wang
2024-04-07 17:19   ` Heiko Carstens
2024-04-03  8:38 ` [PATCH v2 7/7] x86: " Kefeng Wang
2024-04-03 20:45 ` [PATCH v2 0/7] arch/mm/fault: " Andrew Morton
2024-04-07  7:49   ` Kefeng Wang
2024-04-07 17:19     ` Heiko Carstens

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).