* [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess
@ 2024-04-03 8:37 Kefeng Wang
2024-04-03 8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
` (7 more replies)
0 siblings, 8 replies; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03 8:37 UTC (permalink / raw
To: akpm
Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
Palmer Dabbelt, linuxppc-dev
After VMA lock-based page fault handling enabled, if bad access met
under per-vma lock, it will fallback to mmap_lock-based handling,
so it leads to unnessary mmap lock and vma find again. A test from
lmbench shows 34% improve after this changes on arm64,
lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198
Only build test on other archs except arm64.
v2:
- a better changelog, and describe the counting changes, suggested by
Suren Baghdasaryan
- add RB
Kefeng Wang (7):
arm64: mm: cleanup __do_page_fault()
arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS
arm: mm: accelerate pagefault when VM_FAULT_BADACCESS
powerpc: mm: accelerate pagefault when badaccess
riscv: mm: accelerate pagefault when badaccess
s390: mm: accelerate pagefault when badaccess
x86: mm: accelerate pagefault when badaccess
arch/arm/mm/fault.c | 4 +++-
arch/arm64/mm/fault.c | 31 ++++++++++---------------------
arch/powerpc/mm/fault.c | 33 ++++++++++++++++++++-------------
arch/riscv/mm/fault.c | 5 ++++-
arch/s390/mm/fault.c | 3 ++-
arch/x86/mm/fault.c | 23 ++++++++++++++---------
6 files changed, 53 insertions(+), 46 deletions(-)
--
2.27.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault()
2024-04-03 8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
@ 2024-04-03 8:37 ` Kefeng Wang
2024-04-09 11:14 ` Catalin Marinas
2024-04-03 8:38 ` [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS Kefeng Wang
` (6 subsequent siblings)
7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03 8:37 UTC (permalink / raw
To: akpm
Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
Palmer Dabbelt, linuxppc-dev
The __do_page_fault() only calls handle_mm_fault() after vm_flags
checked, and it is only called by do_page_fault(), let's squash
it into do_page_fault() to cleanup code.
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
arch/arm64/mm/fault.c | 27 +++++++--------------------
1 file changed, 7 insertions(+), 20 deletions(-)
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 8251e2fea9c7..9bb9f395351a 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -486,25 +486,6 @@ static void do_bad_area(unsigned long far, unsigned long esr,
}
}
-#define VM_FAULT_BADMAP ((__force vm_fault_t)0x010000)
-#define VM_FAULT_BADACCESS ((__force vm_fault_t)0x020000)
-
-static vm_fault_t __do_page_fault(struct mm_struct *mm,
- struct vm_area_struct *vma, unsigned long addr,
- unsigned int mm_flags, unsigned long vm_flags,
- struct pt_regs *regs)
-{
- /*
- * Ok, we have a good vm_area for this memory access, so we can handle
- * it.
- * Check that the permissions on the VMA allow for the fault which
- * occurred.
- */
- if (!(vma->vm_flags & vm_flags))
- return VM_FAULT_BADACCESS;
- return handle_mm_fault(vma, addr, mm_flags, regs);
-}
-
static bool is_el0_instruction_abort(unsigned long esr)
{
return ESR_ELx_EC(esr) == ESR_ELx_EC_IABT_LOW;
@@ -519,6 +500,9 @@ static bool is_write_abort(unsigned long esr)
return (esr & ESR_ELx_WNR) && !(esr & ESR_ELx_CM);
}
+#define VM_FAULT_BADMAP ((__force vm_fault_t)0x010000)
+#define VM_FAULT_BADACCESS ((__force vm_fault_t)0x020000)
+
static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
struct pt_regs *regs)
{
@@ -617,7 +601,10 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
goto done;
}
- fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs);
+ if (!(vma->vm_flags & vm_flags))
+ fault = VM_FAULT_BADACCESS;
+ else
+ fault = handle_mm_fault(vma, addr, mm_flags, regs);
/* Quick path to respond to signals */
if (fault_signal_pending(fault, regs)) {
--
2.27.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS
2024-04-03 8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
2024-04-03 8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
@ 2024-04-03 8:38 ` Kefeng Wang
2024-04-09 11:15 ` Catalin Marinas
2024-04-03 8:38 ` [PATCH v2 3/7] arm: " Kefeng Wang
` (5 subsequent siblings)
7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03 8:38 UTC (permalink / raw
To: akpm
Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
Palmer Dabbelt, linuxppc-dev
The vm_flags of vma already checked under per-VMA lock, if it is a
bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
no need to retry with mmap_lock again, the latency time reduces 34% in
'lat_sig -P 1 prot lat_sig' from lmbench testcase.
Since the page faut is handled under per-VMA lock, count it as a vma lock
event with VMA_LOCK_SUCCESS.
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
arch/arm64/mm/fault.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 9bb9f395351a..405f9aa831bd 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -572,7 +572,9 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
if (!(vma->vm_flags & vm_flags)) {
vma_end_read(vma);
- goto lock_mmap;
+ fault = VM_FAULT_BADACCESS;
+ count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+ goto done;
}
fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
--
2.27.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 3/7] arm: mm: accelerate pagefault when VM_FAULT_BADACCESS
2024-04-03 8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
2024-04-03 8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
2024-04-03 8:38 ` [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS Kefeng Wang
@ 2024-04-03 8:38 ` Kefeng Wang
2024-04-03 8:38 ` [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess Kefeng Wang
` (4 subsequent siblings)
7 siblings, 0 replies; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03 8:38 UTC (permalink / raw
To: akpm
Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
Palmer Dabbelt, linuxppc-dev
The vm_flags of vma already checked under per-VMA lock, if it is a
bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
no need to retry with mmap_lock again. Since the page faut is handled
under per-VMA lock, count it as a vma lock event with VMA_LOCK_SUCCESS.
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
arch/arm/mm/fault.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 439dc6a26bb9..5c4b417e24f9 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -294,7 +294,9 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
if (!(vma->vm_flags & vm_flags)) {
vma_end_read(vma);
- goto lock_mmap;
+ count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+ fault = VM_FAULT_BADACCESS;
+ goto bad_area;
}
fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);
if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
--
2.27.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess
2024-04-03 8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
` (2 preceding siblings ...)
2024-04-03 8:38 ` [PATCH v2 3/7] arm: " Kefeng Wang
@ 2024-04-03 8:38 ` Kefeng Wang
2024-04-09 8:56 ` Michael Ellerman
2024-04-03 8:38 ` [PATCH v2 5/7] riscv: " Kefeng Wang
` (3 subsequent siblings)
7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03 8:38 UTC (permalink / raw
To: akpm
Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
Palmer Dabbelt, linuxppc-dev
The access_[pkey]_error() of vma already checked under per-VMA lock, if
it is a bad access, directly handle error, no need to retry with mmap_lock
again. In order to release the correct lock, pass the mm_struct into
bad_access_pkey()/bad_access(), if mm is NULL, release vma lock, or
release mmap_lock. Since the page faut is handled under per-VMA lock,
count it as a vma lock event with VMA_LOCK_SUCCESS.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
arch/powerpc/mm/fault.c | 33 ++++++++++++++++++++-------------
1 file changed, 20 insertions(+), 13 deletions(-)
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 53335ae21a40..215690452495 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -71,23 +71,26 @@ static noinline int bad_area_nosemaphore(struct pt_regs *regs, unsigned long add
return __bad_area_nosemaphore(regs, address, SEGV_MAPERR);
}
-static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
+static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code,
+ struct mm_struct *mm, struct vm_area_struct *vma)
{
- struct mm_struct *mm = current->mm;
/*
* Something tried to access memory that isn't in our memory map..
* Fix it, but check if it's kernel or user first..
*/
- mmap_read_unlock(mm);
+ if (mm)
+ mmap_read_unlock(mm);
+ else
+ vma_end_read(vma);
return __bad_area_nosemaphore(regs, address, si_code);
}
static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
+ struct mm_struct *mm,
struct vm_area_struct *vma)
{
- struct mm_struct *mm = current->mm;
int pkey;
/*
@@ -109,7 +112,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
*/
pkey = vma_pkey(vma);
- mmap_read_unlock(mm);
+ if (mm)
+ mmap_read_unlock(mm);
+ else
+ vma_end_read(vma);
/*
* If we are in kernel mode, bail out with a SEGV, this will
@@ -124,9 +130,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
return 0;
}
-static noinline int bad_access(struct pt_regs *regs, unsigned long address)
+static noinline int bad_access(struct pt_regs *regs, unsigned long address,
+ struct mm_struct *mm, struct vm_area_struct *vma)
{
- return __bad_area(regs, address, SEGV_ACCERR);
+ return __bad_area(regs, address, SEGV_ACCERR, mm, vma);
}
static int do_sigbus(struct pt_regs *regs, unsigned long address,
@@ -479,13 +486,13 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
if (unlikely(access_pkey_error(is_write, is_exec,
(error_code & DSISR_KEYFAULT), vma))) {
- vma_end_read(vma);
- goto lock_mmap;
+ count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+ return bad_access_pkey(regs, address, NULL, vma);
}
if (unlikely(access_error(is_write, is_exec, vma))) {
- vma_end_read(vma);
- goto lock_mmap;
+ count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+ return bad_access(regs, address, NULL, vma);
}
fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
@@ -521,10 +528,10 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
if (unlikely(access_pkey_error(is_write, is_exec,
(error_code & DSISR_KEYFAULT), vma)))
- return bad_access_pkey(regs, address, vma);
+ return bad_access_pkey(regs, address, mm, vma);
if (unlikely(access_error(is_write, is_exec, vma)))
- return bad_access(regs, address);
+ return bad_access(regs, address, mm, vma);
/*
* If for any reason at all we couldn't handle the fault,
--
2.27.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
2024-04-03 8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
` (3 preceding siblings ...)
2024-04-03 8:38 ` [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess Kefeng Wang
@ 2024-04-03 8:38 ` Kefeng Wang
2024-04-10 7:32 ` Alexandre Ghiti
2024-04-03 8:38 ` [PATCH v2 6/7] s390: " Kefeng Wang
` (2 subsequent siblings)
7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03 8:38 UTC (permalink / raw
To: akpm
Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
Palmer Dabbelt, linuxppc-dev
The access_error() of vma already checked under per-VMA lock, if it
is a bad access, directly handle error, no need to retry with mmap_lock
again. Since the page faut is handled under per-VMA lock, count it as
a vma lock event with VMA_LOCK_SUCCESS.
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
arch/riscv/mm/fault.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index 3ba1d4dde5dd..b3fcf7d67efb 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
if (unlikely(access_error(cause, vma))) {
vma_end_read(vma);
- goto lock_mmap;
+ count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+ tsk->thread.bad_cause = SEGV_ACCERR;
+ bad_area_nosemaphore(regs, code, addr);
+ return;
}
fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);
--
2.27.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 6/7] s390: mm: accelerate pagefault when badaccess
2024-04-03 8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
` (4 preceding siblings ...)
2024-04-03 8:38 ` [PATCH v2 5/7] riscv: " Kefeng Wang
@ 2024-04-03 8:38 ` Kefeng Wang
2024-04-07 17:19 ` Heiko Carstens
2024-04-03 8:38 ` [PATCH v2 7/7] x86: " Kefeng Wang
2024-04-03 20:45 ` [PATCH v2 0/7] arch/mm/fault: " Andrew Morton
7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03 8:38 UTC (permalink / raw
To: akpm
Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
Palmer Dabbelt, linuxppc-dev
The vm_flags of vma already checked under per-VMA lock, if it is a
bad access, directly handle error, no need to retry with mmap_lock
again. Since the page faut is handled under per-VMA lock, count it
as a vma lock event with VMA_LOCK_SUCCESS.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
arch/s390/mm/fault.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index c421dd44ffbe..162ca2576fd4 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -325,7 +325,8 @@ static void do_exception(struct pt_regs *regs, int access)
goto lock_mmap;
if (!(vma->vm_flags & access)) {
vma_end_read(vma);
- goto lock_mmap;
+ count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+ return handle_fault_error_nolock(regs, SEGV_ACCERR);
}
fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
--
2.27.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2 7/7] x86: mm: accelerate pagefault when badaccess
2024-04-03 8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
` (5 preceding siblings ...)
2024-04-03 8:38 ` [PATCH v2 6/7] s390: " Kefeng Wang
@ 2024-04-03 8:38 ` Kefeng Wang
2024-04-03 20:45 ` [PATCH v2 0/7] arch/mm/fault: " Andrew Morton
7 siblings, 0 replies; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03 8:38 UTC (permalink / raw
To: akpm
Cc: Kefeng Wang, Peter Zijlstra, Catalin Marinas, Dave Hansen,
linux-mm, linux-riscv, Will Deacon, Alexander Gordeev, linux-s390,
x86, Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
Palmer Dabbelt, linuxppc-dev
The access_error() of vma already checked under per-VMA lock, if it
is a bad access, directly handle error, no need to retry with mmap_lock
again. In order to release the correct lock, pass the mm_struct into
bad_area_access_error(), if mm is NULL, release vma lock, or release
mmap_lock. Since the page faut is handled under per-VMA lock, count it
as a vma lock event with VMA_LOCK_SUCCESS.
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
arch/x86/mm/fault.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a4cc20d0036d..67b18adc75dd 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -866,14 +866,17 @@ bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
static void
__bad_area(struct pt_regs *regs, unsigned long error_code,
- unsigned long address, u32 pkey, int si_code)
+ unsigned long address, struct mm_struct *mm,
+ struct vm_area_struct *vma, u32 pkey, int si_code)
{
- struct mm_struct *mm = current->mm;
/*
* Something tried to access memory that isn't in our memory map..
* Fix it, but check if it's kernel or user first..
*/
- mmap_read_unlock(mm);
+ if (mm)
+ mmap_read_unlock(mm);
+ else
+ vma_end_read(vma);
__bad_area_nosemaphore(regs, error_code, address, pkey, si_code);
}
@@ -897,7 +900,8 @@ static inline bool bad_area_access_from_pkeys(unsigned long error_code,
static noinline void
bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
- unsigned long address, struct vm_area_struct *vma)
+ unsigned long address, struct mm_struct *mm,
+ struct vm_area_struct *vma)
{
/*
* This OSPKE check is not strictly necessary at runtime.
@@ -927,9 +931,9 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
*/
u32 pkey = vma_pkey(vma);
- __bad_area(regs, error_code, address, pkey, SEGV_PKUERR);
+ __bad_area(regs, error_code, address, mm, vma, pkey, SEGV_PKUERR);
} else {
- __bad_area(regs, error_code, address, 0, SEGV_ACCERR);
+ __bad_area(regs, error_code, address, mm, vma, 0, SEGV_ACCERR);
}
}
@@ -1357,8 +1361,9 @@ void do_user_addr_fault(struct pt_regs *regs,
goto lock_mmap;
if (unlikely(access_error(error_code, vma))) {
- vma_end_read(vma);
- goto lock_mmap;
+ bad_area_access_error(regs, error_code, address, NULL, vma);
+ count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+ return;
}
fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
@@ -1394,7 +1399,7 @@ void do_user_addr_fault(struct pt_regs *regs,
* we can handle it..
*/
if (unlikely(access_error(error_code, vma))) {
- bad_area_access_error(regs, error_code, address, vma);
+ bad_area_access_error(regs, error_code, address, mm, vma);
return;
}
--
2.27.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess
2024-04-03 8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
` (6 preceding siblings ...)
2024-04-03 8:38 ` [PATCH v2 7/7] x86: " Kefeng Wang
@ 2024-04-03 20:45 ` Andrew Morton
2024-04-07 7:49 ` Kefeng Wang
7 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2024-04-03 20:45 UTC (permalink / raw
To: Kefeng Wang
Cc: x86, linux-s390, Albert Ou, linuxppc-dev, Peter Zijlstra,
Catalin Marinas, Paul Walmsley, Russell King, surenb, Dave Hansen,
linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
Alexander Gordeev, Will Deacon, Gerald Schaefer, linux-arm-kernel,
linux-mm
On Wed, 3 Apr 2024 16:37:58 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> After VMA lock-based page fault handling enabled, if bad access met
> under per-vma lock, it will fallback to mmap_lock-based handling,
> so it leads to unnessary mmap lock and vma find again. A test from
> lmbench shows 34% improve after this changes on arm64,
>
> lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198
>
> Only build test on other archs except arm64.
Thanks. So we now want a bunch of architectures to runtime test this. Do
we have a selftest in place which will adequately do this?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess
2024-04-03 20:45 ` [PATCH v2 0/7] arch/mm/fault: " Andrew Morton
@ 2024-04-07 7:49 ` Kefeng Wang
2024-04-07 17:19 ` Heiko Carstens
0 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-07 7:49 UTC (permalink / raw
To: Andrew Morton
Cc: x86, linux-s390, Albert Ou, linuxppc-dev, Peter Zijlstra,
Catalin Marinas, Paul Walmsley, Russell King, surenb, Dave Hansen,
linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
Alexander Gordeev, Will Deacon, Gerald Schaefer, linux-arm-kernel,
linux-mm
On 2024/4/4 4:45, Andrew Morton wrote:
> On Wed, 3 Apr 2024 16:37:58 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>
>> After VMA lock-based page fault handling enabled, if bad access met
>> under per-vma lock, it will fallback to mmap_lock-based handling,
>> so it leads to unnessary mmap lock and vma find again. A test from
>> lmbench shows 34% improve after this changes on arm64,
>>
>> lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198
>>
>> Only build test on other archs except arm64.
>
> Thanks. So we now want a bunch of architectures to runtime test this. Do
> we have a selftest in place which will adequately do this?
I don't find such selftest, and badaccess would lead to coredump, the
performance should not affect most scene, so no selftest is acceptable.
lmbench is easy to use to measure the performance.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess
2024-04-07 7:49 ` Kefeng Wang
@ 2024-04-07 17:19 ` Heiko Carstens
0 siblings, 0 replies; 19+ messages in thread
From: Heiko Carstens @ 2024-04-07 17:19 UTC (permalink / raw
To: Kefeng Wang
Cc: Peter Zijlstra, Catalin Marinas, Dave Hansen, linux-mm,
linux-riscv, Will Deacon, Alexander Gordeev, linux-s390, x86,
Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
Palmer Dabbelt, Andrew Morton, linuxppc-dev
On Sun, Apr 07, 2024 at 03:49:53PM +0800, Kefeng Wang wrote:
> On 2024/4/4 4:45, Andrew Morton wrote:
> > On Wed, 3 Apr 2024 16:37:58 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> >
> > > After VMA lock-based page fault handling enabled, if bad access met
> > > under per-vma lock, it will fallback to mmap_lock-based handling,
> > > so it leads to unnessary mmap lock and vma find again. A test from
> > > lmbench shows 34% improve after this changes on arm64,
> > >
> > > lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198
> > >
> > > Only build test on other archs except arm64.
> >
> > Thanks. So we now want a bunch of architectures to runtime test this. Do
> > we have a selftest in place which will adequately do this?
>
> I don't find such selftest, and badaccess would lead to coredump, the
> performance should not affect most scene, so no selftest is acceptable.
> lmbench is easy to use to measure the performance.
The rationale for this series (performance improvement) is a bit odd,
since I would expect that the changed code is usually never executed.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 6/7] s390: mm: accelerate pagefault when badaccess
2024-04-03 8:38 ` [PATCH v2 6/7] s390: " Kefeng Wang
@ 2024-04-07 17:19 ` Heiko Carstens
0 siblings, 0 replies; 19+ messages in thread
From: Heiko Carstens @ 2024-04-07 17:19 UTC (permalink / raw
To: Kefeng Wang
Cc: Peter Zijlstra, Catalin Marinas, Dave Hansen, linux-mm,
linux-riscv, Will Deacon, Alexander Gordeev, linux-s390, x86,
Russell King, Gerald Schaefer, Albert Ou, Nicholas Piggin,
Andy Lutomirski, Paul Walmsley, surenb, linux-arm-kernel,
Palmer Dabbelt, akpm, linuxppc-dev
On Wed, Apr 03, 2024 at 04:38:04PM +0800, Kefeng Wang wrote:
> The vm_flags of vma already checked under per-VMA lock, if it is a
> bad access, directly handle error, no need to retry with mmap_lock
> again. Since the page faut is handled under per-VMA lock, count it
> as a vma lock event with VMA_LOCK_SUCCESS.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> arch/s390/mm/fault.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
> index c421dd44ffbe..162ca2576fd4 100644
> --- a/arch/s390/mm/fault.c
> +++ b/arch/s390/mm/fault.c
> @@ -325,7 +325,8 @@ static void do_exception(struct pt_regs *regs, int access)
> goto lock_mmap;
> if (!(vma->vm_flags & access)) {
> vma_end_read(vma);
> - goto lock_mmap;
> + count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> + return handle_fault_error_nolock(regs, SEGV_ACCERR);
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess
2024-04-03 8:38 ` [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess Kefeng Wang
@ 2024-04-09 8:56 ` Michael Ellerman
0 siblings, 0 replies; 19+ messages in thread
From: Michael Ellerman @ 2024-04-09 8:56 UTC (permalink / raw
To: Kefeng Wang, akpm
Cc: x86, Kefeng Wang, Albert Ou, linux-s390, Peter Zijlstra,
Catalin Marinas, Dave Hansen, Russell King, surenb, linuxppc-dev,
linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
Paul Walmsley, Alexander Gordeev, Will Deacon, Gerald Schaefer,
linux-arm-kernel, linux-mm
Kefeng Wang <wangkefeng.wang@huawei.com> writes:
> The access_[pkey]_error() of vma already checked under per-VMA lock, if
> it is a bad access, directly handle error, no need to retry with mmap_lock
> again. In order to release the correct lock, pass the mm_struct into
> bad_access_pkey()/bad_access(), if mm is NULL, release vma lock, or
> release mmap_lock. Since the page faut is handled under per-VMA lock,
> count it as a vma lock event with VMA_LOCK_SUCCESS.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> arch/powerpc/mm/fault.c | 33 ++++++++++++++++++++-------------
> 1 file changed, 20 insertions(+), 13 deletions(-)
I thought there might be a nicer way to do this, plumbing the mm and vma
down through all those levels is a bit of a pain (vma->vm_mm exists
after all).
But I couldn't come up with anything obviously better, without doing
lots of refactoring first, which would be a pain to integrate into this
series.
So anyway, if the series goes ahead:
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
cheers
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 53335ae21a40..215690452495 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -71,23 +71,26 @@ static noinline int bad_area_nosemaphore(struct pt_regs *regs, unsigned long add
> return __bad_area_nosemaphore(regs, address, SEGV_MAPERR);
> }
>
> -static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
> +static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code,
> + struct mm_struct *mm, struct vm_area_struct *vma)
> {
> - struct mm_struct *mm = current->mm;
>
> /*
> * Something tried to access memory that isn't in our memory map..
> * Fix it, but check if it's kernel or user first..
> */
> - mmap_read_unlock(mm);
> + if (mm)
> + mmap_read_unlock(mm);
> + else
> + vma_end_read(vma);
>
> return __bad_area_nosemaphore(regs, address, si_code);
> }
>
> static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
> + struct mm_struct *mm,
> struct vm_area_struct *vma)
> {
> - struct mm_struct *mm = current->mm;
> int pkey;
>
> /*
> @@ -109,7 +112,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
> */
> pkey = vma_pkey(vma);
>
> - mmap_read_unlock(mm);
> + if (mm)
> + mmap_read_unlock(mm);
> + else
> + vma_end_read(vma);
>
> /*
> * If we are in kernel mode, bail out with a SEGV, this will
> @@ -124,9 +130,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
> return 0;
> }
>
> -static noinline int bad_access(struct pt_regs *regs, unsigned long address)
> +static noinline int bad_access(struct pt_regs *regs, unsigned long address,
> + struct mm_struct *mm, struct vm_area_struct *vma)
> {
> - return __bad_area(regs, address, SEGV_ACCERR);
> + return __bad_area(regs, address, SEGV_ACCERR, mm, vma);
> }
>
> static int do_sigbus(struct pt_regs *regs, unsigned long address,
> @@ -479,13 +486,13 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
>
> if (unlikely(access_pkey_error(is_write, is_exec,
> (error_code & DSISR_KEYFAULT), vma))) {
> - vma_end_read(vma);
> - goto lock_mmap;
> + count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> + return bad_access_pkey(regs, address, NULL, vma);
> }
>
> if (unlikely(access_error(is_write, is_exec, vma))) {
> - vma_end_read(vma);
> - goto lock_mmap;
> + count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> + return bad_access(regs, address, NULL, vma);
> }
>
> fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
> @@ -521,10 +528,10 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
>
> if (unlikely(access_pkey_error(is_write, is_exec,
> (error_code & DSISR_KEYFAULT), vma)))
> - return bad_access_pkey(regs, address, vma);
> + return bad_access_pkey(regs, address, mm, vma);
>
> if (unlikely(access_error(is_write, is_exec, vma)))
> - return bad_access(regs, address);
> + return bad_access(regs, address, mm, vma);
>
> /*
> * If for any reason at all we couldn't handle the fault,
> --
> 2.27.0
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault()
2024-04-03 8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
@ 2024-04-09 11:14 ` Catalin Marinas
0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2024-04-09 11:14 UTC (permalink / raw
To: Kefeng Wang
Cc: x86, Albert Ou, linux-s390, Peter Zijlstra, Dave Hansen,
Russell King, surenb, linuxppc-dev, linux-riscv, Palmer Dabbelt,
Nicholas Piggin, Andy Lutomirski, Paul Walmsley, akpm,
Gerald Schaefer, Will Deacon, Alexander Gordeev, linux-arm-kernel,
linux-mm
On Wed, Apr 03, 2024 at 04:37:59PM +0800, Kefeng Wang wrote:
> The __do_page_fault() only calls handle_mm_fault() after vm_flags
> checked, and it is only called by do_page_fault(), let's squash
> it into do_page_fault() to cleanup code.
>
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
As I reviewed v1 and the changes are minimal:
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS
2024-04-03 8:38 ` [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS Kefeng Wang
@ 2024-04-09 11:15 ` Catalin Marinas
0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2024-04-09 11:15 UTC (permalink / raw
To: Kefeng Wang
Cc: x86, Albert Ou, linux-s390, Peter Zijlstra, Dave Hansen,
Russell King, surenb, linuxppc-dev, linux-riscv, Palmer Dabbelt,
Nicholas Piggin, Andy Lutomirski, Paul Walmsley, akpm,
Gerald Schaefer, Will Deacon, Alexander Gordeev, linux-arm-kernel,
linux-mm
On Wed, Apr 03, 2024 at 04:38:00PM +0800, Kefeng Wang wrote:
> The vm_flags of vma already checked under per-VMA lock, if it is a
> bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
> no need to retry with mmap_lock again, the latency time reduces 34% in
> 'lat_sig -P 1 prot lat_sig' from lmbench testcase.
>
> Since the page faut is handled under per-VMA lock, count it as a vma lock
> event with VMA_LOCK_SUCCESS.
>
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
2024-04-03 8:38 ` [PATCH v2 5/7] riscv: " Kefeng Wang
@ 2024-04-10 7:32 ` Alexandre Ghiti
2024-04-10 8:07 ` Kefeng Wang
0 siblings, 1 reply; 19+ messages in thread
From: Alexandre Ghiti @ 2024-04-10 7:32 UTC (permalink / raw
To: Kefeng Wang, akpm
Cc: x86, linux-s390, Albert Ou, linuxppc-dev, Peter Zijlstra,
Catalin Marinas, Paul Walmsley, Russell King, surenb, Dave Hansen,
linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
Alexander Gordeev, Will Deacon, Gerald Schaefer, linux-arm-kernel,
linux-mm
Hi Kefeng,
On 03/04/2024 10:38, Kefeng Wang wrote:
> The access_error() of vma already checked under per-VMA lock, if it
> is a bad access, directly handle error, no need to retry with mmap_lock
> again. Since the page faut is handled under per-VMA lock, count it as
> a vma lock event with VMA_LOCK_SUCCESS.
>
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> arch/riscv/mm/fault.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index 3ba1d4dde5dd..b3fcf7d67efb 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
>
> if (unlikely(access_error(cause, vma))) {
> vma_end_read(vma);
> - goto lock_mmap;
> + count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> + tsk->thread.bad_cause = SEGV_ACCERR;
I think we should use the cause variable here instead of SEGV_ACCERR, as
bad_cause is a riscv internal status which describes the real fault that
happened.
Thanks,
Alex
> + bad_area_nosemaphore(regs, code, addr);
> + return;
> }
>
> fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
2024-04-10 7:32 ` Alexandre Ghiti
@ 2024-04-10 8:07 ` Kefeng Wang
2024-04-10 17:28 ` Alexandre Ghiti
0 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-10 8:07 UTC (permalink / raw
To: Alexandre Ghiti, akpm
Cc: x86, linux-s390, Albert Ou, linuxppc-dev, Peter Zijlstra,
Catalin Marinas, Paul Walmsley, Russell King, surenb, Dave Hansen,
linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
Alexander Gordeev, Will Deacon, Gerald Schaefer, linux-arm-kernel,
linux-mm
On 2024/4/10 15:32, Alexandre Ghiti wrote:
> Hi Kefeng,
>
> On 03/04/2024 10:38, Kefeng Wang wrote:
>> The access_error() of vma already checked under per-VMA lock, if it
>> is a bad access, directly handle error, no need to retry with mmap_lock
>> again. Since the page faut is handled under per-VMA lock, count it as
>> a vma lock event with VMA_LOCK_SUCCESS.
>>
>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>> arch/riscv/mm/fault.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>> index 3ba1d4dde5dd..b3fcf7d67efb 100644
>> --- a/arch/riscv/mm/fault.c
>> +++ b/arch/riscv/mm/fault.c
>> @@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
>> if (unlikely(access_error(cause, vma))) {
>> vma_end_read(vma);
>> - goto lock_mmap;
>> + count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>> + tsk->thread.bad_cause = SEGV_ACCERR;
>
>
> I think we should use the cause variable here instead of SEGV_ACCERR, as
> bad_cause is a riscv internal status which describes the real fault that
> happened.
Oh, I see, it is exception causes on riscv, so it should be
diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index b3fcf7d67efb..5224f3733802 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -293,8 +293,8 @@ void handle_page_fault(struct pt_regs *regs)
if (unlikely(access_error(cause, vma))) {
vma_end_read(vma);
count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
- tsk->thread.bad_cause = SEGV_ACCERR;
- bad_area_nosemaphore(regs, code, addr);
+ tsk->thread.bad_cause = cause;
+ bad_area_nosemaphore(regs, SEGV_ACCERR, addr);
return;
}
Hi Alex, could you help to check it?
Hi Andrew, please help to squash it after Alex ack it.
Thanks both.
>
> Thanks,
>
> Alex
>
>
>> + bad_area_nosemaphore(regs, code, addr);
>> + return;
>> }
>> fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK,
>> regs);
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
2024-04-10 8:07 ` Kefeng Wang
@ 2024-04-10 17:28 ` Alexandre Ghiti
2024-04-11 1:17 ` Kefeng Wang
0 siblings, 1 reply; 19+ messages in thread
From: Alexandre Ghiti @ 2024-04-10 17:28 UTC (permalink / raw
To: Kefeng Wang, akpm
Cc: x86, linux-s390, Albert Ou, linuxppc-dev, Peter Zijlstra,
Catalin Marinas, Paul Walmsley, Russell King, surenb, Dave Hansen,
linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
Alexander Gordeev, Will Deacon, Gerald Schaefer, linux-arm-kernel,
linux-mm
On 10/04/2024 10:07, Kefeng Wang wrote:
>
>
> On 2024/4/10 15:32, Alexandre Ghiti wrote:
>> Hi Kefeng,
>>
>> On 03/04/2024 10:38, Kefeng Wang wrote:
>>> The access_error() of vma already checked under per-VMA lock, if it
>>> is a bad access, directly handle error, no need to retry with mmap_lock
>>> again. Since the page faut is handled under per-VMA lock, count it as
>>> a vma lock event with VMA_LOCK_SUCCESS.
>>>
>>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>> ---
>>> arch/riscv/mm/fault.c | 5 ++++-
>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>>> index 3ba1d4dde5dd..b3fcf7d67efb 100644
>>> --- a/arch/riscv/mm/fault.c
>>> +++ b/arch/riscv/mm/fault.c
>>> @@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
>>> if (unlikely(access_error(cause, vma))) {
>>> vma_end_read(vma);
>>> - goto lock_mmap;
>>> + count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>>> + tsk->thread.bad_cause = SEGV_ACCERR;
>>
>>
>> I think we should use the cause variable here instead of SEGV_ACCERR,
>> as bad_cause is a riscv internal status which describes the real
>> fault that happened.
>
> Oh, I see, it is exception causes on riscv, so it should be
>
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index b3fcf7d67efb..5224f3733802 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -293,8 +293,8 @@ void handle_page_fault(struct pt_regs *regs)
> if (unlikely(access_error(cause, vma))) {
> vma_end_read(vma);
> count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> - tsk->thread.bad_cause = SEGV_ACCERR;
> - bad_area_nosemaphore(regs, code, addr);
> + tsk->thread.bad_cause = cause;
> + bad_area_nosemaphore(regs, SEGV_ACCERR, addr);
> return;
> }
>
> Hi Alex, could you help to check it?
>
> Hi Andrew, please help to squash it after Alex ack it.
>
> Thanks both.
So I have just tested Kefeng's fixup on my usual CI and with a simple
program that triggers such bad access, everything went fine so with the
fixup applied:
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Thanks,
Alex
>
>
>>
>> Thanks,
>>
>> Alex
>>
>>
>>> + bad_area_nosemaphore(regs, code, addr);
>>> + return;
>>> }
>>> fault = handle_mm_fault(vma, addr, flags |
>>> FAULT_FLAG_VMA_LOCK, regs);
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
2024-04-10 17:28 ` Alexandre Ghiti
@ 2024-04-11 1:17 ` Kefeng Wang
0 siblings, 0 replies; 19+ messages in thread
From: Kefeng Wang @ 2024-04-11 1:17 UTC (permalink / raw
To: Alexandre Ghiti, akpm
Cc: x86, linux-s390, Albert Ou, linuxppc-dev, Peter Zijlstra,
Catalin Marinas, Paul Walmsley, Russell King, surenb, Dave Hansen,
linux-riscv, Palmer Dabbelt, Nicholas Piggin, Andy Lutomirski,
Alexander Gordeev, Will Deacon, Gerald Schaefer, linux-arm-kernel,
linux-mm
On 2024/4/11 1:28, Alexandre Ghiti wrote:
> On 10/04/2024 10:07, Kefeng Wang wrote:
>>
>>
>> On 2024/4/10 15:32, Alexandre Ghiti wrote:
>>> Hi Kefeng,
>>>
>>> On 03/04/2024 10:38, Kefeng Wang wrote:
>>>> The access_error() of vma already checked under per-VMA lock, if it
>>>> is a bad access, directly handle error, no need to retry with mmap_lock
>>>> again. Since the page faut is handled under per-VMA lock, count it as
>>>> a vma lock event with VMA_LOCK_SUCCESS.
>>>>
>>>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>> ---
>>>> arch/riscv/mm/fault.c | 5 ++++-
>>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>>>> index 3ba1d4dde5dd..b3fcf7d67efb 100644
>>>> --- a/arch/riscv/mm/fault.c
>>>> +++ b/arch/riscv/mm/fault.c
>>>> @@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
>>>> if (unlikely(access_error(cause, vma))) {
>>>> vma_end_read(vma);
>>>> - goto lock_mmap;
>>>> + count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>>>> + tsk->thread.bad_cause = SEGV_ACCERR;
>>>
>>>
>>> I think we should use the cause variable here instead of SEGV_ACCERR,
>>> as bad_cause is a riscv internal status which describes the real
>>> fault that happened.
>>
>> Oh, I see, it is exception causes on riscv, so it should be
>>
>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>> index b3fcf7d67efb..5224f3733802 100644
>> --- a/arch/riscv/mm/fault.c
>> +++ b/arch/riscv/mm/fault.c
>> @@ -293,8 +293,8 @@ void handle_page_fault(struct pt_regs *regs)
>> if (unlikely(access_error(cause, vma))) {
>> vma_end_read(vma);
>> count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>> - tsk->thread.bad_cause = SEGV_ACCERR;
>> - bad_area_nosemaphore(regs, code, addr);
>> + tsk->thread.bad_cause = cause;
>> + bad_area_nosemaphore(regs, SEGV_ACCERR, addr);
>> return;
>> }
>>
>> Hi Alex, could you help to check it?
>>
>> Hi Andrew, please help to squash it after Alex ack it.
>>
>> Thanks both.
>
>
> So I have just tested Kefeng's fixup on my usual CI and with a simple
> program that triggers such bad access, everything went fine so with the
> fixup applied:
>
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Great, thanks.
>
> Thanks,
>
> Alex
>
>
>
>>
>>
>>>
>>> Thanks,
>>>
>>> Alex
>>>
>>>
>>>> + bad_area_nosemaphore(regs, code, addr);
>>>> + return;
>>>> }
>>>> fault = handle_mm_fault(vma, addr, flags |
>>>> FAULT_FLAG_VMA_LOCK, regs);
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2024-04-11 1:19 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-03 8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
2024-04-03 8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
2024-04-09 11:14 ` Catalin Marinas
2024-04-03 8:38 ` [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS Kefeng Wang
2024-04-09 11:15 ` Catalin Marinas
2024-04-03 8:38 ` [PATCH v2 3/7] arm: " Kefeng Wang
2024-04-03 8:38 ` [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess Kefeng Wang
2024-04-09 8:56 ` Michael Ellerman
2024-04-03 8:38 ` [PATCH v2 5/7] riscv: " Kefeng Wang
2024-04-10 7:32 ` Alexandre Ghiti
2024-04-10 8:07 ` Kefeng Wang
2024-04-10 17:28 ` Alexandre Ghiti
2024-04-11 1:17 ` Kefeng Wang
2024-04-03 8:38 ` [PATCH v2 6/7] s390: " Kefeng Wang
2024-04-07 17:19 ` Heiko Carstens
2024-04-03 8:38 ` [PATCH v2 7/7] x86: " Kefeng Wang
2024-04-03 20:45 ` [PATCH v2 0/7] arch/mm/fault: " Andrew Morton
2024-04-07 7:49 ` Kefeng Wang
2024-04-07 17:19 ` Heiko Carstens
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).