LKML Archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init
@ 2024-03-23 14:15 Uladzislau Rezki (Sony)
  2024-03-23 23:22 ` Guenter Roeck
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Uladzislau Rezki (Sony) @ 2024-03-23 14:15 UTC (permalink / raw
  To: linux-mm, Andrew Morton
  Cc: LKML, Baoquan He, Lorenzo Stoakes, Christoph Hellwig,
	Matthew Wilcox, Dave Chinner, Guenter Roeck, Uladzislau Rezki,
	Oleksiy Avramchenko

During the boot the s390 system triggers "spinlock bad magic" messages
if the spinlock debugging is enabled:

[    0.465445] BUG: spinlock bad magic on CPU#0, swapper/0
[    0.465490]  lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
[    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
[    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
[    0.466270] Call Trace:
[    0.466470]  [<00000000011f26c8>] dump_stack_lvl+0x98/0xd8
[    0.466516]  [<00000000001dcc6a>] do_raw_spin_lock+0x8a/0x108
[    0.466545]  [<000000000042146c>] find_vmap_area+0x6c/0x108
[    0.466572]  [<000000000042175a>] find_vm_area+0x22/0x40
[    0.466597]  [<000000000012f152>] __set_memory+0x132/0x150
[    0.466624]  [<0000000001cc0398>] vmem_map_init+0x40/0x118
[    0.466651]  [<0000000001cc0092>] paging_init+0x22/0x68
[    0.466677]  [<0000000001cbbed2>] setup_arch+0x52a/0x708
[    0.466702]  [<0000000001cb6140>] start_kernel+0x80/0x5c8
[    0.466727]  [<0000000000100036>] startup_continue+0x36/0x40

it happens because such system tries to access some vmap areas
whereas the vmalloc initialization is not even yet done:

[    0.465490] lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
[    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
[    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
[    0.466270] Call Trace:
[    0.466470] dump_stack_lvl (lib/dump_stack.c:117)
[    0.466516] do_raw_spin_lock (kernel/locking/spinlock_debug.c:87 kernel/locking/spinlock_debug.c:115)
[    0.466545] find_vmap_area (mm/vmalloc.c:1059 mm/vmalloc.c:2364)
[    0.466572] find_vm_area (mm/vmalloc.c:3150)
[    0.466597] __set_memory (arch/s390/mm/pageattr.c:360 arch/s390/mm/pageattr.c:393)
[    0.466624] vmem_map_init (./arch/s390/include/asm/set_memory.h:55 arch/s390/mm/vmem.c:660)
[    0.466651] paging_init (arch/s390/mm/init.c:97)
[    0.466677] setup_arch (arch/s390/kernel/setup.c:972)
[    0.466702] start_kernel (init/main.c:899)
[    0.466727] startup_continue (arch/s390/kernel/head64.S:35)
[    0.466811] INFO: lockdep is turned off.
...
[    0.718250] vmalloc init - busy lock init 0000000002871860
[    0.718328] vmalloc init - busy lock init 00000000028731b8

Some background. It worked before because the lock that is in question
was statically defined and initialized. As of now, the locks and data
structures are initialized in the vmalloc_init() function.

To address that issue add the check whether the "vmap_initialized"
variable is set, if not find_vmap_area() bails out on entry returning NULL.

Fixes: 72210662c5a2 ("mm: vmalloc: offload free_vmap_area_lock lock")
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
 mm/vmalloc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 22aa63f4ef63..0d77d171b5d9 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2343,6 +2343,9 @@ struct vmap_area *find_vmap_area(unsigned long addr)
 	struct vmap_area *va;
 	int i, j;
 
+	if (unlikely(!vmap_initialized))
+		return NULL;
+
 	/*
 	 * An addr_to_node_id(addr) converts an address to a node index
 	 * where a VA is located. If VA spans several zones and passed
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init
  2024-03-23 14:15 [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init Uladzislau Rezki (Sony)
@ 2024-03-23 23:22 ` Guenter Roeck
  2024-03-24  2:56 ` Baoquan He
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Guenter Roeck @ 2024-03-23 23:22 UTC (permalink / raw
  To: Uladzislau Rezki (Sony), linux-mm, Andrew Morton
  Cc: LKML, Baoquan He, Lorenzo Stoakes, Christoph Hellwig,
	Matthew Wilcox, Dave Chinner, Oleksiy Avramchenko

On 3/23/24 07:15, Uladzislau Rezki (Sony) wrote:
> During the boot the s390 system triggers "spinlock bad magic" messages
> if the spinlock debugging is enabled:
> 
> [    0.465445] BUG: spinlock bad magic on CPU#0, swapper/0
> [    0.465490]  lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> [    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
> [    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
> [    0.466270] Call Trace:
> [    0.466470]  [<00000000011f26c8>] dump_stack_lvl+0x98/0xd8
> [    0.466516]  [<00000000001dcc6a>] do_raw_spin_lock+0x8a/0x108
> [    0.466545]  [<000000000042146c>] find_vmap_area+0x6c/0x108
> [    0.466572]  [<000000000042175a>] find_vm_area+0x22/0x40
> [    0.466597]  [<000000000012f152>] __set_memory+0x132/0x150
> [    0.466624]  [<0000000001cc0398>] vmem_map_init+0x40/0x118
> [    0.466651]  [<0000000001cc0092>] paging_init+0x22/0x68
> [    0.466677]  [<0000000001cbbed2>] setup_arch+0x52a/0x708
> [    0.466702]  [<0000000001cb6140>] start_kernel+0x80/0x5c8
> [    0.466727]  [<0000000000100036>] startup_continue+0x36/0x40
> 
> it happens because such system tries to access some vmap areas
> whereas the vmalloc initialization is not even yet done:
> 
> [    0.465490] lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> [    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
> [    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
> [    0.466270] Call Trace:
> [    0.466470] dump_stack_lvl (lib/dump_stack.c:117)
> [    0.466516] do_raw_spin_lock (kernel/locking/spinlock_debug.c:87 kernel/locking/spinlock_debug.c:115)
> [    0.466545] find_vmap_area (mm/vmalloc.c:1059 mm/vmalloc.c:2364)
> [    0.466572] find_vm_area (mm/vmalloc.c:3150)
> [    0.466597] __set_memory (arch/s390/mm/pageattr.c:360 arch/s390/mm/pageattr.c:393)
> [    0.466624] vmem_map_init (./arch/s390/include/asm/set_memory.h:55 arch/s390/mm/vmem.c:660)
> [    0.466651] paging_init (arch/s390/mm/init.c:97)
> [    0.466677] setup_arch (arch/s390/kernel/setup.c:972)
> [    0.466702] start_kernel (init/main.c:899)
> [    0.466727] startup_continue (arch/s390/kernel/head64.S:35)
> [    0.466811] INFO: lockdep is turned off.
> ...
> [    0.718250] vmalloc init - busy lock init 0000000002871860
> [    0.718328] vmalloc init - busy lock init 00000000028731b8
> 
> Some background. It worked before because the lock that is in question
> was statically defined and initialized. As of now, the locks and data
> structures are initialized in the vmalloc_init() function.
> 
> To address that issue add the check whether the "vmap_initialized"
> variable is set, if not find_vmap_area() bails out on entry returning NULL.
> 
> Fixes: 72210662c5a2 ("mm: vmalloc: offload free_vmap_area_lock lock")
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

Tested-by: Guenter Roeck <linux@roeck-us.net>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init
  2024-03-23 14:15 [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init Uladzislau Rezki (Sony)
  2024-03-23 23:22 ` Guenter Roeck
@ 2024-03-24  2:56 ` Baoquan He
  2024-03-24 23:32 ` Christoph Hellwig
  2024-03-26  7:48 ` Heiko Carstens
  3 siblings, 0 replies; 10+ messages in thread
From: Baoquan He @ 2024-03-24  2:56 UTC (permalink / raw
  To: Uladzislau Rezki (Sony)
  Cc: linux-mm, Andrew Morton, LKML, Lorenzo Stoakes, Christoph Hellwig,
	Matthew Wilcox, Dave Chinner, Guenter Roeck, Oleksiy Avramchenko

On 03/23/24 at 03:15pm, Uladzislau Rezki (Sony) wrote:
> During the boot the s390 system triggers "spinlock bad magic" messages
> if the spinlock debugging is enabled:
> 
> [    0.465445] BUG: spinlock bad magic on CPU#0, swapper/0
> [    0.465490]  lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> [    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
> [    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
> [    0.466270] Call Trace:
> [    0.466470]  [<00000000011f26c8>] dump_stack_lvl+0x98/0xd8
> [    0.466516]  [<00000000001dcc6a>] do_raw_spin_lock+0x8a/0x108
> [    0.466545]  [<000000000042146c>] find_vmap_area+0x6c/0x108
> [    0.466572]  [<000000000042175a>] find_vm_area+0x22/0x40
> [    0.466597]  [<000000000012f152>] __set_memory+0x132/0x150
> [    0.466624]  [<0000000001cc0398>] vmem_map_init+0x40/0x118
> [    0.466651]  [<0000000001cc0092>] paging_init+0x22/0x68
> [    0.466677]  [<0000000001cbbed2>] setup_arch+0x52a/0x708
> [    0.466702]  [<0000000001cb6140>] start_kernel+0x80/0x5c8
> [    0.466727]  [<0000000000100036>] startup_continue+0x36/0x40
> 
> it happens because such system tries to access some vmap areas
> whereas the vmalloc initialization is not even yet done:
> 
> [    0.465490] lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> [    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
> [    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
> [    0.466270] Call Trace:
> [    0.466470] dump_stack_lvl (lib/dump_stack.c:117)
> [    0.466516] do_raw_spin_lock (kernel/locking/spinlock_debug.c:87 kernel/locking/spinlock_debug.c:115)
> [    0.466545] find_vmap_area (mm/vmalloc.c:1059 mm/vmalloc.c:2364)
> [    0.466572] find_vm_area (mm/vmalloc.c:3150)
> [    0.466597] __set_memory (arch/s390/mm/pageattr.c:360 arch/s390/mm/pageattr.c:393)
> [    0.466624] vmem_map_init (./arch/s390/include/asm/set_memory.h:55 arch/s390/mm/vmem.c:660)
> [    0.466651] paging_init (arch/s390/mm/init.c:97)
> [    0.466677] setup_arch (arch/s390/kernel/setup.c:972)
> [    0.466702] start_kernel (init/main.c:899)
> [    0.466727] startup_continue (arch/s390/kernel/head64.S:35)
> [    0.466811] INFO: lockdep is turned off.
> ...
> [    0.718250] vmalloc init - busy lock init 0000000002871860
> [    0.718328] vmalloc init - busy lock init 00000000028731b8
> 
> Some background. It worked before because the lock that is in question
> was statically defined and initialized. As of now, the locks and data
> structures are initialized in the vmalloc_init() function.
> 
> To address that issue add the check whether the "vmap_initialized"
> variable is set, if not find_vmap_area() bails out on entry returning NULL.
> 
> Fixes: 72210662c5a2 ("mm: vmalloc: offload free_vmap_area_lock lock")
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  mm/vmalloc.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 22aa63f4ef63..0d77d171b5d9 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2343,6 +2343,9 @@ struct vmap_area *find_vmap_area(unsigned long addr)
>  	struct vmap_area *va;
>  	int i, j;
>  
> +	if (unlikely(!vmap_initialized))
> +		return NULL;
> +
>  	/*
>  	 * An addr_to_node_id(addr) converts an address to a node index
>  	 * where a VA is located. If VA spans several zones and passed

LGTM,

Reviewed-by: Baoquan He <bhe@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init
  2024-03-23 14:15 [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init Uladzislau Rezki (Sony)
  2024-03-23 23:22 ` Guenter Roeck
  2024-03-24  2:56 ` Baoquan He
@ 2024-03-24 23:32 ` Christoph Hellwig
  2024-03-25  9:39   ` Heiko Carstens
  2024-03-26  7:48 ` Heiko Carstens
  3 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2024-03-24 23:32 UTC (permalink / raw
  To: Uladzislau Rezki (Sony)
  Cc: linux-mm, Andrew Morton, LKML, Baoquan He, Lorenzo Stoakes,
	Christoph Hellwig, Matthew Wilcox, Dave Chinner, Guenter Roeck,
	Oleksiy Avramchenko, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	linux-s390

I guess this is ok as an urgend bandaid to get s390 booting again,
but calling find_vmap_area before the vmap area is initialized
seems an actual issue in the s390 mm init code.

Adding the s390 maintainers to see if they have and idea how this could
get fixed in a better way.


On Sat, Mar 23, 2024 at 03:15:44PM +0100, Uladzislau Rezki (Sony) wrote:
> During the boot the s390 system triggers "spinlock bad magic" messages
> if the spinlock debugging is enabled:
> 
> [    0.465445] BUG: spinlock bad magic on CPU#0, swapper/0
> [    0.465490]  lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> [    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
> [    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
> [    0.466270] Call Trace:
> [    0.466470]  [<00000000011f26c8>] dump_stack_lvl+0x98/0xd8
> [    0.466516]  [<00000000001dcc6a>] do_raw_spin_lock+0x8a/0x108
> [    0.466545]  [<000000000042146c>] find_vmap_area+0x6c/0x108
> [    0.466572]  [<000000000042175a>] find_vm_area+0x22/0x40
> [    0.466597]  [<000000000012f152>] __set_memory+0x132/0x150
> [    0.466624]  [<0000000001cc0398>] vmem_map_init+0x40/0x118
> [    0.466651]  [<0000000001cc0092>] paging_init+0x22/0x68
> [    0.466677]  [<0000000001cbbed2>] setup_arch+0x52a/0x708
> [    0.466702]  [<0000000001cb6140>] start_kernel+0x80/0x5c8
> [    0.466727]  [<0000000000100036>] startup_continue+0x36/0x40
> 
> it happens because such system tries to access some vmap areas
> whereas the vmalloc initialization is not even yet done:
> 
> [    0.465490] lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> [    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
> [    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
> [    0.466270] Call Trace:
> [    0.466470] dump_stack_lvl (lib/dump_stack.c:117)
> [    0.466516] do_raw_spin_lock (kernel/locking/spinlock_debug.c:87 kernel/locking/spinlock_debug.c:115)
> [    0.466545] find_vmap_area (mm/vmalloc.c:1059 mm/vmalloc.c:2364)
> [    0.466572] find_vm_area (mm/vmalloc.c:3150)
> [    0.466597] __set_memory (arch/s390/mm/pageattr.c:360 arch/s390/mm/pageattr.c:393)
> [    0.466624] vmem_map_init (./arch/s390/include/asm/set_memory.h:55 arch/s390/mm/vmem.c:660)
> [    0.466651] paging_init (arch/s390/mm/init.c:97)
> [    0.466677] setup_arch (arch/s390/kernel/setup.c:972)
> [    0.466702] start_kernel (init/main.c:899)
> [    0.466727] startup_continue (arch/s390/kernel/head64.S:35)
> [    0.466811] INFO: lockdep is turned off.
> ...
> [    0.718250] vmalloc init - busy lock init 0000000002871860
> [    0.718328] vmalloc init - busy lock init 00000000028731b8
> 
> Some background. It worked before because the lock that is in question
> was statically defined and initialized. As of now, the locks and data
> structures are initialized in the vmalloc_init() function.
> 
> To address that issue add the check whether the "vmap_initialized"
> variable is set, if not find_vmap_area() bails out on entry returning NULL.
> 
> Fixes: 72210662c5a2 ("mm: vmalloc: offload free_vmap_area_lock lock")
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  mm/vmalloc.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 22aa63f4ef63..0d77d171b5d9 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2343,6 +2343,9 @@ struct vmap_area *find_vmap_area(unsigned long addr)
>  	struct vmap_area *va;
>  	int i, j;
>  
> +	if (unlikely(!vmap_initialized))
> +		return NULL;
> +
>  	/*
>  	 * An addr_to_node_id(addr) converts an address to a node index
>  	 * where a VA is located. If VA spans several zones and passed
> -- 
> 2.39.2
> 
---end quoted text---

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init
  2024-03-24 23:32 ` Christoph Hellwig
@ 2024-03-25  9:39   ` Heiko Carstens
  2024-03-25 10:09     ` Baoquan He
  0 siblings, 1 reply; 10+ messages in thread
From: Heiko Carstens @ 2024-03-25  9:39 UTC (permalink / raw
  To: Christoph Hellwig
  Cc: Uladzislau Rezki (Sony), linux-mm, Andrew Morton, LKML,
	Baoquan He, Lorenzo Stoakes, Matthew Wilcox, Dave Chinner,
	Guenter Roeck, Oleksiy Avramchenko, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	linux-s390

On Sun, Mar 24, 2024 at 04:32:00PM -0700, Christoph Hellwig wrote:
> On Sat, Mar 23, 2024 at 03:15:44PM +0100, Uladzislau Rezki (Sony) wrote:
> > During the boot the s390 system triggers "spinlock bad magic" messages
> > if the spinlock debugging is enabled:
> > 
> > [    0.465445] BUG: spinlock bad magic on CPU#0, swapper/0
> > [    0.465490]  lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> > [    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
> > [    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
> > [    0.466270] Call Trace:
> > [    0.466470]  [<00000000011f26c8>] dump_stack_lvl+0x98/0xd8
> > [    0.466516]  [<00000000001dcc6a>] do_raw_spin_lock+0x8a/0x108
> > [    0.466545]  [<000000000042146c>] find_vmap_area+0x6c/0x108
> > [    0.466572]  [<000000000042175a>] find_vm_area+0x22/0x40
> > [    0.466597]  [<000000000012f152>] __set_memory+0x132/0x150
> > [    0.466624]  [<0000000001cc0398>] vmem_map_init+0x40/0x118
> > [    0.466651]  [<0000000001cc0092>] paging_init+0x22/0x68
> > [    0.466677]  [<0000000001cbbed2>] setup_arch+0x52a/0x708
> > [    0.466702]  [<0000000001cb6140>] start_kernel+0x80/0x5c8
> > [    0.466727]  [<0000000000100036>] startup_continue+0x36/0x40
...
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 22aa63f4ef63..0d77d171b5d9 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -2343,6 +2343,9 @@ struct vmap_area *find_vmap_area(unsigned long addr)
> >  	struct vmap_area *va;
> >  	int i, j;
> >  
> > +	if (unlikely(!vmap_initialized))
> > +		return NULL;
> > +
>
> I guess this is ok as an urgend bandaid to get s390 booting again,
> but calling find_vmap_area before the vmap area is initialized
> seems an actual issue in the s390 mm init code.
> 
> Adding the s390 maintainers to see if they have and idea how this could
> get fixed in a better way.

I'm going to push the patch below to the s390 git tree later. This is not a
piece of art, but I wanted to avoid to externalize vmalloc's vmap_initialized,
or come up with some s390 specific change_page_attr_alias_early() variant where
sooner or later nobody remembers what "early" means.

So this seems to be "good enough".

From 0308cd304fa3b01904c6060e2115234101811e48 Mon Sep 17 00:00:00 2001
From: Heiko Carstens <hca@linux.ibm.com>
Date: Thu, 21 Mar 2024 09:41:20 +0100
Subject: [PATCH] s390/mm,pageattr: avoid early calls into vmalloc code

The vmalloc code got changed and doesn't have the global statically
initialized vmap_area_lock spinlock anymore. This leads to the following
lockdep splat when find_vm_area() is called before the vmalloc code is
initialized:

BUG: spinlock bad magic on CPU#0, swapper/0
 lock: single+0x1868/0x1978, .magic: 00000000, .owner: swapper/0, .owner_cpu: 0

CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-11767-g23956900041d #1
Hardware name: IBM 3931 A01 701 (KVM/Linux)
Call Trace:
 [<00000000010d840a>] dump_stack_lvl+0xba/0x148
 [<00000000001fdf5c>] do_raw_spin_unlock+0x7c/0xd0
 [<000000000111d848>] _raw_spin_unlock+0x38/0x68
 [<0000000000485830>] find_vmap_area+0xb0/0x108
 [<0000000000485ada>] find_vm_area+0x22/0x40
 [<0000000000132bbc>] __set_memory+0xbc/0x140
 [<0000000001a7f048>] vmem_map_init+0x40/0x158
 [<0000000001a7edc8>] paging_init+0x28/0x80
 [<0000000001a7a6e2>] setup_arch+0x4b2/0x6d8
 [<0000000001a74438>] start_kernel+0x98/0x4b0
 [<0000000000100036>] startup_continue+0x36/0x40
INFO: lockdep is turned off.

Add a slab_is_available() check to change_page_attr_alias() in order to
avoid early calls into vmalloc code. slab_is_available() is not exactly
what is needed, but there is currently no other way to tell if the vmalloc
code is initialized or not, and there is no reason to expose
e.g. vmap_initialized from vmalloc to achieve the same.

The fixes tag does not mean that the referenced commit is broken, but that
there is a dependency to this commit if the vmalloc commit should be
backported.

Fixes: d093602919ad ("mm: vmalloc: remove global vmap_area_root rb-tree")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
---
 arch/s390/mm/pageattr.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/s390/mm/pageattr.c b/arch/s390/mm/pageattr.c
index 01bc8fad64d6..b6c6453d66e2 100644
--- a/arch/s390/mm/pageattr.c
+++ b/arch/s390/mm/pageattr.c
@@ -344,6 +344,9 @@ static int change_page_attr_alias(unsigned long addr, unsigned long end,
 	struct vm_struct *area;
 	int rc = 0;
 
+	/* Avoid early calls into not initialized vmalloc code. */
+	if (!slab_is_available())
+		return 0;
 	/*
 	 * Changes to read-only permissions on kernel VA mappings are also
 	 * applied to the kernel direct mapping. Execute permissions are
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init
  2024-03-25  9:39   ` Heiko Carstens
@ 2024-03-25 10:09     ` Baoquan He
  2024-03-25 11:16       ` Heiko Carstens
  0 siblings, 1 reply; 10+ messages in thread
From: Baoquan He @ 2024-03-25 10:09 UTC (permalink / raw
  To: Heiko Carstens
  Cc: Christoph Hellwig, Uladzislau Rezki (Sony), linux-mm,
	Andrew Morton, LKML, Lorenzo Stoakes, Matthew Wilcox,
	Dave Chinner, Guenter Roeck, Oleksiy Avramchenko, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	linux-s390

On 03/25/24 at 10:39am, Heiko Carstens wrote:
> On Sun, Mar 24, 2024 at 04:32:00PM -0700, Christoph Hellwig wrote:
> > On Sat, Mar 23, 2024 at 03:15:44PM +0100, Uladzislau Rezki (Sony) wrote:
......snip
> > I guess this is ok as an urgend bandaid to get s390 booting again,
> > but calling find_vmap_area before the vmap area is initialized
> > seems an actual issue in the s390 mm init code.
> > 
> > Adding the s390 maintainers to see if they have and idea how this could
> > get fixed in a better way.
> 
> I'm going to push the patch below to the s390 git tree later. This is not a
> piece of art, but I wanted to avoid to externalize vmalloc's vmap_initialized,
> or come up with some s390 specific change_page_attr_alias_early() variant where
> sooner or later nobody remembers what "early" means.
> 
> So this seems to be "good enough".
> 
> From 0308cd304fa3b01904c6060e2115234101811e48 Mon Sep 17 00:00:00 2001
> From: Heiko Carstens <hca@linux.ibm.com>
> Date: Thu, 21 Mar 2024 09:41:20 +0100
> Subject: [PATCH] s390/mm,pageattr: avoid early calls into vmalloc code
> 
> The vmalloc code got changed and doesn't have the global statically
> initialized vmap_area_lock spinlock anymore. This leads to the following
> lockdep splat when find_vm_area() is called before the vmalloc code is
> initialized:
> 
> BUG: spinlock bad magic on CPU#0, swapper/0
>  lock: single+0x1868/0x1978, .magic: 00000000, .owner: swapper/0, .owner_cpu: 0
> 
> CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-11767-g23956900041d #1
> Hardware name: IBM 3931 A01 701 (KVM/Linux)
> Call Trace:
>  [<00000000010d840a>] dump_stack_lvl+0xba/0x148
>  [<00000000001fdf5c>] do_raw_spin_unlock+0x7c/0xd0
>  [<000000000111d848>] _raw_spin_unlock+0x38/0x68
>  [<0000000000485830>] find_vmap_area+0xb0/0x108
>  [<0000000000485ada>] find_vm_area+0x22/0x40
>  [<0000000000132bbc>] __set_memory+0xbc/0x140
>  [<0000000001a7f048>] vmem_map_init+0x40/0x158
>  [<0000000001a7edc8>] paging_init+0x28/0x80
>  [<0000000001a7a6e2>] setup_arch+0x4b2/0x6d8
>  [<0000000001a74438>] start_kernel+0x98/0x4b0
>  [<0000000000100036>] startup_continue+0x36/0x40
> INFO: lockdep is turned off.
> 
> Add a slab_is_available() check to change_page_attr_alias() in order to
> avoid early calls into vmalloc code. slab_is_available() is not exactly
> what is needed, but there is currently no other way to tell if the vmalloc
> code is initialized or not, and there is no reason to expose
> e.g. vmap_initialized from vmalloc to achieve the same.

If so, I would rather add a vmalloc_is_available() to achieve the same.
The added code and the code comment definitely will confuse people and
make people to dig why.

> 
> The fixes tag does not mean that the referenced commit is broken, but that
> there is a dependency to this commit if the vmalloc commit should be
> backported.
> 
> Fixes: d093602919ad ("mm: vmalloc: remove global vmap_area_root rb-tree")
> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
> ---
>  arch/s390/mm/pageattr.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/s390/mm/pageattr.c b/arch/s390/mm/pageattr.c
> index 01bc8fad64d6..b6c6453d66e2 100644
> --- a/arch/s390/mm/pageattr.c
> +++ b/arch/s390/mm/pageattr.c
> @@ -344,6 +344,9 @@ static int change_page_attr_alias(unsigned long addr, unsigned long end,
>  	struct vm_struct *area;
>  	int rc = 0;
>  
> +	/* Avoid early calls into not initialized vmalloc code. */
> +	if (!slab_is_available())
> +		return 0;
>  	/*
>  	 * Changes to read-only permissions on kernel VA mappings are also
>  	 * applied to the kernel direct mapping. Execute permissions are
> -- 
> 2.40.1
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init
  2024-03-25 10:09     ` Baoquan He
@ 2024-03-25 11:16       ` Heiko Carstens
  2024-03-25 13:25         ` Baoquan He
  0 siblings, 1 reply; 10+ messages in thread
From: Heiko Carstens @ 2024-03-25 11:16 UTC (permalink / raw
  To: Baoquan He
  Cc: Christoph Hellwig, Uladzislau Rezki (Sony), linux-mm,
	Andrew Morton, LKML, Lorenzo Stoakes, Matthew Wilcox,
	Dave Chinner, Guenter Roeck, Oleksiy Avramchenko, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	linux-s390

On Mon, Mar 25, 2024 at 06:09:26PM +0800, Baoquan He wrote:
> On 03/25/24 at 10:39am, Heiko Carstens wrote:
> > On Sun, Mar 24, 2024 at 04:32:00PM -0700, Christoph Hellwig wrote:
> > > On Sat, Mar 23, 2024 at 03:15:44PM +0100, Uladzislau Rezki (Sony) wrote:
> ......snip
> > > I guess this is ok as an urgend bandaid to get s390 booting again,
> > > but calling find_vmap_area before the vmap area is initialized
> > > seems an actual issue in the s390 mm init code.
> > > 
> > > Adding the s390 maintainers to see if they have and idea how this could
> > > get fixed in a better way.
> > 
> > I'm going to push the patch below to the s390 git tree later. This is not a
> > piece of art, but I wanted to avoid to externalize vmalloc's vmap_initialized,
> > or come up with some s390 specific change_page_attr_alias_early() variant where
> > sooner or later nobody remembers what "early" means.
> > 
> > So this seems to be "good enough".
...
> > Add a slab_is_available() check to change_page_attr_alias() in order to
> > avoid early calls into vmalloc code. slab_is_available() is not exactly
> > what is needed, but there is currently no other way to tell if the vmalloc
> > code is initialized or not, and there is no reason to expose
> > e.g. vmap_initialized from vmalloc to achieve the same.
> 
> If so, I would rather add a vmalloc_is_available() to achieve the same.
> The added code and the code comment definitely will confuse people and
> make people to dig why.

So after having given this a bit more thought I think Uladzislau's patch is
probably the best way to address this.

It seems to be better that the vmalloc code would just do the right thing,
regardless how early it is called, instead of adding yet another
subsystem_xyz_is_available() call.

Alternatively this could be addressed in s390 code with some sort of
"early" calls, but as already stated, sooner or later nobody would remember
what "early" means, and even if that would be remembered: would that
restriction still be valid?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init
  2024-03-25 11:16       ` Heiko Carstens
@ 2024-03-25 13:25         ` Baoquan He
  0 siblings, 0 replies; 10+ messages in thread
From: Baoquan He @ 2024-03-25 13:25 UTC (permalink / raw
  To: Heiko Carstens
  Cc: Christoph Hellwig, Uladzislau Rezki (Sony), linux-mm,
	Andrew Morton, LKML, Lorenzo Stoakes, Matthew Wilcox,
	Dave Chinner, Guenter Roeck, Oleksiy Avramchenko, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	linux-s390

On 03/25/24 at 12:16pm, Heiko Carstens wrote:
> On Mon, Mar 25, 2024 at 06:09:26PM +0800, Baoquan He wrote:
> > On 03/25/24 at 10:39am, Heiko Carstens wrote:
> > > On Sun, Mar 24, 2024 at 04:32:00PM -0700, Christoph Hellwig wrote:
> > > > On Sat, Mar 23, 2024 at 03:15:44PM +0100, Uladzislau Rezki (Sony) wrote:
> > ......snip
> > > > I guess this is ok as an urgend bandaid to get s390 booting again,
> > > > but calling find_vmap_area before the vmap area is initialized
> > > > seems an actual issue in the s390 mm init code.
> > > > 
> > > > Adding the s390 maintainers to see if they have and idea how this could
> > > > get fixed in a better way.
> > > 
> > > I'm going to push the patch below to the s390 git tree later. This is not a
> > > piece of art, but I wanted to avoid to externalize vmalloc's vmap_initialized,
> > > or come up with some s390 specific change_page_attr_alias_early() variant where
> > > sooner or later nobody remembers what "early" means.
> > > 
> > > So this seems to be "good enough".
> ...
> > > Add a slab_is_available() check to change_page_attr_alias() in order to
> > > avoid early calls into vmalloc code. slab_is_available() is not exactly
> > > what is needed, but there is currently no other way to tell if the vmalloc
> > > code is initialized or not, and there is no reason to expose
> > > e.g. vmap_initialized from vmalloc to achieve the same.
> > 
> > If so, I would rather add a vmalloc_is_available() to achieve the same.
> > The added code and the code comment definitely will confuse people and
> > make people to dig why.
> 
> So after having given this a bit more thought I think Uladzislau's patch is
> probably the best way to address this.
> 
> It seems to be better that the vmalloc code would just do the right thing,
> regardless how early it is called, instead of adding yet another
> subsystem_xyz_is_available() call.
> 
> Alternatively this could be addressed in s390 code with some sort of
> "early" calls, but as already stated, sooner or later nobody would remember
> what "early" means, and even if that would be remembered: would that
> restriction still be valid?

I agree, it's better to let vmalloc code do the thing right whether it's
early ot not with Uladzislau's patch.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init
  2024-03-23 14:15 [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init Uladzislau Rezki (Sony)
                   ` (2 preceding siblings ...)
  2024-03-24 23:32 ` Christoph Hellwig
@ 2024-03-26  7:48 ` Heiko Carstens
  2024-03-26  9:25   ` Uladzislau Rezki
  3 siblings, 1 reply; 10+ messages in thread
From: Heiko Carstens @ 2024-03-26  7:48 UTC (permalink / raw
  To: Uladzislau Rezki (Sony)
  Cc: linux-mm, Andrew Morton, LKML, Baoquan He, Lorenzo Stoakes,
	Christoph Hellwig, Matthew Wilcox, Dave Chinner, Guenter Roeck,
	Oleksiy Avramchenko

On Sat, Mar 23, 2024 at 03:15:44PM +0100, Uladzislau Rezki (Sony) wrote:
> During the boot the s390 system triggers "spinlock bad magic" messages
> if the spinlock debugging is enabled:
...
> Some background. It worked before because the lock that is in question
> was statically defined and initialized. As of now, the locks and data
> structures are initialized in the vmalloc_init() function.
> 
> To address that issue add the check whether the "vmap_initialized"
> variable is set, if not find_vmap_area() bails out on entry returning NULL.
> 
> Fixes: 72210662c5a2 ("mm: vmalloc: offload free_vmap_area_lock lock")
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> ---
>  mm/vmalloc.c | 3 +++
>  1 file changed, 3 insertions(+)

Thanks for fixing this, and FWIW:
Acked-by: Heiko Carstens <hca@linux.ibm.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init
  2024-03-26  7:48 ` Heiko Carstens
@ 2024-03-26  9:25   ` Uladzislau Rezki
  0 siblings, 0 replies; 10+ messages in thread
From: Uladzislau Rezki @ 2024-03-26  9:25 UTC (permalink / raw
  To: Heiko Carstens
  Cc: Uladzislau Rezki (Sony), linux-mm, Andrew Morton, LKML,
	Baoquan He, Lorenzo Stoakes, Christoph Hellwig, Matthew Wilcox,
	Dave Chinner, Guenter Roeck, Oleksiy Avramchenko

On Tue, Mar 26, 2024 at 08:48:46AM +0100, Heiko Carstens wrote:
> On Sat, Mar 23, 2024 at 03:15:44PM +0100, Uladzislau Rezki (Sony) wrote:
> > During the boot the s390 system triggers "spinlock bad magic" messages
> > if the spinlock debugging is enabled:
> ...
> > Some background. It worked before because the lock that is in question
> > was statically defined and initialized. As of now, the locks and data
> > structures are initialized in the vmalloc_init() function.
> > 
> > To address that issue add the check whether the "vmap_initialized"
> > variable is set, if not find_vmap_area() bails out on entry returning NULL.
> > 
> > Fixes: 72210662c5a2 ("mm: vmalloc: offload free_vmap_area_lock lock")
> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > ---
> >  mm/vmalloc.c | 3 +++
> >  1 file changed, 3 insertions(+)
> 
> Thanks for fixing this, and FWIW:
> Acked-by: Heiko Carstens <hca@linux.ibm.com>
>
You are welcome and thank you!

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-03-26  9:26 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-23 14:15 [PATCH 1/1] mm: vmalloc: Bail out early in find_vmap_area() if vmap is not init Uladzislau Rezki (Sony)
2024-03-23 23:22 ` Guenter Roeck
2024-03-24  2:56 ` Baoquan He
2024-03-24 23:32 ` Christoph Hellwig
2024-03-25  9:39   ` Heiko Carstens
2024-03-25 10:09     ` Baoquan He
2024-03-25 11:16       ` Heiko Carstens
2024-03-25 13:25         ` Baoquan He
2024-03-26  7:48 ` Heiko Carstens
2024-03-26  9:25   ` Uladzislau Rezki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).