* cassini: possible recursive locking detected
@ 2014-05-06 9:39 Meelis Roos
2014-05-08 12:53 ` Emil Goode
0 siblings, 1 reply; 9+ messages in thread
From: Meelis Roos @ 2014-05-06 9:39 UTC (permalink / raw
To: netdev
While installing Linux on Sun Fire V480, any traffic on builtin cassini
NIC caused a hang. Worked this around by using Broadcom NIC and tried a
kernel with most debugging options. This resulted in the following
warning. Maybe this is the deadlonck I was seeing?
[ 88.316595] =============================================
[ 88.316597] [ INFO: possible recursive locking detected ]
[ 88.316603] 3.15.0-rc4-00202-g30321c7-dirty #11 Not tainted
[ 88.316605] ---------------------------------------------
[ 88.316608] swapper/3/1 is trying to acquire lock:
[ 88.316644] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
[ 88.316646]
[ 88.316646] but task is already holding lock:
[ 88.316657] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
[ 88.316659]
[ 88.316659] other info that might help us debug this:
[ 88.316661] Possible unsafe locking scenario:
[ 88.316661]
[ 88.316662] CPU0
[ 88.316664] ----
[ 88.316668] lock(&(&cp->tx_lock[i])->rlock);
[ 88.316671] lock(&(&cp->tx_lock[i])->rlock);
[ 88.316672]
[ 88.316672] *** DEADLOCK ***
[ 88.316672]
[ 88.316674] May be due to missing lock nesting notation
[ 88.316674]
[ 88.316677] 3 locks held by swapper/3/1:
[ 88.316694] #0: ((&cp->link_timer)){+.-...}, at: [<0000000000465f80>] call_timer_fn+0x0/0xe0
[ 88.316706] #1: (&(&cp->lock)->rlock){..-...}, at: [<0000000000745d80>] cas_link_timer+0x80/0x460
[ 88.316716] #2: (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
[ 88.316718]
[ 88.316718] stack backtrace:
[ 88.316724] CPU: 2 PID: 1 Comm: swapper/3 Not tainted 3.15.0-rc4-00202-g30321c7-dirty #11
[ 88.316727] Call Trace:
[ 88.316743] [00000000004a2c5c] __lock_acquire+0x10fc/0x1fa0
[ 88.316749] [00000000004a406c] lock_acquire+0x4c/0x80
[ 88.316760] [000000000083e07c] _raw_spin_lock+0x1c/0x40
[ 88.316765] [0000000000745da0] cas_link_timer+0xa0/0x460
[ 88.316769] [0000000000465fc8] call_timer_fn+0x48/0xe0
[ 88.316775] [00000000004665d4] run_timer_softirq+0x214/0x280
[ 88.316788] [000000000045f650] __do_softirq+0xf0/0x240
[ 88.316800] [000000000042bd0c] do_softirq_own_stack+0x2c/0x40
[ 88.316804] [000000000045fb44] irq_exit+0xc4/0xe0
[ 88.316814] [000000000042fcc8] timer_interrupt+0x88/0xc0
[ 88.316819] [0000000000426b84] valid_addr_bitmap_patch+0xbc/0x238
[ 88.316826] [00000000004ab2f8] vprintk_emit+0x1d8/0x540
[ 88.316842] [0000000000835fb8] printk+0x34/0x48
[ 88.316847] [00000000004ac3e0] register_console+0x340/0x3e0
[ 88.316862] [0000000000a74f2c] init_netconsole+0x180/0x20c
[ 88.316867] [0000000000426eb0] do_one_initcall+0x110/0x1a0
--
Meelis Roos (mroos@linux.ee)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected
2014-05-06 9:39 cassini: possible recursive locking detected Meelis Roos
@ 2014-05-08 12:53 ` Emil Goode
2014-05-08 20:00 ` Meelis Roos
0 siblings, 1 reply; 9+ messages in thread
From: Emil Goode @ 2014-05-08 12:53 UTC (permalink / raw
To: Meelis Roos; +Cc: netdev
[-- Attachment #1: Type: text/plain, Size: 3463 bytes --]
Hello Meelis,
I think this warning happens because we acquire multiple locks
in a loop in cas_lock_tx() and I believe we should use nested
lock annotation here.
Perhaps you would like to try the attached patch?
It won't fix the deadlock that you mentioned though.
Best regards,
Emil Goode
On Tue, May 06, 2014 at 12:39:48PM +0300, Meelis Roos wrote:
> While installing Linux on Sun Fire V480, any traffic on builtin cassini
> NIC caused a hang. Worked this around by using Broadcom NIC and tried a
> kernel with most debugging options. This resulted in the following
> warning. Maybe this is the deadlonck I was seeing?
>
> [ 88.316595] =============================================
> [ 88.316597] [ INFO: possible recursive locking detected ]
> [ 88.316603] 3.15.0-rc4-00202-g30321c7-dirty #11 Not tainted
> [ 88.316605] ---------------------------------------------
> [ 88.316608] swapper/3/1 is trying to acquire lock:
> [ 88.316644] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> [ 88.316646]
> [ 88.316646] but task is already holding lock:
> [ 88.316657] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> [ 88.316659]
> [ 88.316659] other info that might help us debug this:
> [ 88.316661] Possible unsafe locking scenario:
> [ 88.316661]
> [ 88.316662] CPU0
> [ 88.316664] ----
> [ 88.316668] lock(&(&cp->tx_lock[i])->rlock);
> [ 88.316671] lock(&(&cp->tx_lock[i])->rlock);
> [ 88.316672]
> [ 88.316672] *** DEADLOCK ***
> [ 88.316672]
> [ 88.316674] May be due to missing lock nesting notation
> [ 88.316674]
> [ 88.316677] 3 locks held by swapper/3/1:
> [ 88.316694] #0: ((&cp->link_timer)){+.-...}, at: [<0000000000465f80>] call_timer_fn+0x0/0xe0
> [ 88.316706] #1: (&(&cp->lock)->rlock){..-...}, at: [<0000000000745d80>] cas_link_timer+0x80/0x460
> [ 88.316716] #2: (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> [ 88.316718]
> [ 88.316718] stack backtrace:
> [ 88.316724] CPU: 2 PID: 1 Comm: swapper/3 Not tainted 3.15.0-rc4-00202-g30321c7-dirty #11
> [ 88.316727] Call Trace:
> [ 88.316743] [00000000004a2c5c] __lock_acquire+0x10fc/0x1fa0
> [ 88.316749] [00000000004a406c] lock_acquire+0x4c/0x80
> [ 88.316760] [000000000083e07c] _raw_spin_lock+0x1c/0x40
> [ 88.316765] [0000000000745da0] cas_link_timer+0xa0/0x460
> [ 88.316769] [0000000000465fc8] call_timer_fn+0x48/0xe0
> [ 88.316775] [00000000004665d4] run_timer_softirq+0x214/0x280
> [ 88.316788] [000000000045f650] __do_softirq+0xf0/0x240
> [ 88.316800] [000000000042bd0c] do_softirq_own_stack+0x2c/0x40
> [ 88.316804] [000000000045fb44] irq_exit+0xc4/0xe0
> [ 88.316814] [000000000042fcc8] timer_interrupt+0x88/0xc0
> [ 88.316819] [0000000000426b84] valid_addr_bitmap_patch+0xbc/0x238
> [ 88.316826] [00000000004ab2f8] vprintk_emit+0x1d8/0x540
> [ 88.316842] [0000000000835fb8] printk+0x34/0x48
> [ 88.316847] [00000000004ac3e0] register_console+0x340/0x3e0
> [ 88.316862] [0000000000a74f2c] init_netconsole+0x180/0x20c
> [ 88.316867] [0000000000426eb0] do_one_initcall+0x110/0x1a0
>
> --
> Meelis Roos (mroos@linux.ee)
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: 0001-net-cassini-use-nested-lock-annotation.patch --]
[-- Type: text/x-diff, Size: 936 bytes --]
>From 1f3fcb0cb141e167c5389861eb0a6cb935f6a3d5 Mon Sep 17 00:00:00 2001
From: Emil Goode <emilgoode@gmail.com>
Date: Thu, 8 May 2014 12:49:24 +0200
Subject: [PATCH] net: cassini: use nested lock annotation
In the cas_lock_tx function we acquire multiple locks in a loop and
need to use nested lock annotation to prevent lockdep warnings.
Signed-off-by: Emil Goode <emilgoode@gmail.com>
---
drivers/net/ethernet/sun/cassini.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/sun/cassini.c b/drivers/net/ethernet/sun/cassini.c
index df8d383..b9ac20f 100644
--- a/drivers/net/ethernet/sun/cassini.c
+++ b/drivers/net/ethernet/sun/cassini.c
@@ -246,7 +246,7 @@ static inline void cas_lock_tx(struct cas *cp)
int i;
for (i = 0; i < N_TX_RINGS; i++)
- spin_lock(&cp->tx_lock[i]);
+ spin_lock_nested(&cp->tx_lock[i], i);
}
static inline void cas_lock_all(struct cas *cp)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected
2014-05-08 12:53 ` Emil Goode
@ 2014-05-08 20:00 ` Meelis Roos
2014-05-08 22:38 ` Emil Goode
0 siblings, 1 reply; 9+ messages in thread
From: Meelis Roos @ 2014-05-08 20:00 UTC (permalink / raw
To: Emil Goode; +Cc: netdev
> Hello Meelis,
>
> I think this warning happens because we acquire multiple locks
> in a loop in cas_lock_tx() and I believe we should use nested
> lock annotation here.
>
> Perhaps you would like to try the attached patch?
Yes, it silences the warning.
> It won't fix the deadlock that you mentioned though.
Yes, the hang still happens, following a
ERROR: System Hardware FATAL RESET from CPU0 CPU2
>
> Best regards,
>
> Emil Goode
>
> On Tue, May 06, 2014 at 12:39:48PM +0300, Meelis Roos wrote:
> > While installing Linux on Sun Fire V480, any traffic on builtin cassini
> > NIC caused a hang. Worked this around by using Broadcom NIC and tried a
> > kernel with most debugging options. This resulted in the following
> > warning. Maybe this is the deadlonck I was seeing?
> >
> > [ 88.316595] =============================================
> > [ 88.316597] [ INFO: possible recursive locking detected ]
> > [ 88.316603] 3.15.0-rc4-00202-g30321c7-dirty #11 Not tainted
> > [ 88.316605] ---------------------------------------------
> > [ 88.316608] swapper/3/1 is trying to acquire lock:
> > [ 88.316644] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> > [ 88.316646]
> > [ 88.316646] but task is already holding lock:
> > [ 88.316657] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> > [ 88.316659]
> > [ 88.316659] other info that might help us debug this:
> > [ 88.316661] Possible unsafe locking scenario:
> > [ 88.316661]
> > [ 88.316662] CPU0
> > [ 88.316664] ----
> > [ 88.316668] lock(&(&cp->tx_lock[i])->rlock);
> > [ 88.316671] lock(&(&cp->tx_lock[i])->rlock);
> > [ 88.316672]
> > [ 88.316672] *** DEADLOCK ***
> > [ 88.316672]
> > [ 88.316674] May be due to missing lock nesting notation
> > [ 88.316674]
> > [ 88.316677] 3 locks held by swapper/3/1:
> > [ 88.316694] #0: ((&cp->link_timer)){+.-...}, at: [<0000000000465f80>] call_timer_fn+0x0/0xe0
> > [ 88.316706] #1: (&(&cp->lock)->rlock){..-...}, at: [<0000000000745d80>] cas_link_timer+0x80/0x460
> > [ 88.316716] #2: (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> > [ 88.316718]
> > [ 88.316718] stack backtrace:
> > [ 88.316724] CPU: 2 PID: 1 Comm: swapper/3 Not tainted 3.15.0-rc4-00202-g30321c7-dirty #11
> > [ 88.316727] Call Trace:
> > [ 88.316743] [00000000004a2c5c] __lock_acquire+0x10fc/0x1fa0
> > [ 88.316749] [00000000004a406c] lock_acquire+0x4c/0x80
> > [ 88.316760] [000000000083e07c] _raw_spin_lock+0x1c/0x40
> > [ 88.316765] [0000000000745da0] cas_link_timer+0xa0/0x460
> > [ 88.316769] [0000000000465fc8] call_timer_fn+0x48/0xe0
> > [ 88.316775] [00000000004665d4] run_timer_softirq+0x214/0x280
> > [ 88.316788] [000000000045f650] __do_softirq+0xf0/0x240
> > [ 88.316800] [000000000042bd0c] do_softirq_own_stack+0x2c/0x40
> > [ 88.316804] [000000000045fb44] irq_exit+0xc4/0xe0
> > [ 88.316814] [000000000042fcc8] timer_interrupt+0x88/0xc0
> > [ 88.316819] [0000000000426b84] valid_addr_bitmap_patch+0xbc/0x238
> > [ 88.316826] [00000000004ab2f8] vprintk_emit+0x1d8/0x540
> > [ 88.316842] [0000000000835fb8] printk+0x34/0x48
> > [ 88.316847] [00000000004ac3e0] register_console+0x340/0x3e0
> > [ 88.316862] [0000000000a74f2c] init_netconsole+0x180/0x20c
> > [ 88.316867] [0000000000426eb0] do_one_initcall+0x110/0x1a0
> >
> > --
> > Meelis Roos (mroos@linux.ee)
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Meelis Roos (mroos@linux.ee)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected
2014-05-08 20:00 ` Meelis Roos
@ 2014-05-08 22:38 ` Emil Goode
2014-05-09 5:37 ` Meelis Roos
0 siblings, 1 reply; 9+ messages in thread
From: Emil Goode @ 2014-05-08 22:38 UTC (permalink / raw
To: Meelis Roos; +Cc: netdev
Hello,
On Thu, May 08, 2014 at 11:00:03PM +0300, Meelis Roos wrote:
> > Hello Meelis,
> >
> > I think this warning happens because we acquire multiple locks
> > in a loop in cas_lock_tx() and I believe we should use nested
> > lock annotation here.
> >
> > Perhaps you would like to try the attached patch?
>
> Yes, it silences the warning.
>
Ok thanks for testing, I'll send that patch.
> > It won't fix the deadlock that you mentioned though.
>
> Yes, the hang still happens, following a
> ERROR: System Hardware FATAL RESET from CPU0 CPU2
>
Are you able to get the full dmesg output?
I think this could be hard to solve since I don't have
the hardware, but could take a look.
> >
> > Best regards,
> >
> > Emil Goode
> >
> > On Tue, May 06, 2014 at 12:39:48PM +0300, Meelis Roos wrote:
> > > While installing Linux on Sun Fire V480, any traffic on builtin cassini
> > > NIC caused a hang. Worked this around by using Broadcom NIC and tried a
> > > kernel with most debugging options. This resulted in the following
> > > warning. Maybe this is the deadlonck I was seeing?
> > >
> > > [ 88.316595] =============================================
> > > [ 88.316597] [ INFO: possible recursive locking detected ]
> > > [ 88.316603] 3.15.0-rc4-00202-g30321c7-dirty #11 Not tainted
> > > [ 88.316605] ---------------------------------------------
> > > [ 88.316608] swapper/3/1 is trying to acquire lock:
> > > [ 88.316644] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> > > [ 88.316646]
> > > [ 88.316646] but task is already holding lock:
> > > [ 88.316657] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> > > [ 88.316659]
> > > [ 88.316659] other info that might help us debug this:
> > > [ 88.316661] Possible unsafe locking scenario:
> > > [ 88.316661]
> > > [ 88.316662] CPU0
> > > [ 88.316664] ----
> > > [ 88.316668] lock(&(&cp->tx_lock[i])->rlock);
> > > [ 88.316671] lock(&(&cp->tx_lock[i])->rlock);
> > > [ 88.316672]
> > > [ 88.316672] *** DEADLOCK ***
> > > [ 88.316672]
> > > [ 88.316674] May be due to missing lock nesting notation
> > > [ 88.316674]
> > > [ 88.316677] 3 locks held by swapper/3/1:
> > > [ 88.316694] #0: ((&cp->link_timer)){+.-...}, at: [<0000000000465f80>] call_timer_fn+0x0/0xe0
> > > [ 88.316706] #1: (&(&cp->lock)->rlock){..-...}, at: [<0000000000745d80>] cas_link_timer+0x80/0x460
> > > [ 88.316716] #2: (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> > > [ 88.316718]
> > > [ 88.316718] stack backtrace:
> > > [ 88.316724] CPU: 2 PID: 1 Comm: swapper/3 Not tainted 3.15.0-rc4-00202-g30321c7-dirty #11
> > > [ 88.316727] Call Trace:
> > > [ 88.316743] [00000000004a2c5c] __lock_acquire+0x10fc/0x1fa0
> > > [ 88.316749] [00000000004a406c] lock_acquire+0x4c/0x80
> > > [ 88.316760] [000000000083e07c] _raw_spin_lock+0x1c/0x40
> > > [ 88.316765] [0000000000745da0] cas_link_timer+0xa0/0x460
> > > [ 88.316769] [0000000000465fc8] call_timer_fn+0x48/0xe0
> > > [ 88.316775] [00000000004665d4] run_timer_softirq+0x214/0x280
> > > [ 88.316788] [000000000045f650] __do_softirq+0xf0/0x240
> > > [ 88.316800] [000000000042bd0c] do_softirq_own_stack+0x2c/0x40
> > > [ 88.316804] [000000000045fb44] irq_exit+0xc4/0xe0
> > > [ 88.316814] [000000000042fcc8] timer_interrupt+0x88/0xc0
> > > [ 88.316819] [0000000000426b84] valid_addr_bitmap_patch+0xbc/0x238
> > > [ 88.316826] [00000000004ab2f8] vprintk_emit+0x1d8/0x540
> > > [ 88.316842] [0000000000835fb8] printk+0x34/0x48
> > > [ 88.316847] [00000000004ac3e0] register_console+0x340/0x3e0
> > > [ 88.316862] [0000000000a74f2c] init_netconsole+0x180/0x20c
> > > [ 88.316867] [0000000000426eb0] do_one_initcall+0x110/0x1a0
> > >
> > > --
> > > Meelis Roos (mroos@linux.ee)
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
> --
> Meelis Roos (mroos@linux.ee)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected
2014-05-08 22:38 ` Emil Goode
@ 2014-05-09 5:37 ` Meelis Roos
2014-05-09 9:06 ` Emil Goode
0 siblings, 1 reply; 9+ messages in thread
From: Meelis Roos @ 2014-05-09 5:37 UTC (permalink / raw
To: Emil Goode; +Cc: netdev
> > > It won't fix the deadlock that you mentioned though.
> >
> > Yes, the hang still happens, following a
> > ERROR: System Hardware FATAL RESET from CPU0 CPU2
> >
>
> Are you able to get the full dmesg output?
> I think this could be hard to solve since I don't have
> the hardware, but could take a look.
There is sparc64 firmware-specific FATAL RESET only, inclugind pages of
state dump for all 4 CPU-s and their MMUs etc. Nothing from Linux side.
So it's like recursive fault or something similar.
>
> > >
> > > Best regards,
> > >
> > > Emil Goode
> > >
> > > On Tue, May 06, 2014 at 12:39:48PM +0300, Meelis Roos wrote:
> > > > While installing Linux on Sun Fire V480, any traffic on builtin cassini
> > > > NIC caused a hang. Worked this around by using Broadcom NIC and tried a
> > > > kernel with most debugging options. This resulted in the following
> > > > warning. Maybe this is the deadlonck I was seeing?
> > > >
> > > > [ 88.316595] =============================================
> > > > [ 88.316597] [ INFO: possible recursive locking detected ]
> > > > [ 88.316603] 3.15.0-rc4-00202-g30321c7-dirty #11 Not tainted
> > > > [ 88.316605] ---------------------------------------------
> > > > [ 88.316608] swapper/3/1 is trying to acquire lock:
> > > > [ 88.316644] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> > > > [ 88.316646]
> > > > [ 88.316646] but task is already holding lock:
> > > > [ 88.316657] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> > > > [ 88.316659]
> > > > [ 88.316659] other info that might help us debug this:
> > > > [ 88.316661] Possible unsafe locking scenario:
> > > > [ 88.316661]
> > > > [ 88.316662] CPU0
> > > > [ 88.316664] ----
> > > > [ 88.316668] lock(&(&cp->tx_lock[i])->rlock);
> > > > [ 88.316671] lock(&(&cp->tx_lock[i])->rlock);
> > > > [ 88.316672]
> > > > [ 88.316672] *** DEADLOCK ***
> > > > [ 88.316672]
> > > > [ 88.316674] May be due to missing lock nesting notation
> > > > [ 88.316674]
> > > > [ 88.316677] 3 locks held by swapper/3/1:
> > > > [ 88.316694] #0: ((&cp->link_timer)){+.-...}, at: [<0000000000465f80>] call_timer_fn+0x0/0xe0
> > > > [ 88.316706] #1: (&(&cp->lock)->rlock){..-...}, at: [<0000000000745d80>] cas_link_timer+0x80/0x460
> > > > [ 88.316716] #2: (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> > > > [ 88.316718]
> > > > [ 88.316718] stack backtrace:
> > > > [ 88.316724] CPU: 2 PID: 1 Comm: swapper/3 Not tainted 3.15.0-rc4-00202-g30321c7-dirty #11
> > > > [ 88.316727] Call Trace:
> > > > [ 88.316743] [00000000004a2c5c] __lock_acquire+0x10fc/0x1fa0
> > > > [ 88.316749] [00000000004a406c] lock_acquire+0x4c/0x80
> > > > [ 88.316760] [000000000083e07c] _raw_spin_lock+0x1c/0x40
> > > > [ 88.316765] [0000000000745da0] cas_link_timer+0xa0/0x460
> > > > [ 88.316769] [0000000000465fc8] call_timer_fn+0x48/0xe0
> > > > [ 88.316775] [00000000004665d4] run_timer_softirq+0x214/0x280
> > > > [ 88.316788] [000000000045f650] __do_softirq+0xf0/0x240
> > > > [ 88.316800] [000000000042bd0c] do_softirq_own_stack+0x2c/0x40
> > > > [ 88.316804] [000000000045fb44] irq_exit+0xc4/0xe0
> > > > [ 88.316814] [000000000042fcc8] timer_interrupt+0x88/0xc0
> > > > [ 88.316819] [0000000000426b84] valid_addr_bitmap_patch+0xbc/0x238
> > > > [ 88.316826] [00000000004ab2f8] vprintk_emit+0x1d8/0x540
> > > > [ 88.316842] [0000000000835fb8] printk+0x34/0x48
> > > > [ 88.316847] [00000000004ac3e0] register_console+0x340/0x3e0
> > > > [ 88.316862] [0000000000a74f2c] init_netconsole+0x180/0x20c
> > > > [ 88.316867] [0000000000426eb0] do_one_initcall+0x110/0x1a0
> > > >
> > > > --
> > > > Meelis Roos (mroos@linux.ee)
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > >
> >
> > --
> > Meelis Roos (mroos@linux.ee)
>
--
Meelis Roos (mroos@linux.ee)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected
2014-05-09 5:37 ` Meelis Roos
@ 2014-05-09 9:06 ` Emil Goode
2014-05-09 20:33 ` David Miller
0 siblings, 1 reply; 9+ messages in thread
From: Emil Goode @ 2014-05-09 9:06 UTC (permalink / raw
To: Meelis Roos; +Cc: netdev
On Fri, May 09, 2014 at 08:37:30AM +0300, Meelis Roos wrote:
> > > > It won't fix the deadlock that you mentioned though.
> > >
> > > Yes, the hang still happens, following a
> > > ERROR: System Hardware FATAL RESET from CPU0 CPU2
> > >
> >
> > Are you able to get the full dmesg output?
> > I think this could be hard to solve since I don't have
> > the hardware, but could take a look.
>
> There is sparc64 firmware-specific FATAL RESET only, inclugind pages of
> state dump for all 4 CPU-s and their MMUs etc. Nothing from Linux side.
>
> So it's like recursive fault or something similar.
>
I searched the net a bit and found these old threads:
http://lists.freebsd.org/pipermail/freebsd-sparc64/2010-January/006935.html
"FreeBSD currently crashes on older models of V480 when attempting
to use an on-board NIC due to what appears to be a CPU bug which
needs to be worked around."
http://marc.info/?l=linux-sparc&m=122220796209509&w=2
"I noticed the cassini network driver for the builtin gigabit
network is unstable and brings the kernel down on a dualprocessor sparc
SunFire 480R with a Hardware FATAL RESET"
I think you should ask about this on the sparclinux mailing list.
http://vger.kernel.org/vger-lists.html#sparclinux
I would say it's very unlikely that the problem is related to that lockdep warning.
> >
> > > >
> > > > Best regards,
> > > >
> > > > Emil Goode
> > > >
> > > > On Tue, May 06, 2014 at 12:39:48PM +0300, Meelis Roos wrote:
> > > > > While installing Linux on Sun Fire V480, any traffic on builtin cassini
> > > > > NIC caused a hang. Worked this around by using Broadcom NIC and tried a
> > > > > kernel with most debugging options. This resulted in the following
> > > > > warning. Maybe this is the deadlonck I was seeing?
> > > > >
> > > > > [ 88.316595] =============================================
> > > > > [ 88.316597] [ INFO: possible recursive locking detected ]
> > > > > [ 88.316603] 3.15.0-rc4-00202-g30321c7-dirty #11 Not tainted
> > > > > [ 88.316605] ---------------------------------------------
> > > > > [ 88.316608] swapper/3/1 is trying to acquire lock:
> > > > > [ 88.316644] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> > > > > [ 88.316646]
> > > > > [ 88.316646] but task is already holding lock:
> > > > > [ 88.316657] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> > > > > [ 88.316659]
> > > > > [ 88.316659] other info that might help us debug this:
> > > > > [ 88.316661] Possible unsafe locking scenario:
> > > > > [ 88.316661]
> > > > > [ 88.316662] CPU0
> > > > > [ 88.316664] ----
> > > > > [ 88.316668] lock(&(&cp->tx_lock[i])->rlock);
> > > > > [ 88.316671] lock(&(&cp->tx_lock[i])->rlock);
> > > > > [ 88.316672]
> > > > > [ 88.316672] *** DEADLOCK ***
> > > > > [ 88.316672]
> > > > > [ 88.316674] May be due to missing lock nesting notation
> > > > > [ 88.316674]
> > > > > [ 88.316677] 3 locks held by swapper/3/1:
> > > > > [ 88.316694] #0: ((&cp->link_timer)){+.-...}, at: [<0000000000465f80>] call_timer_fn+0x0/0xe0
> > > > > [ 88.316706] #1: (&(&cp->lock)->rlock){..-...}, at: [<0000000000745d80>] cas_link_timer+0x80/0x460
> > > > > [ 88.316716] #2: (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
> > > > > [ 88.316718]
> > > > > [ 88.316718] stack backtrace:
> > > > > [ 88.316724] CPU: 2 PID: 1 Comm: swapper/3 Not tainted 3.15.0-rc4-00202-g30321c7-dirty #11
> > > > > [ 88.316727] Call Trace:
> > > > > [ 88.316743] [00000000004a2c5c] __lock_acquire+0x10fc/0x1fa0
> > > > > [ 88.316749] [00000000004a406c] lock_acquire+0x4c/0x80
> > > > > [ 88.316760] [000000000083e07c] _raw_spin_lock+0x1c/0x40
> > > > > [ 88.316765] [0000000000745da0] cas_link_timer+0xa0/0x460
> > > > > [ 88.316769] [0000000000465fc8] call_timer_fn+0x48/0xe0
> > > > > [ 88.316775] [00000000004665d4] run_timer_softirq+0x214/0x280
> > > > > [ 88.316788] [000000000045f650] __do_softirq+0xf0/0x240
> > > > > [ 88.316800] [000000000042bd0c] do_softirq_own_stack+0x2c/0x40
> > > > > [ 88.316804] [000000000045fb44] irq_exit+0xc4/0xe0
> > > > > [ 88.316814] [000000000042fcc8] timer_interrupt+0x88/0xc0
> > > > > [ 88.316819] [0000000000426b84] valid_addr_bitmap_patch+0xbc/0x238
> > > > > [ 88.316826] [00000000004ab2f8] vprintk_emit+0x1d8/0x540
> > > > > [ 88.316842] [0000000000835fb8] printk+0x34/0x48
> > > > > [ 88.316847] [00000000004ac3e0] register_console+0x340/0x3e0
> > > > > [ 88.316862] [0000000000a74f2c] init_netconsole+0x180/0x20c
> > > > > [ 88.316867] [0000000000426eb0] do_one_initcall+0x110/0x1a0
> > > > >
> > > > > --
> > > > > Meelis Roos (mroos@linux.ee)
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > > > > the body of a message to majordomo@vger.kernel.org
> > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > >
> > >
> > > --
> > > Meelis Roos (mroos@linux.ee)
> >
>
> --
> Meelis Roos (mroos@linux.ee)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected
2014-05-09 9:06 ` Emil Goode
@ 2014-05-09 20:33 ` David Miller
2014-05-16 20:12 ` Bjørn Mork
0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2014-05-09 20:33 UTC (permalink / raw
To: emilgoode; +Cc: mroos, netdev
From: Emil Goode <emilgoode@gmail.com>
Date: Fri, 9 May 2014 11:06:42 +0200
> I searched the net a bit and found these old threads:
>
> http://lists.freebsd.org/pipermail/freebsd-sparc64/2010-January/006935.html
>
> "FreeBSD currently crashes on older models of V480 when attempting
> to use an on-board NIC due to what appears to be a CPU bug which
> needs to be worked around."
I wish I still had a copy of that Schizo chip errata document referenced
at the end of that posting, it's not accessible any longer.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected
2014-05-09 20:33 ` David Miller
@ 2014-05-16 20:12 ` Bjørn Mork
2014-05-16 20:48 ` David Miller
0 siblings, 1 reply; 9+ messages in thread
From: Bjørn Mork @ 2014-05-16 20:12 UTC (permalink / raw
To: David Miller; +Cc: emilgoode, mroos, netdev
David Miller <davem@davemloft.net> writes:
> From: Emil Goode <emilgoode@gmail.com>
> Date: Fri, 9 May 2014 11:06:42 +0200
>
>> I searched the net a bit and found these old threads:
>>
>> http://lists.freebsd.org/pipermail/freebsd-sparc64/2010-January/006935.html
>>
>> "FreeBSD currently crashes on older models of V480 when attempting
>> to use an on-board NIC due to what appears to be a CPU bug which
>> needs to be worked around."
>
> I wish I still had a copy of that Schizo chip errata document referenced
> at the end of that posting, it's not accessible any longer.
The wayback machine saved it for you:
https://web.archive.org/web/20090701005954/http://www.sun.com/processors/manuals/External_Schizo_Errata.pdf
Bjørn
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected
2014-05-16 20:12 ` Bjørn Mork
@ 2014-05-16 20:48 ` David Miller
0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2014-05-16 20:48 UTC (permalink / raw
To: bjorn; +Cc: emilgoode, mroos, netdev
From: Bjørn Mork <bjorn@mork.no>
Date: Fri, 16 May 2014 22:12:15 +0200
> David Miller <davem@davemloft.net> writes:
>> From: Emil Goode <emilgoode@gmail.com>
>> Date: Fri, 9 May 2014 11:06:42 +0200
>>
>>> I searched the net a bit and found these old threads:
>>>
>>> http://lists.freebsd.org/pipermail/freebsd-sparc64/2010-January/006935.html
>>>
>>> "FreeBSD currently crashes on older models of V480 when attempting
>>> to use an on-board NIC due to what appears to be a CPU bug which
>>> needs to be worked around."
>>
>> I wish I still had a copy of that Schizo chip errata document referenced
>> at the end of that posting, it's not accessible any longer.
>
> The wayback machine saved it for you:
> https://web.archive.org/web/20090701005954/http://www.sun.com/processors/manuals/External_Schizo_Errata.pdf
Thanks, Meelis pointed out something similar to me in private correspondance :)
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-05-16 20:48 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-06 9:39 cassini: possible recursive locking detected Meelis Roos
2014-05-08 12:53 ` Emil Goode
2014-05-08 20:00 ` Meelis Roos
2014-05-08 22:38 ` Emil Goode
2014-05-09 5:37 ` Meelis Roos
2014-05-09 9:06 ` Emil Goode
2014-05-09 20:33 ` David Miller
2014-05-16 20:12 ` Bjørn Mork
2014-05-16 20:48 ` David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.