All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [powerpc] WARN at drivers/scsi/sg.c:2236 (sg_remove_sfp_usercontext)
@ 2024-03-29  9:08 ` Sachin Sant
  0 siblings, 0 replies; 27+ messages in thread
From: Sachin Sant @ 2024-03-29  9:08 UTC (permalink / raw)
  To: linuxppc-dev, linux-scsi
  Cc: Alexander Wetzel, Martin K . Petersen, Bart Van Assche

Following WARN_ON_ONCE is triggered while running LTP tests
(specifically ioctl_sg01) on IBM Power booted with 6.9.0-rc1-next-20240328

[   64.230233] ------------[ cut here ]------------
[   64.230269] WARNING: CPU: 10 PID: 452 at drivers/scsi/sg.c:2236 sg_remove_sfp_usercontext+0x270/0x280 [sg]
[   64.230302] Modules linked in: rpadlpar_io rpaphp xsk_diag nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding tls rfkill ip_set nf_tables nfnetlink sunrpc binfmt_misc pseries_rng vmx_crypto xfs libcrc32c sd_mod sr_mod t10_pi crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg ibmvscsi ibmveth scsi_transport_srp fuse
[   64.230420] CPU: 10 PID: 452 Comm: kworker/10:1 Kdump: loaded Not tainted 6.9.0-rc1-next-20240328 #2
[   64.230438] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_018) hv:phyp pSeries
[   64.230449] Workqueue: events sg_remove_sfp_usercontext [sg]
[   64.230468] NIP:  c008000015c34110 LR: c008000015c33ffc CTR: c0000000005393b0
[   64.230485] REGS: c00000000c1efae0 TRAP: 0700   Not tainted  (6.9.0-rc1-next-20240328)
[   64.230498] MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44000408  XER: 00000000
[   64.230535] CFAR: c008000015c3400c IRQMASK: 0
[   64.230535] GPR00: c008000015c33ffc c00000000c1efd80 c008000015c58900 c00000000ca8ae98
[   64.230535] GPR04: 00000000c0000000 0000000000000023 c000000007c2e000 0000000000000022
[   64.230535] GPR08: 000000038a130000 0000000000000002 0000000000000000 c008000015c38bc0
[   64.230535] GPR12: c0000000005393b0 c00000038fff3f00 c0000000001a2bac c000000007c7a9c0
[   64.230535] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   64.230535] GPR20: c00000038c3f3b00 c000000007c10030 c000000007c10000 c0000000901c0000
[   64.230535] GPR24: 0000000000000000 c00000000ca8ae00 c0000000045a5805 c000000007c11330
[   64.230535] GPR28: c00000038c3f3b00 c000000007c10080 c000000007c11328 c000000002fdee54
[   64.230671] NIP [c008000015c34110] sg_remove_sfp_usercontext+0x270/0x280 [sg]
[   64.230690] LR [c008000015c33ffc] sg_remove_sfp_usercontext+0x15c/0x280 [sg]
[   64.230709] Call Trace:
[   64.230716] [c00000000c1efd80] [c008000015c33ffc] sg_remove_sfp_usercontext+0x15c/0x280 [sg] (unreliable)
[   64.230740] [c00000000c1efe40] [c00000000019337c] process_one_work+0x20c/0x4f4
[   64.230767] [c00000000c1efef0] [c0000000001942fc] worker_thread+0x378/0x544
[   64.230787] [c00000000c1eff90] [c0000000001a2cdc] kthread+0x138/0x140
[   64.230801] [c00000000c1effe0] [c00000000000df98] start_kernel_thread+0x14/0x18
[   64.230819] Code: e8c98310 3d220000 e8698010 480044bd e8410018 7ec3b378 48004ac9 e8410018 38790098 81390098 2c090001 4182ff04 <0fe00000> 4bfffefc 000247e0 00000000
[   64.230857] ---[ end trace 0000000000000000 ]—

This WARN_ON was introduced with
commit 27f58c04a8f438078583041468ec60597841284d
    scsi: sg: Avoid sg device teardown race

Reverting the patch avoids the warning. The test case passes irrespective of the
patch is present of not. 

-- Sachin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [powerpc] WARN at drivers/scsi/sg.c:2236 (sg_remove_sfp_usercontext)
@ 2024-03-29  9:08 ` Sachin Sant
  0 siblings, 0 replies; 27+ messages in thread
From: Sachin Sant @ 2024-03-29  9:08 UTC (permalink / raw)
  To: linuxppc-dev, linux-scsi
  Cc: Alexander Wetzel, Bart Van Assche, Martin K . Petersen

Following WARN_ON_ONCE is triggered while running LTP tests
(specifically ioctl_sg01) on IBM Power booted with 6.9.0-rc1-next-20240328

[   64.230233] ------------[ cut here ]------------
[   64.230269] WARNING: CPU: 10 PID: 452 at drivers/scsi/sg.c:2236 sg_remove_sfp_usercontext+0x270/0x280 [sg]
[   64.230302] Modules linked in: rpadlpar_io rpaphp xsk_diag nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding tls rfkill ip_set nf_tables nfnetlink sunrpc binfmt_misc pseries_rng vmx_crypto xfs libcrc32c sd_mod sr_mod t10_pi crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg ibmvscsi ibmveth scsi_transport_srp fuse
[   64.230420] CPU: 10 PID: 452 Comm: kworker/10:1 Kdump: loaded Not tainted 6.9.0-rc1-next-20240328 #2
[   64.230438] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_018) hv:phyp pSeries
[   64.230449] Workqueue: events sg_remove_sfp_usercontext [sg]
[   64.230468] NIP:  c008000015c34110 LR: c008000015c33ffc CTR: c0000000005393b0
[   64.230485] REGS: c00000000c1efae0 TRAP: 0700   Not tainted  (6.9.0-rc1-next-20240328)
[   64.230498] MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44000408  XER: 00000000
[   64.230535] CFAR: c008000015c3400c IRQMASK: 0
[   64.230535] GPR00: c008000015c33ffc c00000000c1efd80 c008000015c58900 c00000000ca8ae98
[   64.230535] GPR04: 00000000c0000000 0000000000000023 c000000007c2e000 0000000000000022
[   64.230535] GPR08: 000000038a130000 0000000000000002 0000000000000000 c008000015c38bc0
[   64.230535] GPR12: c0000000005393b0 c00000038fff3f00 c0000000001a2bac c000000007c7a9c0
[   64.230535] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   64.230535] GPR20: c00000038c3f3b00 c000000007c10030 c000000007c10000 c0000000901c0000
[   64.230535] GPR24: 0000000000000000 c00000000ca8ae00 c0000000045a5805 c000000007c11330
[   64.230535] GPR28: c00000038c3f3b00 c000000007c10080 c000000007c11328 c000000002fdee54
[   64.230671] NIP [c008000015c34110] sg_remove_sfp_usercontext+0x270/0x280 [sg]
[   64.230690] LR [c008000015c33ffc] sg_remove_sfp_usercontext+0x15c/0x280 [sg]
[   64.230709] Call Trace:
[   64.230716] [c00000000c1efd80] [c008000015c33ffc] sg_remove_sfp_usercontext+0x15c/0x280 [sg] (unreliable)
[   64.230740] [c00000000c1efe40] [c00000000019337c] process_one_work+0x20c/0x4f4
[   64.230767] [c00000000c1efef0] [c0000000001942fc] worker_thread+0x378/0x544
[   64.230787] [c00000000c1eff90] [c0000000001a2cdc] kthread+0x138/0x140
[   64.230801] [c00000000c1effe0] [c00000000000df98] start_kernel_thread+0x14/0x18
[   64.230819] Code: e8c98310 3d220000 e8698010 480044bd e8410018 7ec3b378 48004ac9 e8410018 38790098 81390098 2c090001 4182ff04 <0fe00000> 4bfffefc 000247e0 00000000
[   64.230857] ---[ end trace 0000000000000000 ]—

This WARN_ON was introduced with
commit 27f58c04a8f438078583041468ec60597841284d
    scsi: sg: Avoid sg device teardown race

Reverting the patch avoids the warning. The test case passes irrespective of the
patch is present of not. 

-- Sachin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [powerpc] WARN at drivers/scsi/sg.c:2236 (sg_remove_sfp_usercontext)
  2024-03-29  9:08 ` Sachin Sant
@ 2024-03-29 11:10   ` Alexander Wetzel
  -1 siblings, 0 replies; 27+ messages in thread
From: Alexander Wetzel @ 2024-03-29 11:10 UTC (permalink / raw)
  To: sachinp; +Cc: Alexander, bvanassche, linux-scsi, linuxppc-dev, martin.petersen

> Following WARN_ON_ONCE is triggered while running LTP tests
> (specifically ioctl_sg01) on IBM Power booted with 6.9.0-rc1-next-20240328
>
> [   64.230233] ------------[ cut here ]------------
> [   64.230269] WARNING: CPU: 10 PID: 452 at drivers/scsi/sg.c:2236 sg_remove_sfp_usercontext+0x270/0x280 [sg]
> [   64.230302] Modules linked in: rpadlpar_io rpaphp xsk_diag nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding tls rfkill ip_set nf_tables nfnetlink sunrpc binfmt_misc pseries_rng vmx_crypto xfs libcrc32c sd_mod sr_mod t10_pi crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg ibmvscsi ibmveth scsi_transport_srp fuse
> [   64.230420] CPU: 10 PID: 452 Comm: kworker/10:1 Kdump: loaded Not tainted 6.9.0-rc1-next-20240328 #2
> [   64.230438] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_018) hv:phyp pSeries
> [   64.230449] Workqueue: events sg_remove_sfp_usercontext [sg]
> [   64.230468] NIP:  c008000015c34110 LR: c008000015c33ffc CTR: c0000000005393b0
> [   64.230485] REGS: c00000000c1efae0 TRAP: 0700   Not tainted  (6.9.0-rc1-next-20240328)
> [   64.230498] MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44000408  XER: 00000000
> [   64.230535] CFAR: c008000015c3400c IRQMASK: 0
> [   64.230535] GPR00: c008000015c33ffc c00000000c1efd80 c008000015c58900 c00000000ca8ae98
> [   64.230535] GPR04: 00000000c0000000 0000000000000023 c000000007c2e000 0000000000000022
> [   64.230535] GPR08: 000000038a130000 0000000000000002 0000000000000000 c008000015c38bc0
> [   64.230535] GPR12: c0000000005393b0 c00000038fff3f00 c0000000001a2bac c000000007c7a9c0
> [   64.230535] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [   64.230535] GPR20: c00000038c3f3b00 c000000007c10030 c000000007c10000 c0000000901c0000
> [   64.230535] GPR24: 0000000000000000 c00000000ca8ae00 c0000000045a5805 c000000007c11330
> [   64.230535] GPR28: c00000038c3f3b00 c000000007c10080 c000000007c11328 c000000002fdee54
> [   64.230671] NIP [c008000015c34110] sg_remove_sfp_usercontext+0x270/0x280 [sg]
> [   64.230690] LR [c008000015c33ffc] sg_remove_sfp_usercontext+0x15c/0x280 [sg]
> [   64.230709] Call Trace:
> [   64.230716] [c00000000c1efd80] [c008000015c33ffc] sg_remove_sfp_usercontext+0x15c/0x280 [sg] (unreliable)
> [   64.230740] [c00000000c1efe40] [c00000000019337c] process_one_work+0x20c/0x4f4
> [   64.230767] [c00000000c1efef0] [c0000000001942fc] worker_thread+0x378/0x544
> [   64.230787] [c00000000c1eff90] [c0000000001a2cdc] kthread+0x138/0x140
> [   64.230801] [c00000000c1effe0] [c00000000000df98] start_kernel_thread+0x14/0x18
> [   64.230819] Code: e8c98310 3d220000 e8698010 480044bd e8410018 7ec3b378 48004ac9 e8410018 38790098 81390098 2c090001 4182ff04 <0fe00000> 4bfffefc 000247e0 00000000
> [   64.230857] ---[ end trace 0000000000000000 ]—
>
> This WARN_ON was introduced with
> commit 27f58c04a8f438078583041468ec60597841284d
>     scsi: sg: Avoid sg device teardown race
>
> Reverting the patch avoids the warning. The test case passes irrespective of the
> patch is present of not.
>

The new WARN_ON_ONCE is only an additional logic check. When it
triggers it also should trigger when you undo the rest of the change.

But when it triggers something with the driver logic must be off.
(Or my understanding of the intent of the code is worse than assumed:-)

Looking into the d_ref logic I see two additional problems not addressed
by the original patch when sg_add_sfp() fails:
 1) sg_open() is then also calling first scsi_device_put() and then
    sg_device_destroy() via kref_put(). That's the wrong order.

 2) When sg_add_sfp() fails we never call kref_get(&sdp->d_ref).
    Thus we shoud not call kref_get() here at all.

Thus your warning above could be triggered by an error within
sg_add_sfp(): In that case d_ref would already be zero when the code
gets to the warning.

Can you check the debug patch below and provide output?
When I'm right the warning should be gone and you should just get the
"Modification triggered" instead. When I'm wrong we should at least see,
how many references d_ref has left.

Alexander
---
 drivers/scsi/sg.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index ff6894ce5404..1c27d5f8f384 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -373,7 +373,8 @@ sg_open(struct inode *inode, struct file *filp)
 	scsi_autopm_put_device(sdp->device);
 sdp_put:
 	scsi_device_put(sdp->device);
-	goto sg_put;
+	pr_warn("%s: Modification triggered\n", __func__);
+	return retval;
 }
 
 /* Release resources associated with a successful sg_open()
@@ -2233,7 +2234,8 @@ sg_remove_sfp_usercontext(struct work_struct *work)
 			"sg_remove_sfp: sfp=0x%p\n", sfp));
 	kfree(sfp);
 
-	WARN_ON_ONCE(kref_read(&sdp->d_ref) != 1);
+	if(WARN_ON_ONCE(kref_read(&sdp->d_ref) != 1))
+		printk(KERN_WARNING "d_ref=%u\n", kref_read(&sdp->d_ref));
 	kref_put(&sdp->d_ref, sg_device_destroy);
 	scsi_device_put(device);
 	module_put(THIS_MODULE);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [powerpc] WARN at drivers/scsi/sg.c:2236 (sg_remove_sfp_usercontext)
@ 2024-03-29 11:10   ` Alexander Wetzel
  0 siblings, 0 replies; 27+ messages in thread
From: Alexander Wetzel @ 2024-03-29 11:10 UTC (permalink / raw)
  To: sachinp; +Cc: linuxppc-dev, Alexander, bvanassche, linux-scsi, martin.petersen

> Following WARN_ON_ONCE is triggered while running LTP tests
> (specifically ioctl_sg01) on IBM Power booted with 6.9.0-rc1-next-20240328
>
> [   64.230233] ------------[ cut here ]------------
> [   64.230269] WARNING: CPU: 10 PID: 452 at drivers/scsi/sg.c:2236 sg_remove_sfp_usercontext+0x270/0x280 [sg]
> [   64.230302] Modules linked in: rpadlpar_io rpaphp xsk_diag nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding tls rfkill ip_set nf_tables nfnetlink sunrpc binfmt_misc pseries_rng vmx_crypto xfs libcrc32c sd_mod sr_mod t10_pi crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg ibmvscsi ibmveth scsi_transport_srp fuse
> [   64.230420] CPU: 10 PID: 452 Comm: kworker/10:1 Kdump: loaded Not tainted 6.9.0-rc1-next-20240328 #2
> [   64.230438] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_018) hv:phyp pSeries
> [   64.230449] Workqueue: events sg_remove_sfp_usercontext [sg]
> [   64.230468] NIP:  c008000015c34110 LR: c008000015c33ffc CTR: c0000000005393b0
> [   64.230485] REGS: c00000000c1efae0 TRAP: 0700   Not tainted  (6.9.0-rc1-next-20240328)
> [   64.230498] MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44000408  XER: 00000000
> [   64.230535] CFAR: c008000015c3400c IRQMASK: 0
> [   64.230535] GPR00: c008000015c33ffc c00000000c1efd80 c008000015c58900 c00000000ca8ae98
> [   64.230535] GPR04: 00000000c0000000 0000000000000023 c000000007c2e000 0000000000000022
> [   64.230535] GPR08: 000000038a130000 0000000000000002 0000000000000000 c008000015c38bc0
> [   64.230535] GPR12: c0000000005393b0 c00000038fff3f00 c0000000001a2bac c000000007c7a9c0
> [   64.230535] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [   64.230535] GPR20: c00000038c3f3b00 c000000007c10030 c000000007c10000 c0000000901c0000
> [   64.230535] GPR24: 0000000000000000 c00000000ca8ae00 c0000000045a5805 c000000007c11330
> [   64.230535] GPR28: c00000038c3f3b00 c000000007c10080 c000000007c11328 c000000002fdee54
> [   64.230671] NIP [c008000015c34110] sg_remove_sfp_usercontext+0x270/0x280 [sg]
> [   64.230690] LR [c008000015c33ffc] sg_remove_sfp_usercontext+0x15c/0x280 [sg]
> [   64.230709] Call Trace:
> [   64.230716] [c00000000c1efd80] [c008000015c33ffc] sg_remove_sfp_usercontext+0x15c/0x280 [sg] (unreliable)
> [   64.230740] [c00000000c1efe40] [c00000000019337c] process_one_work+0x20c/0x4f4
> [   64.230767] [c00000000c1efef0] [c0000000001942fc] worker_thread+0x378/0x544
> [   64.230787] [c00000000c1eff90] [c0000000001a2cdc] kthread+0x138/0x140
> [   64.230801] [c00000000c1effe0] [c00000000000df98] start_kernel_thread+0x14/0x18
> [   64.230819] Code: e8c98310 3d220000 e8698010 480044bd e8410018 7ec3b378 48004ac9 e8410018 38790098 81390098 2c090001 4182ff04 <0fe00000> 4bfffefc 000247e0 00000000
> [   64.230857] ---[ end trace 0000000000000000 ]—
>
> This WARN_ON was introduced with
> commit 27f58c04a8f438078583041468ec60597841284d
>     scsi: sg: Avoid sg device teardown race
>
> Reverting the patch avoids the warning. The test case passes irrespective of the
> patch is present of not.
>

The new WARN_ON_ONCE is only an additional logic check. When it
triggers it also should trigger when you undo the rest of the change.

But when it triggers something with the driver logic must be off.
(Or my understanding of the intent of the code is worse than assumed:-)

Looking into the d_ref logic I see two additional problems not addressed
by the original patch when sg_add_sfp() fails:
 1) sg_open() is then also calling first scsi_device_put() and then
    sg_device_destroy() via kref_put(). That's the wrong order.

 2) When sg_add_sfp() fails we never call kref_get(&sdp->d_ref).
    Thus we shoud not call kref_get() here at all.

Thus your warning above could be triggered by an error within
sg_add_sfp(): In that case d_ref would already be zero when the code
gets to the warning.

Can you check the debug patch below and provide output?
When I'm right the warning should be gone and you should just get the
"Modification triggered" instead. When I'm wrong we should at least see,
how many references d_ref has left.

Alexander
---
 drivers/scsi/sg.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index ff6894ce5404..1c27d5f8f384 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -373,7 +373,8 @@ sg_open(struct inode *inode, struct file *filp)
 	scsi_autopm_put_device(sdp->device);
 sdp_put:
 	scsi_device_put(sdp->device);
-	goto sg_put;
+	pr_warn("%s: Modification triggered\n", __func__);
+	return retval;
 }
 
 /* Release resources associated with a successful sg_open()
@@ -2233,7 +2234,8 @@ sg_remove_sfp_usercontext(struct work_struct *work)
 			"sg_remove_sfp: sfp=0x%p\n", sfp));
 	kfree(sfp);
 
-	WARN_ON_ONCE(kref_read(&sdp->d_ref) != 1);
+	if(WARN_ON_ONCE(kref_read(&sdp->d_ref) != 1))
+		printk(KERN_WARNING "d_ref=%u\n", kref_read(&sdp->d_ref));
 	kref_put(&sdp->d_ref, sg_device_destroy);
 	scsi_device_put(device);
 	module_put(THIS_MODULE);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [powerpc] WARN at drivers/scsi/sg.c:2236 (sg_remove_sfp_usercontext)
  2024-03-29 11:10   ` Alexander Wetzel
  (?)
@ 2024-03-29 14:37   ` Sachin Sant
  2024-04-01  9:56       ` Alexander Wetzel
                       ` (2 more replies)
  -1 siblings, 3 replies; 27+ messages in thread
From: Sachin Sant @ 2024-03-29 14:37 UTC (permalink / raw)
  To: Alexander Wetzel
  Cc: linuxppc-dev, Bart Van Assche, linux-scsi, Martin K . Petersen



> Can you check the debug patch below and provide output?
> When I'm right the warning should be gone and you should just get the
> "Modification triggered" instead. When I'm wrong we should at least see,
> how many references d_ref has left.
> 

With the debug patch applied, code says d_ref value is 2

# ./ioctl_sg01 
tst_test.c:1741: TINFO: LTP version: 20210524-2511-g00b497c47
tst_test.c:1625: TINFO: Timeout per run is 1h 00m 30s
ioctl_sg01.c:83: TINFO: Found SCSI device /dev/sg0
[   36.016630] ------------[ cut here ]------------
[   36.016674] WARNING: CPU: 19 PID: 460 at drivers/scsi/sg.c:2238 sg_remove_sfp_usercontext+0x270/0x298 [sg]
[   36.016707] Modules linked in: rpadlpar_io rpaphp xsk_diag nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding tls rfkill ip_set nf_tables nfnetlink sunrpc binfmt_misc pseries_rng vmx_crypto xfs libcrc32c sd_mod sr_mod t10_pi crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg ibmvscsi scsi_transport_srp ibmveth fuse
[   36.016834] CPU: 19 PID: 460 Comm: kworker/19:1 Kdump: loaded Not tainted 6.9.0-rc1-next-20240328-dirty #3
[   36.016849] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_018) hv:phyp pSeries
[   36.016868] Workqueue: events sg_remove_sfp_usercontext [sg]
[   36.016889] NIP:  c008000015cf4110 LR: c008000015cf4000 CTR: c0000000005393b0
[   36.016903] REGS: c00000009414fae0 TRAP: 0700   Not tainted  (6.9.0-rc1-next-20240328-dirty)
[   36.016921] MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44000448  XER: 00000000
[   36.016962] CFAR: c008000015cf400c IRQMASK: 0  [   36.016962] GPR00: c008000015cf4000 c00000009414fd80 c008000015d18900 0000000000000000  [   36.016962] GPR04: 00000000c0000000 0000000000000023 c000000008dee000 0000000000000022  [   36.016962] GPR08: 000000038a6d0000 0000000000000002 0000000000000000 c008000015cf8c10  [   36.016962] GPR12: c0000000005393b0 c00000038ffe8b00 c0000000001a2bac c000000008e4e980  [   36.016962] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [   36.016962] GPR20: c00000038c993b00 c000000008dea030 c000000008dea000 c0000000a2712000  [   36.016962] GPR24: 0000000000000000 c00000000c3bd380 c0000000045ab205 c000000008deb330  [   36.016962] GPR28: c00000038c993b00 c000000008dea080 c000000008deb328 c00000000c3bd418  [   36.017107] NIP [c008000015cf4110] sg_remove_sfp_usercontext+0x270/0x298 [sg]
[   36.017129] LR [c008000015cf4000] sg_remove_sfp_usercontext+0x160/0x298 [sg]
[   36.017144] Call Trace:
[   36.017148] [c00000009414fd80] [c008000015cf4000] sg_remove_sfp_usercontext+0x160/0x298 [sg] (unreliable)
[   36.017169] [c00000009414fe40] [c00000000019337c] process_one_work+0x20c/0x4f4
[   36.017189] [c00000009414fef0] [c0000000001942fc] worker_thread+0x378/0x544
[   36.017208] [c00000009414ff90] [c0000000001a2cdc] kthread+0x138/0x140
[   36.017225] [c00000009414ffe0] [c00000000000df98] start_kernel_thread+0x14/0x18
[   36.017241] Code: 3bf90098 e8c98310 3d220000 e8698010 48004509 e8410018 7ec3b378 48004b15 e8410018 81390098 2c090001 4182ff04 <0fe00000> 80990098 3d220000 78840020  [   36.017289] ---[ end trace 0000000000000000 ]---
[   36.017302] d_ref=2
ioctl_sg01.c:124: TPASS: Output buffer is empty, no data leaked
[   44.707319] d_ref=2

Summary:
passed   1
failed   0
broken   0
skipped  0
warnings 0


— Sachin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] scsi: sg: Avoid race in error handling & drop bogus warn
  2024-03-29 14:37   ` Sachin Sant
@ 2024-04-01  9:56       ` Alexander Wetzel
  2024-04-01 10:03       ` Alexander Wetzel
  2024-04-01 19:10       ` Alexander Wetzel
  2 siblings, 0 replies; 27+ messages in thread
From: Alexander Wetzel @ 2024-04-01  9:56 UTC (permalink / raw)
  To: dgilbert
  Cc: gregkh, sachinp, Alexander, bvanassche, linux-scsi, linuxppc-dev,
	martin.petersen, stable

commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
introduced an incorrect WARN_ON_ONCE() and missed a sequence where
sg_device_destroy() after scsi_device_put() when handling errors.

sg_device_destroy() is accessing the parent scsi_device request_queue which
will already be set to NULL when the preceding call to scsi_device_put()
removed the last reference to the parent scsi_device.

Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
access to the sg device -  and make sure sg_device_destroy() is not used
after scsi_device_put() in the error handling.

Link: https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
Fixes: 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
---

The WARN_ON_ONCE() was kind of stupid to add:
We get add reference for each sg_open(). So opening a second session and
then closing either one will trigger the warning... Nothing to warn
about here.

Alexander
---
 drivers/scsi/sg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 386981c6976a..833c9277419b 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -372,8 +372,9 @@ sg_open(struct inode *inode, struct file *filp)
 error_out:
 	scsi_autopm_put_device(sdp->device);
 sdp_put:
+	kref_put(&sdp->d_ref, sg_device_destroy);
 	scsi_device_put(sdp->device);
-	goto sg_put;
+	return retval;
 }
 
 /* Release resources associated with a successful sg_open()
@@ -2233,7 +2234,6 @@ sg_remove_sfp_usercontext(struct work_struct *work)
 			"sg_remove_sfp: sfp=0x%p\n", sfp));
 	kfree(sfp);
 
-	WARN_ON_ONCE(kref_read(&sdp->d_ref) != 1);
 	kref_put(&sdp->d_ref, sg_device_destroy);
 	scsi_device_put(device);
 	module_put(THIS_MODULE);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH] scsi: sg: Avoid race in error handling & drop bogus warn
@ 2024-04-01  9:56       ` Alexander Wetzel
  0 siblings, 0 replies; 27+ messages in thread
From: Alexander Wetzel @ 2024-04-01  9:56 UTC (permalink / raw)
  To: dgilbert
  Cc: bvanassche, linux-scsi, gregkh, stable, Alexander, sachinp,
	martin.petersen, linuxppc-dev

commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
introduced an incorrect WARN_ON_ONCE() and missed a sequence where
sg_device_destroy() after scsi_device_put() when handling errors.

sg_device_destroy() is accessing the parent scsi_device request_queue which
will already be set to NULL when the preceding call to scsi_device_put()
removed the last reference to the parent scsi_device.

Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
access to the sg device -  and make sure sg_device_destroy() is not used
after scsi_device_put() in the error handling.

Link: https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
Fixes: 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
---

The WARN_ON_ONCE() was kind of stupid to add:
We get add reference for each sg_open(). So opening a second session and
then closing either one will trigger the warning... Nothing to warn
about here.

Alexander
---
 drivers/scsi/sg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 386981c6976a..833c9277419b 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -372,8 +372,9 @@ sg_open(struct inode *inode, struct file *filp)
 error_out:
 	scsi_autopm_put_device(sdp->device);
 sdp_put:
+	kref_put(&sdp->d_ref, sg_device_destroy);
 	scsi_device_put(sdp->device);
-	goto sg_put;
+	return retval;
 }
 
 /* Release resources associated with a successful sg_open()
@@ -2233,7 +2234,6 @@ sg_remove_sfp_usercontext(struct work_struct *work)
 			"sg_remove_sfp: sfp=0x%p\n", sfp));
 	kfree(sfp);
 
-	WARN_ON_ONCE(kref_read(&sdp->d_ref) != 1);
 	kref_put(&sdp->d_ref, sg_device_destroy);
 	scsi_device_put(device);
 	module_put(THIS_MODULE);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2] scsi: sg: Avoid race in error handling & drop bogus warn
  2024-03-29 14:37   ` Sachin Sant
@ 2024-04-01 10:03       ` Alexander Wetzel
  2024-04-01 10:03       ` Alexander Wetzel
  2024-04-01 19:10       ` Alexander Wetzel
  2 siblings, 0 replies; 27+ messages in thread
From: Alexander Wetzel @ 2024-04-01 10:03 UTC (permalink / raw)
  To: dgilbert
  Cc: gregkh, sachinp, Alexander, bvanassche, linux-scsi, linuxppc-dev,
	martin.petersen, stable

commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
introduced an incorrect WARN_ON_ONCE() and missed a sequence where
sg_device_destroy() was used after scsi_device_put().

sg_device_destroy() is accessing the parent scsi_device request_queue which
will already be set to NULL when the preceding call to scsi_device_put()
removed the last reference to the parent scsi_device.

Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
access to the sg device - and make sure sg_device_destroy() is not used
after scsi_device_put() in the error handling.

Link: https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
Fixes: 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
---

Changes compared to V1: fixed commit message

The WARN_ON_ONCE() was kind of stupid to add:
We get add reference for each sg_open(). So opening a second session and
then closing either one will trigger the warning... Nothing to warn
about here.

Alexander
---
 drivers/scsi/sg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 386981c6976a..833c9277419b 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -372,8 +372,9 @@ sg_open(struct inode *inode, struct file *filp)
 error_out:
 	scsi_autopm_put_device(sdp->device);
 sdp_put:
+	kref_put(&sdp->d_ref, sg_device_destroy);
 	scsi_device_put(sdp->device);
-	goto sg_put;
+	return retval;
 }
 
 /* Release resources associated with a successful sg_open()
@@ -2233,7 +2234,6 @@ sg_remove_sfp_usercontext(struct work_struct *work)
 			"sg_remove_sfp: sfp=0x%p\n", sfp));
 	kfree(sfp);
 
-	WARN_ON_ONCE(kref_read(&sdp->d_ref) != 1);
 	kref_put(&sdp->d_ref, sg_device_destroy);
 	scsi_device_put(device);
 	module_put(THIS_MODULE);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2] scsi: sg: Avoid race in error handling & drop bogus warn
@ 2024-04-01 10:03       ` Alexander Wetzel
  0 siblings, 0 replies; 27+ messages in thread
From: Alexander Wetzel @ 2024-04-01 10:03 UTC (permalink / raw)
  To: dgilbert
  Cc: bvanassche, linux-scsi, gregkh, stable, Alexander, sachinp,
	martin.petersen, linuxppc-dev

commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
introduced an incorrect WARN_ON_ONCE() and missed a sequence where
sg_device_destroy() was used after scsi_device_put().

sg_device_destroy() is accessing the parent scsi_device request_queue which
will already be set to NULL when the preceding call to scsi_device_put()
removed the last reference to the parent scsi_device.

Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
access to the sg device - and make sure sg_device_destroy() is not used
after scsi_device_put() in the error handling.

Link: https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
Fixes: 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
---

Changes compared to V1: fixed commit message

The WARN_ON_ONCE() was kind of stupid to add:
We get add reference for each sg_open(). So opening a second session and
then closing either one will trigger the warning... Nothing to warn
about here.

Alexander
---
 drivers/scsi/sg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 386981c6976a..833c9277419b 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -372,8 +372,9 @@ sg_open(struct inode *inode, struct file *filp)
 error_out:
 	scsi_autopm_put_device(sdp->device);
 sdp_put:
+	kref_put(&sdp->d_ref, sg_device_destroy);
 	scsi_device_put(sdp->device);
-	goto sg_put;
+	return retval;
 }
 
 /* Release resources associated with a successful sg_open()
@@ -2233,7 +2234,6 @@ sg_remove_sfp_usercontext(struct work_struct *work)
 			"sg_remove_sfp: sfp=0x%p\n", sfp));
 	kfree(sfp);
 
-	WARN_ON_ONCE(kref_read(&sdp->d_ref) != 1);
 	kref_put(&sdp->d_ref, sg_device_destroy);
 	scsi_device_put(device);
 	module_put(THIS_MODULE);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] scsi: sg: Avoid race in error handling & drop bogus warn
  2024-04-01 10:03       ` Alexander Wetzel
@ 2024-04-01 17:09         ` Bart Van Assche
  -1 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2024-04-01 17:09 UTC (permalink / raw)
  To: Alexander Wetzel, dgilbert
  Cc: gregkh, sachinp, linux-scsi, linuxppc-dev, martin.petersen,
	stable

On 4/1/24 03:03, Alexander Wetzel wrote:
> commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> introduced an incorrect WARN_ON_ONCE() and missed a sequence where
> sg_device_destroy() was used after scsi_device_put().

Isn't that too negative? I think that the WARN_ON_ONCE() mentioned above
has proven to be useful: it helped to catch a bug.

> sg_device_destroy() is accessing the parent scsi_device request_queue which
> will already be set to NULL when the preceding call to scsi_device_put()
> removed the last reference to the parent scsi_device.
> 
> Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
> access to the sg device - and make sure sg_device_destroy() is not used
> after scsi_device_put() in the error handling.
> 
> Link: https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
> Fixes: 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")

The "goto sg_put" removed by this patch was introduced by commit
cc833acbee9d ("sg: O_EXCL and other lock handling"). Since the latter
commit is older than the one mentioned above, shouldn't the Fixes tag
refer to the latter commit?

> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> index 386981c6976a..833c9277419b 100644
> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -372,8 +372,9 @@ sg_open(struct inode *inode, struct file *filp)
>   error_out:
>   	scsi_autopm_put_device(sdp->device);
>   sdp_put:
> +	kref_put(&sdp->d_ref, sg_device_destroy);
>   	scsi_device_put(sdp->device);
> -	goto sg_put;
> +	return retval;
>   }

Please add a comment above "return retval" that explains which code will
drop the sg reference.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] scsi: sg: Avoid race in error handling & drop bogus warn
@ 2024-04-01 17:09         ` Bart Van Assche
  0 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2024-04-01 17:09 UTC (permalink / raw)
  To: Alexander Wetzel, dgilbert
  Cc: martin.petersen, linux-scsi, gregkh, stable, sachinp,
	linuxppc-dev

On 4/1/24 03:03, Alexander Wetzel wrote:
> commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> introduced an incorrect WARN_ON_ONCE() and missed a sequence where
> sg_device_destroy() was used after scsi_device_put().

Isn't that too negative? I think that the WARN_ON_ONCE() mentioned above
has proven to be useful: it helped to catch a bug.

> sg_device_destroy() is accessing the parent scsi_device request_queue which
> will already be set to NULL when the preceding call to scsi_device_put()
> removed the last reference to the parent scsi_device.
> 
> Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
> access to the sg device - and make sure sg_device_destroy() is not used
> after scsi_device_put() in the error handling.
> 
> Link: https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
> Fixes: 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")

The "goto sg_put" removed by this patch was introduced by commit
cc833acbee9d ("sg: O_EXCL and other lock handling"). Since the latter
commit is older than the one mentioned above, shouldn't the Fixes tag
refer to the latter commit?

> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> index 386981c6976a..833c9277419b 100644
> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -372,8 +372,9 @@ sg_open(struct inode *inode, struct file *filp)
>   error_out:
>   	scsi_autopm_put_device(sdp->device);
>   sdp_put:
> +	kref_put(&sdp->d_ref, sg_device_destroy);
>   	scsi_device_put(sdp->device);
> -	goto sg_put;
> +	return retval;
>   }

Please add a comment above "return retval" that explains which code will
drop the sg reference.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] scsi: sg: Avoid race in error handling & drop bogus warn
  2024-04-01 17:09         ` Bart Van Assche
@ 2024-04-01 19:01           ` Alexander Wetzel
  -1 siblings, 0 replies; 27+ messages in thread
From: Alexander Wetzel @ 2024-04-01 19:01 UTC (permalink / raw)
  To: Bart Van Assche, dgilbert
  Cc: gregkh, sachinp, linux-scsi, linuxppc-dev, martin.petersen,
	stable

On 01.04.24 19:09, Bart Van Assche wrote:
> On 4/1/24 03:03, Alexander Wetzel wrote:
>> commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
>> introduced an incorrect WARN_ON_ONCE() and missed a sequence where
>> sg_device_destroy() was used after scsi_device_put().
> 
> Isn't that too negative? I think that the WARN_ON_ONCE() mentioned above
> has proven to be useful: it helped to catch a bug.
> 

It helped to find the other issue. But the WARN_ON_ONCE() here is still 
plain wrong. Any only explained by my lack of understanding of the code 
and stupid assumptions I should have checked a bit more.

The warning always triggers when we have more than one user of the sg 
device (and then free one of them).
While trying to understand the issue I tripped over the other wrong 
sequence. Which is probably very seldom really executed...

That said I can of course update the wording when you have a better 
suggestion. But I only have some variations of "Ups... sorry. I thought 
that was a good idea. Turns out it's not"


>> sg_device_destroy() is accessing the parent scsi_device request_queue 
>> which
>> will already be set to NULL when the preceding call to scsi_device_put()
>> removed the last reference to the parent scsi_device.
>>
>> Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
>> access to the sg device - and make sure sg_device_destroy() is not used
>> after scsi_device_put() in the error handling.
>>
>> Link: 
>> https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
>> Fixes: 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> 
> The "goto sg_put" removed by this patch was introduced by commit
> cc833acbee9d ("sg: O_EXCL and other lock handling"). Since the latter
> commit is older than the one mentioned above, shouldn't the Fixes tag
> refer to the latter commit?
> 

The order was not wrong till commit db59133e9279 ("scsi: sg: fix 
blktrace debugfs entries leakage"), the one my original patch tried to 
fix. Prior to that one sg_device_destroy() was not using the scsi device 
request_queue.

I guess I (or one maintainer) could add that commit here again, too...

My reasoning here is, that this patch here fixes what my first patch got 
wrong.
Which is already heading into the stable trees. And I would prefer to 
not have any kernel release with commit 27f58c04a8f4 ("scsi: sg: Avoid 
sg device teardown race")  without this fix, too.


>> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
>> index 386981c6976a..833c9277419b 100644
>> --- a/drivers/scsi/sg.c
>> +++ b/drivers/scsi/sg.c
>> @@ -372,8 +372,9 @@ sg_open(struct inode *inode, struct file *filp)
>>   error_out:
>>       scsi_autopm_put_device(sdp->device);
>>   sdp_put:
>> +    kref_put(&sdp->d_ref, sg_device_destroy);
>>       scsi_device_put(sdp->device);
>> -    goto sg_put;
>> +    return retval;
>>   }
> 
> Please add a comment above "return retval" that explains which code will
> drop the sg reference.
> 

Hm, don't get that. That kref_put() is the one dropping the reference.
The matching kref_get() is in sg_add_sfp(). Which is called a few lines 
prior to the code here (line 350).

The patch is literally only swapping the order of scsi_device_put() and 
kref_put().

Which *again* causes a use-after free. So I'll send out v3 immediately 
and if any of the thinks discussed here require a v4 we'll do that.

Alexander

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] scsi: sg: Avoid race in error handling & drop bogus warn
@ 2024-04-01 19:01           ` Alexander Wetzel
  0 siblings, 0 replies; 27+ messages in thread
From: Alexander Wetzel @ 2024-04-01 19:01 UTC (permalink / raw)
  To: Bart Van Assche, dgilbert
  Cc: martin.petersen, linux-scsi, gregkh, stable, sachinp,
	linuxppc-dev

On 01.04.24 19:09, Bart Van Assche wrote:
> On 4/1/24 03:03, Alexander Wetzel wrote:
>> commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
>> introduced an incorrect WARN_ON_ONCE() and missed a sequence where
>> sg_device_destroy() was used after scsi_device_put().
> 
> Isn't that too negative? I think that the WARN_ON_ONCE() mentioned above
> has proven to be useful: it helped to catch a bug.
> 

It helped to find the other issue. But the WARN_ON_ONCE() here is still 
plain wrong. Any only explained by my lack of understanding of the code 
and stupid assumptions I should have checked a bit more.

The warning always triggers when we have more than one user of the sg 
device (and then free one of them).
While trying to understand the issue I tripped over the other wrong 
sequence. Which is probably very seldom really executed...

That said I can of course update the wording when you have a better 
suggestion. But I only have some variations of "Ups... sorry. I thought 
that was a good idea. Turns out it's not"


>> sg_device_destroy() is accessing the parent scsi_device request_queue 
>> which
>> will already be set to NULL when the preceding call to scsi_device_put()
>> removed the last reference to the parent scsi_device.
>>
>> Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
>> access to the sg device - and make sure sg_device_destroy() is not used
>> after scsi_device_put() in the error handling.
>>
>> Link: 
>> https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
>> Fixes: 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> 
> The "goto sg_put" removed by this patch was introduced by commit
> cc833acbee9d ("sg: O_EXCL and other lock handling"). Since the latter
> commit is older than the one mentioned above, shouldn't the Fixes tag
> refer to the latter commit?
> 

The order was not wrong till commit db59133e9279 ("scsi: sg: fix 
blktrace debugfs entries leakage"), the one my original patch tried to 
fix. Prior to that one sg_device_destroy() was not using the scsi device 
request_queue.

I guess I (or one maintainer) could add that commit here again, too...

My reasoning here is, that this patch here fixes what my first patch got 
wrong.
Which is already heading into the stable trees. And I would prefer to 
not have any kernel release with commit 27f58c04a8f4 ("scsi: sg: Avoid 
sg device teardown race")  without this fix, too.


>> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
>> index 386981c6976a..833c9277419b 100644
>> --- a/drivers/scsi/sg.c
>> +++ b/drivers/scsi/sg.c
>> @@ -372,8 +372,9 @@ sg_open(struct inode *inode, struct file *filp)
>>   error_out:
>>       scsi_autopm_put_device(sdp->device);
>>   sdp_put:
>> +    kref_put(&sdp->d_ref, sg_device_destroy);
>>       scsi_device_put(sdp->device);
>> -    goto sg_put;
>> +    return retval;
>>   }
> 
> Please add a comment above "return retval" that explains which code will
> drop the sg reference.
> 

Hm, don't get that. That kref_put() is the one dropping the reference.
The matching kref_get() is in sg_add_sfp(). Which is called a few lines 
prior to the code here (line 350).

The patch is literally only swapping the order of scsi_device_put() and 
kref_put().

Which *again* causes a use-after free. So I'll send out v3 immediately 
and if any of the thinks discussed here require a v4 we'll do that.

Alexander

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
  2024-03-29 14:37   ` Sachin Sant
@ 2024-04-01 19:10       ` Alexander Wetzel
  2024-04-01 10:03       ` Alexander Wetzel
  2024-04-01 19:10       ` Alexander Wetzel
  2 siblings, 0 replies; 27+ messages in thread
From: Alexander Wetzel @ 2024-04-01 19:10 UTC (permalink / raw)
  To: dgilbert
  Cc: gregkh, sachinp, Alexander, bvanassche, linux-scsi, linuxppc-dev,
	martin.petersen, stable

commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
introduced an incorrect WARN_ON_ONCE() and missed a sequence where
sg_device_destroy() was used after scsi_device_put().

sg_device_destroy() is accessing the parent scsi_device request_queue which
will already be set to NULL when the preceding call to scsi_device_put()
removed the last reference to the parent scsi_device.

Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
access to the sg device - and make sure sg_device_destroy() is not used
after scsi_device_put() in the error handling.

Link: https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
Fixes: 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
---

Changes compared to V1: fixed commit message
Changes compared to V2: Fix use-after free
---
 drivers/scsi/sg.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 386981c6976a..baf870a03ecf 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -285,6 +285,7 @@ sg_open(struct inode *inode, struct file *filp)
 	int dev = iminor(inode);
 	int flags = filp->f_flags;
 	struct request_queue *q;
+	struct scsi_device *device;
 	Sg_device *sdp;
 	Sg_fd *sfp;
 	int retval;
@@ -301,11 +302,12 @@ sg_open(struct inode *inode, struct file *filp)
 
 	/* This driver's module count bumped by fops_get in <linux/fs.h> */
 	/* Prevent the device driver from vanishing while we sleep */
-	retval = scsi_device_get(sdp->device);
+	device = sdp->device;
+	retval = scsi_device_get(device);
 	if (retval)
 		goto sg_put;
 
-	retval = scsi_autopm_get_device(sdp->device);
+	retval = scsi_autopm_get_device(device);
 	if (retval)
 		goto sdp_put;
 
@@ -313,7 +315,7 @@ sg_open(struct inode *inode, struct file *filp)
 	 * check if O_NONBLOCK. Permits SCSI commands to be issued
 	 * during error recovery. Tread carefully. */
 	if (!((flags & O_NONBLOCK) ||
-	      scsi_block_when_processing_errors(sdp->device))) {
+	      scsi_block_when_processing_errors(device))) {
 		retval = -ENXIO;
 		/* we are in error recovery for this device */
 		goto error_out;
@@ -344,7 +346,7 @@ sg_open(struct inode *inode, struct file *filp)
 
 	if (sdp->open_cnt < 1) {  /* no existing opens */
 		sdp->sgdebug = 0;
-		q = sdp->device->request_queue;
+		q = device->request_queue;
 		sdp->sg_tablesize = queue_max_segments(q);
 	}
 	sfp = sg_add_sfp(sdp);
@@ -370,10 +372,11 @@ sg_open(struct inode *inode, struct file *filp)
 error_mutex_locked:
 	mutex_unlock(&sdp->open_rel_lock);
 error_out:
-	scsi_autopm_put_device(sdp->device);
+	scsi_autopm_put_device(device);
 sdp_put:
-	scsi_device_put(sdp->device);
-	goto sg_put;
+	kref_put(&sdp->d_ref, sg_device_destroy);
+	scsi_device_put(device);
+	return retval;
 }
 
 /* Release resources associated with a successful sg_open()
@@ -2233,7 +2236,6 @@ sg_remove_sfp_usercontext(struct work_struct *work)
 			"sg_remove_sfp: sfp=0x%p\n", sfp));
 	kfree(sfp);
 
-	WARN_ON_ONCE(kref_read(&sdp->d_ref) != 1);
 	kref_put(&sdp->d_ref, sg_device_destroy);
 	scsi_device_put(device);
 	module_put(THIS_MODULE);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
@ 2024-04-01 19:10       ` Alexander Wetzel
  0 siblings, 0 replies; 27+ messages in thread
From: Alexander Wetzel @ 2024-04-01 19:10 UTC (permalink / raw)
  To: dgilbert
  Cc: bvanassche, linux-scsi, gregkh, stable, Alexander, sachinp,
	martin.petersen, linuxppc-dev

commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
introduced an incorrect WARN_ON_ONCE() and missed a sequence where
sg_device_destroy() was used after scsi_device_put().

sg_device_destroy() is accessing the parent scsi_device request_queue which
will already be set to NULL when the preceding call to scsi_device_put()
removed the last reference to the parent scsi_device.

Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
access to the sg device - and make sure sg_device_destroy() is not used
after scsi_device_put() in the error handling.

Link: https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
Fixes: 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
---

Changes compared to V1: fixed commit message
Changes compared to V2: Fix use-after free
---
 drivers/scsi/sg.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 386981c6976a..baf870a03ecf 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -285,6 +285,7 @@ sg_open(struct inode *inode, struct file *filp)
 	int dev = iminor(inode);
 	int flags = filp->f_flags;
 	struct request_queue *q;
+	struct scsi_device *device;
 	Sg_device *sdp;
 	Sg_fd *sfp;
 	int retval;
@@ -301,11 +302,12 @@ sg_open(struct inode *inode, struct file *filp)
 
 	/* This driver's module count bumped by fops_get in <linux/fs.h> */
 	/* Prevent the device driver from vanishing while we sleep */
-	retval = scsi_device_get(sdp->device);
+	device = sdp->device;
+	retval = scsi_device_get(device);
 	if (retval)
 		goto sg_put;
 
-	retval = scsi_autopm_get_device(sdp->device);
+	retval = scsi_autopm_get_device(device);
 	if (retval)
 		goto sdp_put;
 
@@ -313,7 +315,7 @@ sg_open(struct inode *inode, struct file *filp)
 	 * check if O_NONBLOCK. Permits SCSI commands to be issued
 	 * during error recovery. Tread carefully. */
 	if (!((flags & O_NONBLOCK) ||
-	      scsi_block_when_processing_errors(sdp->device))) {
+	      scsi_block_when_processing_errors(device))) {
 		retval = -ENXIO;
 		/* we are in error recovery for this device */
 		goto error_out;
@@ -344,7 +346,7 @@ sg_open(struct inode *inode, struct file *filp)
 
 	if (sdp->open_cnt < 1) {  /* no existing opens */
 		sdp->sgdebug = 0;
-		q = sdp->device->request_queue;
+		q = device->request_queue;
 		sdp->sg_tablesize = queue_max_segments(q);
 	}
 	sfp = sg_add_sfp(sdp);
@@ -370,10 +372,11 @@ sg_open(struct inode *inode, struct file *filp)
 error_mutex_locked:
 	mutex_unlock(&sdp->open_rel_lock);
 error_out:
-	scsi_autopm_put_device(sdp->device);
+	scsi_autopm_put_device(device);
 sdp_put:
-	scsi_device_put(sdp->device);
-	goto sg_put;
+	kref_put(&sdp->d_ref, sg_device_destroy);
+	scsi_device_put(device);
+	return retval;
 }
 
 /* Release resources associated with a successful sg_open()
@@ -2233,7 +2236,6 @@ sg_remove_sfp_usercontext(struct work_struct *work)
 			"sg_remove_sfp: sfp=0x%p\n", sfp));
 	kfree(sfp);
 
-	WARN_ON_ONCE(kref_read(&sdp->d_ref) != 1);
 	kref_put(&sdp->d_ref, sg_device_destroy);
 	scsi_device_put(device);
 	module_put(THIS_MODULE);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
  2024-04-01 19:10       ` Alexander Wetzel
@ 2024-04-02  6:01         ` Sachin Sant
  -1 siblings, 0 replies; 27+ messages in thread
From: Sachin Sant @ 2024-04-02  6:01 UTC (permalink / raw)
  To: Alexander Wetzel
  Cc: dgilbert, gregkh, Bart Van Assche, linux-scsi, linuxppc-dev,
	Martin K . Petersen, stable



> On 2 Apr 2024, at 12:40 AM, Alexander Wetzel <Alexander@wetzel-home.de> wrote:
> 
> commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> introduced an incorrect WARN_ON_ONCE() and missed a sequence where
> sg_device_destroy() was used after scsi_device_put().
> 
> sg_device_destroy() is accessing the parent scsi_device request_queue which
> will already be set to NULL when the preceding call to scsi_device_put()
> removed the last reference to the parent scsi_device.
> 
> Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
> access to the sg device - and make sure sg_device_destroy() is not used
> after scsi_device_put() in the error handling.
> 
> Link: https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
> Fixes: 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> Cc: stable@vger.kernel.org
> Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
> ---

Thanks for the fix. I tested this patch and confirm it fixes the reported problem.

Tested-by: Sachin Sant <sachinp@linux.ibm.com>


— Sachin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
@ 2024-04-02  6:01         ` Sachin Sant
  0 siblings, 0 replies; 27+ messages in thread
From: Sachin Sant @ 2024-04-02  6:01 UTC (permalink / raw)
  To: Alexander Wetzel
  Cc: Bart Van Assche, Martin K . Petersen, gregkh, dgilbert, stable,
	linux-scsi, linuxppc-dev



> On 2 Apr 2024, at 12:40 AM, Alexander Wetzel <Alexander@wetzel-home.de> wrote:
> 
> commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> introduced an incorrect WARN_ON_ONCE() and missed a sequence where
> sg_device_destroy() was used after scsi_device_put().
> 
> sg_device_destroy() is accessing the parent scsi_device request_queue which
> will already be set to NULL when the preceding call to scsi_device_put()
> removed the last reference to the parent scsi_device.
> 
> Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
> access to the sg device - and make sure sg_device_destroy() is not used
> after scsi_device_put() in the error handling.
> 
> Link: https://lore.kernel.org/all/5375B275-D137-4D5F-BE25-6AF8ACAE41EF@linux.ibm.com
> Fixes: 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> Cc: stable@vger.kernel.org
> Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
> ---

Thanks for the fix. I tested this patch and confirm it fixes the reported problem.

Tested-by: Sachin Sant <sachinp@linux.ibm.com>


— Sachin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
  2024-04-01 19:10       ` Alexander Wetzel
@ 2024-04-03 23:24         ` Bart Van Assche
  -1 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2024-04-03 23:24 UTC (permalink / raw)
  To: Alexander Wetzel, dgilbert
  Cc: gregkh, sachinp, linux-scsi, linuxppc-dev, martin.petersen,
	stable

On 4/1/24 12:10 PM, Alexander Wetzel wrote:
> @@ -301,11 +302,12 @@ sg_open(struct inode *inode, struct file *filp)
>   
>   	/* This driver's module count bumped by fops_get in <linux/fs.h> */
>   	/* Prevent the device driver from vanishing while we sleep */
> -	retval = scsi_device_get(sdp->device);
> +	device = sdp->device;
> +	retval = scsi_device_get(device);
>   	if (retval)
>   		goto sg_put;

Are all the sdp->device -> device changes essential? Isn't there a
preference to minimize patches that will end up in the stable trees?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
@ 2024-04-03 23:24         ` Bart Van Assche
  0 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2024-04-03 23:24 UTC (permalink / raw)
  To: Alexander Wetzel, dgilbert
  Cc: martin.petersen, linux-scsi, gregkh, stable, sachinp,
	linuxppc-dev

On 4/1/24 12:10 PM, Alexander Wetzel wrote:
> @@ -301,11 +302,12 @@ sg_open(struct inode *inode, struct file *filp)
>   
>   	/* This driver's module count bumped by fops_get in <linux/fs.h> */
>   	/* Prevent the device driver from vanishing while we sleep */
> -	retval = scsi_device_get(sdp->device);
> +	device = sdp->device;
> +	retval = scsi_device_get(device);
>   	if (retval)
>   		goto sg_put;

Are all the sdp->device -> device changes essential? Isn't there a
preference to minimize patches that will end up in the stable trees?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
  2024-04-03 23:24         ` Bart Van Assche
@ 2024-04-04  6:55           ` Alexander Wetzel
  -1 siblings, 0 replies; 27+ messages in thread
From: Alexander Wetzel @ 2024-04-04  6:55 UTC (permalink / raw)
  To: Bart Van Assche, dgilbert
  Cc: gregkh, sachinp, linux-scsi, linuxppc-dev, martin.petersen,
	stable

On 04.04.24 01:24, Bart Van Assche wrote:
> On 4/1/24 12:10 PM, Alexander Wetzel wrote:
>> @@ -301,11 +302,12 @@ sg_open(struct inode *inode, struct file *filp)
>>       /* This driver's module count bumped by fops_get in <linux/fs.h> */
>>       /* Prevent the device driver from vanishing while we sleep */
>> -    retval = scsi_device_get(sdp->device);
>> +    device = sdp->device;
>> +    retval = scsi_device_get(device);
>>       if (retval)
>>           goto sg_put;
> 
> Are all the sdp->device -> device changes essential? Isn't there a
> preference to minimize patches that will end up in the stable trees?
> 

Only the very last change is essential:
-       scsi_device_put(sdp->device);
-       goto sg_put;
+       kref_put(&sdp->d_ref, sg_device_destroy);
+       scsi_device_put(device);
+       return retval;

Not using a (required) local variable and de-referencing it again and 
looks strange for anyone reading the code. While the additional lines in 
the patch are trivial to review...

Alexander

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
@ 2024-04-04  6:55           ` Alexander Wetzel
  0 siblings, 0 replies; 27+ messages in thread
From: Alexander Wetzel @ 2024-04-04  6:55 UTC (permalink / raw)
  To: Bart Van Assche, dgilbert
  Cc: martin.petersen, linux-scsi, gregkh, stable, sachinp,
	linuxppc-dev

On 04.04.24 01:24, Bart Van Assche wrote:
> On 4/1/24 12:10 PM, Alexander Wetzel wrote:
>> @@ -301,11 +302,12 @@ sg_open(struct inode *inode, struct file *filp)
>>       /* This driver's module count bumped by fops_get in <linux/fs.h> */
>>       /* Prevent the device driver from vanishing while we sleep */
>> -    retval = scsi_device_get(sdp->device);
>> +    device = sdp->device;
>> +    retval = scsi_device_get(device);
>>       if (retval)
>>           goto sg_put;
> 
> Are all the sdp->device -> device changes essential? Isn't there a
> preference to minimize patches that will end up in the stable trees?
> 

Only the very last change is essential:
-       scsi_device_put(sdp->device);
-       goto sg_put;
+       kref_put(&sdp->d_ref, sg_device_destroy);
+       scsi_device_put(device);
+       return retval;

Not using a (required) local variable and de-referencing it again and 
looks strange for anyone reading the code. While the additional lines in 
the patch are trivial to review...

Alexander

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
  2024-04-01 19:10       ` Alexander Wetzel
@ 2024-04-04 16:34         ` Bart Van Assche
  -1 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2024-04-04 16:34 UTC (permalink / raw)
  To: Alexander Wetzel, dgilbert
  Cc: gregkh, sachinp, linux-scsi, linuxppc-dev, martin.petersen,
	stable

On 4/1/24 12:10, Alexander Wetzel wrote:
> commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> introduced an incorrect WARN_ON_ONCE() and missed a sequence where
> sg_device_destroy() was used after scsi_device_put().
> 
> sg_device_destroy() is accessing the parent scsi_device request_queue which
> will already be set to NULL when the preceding call to scsi_device_put()
> removed the last reference to the parent scsi_device.
> 
> Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
> access to the sg device - and make sure sg_device_destroy() is not used
> after scsi_device_put() in the error handling.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
@ 2024-04-04 16:34         ` Bart Van Assche
  0 siblings, 0 replies; 27+ messages in thread
From: Bart Van Assche @ 2024-04-04 16:34 UTC (permalink / raw)
  To: Alexander Wetzel, dgilbert
  Cc: martin.petersen, linux-scsi, gregkh, stable, sachinp,
	linuxppc-dev

On 4/1/24 12:10, Alexander Wetzel wrote:
> commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> introduced an incorrect WARN_ON_ONCE() and missed a sequence where
> sg_device_destroy() was used after scsi_device_put().
> 
> sg_device_destroy() is accessing the parent scsi_device request_queue which
> will already be set to NULL when the preceding call to scsi_device_put()
> removed the last reference to the parent scsi_device.
> 
> Drop the incorrect WARN_ON_ONCE() - allowing more than one concurrent
> access to the sg device - and make sure sg_device_destroy() is not used
> after scsi_device_put() in the error handling.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
  2024-04-01 19:10       ` Alexander Wetzel
@ 2024-04-04 22:22         ` Martin K. Petersen
  -1 siblings, 0 replies; 27+ messages in thread
From: Martin K. Petersen @ 2024-04-04 22:22 UTC (permalink / raw)
  To: Alexander Wetzel
  Cc: dgilbert, gregkh, sachinp, bvanassche, linux-scsi, linuxppc-dev,
	martin.petersen, stable


Alexander,

> commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> introduced an incorrect WARN_ON_ONCE() and missed a sequence where
> sg_device_destroy() was used after scsi_device_put().

Applied to 6.9/scsi-fixes, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
@ 2024-04-04 22:22         ` Martin K. Petersen
  0 siblings, 0 replies; 27+ messages in thread
From: Martin K. Petersen @ 2024-04-04 22:22 UTC (permalink / raw)
  To: Alexander Wetzel
  Cc: bvanassche, linux-scsi, gregkh, stable, martin.petersen, sachinp,
	dgilbert, linuxppc-dev


Alexander,

> commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> introduced an incorrect WARN_ON_ONCE() and missed a sequence where
> sg_device_destroy() was used after scsi_device_put().

Applied to 6.9/scsi-fixes, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
  2024-04-01 19:10       ` Alexander Wetzel
@ 2024-04-06  1:58         ` Martin K. Petersen
  -1 siblings, 0 replies; 27+ messages in thread
From: Martin K. Petersen @ 2024-04-06  1:58 UTC (permalink / raw)
  To: dgilbert, Alexander Wetzel
  Cc: Martin K . Petersen, gregkh, sachinp, bvanassche, linux-scsi,
	linuxppc-dev, stable

On Mon, 01 Apr 2024 21:10:38 +0200, Alexander Wetzel wrote:

> commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> introduced an incorrect WARN_ON_ONCE() and missed a sequence where
> sg_device_destroy() was used after scsi_device_put().
> 
> sg_device_destroy() is accessing the parent scsi_device request_queue which
> will already be set to NULL when the preceding call to scsi_device_put()
> removed the last reference to the parent scsi_device.
> 
> [...]

Applied to 6.9/scsi-fixes, thanks!

[1/1] scsi: sg: Avoid race in error handling & drop bogus warn
      https://git.kernel.org/mkp/scsi/c/d4e655c49f47

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] scsi: sg: Avoid race in error handling & drop bogus warn
@ 2024-04-06  1:58         ` Martin K. Petersen
  0 siblings, 0 replies; 27+ messages in thread
From: Martin K. Petersen @ 2024-04-06  1:58 UTC (permalink / raw)
  To: dgilbert, Alexander Wetzel
  Cc: Martin K . Petersen, linux-scsi, gregkh, stable, sachinp,
	linuxppc-dev, bvanassche

On Mon, 01 Apr 2024 21:10:38 +0200, Alexander Wetzel wrote:

> commit 27f58c04a8f4 ("scsi: sg: Avoid sg device teardown race")
> introduced an incorrect WARN_ON_ONCE() and missed a sequence where
> sg_device_destroy() was used after scsi_device_put().
> 
> sg_device_destroy() is accessing the parent scsi_device request_queue which
> will already be set to NULL when the preceding call to scsi_device_put()
> removed the last reference to the parent scsi_device.
> 
> [...]

Applied to 6.9/scsi-fixes, thanks!

[1/1] scsi: sg: Avoid race in error handling & drop bogus warn
      https://git.kernel.org/mkp/scsi/c/d4e655c49f47

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-04-06  2:00 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-29  9:08 [powerpc] WARN at drivers/scsi/sg.c:2236 (sg_remove_sfp_usercontext) Sachin Sant
2024-03-29  9:08 ` Sachin Sant
2024-03-29 11:10 ` Alexander Wetzel
2024-03-29 11:10   ` Alexander Wetzel
2024-03-29 14:37   ` Sachin Sant
2024-04-01  9:56     ` [PATCH] scsi: sg: Avoid race in error handling & drop bogus warn Alexander Wetzel
2024-04-01  9:56       ` Alexander Wetzel
2024-04-01 10:03     ` [PATCH v2] " Alexander Wetzel
2024-04-01 10:03       ` Alexander Wetzel
2024-04-01 17:09       ` Bart Van Assche
2024-04-01 17:09         ` Bart Van Assche
2024-04-01 19:01         ` Alexander Wetzel
2024-04-01 19:01           ` Alexander Wetzel
2024-04-01 19:10     ` [PATCH v3] " Alexander Wetzel
2024-04-01 19:10       ` Alexander Wetzel
2024-04-02  6:01       ` Sachin Sant
2024-04-02  6:01         ` Sachin Sant
2024-04-03 23:24       ` Bart Van Assche
2024-04-03 23:24         ` Bart Van Assche
2024-04-04  6:55         ` Alexander Wetzel
2024-04-04  6:55           ` Alexander Wetzel
2024-04-04 16:34       ` Bart Van Assche
2024-04-04 16:34         ` Bart Van Assche
2024-04-04 22:22       ` Martin K. Petersen
2024-04-04 22:22         ` Martin K. Petersen
2024-04-06  1:58       ` Martin K. Petersen
2024-04-06  1:58         ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.