* BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request @ 2021-05-30 7:33 Michal Kalderon 2021-06-08 16:50 ` Christoph Hellwig 2021-06-08 17:43 ` Sagi Grimberg 0 siblings, 2 replies; 8+ messages in thread From: Michal Kalderon @ 2021-05-30 7:33 UTC (permalink / raw) To: Christoph Hellwig, sagi@grimberg.me Cc: linux-nvme@lists.infradead.org, Shai Malin, Ariel Elior Hi Christoph, Sagi, We're testing some device error recovery scenarios and hit the following BUG, stack trace below. In the error scenario, nvmet_rdma_queue_response receives an error from the device when trying to post a wr, this leads to nvmet_rdma_release_rsp being called from softirq eventually reaching the blk_mq_delay_run_hw_queue which tries to schedule in softirq. (full stack below) could you please advise what the correct solution should be in this case ? thanks, Michal [ 8790.082863] nvmet_rdma: post_recv cmd failed [ 8790.083484] nvmet_rdma: sending cmd response failed [ 8790.084131] ------------[ cut here ]------------ [ 8790.084140] WARNING: CPU: 7 PID: 46 at block/blk-mq.c:1422 __blk_mq_run_hw_queue+0xb7/0x100 [ 8790.084619] Modules linked in: null_blk nvmet_rdma nvmet nvme_rdma nvme_fabrics nvme_core netconsole qedr(OE) qede(OE) qed(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xt_CHECKSUM nft_chain_nat xt_MASQUERADE nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nft_counter nft_compat tun bridge stp llc nf_tables nfnetlink ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_umad rpcrdma rdma_ucm ib_iser rdma_cm iw_cm intel_rapl_msr intel_rapl_common ib_cm sb_edac libiscsi scsi_transport_iscsi kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sunrpc rapl ib_uverbs ib_core cirrus drm_kms_helper drm virtio_balloon i2c_piix4 pcspkr crc32c_intel virtio_net serio_raw net_failover failover floppy crc8 ata_generic pata_acpi qemu_fw_cfg [last unloaded: qedr] [ 8790.084748] CPU: 7 PID: 46 Comm: ksoftirqd/7 Tainted: G OE 5.8.10 #1 [ 8790.084749] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014 [ 8790.084752] RIP: 0010:__blk_mq_run_hw_queue+0xb7/0x100 [ 8790.084753] Code: 00 48 89 ef e8 ea 34 c8 ff 48 89 df 41 89 c4 e8 1f 7f 00 00 f6 83 a8 00 00 00 20 74 b1 41 f7 c4 fe ff ff ff 74 b7 0f 0b eb b3 <0f> 0b eb 86 48 83 bf 98 00 00 00 00 48 c7 c0 df 81 3f 82 48 c7 c2 [ 8790.084754] RSP: 0018:ffffc9000020ba60 EFLAGS: 00010206 [ 8790.084755] RAX: 0000000000000100 RBX: ffff88809fe8c400 RCX: 00000000ffffffff [ 8790.084756] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88809fe8c400 [ 8790.084756] RBP: ffff888137b81a50 R08: ffffffffffffffff R09: 0000000000000020 [ 8790.084757] R10: 0000000000000001 R11: ffff8881365d4968 R12: 0000000000000000 [ 8790.084758] R13: ffff888137b81a40 R14: ffff88811e2b9e80 R15: ffff8880b3d964f0 [ 8790.084759] FS: 0000000000000000(0000) GS:ffff88813bbc0000(0000) knlGS:0000000000000000 [ 8790.084759] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8790.084760] CR2: 000055ca53900da8 CR3: 000000012b83e006 CR4: 0000000000360ee0 [ 8790.084763] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8790.084763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8790.084764] Call Trace: [ 8790.084767] __blk_mq_delay_run_hw_queue+0x140/0x160 [ 8790.084768] blk_mq_get_tag+0x1d1/0x270 [ 8790.084771] ? finish_wait+0x80/0x80 [ 8790.084773] __blk_mq_alloc_request+0xb1/0x100 [ 8790.084774] blk_mq_make_request+0x144/0x5d0 [ 8790.084778] generic_make_request+0x2db/0x340 [ 8790.084779] ? bvec_alloc+0x82/0xe0 [ 8790.084781] submit_bio+0x43/0x160 [ 8790.084781] ? bio_add_page+0x39/0x90 [ 8790.084794] nvmet_bdev_execute_rw+0x28c/0x360 [nvmet] [ 8790.084800] nvmet_rdma_execute_command+0x72/0x110 [nvmet_rdma] [ 8790.084802] nvmet_rdma_release_rsp+0xc1/0x1e0 [nvmet_rdma] [ 8790.084804] nvmet_rdma_queue_response.cold.63+0x14/0x19 [nvmet_rdma] [ 8790.084806] nvmet_req_complete+0x11/0x40 [nvmet] [ 8790.084809] nvmet_bio_done+0x27/0x100 [nvmet] [ 8790.084811] blk_update_request+0x23e/0x3b0 [ 8790.084812] blk_mq_end_request+0x1a/0x120 [ 8790.084814] blk_done_softirq+0xa1/0xd0 [ 8790.084818] __do_softirq+0xe4/0x2f8 [ 8790.084821] ? sort_range+0x20/0x20 [ 8790.084824] run_ksoftirqd+0x26/0x40 [ 8790.084825] smpboot_thread_fn+0xc5/0x160 [ 8790.084827] kthread+0x116/0x130 [ 8790.084828] ? kthread_park+0x80/0x80 [ 8790.084832] ret_from_fork+0x22/0x30 [ 8790.084833] ---[ end trace 16ec813ee3f82b56 ]--- [ 8790.085314] BUG: scheduling while atomic: ksoftirqd/7/46/0x00000100 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request 2021-05-30 7:33 BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request Michal Kalderon @ 2021-06-08 16:50 ` Christoph Hellwig 2021-06-08 17:43 ` Sagi Grimberg 1 sibling, 0 replies; 8+ messages in thread From: Christoph Hellwig @ 2021-06-08 16:50 UTC (permalink / raw) To: Michal Kalderon Cc: Christoph Hellwig, sagi@grimberg.me, linux-nvme@lists.infradead.org, Shai Malin, Ariel Elior What kernel version is this? On Sun, May 30, 2021 at 07:33:18AM +0000, Michal Kalderon wrote: > > this leads to nvmet_rdma_release_rsp being called from softirq eventually > reaching the blk_mq_delay_run_hw_queue which tries to schedule in softirq. (full stack below) > > could you please advise what the correct solution should be in this case ? > > thanks, > Michal > > [ 8790.082863] nvmet_rdma: post_recv cmd failed > [ 8790.083484] nvmet_rdma: sending cmd response failed > [ 8790.084131] ------------[ cut here ]------------ > [ 8790.084140] WARNING: CPU: 7 PID: 46 at block/blk-mq.c:1422 __blk_mq_run_hw_queue+0xb7/0x100 > [ 8790.084619] Modules linked in: null_blk nvmet_rdma nvmet nvme_rdma nvme_fabrics nvme_core netconsole qedr(OE) qede(OE) qed(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xt_CHECKSUM nft_chain_nat xt_MASQUERADE nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nft_counter nft_compat tun bridge stp llc nf_tables nfnetlink ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_umad rpcrdma rdma_ucm ib_iser rdma_cm iw_cm intel_rapl_msr intel_rapl_common ib_cm sb_edac libiscsi scsi_transport_iscsi kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sunrpc rapl ib_uverbs ib_core cirrus drm_kms_helper drm virtio_balloon i2c_piix4 pcspkr crc32c_intel virtio_net serio_raw net_failover failover floppy crc8 ata_generic pata_acpi qemu_fw_cfg [last unloaded: qedr] > [ 8790.084748] CPU: 7 PID: 46 Comm: ksoftirqd/7 Tainted: G OE 5.8.10 #1 > [ 8790.084749] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014 > [ 8790.084752] RIP: 0010:__blk_mq_run_hw_queue+0xb7/0x100 > [ 8790.084753] Code: 00 48 89 ef e8 ea 34 c8 ff 48 89 df 41 89 c4 e8 1f 7f 00 00 f6 83 a8 00 00 00 20 74 b1 41 f7 c4 fe ff ff ff 74 b7 0f 0b eb b3 <0f> 0b eb 86 48 83 bf 98 00 00 00 00 48 c7 c0 df 81 3f 82 48 c7 c2 > [ 8790.084754] RSP: 0018:ffffc9000020ba60 EFLAGS: 00010206 > [ 8790.084755] RAX: 0000000000000100 RBX: ffff88809fe8c400 RCX: 00000000ffffffff > [ 8790.084756] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88809fe8c400 > [ 8790.084756] RBP: ffff888137b81a50 R08: ffffffffffffffff R09: 0000000000000020 > [ 8790.084757] R10: 0000000000000001 R11: ffff8881365d4968 R12: 0000000000000000 > [ 8790.084758] R13: ffff888137b81a40 R14: ffff88811e2b9e80 R15: ffff8880b3d964f0 > [ 8790.084759] FS: 0000000000000000(0000) GS:ffff88813bbc0000(0000) knlGS:0000000000000000 > [ 8790.084759] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 8790.084760] CR2: 000055ca53900da8 CR3: 000000012b83e006 CR4: 0000000000360ee0 > [ 8790.084763] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 8790.084763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 8790.084764] Call Trace: > [ 8790.084767] __blk_mq_delay_run_hw_queue+0x140/0x160 > [ 8790.084768] blk_mq_get_tag+0x1d1/0x270 > [ 8790.084771] ? finish_wait+0x80/0x80 > [ 8790.084773] __blk_mq_alloc_request+0xb1/0x100 > [ 8790.084774] blk_mq_make_request+0x144/0x5d0 > [ 8790.084778] generic_make_request+0x2db/0x340 > [ 8790.084779] ? bvec_alloc+0x82/0xe0 > [ 8790.084781] submit_bio+0x43/0x160 > [ 8790.084781] ? bio_add_page+0x39/0x90 > [ 8790.084794] nvmet_bdev_execute_rw+0x28c/0x360 [nvmet] > [ 8790.084800] nvmet_rdma_execute_command+0x72/0x110 [nvmet_rdma] > [ 8790.084802] nvmet_rdma_release_rsp+0xc1/0x1e0 [nvmet_rdma] > [ 8790.084804] nvmet_rdma_queue_response.cold.63+0x14/0x19 [nvmet_rdma] > [ 8790.084806] nvmet_req_complete+0x11/0x40 [nvmet] > [ 8790.084809] nvmet_bio_done+0x27/0x100 [nvmet] > [ 8790.084811] blk_update_request+0x23e/0x3b0 > [ 8790.084812] blk_mq_end_request+0x1a/0x120 > [ 8790.084814] blk_done_softirq+0xa1/0xd0 > [ 8790.084818] __do_softirq+0xe4/0x2f8 > [ 8790.084821] ? sort_range+0x20/0x20 > [ 8790.084824] run_ksoftirqd+0x26/0x40 > [ 8790.084825] smpboot_thread_fn+0xc5/0x160 > [ 8790.084827] kthread+0x116/0x130 > [ 8790.084828] ? kthread_park+0x80/0x80 > [ 8790.084832] ret_from_fork+0x22/0x30 > [ 8790.084833] ---[ end trace 16ec813ee3f82b56 ]--- > [ 8790.085314] BUG: scheduling while atomic: ksoftirqd/7/46/0x00000100 ---end quoted text--- _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request 2021-05-30 7:33 BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request Michal Kalderon 2021-06-08 16:50 ` Christoph Hellwig @ 2021-06-08 17:43 ` Sagi Grimberg 2021-06-08 18:41 ` Keith Busch 1 sibling, 1 reply; 8+ messages in thread From: Sagi Grimberg @ 2021-06-08 17:43 UTC (permalink / raw) To: Michal Kalderon, Christoph Hellwig Cc: linux-nvme@lists.infradead.org, Shai Malin, Ariel Elior > Hi Christoph, Sagi, > > We're testing some device error recovery scenarios and hit the following BUG, stack trace below. > In the error scenario, nvmet_rdma_queue_response receives an error from the device when trying to post a wr, > > this leads to nvmet_rdma_release_rsp being called from softirq eventually > reaching the blk_mq_delay_run_hw_queue which tries to schedule in softirq. (full stack below) > > could you please advise what the correct solution should be in this case ? Hey Michal, I agree this can happen and requires correction. Does the below resolve the issue? -- diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index 7d607f435e36..6d2eea322779 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -16,6 +16,7 @@ #include <linux/wait.h> #include <linux/inet.h> #include <asm/unaligned.h> +#include <linux/async.h> #include <rdma/ib_verbs.h> #include <rdma/rdma_cm.h> @@ -712,6 +713,12 @@ static void nvmet_rdma_send_done(struct ib_cq *cq, struct ib_wc *wc) } } +static void nvmet_rdma_async_release_rsp(void *data, async_cookie_t cookie) +{ + struct nvmet_rdma_rsp *rsp = data; + nvmet_rdma_release_rsp(rsp); +} + static void nvmet_rdma_queue_response(struct nvmet_req *req) { struct nvmet_rdma_rsp *rsp = @@ -745,7 +752,12 @@ static void nvmet_rdma_queue_response(struct nvmet_req *req) if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) { pr_err("sending cmd response failed\n"); - nvmet_rdma_release_rsp(rsp); + /* + * We might be in atomic context, hence release + * the rsp in async context in case we need to + * process the wr_wait_list. + */ + async_schedule(nvmet_rdma_async_release_rsp, rsp); } } -- > > thanks, > Michal > > [ 8790.082863] nvmet_rdma: post_recv cmd failed > [ 8790.083484] nvmet_rdma: sending cmd response failed > [ 8790.084131] ------------[ cut here ]------------ > [ 8790.084140] WARNING: CPU: 7 PID: 46 at block/blk-mq.c:1422 __blk_mq_run_hw_queue+0xb7/0x100 > [ 8790.084619] Modules linked in: null_blk nvmet_rdma nvmet nvme_rdma nvme_fabrics nvme_core netconsole qedr(OE) qede(OE) qed(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xt_CHECKSUM nft_chain_nat xt_MASQUERADE nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nft_counter nft_compat tun bridge stp llc nf_tables nfnetlink ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_umad rpcrdma rdma_ucm ib_iser rdma_cm iw_cm intel_rapl_msr intel_rapl_common ib_cm sb_edac libiscsi scsi_transport_iscsi kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sunrpc rapl ib_uverbs ib_core cirrus drm_kms_helper drm virtio_balloon i2c_piix4 pcspkr crc32c_intel virtio_net serio_raw net_failover failover floppy crc8 ata_generic pata_acpi qemu_fw_cfg [last unloaded: qedr] > [ 8790.084748] CPU: 7 PID: 46 Comm: ksoftirqd/7 Tainted: G OE 5.8.10 #1 > [ 8790.084749] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014 > [ 8790.084752] RIP: 0010:__blk_mq_run_hw_queue+0xb7/0x100 > [ 8790.084753] Code: 00 48 89 ef e8 ea 34 c8 ff 48 89 df 41 89 c4 e8 1f 7f 00 00 f6 83 a8 00 00 00 20 74 b1 41 f7 c4 fe ff ff ff 74 b7 0f 0b eb b3 <0f> 0b eb 86 48 83 bf 98 00 00 00 00 48 c7 c0 df 81 3f 82 48 c7 c2 > [ 8790.084754] RSP: 0018:ffffc9000020ba60 EFLAGS: 00010206 > [ 8790.084755] RAX: 0000000000000100 RBX: ffff88809fe8c400 RCX: 00000000ffffffff > [ 8790.084756] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88809fe8c400 > [ 8790.084756] RBP: ffff888137b81a50 R08: ffffffffffffffff R09: 0000000000000020 > [ 8790.084757] R10: 0000000000000001 R11: ffff8881365d4968 R12: 0000000000000000 > [ 8790.084758] R13: ffff888137b81a40 R14: ffff88811e2b9e80 R15: ffff8880b3d964f0 > [ 8790.084759] FS: 0000000000000000(0000) GS:ffff88813bbc0000(0000) knlGS:0000000000000000 > [ 8790.084759] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 8790.084760] CR2: 000055ca53900da8 CR3: 000000012b83e006 CR4: 0000000000360ee0 > [ 8790.084763] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 8790.084763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 8790.084764] Call Trace: > [ 8790.084767] __blk_mq_delay_run_hw_queue+0x140/0x160 > [ 8790.084768] blk_mq_get_tag+0x1d1/0x270 > [ 8790.084771] ? finish_wait+0x80/0x80 > [ 8790.084773] __blk_mq_alloc_request+0xb1/0x100 > [ 8790.084774] blk_mq_make_request+0x144/0x5d0 > [ 8790.084778] generic_make_request+0x2db/0x340 > [ 8790.084779] ? bvec_alloc+0x82/0xe0 > [ 8790.084781] submit_bio+0x43/0x160 > [ 8790.084781] ? bio_add_page+0x39/0x90 > [ 8790.084794] nvmet_bdev_execute_rw+0x28c/0x360 [nvmet] > [ 8790.084800] nvmet_rdma_execute_command+0x72/0x110 [nvmet_rdma] > [ 8790.084802] nvmet_rdma_release_rsp+0xc1/0x1e0 [nvmet_rdma] > [ 8790.084804] nvmet_rdma_queue_response.cold.63+0x14/0x19 [nvmet_rdma] > [ 8790.084806] nvmet_req_complete+0x11/0x40 [nvmet] > [ 8790.084809] nvmet_bio_done+0x27/0x100 [nvmet] > [ 8790.084811] blk_update_request+0x23e/0x3b0 > [ 8790.084812] blk_mq_end_request+0x1a/0x120 > [ 8790.084814] blk_done_softirq+0xa1/0xd0 > [ 8790.084818] __do_softirq+0xe4/0x2f8 > [ 8790.084821] ? sort_range+0x20/0x20 > [ 8790.084824] run_ksoftirqd+0x26/0x40 > [ 8790.084825] smpboot_thread_fn+0xc5/0x160 > [ 8790.084827] kthread+0x116/0x130 > [ 8790.084828] ? kthread_park+0x80/0x80 > [ 8790.084832] ret_from_fork+0x22/0x30 > [ 8790.084833] ---[ end trace 16ec813ee3f82b56 ]--- > [ 8790.085314] BUG: scheduling while atomic: ksoftirqd/7/46/0x00000100 > _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request 2021-06-08 17:43 ` Sagi Grimberg @ 2021-06-08 18:41 ` Keith Busch 2021-06-09 0:03 ` Sagi Grimberg 0 siblings, 1 reply; 8+ messages in thread From: Keith Busch @ 2021-06-08 18:41 UTC (permalink / raw) To: Sagi Grimberg Cc: Michal Kalderon, Christoph Hellwig, linux-nvme@lists.infradead.org, Shai Malin, Ariel Elior On Tue, Jun 08, 2021 at 10:43:45AM -0700, Sagi Grimberg wrote: > diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c > index 7d607f435e36..6d2eea322779 100644 > --- a/drivers/nvme/target/rdma.c > +++ b/drivers/nvme/target/rdma.c > @@ -16,6 +16,7 @@ > #include <linux/wait.h> > #include <linux/inet.h> > #include <asm/unaligned.h> > +#include <linux/async.h> > > #include <rdma/ib_verbs.h> > #include <rdma/rdma_cm.h> > @@ -712,6 +713,12 @@ static void nvmet_rdma_send_done(struct ib_cq *cq, > struct ib_wc *wc) > } > } > > +static void nvmet_rdma_async_release_rsp(void *data, async_cookie_t cookie) > +{ > + struct nvmet_rdma_rsp *rsp = data; > + nvmet_rdma_release_rsp(rsp); > +} > + > static void nvmet_rdma_queue_response(struct nvmet_req *req) > { > struct nvmet_rdma_rsp *rsp = > @@ -745,7 +752,12 @@ static void nvmet_rdma_queue_response(struct nvmet_req > *req) > > if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) { > pr_err("sending cmd response failed\n"); > - nvmet_rdma_release_rsp(rsp); > + /* > + * We might be in atomic context, hence release > + * the rsp in async context in case we need to > + * process the wr_wait_list. > + */ > + async_schedule(nvmet_rdma_async_release_rsp, rsp); > } > } Just FYI, async_schedule() has conditions where it may execute your callback synchronously. Your suggestion is probably fine for testing, but it sounds like you require something that can guarantee a non-atomic context for nvmet_rdma_release_rsp(). _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request 2021-06-08 18:41 ` Keith Busch @ 2021-06-09 0:03 ` Sagi Grimberg 2021-06-14 14:44 ` [EXT] " Michal Kalderon 0 siblings, 1 reply; 8+ messages in thread From: Sagi Grimberg @ 2021-06-09 0:03 UTC (permalink / raw) To: Keith Busch Cc: Michal Kalderon, Christoph Hellwig, linux-nvme@lists.infradead.org, Shai Malin, Ariel Elior >> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c >> index 7d607f435e36..6d2eea322779 100644 >> --- a/drivers/nvme/target/rdma.c >> +++ b/drivers/nvme/target/rdma.c >> @@ -16,6 +16,7 @@ >> #include <linux/wait.h> >> #include <linux/inet.h> >> #include <asm/unaligned.h> >> +#include <linux/async.h> >> >> #include <rdma/ib_verbs.h> >> #include <rdma/rdma_cm.h> >> @@ -712,6 +713,12 @@ static void nvmet_rdma_send_done(struct ib_cq *cq, >> struct ib_wc *wc) >> } >> } >> >> +static void nvmet_rdma_async_release_rsp(void *data, async_cookie_t cookie) >> +{ >> + struct nvmet_rdma_rsp *rsp = data; >> + nvmet_rdma_release_rsp(rsp); >> +} >> + >> static void nvmet_rdma_queue_response(struct nvmet_req *req) >> { >> struct nvmet_rdma_rsp *rsp = >> @@ -745,7 +752,12 @@ static void nvmet_rdma_queue_response(struct nvmet_req >> *req) >> >> if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) { >> pr_err("sending cmd response failed\n"); >> - nvmet_rdma_release_rsp(rsp); >> + /* >> + * We might be in atomic context, hence release >> + * the rsp in async context in case we need to >> + * process the wr_wait_list. >> + */ >> + async_schedule(nvmet_rdma_async_release_rsp, rsp); >> } >> } > > Just FYI, async_schedule() has conditions where it may execute your > callback synchronously. Your suggestion is probably fine for testing, > but it sounds like you require something that can guarantee a non-atomic > context for nvmet_rdma_release_rsp(). OK, it seems that the issue is that we are submitting I/O in atomic context. This should be more appropriate... -- diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index 7d607f435e36..16f2f5a84ae7 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -102,6 +102,7 @@ struct nvmet_rdma_queue { struct work_struct release_work; struct list_head rsp_wait_list; + struct work_struct wr_wait_work; struct list_head rsp_wr_wait_list; spinlock_t rsp_wr_wait_lock; @@ -517,8 +518,10 @@ static int nvmet_rdma_post_recv(struct nvmet_rdma_device *ndev, return ret; } -static void nvmet_rdma_process_wr_wait_list(struct nvmet_rdma_queue *queue) +static void nvmet_rdma_process_wr_wait_list(struct work_struct *w) { + struct nvmet_rdma_queue *queue = + container_of(w, struct nvmet_rdma_queue, wr_wait_work); spin_lock(&queue->rsp_wr_wait_lock); while (!list_empty(&queue->rsp_wr_wait_list)) { struct nvmet_rdma_rsp *rsp; @@ -677,7 +680,7 @@ static void nvmet_rdma_release_rsp(struct nvmet_rdma_rsp *rsp) nvmet_req_free_sgls(&rsp->req); if (unlikely(!list_empty_careful(&queue->rsp_wr_wait_list))) - nvmet_rdma_process_wr_wait_list(queue); + schedule_work(&queue->wr_wait_work); nvmet_rdma_put_rsp(rsp); } @@ -1446,6 +1449,7 @@ nvmet_rdma_alloc_queue(struct nvmet_rdma_device *ndev, * inside a CM callback would trigger a deadlock. (great API design..) */ INIT_WORK(&queue->release_work, nvmet_rdma_release_queue_work); + INIT_WORK(&queue->wr_wait_work, nvmet_rdma_process_wr_wait_list); queue->dev = ndev; queue->cm_id = cm_id; queue->port = port->nport; -- _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 8+ messages in thread
* RE: [EXT] Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request 2021-06-09 0:03 ` Sagi Grimberg @ 2021-06-14 14:44 ` Michal Kalderon 2021-06-14 16:44 ` Sagi Grimberg 0 siblings, 1 reply; 8+ messages in thread From: Michal Kalderon @ 2021-06-14 14:44 UTC (permalink / raw) To: Sagi Grimberg, Keith Busch Cc: Christoph Hellwig, linux-nvme@lists.infradead.org, Shai Malin, Ariel Elior > From: Sagi Grimberg <sagi@grimberg.me> > Sent: Wednesday, June 9, 2021 3:04 AM > > ---------------------------------------------------------------------- > > >> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c > >> index 7d607f435e36..6d2eea322779 100644 > >> --- a/drivers/nvme/target/rdma.c > >> +++ b/drivers/nvme/target/rdma.c > >> @@ -16,6 +16,7 @@ > >> #include <linux/wait.h> > >> #include <linux/inet.h> > >> #include <asm/unaligned.h> > >> +#include <linux/async.h> > >> > >> #include <rdma/ib_verbs.h> > >> #include <rdma/rdma_cm.h> > >> @@ -712,6 +713,12 @@ static void nvmet_rdma_send_done(struct ib_cq > *cq, > >> struct ib_wc *wc) > >> } > >> } > >> > >> +static void nvmet_rdma_async_release_rsp(void *data, async_cookie_t > cookie) > >> +{ > >> + struct nvmet_rdma_rsp *rsp = data; > >> + nvmet_rdma_release_rsp(rsp); > >> +} > >> + > >> static void nvmet_rdma_queue_response(struct nvmet_req *req) > >> { > >> struct nvmet_rdma_rsp *rsp = > >> @@ -745,7 +752,12 @@ static void nvmet_rdma_queue_response(struct > nvmet_req > >> *req) > >> > >> if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) { > >> pr_err("sending cmd response failed\n"); > >> - nvmet_rdma_release_rsp(rsp); > >> + /* > >> + * We might be in atomic context, hence release > >> + * the rsp in async context in case we need to > >> + * process the wr_wait_list. > >> + */ > >> + async_schedule(nvmet_rdma_async_release_rsp, rsp); > >> } > >> } > > > > Just FYI, async_schedule() has conditions where it may execute your > > callback synchronously. Your suggestion is probably fine for testing, > > but it sounds like you require something that can guarantee a non-atomic > > context for nvmet_rdma_release_rsp(). > > OK, it seems that the issue is that we are submitting I/O in atomic > context. This should be more appropriate... Thanks Sagi, this seems to work. I'm still hitting some other issues where in some cases reconnect fails, but I'm Collecting more info. > > -- > diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c > index 7d607f435e36..16f2f5a84ae7 100644 > --- a/drivers/nvme/target/rdma.c > +++ b/drivers/nvme/target/rdma.c > @@ -102,6 +102,7 @@ struct nvmet_rdma_queue { > > struct work_struct release_work; > struct list_head rsp_wait_list; > + struct work_struct wr_wait_work; > struct list_head rsp_wr_wait_list; > spinlock_t rsp_wr_wait_lock; > > @@ -517,8 +518,10 @@ static int nvmet_rdma_post_recv(struct > nvmet_rdma_device *ndev, > return ret; > } > > -static void nvmet_rdma_process_wr_wait_list(struct nvmet_rdma_queue > *queue) > +static void nvmet_rdma_process_wr_wait_list(struct work_struct *w) > { > + struct nvmet_rdma_queue *queue = > + container_of(w, struct nvmet_rdma_queue, wr_wait_work); > spin_lock(&queue->rsp_wr_wait_lock); > while (!list_empty(&queue->rsp_wr_wait_list)) { > struct nvmet_rdma_rsp *rsp; > @@ -677,7 +680,7 @@ static void nvmet_rdma_release_rsp(struct > nvmet_rdma_rsp *rsp) > nvmet_req_free_sgls(&rsp->req); > > if (unlikely(!list_empty_careful(&queue->rsp_wr_wait_list))) > - nvmet_rdma_process_wr_wait_list(queue); > + schedule_work(&queue->wr_wait_work); > > nvmet_rdma_put_rsp(rsp); > } > @@ -1446,6 +1449,7 @@ nvmet_rdma_alloc_queue(struct > nvmet_rdma_device *ndev, > * inside a CM callback would trigger a deadlock. (great API > design..) > */ > INIT_WORK(&queue->release_work, > nvmet_rdma_release_queue_work); > + INIT_WORK(&queue->wr_wait_work, > nvmet_rdma_process_wr_wait_list); > queue->dev = ndev; > queue->cm_id = cm_id; > queue->port = port->nport; > -- Thanks, Tested-by: Michal Kalderon <michal.kalderon@marvell.com> _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [EXT] Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request 2021-06-14 14:44 ` [EXT] " Michal Kalderon @ 2021-06-14 16:44 ` Sagi Grimberg 2021-06-14 18:14 ` Michal Kalderon 0 siblings, 1 reply; 8+ messages in thread From: Sagi Grimberg @ 2021-06-14 16:44 UTC (permalink / raw) To: Michal Kalderon, Keith Busch Cc: Christoph Hellwig, linux-nvme@lists.infradead.org, Shai Malin, Ariel Elior >> OK, it seems that the issue is that we are submitting I/O in atomic >> context. This should be more appropriate... > > Thanks Sagi, this seems to work. I'm still hitting some other issues where in some cases reconnect fails, but I'm > Collecting more info. Same type of failures? _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [EXT] Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request 2021-06-14 16:44 ` Sagi Grimberg @ 2021-06-14 18:14 ` Michal Kalderon 0 siblings, 0 replies; 8+ messages in thread From: Michal Kalderon @ 2021-06-14 18:14 UTC (permalink / raw) To: Sagi Grimberg, Keith Busch Cc: Christoph Hellwig, linux-nvme@lists.infradead.org, Shai Malin, Ariel Elior > From: Sagi Grimberg <sagi@grimberg.me> > Sent: Monday, June 14, 2021 7:45 PM > > > >> OK, it seems that the issue is that we are submitting I/O in atomic > >> context. This should be more appropriate... > > > > Thanks Sagi, this seems to work. I'm still hitting some other issues where in > some cases reconnect fails, but I'm > > Collecting more info. > > Same type of failures? No, something else. After recovery completes, I'm getting the following errors on initiator side without any messages on target: [14678.618025] nvme nvme2: Connect rejected: status -104 (reset by remote host). [14678.619350] nvme nvme2: rdma connection establishment failed (-104) [14678.622274] nvme nvme2: Failed reconnect attempt 6 [14678.623623] nvme nvme2: Reconnecting in 10 seconds... [14751.304247] nvme nvme2: I/O 0 QID 0 timeout [14751.305749] nvme nvme2: Connect command failed, error wo/DNR bit: 881 [14751.307240] nvme nvme2: failed to connect queue: 0 ret=881 [14751.310497] nvme nvme2: Failed reconnect attempt 7 [14751.312174] nvme nvme2: Reconnecting in 10 seconds... [14825.032645] nvme nvme2: I/O 1 QID 0 timeout _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-06-14 18:15 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-05-30 7:33 BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request Michal Kalderon 2021-06-08 16:50 ` Christoph Hellwig 2021-06-08 17:43 ` Sagi Grimberg 2021-06-08 18:41 ` Keith Busch 2021-06-09 0:03 ` Sagi Grimberg 2021-06-14 14:44 ` [EXT] " Michal Kalderon 2021-06-14 16:44 ` Sagi Grimberg 2021-06-14 18:14 ` Michal Kalderon
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.