All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Michal Kalderon <mkalderon@marvell.com>, Christoph Hellwig <hch@lst.de>
Cc: "linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	Shai Malin <smalin@marvell.com>, Ariel Elior <aelior@marvell.com>
Subject: Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request
Date: Tue, 8 Jun 2021 10:43:45 -0700	[thread overview]
Message-ID: <e059fc90-81e7-dbe8-86f4-87b4bdbd5bb0@grimberg.me> (raw)
In-Reply-To: <CH0PR18MB4129247C2B9AE092E3C59740A1209@CH0PR18MB4129.namprd18.prod.outlook.com>


> Hi Christoph, Sagi,
> 
> We're testing some device error recovery scenarios and hit the following BUG, stack trace below.
> In the error scenario, nvmet_rdma_queue_response receives an error from the device when trying to post a wr,
> 
> this leads to nvmet_rdma_release_rsp being called from softirq eventually
> reaching the blk_mq_delay_run_hw_queue which tries to schedule in softirq. (full stack below)
> 
> could you please advise what the correct solution should be in this case ?

Hey Michal,

I agree this can happen and requires correction. Does the below resolve
the issue?

--
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 7d607f435e36..6d2eea322779 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -16,6 +16,7 @@
  #include <linux/wait.h>
  #include <linux/inet.h>
  #include <asm/unaligned.h>
+#include <linux/async.h>

  #include <rdma/ib_verbs.h>
  #include <rdma/rdma_cm.h>
@@ -712,6 +713,12 @@ static void nvmet_rdma_send_done(struct ib_cq *cq, 
struct ib_wc *wc)
         }
  }

+static void nvmet_rdma_async_release_rsp(void *data, async_cookie_t cookie)
+{
+       struct nvmet_rdma_rsp *rsp = data;
+       nvmet_rdma_release_rsp(rsp);
+}
+
  static void nvmet_rdma_queue_response(struct nvmet_req *req)
  {
         struct nvmet_rdma_rsp *rsp =
@@ -745,7 +752,12 @@ static void nvmet_rdma_queue_response(struct 
nvmet_req *req)

         if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) {
                 pr_err("sending cmd response failed\n");
-               nvmet_rdma_release_rsp(rsp);
+               /*
+                * We might be in atomic context, hence release
+                * the rsp in async context in case we need to
+                * process the wr_wait_list.
+                */
+               async_schedule(nvmet_rdma_async_release_rsp, rsp);
         }
  }
--

> 
> thanks,
> Michal
> 
> [ 8790.082863] nvmet_rdma: post_recv cmd failed
> [ 8790.083484] nvmet_rdma: sending cmd response failed
> [ 8790.084131] ------------[ cut here ]------------
> [ 8790.084140] WARNING: CPU: 7 PID: 46 at block/blk-mq.c:1422 __blk_mq_run_hw_queue+0xb7/0x100
> [ 8790.084619] Modules linked in: null_blk nvmet_rdma nvmet nvme_rdma nvme_fabrics nvme_core netconsole qedr(OE) qede(OE) qed(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xt_CHECKSUM nft_chain_nat xt_MASQUERADE nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nft_counter nft_compat tun bridge stp llc nf_tables nfnetlink ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_umad rpcrdma rdma_ucm ib_iser rdma_cm iw_cm intel_rapl_msr intel_rapl_common ib_cm sb_edac libiscsi scsi_transport_iscsi kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sunrpc rapl ib_uverbs ib_core cirrus drm_kms_helper drm virtio_balloon i2c_piix4 pcspkr crc32c_intel virtio_net serio_raw net_failover failover floppy crc8 ata_generic pata_acpi qemu_fw_cfg [last unloaded: qedr]
> [ 8790.084748] CPU: 7 PID: 46 Comm: ksoftirqd/7 Tainted: G           OE     5.8.10 #1
> [ 8790.084749] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
> [ 8790.084752] RIP: 0010:__blk_mq_run_hw_queue+0xb7/0x100
> [ 8790.084753] Code: 00 48 89 ef e8 ea 34 c8 ff 48 89 df 41 89 c4 e8 1f 7f 00 00 f6 83 a8 00 00 00 20 74 b1 41 f7 c4 fe ff ff ff 74 b7 0f 0b eb b3 <0f> 0b eb 86 48 83 bf 98 00 00 00 00 48 c7 c0 df 81 3f 82 48 c7 c2
> [ 8790.084754] RSP: 0018:ffffc9000020ba60 EFLAGS: 00010206
> [ 8790.084755] RAX: 0000000000000100 RBX: ffff88809fe8c400 RCX: 00000000ffffffff
> [ 8790.084756] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88809fe8c400
> [ 8790.084756] RBP: ffff888137b81a50 R08: ffffffffffffffff R09: 0000000000000020
> [ 8790.084757] R10: 0000000000000001 R11: ffff8881365d4968 R12: 0000000000000000
> [ 8790.084758] R13: ffff888137b81a40 R14: ffff88811e2b9e80 R15: ffff8880b3d964f0
> [ 8790.084759] FS:  0000000000000000(0000) GS:ffff88813bbc0000(0000) knlGS:0000000000000000
> [ 8790.084759] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 8790.084760] CR2: 000055ca53900da8 CR3: 000000012b83e006 CR4: 0000000000360ee0
> [ 8790.084763] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 8790.084763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 8790.084764] Call Trace:
> [ 8790.084767]  __blk_mq_delay_run_hw_queue+0x140/0x160
> [ 8790.084768]  blk_mq_get_tag+0x1d1/0x270
> [ 8790.084771]  ? finish_wait+0x80/0x80
> [ 8790.084773]  __blk_mq_alloc_request+0xb1/0x100
> [ 8790.084774]  blk_mq_make_request+0x144/0x5d0
> [ 8790.084778]  generic_make_request+0x2db/0x340
> [ 8790.084779]  ? bvec_alloc+0x82/0xe0
> [ 8790.084781]  submit_bio+0x43/0x160
> [ 8790.084781]  ? bio_add_page+0x39/0x90
> [ 8790.084794]  nvmet_bdev_execute_rw+0x28c/0x360 [nvmet]
> [ 8790.084800]  nvmet_rdma_execute_command+0x72/0x110 [nvmet_rdma]
> [ 8790.084802]  nvmet_rdma_release_rsp+0xc1/0x1e0 [nvmet_rdma]
> [ 8790.084804]  nvmet_rdma_queue_response.cold.63+0x14/0x19 [nvmet_rdma]
> [ 8790.084806]  nvmet_req_complete+0x11/0x40 [nvmet]
> [ 8790.084809]  nvmet_bio_done+0x27/0x100 [nvmet]
> [ 8790.084811]  blk_update_request+0x23e/0x3b0
> [ 8790.084812]  blk_mq_end_request+0x1a/0x120
> [ 8790.084814]  blk_done_softirq+0xa1/0xd0
> [ 8790.084818]  __do_softirq+0xe4/0x2f8
> [ 8790.084821]  ? sort_range+0x20/0x20
> [ 8790.084824]  run_ksoftirqd+0x26/0x40
> [ 8790.084825]  smpboot_thread_fn+0xc5/0x160
> [ 8790.084827]  kthread+0x116/0x130
> [ 8790.084828]  ? kthread_park+0x80/0x80
> [ 8790.084832]  ret_from_fork+0x22/0x30
> [ 8790.084833] ---[ end trace 16ec813ee3f82b56 ]---
> [ 8790.085314] BUG: scheduling while atomic: ksoftirqd/7/46/0x00000100
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  parent reply	other threads:[~2021-06-08 17:44 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-30  7:33 BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request Michal Kalderon
2021-06-08 16:50 ` Christoph Hellwig
2021-06-08 17:43 ` Sagi Grimberg [this message]
2021-06-08 18:41   ` Keith Busch
2021-06-09  0:03     ` Sagi Grimberg
2021-06-14 14:44       ` [EXT] " Michal Kalderon
2021-06-14 16:44         ` Sagi Grimberg
2021-06-14 18:14           ` Michal Kalderon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e059fc90-81e7-dbe8-86f4-87b4bdbd5bb0@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=aelior@marvell.com \
    --cc=hch@lst.de \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mkalderon@marvell.com \
    --cc=smalin@marvell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.