From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.3 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7413AC47082 for ; Tue, 8 Jun 2021 16:22:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5884F61185 for ; Tue, 8 Jun 2021 16:22:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231689AbhFHQYa (ORCPT ); Tue, 8 Jun 2021 12:24:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230261AbhFHQY3 (ORCPT ); Tue, 8 Jun 2021 12:24:29 -0400 Received: from mail-oi1-x234.google.com (mail-oi1-x234.google.com [IPv6:2607:f8b0:4864:20::234]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0BA5C061574 for ; Tue, 8 Jun 2021 09:22:22 -0700 (PDT) Received: by mail-oi1-x234.google.com with SMTP id a26so1369080oie.11 for ; Tue, 08 Jun 2021 09:22:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:references:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=ZLNNz5cZgoB0842GuQxnQhlKb6JGzToW2/gAQr2wLg4=; b=PeaSLfca6RLReeG9ZWCRZj8LwQnDn6ZuTgOqXmjmOBqkgZS+Ts0VYhMVFhlWoiXK7E cjB50ET5NZDas04YHQQdnN1spu84ggBErQP27iSyNbcmy5IVH8ETkhQScYDT7LY0jtI+ /gfNiHLp+eAG0jyTyBtjRz9WQMjq77KobY/uAQfWS1HgZLqCDrrVr722tLxSXBcNNC/8 WqZFdITqknbCbXQH6hTfXtjrGAW5aUR2IgV8PQwebbdq/Zs4XXIktIAkd67AMTVM62jC hOsqOHA5bWQZlZNUco+5DaXyGxFzVru4LaS/yxQHwTafQ52OnQOgkvgVPV24je0sgIVR Nt+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=ZLNNz5cZgoB0842GuQxnQhlKb6JGzToW2/gAQr2wLg4=; b=VP1jD1PfMAg+KJNJtxI6VSGoejArr0Z/K5WAa9zC6O4O+2peJ5tI+b9CyCXdElQHfa T0/HED+y/7NqNaetKrcmputY+9RelHY89kaA3bQGV/6Y1IzuyaUDDr8HwwtnnJgI2UEh 9/WvKb+YHkXAhidTsxnBIGEOxXOyRfEjYzdikZGVdEGFbo+ofjzuaAvTKolFXWjv821p guCUTxPXLYGgVKwG6r2dyJzX9gZS6l5Md0S+nzh5YkNpVJVrb8lIAhAv5xocFgVEQFuK VIH+Cb++brgMbPFwf9pTIMwjVjv0nb/QLjvyqn7tmH83vWJe9TJaa9l/pCXNX3ddTdpF iLng== X-Gm-Message-State: AOAM530gEDiBzbAE2X9NgyAQmcqhh5CUVMmJ32WGH0KFVdfQoaYGdlCU NmvKRYZVQm0UCX8XZU0NEjEnKYSvSog= X-Google-Smtp-Source: ABdhPJxjEMWqjIiyjm5b7T9iy1epXUhQ0/2swPGXfwTfdIJmWrpRWkR71s8KxJan8SToE5jmCh4qng== X-Received: by 2002:a05:6808:1c9:: with SMTP id x9mr3410764oic.108.1623169341860; Tue, 08 Jun 2021 09:22:21 -0700 (PDT) Received: from ?IPv6:2603:8081:140c:1a00:e53f:9e9b:cd17:cd87? (2603-8081-140c-1a00-e53f-9e9b-cd17-cd87.res6.spectrum.com. [2603:8081:140c:1a00:e53f:9e9b:cd17:cd87]) by smtp.gmail.com with ESMTPSA id x24sm3122965otq.34.2021.06.08.09.22.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 08 Jun 2021 09:22:21 -0700 (PDT) Subject: Re: [Bug Report] RDMA/core: test_qpex.py attempts invalid MW bind operation From: "Pearson, Robert B" To: Edward Srouji , Leon Romanovsky Cc: Jason Gunthorpe , RDMA mailing list References: <8d329494-6653-359b-91aa-31ac9dc8122c@gmail.com> <474ad554-574c-120e-97ba-b617e346f14d@gmail.com> <591f489c-882b-de37-eb1f-d39a71fcbd05@nvidia.com> <90cbdee5-c1f6-0373-8d09-c28e3ad7a6c8@gmail.com> Message-ID: <0d5a9421-e2a4-c6a6-2934-500559bfc651@gmail.com> Date: Tue, 8 Jun 2021 11:22:20 -0500 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <90cbdee5-c1f6-0373-8d09-c28e3ad7a6c8@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On 6/8/2021 11:12 AM, Pearson, Robert B wrote: > > On 6/8/2021 10:54 AM, Pearson, Robert B wrote: >> >> On 6/8/2021 6:53 AM, Edward Srouji wrote: >>> >>> On 6/8/2021 9:47 AM, Leon Romanovsky wrote: >>>> On Mon, Jun 07, 2021 at 11:54:29PM -0500, Pearson, Robert B wrote: >>>>> On 6/7/2021 11:41 PM, Leon Romanovsky wrote: >>>>>> On Mon, Jun 07, 2021 at 04:50:20PM -0500, Pearson, Robert B wrote: >>>>>>> sorry/this time without the HTML. >>>>>>> >>>>>>> ====================================================================== >>>>>>> >>>>>>> ERROR: test_qp_ex_rc_bind_mw (tests.test_qpex.QpExTestCase) >>>>>>> Verify bind memory window operation using the new post_send API. >>>>>>> ---------------------------------------------------------------------- >>>>>>> >>>>>>> Traceback (most recent call last): >>>>>>>     File "/home/rpearson/src/rdma-core/tests/test_qpex.py", line >>>>>>> 292, in >>>>>>> test_qp_ex_rc_bind_mw >>>>>>>       u.poll_cq(server.cq) >>>>>>>     File "/home/rpearson/src/rdma-core/tests/utils.py", line >>>>>>> 538, in poll_cq >>>>>>>       raise PyverbsRDMAError('Completion status is {s}'. >>>>>>> pyverbs.pyverbs_error.PyverbsRDMAError: Completion status is >>>>>>> Memory window >>>>>>> bind error. Errno: 6, No such device or address >>>>>>> >>>>>>> This test attempts to bind a type 2 MW to an MR that does not >>>>>>> have bind mw >>>>>>> access set and expects the test to succeed. >>> >>> You're right, looks like a test bug. I'll send a fix upstream. >>> >>> Can you please confirm that this solves your issue: >> Well I get further. I am hitting a seg fault in python at >> >>         client.qp.wr_rdma_write(new_key, server.mr.buf) >> >> in test_qp_ex_rc_bind_mw. >> >> I'm trying to track it down. I'm not very familiar with python and >> don't know how to run the test under gdb. >> >> Thanks for the fix. >> >> Bob > > OK got it. In the setup for the test you write > >     class QpExRCBindMw(RCResources): >         def create_qps(self): >             create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_BIND_MW) > >         def create_mr(self): >             self.mr = u.create_custom_mr(self, > e.IBV_ACCESS_REMOTE_WRITE | >                             e.IBV_ACCESS_MW_BIND) > > which asks for qp_ex->wr_bind_mw() to be set but later in the test you > write > >     client.qp.wr_rdma_write(new_key, server.mr.buf) > > which calls qp_ex->wr_rdma_write() which is not set causing the seg > fault. I think you should have written > >             create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_BIND_MW > | e.IBV_QP_EX_WITH_RDMA_WRITE) > > since you need both extended QP operations. > > Bob With this patch the test is now running correctly diff --git a/tests/test_qpex.py b/tests/test_qpex.py index 20288d45..0316bfcb 100644 --- a/tests/test_qpex.py +++ b/tests/test_qpex.py @@ -146,10 +146,12 @@ class QpExRCAtomicFetchAdd(RCResources):  class QpExRCBindMw(RCResources):      def create_qps(self): -        create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_BIND_MW) +        create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_BIND_MW | +                       e.IBV_QP_EX_WITH_RDMA_WRITE)      def create_mr(self): -        self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_WRITE) +        self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_WRITE | +                       e.IBV_ACCESS_MW_BIND)  class QpExTestCase(RDMATestCase): > >> >>> >>> diff --git a/tests/test_qpex.py b/tests/test_qpex.py >>> index 4b58260f..c2d67ee8 100644 >>> --- a/tests/test_qpex.py >>> +++ b/tests/test_qpex.py >>> @@ -149,7 +149,7 @@ class QpExRCBindMw(RCResources): >>>          create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_BIND_MW) >>> >>>      def create_mr(self): >>> -        self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_WRITE) >>> +        self.mr = u.create_custom_mr(self, >>> e.IBV_ACCESS_REMOTE_WRITE | e.IBV_ACCESS_MW_BIND) >>> >>>>>> Does the test break after your MW series? Or will it break >>>>>> not-merged >>>>>> code yet? >>>>>> >>>>>> Generally speaking, we expect that developers run rdma-core tests >>>>>> and >>>>>> fixed/extend them prior to the submission. >>>>>> >>>>>> Thanks >>>>>> >>>>>>> Bob Pearson >>>>> Nope. I don't have real RNICs at home to test. But (see my note to >>>>> Zhu) the >>>>> non extended APIs do set the access flags correctly and the >>>>> extended test >>>>> case does not. The wr_bind_mw() function can't fix this for the >>>>> test case. >>>>> It has to set the access flags when it creates the MR and it >>>>> didn't. It is >>>>> possible that mlx5 doesn't check the bind access flag but that seems >>>>> unlikely. >>>> mlx5 devices support MW 1 & 2 and kernel checks that only these types >>>> can be accepted from the user space. This is why mlx5 doesn't need to >>>> check access flags again. >>>> >>>>     903 static int ib_uverbs_alloc_mw(struct uverbs_attr_bundle >>>> *attrs) >>>>     904 { >>>> >>>> .... >>>> >>>>     927         if (cmd.mw_type != IB_MW_TYPE_1 && cmd.mw_type != >>>> IB_MW_TYPE_2) { >>>>     928                 ret = -EINVAL; >>>>     929                 goto err_put; >>>>     930         } >>>> >>>> >>>> Thanks >>> >>> I see that mlx5 checks the access flags in userspace only if >>> MW_DEBUG is turned on (in set_bind_wr()). >>> >>> I guess that's for the sake of performance, as it's part of the data >>> path. >>> >>>>> Bob >>>>>