All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Joao Martins <joao.m.martins@oracle.com>
To: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>,
	Jason Gunthorpe <jgg@nvidia.com>
Cc: iommu@lists.linux.dev, Kevin Tian <kevin.tian@intel.com>,
	Shuah Khan <shuah@kernel.org>,
	linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [BUG] seltests/iommu: runaway ./iommufd consuming 99% CPU after a failed assert()
Date: Mon, 25 Mar 2024 12:17:28 +0000	[thread overview]
Message-ID: <cdc9c46b-1bad-41cd-8f98-38cc2171186a@oracle.com> (raw)
In-Reply-To: <a692d5d7-11d5-4c1b-9abc-208d2194ccde@alu.unizg.hr>

On 23/03/2024 20:13, Mirsad Todorovac wrote:
> 
> 
> On 3/19/24 14:58, Jason Gunthorpe wrote:
>> On Tue, Mar 12, 2024 at 07:35:40AM +0100, Mirsad Todorovac wrote:
>>> Hi,
>>>
>>> (This is verified on the second test box.)
>>>
>>> In the most recent 6.8.0 release of torvalds tree kernel with selftest
>>> configs on,
>>> process ./iommufd appears to consume 99% of a CPU core for quote a while in an
>>> endless loop:
>>
>> There is a "bug" in the ksefltest framework where if you call a
>> kselftest assertion from the setup/teardown it infinite loops
>>
>> The fix I know is to replace kselftest assertions with normal assert()
>>
>> But I don't see an obvious thing here saying you are hitting that..
>>
>> Jason
> 
> Hi,
> 
> I'm not that deep into kselftest for that intervention.
> 
> Yet, with the v6.8-11743-ga4145ce1e7bc build, the problem with ./iommufd did not
> stuck.
> Instead I got these 10 failed tests:
> 
> # #  RUN           iommufd_dirty_tracking.domain_dirty128M_huge.enforce_dirty ...
> # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc ==
> self->buffer' failed.
> # # enforce_dirty: Test terminated by assertion
> # #          FAIL  iommufd_dirty_tracking.domain_dirty128M_huge.enforce_dirty
> # not ok 156 iommufd_dirty_tracking.domain_dirty128M_huge.enforce_dirty
> # #  RUN          
> iommufd_dirty_tracking.domain_dirty128M_huge.set_dirty_tracking ...
> # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc ==
> self->buffer' failed.
> # # set_dirty_tracking: Test terminated by assertion
> # #          FAIL  iommufd_dirty_tracking.domain_dirty128M_huge.set_dirty_tracking
> # not ok 157 iommufd_dirty_tracking.domain_dirty128M_huge.set_dirty_tracking
> # #  RUN          
> iommufd_dirty_tracking.domain_dirty128M_huge.device_dirty_capability ...
> # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc ==
> self->buffer' failed.
> # # device_dirty_capability: Test terminated by assertion
> # #          FAIL 
> iommufd_dirty_tracking.domain_dirty128M_huge.device_dirty_capability
> # not ok 158 iommufd_dirty_tracking.domain_dirty128M_huge.device_dirty_capability
> # #  RUN           iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap
> ...
> # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc ==
> self->buffer' failed.
> # # get_dirty_bitmap: Test terminated by assertion
> # #          FAIL  iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap
> # not ok 159 iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap
> # #  RUN          
> iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap_no_clear ...
> # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc ==
> self->buffer' failed.
> # # get_dirty_bitmap_no_clear: Test terminated by assertion
> # #          FAIL 
> iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap_no_clear
> # not ok 160 iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap_no_clear
> .
> .
> .
> # #  RUN           iommufd_dirty_tracking.domain_dirty256M_huge.enforce_dirty ...
> # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc ==
> self->buffer' failed.
> # # enforce_dirty: Test terminated by assertion
> # #          FAIL  iommufd_dirty_tracking.domain_dirty256M_huge.enforce_dirty
> # not ok 166 iommufd_dirty_tracking.domain_dirty256M_huge.enforce_dirty
> # #  RUN          
> iommufd_dirty_tracking.domain_dirty256M_huge.set_dirty_tracking ...
> # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc ==
> self->buffer' failed.
> # # set_dirty_tracking: Test terminated by assertion
> # #          FAIL  iommufd_dirty_tracking.domain_dirty256M_huge.set_dirty_tracking
> # not ok 167 iommufd_dirty_tracking.domain_dirty256M_huge.set_dirty_tracking
> # #  RUN          
> iommufd_dirty_tracking.domain_dirty256M_huge.device_dirty_capability ...
> # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc ==
> self->buffer' failed.
> # # device_dirty_capability: Test terminated by assertion
> # #          FAIL 
> iommufd_dirty_tracking.domain_dirty256M_huge.device_dirty_capability
> # not ok 168 iommufd_dirty_tracking.domain_dirty256M_huge.device_dirty_capability
> # #  RUN           iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap
> ...
> # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc ==
> self->buffer' failed.
> # # get_dirty_bitmap: Test terminated by assertion
> # #          FAIL  iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap
> # not ok 169 iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap
> # #  RUN          
> iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap_no_clear ...
> # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc ==
> self->buffer' failed.
> # # get_dirty_bitmap_no_clear: Test terminated by assertion
> # #          FAIL 
> iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap_no_clear
> # not ok 170 iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap_no_clear
> .
> .
> .
> # # FAILED: 170 / 180 tests passed.
> # # Totals: pass:170 fail:10 xfail:0 xpass:0 skip:0 error:0
> not ok 1 selftests: iommu: iommufd # exit=1
> 
> It seems like the same assertion failed in all 10 failed tests?
> 

... It means that the hugetlb mmap() failed, which is required for this specific
tests. Because we need to allocate a bigger IOVA range, and in hugepages to
exercise the test.


> However, I am not smart enough to figure out why ...
> 
> Apparently, from the source, mmap() fails to allocate pages on the desired address:
> 
>   1746         assert((uintptr_t)self->buffer % HUGEPAGE_SIZE == 0);
>   1747         vrc = mmap(self->buffer, variant->buffer_size, PROT_READ |
> PROT_WRITE,
>   1748                    mmap_flags, -1, 0);
> → 1749         assert(vrc == self->buffer);
>   1750
> 
> But I am not that deep into the source to figure our what was intended and what
> went
> wrong :-/

I can SKIP() the test rather assert() in here if it helps. Though there are
other tests that fail if no hugetlb pages are reserved.

But I am not sure if this is problem here as the initial bug email had an
enterily different set of failures? Maybe all you need is an assert() and it
gets into this state?

	Joao

  reply	other threads:[~2024-03-25 12:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-12  6:35 [BUG] seltests/iommu: runaway ./iommufd consuming 99% CPU after a failed assert() Mirsad Todorovac
2024-03-19 13:58 ` Jason Gunthorpe
2024-03-23 20:13   ` Mirsad Todorovac
2024-03-25 12:17     ` Joao Martins [this message]
2024-03-25 13:52       ` Jason Gunthorpe
2024-03-27 10:41         ` Joao Martins
2024-03-27 11:40           ` Jason Gunthorpe
2024-03-27 15:04             ` Joao Martins
2024-03-27 16:38               ` Jason Gunthorpe
2024-03-28  0:05                 ` Shuah Khan
2024-04-02 11:33                   ` Jason Gunthorpe
2024-03-27 20:04           ` Mirsad Todorovac
2024-03-28 13:54             ` Joao Martins
2024-04-04 16:54             ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cdc9c46b-1bad-41cd-8f98-38cc2171186a@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mirsad.todorovac@alu.unizg.hr \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.