All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: iommu@lists.linux.dev, Kevin Tian <kevin.tian@intel.com>,
	Shuah Khan <shuah@kernel.org>,
	linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [BUG] seltests/iommu: runaway ./iommufd consuming 99% CPU after a failed assert()
Date: Sat, 23 Mar 2024 21:13:01 +0100	[thread overview]
Message-ID: <a692d5d7-11d5-4c1b-9abc-208d2194ccde@alu.unizg.hr> (raw)
In-Reply-To: <20240319135852.GA393211@nvidia.com>



On 3/19/24 14:58, Jason Gunthorpe wrote:
> On Tue, Mar 12, 2024 at 07:35:40AM +0100, Mirsad Todorovac wrote:
>> Hi,
>>
>> (This is verified on the second test box.)
>>
>> In the most recent 6.8.0 release of torvalds tree kernel with selftest configs on,
>> process ./iommufd appears to consume 99% of a CPU core for quote a while in an
>> endless loop:
> 
> There is a "bug" in the ksefltest framework where if you call a
> kselftest assertion from the setup/teardown it infinite loops
> 
> The fix I know is to replace kselftest assertions with normal assert()
> 
> But I don't see an obvious thing here saying you are hitting that..
> 
> Jason

Hi,

I'm not that deep into kselftest for that intervention.

Yet, with the v6.8-11743-ga4145ce1e7bc build, the problem with ./iommufd did not stuck.
Instead I got these 10 failed tests:

# #  RUN           iommufd_dirty_tracking.domain_dirty128M_huge.enforce_dirty ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # enforce_dirty: Test terminated by assertion
# #          FAIL  iommufd_dirty_tracking.domain_dirty128M_huge.enforce_dirty
# not ok 156 iommufd_dirty_tracking.domain_dirty128M_huge.enforce_dirty
# #  RUN           iommufd_dirty_tracking.domain_dirty128M_huge.set_dirty_tracking ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # set_dirty_tracking: Test terminated by assertion
# #          FAIL  iommufd_dirty_tracking.domain_dirty128M_huge.set_dirty_tracking
# not ok 157 iommufd_dirty_tracking.domain_dirty128M_huge.set_dirty_tracking
# #  RUN           iommufd_dirty_tracking.domain_dirty128M_huge.device_dirty_capability ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # device_dirty_capability: Test terminated by assertion
# #          FAIL  iommufd_dirty_tracking.domain_dirty128M_huge.device_dirty_capability
# not ok 158 iommufd_dirty_tracking.domain_dirty128M_huge.device_dirty_capability
# #  RUN           iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # get_dirty_bitmap: Test terminated by assertion
# #          FAIL  iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap
# not ok 159 iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap
# #  RUN           iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap_no_clear ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # get_dirty_bitmap_no_clear: Test terminated by assertion
# #          FAIL  iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap_no_clear
# not ok 160 iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap_no_clear
.
.
.
# #  RUN           iommufd_dirty_tracking.domain_dirty256M_huge.enforce_dirty ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # enforce_dirty: Test terminated by assertion
# #          FAIL  iommufd_dirty_tracking.domain_dirty256M_huge.enforce_dirty
# not ok 166 iommufd_dirty_tracking.domain_dirty256M_huge.enforce_dirty
# #  RUN           iommufd_dirty_tracking.domain_dirty256M_huge.set_dirty_tracking ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # set_dirty_tracking: Test terminated by assertion
# #          FAIL  iommufd_dirty_tracking.domain_dirty256M_huge.set_dirty_tracking
# not ok 167 iommufd_dirty_tracking.domain_dirty256M_huge.set_dirty_tracking
# #  RUN           iommufd_dirty_tracking.domain_dirty256M_huge.device_dirty_capability ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # device_dirty_capability: Test terminated by assertion
# #          FAIL  iommufd_dirty_tracking.domain_dirty256M_huge.device_dirty_capability
# not ok 168 iommufd_dirty_tracking.domain_dirty256M_huge.device_dirty_capability
# #  RUN           iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # get_dirty_bitmap: Test terminated by assertion
# #          FAIL  iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap
# not ok 169 iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap
# #  RUN           iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap_no_clear ...
# iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed.
# # get_dirty_bitmap_no_clear: Test terminated by assertion
# #          FAIL  iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap_no_clear
# not ok 170 iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap_no_clear
.
.
.
# # FAILED: 170 / 180 tests passed.
# # Totals: pass:170 fail:10 xfail:0 xpass:0 skip:0 error:0
not ok 1 selftests: iommu: iommufd # exit=1

It seems like the same assertion failed in all 10 failed tests?

However, I am not smart enough to figure out why ...

Apparently, from the source, mmap() fails to allocate pages on the desired address:

   1746         assert((uintptr_t)self->buffer % HUGEPAGE_SIZE == 0);
   1747         vrc = mmap(self->buffer, variant->buffer_size, PROT_READ | PROT_WRITE,
   1748                    mmap_flags, -1, 0);
→ 1749         assert(vrc == self->buffer);
   1750

But I am not that deep into the source to figure our what was intended and what went
wrong :-/

Best regards,
Mirsad Todorovac

  reply	other threads:[~2024-03-23 20:13 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-12  6:35 [BUG] seltests/iommu: runaway ./iommufd consuming 99% CPU after a failed assert() Mirsad Todorovac
2024-03-19 13:58 ` Jason Gunthorpe
2024-03-23 20:13   ` Mirsad Todorovac [this message]
2024-03-25 12:17     ` Joao Martins
2024-03-25 13:52       ` Jason Gunthorpe
2024-03-27 10:41         ` Joao Martins
2024-03-27 11:40           ` Jason Gunthorpe
2024-03-27 15:04             ` Joao Martins
2024-03-27 16:38               ` Jason Gunthorpe
2024-03-28  0:05                 ` Shuah Khan
2024-04-02 11:33                   ` Jason Gunthorpe
2024-03-27 20:04           ` Mirsad Todorovac
2024-03-28 13:54             ` Joao Martins
2024-04-04 16:54             ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a692d5d7-11d5-4c1b-9abc-208d2194ccde@alu.unizg.hr \
    --to=mirsad.todorovac@alu.unizg.hr \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.