From: "Nícolas F. R. A. Prado" <nfraprado@collabora.com>
To: regressions@lists.linux.dev
Cc: linux-kernel@vger.kernel.org, kernel@collabora.com
Subject: Re: [REGRESSION] boot regression on linux-next on sc7180 platforms - null pointer dereference on iommu_dma_sync_sg_for_device
Date: Wed, 15 May 2024 17:05:53 -0400 [thread overview]
Message-ID: <d3679496-2e4e-4a7c-97ed-f193bd53af1d@notapiano> (raw)
In-Reply-To: <6d0da90d-f405-4b5d-b7f9-238cc9405ebe@notapiano>
On Tue, May 14, 2024 at 12:41:29PM -0400, Nícolas F. R. A. Prado wrote:
> Hi,
>
> KernelCI has identified a new boot regression on linux-next. It affects the
> following platforms:
> * sc7180-trogdor-kingoftown
> * sc7180-trogdor-lazor-limozeen
>
> The regression was introduced in next-20240509, and still affects today's
> (next-20240514) release.
>
> The config used was the upstream arm64 defconfig with a config fragment on top
> [1].
>
> The following stack traces are produced during boot and a usable shell is never
> reached:
>
> [ 0.381981] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
> [ 0.381989] Mem abort info:
> [ 0.381991] ESR = 0x0000000096000004
> [ 0.381994] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 0.381997] SET = 0, FnV = 0
> [ 0.382000] EA = 0, S1PTW = 0
> [ 0.382003] FSC = 0x04: level 0 translation fault
> [ 0.382006] Data abort info:
> [ 0.382008] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [ 0.382011] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 0.382014] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 0.382017] [000000000000001c] user address but active_mm is swapper
> [ 0.382021] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
> [ 0.382025] Modules linked in:
> [ 0.382032] CPU: 4 PID: 68 Comm: kworker/u32:2 Not tainted 6.9.0-next-20240514-dirty #380
> [ 0.382038] Hardware name: Google Kingoftown (DT)
> [ 0.382042] Workqueue: async async_run_entry_fn
> [ 0.382055] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 0.382061] pc : iommu_dma_sync_sg_for_device+0x28/0x100
> [ 0.382070] lr : __dma_sync_sg_for_device+0x28/0x4c
> [ 0.382080] sp : ffff800080943740
> [ 0.382082] x29: ffff800080943740 x28: ffff36ee44280000 x27: ffff36ee40bd7810
> [ 0.382092] x26: ffff800080943998 x25: ffff36ee44280480 x24: ffffb54600bcf0e8
> [ 0.382101] x23: ffff36ee40bd7810 x22: 0000000000000001 x21: 0000000000000000
> [ 0.382110] x20: ffffb54600f3d098 x19: 0000000000000000 x18: ffffb54601c1a210
> [ 0.382118] x17: 000000040044ffff x16: 0000000000000000 x15: ffff36efb6d95580
> [ 0.382126] x14: ffff36ee409156c0 x13: 0000000000001797 x12: 0000000000000002
> [ 0.382134] x11: 0000000000000004 x10: ffff36ee4308b3d8 x9 : ffff36ee44280469
> [ 0.382143] x8 : ffff36ee4308b304 x7 : 00000000ffffffff x6 : 0000000000000001
> [ 0.382151] x5 : ffffb5460033a740 x4 : ffffb545ff50375c x3 : 0000000000000001
> [ 0.382159] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff36ee40bd7810
> [ 0.382167] Call trace:
> [ 0.382170] iommu_dma_sync_sg_for_device+0x28/0x100
> [ 0.382176] __dma_sync_sg_for_device+0x28/0x4c
> [ 0.382183] spi_transfer_one_message+0x378/0x6e4
> [ 0.382193] __spi_pump_transfer_message+0x190/0x4a4
> [ 0.382199] __spi_sync+0x2a0/0x3c4
> [ 0.382205] spi_sync_locked+0x10/0x1c
> [ 0.382211] tpm_tis_spi_transfer_full+0x160/0x2fc
> [ 0.382217] tpm_tis_spi_transfer+0x34/0x40
> [ 0.382221] tpm_tis_spi_cr50_read_bytes+0x5c/0x90
> [ 0.382226] tpm_tis_core_init+0xfc/0x7e0
> [ 0.382231] tpm_tis_spi_init+0x54/0x70
> [ 0.382236] cr50_spi_probe+0xf4/0x27c
> [ 0.382241] tpm_tis_spi_driver_probe+0x34/0x64
> [ 0.382245] spi_probe+0x84/0xe4
> [ 0.382251] really_probe+0xbc/0x2a0
> [ 0.382258] __driver_probe_device+0x78/0x12c
> [ 0.382264] driver_probe_device+0x40/0x160
> [ 0.382269] __device_attach_driver+0xb8/0x134
> [ 0.382275] bus_for_each_drv+0x84/0xe0
> [ 0.382280] __device_attach_async_helper+0xac/0xd0
> [ 0.382286] async_run_entry_fn+0x34/0xe0
> [ 0.382291] process_one_work+0x154/0x298
> [ 0.382300] worker_thread+0x304/0x408
> [ 0.382307] kthread+0x118/0x11c
> [ 0.382313] ret_from_fork+0x10/0x20
> [ 0.382324] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c20)
> [ 0.382328] ---[ end trace 0000000000000000 ]---
Tracked down the issue to commit 8cc3bad9d9d6 ("spi: Remove unneded check for
orig_nents").
#regzbot introduced: 8cc3bad9d9d6
The issue happens because in spi_dma_sync_for_device(), the line
dma_sync_sgtable_for_device(tx_dev, &xfer->tx_sg, DMA_TO_DEVICE);
is passing a scatterlist table (xfer->tx_sg) that hasn't been initialized, so
the sgl pointer inside it is null. Before the patch, the check would prevent it
from being called since orig_nents is 0.
This initialization of the scatterlist table should happen in
spi_map_buf_attrs(), which should be called in __spi_map_msg(), however some
debugging revealed that the previous check
if (!ctlr->can_dma(ctlr, msg->spi, xfer))
is failing, which is why the scatterlist table isn't initialized. Despite that,
ctlr->cur_msg_mapped is set to true, so spi_dma_sync_for_device() doesn't return
early, which would completely avoid the issue. At this point I'm confused why
that flag tracks DMA mapping per message, if the mapping is done per transfer
(and a message can contain multiple transfers). Maybe that's what needs to
change, though I'd like the input from someone who is familiar with this code.
Thanks,
Nícolas
next prev parent reply other threads:[~2024-05-15 21:05 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-14 16:41 [REGRESSION] boot regression on linux-next on sc7180 platforms - null pointer dereference on iommu_dma_sync_sg_for_device Nícolas F. R. A. Prado
2024-05-15 21:05 ` Nícolas F. R. A. Prado [this message]
2024-05-16 21:32 ` Nícolas F. R. A. Prado
2024-05-22 10:00 ` Neil Armstrong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d3679496-2e4e-4a7c-97ed-f193bd53af1d@notapiano \
--to=nfraprado@collabora.com \
--cc=kernel@collabora.com \
--cc=linux-kernel@vger.kernel.org \
--cc=regressions@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).