From: "Christian König" <christian.koenig@amd.com>
To: Ondrej Zary <linux@zary.sk>
Cc: nouveau@lists.freedesktop.org, Ben Skeggs <bskeggs@redhat.com>,
dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device
Date: Fri, 11 Jun 2021 14:38:18 +0200 [thread overview]
Message-ID: <4b4248d8-b708-3832-7fe3-2a9fd2c2311e@amd.com> (raw)
In-Reply-To: <d4e5042c-3981-02b0-4b9e-fa2c8e373be4@amd.com>
[-- Attachment #1: Type: text/plain, Size: 4084 bytes --]
Am 10.06.21 um 19:59 schrieb Christian König:
> Am 10.06.21 um 19:50 schrieb Ondrej Zary:
>> [SNIP]
>>> I can't see how this is called from the nouveau code, only
>>> possibility I
>>> see is that it is maybe called through the AGP code somehow.
>> Yes, you're right:
>> [ 13.192663] Call Trace:
>> [ 13.192678] dump_stack+0x54/0x68
>> [ 13.192690] ttm_tt_init+0x11/0x8a [ttm]
>> [ 13.192699] ttm_agp_tt_create+0x39/0x51 [ttm]
>> [ 13.192840] nouveau_ttm_tt_create+0x17/0x22 [nouveau]
>> [ 13.192856] ttm_tt_create+0x78/0x8c [ttm]
>> [ 13.192864] ttm_bo_handle_move_mem+0x7d/0xca [ttm]
>> [ 13.192873] ttm_bo_validate+0x92/0xc8 [ttm]
>> [ 13.192883] ttm_bo_init_reserved+0x216/0x243 [ttm]
>> [ 13.192892] ttm_bo_init+0x45/0x65 [ttm]
>> [ 13.193018] ? nouveau_bo_del_io_reserve_lru+0x48/0x48 [nouveau]
>> [ 13.193150] nouveau_bo_init+0x8c/0x94 [nouveau]
>> [ 13.193273] ? nouveau_bo_del_io_reserve_lru+0x48/0x48 [nouveau]
>> [ 13.193407] nouveau_bo_new+0x44/0x57 [nouveau]
>> [ 13.193537] nouveau_channel_prep+0xa3/0x269 [nouveau]
>> [ 13.193665] nouveau_channel_new+0x3c/0x5f7 [nouveau]
>> [ 13.193679] ? slab_free_freelist_hook+0x3b/0xa7
>> [ 13.193686] ? kfree+0x9e/0x11a
>> [ 13.193781] ? nvif_object_sclass_put+0xd/0x16 [nouveau]
>> [ 13.193908] nouveau_drm_device_init+0x2e2/0x646 [nouveau]
>> [ 13.193924] ? pci_enable_device_flags+0x1e/0xac
>> [ 13.194052] nouveau_drm_probe+0xeb/0x188 [nouveau]
>> [ 13.194182] ? nouveau_drm_device_init+0x646/0x646 [nouveau]
>> [ 13.194195] pci_device_probe+0x89/0xe9
>> [ 13.194205] really_probe+0x127/0x2a7
>> [ 13.194212] driver_probe_device+0x5b/0x87
>> [ 13.194219] device_driver_attach+0x2e/0x41
>> [ 13.194226] __driver_attach+0x7c/0x83
>> [ 13.194232] bus_for_each_dev+0x4c/0x66
>> [ 13.194238] driver_attach+0x14/0x16
>> [ 13.194244] ? device_driver_attach+0x41/0x41
>> [ 13.194251] bus_add_driver+0xc5/0x16c
>> [ 13.194258] driver_register+0x87/0xb9
>> [ 13.194265] __pci_register_driver+0x38/0x3b
>> [ 13.194271] ? 0xf0c0d000
>> [ 13.194362] nouveau_drm_init+0x14c/0x1000 [nouveau]
>>
>> How is ttm_dma_tt->dma_address allocated?
>
> Mhm, I need to double check how AGP is supposed to work.
>
> Since barely anybody is using it these days it is something which
> breaks from time to time.
I have no idea how that ever worked in the first place since AGP isn't
supposed to sync between CPU/GPU. Everything is coherent for that case.
Anyway here is a patch which adds a check to those functions if the
dma_address array is allocated in the first place. Please test it.
Thanks,
Christian.
>
> Thanks for the backtrace,
> Christian.
>
>> I cannot find any assignment
>> executed (in the working code):
>>
>> $ git grep dma_address\ = drivers/gpu/
>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c:
>> sg->sgl->dma_address = addr;
>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c: dma_address =
>> &dma->dma_address[offset >> PAGE_SHIFT];
>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c: dma_address =
>> (mm_node->start << PAGE_SHIFT) + offset;
>> drivers/gpu/drm/i915/gvt/scheduler.c: sg->dma_address = addr;
>> drivers/gpu/drm/i915/i915_gpu_error.c: sg->dma_address = it;
>> drivers/gpu/drm/ttm/ttm_tt.c: ttm->dma_address = (void *)
>> (ttm->ttm.pages + ttm->ttm.num_pages);
>> drivers/gpu/drm/ttm/ttm_tt.c: ttm->dma_address =
>> kvmalloc_array(ttm->ttm.num_pages,
>> drivers/gpu/drm/ttm/ttm_tt.c: ttm_dma->dma_address = NULL;
>> drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c: viter->dma_address =
>> &__vmw_piter_phys_addr;
>> drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c: viter->dma_address =
>> &__vmw_piter_dma_addr;
>> drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c: viter->dma_address =
>> &__vmw_piter_sg_addr;
>>
>> The 2 cases in ttm_tt.c are in ttm_dma_tt_alloc_page_directory() and
>> ttm_sg_tt_alloc_page_directory().
>> Confirmed by adding printk()s that they're NOT called.
>>
>>
>
[-- Attachment #2: 0001-drm-nouveau-check-dma_address-array-for-CPU-GPU-sync.patch --]
[-- Type: text/x-patch, Size: 1362 bytes --]
From 5370102729c6ecb280712c40b92ff7b9f58c6e1e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>
Date: Fri, 11 Jun 2021 14:34:50 +0200
Subject: [PATCH] drm/nouveau: check dma_address array for CPU/GPU sync
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
AGP for example doesn't have a dma_address array.
Signed-off-by: Christian König <christian.koenig@amd.com>
---
drivers/gpu/drm/nouveau/nouveau_bo.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 085023624fb0..1a52590f5303 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -551,7 +551,7 @@ nouveau_bo_sync_for_device(struct nouveau_bo *nvbo)
struct ttm_tt *ttm_dma = (struct ttm_tt *)nvbo->bo.ttm;
int i, j;
- if (!ttm_dma)
+ if (!ttm_dma || !ttm_dma->dma_address)
return;
if (!ttm_dma->pages) {
NV_DEBUG(drm, "ttm_dma 0x%p: pages NULL\n", ttm_dma);
@@ -587,7 +587,7 @@ nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo)
struct ttm_tt *ttm_dma = (struct ttm_tt *)nvbo->bo.ttm;
int i, j;
- if (!ttm_dma)
+ if (!ttm_dma || !ttm_dma->dma_address)
return;
if (!ttm_dma->pages) {
NV_DEBUG(drm, "ttm_dma 0x%p: pages NULL\n", ttm_dma);
--
2.25.1
next prev parent reply other threads:[~2021-06-11 12:38 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-05 19:43 nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device Ondrej Zary
2021-06-05 21:22 ` [Nouveau] " Ilia Mirkin
2021-06-05 21:34 ` Ondrej Zary
2021-06-06 21:16 ` Ondrej Zary
2021-06-07 20:58 ` Ondrej Zary
2021-06-08 18:47 ` Ondrej Zary
2021-06-08 20:01 ` Ondrej Zary
2021-06-08 21:59 ` Ondrej Zary
2021-06-09 6:43 ` Christian König
2021-06-09 6:57 ` Ondrej Zary
2021-06-09 7:02 ` Christian König
2021-06-09 7:10 ` Ondrej Zary
2021-06-09 9:21 ` Christian König
2021-06-09 20:00 ` Ondrej Zary
2021-06-10 6:43 ` Christian König
2021-06-10 17:50 ` Ondrej Zary
2021-06-10 17:59 ` Christian König
2021-06-11 12:38 ` Christian König [this message]
2021-06-11 18:23 ` Ondrej Zary
2021-06-14 11:07 ` Christian König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4b4248d8-b708-3832-7fe3-2a9fd2c2311e@amd.com \
--to=christian.koenig@amd.com \
--cc=bskeggs@redhat.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@zary.sk \
--cc=nouveau@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).