* BUG: kernel NULL pointer dereference, address: 0000000000000008
@ 2024-01-18 17:23 Mirsad Todorovac
2024-01-20 19:54 ` BUG [RESEND]: " Mirsad Todorovac
0 siblings, 1 reply; 8+ messages in thread
From: Mirsad Todorovac @ 2024-01-18 17:23 UTC (permalink / raw
To: linux-kernel, amd-gfx
Cc: Sathishkumar S, Lijo Lazar, Srinivasan Shanmugam, Guchun Chen,
Lang Yu, Felix Kuehling, Pan, Xinhui, dri-devel,
Marek Olšák, Boyuan Zhang, Daniel Vetter, David Francis,
Alex Deucher, David Airlie, Christian König
[-- Attachment #1: Type: text/plain, Size: 8275 bytes --]
Hi,
Unfortunately, I was not able to reboot in this kernel again to do the stack decode, but I thought
that any information about the NULL pointer dereference is better than no info.
The system is Ubuntu 23.10 Mantic with AMD product: Navi 23 [Radeon RX 6600/6600 XT/6600M]
graphic card.
Please find the config and the hw listing attached.
Best regards,
Mirsad
kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
kernel: [ 5.576707] #PF: supervisor read access in kernel mode
kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
kernel: [ 5.576712] PGD 0 P4D 0
kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
kernel: [ 5.576903] PKRU: 55555554
kernel: [ 5.576905] Call Trace:
kernel: [ 5.576907] <TASK>
kernel: [ 5.576909] ? show_regs+0x72/0x90
kernel: [ 5.576914] ? __die+0x25/0x80
kernel: [ 5.576917] ? page_fault_oops+0x154/0x4c0
kernel: [ 5.576921] ? srso_alias_return_thunk+0x5/0xfbef5
kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0+0x35/0x70
kernel: [ 5.576930] ? do_user_addr_fault+0x30e/0x6e0
kernel: [ 5.576934] ? exc_page_fault+0x84/0x1b0
kernel: [ 5.576937] ? asm_exc_page_fault+0x27/0x30
kernel: [ 5.576942] ? gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
kernel: [ 5.577056] amdgpu_device_init+0xefa/0x2de0 [amdgpu]
kernel: [ 5.577158] ? srso_alias_return_thunk+0x5/0xfbef5
kernel: [ 5.577161] ? pci_bus_read_config_word+0x47/0x90
kernel: [ 5.577166] ? pci_read_config_word+0x27/0x60
kernel: [ 5.577168] ? srso_alias_return_thunk+0x5/0xfbef5
kernel: [ 5.577171] ? do_pci_enable_device+0xe1/0x110
kernel: [ 5.577176] amdgpu_driver_load_kms+0x1a/0x1c0 [amdgpu]
kernel: [ 5.577275] amdgpu_pci_probe+0x1a8/0x5e0 [amdgpu]
kernel: [ 5.577373] local_pci_probe+0x48/0xb0
kernel: [ 5.577377] pci_device_probe+0xc8/0x290
kernel: [ 5.577381] really_probe+0x1d2/0x440
kernel: [ 5.577386] __driver_probe_device+0x8a/0x190
kernel: [ 5.577389] driver_probe_device+0x23/0xd0
kernel: [ 5.577392] __driver_attach+0x10f/0x220
kernel: [ 5.577396] ? __pfx___driver_attach+0x10/0x10
kernel: [ 5.577399] bus_for_each_dev+0x7a/0xe0
kernel: [ 5.577402] driver_attach+0x1e/0x30
kernel: [ 5.577405] bus_add_driver+0x127/0x240
kernel: [ 5.577409] driver_register+0x64/0x140
kernel: [ 5.577411] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
kernel: [ 5.577521] __pci_register_driver+0x68/0x80
kernel: [ 5.577524] amdgpu_init+0x69/0xff0 [amdgpu]
kernel: [ 5.577628] do_one_initcall+0x46/0x330
kernel: [ 5.577632] ? kmalloc_trace+0x136/0x370
kernel: [ 5.577637] do_init_module+0x6a/0x280
kernel: [ 5.577640] load_module+0x2419/0x2500
kernel: [ 5.577647] init_module_from_file+0x9c/0xf0
kernel: [ 5.577649] ? srso_alias_return_thunk+0x5/0xfbef5
kernel: [ 5.577652] ? init_module_from_file+0x9c/0xf0
kernel: [ 5.577657] idempotent_init_module+0x184/0x240
kernel: [ 5.577661] __x64_sys_finit_module+0x64/0xd0
kernel: [ 5.577664] do_syscall_64+0x76/0x140
kernel: [ 5.577668] ? srso_alias_return_thunk+0x5/0xfbef5
kernel: [ 5.577671] ? ksys_mmap_pgoff+0x123/0x270
kernel: [ 5.577675] ? srso_alias_return_thunk+0x5/0xfbef5
kernel: [ 5.577678] ? srso_alias_return_thunk+0x5/0xfbef5
kernel: [ 5.577681] ? syscall_exit_to_user_mode+0x97/0x1e0
kernel: [ 5.577684] ? srso_alias_return_thunk+0x5/0xfbef5
kernel: [ 5.577687] ? do_syscall_64+0x85/0x140
kernel: [ 5.577689] ? srso_alias_return_thunk+0x5/0xfbef5
kernel: [ 5.577692] ? do_syscall_64+0x85/0x140
kernel: [ 5.577695] ? srso_alias_return_thunk+0x5/0xfbef5
kernel: [ 5.577698] ? do_syscall_64+0x85/0x140
kernel: [ 5.577700] ? srso_alias_return_thunk+0x5/0xfbef5
kernel: [ 5.577703] ? sysvec_call_function+0x4e/0xb0
kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe+0x6e/0x76
kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
kernel: [ 5.577748] </TASK>
kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
kernel: [ 5.577817] CR2: 0000000000000008
kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
rsyslogd: rsyslogd's groupid changed to 111
kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
kernel: [ 5.914419] PKRU: 55555554
[-- Attachment #2: lshw.txt --]
[-- Type: text/plain, Size: 58568 bytes --]
defiant
description: Desktop Computer
product: X670E PG Lightning (Default string)
vendor: ASRock
version: Default string
serial: Default string
width: 64 bits
capabilities: smbios-3.4.0 dmi-3.4.0 smp vsyscall32
configuration: boot=normal chassis=desktop family=Default string sku=Default string uuid=01006b9c-80fb-0000-0000-000000000000
*-core
description: Motherboard
product: X670E PG Lightning
vendor: ASRock
physical id: 0
version: Default string
serial: M80-FA012800404
slot: Default string
*-firmware
description: BIOS
vendor: American Megatrends International, LLC.
physical id: 0
version: 1.21
date: 04/26/2023
size: 64KiB
capacity: 32MiB
capabilities: pci upgrade shadowing cdboot bootselect socketedrom edd int13floppynec int13floppytoshiba int13floppy360 int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int14serial int17printer int10video usb biosbootspecification uefi
*-cache:0
description: L1 cache
physical id: 29
slot: L1 - Cache
size: 1MiB
capacity: 1MiB
clock: 1GHz (1.0ns)
capabilities: pipeline-burst internal write-back unified
configuration: level=1
*-cache:1
description: L2 cache
physical id: 2a
slot: L2 - Cache
size: 16MiB
capacity: 16MiB
clock: 1GHz (1.0ns)
capabilities: pipeline-burst internal write-back unified
configuration: level=2
*-cache:2
description: L3 cache
physical id: 2b
slot: L3 - Cache
size: 64MiB
capacity: 64MiB
clock: 1GHz (1.0ns)
capabilities: pipeline-burst internal write-back unified
configuration: level=3
*-cpu
description: CPU
product: AMD Ryzen 9 7950X 16-Core Processor
vendor: Advanced Micro Devices [AMD]
physical id: 2c
bus info: cpu@0
version: 25.97.2
serial: Unknown
slot: AM5
size: 400MHz
capacity: 5881MHz
width: 64 bits
clock: 100MHz
capabilities: lm fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp x86-64 constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid overflow_recov succor smca fsrm flush_l1d cpufreq
configuration: cores=16 enabledcores=16 microcode=174068227 threads=32
*-memory
description: System Memory
physical id: 2e
slot: System board or motherboard
size: 64GiB
*-bank:0
description: [empty]
product: Unknown
vendor: Unknown
physical id: 0
serial: Unknown
slot: DIMM 0
*-bank:1
description: DIMM Synchronous Unbuffered (Unregistered) 4800 MHz (0.2 ns)
product: KF560C40-32
vendor: Kingston
physical id: 1
serial: 6F1AAEED
slot: DIMM 1
size: 32GiB
width: 64 bits
clock: 505MHz (2.0ns)
*-bank:2
description: [empty]
product: Unknown
vendor: Unknown
physical id: 2
serial: Unknown
slot: DIMM 0
*-bank:3
description: DIMM Synchronous Unbuffered (Unregistered) 4800 MHz (0.2 ns)
product: KF560C40-32
vendor: Kingston
physical id: 3
serial: 3C1AA6CB
slot: DIMM 1
size: 32GiB
width: 64 bits
clock: 505MHz (2.0ns)
*-pci:0
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 100
bus info: pci@0000:00:00.0
version: 00
width: 32 bits
clock: 33MHz
*-pci:0
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 1.1
bus info: pci@0000:00:01.1
version: 00
width: 32 bits
clock: 33MHz
capabilities: pci pm pciexpress msi ht normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:27 ioport:f000(size=4096) memory:fcb00000-fcdfffff ioport:fa00000000(size=8858370048)
*-pci
description: PCI bridge
product: Navi 10 XL Upstream Port of PCI Express Switch
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0
bus info: pci@0000:01:00.0
version: c7
width: 32 bits
clock: 33MHz
capabilities: pci pm pciexpress msi normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:32 memory:fcd00000-fcd03fff ioport:f000(size=4096) memory:fcb00000-fccfffff ioport:fa00000000(size=8858370048)
*-pci
description: PCI bridge
product: Navi 10 XL Downstream Port of PCI Express Switch
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0
bus info: pci@0000:02:00.0
version: 00
width: 32 bits
clock: 33MHz
capabilities: pci pm pciexpress msi normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:33 ioport:f000(size=4096) memory:fcb00000-fccfffff ioport:fa00000000(size=8858370048)
*-display
description: VGA compatible controller
product: Navi 23 [Radeon RX 6600/6600 XT/6600M]
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0
bus info: pci@0000:03:00.0
logical name: /dev/fb0
version: c7
width: 64 bits
clock: 33MHz
capabilities: pm pciexpress msi vga_controller bus_master cap_list rom fb
configuration: depth=32 driver=amdgpu latency=0 mode=3840x2160 resolution=3840,2160 visual=truecolor xres=3840 yres=2160
resources: iomemory:fa0-f9f iomemory:fc0-fbf irq:117 memory:fa00000000-fbffffffff memory:fc00000000-fc0fffffff ioport:f000(size=256) memory:fcb00000-fcbfffff memory:fcc00000-fcc1ffff
*-multimedia
description: Audio device
product: Navi 21 HDMI Audio [Radeon RX 6800/6800 XT / 6900 XT]
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0.1
bus info: pci@0000:03:00.1
logical name: card0
logical name: /dev/snd/controlC0
logical name: /dev/snd/hwC0D0
logical name: /dev/snd/pcmC0D10p
logical name: /dev/snd/pcmC0D3p
logical name: /dev/snd/pcmC0D7p
logical name: /dev/snd/pcmC0D8p
logical name: /dev/snd/pcmC0D9p
version: 00
width: 32 bits
clock: 33MHz
capabilities: pm pciexpress msi bus_master cap_list
configuration: driver=snd_hda_intel latency=0
resources: irq:113 memory:fcc20000-fcc23fff
*-input:0
product: HDA ATI HDMI HDMI/DP,pcm=7
physical id: 0
logical name: input10
logical name: /dev/input/event12
*-input:1
product: HDA ATI HDMI HDMI/DP,pcm=8
physical id: 1
logical name: input11
logical name: /dev/input/event14
*-input:2
product: HDA ATI HDMI HDMI/DP,pcm=9
physical id: 2
logical name: input12
logical name: /dev/input/event16
*-input:3
product: HDA ATI HDMI HDMI/DP,pcm=10
physical id: 3
logical name: input13
logical name: /dev/input/event17
*-input:4
product: HDA ATI HDMI HDMI/DP,pcm=3
physical id: 4
logical name: input9
logical name: /dev/input/event11
*-pci:1
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 2.1
bus info: pci@0000:00:02.1
version: 00
width: 32 bits
clock: 33MHz
capabilities: pci pm pciexpress msi ht normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:28 ioport:e000(size=4096) memory:80000000-804fffff
*-pci
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0
bus info: pci@0000:04:00.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:e1e10-e1e0f irq:24 ioport:e000(size=4096) memory:80000000-804fffff
*-pci:0
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0
bus info: pci@0000:05:00.0
version: 01
width: 32 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:34
*-pci:1
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 4
bus info: pci@0000:05:04.0
version: 01
width: 32 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:35
*-pci:2
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 5
bus info: pci@0000:05:05.0
version: 01
width: 32 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:37
*-pci:3
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 6
bus info: pci@0000:05:06.0
version: 01
width: 32 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:39
*-pci:4
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 7
bus info: pci@0000:05:07.0
version: 01
width: 32 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:41
*-pci:5
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 8
bus info: pci@0000:05:08.0
version: 01
width: 32 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:42 ioport:e000(size=4096) memory:80000000-802fffff
*-pci
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0
bus info: pci@0000:0b:00.0
version: 01
width: 32 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:24 ioport:e000(size=4096) memory:80000000-802fffff
*-pci:0
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0
bus info: pci@0000:0c:00.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:1f10-1f0f irq:43
*-pci:1
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 1
bus info: pci@0000:0c:01.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:1f10-1f0f irq:44
*-pci:2
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 2
bus info: pci@0000:0c:02.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:1f10-1f0f irq:45
*-pci:3
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 3
bus info: pci@0000:0c:03.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:e1e10-e1e0f irq:46 ioport:e000(size=4096) memory:80000000-800fffff
*-network
description: Ethernet interface
product: RTL8125 2.5GbE Controller
vendor: Realtek Semiconductor Co., Ltd.
physical id: 0
bus info: pci@0000:10:00.0
logical name: enp16s0
version: 05
serial: 9c:6b:00:01:fb:80
capacity: 1Gbit/s
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress msix vpd bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=r8169 driverversion=6.7.0-060700-generic duplex=full firmware=rtl8125b-2_0.0.2 07/13/20 ip=192.168.178.20 latency=0 link=yes multicast=yes port=twisted pair
resources: irq:40 ioport:e000(size=256) memory:80000000-8000ffff memory:80010000-80013fff
*-pci:4
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 4
bus info: pci@0000:0c:04.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:1f10-1f0f irq:47
*-pci:5
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 8
bus info: pci@0000:0c:08.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:1f10-1f0f irq:48
*-pci:6
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: a
bus info: pci@0000:0c:0a.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:1f10-1f0f irq:49
*-pci:7
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: b
bus info: pci@0000:0c:0b.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:1f10-1f0f irq:50
*-pci:8
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: c
bus info: pci@0000:0c:0c.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:f00-eff irq:24 memory:80100000-801fffff
*-usb
description: USB controller
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0
bus info: pci@0000:15:00.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: msi msix pm pciexpress xhci bus_master cap_list
configuration: driver=xhci_hcd latency=0
resources: irq:24 memory:80100000-80107fff
*-usbhost:0
product: xHCI Host Controller
vendor: Linux 6.7.0-060700-generic xhci-hcd
physical id: 0
bus info: usb@1
logical name: usb1
version: 6.07
capabilities: usb-2.00
configuration: driver=hub slots=12 speed=480Mbit/s
*-usb:0
description: Mouse
product: USB Optical Mouse
vendor: Logitech
physical id: a
bus info: usb@1:a
version: 72.00
capabilities: usb-2.00
configuration: driver=usbhid maxpower=100mA speed=2Mbit/s
*-usb:1
description: Keyboard
product: USB Keyboard
vendor: CHICONY
physical id: c
bus info: usb@1:c
version: 2.30
capabilities: usb-2.00
configuration: driver=usbhid maxpower=100mA speed=2Mbit/s
*-usbhost:1
product: xHCI Host Controller
vendor: Linux 6.7.0-060700-generic xhci-hcd
physical id: 1
bus info: usb@2
logical name: usb2
version: 6.07
capabilities: usb-3.10
configuration: driver=hub slots=5 speed=10000Mbit/s
*-pci:9
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: d
bus info: pci@0000:0c:0d.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:f00-eff irq:36 memory:80200000-802fffff
*-sata
description: SATA controller
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0
bus info: pci@0000:16:00.0
version: 01
width: 32 bits
clock: 33MHz
capabilities: sata msi pm pciexpress ahci_1.0 bus_master cap_list rom
configuration: driver=ahci latency=0
resources: irq:51 memory:80280000-802803ff memory:80200000-8027ffff
*-pci:6
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: c
bus info: pci@0000:05:0c.0
version: 01
width: 32 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:24 memory:80300000-803fffff
*-usb
description: USB controller
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0
bus info: pci@0000:17:00.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: msi msix pm pciexpress xhci bus_master cap_list
configuration: driver=xhci_hcd latency=0
resources: irq:24 memory:80300000-80307fff
*-usbhost:0
product: xHCI Host Controller
vendor: Linux 6.7.0-060700-generic xhci-hcd
physical id: 0
bus info: usb@3
logical name: usb3
version: 6.07
capabilities: usb-2.00
configuration: driver=hub slots=12 speed=480Mbit/s
*-usbhost:1
product: xHCI Host Controller
vendor: Linux 6.7.0-060700-generic xhci-hcd
physical id: 1
bus info: usb@4
logical name: usb4
version: 6.07
capabilities: usb-3.10
configuration: driver=hub slots=5 speed=10000Mbit/s
*-pci:7
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: d
bus info: pci@0000:05:0d.0
version: 01
width: 32 bits
clock: 33MHz
capabilities: pci msi pm pciexpress normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:36 memory:80400000-804fffff
*-sata
description: SATA controller
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0
bus info: pci@0000:18:00.0
logical name: scsi6
version: 01
width: 32 bits
clock: 33MHz
capabilities: sata msi pm pciexpress ahci_1.0 bus_master cap_list rom emulated
configuration: driver=ahci latency=0
resources: irq:54 memory:80480000-804803ff memory:80400000-8047ffff
*-disk
description: ATA Disk
product: ST2000DM008-2UB1
physical id: 0.0.0
bus info: scsi@6:0.0.0
logical name: /dev/sda
version: 0001
serial: ZK30FG74
size: 1863GiB (2TB)
capabilities: gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=5 guid=29f39dc2-d8c3-d545-8b64-faffe5410d9e logicalsectorsize=512 sectorsize=4096
*-volume:0
description: swap partition
vendor: Linux
physical id: 1
bus info: scsi@6:0.0.0,1
logical name: /dev/sda1
serial: 872163a6-51e8-a549-94e2-916ea5259fd5
capacity: 95GiB
capabilities: nofs
*-volume:1
description: EFI partition
physical id: 2
bus info: scsi@6:0.0.0,2
logical name: /dev/sda2
logical name: /cache
serial: 590bafdc-12b8-9443-aa0c-dc876bb12230
capacity: 511GiB
configuration: mount.fstype=btrfs mount.options=rw,relatime,discard=async,space_cache=v2,subvolid=5,subvol=/ state=mounted
*-volume:2
description: EFI partition
physical id: 3
bus info: scsi@6:0.0.0,3
logical name: /dev/sda3
logical name: /archive
serial: c6013753-d6e9-ba41-aa54-afb3ea3069cf
capacity: 1255GiB
configuration: mount.fstype=btrfs mount.options=rw,relatime,discard=async,space_cache=v2,subvolid=5,subvol=/ state=mounted
*-pci:2
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 2.2
bus info: pci@0000:00:02.2
version: 00
width: 32 bits
clock: 33MHz
capabilities: pci pm pciexpress msi ht normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:29 memory:fcf00000-fcffffff
*-nvme
description: NVMe device
product: Samsung SSD 980 1TB
vendor: Samsung Electronics Co Ltd
physical id: 0
bus info: pci@0000:19:00.0
logical name: /dev/nvme0
version: 3B4QFXO7
serial: S649NL0TC79124F
width: 64 bits
clock: 33MHz
capabilities: nvme pm msi pciexpress msix nvm_express bus_master cap_list
configuration: driver=nvme latency=0 nqn=nqn.1994-11.com.samsung:nvme:980M.2:S649NL0TC79124F state=live
resources: irq:24 memory:fcf00000-fcf03fff
*-namespace:0
description: NVMe disk
physical id: 0
logical name: hwmon0
*-namespace:1
description: NVMe disk
physical id: 2
logical name: /dev/ng0n1
*-namespace:2
description: NVMe disk
physical id: 1
bus info: nvme@0:1
logical name: /dev/nvme0n1
size: 931GiB (1TB)
capabilities: gpt-1.00 partitioned partitioned:gpt
configuration: guid=8ffb1787-5e1a-4fdd-abd3-80ef8eafec9a logicalsectorsize=512 sectorsize=512 wwid=eui.002538dc21a74668
*-volume:0
description: EXT4 volume
vendor: Linux
physical id: 1
bus info: nvme@0:1,1
logical name: /dev/nvme0n1p1
logical name: /boot
version: 1.0
serial: a4814207-8827-4b89-adcd-21899f72071b
size: 3905MiB
capabilities: journaled extended_attributes large_files huge_files dir_nlink recover 64bit extents ext4 ext2 initialized
configuration: created=2023-05-03 21:23:01 filesystem=ext4 lastmountpoint=/boot modified=2024-01-17 19:10:17 mount.fstype=ext4 mount.options=rw,relatime,stripe=32 mounted=2024-01-17 19:10:17 state=mounted
*-volume:1
description: Linux swap volume
vendor: Linux
physical id: 2
bus info: nvme@0:1,2
logical name: /dev/nvme0n1p2
version: 1
serial: 47e2238b-aa16-47c7-b96e-275b16dfc265
size: 30GiB
capacity: 30GiB
capabilities: nofs swap initialized
configuration: filesystem=swap pagesize=4095
*-volume:2
description: EFI partition
physical id: 3
bus info: nvme@0:1,3
logical name: /dev/nvme0n1p3
logical name: /usr
serial: d18d3275-7ca3-4db8-863a-f6a36fb63304
capacity: 30GiB
configuration: mount.fstype=btrfs mount.options=rw,relatime,ssd,discard=async,space_cache=v2,subvolid=5,subvol=/ state=mounted
*-volume:3
description: EFI partition
physical id: 4
bus info: nvme@0:1,4
logical name: /dev/nvme0n1p4
logical name: /usr/local
serial: a7018558-b67d-4807-bb5e-f65af509c34f
capacity: 30GiB
configuration: mount.fstype=btrfs mount.options=rw,relatime,ssd,discard=async,space_cache=v2,subvolid=5,subvol=/ state=mounted
*-volume:4
description: EXT4 volume
vendor: Linux
physical id: 5
bus info: nvme@0:1,5
logical name: /dev/nvme0n1p5
logical name: /var
version: 1.0
serial: 8f6cf2e5-aa47-49bc-b50f-fa8023306013
size: 15GiB
capabilities: journaled extended_attributes large_files huge_files dir_nlink recover 64bit extents ext4 ext2 initialized
configuration: created=2023-05-03 21:23:01 filesystem=ext4 lastmountpoint=/var modified=2024-01-17 19:13:04 mount.fstype=ext4 mount.options=rw,relatime,stripe=32 mounted=2024-01-17 19:13:04 state=mounted
*-volume:5
description: EXT4 volume
vendor: Linux
physical id: 6
bus info: nvme@0:1,6
logical name: /dev/nvme0n1p6
version: 1.0
serial: 1beece94-f1c3-4a45-afed-d33c35c04617
size: 15GiB
capabilities: journaled extended_attributes large_files huge_files dir_nlink 64bit extents ext4 ext2 initialized
configuration: created=2023-05-03 21:23:01 filesystem=ext4 lastmountpoint=/tmp modified=2023-09-30 14:49:13 mounted=2023-05-24 20:38:46 state=clean
*-volume:6
description: EFI partition
physical id: 7
bus info: nvme@0:1,7
logical name: /dev/nvme0n1p7
logical name: /home
serial: 39002ad8-36a0-4a01-97d4-7d8cbd94b8e4
capacity: 122GiB
configuration: mount.fstype=btrfs mount.options=rw,relatime,ssd,discard=async,space_cache=v2,subvolid=256,subvol=/@home state=mounted
*-volume:7
description: EFI partition
physical id: 8
bus info: nvme@0:1,8
logical name: /dev/nvme0n1p8
logical name: /
serial: d7bccd5b-1070-40a4-824c-7f33e4ff440e
capacity: 7812MiB
configuration: mount.fstype=btrfs mount.options=rw,relatime,ssd,discard=async,space_cache=v2,subvolid=256,subvol=/@ state=mounted
*-volume:8
description: EFI partition
physical id: 9
bus info: nvme@0:1,9
logical name: /dev/nvme0n1p9
logical name: /srv
serial: a87e71fd-11e9-4ef3-a1b7-a4966f65ecc9
capacity: 30GiB
configuration: mount.fstype=btrfs mount.options=rw,relatime,ssd,discard=async,space_cache=v2,subvolid=5,subvol=/ state=mounted
*-volume:9
description: EFI partition
physical id: a
bus info: nvme@0:1,10
logical name: /dev/nvme0n1p10
logical name: /opt
serial: ffcf44ce-12e0-42e7-9c50-01f2aa257914
capacity: 30GiB
configuration: mount.fstype=btrfs mount.options=rw,relatime,ssd,discard=async,space_cache=v2,subvolid=5,subvol=/ state=mounted
*-volume:10
description: EXT4 volume
vendor: Linux
physical id: b
bus info: nvme@0:1,11
logical name: /dev/nvme0n1p11
logical name: /var/log
version: 1.0
serial: bb53eb49-b161-4879-82c2-ab28079074f0
size: 30GiB
capabilities: journaled extended_attributes large_files huge_files dir_nlink recover 64bit extents ext4 ext2 initialized
configuration: created=2023-05-03 21:23:01 filesystem=ext4 lastmountpoint=/var/log modified=2024-01-17 19:13:04 mount.fstype=ext4 mount.options=rw,relatime,stripe=32 mounted=2024-01-17 19:13:04 state=mounted
*-volume:11
description: Windows FAT volume
vendor: mkfs.fat
physical id: c
bus info: nvme@0:1,12
logical name: /dev/nvme0n1p12
logical name: /boot/efi
version: FAT32
serial: 4951-9e27
size: 963MiB
capacity: 976MiB
capabilities: boot fat initialized
configuration: FATs=2 filesystem=fat mount.fstype=vfat mount.options=rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro state=mounted
*-volume:12
description: EFI partition
physical id: d
bus info: nvme@0:1,13
logical name: /dev/nvme0n1p13
serial: ed27c682-c896-4248-b923-c42cb7dc3fd7
capacity: 99GiB
*-pci:3
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 8.1
bus info: pci@0000:00:08.1
version: 00
width: 32 bits
clock: 33MHz
capabilities: pci pm pciexpress msi normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:30 ioport:d000(size=4096) memory:fc700000-fcafffff ioport:fc20000000(size=270532608)
*-display
description: VGA compatible controller
product: Advanced Micro Devices, Inc. [AMD/ATI]
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0
bus info: pci@0000:1a:00.0
version: c1
width: 64 bits
clock: 33MHz
capabilities: pm pciexpress msi msix vga_controller bus_master cap_list
configuration: driver=amdgpu latency=0
resources: iomemory:fc0-fbf iomemory:fc0-fbf irq:92 memory:fc20000000-fc2fffffff memory:fc30000000-fc301fffff ioport:d000(size=256) memory:fca00000-fca7ffff
*-multimedia:0
description: Audio device
product: Advanced Micro Devices, Inc. [AMD/ATI]
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0.1
bus info: pci@0000:1a:00.1
logical name: card1
logical name: /dev/snd/controlC1
logical name: /dev/snd/hwC1D0
logical name: /dev/snd/pcmC1D3p
logical name: /dev/snd/pcmC1D7p
logical name: /dev/snd/pcmC1D8p
logical name: /dev/snd/pcmC1D9p
version: 00
width: 32 bits
clock: 33MHz
capabilities: pm pciexpress msi bus_master cap_list
configuration: driver=snd_hda_intel latency=0
resources: irq:115 memory:fca88000-fca8bfff
*-input:0
product: HD-Audio Generic HDMI/DP,pcm=3
physical id: 0
logical name: input14
logical name: /dev/input/event9
*-input:1
product: HD-Audio Generic HDMI/DP,pcm=7
physical id: 1
logical name: input15
logical name: /dev/input/event10
*-input:2
product: HD-Audio Generic HDMI/DP,pcm=8
physical id: 2
logical name: input16
logical name: /dev/input/event13
*-input:3
product: HD-Audio Generic HDMI/DP,pcm=9
physical id: 3
logical name: input17
logical name: /dev/input/event15
*-generic
description: Encryption controller
product: VanGogh PSP/CCP
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0.2
bus info: pci@0000:1a:00.2
version: 00
width: 32 bits
clock: 33MHz
capabilities: pm pciexpress msi msix bus_master cap_list
configuration: driver=ccp latency=0
resources: irq:109 memory:fc900000-fc9fffff memory:fca8c000-fca8dfff
*-usb:0
description: USB controller
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0.3
bus info: pci@0000:1a:00.3
version: 00
width: 64 bits
clock: 33MHz
capabilities: pm pciexpress msi msix xhci cap_list
configuration: driver=xhci_hcd latency=0
resources: irq:83 memory:fc800000-fc8fffff
*-usbhost:0
product: xHCI Host Controller
vendor: Linux 6.7.0-060700-generic xhci-hcd
physical id: 0
bus info: usb@5
logical name: usb5
version: 6.07
capabilities: usb-2.00
configuration: driver=hub slots=2 speed=480Mbit/s
*-usbhost:1
product: xHCI Host Controller
vendor: Linux 6.7.0-060700-generic xhci-hcd
physical id: 1
bus info: usb@6
logical name: usb6
version: 6.07
capabilities: usb-3.10
configuration: driver=hub slots=2 speed=10000Mbit/s
*-usb:1
description: USB controller
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0.4
bus info: pci@0000:1a:00.4
version: 00
width: 64 bits
clock: 33MHz
capabilities: pm pciexpress msi msix xhci cap_list
configuration: driver=xhci_hcd latency=0
resources: irq:92 memory:fc700000-fc7fffff
*-usbhost:0
product: xHCI Host Controller
vendor: Linux 6.7.0-060700-generic xhci-hcd
physical id: 0
bus info: usb@7
logical name: usb7
version: 6.07
capabilities: usb-2.00
configuration: driver=hub slots=2 speed=480Mbit/s
*-usbhost:1
product: xHCI Host Controller
vendor: Linux 6.7.0-060700-generic xhci-hcd
physical id: 1
bus info: usb@8
logical name: usb8
version: 6.07
capabilities: usb-3.10
configuration: driver=hub slots=2 speed=10000Mbit/s
*-multimedia:1
description: Audio device
product: Family 17h (Models 10h-1fh) HD Audio Controller
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0.6
bus info: pci@0000:1a:00.6
logical name: card2
logical name: /dev/snd/controlC2
logical name: /dev/snd/hwC2D0
logical name: /dev/snd/pcmC2D0c
logical name: /dev/snd/pcmC2D0p
logical name: /dev/snd/pcmC2D2c
version: 00
width: 32 bits
clock: 33MHz
capabilities: pm pciexpress msi bus_master cap_list
configuration: driver=snd_hda_intel latency=0
resources: irq:116 memory:fca80000-fca87fff
*-input:0
product: HD-Audio Generic Front Mic
physical id: 0
logical name: input18
logical name: /dev/input/event18
*-input:1
product: HD-Audio Generic Rear Mic
physical id: 1
logical name: input19
logical name: /dev/input/event19
*-input:2
product: HD-Audio Generic Line
physical id: 2
logical name: input20
logical name: /dev/input/event20
*-input:3
product: HD-Audio Generic Line Out
physical id: 3
logical name: input21
logical name: /dev/input/event21
*-input:4
product: HD-Audio Generic Front Headphone
physical id: 4
logical name: input22
logical name: /dev/input/event22
*-pci:4
description: PCI bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 8.3
bus info: pci@0000:00:08.3
version: 00
width: 32 bits
clock: 33MHz
capabilities: pci pm pciexpress msi normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:31 memory:fce00000-fcefffff
*-usb
description: USB controller
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 0
bus info: pci@0000:1b:00.0
version: 00
width: 64 bits
clock: 33MHz
capabilities: pm pciexpress msi msix xhci bus_master cap_list
configuration: driver=xhci_hcd latency=0
resources: irq:24 memory:fce00000-fcefffff
*-usbhost:0
product: xHCI Host Controller
vendor: Linux 6.7.0-060700-generic xhci-hcd
physical id: 0
bus info: usb@9
logical name: usb9
version: 6.07
capabilities: usb-2.00
configuration: driver=hub slots=1 speed=480Mbit/s
*-usb
description: Human interface device
product: ASRock LED Controller
vendor: ASRock
physical id: 1
bus info: usb@9:1
logical name: input3
logical name: /dev/input/event3
logical name: /dev/input/js0
version: 0.00
serial: A02019100900
capabilities: usb-1.10 usb
configuration: driver=usbhid maxpower=100mA speed=12Mbit/s
*-usbhost:1
product: xHCI Host Controller
vendor: Linux 6.7.0-060700-generic xhci-hcd
physical id: 1
bus info: usb@10
logical name: usb10
version: 6.07
capabilities: usb-3.00
configuration: speed=5000Mbit/s
*-serial
description: SMBus
product: FCH SMBus Controller
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 14
bus info: pci@0000:00:14.0
version: 71
width: 32 bits
clock: 66MHz
configuration: driver=piix4_smbus latency=0
resources: irq:0
*-isa
description: ISA bridge
product: FCH LPC Bridge
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 14.3
bus info: pci@0000:00:14.3
version: 51
width: 32 bits
clock: 66MHz
capabilities: isa bus_master
configuration: latency=0
*-pnp00:00
product: PnP device PNP0c01
physical id: 0
capabilities: pnp
configuration: driver=system
*-pnp00:01
product: PnP device PNP0c02
physical id: 1
capabilities: pnp
configuration: driver=system
*-pnp00:02
product: PnP device PNP0c02
physical id: 2
capabilities: pnp
configuration: driver=system
*-pnp00:03
product: PnP device PNP0b00
physical id: 3
capabilities: pnp
configuration: driver=rtc_cmos
*-pnp00:04
product: PnP device PNP0c02
physical id: 4
capabilities: pnp
configuration: driver=system
*-pnp00:05
product: PnP device PNP0c02
physical id: 5
capabilities: pnp
configuration: driver=system
*-pci:1
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 101
bus info: pci@0000:00:01.0
version: 00
width: 32 bits
clock: 33MHz
*-pci:2
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 102
bus info: pci@0000:00:02.0
version: 00
width: 32 bits
clock: 33MHz
*-pci:3
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 103
bus info: pci@0000:00:03.0
version: 00
width: 32 bits
clock: 33MHz
*-pci:4
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 104
bus info: pci@0000:00:04.0
version: 00
width: 32 bits
clock: 33MHz
*-pci:5
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 105
bus info: pci@0000:00:08.0
version: 00
width: 32 bits
clock: 33MHz
*-pci:6
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 106
bus info: pci@0000:00:18.0
version: 00
width: 32 bits
clock: 33MHz
*-pci:7
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 107
bus info: pci@0000:00:18.1
version: 00
width: 32 bits
clock: 33MHz
*-pci:8
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 108
bus info: pci@0000:00:18.2
version: 00
width: 32 bits
clock: 33MHz
*-pci:9
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 109
bus info: pci@0000:00:18.3
version: 00
width: 32 bits
clock: 33MHz
configuration: driver=k10temp
resources: irq:0
*-pci:10
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 10a
bus info: pci@0000:00:18.4
version: 00
width: 32 bits
clock: 33MHz
*-pci:11
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 10b
bus info: pci@0000:00:18.5
version: 00
width: 32 bits
clock: 33MHz
*-pci:12
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 10c
bus info: pci@0000:00:18.6
version: 00
width: 32 bits
clock: 33MHz
*-pci:13
description: Host bridge
product: Advanced Micro Devices, Inc. [AMD]
vendor: Advanced Micro Devices, Inc. [AMD]
physical id: 10d
bus info: pci@0000:00:18.7
version: 00
width: 32 bits
clock: 33MHz
*-input:0
product: Power Button
physical id: 1
logical name: input0
logical name: /dev/input/event0
capabilities: platform
*-input:1
product: Power Button
physical id: 2
logical name: input1
logical name: /dev/input/event1
capabilities: platform
*-input:2
product: Video Bus
physical id: 3
logical name: input2
logical name: /dev/input/event2
capabilities: platform
*-input:3
product: Logitech USB Optical Mouse
physical id: 4
logical name: input4
logical name: /dev/input/event4
logical name: /dev/input/mouse0
capabilities: usb
*-input:4
product: CHICONY USB Keyboard
physical id: 5
logical name: input5
logical name: /dev/input/event5
logical name: input5::capslock
logical name: input5::numlock
logical name: input5::scrolllock
capabilities: usb
*-input:5
product: CHICONY USB Keyboard System Control
physical id: 6
logical name: input6
logical name: /dev/input/event6
capabilities: usb
*-input:6
product: CHICONY USB Keyboard Consumer Control
physical id: 7
logical name: input7
logical name: /dev/input/event7
capabilities: usb
*-input:7
product: CHICONY USB Keyboard
physical id: 8
logical name: input8
logical name: /dev/input/event8
capabilities: usb
[-- Attachment #3: config-6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7.xz --]
[-- Type: application/x-xz, Size: 58728 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* BUG [RESEND]: kernel NULL pointer dereference, address: 0000000000000008
2024-01-18 17:23 BUG: kernel NULL pointer dereference, address: 0000000000000008 Mirsad Todorovac
@ 2024-01-20 19:54 ` Mirsad Todorovac
2024-01-22 8:34 ` Ma, Jun
0 siblings, 1 reply; 8+ messages in thread
From: Mirsad Todorovac @ 2024-01-20 19:54 UTC (permalink / raw
To: linux-kernel, amd-gfx
Cc: Sathishkumar S, Lijo Lazar, Srinivasan Shanmugam, Guchun Chen,
Lang Yu, Felix Kuehling, Pan, Xinhui, dri-devel,
Marek Olšák, Boyuan Zhang, Daniel Vetter, David Francis,
Alex Deucher, David Airlie, Christian König
Hi,
The last email did not pass to the most of the recipients due to banned .xz attachment.
As the .config is too big to send inline or uncompressed either, I will omit it in this
attempt. In the meantime, I had some success in decoding the stack trace, but sadly not
complete.
I don't think this Oops is deterministic, but I am working on a reproducer.
The platform is Ubuntu 22.04 LTS.
Complete list of hardware and .config is available here:
https://domac.alu.unizg.hr/~mtodorov/linux/bugreports/amdgpu/6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7/
Best regards,
Mirsad
-------------------------------------------------------------------------------------------
kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
kernel: [ 5.576707] #PF: supervisor read access in kernel mode
kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
kernel: [ 5.576712] PGD 0 P4D 0
kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
All code
========
0: 8d 55 a8 lea -0x58(%rbp),%edx
3: 4c 89 ff mov %r15,%rdi
6: e8 e4 83 ec ff call 0xffffffffffec83ef
b: 41 89 c2 mov %eax,%r10d
e: 83 f8 ed cmp $0xffffffed,%eax
11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
17: 85 c0 test %eax,%eax
19: 74 05 je 0x20
1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
27: 4c 89 ff mov %r15,%rdi
2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
32: 0f b7 70 08 movzwl 0x8(%rax),%esi
36: e8 e4 42 fb ff call 0xfffffffffffb431f
3b: 41 89 c2 mov %eax,%r10d
3e: 85 c0 test %eax,%eax
Code starting with the faulting instruction
===========================================
0: 48 8b 40 08 mov 0x8(%rax),%rax
4: 0f b7 50 0a movzwl 0xa(%rax),%edx
8: 0f b7 70 08 movzwl 0x8(%rax),%esi
c: e8 e4 42 fb ff call 0xfffffffffffb42f5
11: 41 89 c2 mov %eax,%r10d
14: 85 c0 test %eax,%eax
kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
kernel: [ 5.576903] PKRU: 55555554
kernel: [ 5.576905] Call Trace:
kernel: [ 5.576907] <TASK>
kernel: [ 5.576909] ? show_regs (arch/x86/kernel/dumpstack.c:479)
kernel: [ 5.576914] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
kernel: [ 5.576917] ? page_fault_oops (arch/x86/mm/fault.c:707)
kernel: [ 5.576921] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0 (crypto/api.c:497)
kernel: [ 5.576930] ? do_user_addr_fault (arch/x86/mm/fault.c:1264)
kernel: [ 5.576934] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:693 arch/x86/mm/fault.c:1515 arch/x86/mm/fault.c:1563)
kernel: [ 5.576937] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
kernel: [ 5.576942] ? gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
kernel: [ 5.577056] amdgpu_device_init (drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:2465 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4042) amdgpu
kernel: [ 5.577158] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
kernel: [ 5.577161] ? pci_bus_read_config_word (drivers/pci/access.c:67 (discriminator 2))
kernel: [ 5.577166] ? pci_read_config_word (drivers/pci/access.c:563)
kernel: [ 5.577168] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
kernel: [ 5.577171] ? do_pci_enable_device (drivers/pci/pci.c:1975 drivers/pci/pci.c:1949)
kernel: [ 5.577176] amdgpu_driver_load_kms (drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:146) amdgpu
kernel: [ 5.577275] amdgpu_pci_probe (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2237) amdgpu
kernel: [ 5.577373] local_pci_probe (drivers/pci/pci-driver.c:324)
kernel: [ 5.577377] pci_device_probe (drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:460)
kernel: [ 5.577381] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
kernel: [ 5.577386] __driver_probe_device (drivers/base/dd.c:800)
kernel: [ 5.577389] driver_probe_device (drivers/base/dd.c:830)
kernel: [ 5.577392] __driver_attach (drivers/base/dd.c:1217)
kernel: [ 5.577396] ? __pfx___driver_attach (drivers/base/dd.c:1157)
kernel: [ 5.577399] bus_for_each_dev (drivers/base/bus.c:368)
kernel: [ 5.577402] driver_attach (drivers/base/dd.c:1234)
kernel: [ 5.577405] bus_add_driver (drivers/base/bus.c:674)
kernel: [ 5.577409] driver_register (drivers/base/driver.c:246)
kernel: [ 5.577411] ? __pfx_amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2497) amdgpu
kernel: [ 5.577521] __pci_register_driver (drivers/pci/pci-driver.c:1456)
kernel: [ 5.577524] amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2805) amdgpu
kernel: [ 5.577628] do_one_initcall (init/main.c:1236)
kernel: [ 5.577632] ? kmalloc_trace (mm/slub.c:3816 mm/slub.c:3860 mm/slub.c:4007)
kernel: [ 5.577637] do_init_module (kernel/module/main.c:2533)
kernel: [ 5.577640] load_module (kernel/module/main.c:2984)
kernel: [ 5.577647] init_module_from_file (kernel/module/main.c:3151)
kernel: [ 5.577649] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
kernel: [ 5.577652] ? init_module_from_file (kernel/module/main.c:3151)
kernel: [ 5.577657] idempotent_init_module (kernel/module/main.c:3168)
kernel: [ 5.577661] __x64_sys_finit_module (./include/linux/file.h:45 kernel/module/main.c:3190 kernel/module/main.c:3172 kernel/module/main.c:3172)
kernel: [ 5.577664] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
kernel: [ 5.577668] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
kernel: [ 5.577671] ? ksys_mmap_pgoff (mm/mmap.c:1428)
kernel: [ 5.577675] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
kernel: [ 5.577678] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
kernel: [ 5.577681] ? syscall_exit_to_user_mode (kernel/entry/common.c:215)
kernel: [ 5.577684] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
kernel: [ 5.577687] ? do_syscall_64 (./arch/x86/include/asm/cpufeature.h:171 arch/x86/entry/common.c:98)
kernel: [ 5.577689] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
kernel: [ 5.577692] ? do_syscall_64 (./arch/x86/include/asm/cpufeature.h:171 arch/x86/entry/common.c:98)
kernel: [ 5.577695] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
kernel: [ 5.577698] ? do_syscall_64 (./arch/x86/include/asm/cpufeature.h:171 arch/x86/entry/common.c:98)
kernel: [ 5.577700] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
kernel: [ 5.577703] ? sysvec_call_function (arch/x86/kernel/smp.c:253 (discriminator 69))
kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
All code
========
0: 5b pop %rbx
1: 41 5c pop %r12
3: c3 ret
4: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
b: 00 00
d: f3 0f 1e fa endbr64
11: 48 89 f8 mov %rdi,%rax
14: 48 89 f7 mov %rsi,%rdi
17: 48 89 d6 mov %rdx,%rsi
1a: 48 89 ca mov %rcx,%rdx
1d: 4d 89 c2 mov %r8,%r10
20: 4d 89 c8 mov %r9,%r8
23: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9
28: 0f 05 syscall
2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
30: 73 01 jae 0x33
32: c3 ret
33: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb5ad
3a: f7 d8 neg %eax
3c: 64 89 01 mov %eax,%fs:(%rcx)
3f: 48 rex.W
Code starting with the faulting instruction
===========================================
0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
6: 73 01 jae 0x9
8: c3 ret
9: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb583
10: f7 d8 neg %eax
12: 64 89 01 mov %eax,%fs:(%rcx)
15: 48 rex.W
kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
kernel: [ 5.577748] </TASK>
kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
kernel: [ 5.577817] CR2: 0000000000000008
kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
All code
========
0: 8d 55 a8 lea -0x58(%rbp),%edx
3: 4c 89 ff mov %r15,%rdi
6: e8 e4 83 ec ff call 0xffffffffffec83ef
b: 41 89 c2 mov %eax,%r10d
e: 83 f8 ed cmp $0xffffffed,%eax
11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
17: 85 c0 test %eax,%eax
19: 74 05 je 0x20
1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
27: 4c 89 ff mov %r15,%rdi
2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
32: 0f b7 70 08 movzwl 0x8(%rax),%esi
36: e8 e4 42 fb ff call 0xfffffffffffb431f
3b: 41 89 c2 mov %eax,%r10d
3e: 85 c0 test %eax,%eax
Code starting with the faulting instruction
===========================================
0: 48 8b 40 08 mov 0x8(%rax),%rax
4: 0f b7 50 0a movzwl 0xa(%rax),%edx
8: 0f b7 70 08 movzwl 0x8(%rax),%esi
c: e8 e4 42 fb ff call 0xfffffffffffb42f5
11: 41 89 c2 mov %eax,%r10d
14: 85 c0 test %eax,%eax
rsyslogd: rsyslogd's groupid changed to 111
kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
kernel: [ 5.914419] PKRU: 55555554
Best regards,
Mirsad
On 1/18/24 18:23, Mirsad Todorovac wrote:
> Hi,
>
> Unfortunately, I was not able to reboot in this kernel again to do the stack decode, but I thought
> that any information about the NULL pointer dereference is better than no info.
>
> The system is Ubuntu 23.10 Mantic with AMD product: Navi 23 [Radeon RX 6600/6600 XT/6600M]
> graphic card.
>
> Please find the config and the hw listing attached.
>
> Best regards,
> Mirsad
> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
> kernel: [ 5.576712] PGD 0 P4D 0
> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
> kernel: [ 5.576903] PKRU: 55555554
> kernel: [ 5.576905] Call Trace:
> kernel: [ 5.576907] <TASK>
> kernel: [ 5.576909] ? show_regs+0x72/0x90
> kernel: [ 5.576914] ? __die+0x25/0x80
> kernel: [ 5.576917] ? page_fault_oops+0x154/0x4c0
> kernel: [ 5.576921] ? srso_alias_return_thunk+0x5/0xfbef5
> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0+0x35/0x70
> kernel: [ 5.576930] ? do_user_addr_fault+0x30e/0x6e0
> kernel: [ 5.576934] ? exc_page_fault+0x84/0x1b0
> kernel: [ 5.576937] ? asm_exc_page_fault+0x27/0x30
> kernel: [ 5.576942] ? gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
> kernel: [ 5.577056] amdgpu_device_init+0xefa/0x2de0 [amdgpu]
> kernel: [ 5.577158] ? srso_alias_return_thunk+0x5/0xfbef5
> kernel: [ 5.577161] ? pci_bus_read_config_word+0x47/0x90
> kernel: [ 5.577166] ? pci_read_config_word+0x27/0x60
> kernel: [ 5.577168] ? srso_alias_return_thunk+0x5/0xfbef5
> kernel: [ 5.577171] ? do_pci_enable_device+0xe1/0x110
> kernel: [ 5.577176] amdgpu_driver_load_kms+0x1a/0x1c0 [amdgpu]
> kernel: [ 5.577275] amdgpu_pci_probe+0x1a8/0x5e0 [amdgpu]
> kernel: [ 5.577373] local_pci_probe+0x48/0xb0
> kernel: [ 5.577377] pci_device_probe+0xc8/0x290
> kernel: [ 5.577381] really_probe+0x1d2/0x440
> kernel: [ 5.577386] __driver_probe_device+0x8a/0x190
> kernel: [ 5.577389] driver_probe_device+0x23/0xd0
> kernel: [ 5.577392] __driver_attach+0x10f/0x220
> kernel: [ 5.577396] ? __pfx___driver_attach+0x10/0x10
> kernel: [ 5.577399] bus_for_each_dev+0x7a/0xe0
> kernel: [ 5.577402] driver_attach+0x1e/0x30
> kernel: [ 5.577405] bus_add_driver+0x127/0x240
> kernel: [ 5.577409] driver_register+0x64/0x140
> kernel: [ 5.577411] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
> kernel: [ 5.577521] __pci_register_driver+0x68/0x80
> kernel: [ 5.577524] amdgpu_init+0x69/0xff0 [amdgpu]
> kernel: [ 5.577628] do_one_initcall+0x46/0x330
> kernel: [ 5.577632] ? kmalloc_trace+0x136/0x370
> kernel: [ 5.577637] do_init_module+0x6a/0x280
> kernel: [ 5.577640] load_module+0x2419/0x2500
> kernel: [ 5.577647] init_module_from_file+0x9c/0xf0
> kernel: [ 5.577649] ? srso_alias_return_thunk+0x5/0xfbef5
> kernel: [ 5.577652] ? init_module_from_file+0x9c/0xf0
> kernel: [ 5.577657] idempotent_init_module+0x184/0x240
> kernel: [ 5.577661] __x64_sys_finit_module+0x64/0xd0
> kernel: [ 5.577664] do_syscall_64+0x76/0x140
> kernel: [ 5.577668] ? srso_alias_return_thunk+0x5/0xfbef5
> kernel: [ 5.577671] ? ksys_mmap_pgoff+0x123/0x270
> kernel: [ 5.577675] ? srso_alias_return_thunk+0x5/0xfbef5
> kernel: [ 5.577678] ? srso_alias_return_thunk+0x5/0xfbef5
> kernel: [ 5.577681] ? syscall_exit_to_user_mode+0x97/0x1e0
> kernel: [ 5.577684] ? srso_alias_return_thunk+0x5/0xfbef5
> kernel: [ 5.577687] ? do_syscall_64+0x85/0x140
> kernel: [ 5.577689] ? srso_alias_return_thunk+0x5/0xfbef5
> kernel: [ 5.577692] ? do_syscall_64+0x85/0x140
> kernel: [ 5.577695] ? srso_alias_return_thunk+0x5/0xfbef5
> kernel: [ 5.577698] ? do_syscall_64+0x85/0x140
> kernel: [ 5.577700] ? srso_alias_return_thunk+0x5/0xfbef5
> kernel: [ 5.577703] ? sysvec_call_function+0x4e/0xb0
> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe+0x6e/0x76
> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
> kernel: [ 5.577748] </TASK>
> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
> kernel: [ 5.577817] CR2: 0000000000000008
> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
> rsyslogd: rsyslogd's groupid changed to 111
> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
> kernel: [ 5.914419] PKRU: 55555554
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG [RESEND]: kernel NULL pointer dereference, address: 0000000000000008
2024-01-20 19:54 ` BUG [RESEND]: " Mirsad Todorovac
@ 2024-01-22 8:34 ` Ma, Jun
2024-01-22 22:39 ` Mirsad Todorovac
2024-01-24 17:48 ` BUG [RESEND][NEW BUG]: " Mirsad Todorovac
0 siblings, 2 replies; 8+ messages in thread
From: Ma, Jun @ 2024-01-22 8:34 UTC (permalink / raw
To: Mirsad Todorovac, linux-kernel, amd-gfx
Cc: Sathishkumar S, Pan, Xinhui, Srinivasan Shanmugam, Guchun Chen,
David Airlie, Felix Kuehling, Lijo Lazar, dri-devel,
Christian König, Boyuan Zhang, Daniel Vetter, David Francis,
Alex Deucher, Lang Yu, Marek Olšák
Perhaps similar to the problem I encountered earlier, you can
try the following patch
https://lists.freedesktop.org/archives/amd-gfx/2024-January/103259.html
Regards,
Ma Jun
On 1/21/2024 3:54 AM, Mirsad Todorovac wrote:
> Hi,
>
> The last email did not pass to the most of the recipients due to banned .xz attachment.
>
> As the .config is too big to send inline or uncompressed either, I will omit it in this
> attempt. In the meantime, I had some success in decoding the stack trace, but sadly not
> complete.
>
> I don't think this Oops is deterministic, but I am working on a reproducer.
>
> The platform is Ubuntu 22.04 LTS.
>
> Complete list of hardware and .config is available here:
>
> https://domac.alu.unizg.hr/~mtodorov/linux/bugreports/amdgpu/6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7/
>
> Best regards,
> Mirsad
>
> -------------------------------------------------------------------------------------------
> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
> kernel: [ 5.576712] PGD 0 P4D 0
> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
> All code
> ========
> 0: 8d 55 a8 lea -0x58(%rbp),%edx
> 3: 4c 89 ff mov %r15,%rdi
> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
> b: 41 89 c2 mov %eax,%r10d
> e: 83 f8 ed cmp $0xffffffed,%eax
> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
> 17: 85 c0 test %eax,%eax
> 19: 74 05 je 0x20
> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
> 27: 4c 89 ff mov %r15,%rdi
> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
> 3b: 41 89 c2 mov %eax,%r10d
> 3e: 85 c0 test %eax,%eax
>
> Code starting with the faulting instruction
> ===========================================
> 0: 48 8b 40 08 mov 0x8(%rax),%rax
> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
> 11: 41 89 c2 mov %eax,%r10d
> 14: 85 c0 test %eax,%eax
> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
> kernel: [ 5.576903] PKRU: 55555554
> kernel: [ 5.576905] Call Trace:
> kernel: [ 5.576907] <TASK>
> kernel: [ 5.576909] ? show_regs (arch/x86/kernel/dumpstack.c:479)
> kernel: [ 5.576914] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
> kernel: [ 5.576917] ? page_fault_oops (arch/x86/mm/fault.c:707)
> kernel: [ 5.576921] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0 (crypto/api.c:497)
> kernel: [ 5.576930] ? do_user_addr_fault (arch/x86/mm/fault.c:1264)
> kernel: [ 5.576934] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:693 arch/x86/mm/fault.c:1515 arch/x86/mm/fault.c:1563)
> kernel: [ 5.576937] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
> kernel: [ 5.576942] ? gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
> kernel: [ 5.577056] amdgpu_device_init (drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:2465 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4042) amdgpu
> kernel: [ 5.577158] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
> kernel: [ 5.577161] ? pci_bus_read_config_word (drivers/pci/access.c:67 (discriminator 2))
> kernel: [ 5.577166] ? pci_read_config_word (drivers/pci/access.c:563)
> kernel: [ 5.577168] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
> kernel: [ 5.577171] ? do_pci_enable_device (drivers/pci/pci.c:1975 drivers/pci/pci.c:1949)
> kernel: [ 5.577176] amdgpu_driver_load_kms (drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:146) amdgpu
> kernel: [ 5.577275] amdgpu_pci_probe (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2237) amdgpu
> kernel: [ 5.577373] local_pci_probe (drivers/pci/pci-driver.c:324)
> kernel: [ 5.577377] pci_device_probe (drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:460)
> kernel: [ 5.577381] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
> kernel: [ 5.577386] __driver_probe_device (drivers/base/dd.c:800)
> kernel: [ 5.577389] driver_probe_device (drivers/base/dd.c:830)
> kernel: [ 5.577392] __driver_attach (drivers/base/dd.c:1217)
> kernel: [ 5.577396] ? __pfx___driver_attach (drivers/base/dd.c:1157)
> kernel: [ 5.577399] bus_for_each_dev (drivers/base/bus.c:368)
> kernel: [ 5.577402] driver_attach (drivers/base/dd.c:1234)
> kernel: [ 5.577405] bus_add_driver (drivers/base/bus.c:674)
> kernel: [ 5.577409] driver_register (drivers/base/driver.c:246)
> kernel: [ 5.577411] ? __pfx_amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2497) amdgpu
> kernel: [ 5.577521] __pci_register_driver (drivers/pci/pci-driver.c:1456)
> kernel: [ 5.577524] amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2805) amdgpu
> kernel: [ 5.577628] do_one_initcall (init/main.c:1236)
> kernel: [ 5.577632] ? kmalloc_trace (mm/slub.c:3816 mm/slub.c:3860 mm/slub.c:4007)
> kernel: [ 5.577637] do_init_module (kernel/module/main.c:2533)
> kernel: [ 5.577640] load_module (kernel/module/main.c:2984)
> kernel: [ 5.577647] init_module_from_file (kernel/module/main.c:3151)
> kernel: [ 5.577649] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
> kernel: [ 5.577652] ? init_module_from_file (kernel/module/main.c:3151)
> kernel: [ 5.577657] idempotent_init_module (kernel/module/main.c:3168)
> kernel: [ 5.577661] __x64_sys_finit_module (./include/linux/file.h:45 kernel/module/main.c:3190 kernel/module/main.c:3172 kernel/module/main.c:3172)
> kernel: [ 5.577664] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
> kernel: [ 5.577668] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
> kernel: [ 5.577671] ? ksys_mmap_pgoff (mm/mmap.c:1428)
> kernel: [ 5.577675] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
> kernel: [ 5.577678] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
> kernel: [ 5.577681] ? syscall_exit_to_user_mode (kernel/entry/common.c:215)
> kernel: [ 5.577684] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
> kernel: [ 5.577687] ? do_syscall_64 (./arch/x86/include/asm/cpufeature.h:171 arch/x86/entry/common.c:98)
> kernel: [ 5.577689] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
> kernel: [ 5.577692] ? do_syscall_64 (./arch/x86/include/asm/cpufeature.h:171 arch/x86/entry/common.c:98)
> kernel: [ 5.577695] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
> kernel: [ 5.577698] ? do_syscall_64 (./arch/x86/include/asm/cpufeature.h:171 arch/x86/entry/common.c:98)
> kernel: [ 5.577700] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
> kernel: [ 5.577703] ? sysvec_call_function (arch/x86/kernel/smp.c:253 (discriminator 69))
> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
> All code
> ========
> 0: 5b pop %rbx
> 1: 41 5c pop %r12
> 3: c3 ret
> 4: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
> b: 00 00
> d: f3 0f 1e fa endbr64
> 11: 48 89 f8 mov %rdi,%rax
> 14: 48 89 f7 mov %rsi,%rdi
> 17: 48 89 d6 mov %rdx,%rsi
> 1a: 48 89 ca mov %rcx,%rdx
> 1d: 4d 89 c2 mov %r8,%r10
> 20: 4d 89 c8 mov %r9,%r8
> 23: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9
> 28: 0f 05 syscall
> 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
> 30: 73 01 jae 0x33
> 32: c3 ret
> 33: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb5ad
> 3a: f7 d8 neg %eax
> 3c: 64 89 01 mov %eax,%fs:(%rcx)
> 3f: 48 rex.W
>
> Code starting with the faulting instruction
> ===========================================
> 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
> 6: 73 01 jae 0x9
> 8: c3 ret
> 9: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb583
> 10: f7 d8 neg %eax
> 12: 64 89 01 mov %eax,%fs:(%rcx)
> 15: 48 rex.W
> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
> kernel: [ 5.577748] </TASK>
> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
> kernel: [ 5.577817] CR2: 0000000000000008
> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
> All code
> ========
> 0: 8d 55 a8 lea -0x58(%rbp),%edx
> 3: 4c 89 ff mov %r15,%rdi
> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
> b: 41 89 c2 mov %eax,%r10d
> e: 83 f8 ed cmp $0xffffffed,%eax
> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
> 17: 85 c0 test %eax,%eax
> 19: 74 05 je 0x20
> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
> 27: 4c 89 ff mov %r15,%rdi
> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
> 3b: 41 89 c2 mov %eax,%r10d
> 3e: 85 c0 test %eax,%eax
>
> Code starting with the faulting instruction
> ===========================================
> 0: 48 8b 40 08 mov 0x8(%rax),%rax
> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
> 11: 41 89 c2 mov %eax,%r10d
> 14: 85 c0 test %eax,%eax
> rsyslogd: rsyslogd's groupid changed to 111
> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
> kernel: [ 5.914419] PKRU: 55555554
>
> Best regards,
> Mirsad
>
> On 1/18/24 18:23, Mirsad Todorovac wrote:
>> Hi,
>>
>> Unfortunately, I was not able to reboot in this kernel again to do the stack decode, but I thought
>> that any information about the NULL pointer dereference is better than no info.
>>
>> The system is Ubuntu 23.10 Mantic with AMD product: Navi 23 [Radeon RX 6600/6600 XT/6600M]
>> graphic card.
>>
>> Please find the config and the hw listing attached.
>>
>> Best regards,
>> Mirsad
>
>
>
>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>> kernel: [ 5.576712] PGD 0 P4D 0
>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>> kernel: [ 5.576903] PKRU: 55555554
>> kernel: [ 5.576905] Call Trace:
>> kernel: [ 5.576907] <TASK>
>> kernel: [ 5.576909] ? show_regs+0x72/0x90
>> kernel: [ 5.576914] ? __die+0x25/0x80
>> kernel: [ 5.576917] ? page_fault_oops+0x154/0x4c0
>> kernel: [ 5.576921] ? srso_alias_return_thunk+0x5/0xfbef5
>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0+0x35/0x70
>> kernel: [ 5.576930] ? do_user_addr_fault+0x30e/0x6e0
>> kernel: [ 5.576934] ? exc_page_fault+0x84/0x1b0
>> kernel: [ 5.576937] ? asm_exc_page_fault+0x27/0x30
>> kernel: [ 5.576942] ? gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>> kernel: [ 5.577056] amdgpu_device_init+0xefa/0x2de0 [amdgpu]
>> kernel: [ 5.577158] ? srso_alias_return_thunk+0x5/0xfbef5
>> kernel: [ 5.577161] ? pci_bus_read_config_word+0x47/0x90
>> kernel: [ 5.577166] ? pci_read_config_word+0x27/0x60
>> kernel: [ 5.577168] ? srso_alias_return_thunk+0x5/0xfbef5
>> kernel: [ 5.577171] ? do_pci_enable_device+0xe1/0x110
>> kernel: [ 5.577176] amdgpu_driver_load_kms+0x1a/0x1c0 [amdgpu]
>> kernel: [ 5.577275] amdgpu_pci_probe+0x1a8/0x5e0 [amdgpu]
>> kernel: [ 5.577373] local_pci_probe+0x48/0xb0
>> kernel: [ 5.577377] pci_device_probe+0xc8/0x290
>> kernel: [ 5.577381] really_probe+0x1d2/0x440
>> kernel: [ 5.577386] __driver_probe_device+0x8a/0x190
>> kernel: [ 5.577389] driver_probe_device+0x23/0xd0
>> kernel: [ 5.577392] __driver_attach+0x10f/0x220
>> kernel: [ 5.577396] ? __pfx___driver_attach+0x10/0x10
>> kernel: [ 5.577399] bus_for_each_dev+0x7a/0xe0
>> kernel: [ 5.577402] driver_attach+0x1e/0x30
>> kernel: [ 5.577405] bus_add_driver+0x127/0x240
>> kernel: [ 5.577409] driver_register+0x64/0x140
>> kernel: [ 5.577411] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
>> kernel: [ 5.577521] __pci_register_driver+0x68/0x80
>> kernel: [ 5.577524] amdgpu_init+0x69/0xff0 [amdgpu]
>> kernel: [ 5.577628] do_one_initcall+0x46/0x330
>> kernel: [ 5.577632] ? kmalloc_trace+0x136/0x370
>> kernel: [ 5.577637] do_init_module+0x6a/0x280
>> kernel: [ 5.577640] load_module+0x2419/0x2500
>> kernel: [ 5.577647] init_module_from_file+0x9c/0xf0
>> kernel: [ 5.577649] ? srso_alias_return_thunk+0x5/0xfbef5
>> kernel: [ 5.577652] ? init_module_from_file+0x9c/0xf0
>> kernel: [ 5.577657] idempotent_init_module+0x184/0x240
>> kernel: [ 5.577661] __x64_sys_finit_module+0x64/0xd0
>> kernel: [ 5.577664] do_syscall_64+0x76/0x140
>> kernel: [ 5.577668] ? srso_alias_return_thunk+0x5/0xfbef5
>> kernel: [ 5.577671] ? ksys_mmap_pgoff+0x123/0x270
>> kernel: [ 5.577675] ? srso_alias_return_thunk+0x5/0xfbef5
>> kernel: [ 5.577678] ? srso_alias_return_thunk+0x5/0xfbef5
>> kernel: [ 5.577681] ? syscall_exit_to_user_mode+0x97/0x1e0
>> kernel: [ 5.577684] ? srso_alias_return_thunk+0x5/0xfbef5
>> kernel: [ 5.577687] ? do_syscall_64+0x85/0x140
>> kernel: [ 5.577689] ? srso_alias_return_thunk+0x5/0xfbef5
>> kernel: [ 5.577692] ? do_syscall_64+0x85/0x140
>> kernel: [ 5.577695] ? srso_alias_return_thunk+0x5/0xfbef5
>> kernel: [ 5.577698] ? do_syscall_64+0x85/0x140
>> kernel: [ 5.577700] ? srso_alias_return_thunk+0x5/0xfbef5
>> kernel: [ 5.577703] ? sysvec_call_function+0x4e/0xb0
>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe+0x6e/0x76
>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>> kernel: [ 5.577748] </TASK>
>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
>> kernel: [ 5.577817] CR2: 0000000000000008
>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>> rsyslogd: rsyslogd's groupid changed to 111
>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>> kernel: [ 5.914419] PKRU: 55555554
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG [RESEND]: kernel NULL pointer dereference, address: 0000000000000008
2024-01-22 8:34 ` Ma, Jun
@ 2024-01-22 22:39 ` Mirsad Todorovac
2024-01-24 17:48 ` BUG [RESEND][NEW BUG]: " Mirsad Todorovac
1 sibling, 0 replies; 8+ messages in thread
From: Mirsad Todorovac @ 2024-01-22 22:39 UTC (permalink / raw
To: Ma, Jun, linux-kernel, amd-gfx
Cc: Sathishkumar S, Pan, Xinhui, Srinivasan Shanmugam, Guchun Chen,
David Airlie, Felix Kuehling, Lijo Lazar, dri-devel,
Christian König, Boyuan Zhang, Daniel Vetter, David Francis,
Alex Deucher, Lang Yu, Marek Olšák
On 22. 01. 2024. 09:34, Ma, Jun wrote:
> Perhaps similar to the problem I encountered earlier, you can
> try the following patch
>
> https://lists.freedesktop.org/archives/amd-gfx/2024-January/103259.html
Appaarently, this patch prevented NULL dereference, it was no longer in the log.
However, there is another hang in XWayland password entry dialog, but I do not
think that I figured out what is wrong.
Best regards,
Mirsad
> Regards,
> Ma Jun
>
> On 1/21/2024 3:54 AM, Mirsad Todorovac wrote:
>> Hi,
>>
>> The last email did not pass to the most of the recipients due to banned .xz attachment.
>>
>> As the .config is too big to send inline or uncompressed either, I will omit it in this
>> attempt. In the meantime, I had some success in decoding the stack trace, but sadly not
>> complete.
>>
>> I don't think this Oops is deterministic, but I am working on a reproducer.
>>
>> The platform is Ubuntu 22.04 LTS.
>>
>> Complete list of hardware and .config is available here:
>>
>> https://domac.alu.unizg.hr/~mtodorov/linux/bugreports/amdgpu/6.7.0-rtl-v02-nokcsan-09928-g052d534373b7/
>>
>> Best regards,
>> Mirsad
>>
>> -------------------------------------------------------------------------------------------
>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>> kernel: [ 5.576712] PGD 0 P4D 0
>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>> All code
>> ========
>> 0: 8d 55 a8 lea -0x58(%rbp),%edx
>> 3: 4c 89 ff mov %r15,%rdi
>> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
>> b: 41 89 c2 mov %eax,%r10d
>> e: 83 f8 ed cmp $0xffffffed,%eax
>> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
>> 17: 85 c0 test %eax,%eax
>> 19: 74 05 je 0x20
>> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
>> 27: 4c 89 ff mov %r15,%rdi
>> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
>> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
>> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
>> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
>> 3b: 41 89 c2 mov %eax,%r10d
>> 3e: 85 c0 test %eax,%eax
>>
>> Code starting with the faulting instruction
>> ===========================================
>> 0: 48 8b 40 08 mov 0x8(%rax),%rax
>> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
>> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
>> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
>> 11: 41 89 c2 mov %eax,%r10d
>> 14: 85 c0 test %eax,%eax
>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>> kernel: [ 5.576903] PKRU: 55555554
>> kernel: [ 5.576905] Call Trace:
>> kernel: [ 5.576907] <TASK>
>> kernel: [ 5.576909] ? show_regs (arch/x86/kernel/dumpstack.c:479)
>> kernel: [ 5.576914] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
>> kernel: [ 5.576917] ? page_fault_oops (arch/x86/mm/fault.c:707)
>> kernel: [ 5.576921] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0 (crypto/api.c:497)
>> kernel: [ 5.576930] ? do_user_addr_fault (arch/x86/mm/fault.c:1264)
>> kernel: [ 5.576934] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:693 arch/x86/mm/fault.c:1515 arch/x86/mm/fault.c:1563)
>> kernel: [ 5.576937] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
>> kernel: [ 5.576942] ? gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>> kernel: [ 5.577056] amdgpu_device_init (drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:2465 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4042) amdgpu
>> kernel: [ 5.577158] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577161] ? pci_bus_read_config_word (drivers/pci/access.c:67 (discriminator 2))
>> kernel: [ 5.577166] ? pci_read_config_word (drivers/pci/access.c:563)
>> kernel: [ 5.577168] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577171] ? do_pci_enable_device (drivers/pci/pci.c:1975 drivers/pci/pci.c:1949)
>> kernel: [ 5.577176] amdgpu_driver_load_kms (drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:146) amdgpu
>> kernel: [ 5.577275] amdgpu_pci_probe (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2237) amdgpu
>> kernel: [ 5.577373] local_pci_probe (drivers/pci/pci-driver.c:324)
>> kernel: [ 5.577377] pci_device_probe (drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:460)
>> kernel: [ 5.577381] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
>> kernel: [ 5.577386] __driver_probe_device (drivers/base/dd.c:800)
>> kernel: [ 5.577389] driver_probe_device (drivers/base/dd.c:830)
>> kernel: [ 5.577392] __driver_attach (drivers/base/dd.c:1217)
>> kernel: [ 5.577396] ? __pfx___driver_attach (drivers/base/dd.c:1157)
>> kernel: [ 5.577399] bus_for_each_dev (drivers/base/bus.c:368)
>> kernel: [ 5.577402] driver_attach (drivers/base/dd.c:1234)
>> kernel: [ 5.577405] bus_add_driver (drivers/base/bus.c:674)
>> kernel: [ 5.577409] driver_register (drivers/base/driver.c:246)
>> kernel: [ 5.577411] ? __pfx_amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2497) amdgpu
>> kernel: [ 5.577521] __pci_register_driver (drivers/pci/pci-driver.c:1456)
>> kernel: [ 5.577524] amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drvc:2805) amdgpu
>> kernel: [ 5.577628] do_one_initcall (init/main.c:1236)
>> kernel: [ 5.577632] ? kmalloc_trace (mm/slub.c:3816 mm/slub.c:3860 mm/slub.c:4007)
>> kernel: [ 5.577637] do_init_module (kernel/module/main.c:2533)
>> kernel: [ 5.577640] load_module (kernel/module/main.c:2984)
>> kernel: [ 5.577647] init_module_from_file (kernel/module/main.c:3151)
>> kernel: [ 5.577649] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577652] ? init_module_from_file (kernel/module/main.c:3151)
>> kernel: [ 5.577657] idempotent_init_module (kernel/module/main.c:3168)
>> kernel: [ 5.577661] __x64_sys_finit_module (./include/linux/file.h:45 kernel/module/main.c:3190 kernel/module/main.c:3172 kernel/module/main.c:3172)
>> kernel: [ 5.577664] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
>> kernel: [ 5.577668] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577671] ? ksys_mmap_pgoff (mm/mmap.c:1428)
>> kernel: [ 5.577675] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577678] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577681] ? syscall_exit_to_user_mode (kernel/entry/common.c:215)
>> kernel: [ 5.577684] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577687] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>> kernel: [ 5.577689] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577692] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>> kernel: [ 5.577695] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577698] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>> kernel: [ 5.577700] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577703] ? sysvec_call_function (arch/x86/kernel/smp.c:253 (discriminator 69))
>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>> All code
>> ========
>> 0: 5b pop %rbx
>> 1: 41 5c pop %r12
>> 3: c3 ret
>> 4: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
>> b: 00 00
>> d: f3 0f 1e fa endbr64
>> 11: 48 89 f8 mov %rdi,%rax
>> 14: 48 89 f7 mov %rsi,%rdi
>> 17: 48 89 d6 mov %rdx,%rsi
>> 1a: 48 89 ca mov %rcx,%rdx
>> 1d: 4d 89 c2 mov %r8,%r10
>> 20: 4d 89 c8 mov %r9,%r8
>> 23: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9
>> 28: 0f 05 syscall
>> 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
>> 30: 73 01 jae 0x33
>> 32: c3 ret
>> 33: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb5ad
>> 3a: f7 d8 neg %eax
>> 3c: 64 89 01 mov %eax,%fs:(%rcx)
>> 3f: 48 rex.W
>>
>> Code starting with the faulting instruction
>> ===========================================
>> 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
>> 6: 73 01 jae 0x9
>> 8: c3 ret
>> 9: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb583
>> 10: f7 d8 neg %eax
>> 12: 64 89 01 mov %eax,%fs:(%rcx)
>> 15: 48 rex.W
>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>> kernel: [ 5.577748] </TASK>
>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
>> kernel: [ 5.577817] CR2: 0000000000000008
>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>> All code
>> ========
>> 0: 8d 55 a8 lea -0x58(%rbp),%edx
>> 3: 4c 89 ff mov %r15,%rdi
>> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
>> b: 41 89 c2 mov %eax,%r10d
>> e: 83 f8 ed cmp $0xffffffed,%eax
>> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
>> 17: 85 c0 test %eax,%eax
>> 19: 74 05 je 0x20
>> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
>> 27: 4c 89 ff mov %r15,%rdi
>> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
>> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
>> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
>> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
>> 3b: 41 89 c2 mov %eax,%r10d
>> 3e: 85 c0 test %eax,%eax
>>
>> Code starting with the faulting instruction
>> ===========================================
>> 0: 48 8b 40 08 mov 0x8(%rax),%rax
>> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
>> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
>> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
>> 11: 41 89 c2 mov %eax,%r10d
>> 14: 85 c0 test %eax,%eax
>> rsyslogd: rsyslogd's groupid changed to 111
>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>> kernel: [ 5.914419] PKRU: 55555554
>>
>> Best regards,
>> Mirsad
>>
>> On 1/18/24 18:23, Mirsad Todorovac wrote:
>>> Hi,
>>>
>>> Unfortunately, I was not able to reboot in this kernel again to do the stack decode, but I thought
>>> that any information about the NULL pointer dereference is better than no info.
>>>
>>> The system is Ubuntu 23.10 Mantic with AMD product: Navi 23 [Radeon RX 6600/6600 XT/6600M]
>>> graphic card.
>>>
>>> Please find the config and the hw listing attached.
>>>
>>> Best regards,
>>> Mirsad
>>
>>
>>
>>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>>> kernel: [ 5.576712] PGD 0 P4D 0
>>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>> kernel: [ 5.576903] PKRU: 55555554
>>> kernel: [ 5.576905] Call Trace:
>>> kernel: [ 5.576907] <TASK>
>>> kernel: [ 5.576909] ? show_regs+0x72/0x90
>>> kernel: [ 5.576914] ? __die+0x25/0x80
>>> kernel: [ 5.576917] ? page_fault_oops+0x154/0x4c0
>>> kernel: [ 5.576921] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0+0x35/0x70
>>> kernel: [ 5.576930] ? do_user_addr_fault+0x30e/0x6e0
>>> kernel: [ 5.576934] ? exc_page_fault+0x84/0x1b0
>>> kernel: [ 5.576937] ? asm_exc_page_fault+0x27/0x30
>>> kernel: [ 5.576942] ? gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>> kernel: [ 5.577056] amdgpu_device_init+0xefa/0x2de0 [amdgpu]
>>> kernel: [ 5.577158] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577161] ? pci_bus_read_config_word+0x47/0x90
>>> kernel: [ 5.577166] ? pci_read_config_word+0x27/0x60
>>> kernel: [ 5.577168] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577171] ? do_pci_enable_device+0xe1/0x110
>>> kernel: [ 5.577176] amdgpu_driver_load_kms+0x1a/0x1c0 [amdgpu]
>>> kernel: [ 5.577275] amdgpu_pci_probe+0x1a8/0x5e0 [amdgpu]
>>> kernel: [ 5.577373] local_pci_probe+0x48/0xb0
>>> kernel: [ 5.577377] pci_device_probe+0xc8/0x290
>>> kernel: [ 5.577381] really_probe+0x1d2/0x440
>>> kernel: [ 5.577386] __driver_probe_device+0x8a/0x190
>>> kernel: [ 5.577389] driver_probe_device+0x23/0xd0
>>> kernel: [ 5.577392] __driver_attach+0x10f/0x220
>>> kernel: [ 5.577396] ? __pfx___driver_attach+0x10/0x10
>>> kernel: [ 5.577399] bus_for_each_dev+0x7a/0xe0
>>> kernel: [ 5.577402] driver_attach+0x1e/0x30
>>> kernel: [ 5.577405] bus_add_driver+0x127/0x240
>>> kernel: [ 5.577409] driver_register+0x64/0x140
>>> kernel: [ 5.577411] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
>>> kernel: [ 5.577521] __pci_register_driver+0x68/0x80
>>> kernel: [ 5.577524] amdgpu_init+0x69/0xff0 [amdgpu]
>>> kernel: [ 5.577628] do_one_initcall+0x46/0x330
>>> kernel: [ 5.577632] ? kmalloc_trace+0x136/0x370
>>> kernel: [ 5.577637] do_init_module+0x6a/0x280
>>> kernel: [ 5.577640] load_module+0x2419/0x2500
>>> kernel: [ 5.577647] init_module_from_file+0x9c/0xf0
>>> kernel: [ 5.577649] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577652] ? init_module_from_file+0x9c/0xf0
>>> kernel: [ 5.577657] idempotent_init_module+0x184/0x240
>>> kernel: [ 5.577661] __x64_sys_finit_module+0x64/0xd0
>>> kernel: [ 5.577664] do_syscall_64+0x76/0x140
>>> kernel: [ 5.577668] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577671] ? ksys_mmap_pgoff+0x123/0x270
>>> kernel: [ 5.577675] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577678] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577681] ? syscall_exit_to_user_mode+0x97/0x1e0
>>> kernel: [ 5.577684] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577687] ? do_syscall_64+0x85/0x140
>>> kernel: [ 5.577689] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577692] ? do_syscall_64+0x85/0x140
>>> kernel: [ 5.577695] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577698] ? do_syscall_64+0x85/0x140
>>> kernel: [ 5.577700] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577703] ? sysvec_call_function+0x4e/0xb0
>>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe+0x6e/0x76
>>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>>> kernel: [ 5.577748] </TASK>
>>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
>>> kernel: [ 5.577817] CR2: 0000000000000008
>>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>> rsyslogd: rsyslogd's groupid changed to 111
>>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>> kernel: [ 5.914419] PKRU: 55555554
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union
"I see something approaching fast ... Will it be friends with me?"
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG [RESEND][NEW BUG]: kernel NULL pointer dereference, address: 0000000000000008
2024-01-22 8:34 ` Ma, Jun
2024-01-22 22:39 ` Mirsad Todorovac
@ 2024-01-24 17:48 ` Mirsad Todorovac
2024-01-25 7:38 ` Ma, Jun
1 sibling, 1 reply; 8+ messages in thread
From: Mirsad Todorovac @ 2024-01-24 17:48 UTC (permalink / raw
To: Ma, Jun, linux-kernel, amd-gfx
Cc: Sathishkumar S, Pan, Xinhui, Srinivasan Shanmugam, Guchun Chen,
David Airlie, Felix Kuehling, Lijo Lazar, dri-devel,
Christian König, Boyuan Zhang, Daniel Vetter, David Francis,
Alex Deucher, Lang Yu, Marek Olšák
Hi, Ma Jun,
Normally, I would reply under the quoted text, but I will adjust to your convention.
I have just discovered that your patch causes Ubuntu 22.04 LTS GNOME XWayland session
to block at typing password and ENTER in the graphical logon screen (tested several times).
After that, I was not able to even log from another box with ssh, or the session would
block (tested one time, second time too, thrid time it passed after I connected before
attempt to login on XWayland console).
You might find useful syslog and dmesg of the freeze on this link (they were +100K):
https://magrf.grf.hr/~mtodorov/linux/bugreports/6.7.0/amdgpu/6.7.0-xway-09721-g61da593f4458/
The exact applied patch was this:
marvin@defiant:~/linux/kernel/linux_torvalds$ git diff
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 73f6d7e72c73..6ef333df9adf 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3996,16 +3996,13 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
if (!amdgpu_sriov_vf(adev)) {
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", ucode_prefix);
- err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, fw_name);
- /* don't check this. There are apparently firmwares in the wild with
- * incorrect size in the header
- */
- if (err == -ENODEV)
- goto out;
+ err = request_firmware(&adev->gfx.rlc_fw, fw_name, adev->dev);
if (err)
- dev_dbg(adev->dev,
- "gfx10: amdgpu_ucode_request() failed \"%s\"\n",
- fw_name);
+ goto out;
+
+ /* don't validate this firmware. There are apparently firmwares
+ * in the wild with incorrect size in the header
+ */
rlc_hdr = (const struct rlc_firmware_header_v2_0 *)adev->gfx.rlc_fw->data;
version_major = le16_to_cpu(rlc_hdr->header.header_version_major);
version_minor = le16_to_cpu(rlc_hdr->header.header_version_minor);
marvin@defiant:~/linux/kernel/linux_torvalds$ uname -rms
Linux 6.7.0-xway-09721-g61da593f4458 x86_64
marvin@defiant:~/linux/kernel/linux_torvalds$
So, there seems to be a problem with the way the patch affects XWayland.
Checked multiple times the exact commit with and without the diff.
Hope this helps, because I am not familiar with the amdgpu driver.
Best regards,
Mirsad Todorovac
On 1/22/24 09:34, Ma, Jun wrote:
> Perhaps similar to the problem I encountered earlier, you can
> try the following patch
>
> https://lists.freedesktop.org/archives/amd-gfx/2024-January/103259.html
>
> Regards,
> Ma Jun
>
> On 1/21/2024 3:54 AM, Mirsad Todorovac wrote:
>> Hi,
>>
>> The last email did not pass to the most of the recipients due to banned .xz attachment.
>>
>> As the .config is too big to send inline or uncompressed either, I will omit it in this
>> attempt. In the meantime, I had some success in decoding the stack trace, but sadly not
>> complete.
>>
>> I don't think this Oops is deterministic, but I am working on a reproducer.
>>
>> The platform is Ubuntu 22.04 LTS.
>>
>> Complete list of hardware and .config is available here:
>>
>> https://domac.alu.unizg.hr/~mtodorov/linux/bugreports/amdgpu/6.7.0-rtl-v02-nokcsan-09928-g052d534373b7/
>>
>> Best regards,
>> Mirsad
>>
>> -------------------------------------------------------------------------------------------
>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>> kernel: [ 5.576712] PGD 0 P4D 0
>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>> All code
>> ========
>> 0: 8d 55 a8 lea -0x58(%rbp),%edx
>> 3: 4c 89 ff mov %r15,%rdi
>> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
>> b: 41 89 c2 mov %eax,%r10d
>> e: 83 f8 ed cmp $0xffffffed,%eax
>> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
>> 17: 85 c0 test %eax,%eax
>> 19: 74 05 je 0x20
>> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
>> 27: 4c 89 ff mov %r15,%rdi
>> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
>> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
>> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
>> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
>> 3b: 41 89 c2 mov %eax,%r10d
>> 3e: 85 c0 test %eax,%eax
>>
>> Code starting with the faulting instruction
>> ===========================================
>> 0: 48 8b 40 08 mov 0x8(%rax),%rax
>> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
>> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
>> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
>> 11: 41 89 c2 mov %eax,%r10d
>> 14: 85 c0 test %eax,%eax
>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>> kernel: [ 5.576903] PKRU: 55555554
>> kernel: [ 5.576905] Call Trace:
>> kernel: [ 5.576907] <TASK>
>> kernel: [ 5.576909] ? show_regs (arch/x86/kernel/dumpstack.c:479)
>> kernel: [ 5.576914] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
>> kernel: [ 5.576917] ? page_fault_oops (arch/x86/mm/fault.c:707)
>> kernel: [ 5.576921] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0 (crypto/api.c:497)
>> kernel: [ 5.576930] ? do_user_addr_fault (arch/x86/mm/fault.c:1264)
>> kernel: [ 5.576934] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:693 arch/x86/mm/fault.c:1515 arch/x86/mm/fault.c:1563)
>> kernel: [ 5.576937] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
>> kernel: [ 5.576942] ? gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>> kernel: [ 5.577056] amdgpu_device_init (drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:2465 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4042) amdgpu
>> kernel: [ 5.577158] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577161] ? pci_bus_read_config_word (drivers/pci/access.c:67 (discriminator 2))
>> kernel: [ 5.577166] ? pci_read_config_word (drivers/pci/access.c:563)
>> kernel: [ 5.577168] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577171] ? do_pci_enable_device (drivers/pci/pci.c:1975 drivers/pci/pci.c:1949)
>> kernel: [ 5.577176] amdgpu_driver_load_kms (drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:146) amdgpu
>> kernel: [ 5.577275] amdgpu_pci_probe (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2237) amdgpu
>> kernel: [ 5.577373] local_pci_probe (drivers/pci/pci-driver.c:324)
>> kernel: [ 5.577377] pci_device_probe (drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:460)
>> kernel: [ 5.577381] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
>> kernel: [ 5.577386] __driver_probe_device (drivers/base/dd.c:800)
>> kernel: [ 5.577389] driver_probe_device (drivers/base/dd.c:830)
>> kernel: [ 5.577392] __driver_attach (drivers/base/dd.c:1217)
>> kernel: [ 5.577396] ? __pfx___driver_attach (drivers/base/dd.c:1157)
>> kernel: [ 5.577399] bus_for_each_dev (drivers/base/bus.c:368)
>> kernel: [ 5.577402] driver_attach (drivers/base/dd.c:1234)
>> kernel: [ 5.577405] bus_add_driver (drivers/base/bus.c:674)
>> kernel: [ 5.577409] driver_register (drivers/base/driver.c:246)
>> kernel: [ 5.577411] ? __pfx_amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2497) amdgpu
>> kernel: [ 5.577521] __pci_register_driver (drivers/pci/pci-driver.c:1456)
>> kernel: [ 5.577524] amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drvc:2805) amdgpu
>> kernel: [ 5.577628] do_one_initcall (init/main.c:1236)
>> kernel: [ 5.577632] ? kmalloc_trace (mm/slub.c:3816 mm/slub.c:3860 mm/slub.c:4007)
>> kernel: [ 5.577637] do_init_module (kernel/module/main.c:2533)
>> kernel: [ 5.577640] load_module (kernel/module/main.c:2984)
>> kernel: [ 5.577647] init_module_from_file (kernel/module/main.c:3151)
>> kernel: [ 5.577649] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577652] ? init_module_from_file (kernel/module/main.c:3151)
>> kernel: [ 5.577657] idempotent_init_module (kernel/module/main.c:3168)
>> kernel: [ 5.577661] __x64_sys_finit_module (./include/linux/file.h:45 kernel/module/main.c:3190 kernel/module/main.c:3172 kernel/module/main.c:3172)
>> kernel: [ 5.577664] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
>> kernel: [ 5.577668] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577671] ? ksys_mmap_pgoff (mm/mmap.c:1428)
>> kernel: [ 5.577675] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577678] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577681] ? syscall_exit_to_user_mode (kernel/entry/common.c:215)
>> kernel: [ 5.577684] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577687] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>> kernel: [ 5.577689] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577692] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>> kernel: [ 5.577695] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577698] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>> kernel: [ 5.577700] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577703] ? sysvec_call_function (arch/x86/kernel/smp.c:253 (discriminator 69))
>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>> All code
>> ========
>> 0: 5b pop %rbx
>> 1: 41 5c pop %r12
>> 3: c3 ret
>> 4: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
>> b: 00 00
>> d: f3 0f 1e fa endbr64
>> 11: 48 89 f8 mov %rdi,%rax
>> 14: 48 89 f7 mov %rsi,%rdi
>> 17: 48 89 d6 mov %rdx,%rsi
>> 1a: 48 89 ca mov %rcx,%rdx
>> 1d: 4d 89 c2 mov %r8,%r10
>> 20: 4d 89 c8 mov %r9,%r8
>> 23: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9
>> 28: 0f 05 syscall
>> 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
>> 30: 73 01 jae 0x33
>> 32: c3 ret
>> 33: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb5ad
>> 3a: f7 d8 neg %eax
>> 3c: 64 89 01 mov %eax,%fs:(%rcx)
>> 3f: 48 rex.W
>>
>> Code starting with the faulting instruction
>> ===========================================
>> 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
>> 6: 73 01 jae 0x9
>> 8: c3 ret
>> 9: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb583
>> 10: f7 d8 neg %eax
>> 12: 64 89 01 mov %eax,%fs:(%rcx)
>> 15: 48 rex.W
>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>> kernel: [ 5.577748] </TASK>
>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
>> kernel: [ 5.577817] CR2: 0000000000000008
>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>> All code
>> ========
>> 0: 8d 55 a8 lea -0x58(%rbp),%edx
>> 3: 4c 89 ff mov %r15,%rdi
>> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
>> b: 41 89 c2 mov %eax,%r10d
>> e: 83 f8 ed cmp $0xffffffed,%eax
>> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
>> 17: 85 c0 test %eax,%eax
>> 19: 74 05 je 0x20
>> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
>> 27: 4c 89 ff mov %r15,%rdi
>> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
>> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
>> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
>> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
>> 3b: 41 89 c2 mov %eax,%r10d
>> 3e: 85 c0 test %eax,%eax
>>
>> Code starting with the faulting instruction
>> ===========================================
>> 0: 48 8b 40 08 mov 0x8(%rax),%rax
>> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
>> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
>> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
>> 11: 41 89 c2 mov %eax,%r10d
>> 14: 85 c0 test %eax,%eax
>> rsyslogd: rsyslogd's groupid changed to 111
>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>> kernel: [ 5.914419] PKRU: 55555554
>>
>> Best regards,
>> Mirsad
>>
>> On 1/18/24 18:23, Mirsad Todorovac wrote:
>>> Hi,
>>>
>>> Unfortunately, I was not able to reboot in this kernel again to do the stack decode, but I thought
>>> that any information about the NULL pointer dereference is better than no info.
>>>
>>> The system is Ubuntu 23.10 Mantic with AMD product: Navi 23 [Radeon RX 6600/6600 XT/6600M]
>>> graphic card.
>>>
>>> Please find the config and the hw listing attached.
>>>
>>> Best regards,
>>> Mirsad
>>
>>
>>
>>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>>> kernel: [ 5.576712] PGD 0 P4D 0
>>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>> kernel: [ 5.576903] PKRU: 55555554
>>> kernel: [ 5.576905] Call Trace:
>>> kernel: [ 5.576907] <TASK>
>>> kernel: [ 5.576909] ? show_regs+0x72/0x90
>>> kernel: [ 5.576914] ? __die+0x25/0x80
>>> kernel: [ 5.576917] ? page_fault_oops+0x154/0x4c0
>>> kernel: [ 5.576921] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0+0x35/0x70
>>> kernel: [ 5.576930] ? do_user_addr_fault+0x30e/0x6e0
>>> kernel: [ 5.576934] ? exc_page_fault+0x84/0x1b0
>>> kernel: [ 5.576937] ? asm_exc_page_fault+0x27/0x30
>>> kernel: [ 5.576942] ? gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>> kernel: [ 5.577056] amdgpu_device_init+0xefa/0x2de0 [amdgpu]
>>> kernel: [ 5.577158] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577161] ? pci_bus_read_config_word+0x47/0x90
>>> kernel: [ 5.577166] ? pci_read_config_word+0x27/0x60
>>> kernel: [ 5.577168] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577171] ? do_pci_enable_device+0xe1/0x110
>>> kernel: [ 5.577176] amdgpu_driver_load_kms+0x1a/0x1c0 [amdgpu]
>>> kernel: [ 5.577275] amdgpu_pci_probe+0x1a8/0x5e0 [amdgpu]
>>> kernel: [ 5.577373] local_pci_probe+0x48/0xb0
>>> kernel: [ 5.577377] pci_device_probe+0xc8/0x290
>>> kernel: [ 5.577381] really_probe+0x1d2/0x440
>>> kernel: [ 5.577386] __driver_probe_device+0x8a/0x190
>>> kernel: [ 5.577389] driver_probe_device+0x23/0xd0
>>> kernel: [ 5.577392] __driver_attach+0x10f/0x220
>>> kernel: [ 5.577396] ? __pfx___driver_attach+0x10/0x10
>>> kernel: [ 5.577399] bus_for_each_dev+0x7a/0xe0
>>> kernel: [ 5.577402] driver_attach+0x1e/0x30
>>> kernel: [ 5.577405] bus_add_driver+0x127/0x240
>>> kernel: [ 5.577409] driver_register+0x64/0x140
>>> kernel: [ 5.577411] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
>>> kernel: [ 5.577521] __pci_register_driver+0x68/0x80
>>> kernel: [ 5.577524] amdgpu_init+0x69/0xff0 [amdgpu]
>>> kernel: [ 5.577628] do_one_initcall+0x46/0x330
>>> kernel: [ 5.577632] ? kmalloc_trace+0x136/0x370
>>> kernel: [ 5.577637] do_init_module+0x6a/0x280
>>> kernel: [ 5.577640] load_module+0x2419/0x2500
>>> kernel: [ 5.577647] init_module_from_file+0x9c/0xf0
>>> kernel: [ 5.577649] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577652] ? init_module_from_file+0x9c/0xf0
>>> kernel: [ 5.577657] idempotent_init_module+0x184/0x240
>>> kernel: [ 5.577661] __x64_sys_finit_module+0x64/0xd0
>>> kernel: [ 5.577664] do_syscall_64+0x76/0x140
>>> kernel: [ 5.577668] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577671] ? ksys_mmap_pgoff+0x123/0x270
>>> kernel: [ 5.577675] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577678] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577681] ? syscall_exit_to_user_mode+0x97/0x1e0
>>> kernel: [ 5.577684] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577687] ? do_syscall_64+0x85/0x140
>>> kernel: [ 5.577689] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577692] ? do_syscall_64+0x85/0x140
>>> kernel: [ 5.577695] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577698] ? do_syscall_64+0x85/0x140
>>> kernel: [ 5.577700] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577703] ? sysvec_call_function+0x4e/0xb0
>>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe+0x6e/0x76
>>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>>> kernel: [ 5.577748] </TASK>
>>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
>>> kernel: [ 5.577817] CR2: 0000000000000008
>>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>> rsyslogd: rsyslogd's groupid changed to 111
>>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>> kernel: [ 5.914419] PKRU: 55555554
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: BUG [RESEND][NEW BUG]: kernel NULL pointer dereference, address: 0000000000000008
2024-01-24 17:48 ` BUG [RESEND][NEW BUG]: " Mirsad Todorovac
@ 2024-01-25 7:38 ` Ma, Jun
2024-01-25 9:29 ` Mirsad Todorovac
0 siblings, 1 reply; 8+ messages in thread
From: Ma, Jun @ 2024-01-25 7:38 UTC (permalink / raw
To: Mirsad Todorovac, linux-kernel, amd-gfx
Cc: Sathishkumar S, Pan, Xinhui, Srinivasan Shanmugam, Guchun Chen,
David Airlie, Felix Kuehling, Lijo Lazar, dri-devel,
Christian König, Boyuan Zhang, Daniel Vetter, David Francis,
Alex Deucher, Lang Yu, Marek Olšák
Hi Mirsad,
On 1/25/2024 1:48 AM, Mirsad Todorovac wrote:
> Hi, Ma Jun,
>
> Normally, I would reply under the quoted text, but I will adjust to your convention.
>
> I have just discovered that your patch causes Ubuntu 22.04 LTS GNOME XWayland session
> to block at typing password and ENTER in the graphical logon screen (tested several times).
>
This problem is not caused by my patch.
Based on your syslog, it looks more like a shedule issue.
I just saw a similar problem, please refer to the link below
https://gitlab.freedesktop.org/drm/amd/-/issues/3124
Regards,
Ma Jun
> After that, I was not able to even log from another box with ssh, or the session would
> block (tested one time, second time too, thrid time it passed after I connected before
> attempt to login on XWayland console).
>
> You might find useful syslog and dmesg of the freeze on this link (they were +100K):
>
> https://magrf.grf.hr/~mtodorov/linux/bugreports/6.7.0/amdgpu/6.7.0-xway-09721-g61da593f4458/
>
> The exact applied patch was this:
>
> marvin@defiant:~/linux/kernel/linux_torvalds$ git diff
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 73f6d7e72c73..6ef333df9adf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -3996,16 +3996,13 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
>
> if (!amdgpu_sriov_vf(adev)) {
> snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", ucode_prefix);
> - err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, fw_name);
> - /* don't check this. There are apparently firmwares in the wild with
> - * incorrect size in the header
> - */
> - if (err == -ENODEV)
> - goto out;
> + err = request_firmware(&adev->gfx.rlc_fw, fw_name, adev->dev);
> if (err)
> - dev_dbg(adev->dev,
> - "gfx10: amdgpu_ucode_request() failed \"%s\"\n",
> - fw_name);
> + goto out;
> +
> + /* don't validate this firmware. There are apparently firmwares
> + * in the wild with incorrect size in the header
> + */
> rlc_hdr = (const struct rlc_firmware_header_v2_0 *)adev->gfx.rlc_fw->data;
> version_major = le16_to_cpu(rlc_hdr->header.header_version_major);
> version_minor = le16_to_cpu(rlc_hdr->header.header_version_minor);
> marvin@defiant:~/linux/kernel/linux_torvalds$ uname -rms
> Linux 6.7.0-xway-09721-g61da593f4458 x86_64
> marvin@defiant:~/linux/kernel/linux_torvalds$
>
> So, there seems to be a problem with the way the patch affects XWayland.
>
> Checked multiple times the exact commit with and without the diff.
>
> Hope this helps, because I am not familiar with the amdgpu driver.
>
> Best regards,
> Mirsad Todorovac
>
> On 1/22/24 09:34, Ma, Jun wrote:
>> Perhaps similar to the problem I encountered earlier, you can
>> try the following patch
>>
>> https://lists.freedesktop.org/archives/amd-gfx/2024-January/103259.html
>>
>> Regards,
>> Ma Jun
>>
>> On 1/21/2024 3:54 AM, Mirsad Todorovac wrote:
>>> Hi,
>>>
>>> The last email did not pass to the most of the recipients due to banned .xz attachment.
>>>
>>> As the .config is too big to send inline or uncompressed either, I will omit it in this
>>> attempt. In the meantime, I had some success in decoding the stack trace, but sadly not
>>> complete.
>>>
>>> I don't think this Oops is deterministic, but I am working on a reproducer.
>>>
>>> The platform is Ubuntu 22.04 LTS.
>>>
>>> Complete list of hardware and .config is available here:
>>>
>>> https://domac.alu.unizg.hr/~mtodorov/linux/bugreports/amdgpu/6.7.0-rtl-v02-nokcsan-09928-g052d534373b7/
>>>
>>> Best regards,
>>> Mirsad
>>>
>>> -------------------------------------------------------------------------------------------
>>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>>> kernel: [ 5.576712] PGD 0 P4D 0
>>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>> All code
>>> ========
>>> 0: 8d 55 a8 lea -0x58(%rbp),%edx
>>> 3: 4c 89 ff mov %r15,%rdi
>>> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
>>> b: 41 89 c2 mov %eax,%r10d
>>> e: 83 f8 ed cmp $0xffffffed,%eax
>>> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
>>> 17: 85 c0 test %eax,%eax
>>> 19: 74 05 je 0x20
>>> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>>> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
>>> 27: 4c 89 ff mov %r15,%rdi
>>> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
>>> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
>>> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
>>> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
>>> 3b: 41 89 c2 mov %eax,%r10d
>>> 3e: 85 c0 test %eax,%eax
>>>
>>> Code starting with the faulting instruction
>>> ===========================================
>>> 0: 48 8b 40 08 mov 0x8(%rax),%rax
>>> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
>>> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
>>> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
>>> 11: 41 89 c2 mov %eax,%r10d
>>> 14: 85 c0 test %eax,%eax
>>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>> kernel: [ 5.576903] PKRU: 55555554
>>> kernel: [ 5.576905] Call Trace:
>>> kernel: [ 5.576907] <TASK>
>>> kernel: [ 5.576909] ? show_regs (arch/x86/kernel/dumpstack.c:479)
>>> kernel: [ 5.576914] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
>>> kernel: [ 5.576917] ? page_fault_oops (arch/x86/mm/fault.c:707)
>>> kernel: [ 5.576921] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0 (crypto/api.c:497)
>>> kernel: [ 5.576930] ? do_user_addr_fault (arch/x86/mm/fault.c:1264)
>>> kernel: [ 5.576934] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:693 arch/x86/mm/fault.c:1515 arch/x86/mm/fault.c:1563)
>>> kernel: [ 5.576937] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
>>> kernel: [ 5.576942] ? gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>>> kernel: [ 5.577056] amdgpu_device_init (drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:2465 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4042) amdgpu
>>> kernel: [ 5.577158] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>> kernel: [ 5.577161] ? pci_bus_read_config_word (drivers/pci/access.c:67 (discriminator 2))
>>> kernel: [ 5.577166] ? pci_read_config_word (drivers/pci/access.c:563)
>>> kernel: [ 5.577168] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>> kernel: [ 5.577171] ? do_pci_enable_device (drivers/pci/pci.c:1975 drivers/pci/pci.c:1949)
>>> kernel: [ 5.577176] amdgpu_driver_load_kms (drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:146) amdgpu
>>> kernel: [ 5.577275] amdgpu_pci_probe (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2237) amdgpu
>>> kernel: [ 5.577373] local_pci_probe (drivers/pci/pci-driver.c:324)
>>> kernel: [ 5.577377] pci_device_probe (drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:460)
>>> kernel: [ 5.577381] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
>>> kernel: [ 5.577386] __driver_probe_device (drivers/base/dd.c:800)
>>> kernel: [ 5.577389] driver_probe_device (drivers/base/dd.c:830)
>>> kernel: [ 5.577392] __driver_attach (drivers/base/dd.c:1217)
>>> kernel: [ 5.577396] ? __pfx___driver_attach (drivers/base/dd.c:1157)
>>> kernel: [ 5.577399] bus_for_each_dev (drivers/base/bus.c:368)
>>> kernel: [ 5.577402] driver_attach (drivers/base/dd.c:1234)
>>> kernel: [ 5.577405] bus_add_driver (drivers/base/bus.c:674)
>>> kernel: [ 5.577409] driver_register (drivers/base/driver.c:246)
>>> kernel: [ 5.577411] ? __pfx_amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2497) amdgpu
>>> kernel: [ 5.577521] __pci_register_driver (drivers/pci/pci-driver.c:1456)
>>> kernel: [ 5.577524] amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drvc:2805) amdgpu
>>> kernel: [ 5.577628] do_one_initcall (init/main.c:1236)
>>> kernel: [ 5.577632] ? kmalloc_trace (mm/slub.c:3816 mm/slub.c:3860 mm/slub.c:4007)
>>> kernel: [ 5.577637] do_init_module (kernel/module/main.c:2533)
>>> kernel: [ 5.577640] load_module (kernel/module/main.c:2984)
>>> kernel: [ 5.577647] init_module_from_file (kernel/module/main.c:3151)
>>> kernel: [ 5.577649] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>> kernel: [ 5.577652] ? init_module_from_file (kernel/module/main.c:3151)
>>> kernel: [ 5.577657] idempotent_init_module (kernel/module/main.c:3168)
>>> kernel: [ 5.577661] __x64_sys_finit_module (./include/linux/file.h:45 kernel/module/main.c:3190 kernel/module/main.c:3172 kernel/module/main.c:3172)
>>> kernel: [ 5.577664] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
>>> kernel: [ 5.577668] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>> kernel: [ 5.577671] ? ksys_mmap_pgoff (mm/mmap.c:1428)
>>> kernel: [ 5.577675] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>> kernel: [ 5.577678] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>> kernel: [ 5.577681] ? syscall_exit_to_user_mode (kernel/entry/common.c:215)
>>> kernel: [ 5.577684] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>> kernel: [ 5.577687] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>>> kernel: [ 5.577689] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>> kernel: [ 5.577692] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>>> kernel: [ 5.577695] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>> kernel: [ 5.577698] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>>> kernel: [ 5.577700] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>> kernel: [ 5.577703] ? sysvec_call_function (arch/x86/kernel/smp.c:253 (discriminator 69))
>>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
>>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>>> All code
>>> ========
>>> 0: 5b pop %rbx
>>> 1: 41 5c pop %r12
>>> 3: c3 ret
>>> 4: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
>>> b: 00 00
>>> d: f3 0f 1e fa endbr64
>>> 11: 48 89 f8 mov %rdi,%rax
>>> 14: 48 89 f7 mov %rsi,%rdi
>>> 17: 48 89 d6 mov %rdx,%rsi
>>> 1a: 48 89 ca mov %rcx,%rdx
>>> 1d: 4d 89 c2 mov %r8,%r10
>>> 20: 4d 89 c8 mov %r9,%r8
>>> 23: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9
>>> 28: 0f 05 syscall
>>> 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
>>> 30: 73 01 jae 0x33
>>> 32: c3 ret
>>> 33: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb5ad
>>> 3a: f7 d8 neg %eax
>>> 3c: 64 89 01 mov %eax,%fs:(%rcx)
>>> 3f: 48 rex.W
>>>
>>> Code starting with the faulting instruction
>>> ===========================================
>>> 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
>>> 6: 73 01 jae 0x9
>>> 8: c3 ret
>>> 9: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb583
>>> 10: f7 d8 neg %eax
>>> 12: 64 89 01 mov %eax,%fs:(%rcx)
>>> 15: 48 rex.W
>>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>>> kernel: [ 5.577748] </TASK>
>>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
>>> kernel: [ 5.577817] CR2: 0000000000000008
>>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>> All code
>>> ========
>>> 0: 8d 55 a8 lea -0x58(%rbp),%edx
>>> 3: 4c 89 ff mov %r15,%rdi
>>> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
>>> b: 41 89 c2 mov %eax,%r10d
>>> e: 83 f8 ed cmp $0xffffffed,%eax
>>> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
>>> 17: 85 c0 test %eax,%eax
>>> 19: 74 05 je 0x20
>>> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>>> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
>>> 27: 4c 89 ff mov %r15,%rdi
>>> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
>>> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
>>> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
>>> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
>>> 3b: 41 89 c2 mov %eax,%r10d
>>> 3e: 85 c0 test %eax,%eax
>>>
>>> Code starting with the faulting instruction
>>> ===========================================
>>> 0: 48 8b 40 08 mov 0x8(%rax),%rax
>>> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
>>> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
>>> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
>>> 11: 41 89 c2 mov %eax,%r10d
>>> 14: 85 c0 test %eax,%eax
>>> rsyslogd: rsyslogd's groupid changed to 111
>>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>> kernel: [ 5.914419] PKRU: 55555554
>>>
>>> Best regards,
>>> Mirsad
>>>
>>> On 1/18/24 18:23, Mirsad Todorovac wrote:
>>>> Hi,
>>>>
>>>> Unfortunately, I was not able to reboot in this kernel again to do the stack decode, but I thought
>>>> that any information about the NULL pointer dereference is better than no info.
>>>>
>>>> The system is Ubuntu 23.10 Mantic with AMD product: Navi 23 [Radeon RX 6600/6600 XT/6600M]
>>>> graphic card.
>>>>
>>>> Please find the config and the hw listing attached.
>>>>
>>>> Best regards,
>>>> Mirsad
>>>
>>>
>>>
>>>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>>>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>>>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>>>> kernel: [ 5.576712] PGD 0 P4D 0
>>>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>>>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>>>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>>> kernel: [ 5.576903] PKRU: 55555554
>>>> kernel: [ 5.576905] Call Trace:
>>>> kernel: [ 5.576907] <TASK>
>>>> kernel: [ 5.576909] ? show_regs+0x72/0x90
>>>> kernel: [ 5.576914] ? __die+0x25/0x80
>>>> kernel: [ 5.576917] ? page_fault_oops+0x154/0x4c0
>>>> kernel: [ 5.576921] ? srso_alias_return_thunk+0x5/0xfbef5
>>>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0+0x35/0x70
>>>> kernel: [ 5.576930] ? do_user_addr_fault+0x30e/0x6e0
>>>> kernel: [ 5.576934] ? exc_page_fault+0x84/0x1b0
>>>> kernel: [ 5.576937] ? asm_exc_page_fault+0x27/0x30
>>>> kernel: [ 5.576942] ? gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>>> kernel: [ 5.577056] amdgpu_device_init+0xefa/0x2de0 [amdgpu]
>>>> kernel: [ 5.577158] ? srso_alias_return_thunk+0x5/0xfbef5
>>>> kernel: [ 5.577161] ? pci_bus_read_config_word+0x47/0x90
>>>> kernel: [ 5.577166] ? pci_read_config_word+0x27/0x60
>>>> kernel: [ 5.577168] ? srso_alias_return_thunk+0x5/0xfbef5
>>>> kernel: [ 5.577171] ? do_pci_enable_device+0xe1/0x110
>>>> kernel: [ 5.577176] amdgpu_driver_load_kms+0x1a/0x1c0 [amdgpu]
>>>> kernel: [ 5.577275] amdgpu_pci_probe+0x1a8/0x5e0 [amdgpu]
>>>> kernel: [ 5.577373] local_pci_probe+0x48/0xb0
>>>> kernel: [ 5.577377] pci_device_probe+0xc8/0x290
>>>> kernel: [ 5.577381] really_probe+0x1d2/0x440
>>>> kernel: [ 5.577386] __driver_probe_device+0x8a/0x190
>>>> kernel: [ 5.577389] driver_probe_device+0x23/0xd0
>>>> kernel: [ 5.577392] __driver_attach+0x10f/0x220
>>>> kernel: [ 5.577396] ? __pfx___driver_attach+0x10/0x10
>>>> kernel: [ 5.577399] bus_for_each_dev+0x7a/0xe0
>>>> kernel: [ 5.577402] driver_attach+0x1e/0x30
>>>> kernel: [ 5.577405] bus_add_driver+0x127/0x240
>>>> kernel: [ 5.577409] driver_register+0x64/0x140
>>>> kernel: [ 5.577411] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
>>>> kernel: [ 5.577521] __pci_register_driver+0x68/0x80
>>>> kernel: [ 5.577524] amdgpu_init+0x69/0xff0 [amdgpu]
>>>> kernel: [ 5.577628] do_one_initcall+0x46/0x330
>>>> kernel: [ 5.577632] ? kmalloc_trace+0x136/0x370
>>>> kernel: [ 5.577637] do_init_module+0x6a/0x280
>>>> kernel: [ 5.577640] load_module+0x2419/0x2500
>>>> kernel: [ 5.577647] init_module_from_file+0x9c/0xf0
>>>> kernel: [ 5.577649] ? srso_alias_return_thunk+0x5/0xfbef5
>>>> kernel: [ 5.577652] ? init_module_from_file+0x9c/0xf0
>>>> kernel: [ 5.577657] idempotent_init_module+0x184/0x240
>>>> kernel: [ 5.577661] __x64_sys_finit_module+0x64/0xd0
>>>> kernel: [ 5.577664] do_syscall_64+0x76/0x140
>>>> kernel: [ 5.577668] ? srso_alias_return_thunk+0x5/0xfbef5
>>>> kernel: [ 5.577671] ? ksys_mmap_pgoff+0x123/0x270
>>>> kernel: [ 5.577675] ? srso_alias_return_thunk+0x5/0xfbef5
>>>> kernel: [ 5.577678] ? srso_alias_return_thunk+0x5/0xfbef5
>>>> kernel: [ 5.577681] ? syscall_exit_to_user_mode+0x97/0x1e0
>>>> kernel: [ 5.577684] ? srso_alias_return_thunk+0x5/0xfbef5
>>>> kernel: [ 5.577687] ? do_syscall_64+0x85/0x140
>>>> kernel: [ 5.577689] ? srso_alias_return_thunk+0x5/0xfbef5
>>>> kernel: [ 5.577692] ? do_syscall_64+0x85/0x140
>>>> kernel: [ 5.577695] ? srso_alias_return_thunk+0x5/0xfbef5
>>>> kernel: [ 5.577698] ? do_syscall_64+0x85/0x140
>>>> kernel: [ 5.577700] ? srso_alias_return_thunk+0x5/0xfbef5
>>>> kernel: [ 5.577703] ? sysvec_call_function+0x4e/0xb0
>>>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe+0x6e/0x76
>>>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>>>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>>>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>>>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>>>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>>>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>>>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>>>> kernel: [ 5.577748] </TASK>
>>>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
>>>> kernel: [ 5.577817] CR2: 0000000000000008
>>>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>>>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>>> rsyslogd: rsyslogd's groupid changed to 111
>>>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>>> kernel: [ 5.914419] PKRU: 55555554
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG [RESEND][NEW BUG]: kernel NULL pointer dereference, address: 0000000000000008
2024-01-25 7:38 ` Ma, Jun
@ 2024-01-25 9:29 ` Mirsad Todorovac
2024-01-25 18:02 ` Mirsad Todorovac
0 siblings, 1 reply; 8+ messages in thread
From: Mirsad Todorovac @ 2024-01-25 9:29 UTC (permalink / raw
To: Ma, Jun, linux-kernel, amd-gfx
Cc: Sathishkumar S, Pan, Xinhui, Srinivasan Shanmugam, Guchun Chen,
David Airlie, Felix Kuehling, Lijo Lazar, dri-devel,
Christian König, Boyuan Zhang, Daniel Vetter, David Francis,
Alex Deucher, Lang Yu, Marek Olšák
Hi Ma Jun,
Copy that. This appears to be the exact problem, and thank you for
reviewing the bug report at such a short notice.
I apologise for the wrong assertion.
The patch you sent then just triggered another bug, and it is not
manifested without the patch (but a NULL pointer dereference instead).
But of course, it is not profitable to remove your patch and have
the NULL ptr dereference, but a proper fix is required.
Thanks again.
Best regards,
Mirsad Todorovac
On 1/25/2024 8:38 AM, Ma, Jun wrote:
> Hi Mirsad,
>
>
> On 1/25/2024 1:48 AM, Mirsad Todorovac wrote:
>> Hi, Ma Jun,
>>
>> Normally, I would reply under the quoted text, but I will adjust to your convention.
>>
>> I have just discovered that your patch causes Ubuntu 22.04 LTS GNOME XWayland session
>> to block at typing password and ENTER in the graphical logon screen (tested several times).
>>
> This problem is not caused by my patch.
> Based on your syslog, it looks more like a shedule issue.
> I just saw a similar problem, please refer to the link below
> https://gitlab.freedesktop.org/drm/amd/-/issues/3124
>
> Regards,
> Ma Jun
>> After that, I was not able to even log from another box with ssh, or the session would
>> block (tested one time, second time too, thrid time it passed after I connected before
>> attempt to login on XWayland console).
>>
>> You might find useful syslog and dmesg of the freeze on this link (they were +100K):
>>
>> https://magrf.grf.hr/~mtodorov/linux/bugreports/6.7.0/amdgpu/6.7.0-xway-09721-g61da593f4458/
>>
>> The exact applied patch was this:
>>
>> marvin@defiant:~/linux/kernel/linux_torvalds$ git diff
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> index 73f6d7e72c73..6ef333df9adf 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> @@ -3996,16 +3996,13 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
>>
>> if (!amdgpu_sriov_vf(adev)) {
>> snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", ucode_prefix);
>> - err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, fw_name);
>> - /* don't check this. There are apparently firmwares in the wild with
>> - * incorrect size in the header
>> - */
>> - if (err == -ENODEV)
>> - goto out;
>> + err = request_firmware(&adev->gfx.rlc_fw, fw_name, adev->dev);
>> if (err)
>> - dev_dbg(adev->dev,
>> - "gfx10: amdgpu_ucode_request() failed \"%s\"\n",
>> - fw_name);
>> + goto out;
>> +
>> + /* don't validate this firmware. There are apparently firmwares
>> + * in the wild with incorrect size in the header
>> + */
>> rlc_hdr = (const struct rlc_firmware_header_v2_0 *)adev->gfx.rlc_fw->data;
>> version_major = le16_to_cpu(rlc_hdr->header.header_version_major);
>> version_minor = le16_to_cpu(rlc_hdr->header.header_version_minor);
>> marvin@defiant:~/linux/kernel/linux_torvalds$ uname -rms
>> Linux 6.7.0-xway-09721-g61da593f4458 x86_64
>> marvin@defiant:~/linux/kernel/linux_torvalds$
>>
>> So, there seems to be a problem with the way the patch affects XWayland.
>>
>> Checked multiple times the exact commit with and without the diff.
>>
>> Hope this helps, because I am not familiar with the amdgpu driver.
>>
>> Best regards,
>> Mirsad Todorovac
>>
>> On 1/22/24 09:34, Ma, Jun wrote:
>>> Perhaps similar to the problem I encountered earlier, you can
>>> try the following patch
>>>
>>> https://lists.freedesktop.org/archives/amd-gfx/2024-January/103259.html
>>>
>>> Regards,
>>> Ma Jun
>>>
>>> On 1/21/2024 3:54 AM, Mirsad Todorovac wrote:
>>>> Hi,
>>>>
>>>> The last email did not pass to the most of the recipients due to banned .xz attachment.
>>>>
>>>> As the .config is too big to send inline or uncompressed either, I will omit it in this
>>>> attempt. In the meantime, I had some success in decoding the stack trace, but sadly not
>>>> complete.
>>>>
>>>> I don't think this Oops is deterministic, but I am working on a reproducer.
>>>>
>>>> The platform is Ubuntu 22.04 LTS.
>>>>
>>>> Complete list of hardware and .config is available here:
>>>>
>>>> https://domac.alu.unizg.hr/~mtodorov/linux/bugreports/amdgpu/6.7.0-rtl-v02-nokcsan-09928-g052d534373b7/
>>>>
>>>> Best regards,
>>>> Mirsad
>>>>
>>>> -------------------------------------------------------------------------------------------
>>>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>>>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>>>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>>>> kernel: [ 5.576712] PGD 0 P4D 0
>>>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>>>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>>>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>>>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>>> All code
>>>> ========
>>>> 0: 8d 55 a8 lea -0x58(%rbp),%edx
>>>> 3: 4c 89 ff mov %r15,%rdi
>>>> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
>>>> b: 41 89 c2 mov %eax,%r10d
>>>> e: 83 f8 ed cmp $0xffffffed,%eax
>>>> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
>>>> 17: 85 c0 test %eax,%eax
>>>> 19: 74 05 je 0x20
>>>> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>>>> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
>>>> 27: 4c 89 ff mov %r15,%rdi
>>>> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
>>>> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
>>>> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
>>>> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
>>>> 3b: 41 89 c2 mov %eax,%r10d
>>>> 3e: 85 c0 test %eax,%eax
>>>>
>>>> Code starting with the faulting instruction
>>>> ===========================================
>>>> 0: 48 8b 40 08 mov 0x8(%rax),%rax
>>>> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
>>>> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
>>>> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
>>>> 11: 41 89 c2 mov %eax,%r10d
>>>> 14: 85 c0 test %eax,%eax
>>>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>>> kernel: [ 5.576903] PKRU: 55555554
>>>> kernel: [ 5.576905] Call Trace:
>>>> kernel: [ 5.576907] <TASK>
>>>> kernel: [ 5.576909] ? show_regs (arch/x86/kernel/dumpstack.c:479)
>>>> kernel: [ 5.576914] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
>>>> kernel: [ 5.576917] ? page_fault_oops (arch/x86/mm/fault.c:707)
>>>> kernel: [ 5.576921] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0 (crypto/api.c:497)
>>>> kernel: [ 5.576930] ? do_user_addr_fault (arch/x86/mm/fault.c:1264)
>>>> kernel: [ 5.576934] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:693 arch/x86/mm/fault.c:1515 arch/x86/mm/fault.c:1563)
>>>> kernel: [ 5.576937] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
>>>> kernel: [ 5.576942] ? gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>>>> kernel: [ 5.577056] amdgpu_device_init (drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:2465 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4042) amdgpu
>>>> kernel: [ 5.577158] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>> kernel: [ 5.577161] ? pci_bus_read_config_word (drivers/pci/access.c:67 (discriminator 2))
>>>> kernel: [ 5.577166] ? pci_read_config_word (drivers/pci/access.c:563)
>>>> kernel: [ 5.577168] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>> kernel: [ 5.577171] ? do_pci_enable_device (drivers/pci/pci.c:1975 drivers/pci/pci.c:1949)
>>>> kernel: [ 5.577176] amdgpu_driver_load_kms (drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:146) amdgpu
>>>> kernel: [ 5.577275] amdgpu_pci_probe (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2237) amdgpu
>>>> kernel: [ 5.577373] local_pci_probe (drivers/pci/pci-driver.c:324)
>>>> kernel: [ 5.577377] pci_device_probe (drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:460)
>>>> kernel: [ 5.577381] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
>>>> kernel: [ 5.577386] __driver_probe_device (drivers/base/dd.c:800)
>>>> kernel: [ 5.577389] driver_probe_device (drivers/base/dd.c:830)
>>>> kernel: [ 5.577392] __driver_attach (drivers/base/dd.c:1217)
>>>> kernel: [ 5.577396] ? __pfx___driver_attach (drivers/base/dd.c:1157)
>>>> kernel: [ 5.577399] bus_for_each_dev (drivers/base/bus.c:368)
>>>> kernel: [ 5.577402] driver_attach (drivers/base/dd.c:1234)
>>>> kernel: [ 5.577405] bus_add_driver (drivers/base/bus.c:674)
>>>> kernel: [ 5.577409] driver_register (drivers/base/driver.c:246)
>>>> kernel: [ 5.577411] ? __pfx_amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2497) amdgpu
>>>> kernel: [ 5.577521] __pci_register_driver (drivers/pci/pci-driver.c:1456)
>>>> kernel: [ 5.577524] amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drvc:2805) amdgpu
>>>> kernel: [ 5.577628] do_one_initcall (init/main.c:1236)
>>>> kernel: [ 5.577632] ? kmalloc_trace (mm/slub.c:3816 mm/slub.c:3860 mm/slub.c:4007)
>>>> kernel: [ 5.577637] do_init_module (kernel/module/main.c:2533)
>>>> kernel: [ 5.577640] load_module (kernel/module/main.c:2984)
>>>> kernel: [ 5.577647] init_module_from_file (kernel/module/main.c:3151)
>>>> kernel: [ 5.577649] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>> kernel: [ 5.577652] ? init_module_from_file (kernel/module/main.c:3151)
>>>> kernel: [ 5.577657] idempotent_init_module (kernel/module/main.c:3168)
>>>> kernel: [ 5.577661] __x64_sys_finit_module (./include/linux/file.h:45 kernel/module/main.c:3190 kernel/module/main.c:3172 kernel/module/main.c:3172)
>>>> kernel: [ 5.577664] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
>>>> kernel: [ 5.577668] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>> kernel: [ 5.577671] ? ksys_mmap_pgoff (mm/mmap.c:1428)
>>>> kernel: [ 5.577675] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>> kernel: [ 5.577678] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>> kernel: [ 5.577681] ? syscall_exit_to_user_mode (kernel/entry/commonc:215)
>>>> kernel: [ 5.577684] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>> kernel: [ 5.577687] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>>>> kernel: [ 5.577689] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>> kernel: [ 5.577692] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>>>> kernel: [ 5.577695] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>> kernel: [ 5.577698] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>>>> kernel: [ 5.577700] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>> kernel: [ 5.577703] ? sysvec_call_function (arch/x86/kernel/smp.c:253 (discriminator 69))
>>>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
>>>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>>>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>>>> All code
>>>> ========
>>>> 0: 5b pop %rbx
>>>> 1: 41 5c pop %r12
>>>> 3: c3 ret
>>>> 4: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
>>>> b: 00 00
>>>> d: f3 0f 1e fa endbr64
>>>> 11: 48 89 f8 mov %rdi,%rax
>>>> 14: 48 89 f7 mov %rsi,%rdi
>>>> 17: 48 89 d6 mov %rdx,%rsi
>>>> 1a: 48 89 ca mov %rcx,%rdx
>>>> 1d: 4d 89 c2 mov %r8,%r10
>>>> 20: 4d 89 c8 mov %r9,%r8
>>>> 23: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9
>>>> 28: 0f 05 syscall
>>>> 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
>>>> 30: 73 01 jae 0x33
>>>> 32: c3 ret
>>>> 33: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb5ad
>>>> 3a: f7 d8 neg %eax
>>>> 3c: 64 89 01 mov %eax,%fs:(%rcx)
>>>> 3f: 48 rex.W
>>>>
>>>> Code starting with the faulting instruction
>>>> ===========================================
>>>> 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
>>>> 6: 73 01 jae 0x9
>>>> 8: c3 ret
>>>> 9: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb583
>>>> 10: f7 d8 neg %eax
>>>> 12: 64 89 01 mov %eax,%fs:(%rcx)
>>>> 15: 48 rex.W
>>>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>>>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>>>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>>>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>>>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>>>> kernel: [ 5.577748] </TASK>
>>>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
>>>> kernel: [ 5.577817] CR2: 0000000000000008
>>>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>>>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>>>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>>> All code
>>>> ========
>>>> 0: 8d 55 a8 lea -0x58(%rbp),%edx
>>>> 3: 4c 89 ff mov %r15,%rdi
>>>> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
>>>> b: 41 89 c2 mov %eax,%r10d
>>>> e: 83 f8 ed cmp $0xffffffed,%eax
>>>> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
>>>> 17: 85 c0 test %eax,%eax
>>>> 19: 74 05 je 0x20
>>>> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>>>> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
>>>> 27: 4c 89 ff mov %r15,%rdi
>>>> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
>>>> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
>>>> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
>>>> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
>>>> 3b: 41 89 c2 mov %eax,%r10d
>>>> 3e: 85 c0 test %eax,%eax
>>>>
>>>> Code starting with the faulting instruction
>>>> ===========================================
>>>> 0: 48 8b 40 08 mov 0x8(%rax),%rax
>>>> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
>>>> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
>>>> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
>>>> 11: 41 89 c2 mov %eax,%r10d
>>>> 14: 85 c0 test %eax,%eax
>>>> rsyslogd: rsyslogd's groupid changed to 111
>>>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>>> kernel: [ 5.914419] PKRU: 55555554
>>>>
>>>> Best regards,
>>>> Mirsad
>>>>
>>>> On 1/18/24 18:23, Mirsad Todorovac wrote:
>>>>> Hi,
>>>>>
>>>>> Unfortunately, I was not able to reboot in this kernel again to do the stack decode, but I thought
>>>>> that any information about the NULL pointer dereference is better than no info.
>>>>>
>>>>> The system is Ubuntu 23.10 Mantic with AMD product: Navi 23 [Radeon RX 6600/6600 XT/6600M]
>>>>> graphic card.
>>>>>
>>>>> Please find the config and the hw listing attached.
>>>>>
>>>>> Best regards,
>>>>> Mirsad
>>>>
>>>>
>>>>
>>>>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>>>>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>>>>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>>>>> kernel: [ 5.576712] PGD 0 P4D 0
>>>>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>>>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>>>>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>>>>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>>>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>>>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>>>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>>>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>>>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>>>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>>>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>>>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>>>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>>>> kernel: [ 5.576903] PKRU: 55555554
>>>>> kernel: [ 5.576905] Call Trace:
>>>>> kernel: [ 5.576907] <TASK>
>>>>> kernel: [ 5.576909] ? show_regs+0x72/0x90
>>>>> kernel: [ 5.576914] ? __die+0x25/0x80
>>>>> kernel: [ 5.576917] ? page_fault_oops+0x154/0x4c0
>>>>> kernel: [ 5.576921] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra0+0x35/0x70
>>>>> kernel: [ 5.576930] ? do_user_addr_fault+0x30e/0x6e0
>>>>> kernel: [ 5.576934] ? exc_page_fault+0x84/0x1b0
>>>>> kernel: [ 5.576937] ? asm_exc_page_fault+0x27/0x30
>>>>> kernel: [ 5.576942] ? gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>>>> kernel: [ 5.577056] amdgpu_device_init+0xefa/0x2de0 [amdgpu]
>>>>> kernel: [ 5.577158] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>> kernel: [ 5.577161] ? pci_bus_read_config_word+0x47/0x90
>>>>> kernel: [ 5.577166] ? pci_read_config_word+0x27/0x60
>>>>> kernel: [ 5.577168] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>> kernel: [ 5.577171] ? do_pci_enable_device+0xe1/0x110
>>>>> kernel: [ 5.577176] amdgpu_driver_load_kms+0x1a/0x1c0 [amdgpu]
>>>>> kernel: [ 5.577275] amdgpu_pci_probe+0x1a8/0x5e0 [amdgpu]
>>>>> kernel: [ 5.577373] local_pci_probe+0x48/0xb0
>>>>> kernel: [ 5.577377] pci_device_probe+0xc8/0x290
>>>>> kernel: [ 5.577381] really_probe+0x1d2/0x440
>>>>> kernel: [ 5.577386] __driver_probe_device+0x8a/0x190
>>>>> kernel: [ 5.577389] driver_probe_device+0x23/0xd0
>>>>> kernel: [ 5.577392] __driver_attach+0x10f/0x220
>>>>> kernel: [ 5.577396] ? __pfx___driver_attach+0x10/0x10
>>>>> kernel: [ 5.577399] bus_for_each_dev+0x7a/0xe0
>>>>> kernel: [ 5.577402] driver_attach+0x1e/0x30
>>>>> kernel: [ 5.577405] bus_add_driver+0x127/0x240
>>>>> kernel: [ 5.577409] driver_register+0x64/0x140
>>>>> kernel: [ 5.577411] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
>>>>> kernel: [ 5.577521] __pci_register_driver+0x68/0x80
>>>>> kernel: [ 5.577524] amdgpu_init+0x69/0xff0 [amdgpu]
>>>>> kernel: [ 5.577628] do_one_initcall+0x46/0x330
>>>>> kernel: [ 5.577632] ? kmalloc_trace+0x136/0x370
>>>>> kernel: [ 5.577637] do_init_module+0x6a/0x280
>>>>> kernel: [ 5.577640] load_module+0x2419/0x2500
>>>>> kernel: [ 5.577647] init_module_from_file+0x9c/0xf0
>>>>> kernel: [ 5.577649] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>> kernel: [ 5.577652] ? init_module_from_file+0x9c/0xf0
>>>>> kernel: [ 5.577657] idempotent_init_module+0x184/0x240
>>>>> kernel: [ 5.577661] __x64_sys_finit_module+0x64/0xd0
>>>>> kernel: [ 5.577664] do_syscall_64+0x76/0x140
>>>>> kernel: [ 5.577668] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>> kernel: [ 5.577671] ? ksys_mmap_pgoff+0x123/0x270
>>>>> kernel: [ 5.577675] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>> kernel: [ 5.577678] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>> kernel: [ 5.577681] ? syscall_exit_to_user_mode+0x97/0x1e0
>>>>> kernel: [ 5.577684] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>> kernel: [ 5.577687] ? do_syscall_64+0x85/0x140
>>>>> kernel: [ 5.577689] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>> kernel: [ 5.577692] ? do_syscall_64+0x85/0x140
>>>>> kernel: [ 5.577695] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>> kernel: [ 5.577698] ? do_syscall_64+0x85/0x140
>>>>> kernel: [ 5.577700] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>> kernel: [ 5.577703] ? sysvec_call_function+0x4e/0xb0
>>>>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe+0x6e/0x76
>>>>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>>>>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>>>>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>>>>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>>>>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>>>>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>>>>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>>>>> kernel: [ 5.577748] </TASK>
>>>>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
>>>>> kernel: [ 5.577817] CR2: 0000000000000008
>>>>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>>>>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>>>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>>>> rsyslogd: rsyslogd's groupid changed to 111
>>>>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>>>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>>>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>>>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>>>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>>>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>>>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>>>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>>>> kernel: [ 5.914419] PKRU: 55555554
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG [RESEND][NEW BUG]: kernel NULL pointer dereference, address: 0000000000000008
2024-01-25 9:29 ` Mirsad Todorovac
@ 2024-01-25 18:02 ` Mirsad Todorovac
0 siblings, 0 replies; 8+ messages in thread
From: Mirsad Todorovac @ 2024-01-25 18:02 UTC (permalink / raw
To: Ma, Jun, linux-kernel, amd-gfx
Cc: Sathishkumar S, Pan, Xinhui, Srinivasan Shanmugam, Guchun Chen,
David Airlie, Felix Kuehling, Lijo Lazar, dri-devel,
Christian König, Boyuan Zhang, Daniel Vetter, David Francis,
Alex Deucher, Lang Yu, Marek Olšák
Hi Ma Jun,
Greetings again.
So, I just tested the recommended patch and the issue with the graphical login
screen was successfully resolved.
Thank you very much for your prompt reviews and recommended patches.
God bless.
Best regards,
Mirsad Todorovac
On 1/25/24 10:29, Mirsad Todorovac wrote:
> Hi Ma Jun,
>
> Copy that. This appears to be the exact problem, and thank you for
> reviewing the bug report at such a short notice.
>
> I apologise for the wrong assertion.
>
> The patch you sent then just triggered another bug, and it is not manifested without the patch (but a NULL pointer dereference instead).
>
> But of course, it is not profitable to remove your patch and have
> the NULL ptr dereference, but a proper fix is required.
>
> Thanks again.
>
> Best regards,
> Mirsad Todorovac
>
> On 1/25/2024 8:38 AM, Ma, Jun wrote:
>> Hi Mirsad,
>>
>>
>> On 1/25/2024 1:48 AM, Mirsad Todorovac wrote:
>>> Hi, Ma Jun,
>>>
>>> Normally, I would reply under the quoted text, but I will adjust to your convention.
>>>
>>> I have just discovered that your patch causes Ubuntu 22.04 LTS GNOME XWayland session
>>> to block at typing password and ENTER in the graphical logon screen (tested several times).
>>>
>> This problem is not caused by my patch.
>> Based on your syslog, it looks more like a shedule issue.
>> I just saw a similar problem, please refer to the link below
>> https://gitlab.freedesktop.org/drm/amd/-/issues/3124
>>
>> Regards,
>> Ma Jun
>>> After that, I was not able to even log from another box with ssh, or the session would
>>> block (tested one time, second time too, thrid time it passed after I connected before
>>> attempt to login on XWayland console).
>>>
>>> You might find useful syslog and dmesg of the freeze on this link (they were +100K):
>>>
>>> https://magrf.grf.hr/~mtodorov/linux/bugreports/6.7.0/amdgpu/6.7.0-xway-09721-g61da593f4458/
>>>
>>> The exact applied patch was this:
>>>
>>> marvin@defiant:~/linux/kernel/linux_torvalds$ git diff
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> index 73f6d7e72c73..6ef333df9adf 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>>> @@ -3996,16 +3996,13 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
>>> if (!amdgpu_sriov_vf(adev)) {
>>> snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", ucode_prefix);
>>> - err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, fw_name);
>>> - /* don't check this. There are apparently firmwares in the wild with
>>> - * incorrect size in the header
>>> - */
>>> - if (err == -ENODEV)
>>> - goto out;
>>> + err = request_firmware(&adev->gfx.rlc_fw, fw_name, adev->dev);
>>> if (err)
>>> - dev_dbg(adev->dev,
>>> - "gfx10: amdgpu_ucode_request() failed \"%s\"\n",
>>> - fw_name);
>>> + goto out;
>>> +
>>> + /* don't validate this firmware. There are apparently firmwares
>>> + * in the wild with incorrect size in the header
>>> + */
>>> rlc_hdr = (const struct rlc_firmware_header_v2_0 *)adev->gfx.rlc_fw->data;
>>> version_major = le16_to_cpu(rlc_hdr->header.header_version_major);
>>> version_minor = le16_to_cpu(rlc_hdr->header.header_version_minor);
>>> marvin@defiant:~/linux/kernel/linux_torvalds$ uname -rms
>>> Linux 6.7.0-xway-09721-g61da593f4458 x86_64
>>> marvin@defiant:~/linux/kernel/linux_torvalds$
>>>
>>> So, there seems to be a problem with the way the patch affects XWayland.
>>>
>>> Checked multiple times the exact commit with and without the diff.
>>>
>>> Hope this helps, because I am not familiar with the amdgpu driver.
>>>
>>> Best regards,
>>> Mirsad Todorovac
>>>
>>> On 1/22/24 09:34, Ma, Jun wrote:
>>>> Perhaps similar to the problem I encountered earlier, you can
>>>> try the following patch
>>>>
>>>> https://lists.freedesktop.org/archives/amd-gfx/2024-January/103259.html
>>>>
>>>> Regards,
>>>> Ma Jun
>>>>
>>>> On 1/21/2024 3:54 AM, Mirsad Todorovac wrote:
>>>>> Hi,
>>>>>
>>>>> The last email did not pass to the most of the recipients due to banned .xz attachment.
>>>>>
>>>>> As the .config is too big to send inline or uncompressed either, I will omit it in this
>>>>> attempt. In the meantime, I had some success in decoding the stack trace, but sadly not
>>>>> complete.
>>>>>
>>>>> I don't think this Oops is deterministic, but I am working on a reproducer.
>>>>>
>>>>> The platform is Ubuntu 22.04 LTS.
>>>>>
>>>>> Complete list of hardware and .config is available here:
>>>>>
>>>>> https://domac.alu.unizg.hr/~mtodorov/linux/bugreports/amdgpu/6.7.0-rtl-v02-nokcsan-09928-g052d534373b7/
>>>>>
>>>>> Best regards,
>>>>> Mirsad
>>>>>
>>>>> -------------------------------------------------------------------------------------------
>>>>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>>>>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>>>>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>>>>> kernel: [ 5.576712] PGD 0 P4D 0
>>>>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>>>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>>>>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>>>>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>>>>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>>>> All code
>>>>> ========
>>>>> 0: 8d 55 a8 lea -0x58(%rbp),%edx
>>>>> 3: 4c 89 ff mov %r15,%rdi
>>>>> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
>>>>> b: 41 89 c2 mov %eax,%r10d
>>>>> e: 83 f8 ed cmp $0xffffffed,%eax
>>>>> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
>>>>> 17: 85 c0 test %eax,%eax
>>>>> 19: 74 05 je 0x20
>>>>> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>>>>> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
>>>>> 27: 4c 89 ff mov %r15,%rdi
>>>>> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
>>>>> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
>>>>> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
>>>>> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
>>>>> 3b: 41 89 c2 mov %eax,%r10d
>>>>> 3e: 85 c0 test %eax,%eax
>>>>>
>>>>> Code starting with the faulting instruction
>>>>> ===========================================
>>>>> 0: 48 8b 40 08 mov 0x8(%rax),%rax
>>>>> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
>>>>> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
>>>>> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
>>>>> 11: 41 89 c2 mov %eax,%r10d
>>>>> 14: 85 c0 test %eax,%eax
>>>>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>>>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>>>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>>>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>>>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>>>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>>>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>>>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>>>> kernel: [ 5.576903] PKRU: 55555554
>>>>> kernel: [ 5.576905] Call Trace:
>>>>> kernel: [ 5.576907] <TASK>
>>>>> kernel: [ 5.576909] ? show_regs (arch/x86/kernel/dumpstack.c:479)
>>>>> kernel: [ 5.576914] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
>>>>> kernel: [ 5.576917] ? page_fault_oops (arch/x86/mm/fault.c:707)
>>>>> kernel: [ 5.576921] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0 (crypto/api.c:497)
>>>>> kernel: [ 5.576930] ? do_user_addr_fault (arch/x86/mm/fault.c:1264)
>>>>> kernel: [ 5.576934] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:693 arch/x86/mm/fault.c:1515 arch/x86/mm/fault.c:1563)
>>>>> kernel: [ 5.576937] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
>>>>> kernel: [ 5.576942] ? gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>>>>> kernel: [ 5.577056] amdgpu_device_init (drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:2465 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4042) amdgpu
>>>>> kernel: [ 5.577158] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>>> kernel: [ 5.577161] ? pci_bus_read_config_word (drivers/pci/access.c:67 (discriminator 2))
>>>>> kernel: [ 5.577166] ? pci_read_config_word (drivers/pci/access.c:563)
>>>>> kernel: [ 5.577168] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>>> kernel: [ 5.577171] ? do_pci_enable_device (drivers/pci/pci.c:1975 drivers/pci/pci.c:1949)
>>>>> kernel: [ 5.577176] amdgpu_driver_load_kms (drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:146) amdgpu
>>>>> kernel: [ 5.577275] amdgpu_pci_probe (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2237) amdgpu
>>>>> kernel: [ 5.577373] local_pci_probe (drivers/pci/pci-driver.c:324)
>>>>> kernel: [ 5.577377] pci_device_probe (drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:460)
>>>>> kernel: [ 5.577381] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
>>>>> kernel: [ 5.577386] __driver_probe_device (drivers/base/dd.c:800)
>>>>> kernel: [ 5.577389] driver_probe_device (drivers/base/dd.c:830)
>>>>> kernel: [ 5.577392] __driver_attach (drivers/base/dd.c:1217)
>>>>> kernel: [ 5.577396] ? __pfx___driver_attach (drivers/base/dd.c:1157)
>>>>> kernel: [ 5.577399] bus_for_each_dev (drivers/base/bus.c:368)
>>>>> kernel: [ 5.577402] driver_attach (drivers/base/dd.c:1234)
>>>>> kernel: [ 5.577405] bus_add_driver (drivers/base/bus.c:674)
>>>>> kernel: [ 5.577409] driver_register (drivers/base/driver.c:246)
>>>>> kernel: [ 5.577411] ? __pfx_amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2497) amdgpu
>>>>> kernel: [ 5.577521] __pci_register_driver (drivers/pci/pci-driver.c:1456)
>>>>> kernel: [ 5.577524] amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drvc:2805) amdgpu
>>>>> kernel: [ 5.577628] do_one_initcall (init/main.c:1236)
>>>>> kernel: [ 5.577632] ? kmalloc_trace (mm/slub.c:3816 mm/slub.c:3860 mm/slub.c:4007)
>>>>> kernel: [ 5.577637] do_init_module (kernel/module/main.c:2533)
>>>>> kernel: [ 5.577640] load_module (kernel/module/main.c:2984)
>>>>> kernel: [ 5.577647] init_module_from_file (kernel/module/main.c:3151)
>>>>> kernel: [ 5.577649] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>>> kernel: [ 5.577652] ? init_module_from_file (kernel/module/main.c:3151)
>>>>> kernel: [ 5.577657] idempotent_init_module (kernel/module/main.c:3168)
>>>>> kernel: [ 5.577661] __x64_sys_finit_module (./include/linux/file.h:45 kernel/module/main.c:3190 kernel/module/main.c:3172 kernel/module/main.c:3172)
>>>>> kernel: [ 5.577664] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
>>>>> kernel: [ 5.577668] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>>> kernel: [ 5.577671] ? ksys_mmap_pgoff (mm/mmap.c:1428)
>>>>> kernel: [ 5.577675] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>>> kernel: [ 5.577678] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>>> kernel: [ 5.577681] ? syscall_exit_to_user_mode (kernel/entry/commonc:215)
>>>>> kernel: [ 5.577684] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>>> kernel: [ 5.577687] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>>>>> kernel: [ 5.577689] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>>> kernel: [ 5.577692] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>>>>> kernel: [ 5.577695] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>>> kernel: [ 5.577698] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>>>>> kernel: [ 5.577700] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>>>>> kernel: [ 5.577703] ? sysvec_call_function (arch/x86/kernel/smp.c:253 (discriminator 69))
>>>>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
>>>>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>>>>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>>>>> All code
>>>>> ========
>>>>> 0: 5b pop %rbx
>>>>> 1: 41 5c pop %r12
>>>>> 3: c3 ret
>>>>> 4: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
>>>>> b: 00 00
>>>>> d: f3 0f 1e fa endbr64
>>>>> 11: 48 89 f8 mov %rdi,%rax
>>>>> 14: 48 89 f7 mov %rsi,%rdi
>>>>> 17: 48 89 d6 mov %rdx,%rsi
>>>>> 1a: 48 89 ca mov %rcx,%rdx
>>>>> 1d: 4d 89 c2 mov %r8,%r10
>>>>> 20: 4d 89 c8 mov %r9,%r8
>>>>> 23: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9
>>>>> 28: 0f 05 syscall
>>>>> 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
>>>>> 30: 73 01 jae 0x33
>>>>> 32: c3 ret
>>>>> 33: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb5ad
>>>>> 3a: f7 d8 neg %eax
>>>>> 3c: 64 89 01 mov %eax,%fs:(%rcx)
>>>>> 3f: 48 rex.W
>>>>>
>>>>> Code starting with the faulting instruction
>>>>> ===========================================
>>>>> 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
>>>>> 6: 73 01 jae 0x9
>>>>> 8: c3 ret
>>>>> 9: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb583
>>>>> 10: f7 d8 neg %eax
>>>>> 12: 64 89 01 mov %eax,%fs:(%rcx)
>>>>> 15: 48 rex.W
>>>>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>>>>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>>>>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>>>>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>>>>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>>>>> kernel: [ 5.577748] </TASK>
>>>>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
>>>>> kernel: [ 5.577817] CR2: 0000000000000008
>>>>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>>>>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>>>>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>>>> All code
>>>>> ========
>>>>> 0: 8d 55 a8 lea -0x58(%rbp),%edx
>>>>> 3: 4c 89 ff mov %r15,%rdi
>>>>> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
>>>>> b: 41 89 c2 mov %eax,%r10d
>>>>> e: 83 f8 ed cmp $0xffffffed,%eax
>>>>> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
>>>>> 17: 85 c0 test %eax,%eax
>>>>> 19: 74 05 je 0x20
>>>>> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>>>>> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
>>>>> 27: 4c 89 ff mov %r15,%rdi
>>>>> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
>>>>> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
>>>>> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
>>>>> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
>>>>> 3b: 41 89 c2 mov %eax,%r10d
>>>>> 3e: 85 c0 test %eax,%eax
>>>>>
>>>>> Code starting with the faulting instruction
>>>>> ===========================================
>>>>> 0: 48 8b 40 08 mov 0x8(%rax),%rax
>>>>> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
>>>>> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
>>>>> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
>>>>> 11: 41 89 c2 mov %eax,%r10d
>>>>> 14: 85 c0 test %eax,%eax
>>>>> rsyslogd: rsyslogd's groupid changed to 111
>>>>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>>>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>>>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>>>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>>>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>>>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>>>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>>>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>>>> kernel: [ 5.914419] PKRU: 55555554
>>>>>
>>>>> Best regards,
>>>>> Mirsad
>>>>>
>>>>> On 1/18/24 18:23, Mirsad Todorovac wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Unfortunately, I was not able to reboot in this kernel again to do the stack decode, but I thought
>>>>>> that any information about the NULL pointer dereference is better than no info.
>>>>>>
>>>>>> The system is Ubuntu 23.10 Mantic with AMD product: Navi 23 [Radeon RX 6600/6600 XT/6600M]
>>>>>> graphic card.
>>>>>>
>>>>>> Please find the config and the hw listing attached.
>>>>>>
>>>>>> Best regards,
>>>>>> Mirsad
>>>>>
>>>>>
>>>>>
>>>>>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>>>>>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>>>>>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>>>>>> kernel: [ 5.576712] PGD 0 P4D 0
>>>>>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>>>>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>>>>>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>>>>>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>>>>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>>>>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>>>>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>>>>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>>>>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>>>>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>>>>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>>>>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>>>>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>>>>> kernel: [ 5.576903] PKRU: 55555554
>>>>>> kernel: [ 5.576905] Call Trace:
>>>>>> kernel: [ 5.576907] <TASK>
>>>>>> kernel: [ 5.576909] ? show_regs+0x72/0x90
>>>>>> kernel: [ 5.576914] ? __die+0x25/0x80
>>>>>> kernel: [ 5.576917] ? page_fault_oops+0x154/0x4c0
>>>>>> kernel: [ 5.576921] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra0+0x35/0x70
>>>>>> kernel: [ 5.576930] ? do_user_addr_fault+0x30e/0x6e0
>>>>>> kernel: [ 5.576934] ? exc_page_fault+0x84/0x1b0
>>>>>> kernel: [ 5.576937] ? asm_exc_page_fault+0x27/0x30
>>>>>> kernel: [ 5.576942] ? gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>>>>> kernel: [ 5.577056] amdgpu_device_init+0xefa/0x2de0 [amdgpu]
>>>>>> kernel: [ 5.577158] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>>> kernel: [ 5.577161] ? pci_bus_read_config_word+0x47/0x90
>>>>>> kernel: [ 5.577166] ? pci_read_config_word+0x27/0x60
>>>>>> kernel: [ 5.577168] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>>> kernel: [ 5.577171] ? do_pci_enable_device+0xe1/0x110
>>>>>> kernel: [ 5.577176] amdgpu_driver_load_kms+0x1a/0x1c0 [amdgpu]
>>>>>> kernel: [ 5.577275] amdgpu_pci_probe+0x1a8/0x5e0 [amdgpu]
>>>>>> kernel: [ 5.577373] local_pci_probe+0x48/0xb0
>>>>>> kernel: [ 5.577377] pci_device_probe+0xc8/0x290
>>>>>> kernel: [ 5.577381] really_probe+0x1d2/0x440
>>>>>> kernel: [ 5.577386] __driver_probe_device+0x8a/0x190
>>>>>> kernel: [ 5.577389] driver_probe_device+0x23/0xd0
>>>>>> kernel: [ 5.577392] __driver_attach+0x10f/0x220
>>>>>> kernel: [ 5.577396] ? __pfx___driver_attach+0x10/0x10
>>>>>> kernel: [ 5.577399] bus_for_each_dev+0x7a/0xe0
>>>>>> kernel: [ 5.577402] driver_attach+0x1e/0x30
>>>>>> kernel: [ 5.577405] bus_add_driver+0x127/0x240
>>>>>> kernel: [ 5.577409] driver_register+0x64/0x140
>>>>>> kernel: [ 5.577411] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
>>>>>> kernel: [ 5.577521] __pci_register_driver+0x68/0x80
>>>>>> kernel: [ 5.577524] amdgpu_init+0x69/0xff0 [amdgpu]
>>>>>> kernel: [ 5.577628] do_one_initcall+0x46/0x330
>>>>>> kernel: [ 5.577632] ? kmalloc_trace+0x136/0x370
>>>>>> kernel: [ 5.577637] do_init_module+0x6a/0x280
>>>>>> kernel: [ 5.577640] load_module+0x2419/0x2500
>>>>>> kernel: [ 5.577647] init_module_from_file+0x9c/0xf0
>>>>>> kernel: [ 5.577649] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>>> kernel: [ 5.577652] ? init_module_from_file+0x9c/0xf0
>>>>>> kernel: [ 5.577657] idempotent_init_module+0x184/0x240
>>>>>> kernel: [ 5.577661] __x64_sys_finit_module+0x64/0xd0
>>>>>> kernel: [ 5.577664] do_syscall_64+0x76/0x140
>>>>>> kernel: [ 5.577668] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>>> kernel: [ 5.577671] ? ksys_mmap_pgoff+0x123/0x270
>>>>>> kernel: [ 5.577675] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>>> kernel: [ 5.577678] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>>> kernel: [ 5.577681] ? syscall_exit_to_user_mode+0x97/0x1e0
>>>>>> kernel: [ 5.577684] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>>> kernel: [ 5.577687] ? do_syscall_64+0x85/0x140
>>>>>> kernel: [ 5.577689] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>>> kernel: [ 5.577692] ? do_syscall_64+0x85/0x140
>>>>>> kernel: [ 5.577695] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>>> kernel: [ 5.577698] ? do_syscall_64+0x85/0x140
>>>>>> kernel: [ 5.577700] ? srso_alias_return_thunk+0x5/0xfbef5
>>>>>> kernel: [ 5.577703] ? sysvec_call_function+0x4e/0xb0
>>>>>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe+0x6e/0x76
>>>>>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>>>>>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>>>>>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>>>>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>>>>>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>>>>>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>>>>>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>>>>>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>>>>>> kernel: [ 5.577748] </TASK>
>>>>>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi
>>>>>> gpio_amdpt
>>>>>> kernel: [ 5.577817] CR2: 0000000000000008
>>>>>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>>>>>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>>>>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>>>>> rsyslogd: rsyslogd's groupid changed to 111
>>>>>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>>>>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>>>>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>>>>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>>>>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>>>>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>>>>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>>>>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>>>>> kernel: [ 5.914419] PKRU: 55555554
>>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-01-26 9:12 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-18 17:23 BUG: kernel NULL pointer dereference, address: 0000000000000008 Mirsad Todorovac
2024-01-20 19:54 ` BUG [RESEND]: " Mirsad Todorovac
2024-01-22 8:34 ` Ma, Jun
2024-01-22 22:39 ` Mirsad Todorovac
2024-01-24 17:48 ` BUG [RESEND][NEW BUG]: " Mirsad Todorovac
2024-01-25 7:38 ` Ma, Jun
2024-01-25 9:29 ` Mirsad Todorovac
2024-01-25 18:02 ` Mirsad Todorovac
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).