linux-embedded.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dirk Behme <dirk.behme@gmail.com>
To: Lior Weintraub <liorw@pliops.com>,
	"linux-embedded@vger.kernel.org" <linux-embedded@vger.kernel.org>
Subject: Re: Debugging early SError exception
Date: Tue, 19 Dec 2023 08:09:10 +0100	[thread overview]
Message-ID: <b17b9901-5007-4d12-99d9-be531360227e@gmail.com> (raw)
In-Reply-To: <PR3P195MB05556D6B225E93A2B5BFFE88C391A@PR3P195MB0555.EURP195.PROD.OUTLOOK.COM>

Am 17.12.23 um 22:32 schrieb Lior Weintraub:
> Hi,
> 
> We have a new SoC with eLinux porting (kernel v6.5).
> This SoC is ARM64 (A53) single core based device.
> It runs correctly on QEMU but fails with SError on emulation platform (Synopsys Zebu running our SoC model).
> There is no debugger connected to this emulation but there are several debug capabilities we can use:
> 1. Generating wave dump of CPU signals
> 2. Generate a Tarmac log
> 3. UART
> 
> Since the SError happens at early stages of Linux boot the UART is not enabled yet.
>  From the Tarmac log we can see:
>   3824884521 ps  ES  (ffff800080760888:d65f03c0) O el1h_ns:	ret 	(parse_early_param)
>   3824884522 ps  ES  (ffff800080763a60:d2801800) O el1h_ns:	mov	x0,	#0xc0	//	#192 	(setup_arch)
>                      R X0 (AARCH64) 00000000 000000c0
>   3824884523 ps  ES  (ffff800080763a64:d51b4220) O el1h_ns:	msr	daif,	x0 	(setup_arch)
>                      R CPSR 600000c5
>   3824884529 ps  ES  System Error (Abort)
>                      EXC [0x380] SError/vSError Current EL with SP_ELx
>                      R ESR_EL1 (AARCH64) bf000002
>                      R CPSR 600003c5
>                      R SPSR_EL1 (AARCH64) 600000c5
>                      R ELR_EL1 (AARCH64) ffff8000 80763a68
>   3824884925 ps  ES  (ffff800080010b80:d10543ff) O el1h_ns:	sub	sp,	sp,	#0x150 	(vectors)
>                      R SP_EL1 (AARCH64) ffff8000 808f3c50
>   3824884925 ps  ES  (ffff800080010b84:8b2063ff) O el1h_ns:	add	sp,	sp,	x0 	(vectors)
>                      R SP_EL1 (AARCH64) ffff8000 808f3d10
>   3824884926 ps  ES  (ffff800080010b88:cb2063e0) O el1h_ns:	sub	x0,	sp,	x0 	(vectors)
>                      R X0 (AARCH64) ffff8000 808f3c50
>   3824884927 ps  ES  (ffff800080010b8c:37700080) O el1h_ns:	tbnz	w0,	#14,	ffff800080010b9c	<vectors+0x39c> 	(vectors)
>   3824884935 ps  ES  (ffff800080010b90:cb2063e0) O el1h_ns:	sub	x0,	sp,	x0 	(vectors)
>                      R X0 (AARCH64) 00000000 000000c0
>   3824884937 ps  ES  (ffff800080010b94:cb2063ff) O el1h_ns:	sub	sp,	sp,	x0 	(vectors)
>                      R SP_EL1 (AARCH64) ffff8000 808f3c50
>   3824884938 ps  ES  (ffff800080010b98:140001ef) O el1h_ns:	b	ffff800080011354	<el1h_64_error> 	(vectors)
> 
> If I understand correctly, the exception happened sometime earlier and only now Linux boot code (setup_arch) opened the exception handling and as a result we immediately jump to the SError exception handler.


Yes, that sounds reasonable. If I understood correctly, you are 
running something "quite new" on some software (QEMU) and hardware 
(Synopsis) simulators.

That would mean that you have new hardware with e.g. new memory map 
not used before. What you describe might sound like in the code before 
Linux (boot loader) there is anything resulting in the SError. This 
might be an access to non-existing or non-enabled hardware. I.e. it 
might be that you try to access (read/write) an address what is not 
available, yet (or just invalid). It's hard to debug that. In case you 
are able to modify the code before Linux (the boot loader?) you might 
try to enable SError exceptions, there, too. To get it earlier and 
with that make the search window smaller. I'm not that familiar with 
QEMU, but could you try to trace which (all?) hardware accesses your 
code does. And with that analyse all accesses and with that check if 
all these accesses are valid even on the hardware (Synopsis) emulation 
system? That should be checked from valid address and from hardware 
subsystem enablement point of view.

Hth,

Dirk


>  From the Linux source:
> 	parse_early_param();
> 
> 	dynamic_scs_init();
> 
> 	/*
> 	 * Unmask asynchronous aborts and fiq after bringing up possible
> 	 * earlycon. (Report possible System Errors once we can report this
> 	 * occurred).
> 	 */
> 	local_daif_restore(DAIF_PROCCTX_NOIRQ); <---- This is when we get the exception.
> 
> After some kernel hacking (replacing printk) we could extract the logs:
> 6Booting Linux on physical CPU 0x0000000000 [0x410fd034]
> 5Linux version 6.5.0 (pliops@dev-liorw) (aarch64-buildroot-linux-gnu-gcc.br_real (Buildroot 2023.02.1-95-g8391404e23) 11.3.0, GNU ld (GNU Binutils) 2.38) #101 SMP Sun Dec 17 20:09:06 IST 2023
> 6Machine model: Pliops Spider MK-I EVK
> 2SError Interrupt on CPU0, code 0x00000000bf000002 -- SError
> CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0 #101
> Hardware name: Pliops Spider MK-I EVK (DT)
> pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : setup_arch+0x13c/0x5ac
> lr : setup_arch+0x134/0x5ac
> sp : ffff8000808f3da0
> x29: ffff8000808f3da0c x28: 0000000008758074c x27: 0000000005e31b58c
> x26: 0000000000000001c x25: 0000000007e5f728c x24: ffff8000808f8000c
> x23: ffff8000808f8600c x22: ffff8000807b6000c x21: ffff800080010000c
> x20: ffff800080a1e000c x19: fffffbfffddfe190c x18: 000000002266684ac
> x17: 00000000fcad60bbc x16: 0000000000001800c x15: 0000000000000008c
> x14: ffffffffffffffffc x13: 0000000000000000c x12: 0000000000000003c
> x11: 0101010101010101c x10: ffffffffffee87dfc x9 : 0000000000000038c
> x8 : 0101010101010101c x7 : 7f7f7f7f7f7f7f7fc x6 : 0000000000000001c
> x5 : 0000000000000000c x4 : 8000000000000000c x3 : 0000000000000065c
> x2 : 0000000000000000c x1 : 0000000000000000c x0 : 00000000000000c0c
> 0Kernel panic - not syncing: Asynchronous SError Interrupt
> CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0 #101
> Hardware name: Pliops Spider MK-I EVK (DT)
> Call trace:
>   dump_backtrace+0x9c/0xd0
>   show_stack+0x14/0x1c
>   dump_stack_lvl+0x44/0x58
>   dump_stack+0x14/0x1c
>   panic+0x2e0/0x33c
>   nmi_panic+0x68/0x6c
>   arm64_serror_panic+0x68/0x78
>   do_serror+0x24/0x54
>   el1h_64_error_handler+0x2c/0x40
>   el1h_64_error+0x64/0x68
>   setup_arch+0x13c/0x5ac
>   start_kernel+0x5c/0x5b8
>   __primary_switched+0xb4/0xbc
> 0---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]---
> 
> Can you please advice how to proceed with debugging?
> 
> Thanks in advanced,
> Cheers,
> Lior.
> 
> 


  reply	other threads:[~2023-12-19  7:09 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-17 21:32 Debugging early SError exception Lior Weintraub
2023-12-19  7:09 ` Dirk Behme [this message]
2023-12-19 13:23   ` Lior Weintraub
2023-12-19 13:37     ` Dirk Behme
2023-12-21  7:43       ` Lior Weintraub
2023-12-21  8:29         ` Dirk Behme
2023-12-21 10:04           ` Lior Weintraub
2023-12-21 11:19             ` Dirk Behme
2023-12-21 11:36               ` Heiko Schocher
2023-12-21 12:04                 ` Lior Weintraub
2023-12-22  7:03                 ` Lior Weintraub
2023-12-22  7:48                   ` Dirk Behme
2023-12-22  8:04                     ` Heiko Schocher
2023-12-24 15:41                       ` Lior Weintraub
2023-12-24 19:12                         ` Lior Weintraub
2023-12-26  7:48                           ` Lior Weintraub

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b17b9901-5007-4d12-99d9-be531360227e@gmail.com \
    --to=dirk.behme@gmail.com \
    --cc=linux-embedded@vger.kernel.org \
    --cc=liorw@pliops.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).