
On 9/1/20 7:23 AM, Heinrich Schuchardt wrote:
On 01.09.20 13:14, Sean Anderson wrote:
On 9/1/20 6:51 AM, Heinrich Schuchardt wrote:
When resetting the sipeed_maix_bitm_defconfig without the patch I see a crash:
=> reset resetting ... Unhandled exception: Illegal instruction EPC: 0000000000000000 RA: 000000008000021a TVAL: 00000002d947c509 ### ERROR ### Please RESET the board ###
Hm, interesting. I don't see that error. What commit are you testing with?
origin/master 23e333a5c083a000d0cabc
sipeed_maix_bitm_defconfig plus
CONFIG_DEBUG_UART=y CONFIG_DEBUG_UART_SIFIVE=y CONFIG_DEBUG_UART_BASE=0x38000000 CONFIG_DEBUG_UART_CLOCK=390000000
on a Maixduino.
Ok, I can reproduce that crash, but only with the debug uart enabled. I suspect it happens every time, but we only get an error with the debug uart. I enabled CONFIG_SHOW_REGS, but there was some interference between the output and the (next) instance of U-Boot.
Best regards
Heinrich
Objdump shows that it is related to the secondary_hart_loop:
j secondary_hart_loop 8000021a: b7ed j 80000204
<secondary_hart_loop>
It is actually related to the previous instruction
80000216: 56e000ef jal ra,80000784 <handle_ipi>
secondary_hart_loop is never reached because handle_ipi jumps to 0x0.
I decided to do a bit more experimentation with the following patch
diff --git i/arch/riscv/lib/smp.c w/arch/riscv/lib/smp.c index ac22136314..8be320f6ae 100644 --- i/arch/riscv/lib/smp.c +++ w/arch/riscv/lib/smp.c @@ -54,6 +54,10 @@ static int send_ipi_many(struct ipi_data *ipi, int wait) gd->arch.ipi[reg].arg0 = ipi->arg0; gd->arch.ipi[reg].arg1 = ipi->arg1;
+ printf("sending %lx(%lx, %lx) to hart %u\n", ipi->addr, + ipi->arg0, ipi->arg1, reg); + + __smp_mb(); ret = riscv_send_ipi(reg); if (ret) { pr_err("Cannot send IPI to hart %d\n", reg); @@ -77,15 +81,23 @@ void handle_ipi(ulong hart) { int ret; void (*smp_function)(ulong hart, ulong arg0, ulong arg1); + int ipi;
if (hart >= CONFIG_NR_CPUS) return; + riscv_get_ipi(hart, &ipi);
__smp_mb();
smp_function = (void (*)(ulong, ulong, ulong))gd->arch.ipi[hart].addr; invalidate_icache_all();
+ printf("calling %p(%lx, %lx) on hart %lu, ipi = %d\n", smp_function, + gd->arch.ipi[hart].arg0, gd->arch.ipi[hart].arg1, hart, ipi); + + if (!ipi) + return; + /* * Clear the IPI to acknowledge the request before jumping to the * requested function.
With this patch I get the following output
=> reset resetting ... calling 0000000000000000(0, 0) on hart 1, ipi = 0 calling 0000000000000000(0, 0) on hart 1, ipi = 0 calling 0000000000000000(0, 0) on hart 1, ipi = 0 calling 0000000000000000(0, 0) on hart 1, ipi = 0 calling 0000000000000000(0, 0) on hart 1, ipi = 1
U-Boot 2020.10-rc3-00033-g23e333a5c0-dirty (Sep 01 2020 - 09:55:43 -0400)
DRAM: 8 MiB sending 805cc1fa(80589040, 8058cdd0) to hart 1 In: serial@38000000 Out: serial@38000000 Err: serial@38000000 =>
There is no "calling" line after the "sending" line because hart 1 is hung from jumping to 0x0.
From this log, it is clear that something other than smp_call_function
is sending an ipi, and, further, that there is a delay between when the IRQ_M_SOFT bit is asserted and when the clint has a pending interrupt.
To further investigate this, I enabled debug output. This causes some garbling in the console, as both harts try and write at the same time. To alleviate this, I applied the following patch which adds locking to puts and putc (NB this only works on K210)
diff --git i/common/console.c w/common/console.c index 8e50af1f9d..31622df9b3 100644 --- i/common/console.c +++ w/common/console.c @@ -515,7 +515,35 @@ static inline void pre_console_puts(const char *s) {} static inline void print_pre_console_buffer(int flushpoint) {} #endif
-void putc(const char c) +/* Static location which will survive across resets */ +u32 *console_spinlock = (void *)0x80400000; + +void console_lock(void) +{ + u32 tmp = 1; + + /* + * We only ever write 1 to the lock, so if anything else is there then + * this is the first access to the spinlock, and we have thus acquired + * it. This will not work if the spinlock is ever (un)initialized to + * exactly 1, but I'll take my chances :) + */ + while (tmp == 1) + asm volatile ("amoswap.w.aq %0, %0, %1" + : "+r" (tmp), "+A" (console_spinlock) + : + : "memory"); +} + +void console_unlock(void) +{ + asm volatile ("amoswap.w.rl x0, x0, %0" + : "+A" (console_spinlock) + : + : "memory"); +} + +static void _putc(const char c) { #ifdef CONFIG_SANDBOX /* sandbox can send characters to stdout before it has a console */ @@ -563,7 +591,14 @@ void putc(const char c) } }
-void puts(const char *s) +void putc(const char c) +{ + console_lock(); + _putc(c); + console_unlock(); +} + +static void _puts(const char *s) { #ifdef CONFIG_SANDBOX /* sandbox can send characters to stdout before it has a console */ @@ -614,6 +649,13 @@ void puts(const char *s) } }
+void puts(const char *s) +{ + console_lock(); + _puts(s); + console_unlock(); +} + #ifdef CONFIG_CONSOLE_RECORD int console_record_init(void) {
With this patch applied the output is [1] (and also far too long for this email). Note that hart 1 first recieves a software interrupt on line 46, which is also the first output after reset. The next line (initcall: 0000000080005cc0) is from initcall_run_list calling initf_bootstage, the first initcall after log_init. This indicates to me that the pending interrupt eithe was triggered by start.S, or was never cleared. Perhaps line 69 (csrc MODE_PREFIX(ip), t0) is a no-op.
The clint only indicates an interrupt is present on line 723. Line 722 (initcall: 000000008001ce3a) indicates a call to timer_init. This is the initcall immediately following arch_cpu_init_dm. Although this function initializes the IPI, afact it does not make any writes to it. This does suggest that some kind of effect does happen.
--Sean
[1] https://gist.github.com/Forty-Bot/e010480c34f31f0ecc150bbacd552ede