
Hi Rick,
On Tue, 2019-03-12 at 09:15 +0800, Rick Chen wrote:
Hi Lukas
Auer, Lukas lukas.auer@aisec.fraunhofer.de 於 2019年3月11日 週一 上午2:12寫道:
On Sun, 2019-03-10 at 20:24 +0530, Anup Patel wrote:
On Sun, Mar 10, 2019 at 7:28 PM Auer, Lukas lukas.auer@aisec.fraunhofer.de wrote:
Hi Rick,
On Thu, 2019-03-07 at 17:30 +0800, Rick Chen wrote:
Hi Lukas
> From: Lukas Auer [mailto:lukas.auer@aisec.fraunhofer.de] > Sent: Tuesday, February 12, 2019 6:14 AM > To: u-boot@lists.denx.de > Cc: Atish Patra; Anup Patel; Bin Meng; Andreas Schwab; > Palmer > Dabbelt; > Alexander Graf; Lukas Auer; Anup Patel; Rick Jian-Zhi > Chen(陳建志); > Baruch Siach; > Stefan Roese > Subject: [PATCH 5/7] riscv: add support for multi-hart > systems > > On RISC-V, all harts boot independently. To be able to > run on > a > multi-hart system, > U-Boot must be extended with the functionality to manage > all > harts in the > system. A new config option, CONFIG_MAIN_HART, is used to > select > the hart > U-Boot runs on. All other harts are halted. > U-Boot can delegate functions to them using > smp_call_function(). > > Every hart has a valid pointer to the global data > structure > and a > 8KiB stack by > default. The stack size is set with > CONFIG_STACK_SIZE_SHIFT. > > Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de > > > --- > > arch/riscv/Kconfig | 12 +++++ > arch/riscv/cpu/start.S | 102 > ++++++++++++++++++++++++++++++++++- > arch/riscv/include/asm/csr.h | 1 + > 3 files changed, 114 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index > 3a51339c4d..af8d0f8d67 > 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -140,4 +140,16 @@ config SBI_IPI > default y if RISCV_SMODE > depends on SMP > > +config MAIN_HART > + int "Main hart in system" > + default 0 > + help > + Some SoCs include harts of various sizes, some of > which > might not > + be suitable for running U-Boot. CONFIG_MAIN_HART > is > used > to select > + the hart U-Boot runs on. > + > +config STACK_SIZE_SHIFT > + int > + default 13 > + > endmenu > diff --git a/arch/riscv/cpu/start.S > b/arch/riscv/cpu/start.S > index > a30f6f7194..ce7230df37 100644 > --- a/arch/riscv/cpu/start.S > +++ b/arch/riscv/cpu/start.S > @@ -13,6 +13,7 @@ > #include <config.h> > #include <common.h> > #include <elf.h> > +#include <asm/csr.h> > #include <asm/encoding.h> > #include <generated/asm-offsets.h> > > @@ -45,6 +46,23 @@ _start: > /* mask all interrupts */ > csrw MODE_PREFIX(ie), zero > > +#ifdef CONFIG_SMP > + /* check if hart is within range */ > + /* s0: hart id */ > + li t0, CONFIG_NR_CPUS > + bge s0, t0, hart_out_of_bounds_loop > +#endif > + > +#ifdef CONFIG_SMP > + /* set xSIE bit to receive IPIs */ > +#ifdef CONFIG_RISCV_MMODE > + li t0, MIE_MSIE > +#else > + li t0, SIE_SSIE > +#endif > + csrs MODE_PREFIX(ie), t0 > +#endif > + > /* > * Set stackpointer in internal/ex RAM to call > board_init_f > */ > @@ -56,7 +74,25 @@ call_board_init_f: > call_board_init_f_0: > mv a0, sp > jal board_init_f_alloc_reserve > + > + /* > + * Set global data pointer here for all harts, > uninitialized at this > + * point. > + */ > + mv gp, a0 > + > + /* setup stack */ > +#ifdef CONFIG_SMP > + /* s0: hart id */ > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > + sub sp, a0, t0 > +#else > mv sp, a0 > +#endif > + > + /* Continue on main hart, others branch to > secondary_hart_loop */ > + li t0, CONFIG_MAIN_HART > + bne s0, t0, secondary_hart_loop > > la t0, prior_stage_fdt_address > SREG s1, 0(t0) > @@ -95,7 +131,14 @@ relocate_code: > *Set up the stack > */ > stack_setup: > +#ifdef CONFIG_SMP > + /* s0: hart id */ > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > + sub sp, s2, t0 > +#else > mv sp, s2 > +#endif > + > la t0, _start > sub t6, s4, t0 /* t6 <- relocation > offset > */ > beq t0, s4, clear_bss /* skip relocation > */ > @@ -175,13 +218,30 @@ clear_bss: > add t0, t0, t6 /* t0 <- rel > __bss_start in > RAM */ > la t1, __bss_end /* t1 <- rel > __bss_end > in > FLASH */ > add t1, t1, t6 /* t1 <- rel > __bss_end > in > RAM */ > - beq t0, t1, call_board_init_r > + beq t0, t1, relocate_secondary_harts > > clbss_l: > SREG zero, 0(t0) /* clear loop... */ > addi t0, t0, REGBYTES > bne t0, t1, clbss_l > > +relocate_secondary_harts: > +#ifdef CONFIG_SMP > + /* send relocation IPI */ > + la t0, secondary_hart_relocate > + add a0, t0, t6 > + > + /* store relocation offset */ > + mv s5, t6 > + > + mv a1, s2 > + mv a2, s3 > + jal smp_call_function > + > + /* restore relocation offset */ > + mv t6, s5 > +#endif > + > /* > * We are done. Do not return, instead branch to second > part > of > board > * initialization, now running from RAM. > @@ -202,3 +262,43 @@ call_board_init_r: > * jump to it ... > */ > jr t4 /* jump to > board_init_r() > */ > + > +#ifdef CONFIG_SMP > +hart_out_of_bounds_loop: > + /* Harts in this loop are out of bounds, increase > CONFIG_NR_CPUS. */ > + wfi > + j hart_out_of_bounds_loop > +#endif > + > +#ifdef CONFIG_SMP > +/* SMP relocation entry */ > +secondary_hart_relocate: > + /* a1: new sp */ > + /* a2: new gd */ > + /* s0: hart id */ > + > + /* setup stack */ > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > + sub sp, a1, t0 > + > + /* update global data pointer */ > + mv gp, a2 > +#endif > + > +secondary_hart_loop: > + wfi > + > +#ifdef CONFIG_SMP > + csrr t0, MODE_PREFIX(ip) > +#ifdef CONFIG_RISCV_MMODE > + andi t0, t0, MIE_MSIE > +#else > + andi t0, t0, SIE_SSIE > +#endif > + beqz t0, secondary_hart_loop > + > + mv a0, s0 > + jal handle_ipi
I found that s0 maybe corrupted after execute handle_ipi. Because smp_function will be treated as a return function by compiler, so compiler will generate codes to execute restore after smp_function().
But actually it is a no-return function. So there maybe no chance to execute restore. And s0 will be corrupted somehow.
The usage of s0 in v2 flow seem the same as v1. So I reply mail in v1 patch.
Thanks for reporting this issue! What compiler version are you using? I never saw this issue with GCC 8.2.0, because optimization removes the ret following the call to smp_function(), it is called using jr. The registers are therefore restored before jumping to smp_function().
I found this issue with Andestech's toolchain (GCC version is 7.3.0) And it's code gen will be as below:
00000e94 <handle_ipi>: e94: 4785 c.li a5,1 e96: 04a7e663 bltu a5,a0,ee2 <handle_ipi+0x4e> e9a: 456362ef jal t0,372f0 <__riscv_save_0> e9e: 84aa c.mv s1,a0 ea0: 3539 c.jal cae <riscv_clear_ipi> ea2: cd19 c.beqz a0,ec0 <handle_ipi+0x2c> ea4: 0003d517 auipc a0,0x3d ea8: 7c450513 addi a0,a0,1988 # 3e668 <freqs.8638+0x430> eac: 17c320ef jal ra,33028 <printf> eb0: 0003d517 auipc a0,0x3d eb4: 7b850513 addi a0,a0,1976 # 3e668 <freqs.8638+0x430> eb8: 170320ef jal ra,33028 <printf> ebc: 4583606f j 37314 <__riscv_restore_0> ec0: 47b1 c.li a5,12 ec2: 02f48433 mul s0,s1,a5 s0 : 1 => 0xc ec6: 003407b3 add a5,s0,gp eca: 0bc7a903 lw s2,188(a5) # 800000bc <__bss_end+0x7ff8f718> ece: 3b21 c.jal be6
<invalidate_icache_all> ed0: 008187b3 add a5,gp,s0 ed4: 0c47a603 lw a2,196(a5) ed8: 0c07a583 lw a1,192(a5) edc: 8526 c.mv a0,s1 ede: 9902 c.jalr s2 s2=0x3ff8f194 , s0=0xcsou ee0: bff1 c.j ebc <handle_ipi+0x28> ee2: 8082 c.jr ra
As I said last mail, compiler will treat smp_function() as a return call, so restore() will be executed after it. This behavior shall comply with the calling convention.
Thanks for your understanding and tolerant.
No problem, I was just asking out of interest. Thanks for providing the toolchain output above!
This system is probably too fragile. A simple solution would be to store the hart ID somewhere else, for example register tp. What do you think?
I think you can also use scratch/mscratch (if it is not used anywhere else).
You are right, sscratch and mscratch are also not currently used in U-Boot. Is there an advantage to use the scratch CSRs instead of tp?
I will prefer to use tp. xscratch shall consider mode distinguish additionally.
Ok, I will update my series and use tp instead of s0.
Thanks, Lukas