Re: RISCV: the machanism of available_harts may cause other harts boot failure

6 Sep 2022

      HI all,
...
On Mon, 5 Sep 2022 11:30:38 -0400
Sean Anderson seanga2@gmail.com wrote:
...
On 9/5/22 3:47 AM, Nikita Shubin wrote:
...
Hi Rick!
On Mon, 5 Sep 2022 14:22:41 +0800
Rick Chen rickchen36@gmail.com wrote:
...
Hi,
When I free-run a SMP system, I once hit a failure case where some
harts didn't boot to the kernel shell successfully.
However it can't be duplicated anymore even if I try many times.
But when I set a break during debugging with GDB, it can trigger
the failure case each time.
If hart fails to register itself to available_harts before
send_ipi_many is hit by the main hart:
https://elixir.bootlin.com/u-boot/v2022.10-rc3/source/arch/riscv/lib/smp.c#L...
it won't exit the secondary_hart_loop:
https://elixir.bootlin.com/u-boot/v2022.10-rc3/source/arch/riscv/cpu/start.S...
As no ipi will be sent to it.
This might be exactly your case.
When working on the IPI mechanism, I considered this possibility.
However, there's really no way to know how long to wait. On normal
systems, the boot hart is going to do a lot of work before calling
send_ipi_many, and the other harts just have to make it through ~100
instructions. So I figured we would never run into this issue.
We might not even need the mask... the only direct reason we might is
for OpenSBI, as spl_invoke_opensbi is the only function which uses
the wait parameter.
Actually i think available_harts in is duplicated by device tree,
so we can:

drop registering harts in start.S (and related lock completely)
fill gd->arch.available_harts in send_ipi_many relying on device

tree, and also making riscv_send_ipi non-fatal
3) move this procedure to the very end just before spl_invoke_opensbi
4) may be even wrap all above in some CONFIG option which enforces
checking that harts are alive, otherwise just pass the device tree harts
count
Thanks for all of your discussion and advise.
I would like to let available_harts become an option by something like
CONFIG_SEND_IPI_BY_DTS_CPUS.
It can help to avoid the SMP booting failure situation and also will
not affect other's platform.
#ifndef CONFIG_SEND_IPI_BY_DTS_CPUS
/* skip if hart is not available */
if (!(gd->arch.available_harts & (1 << reg)))
continue;
#endif
Any opinions ?
Thanks,
Rick
...
...
...
...
I think the mechanism of available_harts does not provide a method
that guarantees the success of the SMP system.
Maybe we shall think of a better way for the SMP booting or just
remove it ?
I haven't experienced any unexplained problem with hart_lottery or
available_harts_lock unless:

harts are started non-simultaneously
SPL/U-Boot is in some kind of TCM, OCRAM, etc... which is not

cleared on reset which leaves available_harts dirty
XIP, of course, has this problem every time and just doesn't use the
mask. I remember thinking a lot about how to deal with this, but I
never ended up sending a patch because I didn't have a XIP system.
It can be in some part emulated by setting up SPL region as
read-only via PMP before start.
...
--Sean
...

something is wrong with atomics

Also there might be something wrong with IPI send/recieve.
...
Thread 8 hit Breakpoint 1, harts_early_init ()
(gdb) c
Continuing.
[Switching to Thread 7]
Thread 7 hit Breakpoint 1, harts_early_init ()
(gdb)
Continuing.
[Switching to Thread 6]
Thread 6 hit Breakpoint 1, harts_early_init ()
(gdb)
Continuing.
[Switching to Thread 5]
Thread 5 hit Breakpoint 1, harts_early_init ()
(gdb)
Continuing.
[Switching to Thread 4]
Thread 4 hit Breakpoint 1, harts_early_init ()
(gdb)
Continuing.
[Switching to Thread 3]
Thread 3 hit Breakpoint 1, harts_early_init ()
(gdb)
Continuing.
[Switching to Thread 2]
Thread 2 hit Breakpoint 1, harts_early_init ()
(gdb)
Continuing.
[Switching to Thread 1]
Thread 1 hit Breakpoint 1, harts_early_init ()
(gdb)
Continuing.
[Switching to Thread 5]
Thread 5 hit Breakpoint 3, 0x0000000001200000 in ?? ()
(gdb) info threads
   Id   Target Id         Frame
   1    Thread 1 (hart 1) secondary_hart_loop () at
arch/riscv/cpu/start.S:436 2    Thread 2 (hart 2)
secondary_hart_loop () at arch/riscv/cpu/start.S:436 3    Thread 3
(hart 3) secondary_hart_loop () at arch/riscv/cpu/start.S:436 4
Thread 4 (hart 4) secondary_hart_loop () at
arch/riscv/cpu/start.S:436

5    Thread 5 (hart 5) 0x0000000001200000 in ?? ()
 6    Thread 6 (hart 6) 0x000000000000b650 in ?? ()
 7    Thread 7 (hart 7) 0x000000000000b650 in ?? ()
 8    Thread 8 (hart 8) 0x0000000000005fa0 in ?? ()

(gdb) c
Continuing.
Do they all "offline" harts remain in SPL/U-Boot
secondary_hart_loop ?
...
[    0.175619] smp: Bringing up secondary CPUs ...
[    1.230474] CPU1: failed to come online
[    2.282349] CPU2: failed to come online
[    3.334394] CPU3: failed to come online
[    4.386783] CPU4: failed to come online
[    4.427829] smp: Brought up 1 node, 4 CPUs
/root # cat /proc/cpuinfo
processor       : 0
hart            : 4
isa     : rv64i2p0m2p0a2p0c2p0xv5-1p1
mmu             : sv39
processor       : 5
hart            : 5
isa     : rv64i2p0m2p0a2p0c2p0xv5-1p1
mmu             : sv39
processor       : 6
hart            : 6
isa     : rv64i2p0m2p0a2p0c2p0xv5-1p1
mmu             : sv39
processor       : 7
hart            : 7
isa     : rv64i2p0m2p0a2p0c2p0xv5-1p1
mmu             : sv39
/root #
Thanks,
Rick