
On 11.08.20 12:32, Sean Anderson wrote:
On 8/11/20 3:50 AM, Heinrich Schuchardt wrote:
On 11.08.20 08:20, Rick Chen wrote:
Hi Heinrich
From: Heinrich Schuchardt [mailto:xypron.glpk@gmx.de] Sent: Tuesday, August 11, 2020 11:57 AM To: Rick Jian-Zhi Chen(陳建志) Cc: Sean Anderson; Lukas Auer; Simon Glass; Anup Patel; Daniel Schwierzeck; u-boot@lists.denx.de; Heinrich Schuchardt Subject: [PATCH 1/1] riscv: don't jump to 0x0 in handle_ipi()
At least on the Kendryte K210:
gd->arch.available_harts= 0x0000000000000003 arch.ipi[0].addr= gd->0x0000000000000000 arch.ipi[0].arg0= 0x0000000000000000 gd->arch.ipi[0].arg1= 0x0000000000000000 arch.ipi[1].addr= gd->0x0000000000000000 arch.ipi[1].arg0= 0x0000000000000000 gd->arch.ipi[1].arg1= 0x0000000000000000
We should not jump to 0x0 to handle an interrupt.
Can you explain why K210 be affected by it ?
I have been running U-Boot on the MaixDuino.
Without this patch secondary_hart_loop() is reached only once. With the patch it is reached several thousand times.
Hm, interesting. To me, this is a symptom of something else going terribly wrong. I originally had this check in place so that it would be easier to detect these sorts of errors. I don't think this is the correct fix, however. We should really try and find the root cause of the bug.
I would not expect NULL to contain any code that should be executed by the secondary hart. See doc/board/sipeed/maix.rst:
Address Size Description ========== ========= =========== 0x00000000 0x1000 debug 0x00001000 0x1000 rom 0x02000000 0xC000 clint
Signed-off-by: Heinrich Schuchardt xypron.glpk@gmx.de
arch/riscv/lib/smp.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/riscv/lib/smp.c b/arch/riscv/lib/smp.c index ac22136314..d725fa32e8 100644 --- a/arch/riscv/lib/smp.c +++ b/arch/riscv/lib/smp.c @@ -96,6 +96,8 @@ void handle_ipi(ulong hart) return; }
if (!smp_function)
return; smp_function(hart, gd->arch.ipi[hart].arg0, gd->arch.ipi[hart].arg1); }
I remember Sean add this check in [v10,14/21] riscv: Clean up IPI initialization code . And I ask him to remove. https://patchwork.ozlabs.org/project/uboot/patch/20200503024637.327733-15-se...
Your comment was: "Please remove the sanity check. If zero address is the intended jump point, then system will work abnormal."
The only place where gd->arch.ipi[hart].addr is set is:
arch/riscv/lib/smp.c:53 send_ipi_many(): gd->arch.ipi[reg].addr = ipi->addr;
send_ipi_many() is only called in smp_call_function().
So the line
smp_function(hart, gd->arch.ipi[hart].arg0, gd->arch.ipi[hart].arg1);
can only work if smp_function() has been called before this point at any time. The start code only calls it in spl_secondary_hart_stack_gd_setup().
Can you retry with [1]? I think Bin Meng removed a call to sbi_init_ipi in [2] as part of a larger revert. The actual revert-of-revert is in [3], though it really should be split out into its own patch. The original commit making these changes is [4].
Patch series [1] makes not difference. You still have
gd->arch.ipi[0].addr= 0x0000000000000000 gd->arch.ipi[1].addr= 0x0000000000000000
and the secondary hart jumps to NULL and never returns.
Note that the situation before [4] was that the IPI got initialized by the first hart to call any IPI function. If that hart was not the boot hart, Bad Things started to happen (e.g. race conditions, memory corruption, etc). In that patch, I moved the initialization to its own function so we would not have any race conditions and instead have (easier-to-debug imo) calls to handle_ipi with bogus arguments. Of course, when everything is working properly, we should get neither of these.
[1] https://patchwork.ozlabs.org/project/uboot/list/?series=193047 [2] https://patchwork.ozlabs.org/project/uboot/patch/1595225827-23674-1-git-send... [3] https://patchwork.ozlabs.org/project/uboot/patch/20200729095636.1077054-5-se... [4] https://patchwork.ozlabs.org/project/uboot/patch/20200521161503.384823-15-se...
Why do we call handle_ipi() on the secondary hart at all?
Presumably to handle the IPI it got sent? Sorry, I'm confused as to what you're getting at.
I cannot see anything to be done by a secondary hart in case of a software interrupt.
Best regards
Heinrich
Where do you expect it to jump if U-Boot is the first binary loaded?
Even if spl_secondary_hart_stack_gd_setup() has initialized the jump address to secondary_hart_relocate(), do we want the secondary hart to execute it again and again?
--Sean