
On Thu, Dec 30, 2021 at 01:55:15AM +0800, Xiang W wrote:
在 2021-12-29星期三的 17:23 +0800,Leo Liang写道:
Hi Xiang, On Wed, Dec 22, 2021 at 07:32:53AM +0800, Xiang W wrote:
Various specifications of riscv allow the number of hart to be greater than 32. The limit of 32 is determined by gd->arch.available_harts. We can eliminate this limitation through bitmaps. Currently, the number of hart is limited to 4095, and 4095 is the limit of the RISC-V Advanced Core Local Interruptor Specification.
Test on sifive unmatched.
Signed-off-by: Xiang W wxjstz@126.com
Changes since v1:
- When NR_CPUS is very large, the value of GD_AVAILABLE_HARTS will
overflow the immediate range of ld/lw. This patch fixes this problem
arch/riscv/Kconfig | 4 ++-- arch/riscv/cpu/start.S | 21 ++++++++++++++++----- arch/riscv/include/asm/global_data.h | 4 +++- arch/riscv/lib/smp.c | 2 +- 4 files changed, 22 insertions(+), 9 deletions(-)
diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S index 76850ec9be..92f3b78f29 100644 --- a/arch/riscv/cpu/start.S +++ b/arch/riscv/cpu/start.S @@ -166,11 +166,22 @@ wait_for_gd_init: mv gp, s0 /* register available harts in the available_harts mask */ - li t1, 1 - sll t1, t1, tp - LREG t2, GD_AVAILABLE_HARTS(gp) - or t2, t2, t1 - SREG t2, GD_AVAILABLE_HARTS(gp) + li t1, GD_AVAILABLE_HARTS + add t1, t1, gp + LREG t1, 0(t1) +#if defined(CONFIG_ARCH_RV64I) + srli t2, tp, 6 + slli t2, t2, 3 +#elif defined(CONFIG_ARCH_RV32I) + srli t2, tp, 5 + slli t2, t2, 2 +#endif + add t1, t1, t2 + LREG t2, 0(t1) + li t3, 1 + sll t3, t3, tp
This seems incorrect. Shouldn't we have "$tp % sizeof(ulong)" instead of "$tp / sizeof(ulong)" ?
Do you meening: "$tp % sizeof(ulong)" instead of "$tp" ?
There is such a description in the riscv specification:
SLL, SRL, and SRA perform logical left, logical right, and arithmetic right shifts on the value in register rs1 by the shift amount held in the lower 5 bits of register rs2.
SLL, SRL, and SRA perform logical left, logical right, and arithmetic right shifts on the value in register rs1 by the shift amount held in register rs2. In RV64I, only the low 6 bits of rs2 are considered for the shift amount.
So we don’t need to perform the remainder operation.
Got it! Thanks for the explanation.
LGTM, Reviewed-by: Leo Yu-Chi Liang ycliang@andestech.com
Best regards, Leo
regards, Xiang W
+ or t2, t2, t3 + SREG t2, 0(t1) amoswap.w.rl zero, zero, 0(t0)
Best regards, Leo