[U-Boot] [PATCH 00/10] sunxi: PSCI implementation rewrite in C

Hi everyone,
This series rewrites the Allwinner/sunxi PSCI implementation in C, to make it easier to maintain and extend for the currently unsupported multi-cluster SoCs. The SMP code in the BSP kernels are in C. Having the PSCI code in C as well will make it easier to work on.
To be able to convert the platform bits to C, some common PSCI functions have to be fixed up according to the ARM calling conventions. Function declarations are also needed.
This series is based on sunxi/next. Parts of it will likely conflict with the effort to support PSCI 1.0 on the Freescale LS102xA.
Patch 1 fixes up psci_get_cpu_stack_top.
Patch 2 fixes up the PSCI version of v7_flush_dcache_all.
Patch 3 adds function declarations for some of the common PSCI functions.
Patch 4 fixes issues with reserving memory for the secure section.
Patch 5 unifies the CPUCFG_BASE macro names for various sunxi platforms.
Patch 6 groups cpu core related controls together into one struct per core. This makes it straightforward to access the controls by the cpu index.
Patch 7 adds a missing header to cpucfg.h
Patch 8 adds some missing fields to cpucfg, which were used in the assembly code.
Patch 9 adds the base address for the GIC.
Patch 10 is the new PSCI implementation in C. Almost all of the code is converted, with the exception of initial setup of the stack.
Regards ChenYu
Chen-Yu Tsai (10): ARM: PSCI: use only r0 and r3 in psci_get_cpu_stack_top() ARM: PSCI: save and restore clobbered registers in v7_flush_dcache_all ARM: PSCI: export common PSCI function declarations for C code ARM: allocate extra space for PSCI stack in secure section during link phase sunxi: Make CPUCFG_BASE macro names the same across families sunxi: Group cpu core related controls together sunxi: Add missing linux/types.h header for cpucfg.h sunxi: Add CPUCFG debug lock and sun7i cpu power controls sunxi: Add base address for GIC sunxi: Add PSCI implementation in C
arch/arm/cpu/armv7/psci.S | 20 +- arch/arm/cpu/armv7/sunxi/Makefile | 7 +- arch/arm/cpu/armv7/sunxi/psci.c | 229 ++++++++++++++++++ arch/arm/cpu/armv7/sunxi/psci_head.S | 61 +++++ arch/arm/cpu/armv7/sunxi/psci_sun6i.S | 262 --------------------- arch/arm/cpu/armv7/sunxi/psci_sun7i.S | 237 ------------------- arch/arm/cpu/u-boot.lds | 3 + arch/arm/include/asm/arch-sunxi/cpu_sun4i.h | 17 +- .../asm/arch-sunxi/{cpucfg_sun6i.h => cpucfg.h} | 38 +-- arch/arm/include/asm/arch-sunxi/prcm.h | 6 +- arch/arm/include/asm/psci.h | 8 + 11 files changed, 350 insertions(+), 538 deletions(-) create mode 100644 arch/arm/cpu/armv7/sunxi/psci.c create mode 100644 arch/arm/cpu/armv7/sunxi/psci_head.S delete mode 100644 arch/arm/cpu/armv7/sunxi/psci_sun6i.S delete mode 100644 arch/arm/cpu/armv7/sunxi/psci_sun7i.S rename arch/arm/include/asm/arch-sunxi/{cpucfg_sun6i.h => cpucfg.h} (69%)

For psci_get_cpu_stack_top() to be usable in C code, it must adhere to the ARM calling conventions. Since it could be called when the stack is still unavailable, and the entry code to linux also expects r1 and r2 to remain unchanged, stick to r0 and r3.
Signed-off-by: Chen-Yu Tsai wens@csie.org --- arch/arm/cpu/armv7/psci.S | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/arm/cpu/armv7/psci.S b/arch/arm/cpu/armv7/psci.S index 87c0c0b6f5eb..cdd001fe3fb0 100644 --- a/arch/arm/cpu/armv7/psci.S +++ b/arch/arm/cpu/armv7/psci.S @@ -196,15 +196,15 @@ ENDPROC(psci_cpu_off_common)
@ expects CPU ID in r0 and returns stack top in r0 ENTRY(psci_get_cpu_stack_top) - mov r5, #0x400 @ 1kB of stack per CPU - mul r0, r0, r5 - - ldr r5, =psci_text_end @ end of monitor text - add r5, r5, #0x2000 @ Skip two pages - lsr r5, r5, #12 @ Align to start of page - lsl r5, r5, #12 - sub r5, r5, #4 @ reserve 1 word for target PC - sub r0, r5, r0 @ here's our stack! + mov r3, #0x400 @ 1kB of stack per CPU + mul r0, r0, r3 + + ldr r3, =psci_text_end @ end of monitor text + add r3, r3, #0x2000 @ Skip two pages + lsr r3, r3, #12 @ Align to start of page + lsl r3, r3, #12 + sub r3, r3, #4 @ reserve 1 word for target PC + sub r0, r3, r0 @ here's our stack!
bx lr ENDPROC(psci_get_cpu_stack_top)

Signed-off-by: Chen-Yu Tsai wens@csie.org --- arch/arm/cpu/armv7/psci.S | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm/cpu/armv7/psci.S b/arch/arm/cpu/armv7/psci.S index cdd001fe3fb0..ab408378fcae 100644 --- a/arch/arm/cpu/armv7/psci.S +++ b/arch/arm/cpu/armv7/psci.S @@ -110,6 +110,7 @@ ENDPROC(psci_get_cpu_id)
/* Imported from Linux kernel */ LENTRY(v7_flush_dcache_all) + stmfd sp!, {r4-r5, r7, r9-r11, lr} dmb @ ensure ordering with previous memory accesses mrc p15, 1, r0, c0, c0, 1 @ read clidr ands r3, r0, #0x7000000 @ extract loc from clidr @@ -153,6 +154,7 @@ finished: mcr p15, 2, r10, c0, c0, 0 @ select current cache level in cssr dsb st isb + ldmfd sp!, {r4-r5, r7, r9-r11, lr} bx lr ENDPROC(v7_flush_dcache_all)

Some common PSCI functions are written in assembly, but it should be possible to use them from C code.
Add function declarations for C code to consume.
Signed-off-by: Chen-Yu Tsai wens@csie.org --- arch/arm/include/asm/psci.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/arm/include/asm/psci.h b/arch/arm/include/asm/psci.h index 128a606444fe..5db33562a299 100644 --- a/arch/arm/include/asm/psci.h +++ b/arch/arm/include/asm/psci.h @@ -33,6 +33,14 @@ #define ARM_PSCI_RET_DENIED (-3)
#ifndef __ASSEMBLY__ +#include <asm/types.h> +#include <linux/compiler.h> + +void __section("._secure.text") psci_cpu_entry(void); +u32 __section("._secure.text") psci_get_cpu_id(void); +u32 __section("._secure.text") psci_get_cpu_stack_top(int cpu); +void __section("._secure.text") psci_cpu_off_common(void); + int psci_update_dt(void *fdt); void psci_board_init(void); #endif /* ! __ASSEMBLY__ */

On 23/05/16 13:41, Chen-Yu Tsai wrote:
Some common PSCI functions are written in assembly, but it should be possible to use them from C code.
Add function declarations for C code to consume.
Signed-off-by: Chen-Yu Tsai wens@csie.org
arch/arm/include/asm/psci.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/arm/include/asm/psci.h b/arch/arm/include/asm/psci.h index 128a606444fe..5db33562a299 100644 --- a/arch/arm/include/asm/psci.h +++ b/arch/arm/include/asm/psci.h @@ -33,6 +33,14 @@ #define ARM_PSCI_RET_DENIED (-3)
#ifndef __ASSEMBLY__ +#include <asm/types.h> +#include <linux/compiler.h>
+void __section("._secure.text") psci_cpu_entry(void); +u32 __section("._secure.text") psci_get_cpu_id(void); +u32 __section("._secure.text") psci_get_cpu_stack_top(int cpu); +void __section("._secure.text") psci_cpu_off_common(void);
I may be wrong, but I don't think the section matters for prototypes. It is only at the location where the code is actually generated that it is actually useful.
Thanks,
M.

On Tue, May 24, 2016 at 5:58 PM, Marc Zyngier marc.zyngier@arm.com wrote:
On 23/05/16 13:41, Chen-Yu Tsai wrote:
Some common PSCI functions are written in assembly, but it should be possible to use them from C code.
Add function declarations for C code to consume.
Signed-off-by: Chen-Yu Tsai wens@csie.org
arch/arm/include/asm/psci.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/arm/include/asm/psci.h b/arch/arm/include/asm/psci.h index 128a606444fe..5db33562a299 100644 --- a/arch/arm/include/asm/psci.h +++ b/arch/arm/include/asm/psci.h @@ -33,6 +33,14 @@ #define ARM_PSCI_RET_DENIED (-3)
#ifndef __ASSEMBLY__ +#include <asm/types.h> +#include <linux/compiler.h>
+void __section("._secure.text") psci_cpu_entry(void); +u32 __section("._secure.text") psci_get_cpu_id(void); +u32 __section("._secure.text") psci_get_cpu_stack_top(int cpu); +void __section("._secure.text") psci_cpu_off_common(void);
I may be wrong, but I don't think the section matters for prototypes. It is only at the location where the code is actually generated that it is actually useful.
You're right. Dropping the section attributes here.
ChenYu

The PSCI implementation expects at most 2 pages worth of space reserved at the end of the secure section for its stacks. This was not properly marked and taken into consideration when reserving memory from the kernel.
If one accesses PSCI after Linux has fully booted, the memory that should have been reserved for the PSCI stacks may have been used by the kernel or userspace, and would be corrupted. Observed after effects include the system hanging or telinit core dumping when trying to reboot. It seems the init process gets hit the most on my test bed.
This fix is only a stop gap. It would be better to rework the stack allocation mechanism, maybe with proper usage of CONFIG_ macros and an explicit symbol.
Signed-off-by: Chen-Yu Tsai wens@csie.org --- arch/arm/cpu/u-boot.lds | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/cpu/u-boot.lds b/arch/arm/cpu/u-boot.lds index cfab8b041234..c7f37b606ad5 100644 --- a/arch/arm/cpu/u-boot.lds +++ b/arch/arm/cpu/u-boot.lds @@ -67,6 +67,9 @@ SECTIONS SIZEOF(.__secure_start) + SIZEOF(.secure_text);
+ /* Align to page boundary and skip 2 pages */ + . = (. & ~ 0xfff) + 0x2000; + __secure_end_lma = .; .__secure_end : AT(__secure_end_lma) { *(.__secure_end)

On 23/05/16 13:41, Chen-Yu Tsai wrote:
The PSCI implementation expects at most 2 pages worth of space reserved at the end of the secure section for its stacks. This was not properly marked and taken into consideration when reserving memory from the kernel.
If one accesses PSCI after Linux has fully booted, the memory that should have been reserved for the PSCI stacks may have been used by the kernel or userspace, and would be corrupted. Observed after effects include the system hanging or telinit core dumping when trying to reboot. It seems the init process gets hit the most on my test bed.
This fix is only a stop gap. It would be better to rework the stack allocation mechanism, maybe with proper usage of CONFIG_ macros and an explicit symbol.
Signed-off-by: Chen-Yu Tsai wens@csie.org
arch/arm/cpu/u-boot.lds | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/cpu/u-boot.lds b/arch/arm/cpu/u-boot.lds index cfab8b041234..c7f37b606ad5 100644 --- a/arch/arm/cpu/u-boot.lds +++ b/arch/arm/cpu/u-boot.lds @@ -67,6 +67,9 @@ SECTIONS SIZEOF(.__secure_start) + SIZEOF(.secure_text);
- /* Align to page boundary and skip 2 pages */
- . = (. & ~ 0xfff) + 0x2000;
- __secure_end_lma = .; .__secure_end : AT(__secure_end_lma) { *(.__secure_end)
Something worries me here. The PSCI stacks are on the secure side (in your case in SRAM), and shouldn't be part of the u-boot binary. If Linux sees some corruption, that's because you're not putting the stacks where they should, and that's where the issue is.
One possible bug would be if like the stack address computing is done using absolute addresses from one of the labels, and not using PC-relative addresses.
And crucially, this:
+ ldr r3, =psci_text_end @ end of monitor text
which was introduced by 4c681a3d22f0 ("ARM: Factor out reusable psci_get_cpu_stack_top").
Unless you actually relocate this value, this will base your stack in RAM, corrupting the hell out of the whatever is there, and moving the goalpost by 8kB is just papering over the issue.
The original code was:
+ adr r5, text_end @ end of text + add r5, r5, #0x2000 @ Skip two pages + lsr r5, r5, #12 @ Align to start of page + lsl r5, r5, #12 + sub sp, r5, r4 @ here's our stack!
which had its own share of bug, but was actually safe, thanks to the use of 'adr' and not 'ldr'.
Can you please check whether this value gets relocated?
Thanks,
M.

On 24/05/16 11:21, Marc Zyngier wrote:
On 23/05/16 13:41, Chen-Yu Tsai wrote:
The PSCI implementation expects at most 2 pages worth of space reserved at the end of the secure section for its stacks. This was not properly marked and taken into consideration when reserving memory from the kernel.
If one accesses PSCI after Linux has fully booted, the memory that should have been reserved for the PSCI stacks may have been used by the kernel or userspace, and would be corrupted. Observed after effects include the system hanging or telinit core dumping when trying to reboot. It seems the init process gets hit the most on my test bed.
This fix is only a stop gap. It would be better to rework the stack allocation mechanism, maybe with proper usage of CONFIG_ macros and an explicit symbol.
Signed-off-by: Chen-Yu Tsai wens@csie.org
arch/arm/cpu/u-boot.lds | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/cpu/u-boot.lds b/arch/arm/cpu/u-boot.lds index cfab8b041234..c7f37b606ad5 100644 --- a/arch/arm/cpu/u-boot.lds +++ b/arch/arm/cpu/u-boot.lds @@ -67,6 +67,9 @@ SECTIONS SIZEOF(.__secure_start) + SIZEOF(.secure_text);
- /* Align to page boundary and skip 2 pages */
- . = (. & ~ 0xfff) + 0x2000;
- __secure_end_lma = .; .__secure_end : AT(__secure_end_lma) { *(.__secure_end)
Something worries me here. The PSCI stacks are on the secure side (in your case in SRAM), and shouldn't be part of the u-boot binary. If Linux sees some corruption, that's because you're not putting the stacks where they should, and that's where the issue is.
One possible bug would be if like the stack address computing is done using absolute addresses from one of the labels, and not using PC-relative addresses.
And crucially, this:
- ldr r3, =psci_text_end @ end of monitor text
which was introduced by 4c681a3d22f0 ("ARM: Factor out reusable psci_get_cpu_stack_top").
Unless you actually relocate this value, this will base your stack in RAM, corrupting the hell out of the whatever is there, and moving the goalpost by 8kB is just papering over the issue.
The original code was:
adr r5, text_end @ end of text
add r5, r5, #0x2000 @ Skip two pages
lsr r5, r5, #12 @ Align to start of page
lsl r5, r5, #12
sub sp, r5, r4 @ here's our stack!
which had its own share of bug, but was actually safe, thanks to the use of 'adr' and not 'ldr'.
Can you please check whether this value gets relocated?
I had a check by building a semi-recent u-boot (that is, one that actually builds), and the relocation seems to be correct (I've forced a call to relocate_secure_section() in an unsuspecting command). I feel relieved.
So this bug only affects systems that have their PSCI in main memory. Maybe a CONFIG_ALLOCATE_PSCI_STACK_IN_RAM would be in order so that systems with SRAM do not have to see their u-boot grow by another 8kB?
Thanks,
M.

Hi,
On Tue, May 24, 2016 at 9:58 PM, Marc Zyngier marc.zyngier@arm.com wrote:
On 24/05/16 11:21, Marc Zyngier wrote:
On 23/05/16 13:41, Chen-Yu Tsai wrote:
The PSCI implementation expects at most 2 pages worth of space reserved at the end of the secure section for its stacks. This was not properly marked and taken into consideration when reserving memory from the kernel.
If one accesses PSCI after Linux has fully booted, the memory that should have been reserved for the PSCI stacks may have been used by the kernel or userspace, and would be corrupted. Observed after effects include the system hanging or telinit core dumping when trying to reboot. It seems the init process gets hit the most on my test bed.
This fix is only a stop gap. It would be better to rework the stack allocation mechanism, maybe with proper usage of CONFIG_ macros and an explicit symbol.
Signed-off-by: Chen-Yu Tsai wens@csie.org
arch/arm/cpu/u-boot.lds | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/cpu/u-boot.lds b/arch/arm/cpu/u-boot.lds index cfab8b041234..c7f37b606ad5 100644 --- a/arch/arm/cpu/u-boot.lds +++ b/arch/arm/cpu/u-boot.lds @@ -67,6 +67,9 @@ SECTIONS SIZEOF(.__secure_start) + SIZEOF(.secure_text);
- /* Align to page boundary and skip 2 pages */
- . = (. & ~ 0xfff) + 0x2000;
- __secure_end_lma = .; .__secure_end : AT(__secure_end_lma) { *(.__secure_end)
Something worries me here. The PSCI stacks are on the secure side (in your case in SRAM), and shouldn't be part of the u-boot binary. If Linux sees some corruption, that's because you're not putting the stacks where they should, and that's where the issue is.
One possible bug would be if like the stack address computing is done using absolute addresses from one of the labels, and not using PC-relative addresses.
And crucially, this:
ldr r3, =psci_text_end @ end of monitor text
which was introduced by 4c681a3d22f0 ("ARM: Factor out reusable psci_get_cpu_stack_top").
Unless you actually relocate this value, this will base your stack in RAM, corrupting the hell out of the whatever is there, and moving the goalpost by 8kB is just papering over the issue.
The original code was:
adr r5, text_end @ end of text
add r5, r5, #0x2000 @ Skip two pages
lsr r5, r5, #12 @ Align to start of page
lsl r5, r5, #12
sub sp, r5, r4 @ here's our stack!
which had its own share of bug, but was actually safe, thanks to the use of 'adr' and not 'ldr'.
Can you please check whether this value gets relocated?
I had a check by building a semi-recent u-boot (that is, one that actually builds), and the relocation seems to be correct (I've forced a call to relocate_secure_section() in an unsuspecting command). I feel relieved.
So this bug only affects systems that have their PSCI in main memory. Maybe a CONFIG_ALLOCATE_PSCI_STACK_IN_RAM would be in order so that systems with SRAM do not have to see their u-boot grow by another 8kB?
Maybe we could just put the new macro in the "#ifndef CONFIG_ARMV7_SECURE_BASE" above? The code get relocated if CONFIG_ARMV7_SECURE_BASE is set, and the region is not reserved. I think the current status is that if one uses CONFIG_ARMV7_SECURE_BASE then it should be secure SRAM/DRAM.
I'll also make it clear in the commit message that this only affects systems that put PSCI in main memory.
Sorry for the confusion.
Regards ChenYu
P.S. I wonder if we should do a size check for the secure section?

On 24/05/16 16:49, Chen-Yu Tsai wrote:
Hi,
On Tue, May 24, 2016 at 9:58 PM, Marc Zyngier marc.zyngier@arm.com wrote:
On 24/05/16 11:21, Marc Zyngier wrote:
On 23/05/16 13:41, Chen-Yu Tsai wrote:
The PSCI implementation expects at most 2 pages worth of space reserved at the end of the secure section for its stacks. This was not properly marked and taken into consideration when reserving memory from the kernel.
If one accesses PSCI after Linux has fully booted, the memory that should have been reserved for the PSCI stacks may have been used by the kernel or userspace, and would be corrupted. Observed after effects include the system hanging or telinit core dumping when trying to reboot. It seems the init process gets hit the most on my test bed.
This fix is only a stop gap. It would be better to rework the stack allocation mechanism, maybe with proper usage of CONFIG_ macros and an explicit symbol.
Signed-off-by: Chen-Yu Tsai wens@csie.org
arch/arm/cpu/u-boot.lds | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/cpu/u-boot.lds b/arch/arm/cpu/u-boot.lds index cfab8b041234..c7f37b606ad5 100644 --- a/arch/arm/cpu/u-boot.lds +++ b/arch/arm/cpu/u-boot.lds @@ -67,6 +67,9 @@ SECTIONS SIZEOF(.__secure_start) + SIZEOF(.secure_text);
- /* Align to page boundary and skip 2 pages */
- . = (. & ~ 0xfff) + 0x2000;
- __secure_end_lma = .; .__secure_end : AT(__secure_end_lma) { *(.__secure_end)
Something worries me here. The PSCI stacks are on the secure side (in your case in SRAM), and shouldn't be part of the u-boot binary. If Linux sees some corruption, that's because you're not putting the stacks where they should, and that's where the issue is.
One possible bug would be if like the stack address computing is done using absolute addresses from one of the labels, and not using PC-relative addresses.
And crucially, this:
ldr r3, =psci_text_end @ end of monitor text
which was introduced by 4c681a3d22f0 ("ARM: Factor out reusable psci_get_cpu_stack_top").
Unless you actually relocate this value, this will base your stack in RAM, corrupting the hell out of the whatever is there, and moving the goalpost by 8kB is just papering over the issue.
The original code was:
adr r5, text_end @ end of text
add r5, r5, #0x2000 @ Skip two pages
lsr r5, r5, #12 @ Align to start of page
lsl r5, r5, #12
sub sp, r5, r4 @ here's our stack!
which had its own share of bug, but was actually safe, thanks to the use of 'adr' and not 'ldr'.
Can you please check whether this value gets relocated?
I had a check by building a semi-recent u-boot (that is, one that actually builds), and the relocation seems to be correct (I've forced a call to relocate_secure_section() in an unsuspecting command). I feel relieved.
So this bug only affects systems that have their PSCI in main memory. Maybe a CONFIG_ALLOCATE_PSCI_STACK_IN_RAM would be in order so that systems with SRAM do not have to see their u-boot grow by another 8kB?
Maybe we could just put the new macro in the "#ifndef CONFIG_ARMV7_SECURE_BASE" above? The code get relocated if CONFIG_ARMV7_SECURE_BASE is set, and the region is not reserved. I think the current status is that if one uses CONFIG_ARMV7_SECURE_BASE then it should be secure SRAM/DRAM.
Yup, that'd work.
I'll also make it clear in the commit message that this only affects systems that put PSCI in main memory.
Sorry for the confusion.
Regards ChenYu
P.S. I wonder if we should do a size check for the secure section?
That'd make sense. Given how hard it has become to keep the A20 SPL under 24kB in the recent months, having a basic check on the size of the relocated part would be a good thing. Probably for a separate patch series though.
Thanks,
M.

Use SUNXI_CPUCFG_BASE across all families. This makes writing common PSCI code easier.
Signed-off-by: Chen-Yu Tsai wens@csie.org --- arch/arm/cpu/armv7/sunxi/psci_sun6i.S | 16 ++++++++-------- arch/arm/cpu/armv7/sunxi/psci_sun7i.S | 8 ++++---- arch/arm/include/asm/arch-sunxi/cpu_sun4i.h | 15 +++++++++++++-- 3 files changed, 25 insertions(+), 14 deletions(-)
diff --git a/arch/arm/cpu/armv7/sunxi/psci_sun6i.S b/arch/arm/cpu/armv7/sunxi/psci_sun6i.S index 90b5bfd35947..9752550dea35 100644 --- a/arch/arm/cpu/armv7/sunxi/psci_sun6i.S +++ b/arch/arm/cpu/armv7/sunxi/psci_sun6i.S @@ -73,8 +73,8 @@ psci_fiq_enter: lsr r9, r9, #10 and r9, r9, #0xf
- movw r8, #(SUN6I_CPUCFG_BASE & 0xffff) - movt r8, #(SUN6I_CPUCFG_BASE >> 16) + movw r8, #(SUNXI_CPUCFG_BASE & 0xffff) + movt r8, #(SUNXI_CPUCFG_BASE >> 16)
@ Wait for the core to enter WFI lsl r11, r9, #6 @ x64 @@ -114,8 +114,8 @@ psci_fiq_enter: str r10, [r12, #0x140] #endif
- movw r8, #(SUN6I_CPUCFG_BASE & 0xffff) - movt r8, #(SUN6I_CPUCFG_BASE >> 16) + movw r8, #(SUNXI_CPUCFG_BASE & 0xffff) + movt r8, #(SUNXI_CPUCFG_BASE >> 16)
@ Unlock CPU ldr r10, [r8, #0x1e4] @@ -139,8 +139,8 @@ psci_cpu_on: str r2, [r0] @ store target PC at stack top dsb
- movw r0, #(SUN6I_CPUCFG_BASE & 0xffff) - movt r0, #(SUN6I_CPUCFG_BASE >> 16) + movw r0, #(SUNXI_CPUCFG_BASE & 0xffff) + movt r0, #(SUNXI_CPUCFG_BASE >> 16)
@ CPU mask and r1, r1, #3 @ only care about first cluster @@ -189,8 +189,8 @@ psci_cpu_on: str r6, [r0, #0x100]
@ re-calculate CPU control register address - movw r0, #(SUN6I_CPUCFG_BASE & 0xffff) - movt r0, #(SUN6I_CPUCFG_BASE >> 16) + movw r0, #(SUNXI_CPUCFG_BASE & 0xffff) + movt r0, #(SUNXI_CPUCFG_BASE >> 16)
@ Deassert reset on target CPU mov r6, #3 diff --git a/arch/arm/cpu/armv7/sunxi/psci_sun7i.S b/arch/arm/cpu/armv7/sunxi/psci_sun7i.S index e15d587f2901..ac8ebf888a4a 100644 --- a/arch/arm/cpu/armv7/sunxi/psci_sun7i.S +++ b/arch/arm/cpu/armv7/sunxi/psci_sun7i.S @@ -73,8 +73,8 @@ psci_fiq_enter: lsr r9, r9, #10 and r9, r9, #0xf
- movw r8, #(SUN7I_CPUCFG_BASE & 0xffff) - movt r8, #(SUN7I_CPUCFG_BASE >> 16) + movw r8, #(SUNXI_CPUCFG_BASE & 0xffff) + movt r8, #(SUNXI_CPUCFG_BASE >> 16)
@ Wait for the core to enter WFI lsl r11, r9, #6 @ x64 @@ -128,8 +128,8 @@ psci_cpu_on: str r2, [r0] @ store target PC at stack top dsb
- movw r0, #(SUN7I_CPUCFG_BASE & 0xffff) - movt r0, #(SUN7I_CPUCFG_BASE >> 16) + movw r0, #(SUNXI_CPUCFG_BASE & 0xffff) + movt r0, #(SUNXI_CPUCFG_BASE >> 16)
@ CPU mask and r1, r1, #3 @ only care about first cluster diff --git a/arch/arm/include/asm/arch-sunxi/cpu_sun4i.h b/arch/arm/include/asm/arch-sunxi/cpu_sun4i.h index 65c0441fe8a2..47e327e71f84 100644 --- a/arch/arm/include/asm/arch-sunxi/cpu_sun4i.h +++ b/arch/arm/include/asm/arch-sunxi/cpu_sun4i.h @@ -18,6 +18,10 @@ #define SUNXI_SRAM_D_BASE 0x00010000 /* 4 kiB */ #define SUNXI_SRAM_B_BASE 0x00020000 /* 64 kiB (secure) */
+#ifdef CONFIG_MACH_SUN8I_A83T +#define SUNXI_CPUCFG_BASE 0x01700000 +#endif + #define SUNXI_SRAMC_BASE 0x01c00000 #define SUNXI_DRAMC_BASE 0x01c01000 #define SUNXI_DMA_BASE 0x01c02000 @@ -94,7 +98,10 @@
#define SUNXI_TP_BASE 0x01c25000 #define SUNXI_PMU_BASE 0x01c25400 -#define SUN7I_CPUCFG_BASE 0x01c25c00 + +#ifdef CONFIG_MACH_SUN7I +#define SUNXI_CPUCFG_BASE 0x01c25c00 +#endif
#define SUNXI_UART0_BASE 0x01c28000 #define SUNXI_UART1_BASE 0x01c28400 @@ -148,7 +155,11 @@
#define SUNXI_RTC_BASE 0x01f00000 #define SUNXI_PRCM_BASE 0x01f01400 -#define SUN6I_CPUCFG_BASE 0x01f01c00 + +#if defined CONFIG_SUNXI_GEN_SUN6I && !defined CONFIG_MACH_SUN8I_A83T +#define SUNXI_CPUCFG_BASE 0x01f01c00 +#endif + #define SUNXI_R_TWI_BASE 0x01f02400 #define SUNXI_R_UART_BASE 0x01f02800 #define SUNXI_R_PIO_BASE 0x01f02c00

Instead of listing individual registers for controls to each processor core, list them as an array of registers. This makes accessing controls by core index easier.
Also rename "cpucfg_sun6i.h" (which was unused anyway) to the more generic "cpucfg.h".
Signed-off-by: Chen-Yu Tsai wens@csie.org --- .../asm/arch-sunxi/{cpucfg_sun6i.h => cpucfg.h} | 31 +++++++++------------- arch/arm/include/asm/arch-sunxi/prcm.h | 6 ++--- 2 files changed, 14 insertions(+), 23 deletions(-) rename arch/arm/include/asm/arch-sunxi/{cpucfg_sun6i.h => cpucfg.h} (69%)
diff --git a/arch/arm/include/asm/arch-sunxi/cpucfg_sun6i.h b/arch/arm/include/asm/arch-sunxi/cpucfg.h similarity index 69% rename from arch/arm/include/asm/arch-sunxi/cpucfg_sun6i.h rename to arch/arm/include/asm/arch-sunxi/cpucfg.h index e2a29cb1818e..b9084b3968cd 100644 --- a/arch/arm/include/asm/arch-sunxi/cpucfg_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/cpucfg.h @@ -11,33 +11,26 @@
#ifndef __ASSEMBLY__
+struct sunxi_cpucfg_cpu { + u32 rst; /* base + 0x0 */ + u32 ctrl; /* base + 0x4 */ + u32 status; /* base + 0x8 */ + u8 res[0x34]; /* base + 0xc */ +}; + struct sunxi_cpucfg_reg { u8 res0[0x40]; /* 0x000 */ - u32 cpu0_rst; /* 0x040 */ - u32 cpu0_ctrl; /* 0x044 */ - u32 cpu0_status; /* 0x048 */ - u8 res1[0x34]; /* 0x04c */ - u32 cpu1_rst; /* 0x080 */ - u32 cpu1_ctrl; /* 0x084 */ - u32 cpu1_status; /* 0x088 */ - u8 res2[0x34]; /* 0x08c */ - u32 cpu2_rst; /* 0x0c0 */ - u32 cpu2_ctrl; /* 0x0c4 */ - u32 cpu2_status; /* 0x0c8 */ - u8 res3[0x34]; /* 0x0cc */ - u32 cpu3_rst; /* 0x100 */ - u32 cpu3_ctrl; /* 0x104 */ - u32 cpu3_status; /* 0x108 */ - u8 res4[0x78]; /* 0x10c */ + struct sunxi_cpucfg_cpu cpu[4]; /* 0x040 */ + u8 res1[0x44]; /* 0x140 */ u32 gen_ctrl; /* 0x184 */ u32 l2_status; /* 0x188 */ - u8 res5[0x4]; /* 0x18c */ + u8 res2[0x4]; /* 0x18c */ u32 event_in; /* 0x190 */ - u8 res6[0xc]; /* 0x194 */ + u8 res3[0xc]; /* 0x194 */ u32 super_standy_flag; /* 0x1a0 */ u32 priv0; /* 0x1a4 */ u32 priv1; /* 0x1a8 */ - u8 res7[0x54]; /* 0x1ac */ + u8 res4[0x54]; /* 0x1ac */ u32 idle_cnt0_low; /* 0x200 */ u32 idle_cnt0_high; /* 0x204 */ u32 idle_cnt0_ctrl; /* 0x208 */ diff --git a/arch/arm/include/asm/arch-sunxi/prcm.h b/arch/arm/include/asm/arch-sunxi/prcm.h index 556c1af60058..2d69feb33c65 100644 --- a/arch/arm/include/asm/arch-sunxi/prcm.h +++ b/arch/arm/include/asm/arch-sunxi/prcm.h @@ -225,10 +225,8 @@ struct sunxi_prcm_reg { u32 gpu_pwroff; /* 0x118 */ u8 res9[0x4]; /* 0x11c */ u32 vdd_pwr_reset; /* 0x120 */ - u8 res10[0x20]; /* 0x124 */ - u32 cpu1_pwr_clamp; /* 0x144 */ - u32 cpu2_pwr_clamp; /* 0x148 */ - u32 cpu3_pwr_clamp; /* 0x14c */ + u8 res10[0x1c]; /* 0x124 */ + u32 cpu_pwr_clamp[4]; /* 0x140 but first one is actually unused */ u8 res11[0x30]; /* 0x150 */ u32 dram_pwr; /* 0x180 */ u8 res12[0xc]; /* 0x184 */

On 23/05/16 13:41, Chen-Yu Tsai wrote:
Instead of listing individual registers for controls to each processor core, list them as an array of registers. This makes accessing controls by core index easier.
Also rename "cpucfg_sun6i.h" (which was unused anyway) to the more generic "cpucfg.h".
Signed-off-by: Chen-Yu Tsai wens@csie.org
.../asm/arch-sunxi/{cpucfg_sun6i.h => cpucfg.h} | 31 +++++++++------------- arch/arm/include/asm/arch-sunxi/prcm.h | 6 ++--- 2 files changed, 14 insertions(+), 23 deletions(-) rename arch/arm/include/asm/arch-sunxi/{cpucfg_sun6i.h => cpucfg.h} (69%)
diff --git a/arch/arm/include/asm/arch-sunxi/cpucfg_sun6i.h b/arch/arm/include/asm/arch-sunxi/cpucfg.h similarity index 69% rename from arch/arm/include/asm/arch-sunxi/cpucfg_sun6i.h rename to arch/arm/include/asm/arch-sunxi/cpucfg.h index e2a29cb1818e..b9084b3968cd 100644 --- a/arch/arm/include/asm/arch-sunxi/cpucfg_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/cpucfg.h @@ -11,33 +11,26 @@
#ifndef __ASSEMBLY__
+struct sunxi_cpucfg_cpu {
- u32 rst; /* base + 0x0 */
- u32 ctrl; /* base + 0x4 */
- u32 status; /* base + 0x8 */
- u8 res[0x34]; /* base + 0xc */
+};
Please use the "packed" attribute. Even if you declared your structure in a way that makes sure no padding will be introduced, this also serves as a reminder that this is not your usual memory.
Same goes for the other structures in the file.
Thanks,
M.

On Tue, May 24, 2016 at 4:15 PM, Marc Zyngier marc.zyngier@arm.com wrote:
On 23/05/16 13:41, Chen-Yu Tsai wrote:
Instead of listing individual registers for controls to each processor core, list them as an array of registers. This makes accessing controls by core index easier.
Also rename "cpucfg_sun6i.h" (which was unused anyway) to the more generic "cpucfg.h".
Signed-off-by: Chen-Yu Tsai wens@csie.org
.../asm/arch-sunxi/{cpucfg_sun6i.h => cpucfg.h} | 31 +++++++++------------- arch/arm/include/asm/arch-sunxi/prcm.h | 6 ++--- 2 files changed, 14 insertions(+), 23 deletions(-) rename arch/arm/include/asm/arch-sunxi/{cpucfg_sun6i.h => cpucfg.h} (69%)
diff --git a/arch/arm/include/asm/arch-sunxi/cpucfg_sun6i.h b/arch/arm/include/asm/arch-sunxi/cpucfg.h similarity index 69% rename from arch/arm/include/asm/arch-sunxi/cpucfg_sun6i.h rename to arch/arm/include/asm/arch-sunxi/cpucfg.h index e2a29cb1818e..b9084b3968cd 100644 --- a/arch/arm/include/asm/arch-sunxi/cpucfg_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/cpucfg.h @@ -11,33 +11,26 @@
#ifndef __ASSEMBLY__
+struct sunxi_cpucfg_cpu {
u32 rst; /* base + 0x0 */
u32 ctrl; /* base + 0x4 */
u32 status; /* base + 0x8 */
u8 res[0x34]; /* base + 0xc */
+};
Please use the "packed" attribute. Even if you declared your structure in a way that makes sure no padding will be introduced, this also serves as a reminder that this is not your usual memory.
Same goes for the other structures in the file.
OK.
Somewhat related, it seems we use (struct foo*) for accessing registers in U-boot, while in the kernel we use (void * + some offset). Could someone explain the trade-offs or preferences on this? struct foo doesn't work in assembly afaik.
Thanks.
ChenYu

On 24/05/16 17:06, Chen-Yu Tsai wrote:
On Tue, May 24, 2016 at 4:15 PM, Marc Zyngier marc.zyngier@arm.com wrote:
On 23/05/16 13:41, Chen-Yu Tsai wrote:
Instead of listing individual registers for controls to each processor core, list them as an array of registers. This makes accessing controls by core index easier.
Also rename "cpucfg_sun6i.h" (which was unused anyway) to the more generic "cpucfg.h".
Signed-off-by: Chen-Yu Tsai wens@csie.org
.../asm/arch-sunxi/{cpucfg_sun6i.h => cpucfg.h} | 31 +++++++++------------- arch/arm/include/asm/arch-sunxi/prcm.h | 6 ++--- 2 files changed, 14 insertions(+), 23 deletions(-) rename arch/arm/include/asm/arch-sunxi/{cpucfg_sun6i.h => cpucfg.h} (69%)
diff --git a/arch/arm/include/asm/arch-sunxi/cpucfg_sun6i.h b/arch/arm/include/asm/arch-sunxi/cpucfg.h similarity index 69% rename from arch/arm/include/asm/arch-sunxi/cpucfg_sun6i.h rename to arch/arm/include/asm/arch-sunxi/cpucfg.h index e2a29cb1818e..b9084b3968cd 100644 --- a/arch/arm/include/asm/arch-sunxi/cpucfg_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/cpucfg.h @@ -11,33 +11,26 @@
#ifndef __ASSEMBLY__
+struct sunxi_cpucfg_cpu {
u32 rst; /* base + 0x0 */
u32 ctrl; /* base + 0x4 */
u32 status; /* base + 0x8 */
u8 res[0x34]; /* base + 0xc */
+};
Please use the "packed" attribute. Even if you declared your structure in a way that makes sure no padding will be introduced, this also serves as a reminder that this is not your usual memory.
Same goes for the other structures in the file.
OK.
Somewhat related, it seems we use (struct foo*) for accessing registers in U-boot, while in the kernel we use (void * + some offset). Could someone explain the trade-offs or preferences on this? struct foo doesn't work in assembly afaik.
I personally hate the use of structures to access MMIO, because it gives people the idea that they can manipulate this just like memory. Which means that they are probably going to miss crucial barriers (on ARM), or do something completely wrong on some other architectures (have a look at SPARC and its ASIs).
But that's just my personal taste, and I don't mind people doing one or the other in code that I don't have to maintain... ;-)
Thanks,
M.

cpucfg.h includes a register definition for the CPUCFG register block. The types used are u32 and u8, which are defined in linux/types.h.
Signed-off-by: Chen-Yu Tsai wens@csie.org --- arch/arm/include/asm/arch-sunxi/cpucfg.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm/include/asm/arch-sunxi/cpucfg.h b/arch/arm/include/asm/arch-sunxi/cpucfg.h index b9084b3968cd..fc42035d70be 100644 --- a/arch/arm/include/asm/arch-sunxi/cpucfg.h +++ b/arch/arm/include/asm/arch-sunxi/cpucfg.h @@ -9,6 +9,8 @@ #ifndef _SUNXI_CPUCFG_H #define _SUNXI_CPUCFG_H
+#include <linux/types.h> + #ifndef __ASSEMBLY__
struct sunxi_cpucfg_cpu {

CPUCFG has an unlisted debug control register, which is used to disable external debug access.
Also, sun7i secondary core power controls are in CPUCFG, as there's no separate PRCM block.
Signed-off-by: Chen-Yu Tsai wens@csie.org --- arch/arm/include/asm/arch-sunxi/cpucfg.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/arch-sunxi/cpucfg.h b/arch/arm/include/asm/arch-sunxi/cpucfg.h index fc42035d70be..02afd8b4a09a 100644 --- a/arch/arm/include/asm/arch-sunxi/cpucfg.h +++ b/arch/arm/include/asm/arch-sunxi/cpucfg.h @@ -32,7 +32,12 @@ struct sunxi_cpucfg_reg { u32 super_standy_flag; /* 0x1a0 */ u32 priv0; /* 0x1a4 */ u32 priv1; /* 0x1a8 */ - u8 res4[0x54]; /* 0x1ac */ + u8 res4[0x4]; /* 0x1ac */ + u32 cpu1_pwr_clamp; /* 0x1b0 sun7i only */ + u32 cpu1_pwroff; /* 0x1b4 sun7i only */ + u8 res5[0x2c]; /* 0x1b8 */ + u32 dbg_ctrl1; /* 0x1e4 */ + u8 res6[0x18]; /* 0x1e8 */ u32 idle_cnt0_low; /* 0x200 */ u32 idle_cnt0_high; /* 0x204 */ u32 idle_cnt0_ctrl; /* 0x208 */

Instead of hardcoding the GIC addresses in the PSCI implementation, provide a base address in the cpu header.
Signed-off-by: Chen-Yu Tsai wens@csie.org --- arch/arm/cpu/armv7/sunxi/psci_sun6i.S | 4 ++-- arch/arm/cpu/armv7/sunxi/psci_sun7i.S | 4 ++-- arch/arm/include/asm/arch-sunxi/cpu_sun4i.h | 2 ++ 3 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/arm/cpu/armv7/sunxi/psci_sun6i.S b/arch/arm/cpu/armv7/sunxi/psci_sun6i.S index 9752550dea35..95fdb0e58874 100644 --- a/arch/arm/cpu/armv7/sunxi/psci_sun6i.S +++ b/arch/arm/cpu/armv7/sunxi/psci_sun6i.S @@ -42,8 +42,8 @@
#define ONE_MS (CONFIG_TIMER_CLK_FREQ / 1000) #define TEN_MS (10 * ONE_MS) -#define GICD_BASE 0x1c81000 -#define GICC_BASE 0x1c82000 +#define GICD_BASE (SUNXI_GIC400_BASE + 0x1000) +#define GICC_BASE (SUNXI_GIC400_BASE + 0x2000)
.globl psci_fiq_enter psci_fiq_enter: diff --git a/arch/arm/cpu/armv7/sunxi/psci_sun7i.S b/arch/arm/cpu/armv7/sunxi/psci_sun7i.S index ac8ebf888a4a..87bbd725f0b3 100644 --- a/arch/arm/cpu/armv7/sunxi/psci_sun7i.S +++ b/arch/arm/cpu/armv7/sunxi/psci_sun7i.S @@ -42,8 +42,8 @@
#define ONE_MS (CONFIG_TIMER_CLK_FREQ / 1000) #define TEN_MS (10 * ONE_MS) -#define GICD_BASE 0x1c81000 -#define GICC_BASE 0x1c82000 +#define GICD_BASE (SUNXI_GIC400_BASE + 0x1000) +#define GICC_BASE (SUNXI_GIC400_BASE + 0x2000)
.globl psci_fiq_enter psci_fiq_enter: diff --git a/arch/arm/include/asm/arch-sunxi/cpu_sun4i.h b/arch/arm/include/asm/arch-sunxi/cpu_sun4i.h index 47e327e71f84..c5e9d88bab5c 100644 --- a/arch/arm/include/asm/arch-sunxi/cpu_sun4i.h +++ b/arch/arm/include/asm/arch-sunxi/cpu_sun4i.h @@ -143,6 +143,8 @@ #define SUNXI_DRAM_PHY0_BASE 0x01c65000 #define SUNXI_DRAM_PHY1_BASE 0x01c66000
+#define SUNXI_GIC400_BASE 0x01c80000 + /* module sram */ #define SUNXI_SRAM_C_BASE 0x01d00000

To make the PSCI backend more maintainable and easier to port to newer SoCs, rewrite the current PSCI implementation in C.
Some inline assembly bits are required to access coprocessor registers. PSCI stack setup is the only part left completely in assembly. In theory this part could be split out of psci_arch_init into a separate common function, and psci_arch_init could be completely in C.
Signed-off-by: Chen-Yu Tsai wens@csie.org --- arch/arm/cpu/armv7/sunxi/Makefile | 7 +- arch/arm/cpu/armv7/sunxi/psci.c | 229 +++++++++++++++++++++++++++++ arch/arm/cpu/armv7/sunxi/psci_head.S | 61 ++++++++ arch/arm/cpu/armv7/sunxi/psci_sun6i.S | 262 ---------------------------------- arch/arm/cpu/armv7/sunxi/psci_sun7i.S | 237 ------------------------------ 5 files changed, 292 insertions(+), 504 deletions(-) create mode 100644 arch/arm/cpu/armv7/sunxi/psci.c create mode 100644 arch/arm/cpu/armv7/sunxi/psci_head.S delete mode 100644 arch/arm/cpu/armv7/sunxi/psci_sun6i.S delete mode 100644 arch/arm/cpu/armv7/sunxi/psci_sun7i.S
diff --git a/arch/arm/cpu/armv7/sunxi/Makefile b/arch/arm/cpu/armv7/sunxi/Makefile index 4d2274a38ed1..c2085101685b 100644 --- a/arch/arm/cpu/armv7/sunxi/Makefile +++ b/arch/arm/cpu/armv7/sunxi/Makefile @@ -13,11 +13,8 @@ obj-$(CONFIG_MACH_SUN6I) += tzpc.o obj-$(CONFIG_MACH_SUN8I_H3) += tzpc.o
ifndef CONFIG_SPL_BUILD -ifdef CONFIG_ARMV7_PSCI -obj-$(CONFIG_MACH_SUN6I) += psci_sun6i.o -obj-$(CONFIG_MACH_SUN7I) += psci_sun7i.o -obj-$(CONFIG_MACH_SUN8I) += psci_sun6i.o -endif +obj-$(CONFIG_ARMV7_PSCI) += psci.o +obj-$(CONFIG_ARMV7_PSCI) += psci_head.o endif
ifdef CONFIG_SPL_BUILD diff --git a/arch/arm/cpu/armv7/sunxi/psci.c b/arch/arm/cpu/armv7/sunxi/psci.c new file mode 100644 index 000000000000..943061937f7c --- /dev/null +++ b/arch/arm/cpu/armv7/sunxi/psci.c @@ -0,0 +1,229 @@ +/* + * Copyright (C) 2016 + * Author: Chen-Yu Tsai wens@csie.org + * + * Based on assembly code by Marc Zyngier marc.zyngier@arm.com, + * which was based on code by Carl van Schaik carl@ok-labs.com. + * + * SPDX-License-Identifier: GPL-2.0 + */ +#include <config.h> +#include <common.h> + +#include <asm/arch/cpu.h> +#include <asm/arch/cpucfg.h> +#include <asm/arch/prcm.h> +#include <asm/armv7.h> +#include <asm/gic.h> +#include <asm/io.h> +#include <asm/psci.h> +#include <asm/system.h> + +#include <linux/bitops.h> + +#define __secure __attribute__ ((section ("._secure.text"))) +#define __irq __attribute__ ((interrupt ("IRQ"))) + +#define GICD_BASE (SUNXI_GIC400_BASE + GIC_DIST_OFFSET) +#define GICC_BASE (SUNXI_GIC400_BASE + GIC_CPU_OFFSET_A15) + +static void __secure __mdelay(u32 ms) +{ + u32 reg = DIV_ROUND_UP(CONFIG_TIMER_CLK_FREQ, ms); + + /* CNTP_TVAL */ + asm volatile ("mcr p15, 0, %0, c14, c2, 0" : : "r" (reg)); + ISB; + /* CNTP_CTL */ + asm volatile ("mcr p15, 0, %0, c14, c2, 1" : : "r" (3)); + + do { + ISB; + /* CNTP_CTL */ + asm volatile ("mrc p15, 0, %0, c14, c2, 1" : "=r" (reg) : : + "cc" ); + } while (!(reg & BIT(2))); + + /* CNTP_CTL */ + asm volatile ("mcr p15, 0, %0, c14, c2, 1" : : "r" (0)); +} + +void __secure sunxi_cpu_power_off(u32 cpuid) +{ +#ifdef CONFIG_SUNXI_GEN_SUN6I + struct sunxi_prcm_reg *prcm = + (struct sunxi_prcm_reg *)SUNXI_PRCM_BASE; +#endif + struct sunxi_cpucfg_reg *cpucfg = + (struct sunxi_cpucfg_reg *)SUNXI_CPUCFG_BASE; + u32 cpu = cpuid & 0x3; + u32 tmp __maybe_unused; + + /* Wait for the core to enter WFI */ + while (1) { + if (readl(&cpucfg->cpu[cpu].status) & BIT(2)) + break; + __mdelay(1); + } + + /* Assert reset on target CPU */ + writel(0, &cpucfg->cpu[cpu].rst); + + /* Lock CPU (Disable external debug access) */ + clrbits_le32(&cpucfg->dbg_ctrl1, BIT(cpu)); + +#ifdef CONFIG_MACH_SUN7I + /* Set power gating */ + setbits_le32(&cpucfg->cpu1_pwroff, BIT(0)); +#else + /* Set power gating */ + setbits_le32(&prcm->cpu_pwroff, BIT(cpu)); +#endif + +#ifdef CONFIG_MACH_SUN7I + /* Activate power clamp */ + writel(0xff, &cpucfg->cpu1_pwr_clamp); +#elif defined(CONFIG_MACH_SUN6I) || defined(CONFIG_MACH_SUN8I_H3) + /* Activate power clamp */ + writel(0xff, &prcm->cpu_pwr_clamp[cpu]); +#endif + + /* Unlock CPU (Disable external debug access) */ + setbits_le32(&cpucfg->dbg_ctrl1, BIT(cpu)); +} + +/* + * Although this is an FIQ handler, the FIQ is processed in monitor mode, + * which means there's no FIQ banked registers. This is the same as IRQ + * mode, so use the IRQ attribute to ask the compiler to handler entry + * and return. + */ +void __secure __irq psci_fiq_enter(void) +{ + u32 scr, reg, cpu; + + /* Switch to secure mode */ + asm volatile ("mrc p15, 0, %0, c1, c1, 0" : "=r" (scr) : : "cc"); + reg = scr & ~(1 << 0); + asm volatile ("mcr p15, 0, %0, c1, c1, 0" : : "r" (reg) : "cc"); + ISB; + + /* Validate reason based on IAR and acknowledge */ + reg = readl(GICC_BASE + GICC_IAR); + + /* Skip spurious interrupts 1022 and 1023 */ + if (reg == 1023 || reg == 1022) + goto out; + + /* Acknowledge interrupt */ + writel(reg, GICC_BASE + GICC_EOIR); + DSB; + + /* Get CPU number */ + cpu = (reg >> 10) & 0xf; /* but GIC specs say only 3 bits? */ + + /* Power off the CPU */ + sunxi_cpu_power_off(cpu); + +out: + /* Restore security level */ + asm volatile ("mcr p15, 0, %0, c1, c1, 0" : : "r" (scr) : "cc"); +} + +int __secure psci_cpu_on(u32 unused __always_unused, u32 mpidr, u32 pc) +{ +#ifdef CONFIG_SUNXI_GEN_SUN6I + struct sunxi_prcm_reg *prcm = + (struct sunxi_prcm_reg *)SUNXI_PRCM_BASE; +#endif + struct sunxi_cpucfg_reg *cpucfg = + (struct sunxi_cpucfg_reg *)SUNXI_CPUCFG_BASE; + u32 cpu = (mpidr & 0x3); + u32 tmp __maybe_unused; + + /* store target PC at target CPU stack top */ + writel(pc, psci_get_cpu_stack_top(cpu)); + DSB; + + /* Set secondary core power on PC */ + writel((u32)&psci_cpu_entry, &cpucfg->priv0); + + /* Assert reset on target CPU */ + writel(0, &cpucfg->cpu[cpu].rst); + + /* Invalidate L1 cache */ + clrbits_le32(&cpucfg->gen_ctrl, BIT(cpu)); + + /* Lock CPU (Disable external debug access) */ + clrbits_le32(&cpucfg->dbg_ctrl1, BIT(cpu)); + +#ifdef CONFIG_MACH_SUN7I + /* Release power clamp */ + tmp = 0x1ff; + do { + tmp >>= 1; + writel(tmp, &cpucfg->cpu1_pwr_clamp); + } while (tmp); +#elif defined(CONFIG_MACH_SUN6I) || defined(CONFIG_MACH_SUN8I_H3) + /* Release power clamp */ + tmp = 0x1ff; + do { + tmp >>= 1; + writel(tmp, &prcm->cpu_pwr_clamp[cpu]); + } while (tmp); +#endif + + __mdelay(10); + +#ifdef CONFIG_MACH_SUN7I + /* Clear power gating */ + clrbits_le32(&cpucfg->cpu1_pwroff, BIT(0)); +#else + /* Clear power gating */ + clrbits_le32(&prcm->cpu_pwroff, BIT(cpu)); +#endif + + /* De-assert reset on target CPU */ + writel(BIT(1) | BIT(0), &cpucfg->cpu[cpu].rst); + + /* Unlock CPU (Disable external debug access) */ + setbits_le32(&cpucfg->dbg_ctrl1, BIT(cpu)); + + return ARM_PSCI_RET_SUCCESS; +} + +void __secure psci_cpu_off(void) +{ + psci_cpu_off_common(); + + /* Ask CPU0 via SGI15 to pull the rug... */ + writel(BIT(16) | 15, GICD_BASE + GICD_SGIR); + DSB; + + /* Wait to be turned off */ + while (1) + wfi(); +} + +void __secure sunxi_gic_init(void) +{ + u32 reg; + + /* SGI15 as Group-0 */ + clrbits_le32(GICD_BASE + GICD_IGROUPRn, BIT(15)); + + /* Set SGI15 priority to 0 */ + writeb(0, GICD_BASE + GICD_IPRIORITYRn + 15); + + /* Be cool with non-secure */ + writel(0xff, GICC_BASE + GICC_PMR); + + /* Switch FIQEn on */ + setbits_le32(GICC_BASE + GICC_CTLR, BIT(3)); + + asm volatile ("mrc p15, 0, %0, c1, c1, 0" : "=r" (reg) : : "cc"); + reg |= BIT(2); /* Enable FIQ in monitor mode */ + reg &= ~BIT(0); /* Secure mode */ + asm volatile ("mcr p15, 0, %0, c1, c1, 0" : : "r" (reg) : "cc"); + ISB; +} diff --git a/arch/arm/cpu/armv7/sunxi/psci_head.S b/arch/arm/cpu/armv7/sunxi/psci_head.S new file mode 100644 index 000000000000..40b350636e32 --- /dev/null +++ b/arch/arm/cpu/armv7/sunxi/psci_head.S @@ -0,0 +1,61 @@ +/* + * Copyright (C) 2013 - ARM Ltd + * Author: Marc Zyngier marc.zyngier@arm.com + * + * Based on code by Carl van Schaik carl@ok-labs.com. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see http://www.gnu.org/licenses/. + */ + +#include <config.h> + +#include <asm/arch-armv7/generictimer.h> +#include <asm/gic.h> +#include <asm/macro.h> +#include <asm/psci.h> +#include <asm/arch/cpu.h> + +/* + * Memory layout: + * + * SECURE_RAM to text_end : + * ._secure_text section + * text_end to ALIGN_PAGE(text_end): + * nothing + * ALIGN_PAGE(text_end) to ALIGN_PAGE(text_end) + 0x1000) + * 1kB of stack per CPU (4 CPUs max). + */ + + .pushsection ._secure.text, "ax" + + .arch_extension sec + +#define GICD_BASE (SUNXI_GIC400_BASE + 0x1000) +#define GICC_BASE (SUNXI_GIC400_BASE + 0x2000) + +.globl psci_arch_init +psci_arch_init: + mov r6, lr + bl psci_get_cpu_id @ CPU ID => r0 + bl psci_get_cpu_stack_top @ stack top => r0 + sub r0, r0, #4 @ Save space for target PC + mov sp, r0 + mov lr, r6 + + push {r0, r1, r2, ip, lr} + bl sunxi_gic_init + pop {r0, r1, r2, ip, pc} + + .globl psci_text_end +psci_text_end: + .popsection diff --git a/arch/arm/cpu/armv7/sunxi/psci_sun6i.S b/arch/arm/cpu/armv7/sunxi/psci_sun6i.S deleted file mode 100644 index 95fdb0e58874..000000000000 --- a/arch/arm/cpu/armv7/sunxi/psci_sun6i.S +++ /dev/null @@ -1,262 +0,0 @@ -/* - * Copyright (C) 2015 - Chen-Yu Tsai - * Author: Chen-Yu Tsai wens@csie.org - * - * Based on psci_sun7i.S by Marc Zyngier marc.zyngier@arm.com - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program. If not, see http://www.gnu.org/licenses/. - */ - -#include <config.h> - -#include <asm/arch-armv7/generictimer.h> -#include <asm/gic.h> -#include <asm/macro.h> -#include <asm/psci.h> -#include <asm/arch/cpu.h> - -/* - * Memory layout: - * - * SECURE_RAM to text_end : - * ._secure_text section - * text_end to ALIGN_PAGE(text_end): - * nothing - * ALIGN_PAGE(text_end) to ALIGN_PAGE(text_end) + 0x1000) - * 1kB of stack per CPU (4 CPUs max). - */ - - .pushsection ._secure.text, "ax" - - .arch_extension sec - -#define ONE_MS (CONFIG_TIMER_CLK_FREQ / 1000) -#define TEN_MS (10 * ONE_MS) -#define GICD_BASE (SUNXI_GIC400_BASE + 0x1000) -#define GICC_BASE (SUNXI_GIC400_BASE + 0x2000) - -.globl psci_fiq_enter -psci_fiq_enter: - push {r0-r12} - - @ Switch to secure - mrc p15, 0, r7, c1, c1, 0 - bic r8, r7, #1 - mcr p15, 0, r8, c1, c1, 0 - isb - - @ Validate reason based on IAR and acknowledge - movw r8, #(GICC_BASE & 0xffff) - movt r8, #(GICC_BASE >> 16) - ldr r9, [r8, #GICC_IAR] - movw r10, #0x3ff - movt r10, #0 - cmp r9, r10 @ skip spurious interrupt 1023 - beq out - movw r10, #0x3fe @ ...and 1022 - cmp r9, r10 - beq out - str r9, [r8, #GICC_EOIR] @ acknowledge the interrupt - dsb - - @ Compute CPU number - lsr r9, r9, #10 - and r9, r9, #0xf - - movw r8, #(SUNXI_CPUCFG_BASE & 0xffff) - movt r8, #(SUNXI_CPUCFG_BASE >> 16) - - @ Wait for the core to enter WFI - lsl r11, r9, #6 @ x64 - add r11, r11, r8 - -1: ldr r10, [r11, #0x48] - tst r10, #(1 << 2) - bne 2f - timer_wait r10, ONE_MS - b 1b - - @ Reset CPU -2: mov r10, #0 - str r10, [r11, #0x40] - - @ Lock CPU - mov r10, #1 - lsl r11, r10, r9 @ r11 is now CPU mask - ldr r10, [r8, #0x1e4] - bic r10, r10, r11 - str r10, [r8, #0x1e4] - - movw r8, #(SUNXI_PRCM_BASE & 0xffff) - movt r8, #(SUNXI_PRCM_BASE >> 16) - - @ Set power gating - ldr r10, [r8, #0x100] - orr r10, r10, r11 - str r10, [r8, #0x100] - timer_wait r10, ONE_MS - -#if defined(CONFIG_MACH_SUN6I) || defined(CONFIG_MACH_SUN8I_H3) - @ Activate power clamp - lsl r12, r9, #2 @ x4 - add r12, r12, r8 - mov r10, #0xff - str r10, [r12, #0x140] -#endif - - movw r8, #(SUNXI_CPUCFG_BASE & 0xffff) - movt r8, #(SUNXI_CPUCFG_BASE >> 16) - - @ Unlock CPU - ldr r10, [r8, #0x1e4] - orr r10, r10, r11 - str r10, [r8, #0x1e4] - - @ Restore security level -out: mcr p15, 0, r7, c1, c1, 0 - - pop {r0-r12} - subs pc, lr, #4 - - @ r1 = target CPU - @ r2 = target PC -.globl psci_cpu_on -psci_cpu_on: - push {lr} - - mov r0, r1 - bl psci_get_cpu_stack_top @ get stack top of target CPU - str r2, [r0] @ store target PC at stack top - dsb - - movw r0, #(SUNXI_CPUCFG_BASE & 0xffff) - movt r0, #(SUNXI_CPUCFG_BASE >> 16) - - @ CPU mask - and r1, r1, #3 @ only care about first cluster - mov r4, #1 - lsl r4, r4, r1 - - ldr r6, =psci_cpu_entry - str r6, [r0, #0x1a4] @ PRIVATE_REG (boot vector) - - @ Assert reset on target CPU - mov r6, #0 - lsl r5, r1, #6 @ 64 bytes per CPU - add r5, r5, #0x40 @ Offset from base - add r5, r5, r0 @ CPU control block - str r6, [r5] @ Reset CPU - - @ l1 invalidate - ldr r6, [r0, #0x184] @ CPUCFG_GEN_CTRL_REG - bic r6, r6, r4 - str r6, [r0, #0x184] - - @ Lock CPU (Disable external debug access) - ldr r6, [r0, #0x1e4] @ CPUCFG_DBG_CTL1_REG - bic r6, r6, r4 - str r6, [r0, #0x1e4] - - movw r0, #(SUNXI_PRCM_BASE & 0xffff) - movt r0, #(SUNXI_PRCM_BASE >> 16) - -#if defined(CONFIG_MACH_SUN6I) || defined(CONFIG_MACH_SUN8I_H3) - @ Release power clamp - lsl r5, r1, #2 @ 1 register per CPU - add r5, r5, r0 @ PRCM - movw r6, #0x1ff - movt r6, #0 -1: lsrs r6, r6, #1 - str r6, [r5, #0x140] @ CPUx_PWR_CLAMP - bne 1b -#endif - - timer_wait r6, TEN_MS - - @ Clear power gating - ldr r6, [r0, #0x100] @ CPU_PWROFF_GATING - bic r6, r6, r4 - str r6, [r0, #0x100] - - @ re-calculate CPU control register address - movw r0, #(SUNXI_CPUCFG_BASE & 0xffff) - movt r0, #(SUNXI_CPUCFG_BASE >> 16) - - @ Deassert reset on target CPU - mov r6, #3 - lsl r5, r1, #6 @ 64 bytes per CPU - add r5, r5, #0x40 @ Offset from base - add r5, r5, r0 @ CPU control block - str r6, [r5] - - @ Unlock CPU (Enable external debug access) - ldr r6, [r0, #0x1e4] @ CPUCFG_DBG_CTL1_REG - orr r6, r6, r4 - str r6, [r0, #0x1e4] - - mov r0, #ARM_PSCI_RET_SUCCESS @ Return PSCI_RET_SUCCESS - pop {pc} - -.globl psci_cpu_off -psci_cpu_off: - bl psci_cpu_off_common - - @ Ask CPU0 to pull the rug... - movw r0, #(GICD_BASE & 0xffff) - movt r0, #(GICD_BASE >> 16) - movw r1, #15 @ SGI15 - movt r1, #1 @ Target is CPU0 - str r1, [r0, #GICD_SGIR] - dsb - -1: wfi - b 1b - -.globl psci_arch_init -psci_arch_init: - mov r6, lr - - movw r4, #(GICD_BASE & 0xffff) - movt r4, #(GICD_BASE >> 16) - - ldr r5, [r4, #GICD_IGROUPRn] - bic r5, r5, #(1 << 15) @ SGI15 as Group-0 - str r5, [r4, #GICD_IGROUPRn] - - mov r5, #0 @ Set SGI15 priority to 0 - strb r5, [r4, #(GICD_IPRIORITYRn + 15)] - - add r4, r4, #0x1000 @ GICC address - - mov r5, #0xff - str r5, [r4, #GICC_PMR] @ Be cool with non-secure - - ldr r5, [r4, #GICC_CTLR] - orr r5, r5, #(1 << 3) @ Switch FIQEn on - str r5, [r4, #GICC_CTLR] - - mrc p15, 0, r5, c1, c1, 0 @ Read SCR - orr r5, r5, #4 @ Enable FIQ in monitor mode - bic r5, r5, #1 @ Secure mode - mcr p15, 0, r5, c1, c1, 0 @ Write SCR - isb - - bl psci_get_cpu_id @ CPU ID => r0 - bl psci_get_cpu_stack_top @ stack top => r0 - mov sp, r0 - - bx r6 - - .globl psci_text_end -psci_text_end: - .popsection diff --git a/arch/arm/cpu/armv7/sunxi/psci_sun7i.S b/arch/arm/cpu/armv7/sunxi/psci_sun7i.S deleted file mode 100644 index 87bbd725f0b3..000000000000 --- a/arch/arm/cpu/armv7/sunxi/psci_sun7i.S +++ /dev/null @@ -1,237 +0,0 @@ -/* - * Copyright (C) 2013 - ARM Ltd - * Author: Marc Zyngier marc.zyngier@arm.com - * - * Based on code by Carl van Schaik carl@ok-labs.com. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program. If not, see http://www.gnu.org/licenses/. - */ - -#include <config.h> - -#include <asm/arch-armv7/generictimer.h> -#include <asm/gic.h> -#include <asm/macro.h> -#include <asm/psci.h> -#include <asm/arch/cpu.h> - -/* - * Memory layout: - * - * SECURE_RAM to text_end : - * ._secure_text section - * text_end to ALIGN_PAGE(text_end): - * nothing - * ALIGN_PAGE(text_end) to ALIGN_PAGE(text_end) + 0x1000) - * 1kB of stack per CPU (4 CPUs max). - */ - - .pushsection ._secure.text, "ax" - - .arch_extension sec - -#define ONE_MS (CONFIG_TIMER_CLK_FREQ / 1000) -#define TEN_MS (10 * ONE_MS) -#define GICD_BASE (SUNXI_GIC400_BASE + 0x1000) -#define GICC_BASE (SUNXI_GIC400_BASE + 0x2000) - -.globl psci_fiq_enter -psci_fiq_enter: - push {r0-r12} - - @ Switch to secure - mrc p15, 0, r7, c1, c1, 0 - bic r8, r7, #1 - mcr p15, 0, r8, c1, c1, 0 - isb - - @ Validate reason based on IAR and acknowledge - movw r8, #(GICC_BASE & 0xffff) - movt r8, #(GICC_BASE >> 16) - ldr r9, [r8, #GICC_IAR] - movw r10, #0x3ff - movt r10, #0 - cmp r9, r10 @ skip spurious interrupt 1023 - beq out - movw r10, #0x3fe @ ...and 1022 - cmp r9, r10 - beq out - str r9, [r8, #GICC_EOIR] @ acknowledge the interrupt - dsb - - @ Compute CPU number - lsr r9, r9, #10 - and r9, r9, #0xf - - movw r8, #(SUNXI_CPUCFG_BASE & 0xffff) - movt r8, #(SUNXI_CPUCFG_BASE >> 16) - - @ Wait for the core to enter WFI - lsl r11, r9, #6 @ x64 - add r11, r11, r8 - -1: ldr r10, [r11, #0x48] - tst r10, #(1 << 2) - bne 2f - timer_wait r10, ONE_MS - b 1b - - @ Reset CPU -2: mov r10, #0 - str r10, [r11, #0x40] - - @ Lock CPU - mov r10, #1 - lsl r9, r10, r9 @ r9 is now CPU mask - ldr r10, [r8, #0x1e4] - bic r10, r10, r9 - str r10, [r8, #0x1e4] - - @ Set power gating - ldr r10, [r8, #0x1b4] - orr r10, r10, #1 - str r10, [r8, #0x1b4] - timer_wait r10, ONE_MS - - @ Activate power clamp - mov r10, #1 -1: str r10, [r8, #0x1b0] - lsl r10, r10, #1 - orr r10, r10, #1 - tst r10, #0x100 - beq 1b - - @ Restore security level -out: mcr p15, 0, r7, c1, c1, 0 - - pop {r0-r12} - subs pc, lr, #4 - - @ r1 = target CPU - @ r2 = target PC -.globl psci_cpu_on -psci_cpu_on: - push {lr} - - mov r0, r1 - bl psci_get_cpu_stack_top @ get stack top of target CPU - str r2, [r0] @ store target PC at stack top - dsb - - movw r0, #(SUNXI_CPUCFG_BASE & 0xffff) - movt r0, #(SUNXI_CPUCFG_BASE >> 16) - - @ CPU mask - and r1, r1, #3 @ only care about first cluster - mov r4, #1 - lsl r4, r4, r1 - - ldr r6, =psci_cpu_entry - str r6, [r0, #0x1a4] @ PRIVATE_REG (boot vector) - - @ Assert reset on target CPU - mov r6, #0 - lsl r5, r1, #6 @ 64 bytes per CPU - add r5, r5, #0x40 @ Offset from base - add r5, r5, r0 @ CPU control block - str r6, [r5] @ Reset CPU - - @ l1 invalidate - ldr r6, [r0, #0x184] @ CPUCFG_GEN_CTRL_REG - bic r6, r6, r4 - str r6, [r0, #0x184] - - @ Lock CPU (Disable external debug access) - ldr r6, [r0, #0x1e4] @ CPUCFG_DBG_CTL1_REG - bic r6, r6, r4 - str r6, [r0, #0x1e4] - - @ Release power clamp - movw r6, #0x1ff - movt r6, #0 -1: lsrs r6, r6, #1 - str r6, [r0, #0x1b0] @ CPU1_PWR_CLAMP - bne 1b - - timer_wait r1, TEN_MS - - @ Clear power gating - ldr r6, [r0, #0x1b4] @ CPU1_PWROFF_REG - bic r6, r6, #1 - str r6, [r0, #0x1b4] - - @ Deassert reset on target CPU - mov r6, #3 - str r6, [r5] - - @ Unlock CPU (Enable external debug access) - ldr r6, [r0, #0x1e4] @ CPUCFG_DBG_CTL1_REG - orr r6, r6, r4 - str r6, [r0, #0x1e4] - - mov r0, #ARM_PSCI_RET_SUCCESS @ Return PSCI_RET_SUCCESS - pop {pc} - -.globl psci_cpu_off -psci_cpu_off: - bl psci_cpu_off_common - - @ Ask CPU0 to pull the rug... - movw r0, #(GICD_BASE & 0xffff) - movt r0, #(GICD_BASE >> 16) - movw r1, #15 @ SGI15 - movt r1, #1 @ Target is CPU0 - str r1, [r0, #GICD_SGIR] - dsb - -1: wfi - b 1b - -.globl psci_arch_init -psci_arch_init: - mov r6, lr - - movw r4, #(GICD_BASE & 0xffff) - movt r4, #(GICD_BASE >> 16) - - ldr r5, [r4, #GICD_IGROUPRn] - bic r5, r5, #(1 << 15) @ SGI15 as Group-0 - str r5, [r4, #GICD_IGROUPRn] - - mov r5, #0 @ Set SGI15 priority to 0 - strb r5, [r4, #(GICD_IPRIORITYRn + 15)] - - add r4, r4, #0x1000 @ GICC address - - mov r5, #0xff - str r5, [r4, #GICC_PMR] @ Be cool with non-secure - - ldr r5, [r4, #GICC_CTLR] - orr r5, r5, #(1 << 3) @ Switch FIQEn on - str r5, [r4, #GICC_CTLR] - - mrc p15, 0, r5, c1, c1, 0 @ Read SCR - orr r5, r5, #4 @ Enable FIQ in monitor mode - bic r5, r5, #1 @ Secure mode - mcr p15, 0, r5, c1, c1, 0 @ Write SCR - isb - - bl psci_get_cpu_id @ CPU ID => r0 - bl psci_get_cpu_stack_top @ stack top => r0 - mov sp, r0 - - bx r6 - - .globl psci_text_end -psci_text_end: - .popsection

On 23/05/16 13:41, Chen-Yu Tsai wrote:
To make the PSCI backend more maintainable and easier to port to newer SoCs, rewrite the current PSCI implementation in C.
Some inline assembly bits are required to access coprocessor registers. PSCI stack setup is the only part left completely in assembly. In theory this part could be split out of psci_arch_init into a separate common function, and psci_arch_init could be completely in C.
Signed-off-by: Chen-Yu Tsai wens@csie.org
arch/arm/cpu/armv7/sunxi/Makefile | 7 +- arch/arm/cpu/armv7/sunxi/psci.c | 229 +++++++++++++++++++++++++++++ arch/arm/cpu/armv7/sunxi/psci_head.S | 61 ++++++++ arch/arm/cpu/armv7/sunxi/psci_sun6i.S | 262 ---------------------------------- arch/arm/cpu/armv7/sunxi/psci_sun7i.S | 237 ------------------------------ 5 files changed, 292 insertions(+), 504 deletions(-) create mode 100644 arch/arm/cpu/armv7/sunxi/psci.c create mode 100644 arch/arm/cpu/armv7/sunxi/psci_head.S delete mode 100644 arch/arm/cpu/armv7/sunxi/psci_sun6i.S delete mode 100644 arch/arm/cpu/armv7/sunxi/psci_sun7i.S
diff --git a/arch/arm/cpu/armv7/sunxi/Makefile b/arch/arm/cpu/armv7/sunxi/Makefile index 4d2274a38ed1..c2085101685b 100644 --- a/arch/arm/cpu/armv7/sunxi/Makefile +++ b/arch/arm/cpu/armv7/sunxi/Makefile @@ -13,11 +13,8 @@ obj-$(CONFIG_MACH_SUN6I) += tzpc.o obj-$(CONFIG_MACH_SUN8I_H3) += tzpc.o
ifndef CONFIG_SPL_BUILD -ifdef CONFIG_ARMV7_PSCI -obj-$(CONFIG_MACH_SUN6I) += psci_sun6i.o -obj-$(CONFIG_MACH_SUN7I) += psci_sun7i.o -obj-$(CONFIG_MACH_SUN8I) += psci_sun6i.o -endif +obj-$(CONFIG_ARMV7_PSCI) += psci.o +obj-$(CONFIG_ARMV7_PSCI) += psci_head.o endif
ifdef CONFIG_SPL_BUILD diff --git a/arch/arm/cpu/armv7/sunxi/psci.c b/arch/arm/cpu/armv7/sunxi/psci.c new file mode 100644 index 000000000000..943061937f7c --- /dev/null +++ b/arch/arm/cpu/armv7/sunxi/psci.c @@ -0,0 +1,229 @@ +/*
- Copyright (C) 2016
- Author: Chen-Yu Tsai wens@csie.org
- Based on assembly code by Marc Zyngier marc.zyngier@arm.com,
- which was based on code by Carl van Schaik carl@ok-labs.com.
- SPDX-License-Identifier: GPL-2.0
- */
+#include <config.h> +#include <common.h>
+#include <asm/arch/cpu.h> +#include <asm/arch/cpucfg.h> +#include <asm/arch/prcm.h> +#include <asm/armv7.h> +#include <asm/gic.h> +#include <asm/io.h> +#include <asm/psci.h> +#include <asm/system.h>
+#include <linux/bitops.h>
+#define __secure __attribute__ ((section ("._secure.text"))) +#define __irq __attribute__ ((interrupt ("IRQ")))
+#define GICD_BASE (SUNXI_GIC400_BASE + GIC_DIST_OFFSET) +#define GICC_BASE (SUNXI_GIC400_BASE + GIC_CPU_OFFSET_A15)
+static void __secure __mdelay(u32 ms) +{
- u32 reg = DIV_ROUND_UP(CONFIG_TIMER_CLK_FREQ, ms);
- /* CNTP_TVAL */
- asm volatile ("mcr p15, 0, %0, c14, c2, 0" : : "r" (reg));
Since you made the effort of switching to C code, please move all the asm statements into static helpers:
static write_cntp_tval(u32 tval) { asm volatile("mcr p15, 0, %0, c14, c2, 0" : : "r" (tval)); }
This has the benefit of making it obvious which CP15 register you're dealing with (specially further down when you're dealing with SCR).
- ISB;
- /* CNTP_CTL */
- asm volatile ("mcr p15, 0, %0, c14, c2, 1" : : "r" (3));
- do {
ISB;
/* CNTP_CTL */
asm volatile ("mrc p15, 0, %0, c14, c2, 1" : "=r" (reg) : :
"cc" );
Why are the flags part of the clobber list?
- } while (!(reg & BIT(2)));
- /* CNTP_CTL */
- asm volatile ("mcr p15, 0, %0, c14, c2, 1" : : "r" (0));
+}
+void __secure sunxi_cpu_power_off(u32 cpuid) +{ +#ifdef CONFIG_SUNXI_GEN_SUN6I
- struct sunxi_prcm_reg *prcm =
(struct sunxi_prcm_reg *)SUNXI_PRCM_BASE;
+#endif
- struct sunxi_cpucfg_reg *cpucfg =
(struct sunxi_cpucfg_reg *)SUNXI_CPUCFG_BASE;
- u32 cpu = cpuid & 0x3;
- u32 tmp __maybe_unused;
This doesn't look used at all. Why is it there?
- /* Wait for the core to enter WFI */
- while (1) {
if (readl(&cpucfg->cpu[cpu].status) & BIT(2))
break;
__mdelay(1);
- }
- /* Assert reset on target CPU */
- writel(0, &cpucfg->cpu[cpu].rst);
- /* Lock CPU (Disable external debug access) */
- clrbits_le32(&cpucfg->dbg_ctrl1, BIT(cpu));
+#ifdef CONFIG_MACH_SUN7I
- /* Set power gating */
- setbits_le32(&cpucfg->cpu1_pwroff, BIT(0));
+#else
- /* Set power gating */
- setbits_le32(&prcm->cpu_pwroff, BIT(cpu));
+#endif
+#ifdef CONFIG_MACH_SUN7I
- /* Activate power clamp */
- writel(0xff, &cpucfg->cpu1_pwr_clamp);
+#elif defined(CONFIG_MACH_SUN6I) || defined(CONFIG_MACH_SUN8I_H3)
- /* Activate power clamp */
- writel(0xff, &prcm->cpu_pwr_clamp[cpu]);
+#endif
Why don't you put all the #ifdefery in two helper functions (one for syn7i, one for all the others)? This would look a lot better.
- /* Unlock CPU (Disable external debug access) */
- setbits_le32(&cpucfg->dbg_ctrl1, BIT(cpu));
+}
+/*
- Although this is an FIQ handler, the FIQ is processed in monitor mode,
- which means there's no FIQ banked registers. This is the same as IRQ
- mode, so use the IRQ attribute to ask the compiler to handler entry
- and return.
- */
+void __secure __irq psci_fiq_enter(void) +{
- u32 scr, reg, cpu;
- /* Switch to secure mode */
- asm volatile ("mrc p15, 0, %0, c1, c1, 0" : "=r" (scr) : : "cc");
- reg = scr & ~(1 << 0);
- asm volatile ("mcr p15, 0, %0, c1, c1, 0" : : "r" (reg) : "cc");
- ISB;
Same question about the flags.
- /* Validate reason based on IAR and acknowledge */
- reg = readl(GICC_BASE + GICC_IAR);
- /* Skip spurious interrupts 1022 and 1023 */
- if (reg == 1023 || reg == 1022)
goto out;
- /* Acknowledge interrupt */
No, this is an End Of Interrupt. The Acknowledge is done by reading GICC_IAR.
- writel(reg, GICC_BASE + GICC_EOIR);
- DSB;
- /* Get CPU number */
- cpu = (reg >> 10) & 0xf; /* but GIC specs say only 3 bits? */
Indeed, only GICC_IAR[12:10]. Seems like a bug in the original code, no need to reproduce it here.
- /* Power off the CPU */
- sunxi_cpu_power_off(cpu);
+out:
- /* Restore security level */
- asm volatile ("mcr p15, 0, %0, c1, c1, 0" : : "r" (scr) : "cc");
Same question about flags.
+}
+int __secure psci_cpu_on(u32 unused __always_unused, u32 mpidr, u32 pc) +{ +#ifdef CONFIG_SUNXI_GEN_SUN6I
- struct sunxi_prcm_reg *prcm =
(struct sunxi_prcm_reg *)SUNXI_PRCM_BASE;
+#endif
- struct sunxi_cpucfg_reg *cpucfg =
(struct sunxi_cpucfg_reg *)SUNXI_CPUCFG_BASE;
- u32 cpu = (mpidr & 0x3);
- u32 tmp __maybe_unused;
- /* store target PC at target CPU stack top */
- writel(pc, psci_get_cpu_stack_top(cpu));
- DSB;
- /* Set secondary core power on PC */
- writel((u32)&psci_cpu_entry, &cpucfg->priv0);
- /* Assert reset on target CPU */
- writel(0, &cpucfg->cpu[cpu].rst);
- /* Invalidate L1 cache */
- clrbits_le32(&cpucfg->gen_ctrl, BIT(cpu));
- /* Lock CPU (Disable external debug access) */
- clrbits_le32(&cpucfg->dbg_ctrl1, BIT(cpu));
+#ifdef CONFIG_MACH_SUN7I
- /* Release power clamp */
- tmp = 0x1ff;
- do {
tmp >>= 1;
writel(tmp, &cpucfg->cpu1_pwr_clamp);
- } while (tmp);
+#elif defined(CONFIG_MACH_SUN6I) || defined(CONFIG_MACH_SUN8I_H3)
- /* Release power clamp */
- tmp = 0x1ff;
- do {
tmp >>= 1;
writel(tmp, &prcm->cpu_pwr_clamp[cpu]);
- } while (tmp);
+#endif
- __mdelay(10);
+#ifdef CONFIG_MACH_SUN7I
- /* Clear power gating */
- clrbits_le32(&cpucfg->cpu1_pwroff, BIT(0));
+#else
- /* Clear power gating */
- clrbits_le32(&prcm->cpu_pwroff, BIT(cpu));
+#endif
Same remark about having per variant helpers. This will get rid of your __maybe_unused attribute.
- /* De-assert reset on target CPU */
- writel(BIT(1) | BIT(0), &cpucfg->cpu[cpu].rst);
- /* Unlock CPU (Disable external debug access) */
- setbits_le32(&cpucfg->dbg_ctrl1, BIT(cpu));
- return ARM_PSCI_RET_SUCCESS;
+}
+void __secure psci_cpu_off(void) +{
- psci_cpu_off_common();
- /* Ask CPU0 via SGI15 to pull the rug... */
- writel(BIT(16) | 15, GICD_BASE + GICD_SGIR);
- DSB;
- /* Wait to be turned off */
- while (1)
wfi();
+}
+void __secure sunxi_gic_init(void) +{
- u32 reg;
- /* SGI15 as Group-0 */
- clrbits_le32(GICD_BASE + GICD_IGROUPRn, BIT(15));
- /* Set SGI15 priority to 0 */
- writeb(0, GICD_BASE + GICD_IPRIORITYRn + 15);
- /* Be cool with non-secure */
- writel(0xff, GICC_BASE + GICC_PMR);
- /* Switch FIQEn on */
- setbits_le32(GICC_BASE + GICC_CTLR, BIT(3));
- asm volatile ("mrc p15, 0, %0, c1, c1, 0" : "=r" (reg) : : "cc");
- reg |= BIT(2); /* Enable FIQ in monitor mode */
- reg &= ~BIT(0); /* Secure mode */
- asm volatile ("mcr p15, 0, %0, c1, c1, 0" : : "r" (reg) : "cc");
Flags, helpers...
- ISB;
+} diff --git a/arch/arm/cpu/armv7/sunxi/psci_head.S b/arch/arm/cpu/armv7/sunxi/psci_head.S new file mode 100644 index 000000000000..40b350636e32 --- /dev/null +++ b/arch/arm/cpu/armv7/sunxi/psci_head.S @@ -0,0 +1,61 @@ +/*
- Copyright (C) 2013 - ARM Ltd
- Author: Marc Zyngier marc.zyngier@arm.com
- Based on code by Carl van Schaik carl@ok-labs.com.
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with this program. If not, see http://www.gnu.org/licenses/.
- */
+#include <config.h>
+#include <asm/arch-armv7/generictimer.h> +#include <asm/gic.h> +#include <asm/macro.h> +#include <asm/psci.h> +#include <asm/arch/cpu.h>
+/*
- Memory layout:
- SECURE_RAM to text_end :
- ._secure_text section
- text_end to ALIGN_PAGE(text_end):
- nothing
- ALIGN_PAGE(text_end) to ALIGN_PAGE(text_end) + 0x1000)
- 1kB of stack per CPU (4 CPUs max).
- */
- .pushsection ._secure.text, "ax"
- .arch_extension sec
+#define GICD_BASE (SUNXI_GIC400_BASE + 0x1000) +#define GICC_BASE (SUNXI_GIC400_BASE + 0x2000)
+.globl psci_arch_init +psci_arch_init:
- mov r6, lr
- bl psci_get_cpu_id @ CPU ID => r0
- bl psci_get_cpu_stack_top @ stack top => r0
- sub r0, r0, #4 @ Save space for target PC
- mov sp, r0
- mov lr, r6
- push {r0, r1, r2, ip, lr}
- bl sunxi_gic_init
- pop {r0, r1, r2, ip, pc}
I'm a bit sceptical with this sequence. You're saving registers that may be clobbered by the called C code, but you're missing r3. But more fundamentally, you're saving these registers after having already clobbered them (r0). To me, you should be able to replace these three instructions with a single:
b sunxi_gic_init
Or am I missing something?
Thanks,
M.

On Tue, May 24, 2016 at 4:41 PM, Marc Zyngier marc.zyngier@arm.com wrote:
On 23/05/16 13:41, Chen-Yu Tsai wrote:
To make the PSCI backend more maintainable and easier to port to newer SoCs, rewrite the current PSCI implementation in C.
Some inline assembly bits are required to access coprocessor registers. PSCI stack setup is the only part left completely in assembly. In theory this part could be split out of psci_arch_init into a separate common function, and psci_arch_init could be completely in C.
Signed-off-by: Chen-Yu Tsai wens@csie.org
arch/arm/cpu/armv7/sunxi/Makefile | 7 +- arch/arm/cpu/armv7/sunxi/psci.c | 229 +++++++++++++++++++++++++++++ arch/arm/cpu/armv7/sunxi/psci_head.S | 61 ++++++++ arch/arm/cpu/armv7/sunxi/psci_sun6i.S | 262 ---------------------------------- arch/arm/cpu/armv7/sunxi/psci_sun7i.S | 237 ------------------------------ 5 files changed, 292 insertions(+), 504 deletions(-) create mode 100644 arch/arm/cpu/armv7/sunxi/psci.c create mode 100644 arch/arm/cpu/armv7/sunxi/psci_head.S delete mode 100644 arch/arm/cpu/armv7/sunxi/psci_sun6i.S delete mode 100644 arch/arm/cpu/armv7/sunxi/psci_sun7i.S
diff --git a/arch/arm/cpu/armv7/sunxi/Makefile b/arch/arm/cpu/armv7/sunxi/Makefile index 4d2274a38ed1..c2085101685b 100644 --- a/arch/arm/cpu/armv7/sunxi/Makefile +++ b/arch/arm/cpu/armv7/sunxi/Makefile @@ -13,11 +13,8 @@ obj-$(CONFIG_MACH_SUN6I) += tzpc.o obj-$(CONFIG_MACH_SUN8I_H3) += tzpc.o
ifndef CONFIG_SPL_BUILD -ifdef CONFIG_ARMV7_PSCI -obj-$(CONFIG_MACH_SUN6I) += psci_sun6i.o -obj-$(CONFIG_MACH_SUN7I) += psci_sun7i.o -obj-$(CONFIG_MACH_SUN8I) += psci_sun6i.o -endif +obj-$(CONFIG_ARMV7_PSCI) += psci.o +obj-$(CONFIG_ARMV7_PSCI) += psci_head.o endif
ifdef CONFIG_SPL_BUILD diff --git a/arch/arm/cpu/armv7/sunxi/psci.c b/arch/arm/cpu/armv7/sunxi/psci.c new file mode 100644 index 000000000000..943061937f7c --- /dev/null +++ b/arch/arm/cpu/armv7/sunxi/psci.c @@ -0,0 +1,229 @@ +/*
- Copyright (C) 2016
- Author: Chen-Yu Tsai wens@csie.org
- Based on assembly code by Marc Zyngier marc.zyngier@arm.com,
- which was based on code by Carl van Schaik carl@ok-labs.com.
- SPDX-License-Identifier: GPL-2.0
- */
+#include <config.h> +#include <common.h>
+#include <asm/arch/cpu.h> +#include <asm/arch/cpucfg.h> +#include <asm/arch/prcm.h> +#include <asm/armv7.h> +#include <asm/gic.h> +#include <asm/io.h> +#include <asm/psci.h> +#include <asm/system.h>
+#include <linux/bitops.h>
+#define __secure __attribute__ ((section ("._secure.text"))) +#define __irq __attribute__ ((interrupt ("IRQ")))
+#define GICD_BASE (SUNXI_GIC400_BASE + GIC_DIST_OFFSET) +#define GICC_BASE (SUNXI_GIC400_BASE + GIC_CPU_OFFSET_A15)
+static void __secure __mdelay(u32 ms) +{
u32 reg = DIV_ROUND_UP(CONFIG_TIMER_CLK_FREQ, ms);
/* CNTP_TVAL */
asm volatile ("mcr p15, 0, %0, c14, c2, 0" : : "r" (reg));
Since you made the effort of switching to C code, please move all the asm statements into static helpers:
static write_cntp_tval(u32 tval) { asm volatile("mcr p15, 0, %0, c14, c2, 0" : : "r" (tval)); }
This has the benefit of making it obvious which CP15 register you're dealing with (specially further down when you're dealing with SCR).
Will do.
ISB;
/* CNTP_CTL */
asm volatile ("mcr p15, 0, %0, c14, c2, 1" : : "r" (3));
do {
ISB;
/* CNTP_CTL */
asm volatile ("mrc p15, 0, %0, c14, c2, 1" : "=r" (reg) : :
"cc" );
Why are the flags part of the clobber list?
I misunderstood the meaning and thought they covered the coprocessors. Will remove them.
} while (!(reg & BIT(2)));
/* CNTP_CTL */
asm volatile ("mcr p15, 0, %0, c14, c2, 1" : : "r" (0));
+}
+void __secure sunxi_cpu_power_off(u32 cpuid) +{ +#ifdef CONFIG_SUNXI_GEN_SUN6I
struct sunxi_prcm_reg *prcm =
(struct sunxi_prcm_reg *)SUNXI_PRCM_BASE;
+#endif
struct sunxi_cpucfg_reg *cpucfg =
(struct sunxi_cpucfg_reg *)SUNXI_CPUCFG_BASE;
u32 cpu = cpuid & 0x3;
u32 tmp __maybe_unused;
This doesn't look used at all. Why is it there?
/* Wait for the core to enter WFI */
while (1) {
if (readl(&cpucfg->cpu[cpu].status) & BIT(2))
break;
__mdelay(1);
}
/* Assert reset on target CPU */
writel(0, &cpucfg->cpu[cpu].rst);
/* Lock CPU (Disable external debug access) */
clrbits_le32(&cpucfg->dbg_ctrl1, BIT(cpu));
+#ifdef CONFIG_MACH_SUN7I
/* Set power gating */
setbits_le32(&cpucfg->cpu1_pwroff, BIT(0));
+#else
/* Set power gating */
setbits_le32(&prcm->cpu_pwroff, BIT(cpu));
+#endif
+#ifdef CONFIG_MACH_SUN7I
/* Activate power clamp */
writel(0xff, &cpucfg->cpu1_pwr_clamp);
+#elif defined(CONFIG_MACH_SUN6I) || defined(CONFIG_MACH_SUN8I_H3)
/* Activate power clamp */
writel(0xff, &prcm->cpu_pwr_clamp[cpu]);
+#endif
Why don't you put all the #ifdefery in two helper functions (one for syn7i, one for all the others)? This would look a lot better.
Will do.
/* Unlock CPU (Disable external debug access) */
setbits_le32(&cpucfg->dbg_ctrl1, BIT(cpu));
+}
+/*
- Although this is an FIQ handler, the FIQ is processed in monitor mode,
- which means there's no FIQ banked registers. This is the same as IRQ
- mode, so use the IRQ attribute to ask the compiler to handler entry
- and return.
- */
+void __secure __irq psci_fiq_enter(void) +{
u32 scr, reg, cpu;
/* Switch to secure mode */
asm volatile ("mrc p15, 0, %0, c1, c1, 0" : "=r" (scr) : : "cc");
reg = scr & ~(1 << 0);
asm volatile ("mcr p15, 0, %0, c1, c1, 0" : : "r" (reg) : "cc");
ISB;
Same question about the flags.
/* Validate reason based on IAR and acknowledge */
reg = readl(GICC_BASE + GICC_IAR);
/* Skip spurious interrupts 1022 and 1023 */
if (reg == 1023 || reg == 1022)
goto out;
/* Acknowledge interrupt */
No, this is an End Of Interrupt. The Acknowledge is done by reading GICC_IAR.
The comment was copied from the original assembly code.
writel(reg, GICC_BASE + GICC_EOIR);
DSB;
/* Get CPU number */
cpu = (reg >> 10) & 0xf; /* but GIC specs say only 3 bits? */
Indeed, only GICC_IAR[12:10]. Seems like a bug in the original code, no need to reproduce it here.
OK.
/* Power off the CPU */
sunxi_cpu_power_off(cpu);
+out:
/* Restore security level */
asm volatile ("mcr p15, 0, %0, c1, c1, 0" : : "r" (scr) : "cc");
Same question about flags.
+}
+int __secure psci_cpu_on(u32 unused __always_unused, u32 mpidr, u32 pc) +{ +#ifdef CONFIG_SUNXI_GEN_SUN6I
struct sunxi_prcm_reg *prcm =
(struct sunxi_prcm_reg *)SUNXI_PRCM_BASE;
+#endif
struct sunxi_cpucfg_reg *cpucfg =
(struct sunxi_cpucfg_reg *)SUNXI_CPUCFG_BASE;
u32 cpu = (mpidr & 0x3);
u32 tmp __maybe_unused;
/* store target PC at target CPU stack top */
writel(pc, psci_get_cpu_stack_top(cpu));
DSB;
/* Set secondary core power on PC */
writel((u32)&psci_cpu_entry, &cpucfg->priv0);
/* Assert reset on target CPU */
writel(0, &cpucfg->cpu[cpu].rst);
/* Invalidate L1 cache */
clrbits_le32(&cpucfg->gen_ctrl, BIT(cpu));
/* Lock CPU (Disable external debug access) */
clrbits_le32(&cpucfg->dbg_ctrl1, BIT(cpu));
+#ifdef CONFIG_MACH_SUN7I
/* Release power clamp */
tmp = 0x1ff;
do {
tmp >>= 1;
writel(tmp, &cpucfg->cpu1_pwr_clamp);
} while (tmp);
+#elif defined(CONFIG_MACH_SUN6I) || defined(CONFIG_MACH_SUN8I_H3)
/* Release power clamp */
tmp = 0x1ff;
do {
tmp >>= 1;
writel(tmp, &prcm->cpu_pwr_clamp[cpu]);
} while (tmp);
+#endif
__mdelay(10);
+#ifdef CONFIG_MACH_SUN7I
/* Clear power gating */
clrbits_le32(&cpucfg->cpu1_pwroff, BIT(0));
+#else
/* Clear power gating */
clrbits_le32(&prcm->cpu_pwroff, BIT(cpu));
+#endif
Same remark about having per variant helpers. This will get rid of your __maybe_unused attribute.
/* De-assert reset on target CPU */
writel(BIT(1) | BIT(0), &cpucfg->cpu[cpu].rst);
/* Unlock CPU (Disable external debug access) */
setbits_le32(&cpucfg->dbg_ctrl1, BIT(cpu));
return ARM_PSCI_RET_SUCCESS;
+}
+void __secure psci_cpu_off(void) +{
psci_cpu_off_common();
/* Ask CPU0 via SGI15 to pull the rug... */
writel(BIT(16) | 15, GICD_BASE + GICD_SGIR);
DSB;
/* Wait to be turned off */
while (1)
wfi();
+}
+void __secure sunxi_gic_init(void) +{
u32 reg;
/* SGI15 as Group-0 */
clrbits_le32(GICD_BASE + GICD_IGROUPRn, BIT(15));
/* Set SGI15 priority to 0 */
writeb(0, GICD_BASE + GICD_IPRIORITYRn + 15);
/* Be cool with non-secure */
writel(0xff, GICC_BASE + GICC_PMR);
/* Switch FIQEn on */
setbits_le32(GICC_BASE + GICC_CTLR, BIT(3));
asm volatile ("mrc p15, 0, %0, c1, c1, 0" : "=r" (reg) : : "cc");
reg |= BIT(2); /* Enable FIQ in monitor mode */
reg &= ~BIT(0); /* Secure mode */
asm volatile ("mcr p15, 0, %0, c1, c1, 0" : : "r" (reg) : "cc");
Flags, helpers...
ISB;
+} diff --git a/arch/arm/cpu/armv7/sunxi/psci_head.S b/arch/arm/cpu/armv7/sunxi/psci_head.S new file mode 100644 index 000000000000..40b350636e32 --- /dev/null +++ b/arch/arm/cpu/armv7/sunxi/psci_head.S @@ -0,0 +1,61 @@ +/*
- Copyright (C) 2013 - ARM Ltd
- Author: Marc Zyngier marc.zyngier@arm.com
- Based on code by Carl van Schaik carl@ok-labs.com.
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with this program. If not, see http://www.gnu.org/licenses/.
- */
+#include <config.h>
+#include <asm/arch-armv7/generictimer.h> +#include <asm/gic.h> +#include <asm/macro.h> +#include <asm/psci.h> +#include <asm/arch/cpu.h>
+/*
- Memory layout:
- SECURE_RAM to text_end :
- ._secure_text section
- text_end to ALIGN_PAGE(text_end):
- nothing
- ALIGN_PAGE(text_end) to ALIGN_PAGE(text_end) + 0x1000)
- 1kB of stack per CPU (4 CPUs max).
- */
.pushsection ._secure.text, "ax"
.arch_extension sec
+#define GICD_BASE (SUNXI_GIC400_BASE + 0x1000) +#define GICC_BASE (SUNXI_GIC400_BASE + 0x2000)
+.globl psci_arch_init +psci_arch_init:
mov r6, lr
bl psci_get_cpu_id @ CPU ID => r0
bl psci_get_cpu_stack_top @ stack top => r0
sub r0, r0, #4 @ Save space for target PC
mov sp, r0
mov lr, r6
push {r0, r1, r2, ip, lr}
bl sunxi_gic_init
pop {r0, r1, r2, ip, pc}
I'm a bit sceptical with this sequence. You're saving registers that may be clobbered by the called C code, but you're missing r3. But more fundamentally, you're saving these registers after having already clobbered them (r0). To me, you should be able to replace these three instructions with a single:
b sunxi_gic_init
Or am I missing something?
This gets called at the top of the secure monitor procedure, which itself is entered via the smc call in _do_nonsec_entry(). _do_nonsec_entry() puts whatever arguments in r0 ~ r2, and the entry point in ip.
_do_nonsec_entry() is called in 2 places:
a) the PSCI entry point for secondary cores. For this part we only care about the entry point.
b) The Linux kernel entry point (see arch/arm/lib/bootm.c), which results in {r0, r1, r2, ip} = {0, mach_id, dt_addr, kernel_entry}. I'm not sure if the kernel doesn't care about r0, but we could reset it back to 0 at the end of the code above. What must be saved here are r1, r2, and lr.
I think we can split out the stack setup stuff into a separate function that gets called explicitly before psci_arch_init, and psci_arch_init should just stick to the ARM calling convention. However I intended this series to be mostly sunxi specific, and do the cross platform refactoring in a followup series.
Thanks for the thorough review. For the first version I wanted something that works and closely resembles the original to the point that the disassembled code can be matched to the original to aid in working out issues.
Regards ChenYu

On 25/05/16 03:14, Chen-Yu Tsai wrote:
On Tue, May 24, 2016 at 4:41 PM, Marc Zyngier marc.zyngier@arm.com wrote:
On 23/05/16 13:41, Chen-Yu Tsai wrote:
To make the PSCI backend more maintainable and easier to port to newer SoCs, rewrite the current PSCI implementation in C.
Some inline assembly bits are required to access coprocessor registers. PSCI stack setup is the only part left completely in assembly. In theory this part could be split out of psci_arch_init into a separate common function, and psci_arch_init could be completely in C.
Signed-off-by: Chen-Yu Tsai wens@csie.org
arch/arm/cpu/armv7/sunxi/Makefile | 7 +- arch/arm/cpu/armv7/sunxi/psci.c | 229 +++++++++++++++++++++++++++++ arch/arm/cpu/armv7/sunxi/psci_head.S | 61 ++++++++ arch/arm/cpu/armv7/sunxi/psci_sun6i.S | 262 ---------------------------------- arch/arm/cpu/armv7/sunxi/psci_sun7i.S | 237 ------------------------------ 5 files changed, 292 insertions(+), 504 deletions(-) create mode 100644 arch/arm/cpu/armv7/sunxi/psci.c create mode 100644 arch/arm/cpu/armv7/sunxi/psci_head.S delete mode 100644 arch/arm/cpu/armv7/sunxi/psci_sun6i.S delete mode 100644 arch/arm/cpu/armv7/sunxi/psci_sun7i.S
[...]
diff --git a/arch/arm/cpu/armv7/sunxi/psci_head.S b/arch/arm/cpu/armv7/sunxi/psci_head.S new file mode 100644 index 000000000000..40b350636e32 --- /dev/null +++ b/arch/arm/cpu/armv7/sunxi/psci_head.S @@ -0,0 +1,61 @@ +/*
- Copyright (C) 2013 - ARM Ltd
- Author: Marc Zyngier marc.zyngier@arm.com
- Based on code by Carl van Schaik carl@ok-labs.com.
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with this program. If not, see http://www.gnu.org/licenses/.
- */
+#include <config.h>
+#include <asm/arch-armv7/generictimer.h> +#include <asm/gic.h> +#include <asm/macro.h> +#include <asm/psci.h> +#include <asm/arch/cpu.h>
+/*
- Memory layout:
- SECURE_RAM to text_end :
- ._secure_text section
- text_end to ALIGN_PAGE(text_end):
- nothing
- ALIGN_PAGE(text_end) to ALIGN_PAGE(text_end) + 0x1000)
- 1kB of stack per CPU (4 CPUs max).
- */
.pushsection ._secure.text, "ax"
.arch_extension sec
+#define GICD_BASE (SUNXI_GIC400_BASE + 0x1000) +#define GICC_BASE (SUNXI_GIC400_BASE + 0x2000)
+.globl psci_arch_init +psci_arch_init:
mov r6, lr
bl psci_get_cpu_id @ CPU ID => r0
bl psci_get_cpu_stack_top @ stack top => r0
sub r0, r0, #4 @ Save space for target PC
mov sp, r0
mov lr, r6
push {r0, r1, r2, ip, lr}
bl sunxi_gic_init
pop {r0, r1, r2, ip, pc}
I'm a bit sceptical with this sequence. You're saving registers that may be clobbered by the called C code, but you're missing r3. But more fundamentally, you're saving these registers after having already clobbered them (r0). To me, you should be able to replace these three instructions with a single:
b sunxi_gic_init
Or am I missing something?
This gets called at the top of the secure monitor procedure, which itself is entered via the smc call in _do_nonsec_entry(). _do_nonsec_entry() puts whatever arguments in r0 ~ r2, and the entry point in ip.
_do_nonsec_entry() is called in 2 places:
a) the PSCI entry point for secondary cores. For this part we only care about the entry point.
b) The Linux kernel entry point (see arch/arm/lib/bootm.c), which results in {r0, r1, r2, ip} = {0, mach_id, dt_addr, kernel_entry}. I'm not sure if the kernel doesn't care about r0, but we could reset it back to 0 at the end of the code above. What must be saved here are r1, r2, and lr.
Right. I completely forgot how this thing worked. It makes sense then. But we definitely should preserve r0, as it is part of the calling convention. And for the sake of being generic, don't just reset it to zero, but preserve it from the beginning.
I think we can split out the stack setup stuff into a separate function that gets called explicitly before psci_arch_init, and psci_arch_init should just stick to the ARM calling convention. However I intended this series to be mostly sunxi specific, and do the cross platform refactoring in a followup series.
Yeah, I'm not too worried about that just yet.
Thanks for the thorough review. For the first version I wanted something that works and closely resembles the original to the point that the disassembled code can be matched to the original to aid in working out issues.
No problem, you're doing some good job here.
Thanks,
M.

Yes, C code is better and even necessary for some cases. Drawback of assemble language is that it is difficult not only for developers but also for the reviewers and maintainers. I am implementing our NXP/Freescale LS1 platform's system-suspend function, it is written in C too, is is impossible without C for my case.
On Mon, May 23, 2016 at 8:41 PM, Chen-Yu Tsai wens@csie.org wrote:
Hi everyone,
This series rewrites the Allwinner/sunxi PSCI implementation in C, to make it easier to maintain and extend for the currently unsupported multi-cluster SoCs. The SMP code in the BSP kernels are in C. Having the PSCI code in C as well will make it easier to work on.
To be able to convert the platform bits to C, some common PSCI functions have to be fixed up according to the ARM calling conventions. Function declarations are also needed.
This series is based on sunxi/next. Parts of it will likely conflict with the effort to support PSCI 1.0 on the Freescale LS102xA.
Patch 1 fixes up psci_get_cpu_stack_top.
Patch 2 fixes up the PSCI version of v7_flush_dcache_all.
Patch 3 adds function declarations for some of the common PSCI functions.
Patch 4 fixes issues with reserving memory for the secure section.
Patch 5 unifies the CPUCFG_BASE macro names for various sunxi platforms.
Patch 6 groups cpu core related controls together into one struct per core. This makes it straightforward to access the controls by the cpu index.
Patch 7 adds a missing header to cpucfg.h
Patch 8 adds some missing fields to cpucfg, which were used in the assembly code.
Patch 9 adds the base address for the GIC.
Patch 10 is the new PSCI implementation in C. Almost all of the code is converted, with the exception of initial setup of the stack.
Regards ChenYu
Chen-Yu Tsai (10): ARM: PSCI: use only r0 and r3 in psci_get_cpu_stack_top() ARM: PSCI: save and restore clobbered registers in v7_flush_dcache_all ARM: PSCI: export common PSCI function declarations for C code ARM: allocate extra space for PSCI stack in secure section during link phase sunxi: Make CPUCFG_BASE macro names the same across families sunxi: Group cpu core related controls together sunxi: Add missing linux/types.h header for cpucfg.h sunxi: Add CPUCFG debug lock and sun7i cpu power controls sunxi: Add base address for GIC sunxi: Add PSCI implementation in C
arch/arm/cpu/armv7/psci.S | 20 +- arch/arm/cpu/armv7/sunxi/Makefile | 7 +- arch/arm/cpu/armv7/sunxi/psci.c | 229 ++++++++++++++++++ arch/arm/cpu/armv7/sunxi/psci_head.S | 61 +++++ arch/arm/cpu/armv7/sunxi/psci_sun6i.S | 262 --------------------- arch/arm/cpu/armv7/sunxi/psci_sun7i.S | 237 ------------------- arch/arm/cpu/u-boot.lds | 3 + arch/arm/include/asm/arch-sunxi/cpu_sun4i.h | 17 +- .../asm/arch-sunxi/{cpucfg_sun6i.h => cpucfg.h} | 38 +-- arch/arm/include/asm/arch-sunxi/prcm.h | 6 +- arch/arm/include/asm/psci.h | 8 + 11 files changed, 350 insertions(+), 538 deletions(-) create mode 100644 arch/arm/cpu/armv7/sunxi/psci.c create mode 100644 arch/arm/cpu/armv7/sunxi/psci_head.S delete mode 100644 arch/arm/cpu/armv7/sunxi/psci_sun6i.S delete mode 100644 arch/arm/cpu/armv7/sunxi/psci_sun7i.S rename arch/arm/include/asm/arch-sunxi/{cpucfg_sun6i.h => cpucfg.h} (69%)
-- 2.8.1

Hi,
On 23-05-16 14:41, Chen-Yu Tsai wrote:
Hi everyone,
This series rewrites the Allwinner/sunxi PSCI implementation in C, to make it easier to maintain and extend for the currently unsupported multi-cluster SoCs. The SMP code in the BSP kernels are in C. Having the PSCI code in C as well will make it easier to work on.
To be able to convert the platform bits to C, some common PSCI functions have to be fixed up according to the ARM calling conventions. Function declarations are also needed.
This series is based on sunxi/next. Parts of it will likely conflict with the effort to support PSCI 1.0 on the Freescale LS102xA.
Patch 1 fixes up psci_get_cpu_stack_top.
Patch 2 fixes up the PSCI version of v7_flush_dcache_all.
Patch 3 adds function declarations for some of the common PSCI functions.
Patch 4 fixes issues with reserving memory for the secure section.
Patch 5 unifies the CPUCFG_BASE macro names for various sunxi platforms.
Patch 6 groups cpu core related controls together into one struct per core. This makes it straightforward to access the controls by the cpu index.
Patch 7 adds a missing header to cpucfg.h
Patch 8 adds some missing fields to cpucfg, which were used in the assembly code.
Patch 9 adds the base address for the GIC.
Patch 10 is the new PSCI implementation in C. Almost all of the code is converted, with the exception of initial setup of the stack.
Thanks for your work on this, from a sunxi pov it looks good (once Marc's remarks are fixed).
Also many thanks to Marc for the thorough review. I've been treating the PSCI stuff as a black-box, only doing mostly style / sanity reviews, so the thorough review is appreciated a lot.
Lets do a v2 and try to land this soon-ish ?
Regards,
Hans
participants (4)
-
Chen-Yu Tsai
-
Hans de Goede
-
Hongbo Zhang
-
Marc Zyngier