[PATCH v2] arm64: issue ISB after updating system registers

ARM Architecture reference manual clearly states that PE pipeline should be flushed after any change to system registers. Refer to paragraph "B2.3.5 Memory Barriers" at page B2-92 of "Arm Architecture Reference Manual ARMv8 for ARMv8-A Architecture Profile" (ARM DDI 0487B.a).
Failing to issue instruction memory synchronization barrier can lead to spurious errors, like synchronous exception when accessing FPU registers. This is very prominent on CPUs with long instruction pipeline, like ARM Cortex A72.
This change fixes the following U-Boot panic:
"Synchronous Abort" handler, esr 0x1fe00000 elr: 00000000800948cc lr : 0000000080091e04 x0 : 00000000801ffdc8 x1 : 00000000000000c8 x2 : 00000000800979d4 x3 : 00000000801ffc60 x4 : 00000000801ffd40 x5 : ffffff80ffffffd8 x6 : 00000000801ffd70 x7 : 00000000801ffd70 x8 : 000000000000000a x9 : 0000000000000000 x10: 0000000000000044 x11: 0000000000000000 x12: 0000000000000000 x13: 0000000000000000 x14: 0000000000000000 x15: 0000000000000000 x16: 000000008008b2e0 x17: 0000000000000000 x18: 00000000801ffec0 x19: 00000000800957b0 x20: 00000000000000c8 x21: 00000000801ffdc8 x22: 000000008009909e x23: 0000000000000000 x24: 0000000000000000 x25: 0000000000000000 x26: 0000000000000000 x27: 0000000000000000 x28: 0000000000000000 x29: 00000000801ffc50
Code: a94417e4 a90217e4 a9051fe6 a90617e4 (3d801fe0)
While executing instruction
str q0, [sp, #112]
in vsnprintf() prologue. This panic was observed only on Cortex A72 so far.
This patch places ISBs on other strategic places as well.
Also, this probably is the right fix for the issue workarounded in the commit 45f41c13 ("ARM: uniphier: add weird workaround code for LD20")
Reported-by: Oleksandr Andrushchenko oleksandr_andrushchenko@epam.com Suggested-by: Julien Grall julien.grall.oss@gmail.com Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com CC: Tom Rini trini@konsulko.com CC: Masahiro Yamada yamada.masahiro@socionext.com CC: Stefano Stabellini sstabellini@kernel.org
--
Changes from v1: - Added ISBs under CONFIG_ARMV8_SET_SMPEN and erratas. - Added Stefano, Julien and Oleksandr --- arch/arm/cpu/armv8/start.S | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/arm/cpu/armv8/start.S b/arch/arm/cpu/armv8/start.S index 99d126660d..002698b501 100644 --- a/arch/arm/cpu/armv8/start.S +++ b/arch/arm/cpu/armv8/start.S @@ -120,6 +120,7 @@ pie_fixup_done: mov x0, #3 << 20 msr cpacr_el1, x0 /* Enable FP/SIMD */ 0: + isb
/* * Enable SMPEN bit for coherency. @@ -132,6 +133,7 @@ pie_fixup_done: mrs x0, S3_1_c15_c2_1 /* cpuectlr_el1 */ orr x0, x0, #0x40 msr S3_1_c15_c2_1, x0 + isb 1: #endif
@@ -233,6 +235,7 @@ apply_a53_core_errata: /* Enable data cache clean as data cache clean/invalidate */ orr x0, x0, #1 << 44 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */ + isb #endif b 0b
@@ -247,6 +250,7 @@ apply_a57_core_errata: /* Disable write streaming no-allocate threshold */ orr x0, x0, #3 << 27 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */ + isb #endif
#ifdef CONFIG_ARM_ERRATA_826974 @@ -254,6 +258,7 @@ apply_a57_core_errata: /* Disable speculative load execution ahead of a DMB */ orr x0, x0, #1 << 59 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */ + isb #endif
#ifdef CONFIG_ARM_ERRATA_833471 @@ -263,6 +268,7 @@ apply_a57_core_errata: could impact performance. */ orr x0, x0, #1 << 38 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */ + isb #endif
#ifdef CONFIG_ARM_ERRATA_829520 @@ -273,6 +279,7 @@ apply_a57_core_errata: could impact performance. */ orr x0, x0, #1 << 4 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */ + isb #endif
#ifdef CONFIG_ARM_ERRATA_833069 @@ -280,6 +287,7 @@ apply_a57_core_errata: /* Disable Enable Invalidates of BTB bit */ and x0, x0, #0xE msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */ + isb #endif b 0b ENDPROC(apply_core_errata)

(+ some Arm folks)
Hi Volodymyr,
On 24/06/2020 02:05, Volodymyr Babchuk wrote:
ARM Architecture reference manual clearly states that PE pipeline should be flushed after any change to system registers. Refer to paragraph "B2.3.5 Memory Barriers" at page B2-92 of "Arm Architecture Reference Manual ARMv8 for ARMv8-A Architecture Profile" (ARM DDI 0487B.a).
Failing to issue instruction memory synchronization barrier can lead to spurious errors, like synchronous exception when accessing FPU registers. This is very prominent on CPUs with long instruction pipeline, like ARM Cortex A72.
This change fixes the following U-Boot panic:
"Synchronous Abort" handler, esr 0x1fe00000 elr: 00000000800948cc lr : 0000000080091e04 x0 : 00000000801ffdc8 x1 : 00000000000000c8 x2 : 00000000800979d4 x3 : 00000000801ffc60 x4 : 00000000801ffd40 x5 : ffffff80ffffffd8 x6 : 00000000801ffd70 x7 : 00000000801ffd70 x8 : 000000000000000a x9 : 0000000000000000 x10: 0000000000000044 x11: 0000000000000000 x12: 0000000000000000 x13: 0000000000000000 x14: 0000000000000000 x15: 0000000000000000 x16: 000000008008b2e0 x17: 0000000000000000 x18: 00000000801ffec0 x19: 00000000800957b0 x20: 00000000000000c8 x21: 00000000801ffdc8 x22: 000000008009909e x23: 0000000000000000 x24: 0000000000000000 x25: 0000000000000000 x26: 0000000000000000 x27: 0000000000000000 x28: 0000000000000000 x29: 00000000801ffc50
Code: a94417e4 a90217e4 a9051fe6 a90617e4 (3d801fe0)
While executing instruction
str q0, [sp, #112]
in vsnprintf() prologue. This panic was observed only on Cortex A72 so far.
This patch places ISBs on other strategic places as well.
Also, this probably is the right fix for the issue workarounded in the commit 45f41c13 ("ARM: uniphier: add weird workaround code for LD20")
Reported-by: Oleksandr Andrushchenko oleksandr_andrushchenko@epam.com Suggested-by: Julien Grall julien.grall.oss@gmail.com
Please use julien@xen.org :).
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com CC: Tom Rini trini@konsulko.com CC: Masahiro Yamada yamada.masahiro@socionext.com CC: Stefano Stabellini sstabellini@kernel.org
Reviewed-by: Julien Grall julien@xen.org
--
Changes from v1:
- Added ISBs under CONFIG_ARMV8_SET_SMPEN and erratas.
- Added Stefano, Julien and Oleksandr
arch/arm/cpu/armv8/start.S | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/arm/cpu/armv8/start.S b/arch/arm/cpu/armv8/start.S index 99d126660d..002698b501 100644 --- a/arch/arm/cpu/armv8/start.S +++ b/arch/arm/cpu/armv8/start.S @@ -120,6 +120,7 @@ pie_fixup_done: mov x0, #3 << 20 msr cpacr_el1, x0 /* Enable FP/SIMD */ 0:
isb
/*
- Enable SMPEN bit for coherency.
@@ -132,6 +133,7 @@ pie_fixup_done: mrs x0, S3_1_c15_c2_1 /* cpuectlr_el1 */ orr x0, x0, #0x40 msr S3_1_c15_c2_1, x0
- isb 1: #endif
@@ -233,6 +235,7 @@ apply_a53_core_errata: /* Enable data cache clean as data cache clean/invalidate */ orr x0, x0, #1 << 44 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
- isb #endif b 0b
@@ -247,6 +250,7 @@ apply_a57_core_errata: /* Disable write streaming no-allocate threshold */ orr x0, x0, #3 << 27 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
isb #endif
#ifdef CONFIG_ARM_ERRATA_826974
@@ -254,6 +258,7 @@ apply_a57_core_errata: /* Disable speculative load execution ahead of a DMB */ orr x0, x0, #1 << 59 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
isb #endif
#ifdef CONFIG_ARM_ERRATA_833471
@@ -263,6 +268,7 @@ apply_a57_core_errata: could impact performance. */ orr x0, x0, #1 << 38 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
isb #endif
#ifdef CONFIG_ARM_ERRATA_829520
@@ -273,6 +279,7 @@ apply_a57_core_errata: could impact performance. */ orr x0, x0, #1 << 4 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
isb #endif
#ifdef CONFIG_ARM_ERRATA_833069
@@ -280,6 +287,7 @@ apply_a57_core_errata: /* Disable Enable Invalidates of BTB bit */ and x0, x0, #0xE msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
- isb #endif b 0b ENDPROC(apply_core_errata)

On 24 Jun 2020, at 12:26, Julien Grall julien@xen.org wrote:
(+ some Arm folks)
Hi Volodymyr,
On 24/06/2020 02:05, Volodymyr Babchuk wrote:
ARM Architecture reference manual clearly states that PE pipeline should be flushed after any change to system registers. Refer to paragraph "B2.3.5 Memory Barriers" at page B2-92 of "Arm Architecture Reference Manual ARMv8 for ARMv8-A Architecture Profile" (ARM DDI 0487B.a). Failing to issue instruction memory synchronization barrier can lead to spurious errors, like synchronous exception when accessing FPU registers. This is very prominent on CPUs with long instruction pipeline, like ARM Cortex A72. This change fixes the following U-Boot panic: "Synchronous Abort" handler, esr 0x1fe00000 elr: 00000000800948cc lr : 0000000080091e04 x0 : 00000000801ffdc8 x1 : 00000000000000c8 x2 : 00000000800979d4 x3 : 00000000801ffc60 x4 : 00000000801ffd40 x5 : ffffff80ffffffd8 x6 : 00000000801ffd70 x7 : 00000000801ffd70 x8 : 000000000000000a x9 : 0000000000000000 x10: 0000000000000044 x11: 0000000000000000 x12: 0000000000000000 x13: 0000000000000000 x14: 0000000000000000 x15: 0000000000000000 x16: 000000008008b2e0 x17: 0000000000000000 x18: 00000000801ffec0 x19: 00000000800957b0 x20: 00000000000000c8 x21: 00000000801ffdc8 x22: 000000008009909e x23: 0000000000000000 x24: 0000000000000000 x25: 0000000000000000 x26: 0000000000000000 x27: 0000000000000000 x28: 0000000000000000 x29: 00000000801ffc50 Code: a94417e4 a90217e4 a9051fe6 a90617e4 (3d801fe0) While executing instruction str q0, [sp, #112] in vsnprintf() prologue. This panic was observed only on Cortex A72 so far. This patch places ISBs on other strategic places as well. Also, this probably is the right fix for the issue workarounded in the commit 45f41c13 ("ARM: uniphier: add weird workaround code for LD20") Reported-by: Oleksandr Andrushchenko oleksandr_andrushchenko@epam.com Suggested-by: Julien Grall julien.grall.oss@gmail.com
Please use julien@xen.org :).
Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com CC: Tom Rini trini@konsulko.com CC: Masahiro Yamada yamada.masahiro@socionext.com CC: Stefano Stabellini sstabellini@kernel.org
Reviewed-by: Julien Grall julien@xen.org
Reviewed-by Bertrand Marquis bertrand.marquis@arm.com
-- Changes from v1:
- Added ISBs under CONFIG_ARMV8_SET_SMPEN and erratas.
- Added Stefano, Julien and Oleksandr
arch/arm/cpu/armv8/start.S | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/arm/cpu/armv8/start.S b/arch/arm/cpu/armv8/start.S index 99d126660d..002698b501 100644 --- a/arch/arm/cpu/armv8/start.S +++ b/arch/arm/cpu/armv8/start.S @@ -120,6 +120,7 @@ pie_fixup_done: movx0, #3 << 20 msrcpacr_el1, x0/* Enable FP/SIMD */ 0: +isb /*
- Enable SMPEN bit for coherency.
@@ -132,6 +133,7 @@ pie_fixup_done: mrs x0, S3_1_c15_c2_1 /* cpuectlr_el1 */ orr x0, x0, #0x40 msr S3_1_c15_c2_1, x0 +isb 1: #endif @@ -233,6 +235,7 @@ apply_a53_core_errata: /* Enable data cache clean as data cache clean/invalidate */ orrx0, x0, #1 << 44 msrS3_1_c15_c2_0, x0/* cpuactlr_el1 */ +isb #endif b 0b @@ -247,6 +250,7 @@ apply_a57_core_errata: /* Disable write streaming no-allocate threshold */ orrx0, x0, #3 << 27 msrS3_1_c15_c2_0, x0/* cpuactlr_el1 */ +isb #endif #ifdef CONFIG_ARM_ERRATA_826974 @@ -254,6 +258,7 @@ apply_a57_core_errata: /* Disable speculative load execution ahead of a DMB */ orrx0, x0, #1 << 59 msrS3_1_c15_c2_0, x0/* cpuactlr_el1 */ +isb #endif #ifdef CONFIG_ARM_ERRATA_833471 @@ -263,6 +268,7 @@ apply_a57_core_errata: could impact performance. */ orrx0, x0, #1 << 38 msrS3_1_c15_c2_0, x0/* cpuactlr_el1 */ +isb #endif #ifdef CONFIG_ARM_ERRATA_829520 @@ -273,6 +279,7 @@ apply_a57_core_errata: could impact performance. */ orrx0, x0, #1 << 4 msrS3_1_c15_c2_0, x0/* cpuactlr_el1 */ +isb #endif #ifdef CONFIG_ARM_ERRATA_833069 @@ -280,6 +287,7 @@ apply_a57_core_errata: /* Disable Enable Invalidates of BTB bit */ andx0, x0, #0xE msrS3_1_c15_c2_0, x0/* cpuactlr_el1 */ +isb #endif b 0b ENDPROC(apply_core_errata)
-- Julien Grall
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

On 24/06/2020 02:05, Volodymyr Babchuk wrote:
Hi Volodymyr,
thanks for the find and for taking care! And thanks Julien for forwarding!
ARM Architecture reference manual clearly states that PE pipeline should be flushed after any change to system registers.
Don't want to be pedantic here, but this does not affect all system registers. Some clearly don't need an ISB, some are self-synchronising. The manual speaks of "*Examples* of context-changing operations ..." Could you change this to "... after changes to some system registers."?
Refer to paragraph "B2.3.5 Memory Barriers" at page B2-92 of "Arm Architecture Reference Manual ARMv8 for ARMv8-A Architecture Profile" (ARM DDI 0487B.a).
Failing to issue instruction memory synchronization barrier can lead
It's just "instruction synchronization barrier". This is not really about memory, it's about making sure that the processor re-fetches the next instructions and "re-evaluates" them again, to take the changed system state into account. (Flushing the pipeline is just one possible way to implement this, btw.)
to spurious errors, like synchronous exception when accessing FPU registers. This is very prominent on CPUs with long instruction pipeline, like ARM Cortex A72.
This change fixes the following U-Boot panic:
"Synchronous Abort" handler, esr 0x1fe00000 elr: 00000000800948cc lr : 0000000080091e04 x0 : 00000000801ffdc8 x1 : 00000000000000c8 x2 : 00000000800979d4 x3 : 00000000801ffc60 x4 : 00000000801ffd40 x5 : ffffff80ffffffd8 x6 : 00000000801ffd70 x7 : 00000000801ffd70 x8 : 000000000000000a x9 : 0000000000000000 x10: 0000000000000044 x11: 0000000000000000 x12: 0000000000000000 x13: 0000000000000000 x14: 0000000000000000 x15: 0000000000000000 x16: 000000008008b2e0 x17: 0000000000000000 x18: 00000000801ffec0 x19: 00000000800957b0 x20: 00000000000000c8 x21: 00000000801ffdc8 x22: 000000008009909e x23: 0000000000000000 x24: 0000000000000000 x25: 0000000000000000 x26: 0000000000000000 x27: 0000000000000000 x28: 0000000000000000 x29: 00000000801ffc50
Code: a94417e4 a90217e4 a9051fe6 a90617e4 (3d801fe0)
While executing instruction
str q0, [sp, #112]
in vsnprintf() prologue. This panic was observed only on Cortex A72 so far.
This patch places ISBs on other strategic places as well.
Also, this probably is the right fix for the issue workarounded in the commit 45f41c13 ("ARM: uniphier: add weird workaround code for LD20")
Reported-by: Oleksandr Andrushchenko oleksandr_andrushchenko@epam.com Suggested-by: Julien Grall julien.grall.oss@gmail.com Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com CC: Tom Rini trini@konsulko.com CC: Masahiro Yamada yamada.masahiro@socionext.com CC: Stefano Stabellini sstabellini@kernel.org
I checked the A57 ACTLR_EL1 and ECTLR_EL1 bits affected below, and indeed an ISB seems to be in order here. Since this is the initialization code path, it wouldn't hurt anyway ;-)
Reviewed-by: Andre Przywara andre.przywara@arm.com
Cheers, Andre
--
Changes from v1:
- Added ISBs under CONFIG_ARMV8_SET_SMPEN and erratas.
- Added Stefano, Julien and Oleksandr
arch/arm/cpu/armv8/start.S | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/arm/cpu/armv8/start.S b/arch/arm/cpu/armv8/start.S index 99d126660d..002698b501 100644 --- a/arch/arm/cpu/armv8/start.S +++ b/arch/arm/cpu/armv8/start.S @@ -120,6 +120,7 @@ pie_fixup_done: mov x0, #3 << 20 msr cpacr_el1, x0 /* Enable FP/SIMD */ 0:
isb
/*
- Enable SMPEN bit for coherency.
@@ -132,6 +133,7 @@ pie_fixup_done: mrs x0, S3_1_c15_c2_1 /* cpuectlr_el1 */ orr x0, x0, #0x40 msr S3_1_c15_c2_1, x0
- isb
1: #endif
@@ -233,6 +235,7 @@ apply_a53_core_errata: /* Enable data cache clean as data cache clean/invalidate */ orr x0, x0, #1 << 44 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
- isb
#endif b 0b
@@ -247,6 +250,7 @@ apply_a57_core_errata: /* Disable write streaming no-allocate threshold */ orr x0, x0, #3 << 27 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
- isb
#endif
#ifdef CONFIG_ARM_ERRATA_826974 @@ -254,6 +258,7 @@ apply_a57_core_errata: /* Disable speculative load execution ahead of a DMB */ orr x0, x0, #1 << 59 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
- isb
#endif
#ifdef CONFIG_ARM_ERRATA_833471 @@ -263,6 +268,7 @@ apply_a57_core_errata: could impact performance. */ orr x0, x0, #1 << 38 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
- isb
#endif
#ifdef CONFIG_ARM_ERRATA_829520 @@ -273,6 +279,7 @@ apply_a57_core_errata: could impact performance. */ orr x0, x0, #1 << 4 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
- isb
#endif
#ifdef CONFIG_ARM_ERRATA_833069 @@ -280,6 +287,7 @@ apply_a57_core_errata: /* Disable Enable Invalidates of BTB bit */ and x0, x0, #0xE msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
- isb
#endif b 0b ENDPROC(apply_core_errata)

On Wed, Jun 24, 2020 at 11:07 AM Volodymyr Babchuk Volodymyr_Babchuk@epam.com wrote:
ARM Architecture reference manual clearly states that PE pipeline should be flushed after any change to system registers. Refer to paragraph "B2.3.5 Memory Barriers" at page B2-92 of "Arm Architecture Reference Manual ARMv8 for ARMv8-A Architecture Profile" (ARM DDI 0487B.a).
Failing to issue instruction memory synchronization barrier can lead to spurious errors, like synchronous exception when accessing FPU registers. This is very prominent on CPUs with long instruction pipeline, like ARM Cortex A72.
This change fixes the following U-Boot panic:
"Synchronous Abort" handler, esr 0x1fe00000 elr: 00000000800948cc lr : 0000000080091e04 x0 : 00000000801ffdc8 x1 : 00000000000000c8 x2 : 00000000800979d4 x3 : 00000000801ffc60 x4 : 00000000801ffd40 x5 : ffffff80ffffffd8 x6 : 00000000801ffd70 x7 : 00000000801ffd70 x8 : 000000000000000a x9 : 0000000000000000 x10: 0000000000000044 x11: 0000000000000000 x12: 0000000000000000 x13: 0000000000000000 x14: 0000000000000000 x15: 0000000000000000 x16: 000000008008b2e0 x17: 0000000000000000 x18: 00000000801ffec0 x19: 00000000800957b0 x20: 00000000000000c8 x21: 00000000801ffdc8 x22: 000000008009909e x23: 0000000000000000 x24: 0000000000000000 x25: 0000000000000000 x26: 0000000000000000 x27: 0000000000000000 x28: 0000000000000000 x29: 00000000801ffc50
Code: a94417e4 a90217e4 a9051fe6 a90617e4 (3d801fe0)
While executing instruction
str q0, [sp, #112]
in vsnprintf() prologue. This panic was observed only on Cortex A72 so far.
This patch places ISBs on other strategic places as well.
Also, this probably is the right fix for the issue workarounded in the commit 45f41c13 ("ARM: uniphier: add weird workaround code for LD20")
Thanks for addressing this issue. Currently, I do not have a board in hand to test this. (I do not commute to the office due to COVID-19 these days...)
I have another SoC board, but it does not integrate CA72. I have ever seen this problem only on CA72.
Eventually, I will go to the office, and I can test this. But, you do not need to wait for my test if other people review it.
Thank you.
Reported-by: Oleksandr Andrushchenko oleksandr_andrushchenko@epam.com Suggested-by: Julien Grall julien.grall.oss@gmail.com Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com CC: Tom Rini trini@konsulko.com CC: Masahiro Yamada yamada.masahiro@socionext.com CC: Stefano Stabellini sstabellini@kernel.org
--
Changes from v1:
- Added ISBs under CONFIG_ARMV8_SET_SMPEN and erratas.
- Added Stefano, Julien and Oleksandr
arch/arm/cpu/armv8/start.S | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/arm/cpu/armv8/start.S b/arch/arm/cpu/armv8/start.S index 99d126660d..002698b501 100644 --- a/arch/arm/cpu/armv8/start.S +++ b/arch/arm/cpu/armv8/start.S @@ -120,6 +120,7 @@ pie_fixup_done: mov x0, #3 << 20 msr cpacr_el1, x0 /* Enable FP/SIMD */ 0:
isb /* * Enable SMPEN bit for coherency.
@@ -132,6 +133,7 @@ pie_fixup_done: mrs x0, S3_1_c15_c2_1 /* cpuectlr_el1 */ orr x0, x0, #0x40 msr S3_1_c15_c2_1, x0
isb
1: #endif
@@ -233,6 +235,7 @@ apply_a53_core_errata: /* Enable data cache clean as data cache clean/invalidate */ orr x0, x0, #1 << 44 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
isb
#endif b 0b
@@ -247,6 +250,7 @@ apply_a57_core_errata: /* Disable write streaming no-allocate threshold */ orr x0, x0, #3 << 27 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
isb
#endif
#ifdef CONFIG_ARM_ERRATA_826974 @@ -254,6 +258,7 @@ apply_a57_core_errata: /* Disable speculative load execution ahead of a DMB */ orr x0, x0, #1 << 59 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
isb
#endif
#ifdef CONFIG_ARM_ERRATA_833471 @@ -263,6 +268,7 @@ apply_a57_core_errata: could impact performance. */ orr x0, x0, #1 << 38 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
isb
#endif
#ifdef CONFIG_ARM_ERRATA_829520 @@ -273,6 +279,7 @@ apply_a57_core_errata: could impact performance. */ orr x0, x0, #1 << 4 msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
isb
#endif
#ifdef CONFIG_ARM_ERRATA_833069 @@ -280,6 +287,7 @@ apply_a57_core_errata: /* Disable Enable Invalidates of BTB bit */ and x0, x0, #0xE msr S3_1_c15_c2_0, x0 /* cpuactlr_el1 */
isb
#endif b 0b ENDPROC(apply_core_errata) -- 2.27.0
-- Best Regards Masahiro Yamada

Hi.
On Mon, Jun 29, 2020 at 12:29 AM Masahiro Yamada masahiroy@kernel.org wrote:
On Wed, Jun 24, 2020 at 11:07 AM Volodymyr Babchuk Volodymyr_Babchuk@epam.com wrote:
ARM Architecture reference manual clearly states that PE pipeline should be flushed after any change to system registers. Refer to paragraph "B2.3.5 Memory Barriers" at page B2-92 of "Arm Architecture Reference Manual ARMv8 for ARMv8-A Architecture Profile" (ARM DDI 0487B.a).
Failing to issue instruction memory synchronization barrier can lead to spurious errors, like synchronous exception when accessing FPU registers. This is very prominent on CPUs with long instruction pipeline, like ARM Cortex A72.
This change fixes the following U-Boot panic:
"Synchronous Abort" handler, esr 0x1fe00000 elr: 00000000800948cc lr : 0000000080091e04 x0 : 00000000801ffdc8 x1 : 00000000000000c8 x2 : 00000000800979d4 x3 : 00000000801ffc60 x4 : 00000000801ffd40 x5 : ffffff80ffffffd8 x6 : 00000000801ffd70 x7 : 00000000801ffd70 x8 : 000000000000000a x9 : 0000000000000000 x10: 0000000000000044 x11: 0000000000000000 x12: 0000000000000000 x13: 0000000000000000 x14: 0000000000000000 x15: 0000000000000000 x16: 000000008008b2e0 x17: 0000000000000000 x18: 00000000801ffec0 x19: 00000000800957b0 x20: 00000000000000c8 x21: 00000000801ffdc8 x22: 000000008009909e x23: 0000000000000000 x24: 0000000000000000 x25: 0000000000000000 x26: 0000000000000000 x27: 0000000000000000 x28: 0000000000000000 x29: 00000000801ffc50
Code: a94417e4 a90217e4 a9051fe6 a90617e4 (3d801fe0)
While executing instruction
str q0, [sp, #112]
in vsnprintf() prologue. This panic was observed only on Cortex A72 so far.
This patch places ISBs on other strategic places as well.
Also, this probably is the right fix for the issue workarounded in the commit 45f41c13 ("ARM: uniphier: add weird workaround code for LD20")
Thanks for addressing this issue. Currently, I do not have a board in hand to test this. (I do not commute to the office due to COVID-19 these days...)
I have another SoC board, but it does not integrate CA72. I have ever seen this problem only on CA72.
Eventually, I will go to the office, and I can test this. But, you do not need to wait for my test if other people review it.
Thank you.
Today I tested this patch.
Yes, it fixes the CA72 problem on my board.
Tested-by: Masahiro Yamada yamada.masahiro@socionext.com
After this patch is picked up, I will revert the ugly workaround:
http://patchwork.ozlabs.org/project/uboot/patch/20200706042304.15853-1-yamad...
Interestingly, I observe this problem only on U-Boot running at EL1.
Anyway, this fix makes it work at any exception level.
-- Best Regards Masahiro Yamada

On Mon, Jul 6, 2020 at 1:33 PM Masahiro Yamada yamada.masahiro@socionext.com wrote:
Hi.
On Mon, Jun 29, 2020 at 12:29 AM Masahiro Yamada masahiroy@kernel.org wrote:
On Wed, Jun 24, 2020 at 11:07 AM Volodymyr Babchuk Volodymyr_Babchuk@epam.com wrote:
ARM Architecture reference manual clearly states that PE pipeline should be flushed after any change to system registers. Refer to paragraph "B2.3.5 Memory Barriers" at page B2-92 of "Arm Architecture Reference Manual ARMv8 for ARMv8-A Architecture Profile" (ARM DDI 0487B.a).
Failing to issue instruction memory synchronization barrier can lead to spurious errors, like synchronous exception when accessing FPU registers. This is very prominent on CPUs with long instruction pipeline, like ARM Cortex A72.
This change fixes the following U-Boot panic:
"Synchronous Abort" handler, esr 0x1fe00000 elr: 00000000800948cc lr : 0000000080091e04 x0 : 00000000801ffdc8 x1 : 00000000000000c8 x2 : 00000000800979d4 x3 : 00000000801ffc60 x4 : 00000000801ffd40 x5 : ffffff80ffffffd8 x6 : 00000000801ffd70 x7 : 00000000801ffd70 x8 : 000000000000000a x9 : 0000000000000000 x10: 0000000000000044 x11: 0000000000000000 x12: 0000000000000000 x13: 0000000000000000 x14: 0000000000000000 x15: 0000000000000000 x16: 000000008008b2e0 x17: 0000000000000000 x18: 00000000801ffec0 x19: 00000000800957b0 x20: 00000000000000c8 x21: 00000000801ffdc8 x22: 000000008009909e x23: 0000000000000000 x24: 0000000000000000 x25: 0000000000000000 x26: 0000000000000000 x27: 0000000000000000 x28: 0000000000000000 x29: 00000000801ffc50
Code: a94417e4 a90217e4 a9051fe6 a90617e4 (3d801fe0)
While executing instruction
str q0, [sp, #112]
in vsnprintf() prologue. This panic was observed only on Cortex A72 so far.
This patch places ISBs on other strategic places as well.
Also, this probably is the right fix for the issue workarounded in the commit 45f41c13 ("ARM: uniphier: add weird workaround code for LD20")
Thanks for addressing this issue. Currently, I do not have a board in hand to test this. (I do not commute to the office due to COVID-19 these days...)
I have another SoC board, but it does not integrate CA72. I have ever seen this problem only on CA72.
Eventually, I will go to the office, and I can test this. But, you do not need to wait for my test if other people review it.
Thank you.
Today I tested this patch.
Yes, it fixes the CA72 problem on my board.
Tested-by: Masahiro Yamada yamada.masahiro@socionext.com
Now that MW is open, I want to see this patch in mainline because it fixes the CA-72 problem in the right way.
There was no object except two minor comments from Andre.
Can Tom fix up the commit description when he picks it up?
"... after any change to system registers." ---> "... after changes to some system registers."
"instruction memory synchronization barrier" ---> "instruction synchronization barrier"
Thanks.

On Tue, Jul 07, 2020 at 08:37:53PM +0900, Masahiro Yamada wrote:
On Mon, Jul 6, 2020 at 1:33 PM Masahiro Yamada yamada.masahiro@socionext.com wrote:
Hi.
On Mon, Jun 29, 2020 at 12:29 AM Masahiro Yamada masahiroy@kernel.org wrote:
On Wed, Jun 24, 2020 at 11:07 AM Volodymyr Babchuk Volodymyr_Babchuk@epam.com wrote:
ARM Architecture reference manual clearly states that PE pipeline should be flushed after any change to system registers. Refer to paragraph "B2.3.5 Memory Barriers" at page B2-92 of "Arm Architecture Reference Manual ARMv8 for ARMv8-A Architecture Profile" (ARM DDI 0487B.a).
Failing to issue instruction memory synchronization barrier can lead to spurious errors, like synchronous exception when accessing FPU registers. This is very prominent on CPUs with long instruction pipeline, like ARM Cortex A72.
This change fixes the following U-Boot panic:
"Synchronous Abort" handler, esr 0x1fe00000 elr: 00000000800948cc lr : 0000000080091e04 x0 : 00000000801ffdc8 x1 : 00000000000000c8 x2 : 00000000800979d4 x3 : 00000000801ffc60 x4 : 00000000801ffd40 x5 : ffffff80ffffffd8 x6 : 00000000801ffd70 x7 : 00000000801ffd70 x8 : 000000000000000a x9 : 0000000000000000 x10: 0000000000000044 x11: 0000000000000000 x12: 0000000000000000 x13: 0000000000000000 x14: 0000000000000000 x15: 0000000000000000 x16: 000000008008b2e0 x17: 0000000000000000 x18: 00000000801ffec0 x19: 00000000800957b0 x20: 00000000000000c8 x21: 00000000801ffdc8 x22: 000000008009909e x23: 0000000000000000 x24: 0000000000000000 x25: 0000000000000000 x26: 0000000000000000 x27: 0000000000000000 x28: 0000000000000000 x29: 00000000801ffc50
Code: a94417e4 a90217e4 a9051fe6 a90617e4 (3d801fe0)
While executing instruction
str q0, [sp, #112]
in vsnprintf() prologue. This panic was observed only on Cortex A72 so far.
This patch places ISBs on other strategic places as well.
Also, this probably is the right fix for the issue workarounded in the commit 45f41c13 ("ARM: uniphier: add weird workaround code for LD20")
Thanks for addressing this issue. Currently, I do not have a board in hand to test this. (I do not commute to the office due to COVID-19 these days...)
I have another SoC board, but it does not integrate CA72. I have ever seen this problem only on CA72.
Eventually, I will go to the office, and I can test this. But, you do not need to wait for my test if other people review it.
Thank you.
Today I tested this patch.
Yes, it fixes the CA72 problem on my board.
Tested-by: Masahiro Yamada yamada.masahiro@socionext.com
Now that MW is open, I want to see this patch in mainline because it fixes the CA-72 problem in the right way.
There was no object except two minor comments from Andre.
Can Tom fix up the commit description when he picks it up?
"... after any change to system registers." ---> "... after changes to some system registers."
"instruction memory synchronization barrier" ---> "instruction synchronization barrier"
Yes, thanks. I've been wanting to put this directly to master (and not part of the next merge) for easier bisecting should there be a problem. I'm running this (and a few other risky patches) through CI now.

On Wed, Jun 24, 2020 at 01:05:19AM +0000, Volodymyr Babchuk wrote:
ARM Architecture reference manual clearly states that PE pipeline should be flushed after any change to system registers. Refer to paragraph "B2.3.5 Memory Barriers" at page B2-92 of "Arm Architecture Reference Manual ARMv8 for ARMv8-A Architecture Profile" (ARM DDI 0487B.a).
Failing to issue instruction memory synchronization barrier can lead to spurious errors, like synchronous exception when accessing FPU registers. This is very prominent on CPUs with long instruction pipeline, like ARM Cortex A72.
This change fixes the following U-Boot panic:
"Synchronous Abort" handler, esr 0x1fe00000 elr: 00000000800948cc lr : 0000000080091e04 x0 : 00000000801ffdc8 x1 : 00000000000000c8 x2 : 00000000800979d4 x3 : 00000000801ffc60 x4 : 00000000801ffd40 x5 : ffffff80ffffffd8 x6 : 00000000801ffd70 x7 : 00000000801ffd70 x8 : 000000000000000a x9 : 0000000000000000 x10: 0000000000000044 x11: 0000000000000000 x12: 0000000000000000 x13: 0000000000000000 x14: 0000000000000000 x15: 0000000000000000 x16: 000000008008b2e0 x17: 0000000000000000 x18: 00000000801ffec0 x19: 00000000800957b0 x20: 00000000000000c8 x21: 00000000801ffdc8 x22: 000000008009909e x23: 0000000000000000 x24: 0000000000000000 x25: 0000000000000000 x26: 0000000000000000 x27: 0000000000000000 x28: 0000000000000000 x29: 00000000801ffc50
Code: a94417e4 a90217e4 a9051fe6 a90617e4 (3d801fe0)
While executing instruction
str q0, [sp, #112]
in vsnprintf() prologue. This panic was observed only on Cortex A72 so far.
This patch places ISBs on other strategic places as well.
Also, this probably is the right fix for the issue workarounded in the commit 45f41c13 ("ARM: uniphier: add weird workaround code for LD20")
Reported-by: Oleksandr Andrushchenko oleksandr_andrushchenko@epam.com Suggested-by: Julien Grall julien.grall.oss@gmail.com Signed-off-by: Volodymyr Babchuk volodymyr_babchuk@epam.com CC: Tom Rini trini@konsulko.com CC: Masahiro Yamada yamada.masahiro@socionext.com CC: Stefano Stabellini sstabellini@kernel.org Reviewed-by: Julien Grall julien@xen.org Reviewed-by: Andre Przywara andre.przywara@arm.com Tested-by: Masahiro Yamada yamada.masahiro@socionext.com
Applied to u-boot/master, thanks!
participants (7)
-
André Przywara
-
Bertrand Marquis
-
Julien Grall
-
Masahiro Yamada
-
Masahiro Yamada
-
Tom Rini
-
Volodymyr Babchuk