[U-Boot] [PATCH] OMAP3: workaround for ARM Cortex-A8 erratum 725233

725233: PLD instructions executed with PLD data forwarding enabled can result in a processor deadlock
Signed-off-by: Siarhei Siamashka siarhei.siamashka@gmail.com --- cpu/arm_cortexa8/omap3/board.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/cpu/arm_cortexa8/omap3/board.c b/cpu/arm_cortexa8/omap3/board.c index 2aa69b3..7b78fa4 100644 --- a/cpu/arm_cortexa8/omap3/board.c +++ b/cpu/arm_cortexa8/omap3/board.c @@ -146,6 +146,12 @@ void setup_auxcr() __asm__ __volatile__("orr r0, r0, #1 << 5"); /* SMI instruction to call ROM Code API */ __asm__ __volatile__(".word 0xE1600070"); + /* Set PLD_FWD bit in L2AUXCR (Cortex-A8 erratum 725233 workaround) */ + __asm__ __volatile__("mov r12, #0x2"); + __asm__ __volatile__("mrc p15, 1, r0, c9, c0, 2"); + __asm__ __volatile__("orr r0, r0, #1 << 27"); + /* SMI instruction to call ROM Code API */ + __asm__ __volatile__(".word 0xE1600070"); __asm__ __volatile__("mov r0, %0":"=r"(i)); __asm__ __volatile__("mov r12, %0":"=r"(j)); }

Setting ASA bit hurts performance for the code which has lots of I-cache misses and there are no Cortex-A8 errata workarounds which would require to have it.
A test program which intentionally stresses I-cache misses on conditional branches is attached.
ASA bit is not set:
real 0m2.940s user 0m2.930s sys 0m0.008s
ASA bit is set:
real 0m3.470s user 0m3.461s sys 0m0.008s
The difference on some real applications is much more modest and is just something like ~0.5%, but every little bit helps.
/**** start of bench_ASA.c ****/ void __attribute__((naked)) f(int count, void *rand) { asm volatile ( " push {r4, r5, r6, lr}\n" " mov r4, r0\n" " mov r5, r1\n" "0:\n" ".rept 4096\n" " blx r5\n" " tst r0, #1\n" " bne 1f\n" " b 2f\n" ".balign 64\n" "1:\n" ".rept 15\n" " add r0, r0, #0\n" ".endr\n" " b 3f\n" ".balign 64\n" "2:\n" ".rept 16\n" " add r0, r0, #0\n" ".endr\n" "3:\n" ".endr\n" " subs r4, r4, #1\n" " bgt 0b\n" " pop {r4, r5, r6, pc}\n" ); } int main() { f(1000, rand); return 0; } /**** end of bench_ASA.c ****/
Signed-off-by: Siarhei Siamashka siarhei.siamashka@gmail.com --- cpu/arm_cortexa8/omap3/board.c | 2 -- 1 files changed, 0 insertions(+), 2 deletions(-)
diff --git a/cpu/arm_cortexa8/omap3/board.c b/cpu/arm_cortexa8/omap3/board.c index 7b78fa4..b9b71dc 100644 --- a/cpu/arm_cortexa8/omap3/board.c +++ b/cpu/arm_cortexa8/omap3/board.c @@ -140,8 +140,6 @@ void setup_auxcr() */ __asm__ __volatile__("mov r12, #0x3"); __asm__ __volatile__("mrc p15, 0, r0, c1, c0, 1"); - /* Enabling ASA */ - __asm__ __volatile__("orr r0, r0, #0x10"); /* Enable L1NEON */ __asm__ __volatile__("orr r0, r0, #1 << 5"); /* SMI instruction to call ROM Code API */

Siarhei Siamashka wrote:
725233: PLD instructions executed with PLD data forwarding enabled can result in a processor deadlock
Signed-off-by: Siarhei Siamashka siarhei.siamashka@gmail.com
Please add a detailed comment on the errata. Also looks like this jumping to ROM code. Can this be done without a ROM code call ? Tom
cpu/arm_cortexa8/omap3/board.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/cpu/arm_cortexa8/omap3/board.c b/cpu/arm_cortexa8/omap3/board.c index 2aa69b3..7b78fa4 100644 --- a/cpu/arm_cortexa8/omap3/board.c +++ b/cpu/arm_cortexa8/omap3/board.c @@ -146,6 +146,12 @@ void setup_auxcr() __asm__ __volatile__("orr r0, r0, #1 << 5"); /* SMI instruction to call ROM Code API */ __asm__ __volatile__(".word 0xE1600070");
- /* Set PLD_FWD bit in L2AUXCR (Cortex-A8 erratum 725233 workaround) */
- __asm__ __volatile__("mov r12, #0x2");
- __asm__ __volatile__("mrc p15, 1, r0, c9, c0, 2");
- __asm__ __volatile__("orr r0, r0, #1 << 27");
- /* SMI instruction to call ROM Code API */
- __asm__ __volatile__(".word 0xE1600070"); __asm__ __volatile__("mov r0, %0":"=r"(i)); __asm__ __volatile__("mov r12, %0":"=r"(j));
}

On Sunday 07 February 2010, Tom wrote:
Siarhei Siamashka wrote:
725233: PLD instructions executed with PLD data forwarding enabled can result in a processor deadlock
Signed-off-by: Siarhei Siamashka siarhei.siamashka@gmail.com
Please add a detailed comment on the errata.
The short summary is pretty much complete (PLD data forwarding is not working correctly and needs to be turned off to prevent problems). Though it really makes sense to add that: 1. only r1pX revisions of Cortex-A8 are affected 2. performance impact is practically non-existant
The detailed description of this erratum is available in the official Cortex-A8 errata list. I'm not sure about directly copying full erratum description text here. Anyway, this workaround would be better submitted by somebody TI. I'm just "guilty" of triggering this deadlock with some ARM NEON optimizations from pixman library, so feel a bit of responsibility for it too.
Also looks like this jumping to ROM code. Can this be done without a ROM code call ?
I just added it to the part of code which is already doing similar things (sets L1NEON and other bits from AUXCR with some icky looking assembly via ROM code call). Most likely it can be also done in some other way.

Siarhei Siamashka siarhei.siamashka@gmail.com writes:
On Sunday 07 February 2010, Tom wrote:
Siarhei Siamashka wrote:
725233: PLD instructions executed with PLD data forwarding enabled can result in a processor deadlock
Signed-off-by: Siarhei Siamashka siarhei.siamashka@gmail.com
Please add a detailed comment on the errata.
The short summary is pretty much complete (PLD data forwarding is not working correctly and needs to be turned off to prevent problems). Though it really makes sense to add that:
- only r1pX revisions of Cortex-A8 are affected
- performance impact is practically non-existant
The detailed description of this erratum is available in the official Cortex-A8 errata list. I'm not sure about directly copying full erratum description text here. Anyway, this workaround would be better submitted by somebody TI. I'm just "guilty" of triggering this deadlock with some ARM NEON optimizations from pixman library, so feel a bit of responsibility for it too.
Also looks like this jumping to ROM code. Can this be done without a ROM code call ?
I just added it to the part of code which is already doing similar things (sets L1NEON and other bits from AUXCR with some icky looking assembly via ROM code call). Most likely it can be also done in some other way.
L2AUX is only writable in secure mode, which on GP devices implies using a ROM call. Nothing wrong with that.

On Monday 08 February 2010, Måns Rullgård wrote:
Siarhei Siamashka siarhei.siamashka@gmail.com writes:
On Sunday 07 February 2010, Tom wrote:
Siarhei Siamashka wrote:
725233: PLD instructions executed with PLD data forwarding enabled can result in a processor deadlock
Signed-off-by: Siarhei Siamashka siarhei.siamashka@gmail.com
Please add a detailed comment on the errata.
The short summary is pretty much complete (PLD data forwarding is not working correctly and needs to be turned off to prevent problems). Though it really makes sense to add that:
- only r1pX revisions of Cortex-A8 are affected
- performance impact is practically non-existant
The detailed description of this erratum is available in the official Cortex-A8 errata list. I'm not sure about directly copying full erratum description text here. Anyway, this workaround would be better submitted by somebody TI. I'm just "guilty" of triggering this deadlock with some ARM NEON optimizations from pixman library, so feel a bit of responsibility for it too.
Also looks like this jumping to ROM code. Can this be done without a ROM code call ?
I just added it to the part of code which is already doing similar things (sets L1NEON and other bits from AUXCR with some icky looking assembly via ROM code call). Most likely it can be also done in some other way.
L2AUX is only writable in secure mode, which on GP devices implies using a ROM call. Nothing wrong with that.
OK, thanks for the clarification. Can I add you to Acked-by field in the soon to be resubmitted patch with updated comment?
I just remembered the following patch and had a doubt for a second: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=...
Have not tested this yet (IBE bit is set from u-boot on my beagleboard at the moment), but apparently the MCR instruction used in that kernel patch has no effect on OMAP3 and setting IBE bit in u-boot is also required for anybody who wants to have thumb support which is not totally broken. Additionally it would be nice to do something about related Cortex-A8 erratas #687067 and #454179. The status of L1 System Array Debug Register on the beagleboard is particularly interesting.
Maybe it is a good time to do a full review of all the needed workarounds and update the u-boot code responsible for applying them on omap3 devices?
participants (3)
-
Måns Rullgård
-
Siarhei Siamashka
-
Tom