[U-Boot] OMAP3 patch to improve performance?

I came across this patch that might be useful
http://lists.denx.de/pipermail/u-boot/2010-February/067534.html
In any case since the u-boot structure has changed this cannot be applied. If anybody is interested please send a new patch.
Regards, Sandeep

Dear Sandeep,
In message 0554BEF07D437848AF01B9C9B5F0BC5DBD1C11AD@dlee01.ent.ti.com you wrote:
I came across this patch that might be useful
http://lists.denx.de/pipermail/u-boot/2010-February/067534.html
In any case since the u-boot structure has changed this cannot be applied. If anybody is interested please send a new patch.
That's actually prtty straightforward. Here it comes (completely untested :-)
----------------------------------------------------------------------
From 0223e42a48417ad18276d2709206b404c8796807 Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka siarhei.siamashka@gmail.com Date: Sat, 6 Feb 2010 16:19:46 +0000 Subject: [PATCH] OMAP3: remove useless ASA bit from AUXCR
Setting ASA bit hurts performance for the code which has lots of I-cache misses and there are no Cortex-A8 errata workarounds which would require to have it.
A test program which intentionally stresses I-cache misses on conditional branches is attached.
ASA bit is not set:
real 0m2.940s user 0m2.930s sys 0m0.008s
ASA bit is set:
real 0m3.470s user 0m3.461s sys 0m0.008s
The difference on some real applications is much more modest and is just something like ~0.5%, but every little bit helps.
/**** start of bench_ASA.c ****/ void __attribute__((naked)) f(int count, void *rand) { asm volatile ( " push {r4, r5, r6, lr}\n" " mov r4, r0\n" " mov r5, r1\n" "0:\n" ".rept 4096\n" " blx r5\n" " tst r0, #1\n" " bne 1f\n" " b 2f\n" ".balign 64\n" "1:\n" ".rept 15\n" " add r0, r0, #0\n" ".endr\n" " b 3f\n" ".balign 64\n" "2:\n" ".rept 16\n" " add r0, r0, #0\n" ".endr\n" "3:\n" ".endr\n" " subs r4, r4, #1\n" " bgt 0b\n" " pop {r4, r5, r6, pc}\n" ); } int main() { f(1000, rand); return 0; } /**** end of bench_ASA.c ****/
Signed-off-by: Siarhei Siamashka siarhei.siamashka@gmail.com --- arch/arm/cpu/armv7/omap3/cache.S | 1 - 1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/arch/arm/cpu/armv7/omap3/cache.S b/arch/arm/cpu/armv7/omap3/cache.S index cda87ba..2854771 100644 --- a/arch/arm/cpu/armv7/omap3/cache.S +++ b/arch/arm/cpu/armv7/omap3/cache.S @@ -169,7 +169,6 @@ setup_auxcr: orr r1, r3, r2, lsr #20-4 @ combine variant and revision mov r12, #0x3 mrc p15, 0, r0, c1, c0, 1 - orr r0, r0, #0x10 @ Enable ASA @ Enable L1NEON on pre-r2p1 (erratum 621766 workaround) cmp r1, #0x21 orrlt r0, r0, #1 << 5

Dear Sandeep,
In message 0554BEF07D437848AF01B9C9B5F0BC5DBD1C11AD@dlee01.ent.ti.com you wrote:
I came across this patch that might be useful
http://lists.denx.de/pipermail/u-boot/2010-February/067534.html
In any case since the u-boot structure has changed this cannot be applied. If anybody is interested please send a new patch.
That's actually prtty straightforward. Here it comes (completely untested :-)
Thanks but this now effects both OMAP3 and OMAP4, it needs quite a bit of testing in my opinion.
From 0223e42a48417ad18276d2709206b404c8796807 Mon Sep 17 00:00:00 2001 From: Siarhei Siamashka siarhei.siamashka@gmail.com Date: Sat, 6 Feb 2010 16:19:46 +0000 Subject: [PATCH] OMAP3: remove useless ASA bit from AUXCR
Setting ASA bit hurts performance for the code which has lots of I-cache misses and there are no Cortex-A8 errata workarounds which would require to have it.
A test program which intentionally stresses I-cache misses on conditional branches is attached.
ASA bit is not set:
real 0m2.940s user 0m2.930s sys 0m0.008s
ASA bit is set:
real 0m3.470s user 0m3.461s sys 0m0.008s
The difference on some real applications is much more modest and is just something like ~0.5%, but every little bit helps.
/**** start of bench_ASA.c ****/ void __attribute__((naked)) f(int count, void *rand) { asm volatile ( " push {r4, r5, r6, lr}\n" " mov r4, r0\n" " mov r5, r1\n" "0:\n" ".rept 4096\n" " blx r5\n" " tst r0, #1\n" " bne 1f\n" " b 2f\n" ".balign 64\n" "1:\n" ".rept 15\n" " add r0, r0, #0\n" ".endr\n" " b 3f\n" ".balign 64\n" "2:\n" ".rept 16\n" " add r0, r0, #0\n" ".endr\n" "3:\n" ".endr\n" " subs r4, r4, #1\n" " bgt 0b\n" " pop {r4, r5, r6, pc}\n" ); } int main() { f(1000, rand); return 0; } /**** end of bench_ASA.c ****/
Signed-off-by: Siarhei Siamashka siarhei.siamashka@gmail.com
arch/arm/cpu/armv7/omap3/cache.S | 1 - 1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/arch/arm/cpu/armv7/omap3/cache.S b/arch/arm/cpu/armv7/omap3/cache.S index cda87ba..2854771 100644 --- a/arch/arm/cpu/armv7/omap3/cache.S +++ b/arch/arm/cpu/armv7/omap3/cache.S @@ -169,7 +169,6 @@ setup_auxcr: orr r1, r3, r2, lsr #20-4 @ combine variant and revision mov r12, #0x3 mrc p15, 0, r0, c1, c0, 1
- orr r0, r0, #0x10 @ Enable ASA @ Enable L1NEON on pre-r2p1 (erratum 621766 workaround) cmp r1, #0x21 orrlt r0, r0, #1 << 5
-- 1.7.3.2
Best regards,
Wolfgang Denk
-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de Never put off until tomorrow what you can put off indefinitely.

On Saturday 20 November 2010 01:12:22 Paulraj, Sandeep wrote:
Dear Sandeep,
In message 0554BEF07D437848AF01B9C9B5F0BC5DBD1C11AD@dlee01.ent.ti.com
you wrote:
I came across this patch that might be useful
http://lists.denx.de/pipermail/u-boot/2010-February/067534.html
In any case since the u-boot structure has changed this cannot be applied. If anybody is interested please send a new patch.
That's actually prtty straightforward. Here it comes (completely untested :-)
Yes, this straightforward change looks ok to me.
Thanks but this now effects both OMAP3 and OMAP4, it needs quite a bit of testing in my opinion.
Why do you think it affects OMAP4?
+++ b/arch/arm/cpu/armv7/omap3/cache.S
^^^^^
OMAP4 uses Cortex-A9 with totally different bits in Auxiliary Control Register. This ASA bit (and the patch) is only relevant for Cortex-A8.
Anyway, additional testing and ACK would be surely welcome :-)
Is there anything else expected to be done from my side? As I mentioned before, I would really prefer if TI could deal with this stuff themselves. Just because they ought to have all the most relevant information.

On Fri, Nov 19, 2010 at 11:56 PM, Paulraj, Sandeep s-paulraj@ti.com wrote:
I came across this patch that might be useful
http://lists.denx.de/pipermail/u-boot/2010-February/067534.html
In any case since the u-boot structure has changed this cannot be applied. If anybody is interested please send a new patch.
There are actually more problems (already known to TI) related to the use of ASA bit on OMAP3 Cortex-A8: http://article.gmane.org/gmane.linux.kernel/1048445
I don't know any of the details related to these reliability issues mentioned by Richard. But I still think that u-boot should not be setting ASA bit, there are just too many reasons not to do that.
participants (3)
-
Paulraj, Sandeep
-
Siarhei Siamashka
-
Wolfgang Denk