[U-Boot] Slow standalone routine with U-Boot

Dear Sirs,
I am trying to implement a memory and CPU demanding standalone application on the beagleboard by hacking U-Boot.
To my surprise, the U-Boot program is ten times slower than the same program running under Linux on the beagleboard.
This test loop when inserted in u-boot/common/main.c.main_loop takes 8.9 s with caches enabled and 10.1 s with caches disabled:
Unsigned long *p;
long i, error;
p = 0x81000000;
printf("filling:\n");
for (i = 0; i < 25000000; i++) *(p+i) = i;
printf("testing:\n");
error = 0;
for (i = 0; i < 25000000; i++)
{
if (*(p+i) != i)
{
error = i+1;
break;
}
}
printf("error=%ld\n",error);
The same loop (with p = malloc( 25000000 * 4) takes less than a second on Angstrom Linux on the beagleboard.
The routine cpy_clk_code in lowlevel_init.S that copies the go_to_speed routine to SRAM is executed at start-up but the go_to_speed code apparently never gets executed. I have verified this by inserting hang loops in cpy_clk_code() and go_to_speed().
Beagleboard version = C4
U-Boot version = u-boot-2010.09

Dear Lennart Sundberg,
In message AANLkTik6wti_DTFLsPfT+y_ZtHFsz-XrQx4tF8H_Lnj5@mail.gmail.com you wrote:
I am trying to implement a memory and CPU demanding standalone application on the beagleboard by hacking U-Boot.
To my surprise, the U-Boot program is ten times slower than the same program running under Linux on the beagleboard.
This is not a surprise...
U-Boot version = u-boot-2010.09
...given that old version of U-Boot.
In that old code, U-Boot was running with cahces disabled, which explains the poor performance you see.
Update and use recent code instead. This will include this commit then:
commit 95c6f6d34d4ff23f4d005488d84682eec5fa9ec8 Author: Heiko Schocher hs@denx.de Date: Fri Sep 17 13:10:31 2010 +0200
ARM V7 (OMAP): add data cache support, test on Beagle board
Add data cache support for ARM V7 systems. Used cache flush functions from linux:arch/arm/mm/cache-v7.S developed from Catalin Marinas.
Enable "cache" command on Beagle board and test performance.
Test 1: Loading 127 MB of data from NAND flash into RAM:
Instr. Cache off on on Data Cache off off on -------------------------------------------------- Beagle (Cortex A8) 116s 106s 30.3s = x 3.8
Test 2: uncompressing a gzipped image from RAM to RAM (size compressed: 6.5 MiB, uncompressed: 35 MiB):
Instr. Cache off on on Data Cache off off on -------------------------------------------------- Beagle (Cortex A8) 1.84s 1.64s 0.12s = x 15.3
Portions of this work were supported by funding from the CE Linux Forum.
Signed-off-by: Heiko Schocher hs@denx.de Reviewed-by: Ben Gardinerbengardiner@nanometrics.ca
As you can see, CPU and memory bound executin may improve by a factor of 15 or so ...
Best regards,
Wolfgang Denk
participants (2)
-
Lennart Sundberg
-
Wolfgang Denk