[U-Boot] [RFC][PATCH 00/19] arm: add full relocation / cache support

This patch series add full relocation and cache support for arm based boards. I did this for arm1136, arm_cortexa8 and arm926ejs based boards. As this change is not compatible to old code, before this can go to mainline *all* plattforms and boards have to be converted! As I don;t have access to all plattforms/ boards I need help here! Also I couldn;t test all boards, so please test and report, send bugfixes!
Relocation support:
changed arch/arm/lib/board.c to get in sync with arch/powerpc/lib/board.c maybe it is possible to merge them to one arch/generic/lib/board.c?
This approach is similiar to powerpc, so there is a need for an initial stack pointer addr defined through CONFIG_SYS_INIT_SP_ADDR. As I don;t know all architectures/boards, I set this value where I didn;t find such a stack, for example in the processor, to RAM, as RAM is actual setup in low_level_init.S. Please check this for boards where this is in RAM to find a "better" place.
Please also read doc/README.arm-relocation There is more info what is done, and maybe should be done.
Cache support:
I used the patches from Alessandro Rubini: http://lists.denx.de/pipermail/u-boot/2010-January/067099.html
and rebased them to actual code. Also, in case of full relocation, the position of the TLB can not be set on compile time, instead it is calculated in board_init_f() and stored in gd. Also added support for arm_cortexa8 and arm1136.
As this is a RFC, this patch series is in a state where things *can* and *should* be discussed!
Also I think, it would be good to create an "arm-relocation.git" where patches (for example board maintainers bugfixes) for this change are collected.
I tested the patch series on the qong(arm1136), beagle(arm_cortexa8) tx25(arm926ejs) and magnesium(arm926ejs) boards. Relocation worked fine.
Cache results:
Test 1: Loading 127 MB of data from NAND flash into RAM:
Instr. Cache off on on Data Cache off off on -------------------------------------------------- QONG (ARM11) 177s 95s 43s = x 4.1 Beagle (Cortex A8) 116s 106s 30.3s = x 3.8
Test 2: uncompressing a gzipped image from RAM to RAM (size compressed: 6.5 MiB, uncompressed: 35 MiB):
Instr. Cache off on on Data Cache off off on -------------------------------------------------- QONG (ARM11) 1.54s 0.95s 0.18s = x 8.6 Beagle (Cortex A8) 1.84s 1.64s 0.12s = x 15.3
Heiko Schocher (19): arm: get rid of bi_env relocation: fixup cmdtable common: move TOTAL_MALLOC_LEN to include/common.h arm, arm1136, qong: add relocation support arm, arm1136: all arm1136 boards converted to new relocation arm, bdinfo: print some more infos. arm, relocation: add documentation i2c: fix command usage help i2c, omap24xx: only set bus_initialized, if uboot is relocated part: fix relocation fixup cortex8, beagle: add relocation support arm, cortexa8: all arm_cortexa8 based boards converted arm926, tx25: add relocation nand_fsl_nfc: get rid of local var arm, 926: convert all arm926ejs based board arm926: flush cache for arm926 arm cp15: setup mmu and enable dcache beagle, cache: activate cache command arm1136, dcache: enable cache command for qong board
arch/arm/config.mk | 3 + arch/arm/cpu/arm1136/start.S | 181 +++++++++---- arch/arm/cpu/arm1136/u-boot.lds | 14 +- arch/arm/cpu/arm926ejs/orion5x/dram.c | 12 +- arch/arm/cpu/arm926ejs/start.S | 165 ++++++++---- arch/arm/cpu/arm926ejs/u-boot.lds | 14 +- arch/arm/cpu/armv7/mx51/u-boot.lds | 14 +- arch/arm/cpu/armv7/omap3/cache.S | 82 ++++++ arch/arm/cpu/armv7/omap3/emif4.c | 14 +- arch/arm/cpu/armv7/omap3/sdrc.c | 14 +- arch/arm/cpu/armv7/start.S | 174 +++++++++---- arch/arm/cpu/armv7/u-boot.lds | 14 +- arch/arm/include/asm/config.h | 3 - arch/arm/include/asm/global_data.h | 9 + arch/arm/include/asm/u-boot-arm.h | 8 +- arch/arm/include/asm/u-boot.h | 1 - arch/arm/lib/board.c | 326 +++++++++++++++++++---- arch/arm/lib/cache-cp15.c | 55 ++++ arch/arm/lib/cache.c | 13 +- arch/arm/lib/interrupts.c | 15 +- arch/powerpc/lib/board.c | 10 - board/Marvell/guruplug/guruplug.c | 15 +- board/Marvell/mv88f6281gtw_ge/mv88f6281gtw_ge.c | 12 +- board/Marvell/openrd_base/openrd_base.c | 11 +- board/Marvell/rd6281a/rd6281a.c | 15 +- board/Marvell/sheevaplug/sheevaplug.c | 15 +- board/apollon/apollon.c | 13 +- board/armltd/integrator/integrator.c | 14 +- board/armltd/integrator/lowlevel_init.S | 2 +- board/armltd/versatile/versatile.c | 10 + board/atmel/at91cap9adk/at91cap9adk.c | 11 +- board/atmel/at91sam9260ek/at91sam9260ek.c | 11 +- board/atmel/at91sam9261ek/at91sam9261ek.c | 11 +- board/atmel/at91sam9263ek/at91sam9263ek.c | 11 +- board/atmel/at91sam9m10g45ek/at91sam9m10g45ek.c | 11 +- board/atmel/at91sam9rlek/at91sam9rlek.c | 11 +- board/calao/sbc35_a9g20/sbc35_a9g20.c | 15 +- board/calao/tny_a9260/tny_a9260.c | 14 +- board/davedenx/qong/config.mk | 4 +- board/davedenx/qong/qong.c | 92 ++++--- board/davinci/common/misc.c | 12 +- board/esd/meesc/meesc.c | 13 +- board/esd/otc570/otc570.c | 13 +- board/eukrea/cpu9260/cpu9260.c | 15 +- board/freescale/mx31ads/config.mk | 2 +- board/freescale/mx31ads/mx31ads.c | 10 +- board/freescale/mx31ads/u-boot.lds | 14 +- board/freescale/mx31pdk/mx31pdk.c | 11 +- board/freescale/mx51evk/mx51evk.c | 14 +- board/imx31_phycore/config.mk | 2 +- board/imx31_phycore/imx31_phycore.c | 10 +- board/karo/tx25/config.mk | 4 +- board/karo/tx25/tx25.c | 11 +- board/keymile/km_arm/km_arm.c | 13 +- board/logicpd/imx27lite/config.mk | 2 +- board/logicpd/imx27lite/imx27lite.c | 15 +- board/logicpd/imx31_litekit/config.mk | 2 +- board/logicpd/imx31_litekit/imx31_litekit.c | 11 +- board/ronetix/pm9261/pm9261.c | 13 +- board/ronetix/pm9263/pm9263.c | 13 +- board/ronetix/pm9g45/pm9g45.c | 13 +- board/samsung/goni/goni.c | 11 +- board/samsung/smdkc100/smdkc100.c | 9 +- board/st/nhk8815/nhk8815.c | 11 +- board/ti/beagle/config.mk | 2 +- board/ti/omap1510inn/config.mk | 3 +- board/ti/omap1510inn/omap1510innovator.c | 13 +- board/ti/omap1610inn/config.mk | 3 +- board/ti/omap1610inn/omap1610innovator.c | 12 +- board/ti/omap2420h4/config.mk | 6 +- board/ti/omap2420h4/omap2420h4.c | 18 ++- board/ti/omap5912osk/omap5912osk.c | 12 +- board/ti/omap730p2/omap730p2.c | 12 +- common/cmd_bdinfo.c | 10 +- common/cmd_bmp.c | 6 + common/cmd_i2c.c | 11 + common/command.c | 33 +++ disk/part.c | 11 +- doc/README.arm-relocation | 315 ++++++++++++++++++++++ drivers/i2c/omap24xx_i2c.c | 4 +- include/command.h | 3 + include/common.h | 9 + include/configs/am3517_evm.h | 5 + include/configs/apollon.h | 7 + include/configs/at91cap9adk.h | 4 + include/configs/at91sam9260ek.h | 4 + include/configs/at91sam9261ek.h | 4 + include/configs/at91sam9263ek.h | 4 + include/configs/at91sam9m10g45ek.h | 4 + include/configs/at91sam9rlek.h | 4 + include/configs/cpu9260.h | 4 + include/configs/da830evm.h | 5 + include/configs/da850evm.h | 4 + include/configs/davinci_dm355evm.h | 4 + include/configs/davinci_dm355leopard.h | 4 + include/configs/davinci_dm365evm.h | 4 + include/configs/davinci_dm6467evm.h | 4 + include/configs/davinci_dvevm.h | 6 + include/configs/davinci_schmoogie.h | 5 + include/configs/davinci_sffsdr.h | 5 + include/configs/davinci_sonata.h | 6 + include/configs/devkit8000.h | 5 + include/configs/edminiv2.h | 4 + include/configs/guruplug.h | 4 + include/configs/imx27lite-common.h | 4 + include/configs/imx31_litekit.h | 7 + include/configs/imx31_phycore.h | 9 + include/configs/integratorcp.h | 6 + include/configs/km_arm.h | 4 + include/configs/meesc.h | 4 + include/configs/mv88f6281gtw_ge.h | 4 + include/configs/mx31ads.h | 7 + include/configs/mx31pdk.h | 7 + include/configs/mx51evk.h | 5 + include/configs/nhk8815.h | 4 + include/configs/ns9750dev.h | 4 + include/configs/omap1610h2.h | 4 + include/configs/omap1610inn.h | 4 + include/configs/omap2420h4.h | 7 + include/configs/omap3_beagle.h | 5 + include/configs/omap3_evm.h | 5 + include/configs/omap3_overo.h | 5 + include/configs/omap3_pandora.h | 5 + include/configs/omap3_sdp3430.h | 5 + include/configs/omap3_zoom1.h | 5 + include/configs/omap3_zoom2.h | 5 + include/configs/omap5912osk.h | 4 + include/configs/omap730p2.h | 4 + include/configs/openrd_base.h | 4 + include/configs/otc570.h | 4 + include/configs/pm9261.h | 4 + include/configs/pm9263.h | 4 + include/configs/pm9g45.h | 4 + include/configs/qong.h | 10 + include/configs/rd6281a.h | 4 + include/configs/s5p_goni.h | 4 + include/configs/sbc35_a9g20.h | 4 + include/configs/sheevaplug.h | 4 + include/configs/smdkc100.h | 4 + include/configs/tny_a9260.h | 4 + include/configs/tx25.h | 12 +- include/configs/versatile.h | 4 + nand_spl/board/freescale/mx31pdk/u-boot.lds | 14 +- nand_spl/board/karo/tx25/u-boot.lds | 14 +- nand_spl/nand_boot.c | 5 + nand_spl/nand_boot_fsl_nfc.c | 19 +- onenand_ipl/board/apollon/apollon.c | 23 ++ onenand_ipl/board/apollon/u-boot.onenand.lds | 15 +- 148 files changed, 2143 insertions(+), 406 deletions(-) create mode 100644 doc/README.arm-relocation

Hi Heiko,
[...]
Also I think, it would be good to create an "arm-relocation.git" where patches (for example board maintainers bugfixes) for this change are collected.
Hopefully this will not need a whole repo, but only a branch. So I can easily pull in a branch into u-boot-testing as this was the original intention of the repository. Just let me know what to pull.
Cheers Detlev

Hello Detlev,
Detlev Zundel wrote:
Also I think, it would be good to create an "arm-relocation.git" where patches (for example board maintainers bugfixes) for this change are collected.
Hopefully this will not need a whole repo, but only a branch. So I can easily pull in a branch into u-boot-testing as this was the original intention of the repository. Just let me know what to pull.
I am fine with this too, Thanks!
bye Heiko

Hello Heiko
Am 29.07.2010 12:44, schrieb Heiko Schocher:
This patch series add full relocation and cache support for arm based boards. I did this for arm1136, arm_cortexa8 and arm926ejs based boards. As this change is not compatible to old code, before this can go to mainline *all* plattforms and boards have to be converted! As I don;t have access to all plattforms/ boards I need help here! Also I couldn;t test all boards, so please test and report, send bugfixes!
I just tested your patch set on my version of u-boot for MB86R01 from Fujitsu (arm926ejs based SoC). This is currently not available in mainline u-boot but current patches are available here
http://lists.denx.de/pipermail/u-boot/2010-August/074688.html
The point is that the board doesn't boot after applying your patches and doing the changes to my board which are given at the end of this mail.
The board runs through my low level init (so DDR RAM is up) and later on crashes in the first call to memset. I could not further debug this as I have to admit that I am not an expert with GDB + BDI2000 debugging. Maybe you can give me some hints what I am missing.
Thanks
Matthias
Changes made to the board code after applying your patches:
diff --git a/include/configs/jadecpu.h b/include/configs/jadecpu.h index bfc60a6..24aa23d 100644 --- a/include/configs/jadecpu.h +++ b/include/configs/jadecpu.h @@ -149,6 +149,10 @@ #define PHYS_SDRAM 0x40000000 /* Start address of DDRRAM */ #define PHYS_SDRAM_SIZE 0x08000000 /* 128 megs */
+/* additions for new relocation code, must added to all boards */ +#define CONFIG_SYS_SDRAM_BASE PHYS_SDRAM +#define CONFIG_SYS_INIT_SP_ADDR 0x01000000 + /* * FLASH and environment organization */
diff --git a/board/syteco/jadecpu/jadecpu.c b/board/syteco/jadecpu/jadecpu.c index 04d2f9d..bf96bcd 100644 --- a/board/syteco/jadecpu/jadecpu.c +++ b/board/syteco/jadecpu/jadecpu.c @@ -154,12 +154,18 @@ int misc_init_r(void) */ int dram_init(void) { - gd->bd->bi_dram[0].start = PHYS_SDRAM; - gd->bd->bi_dram[0].size = PHYS_SDRAM_SIZE; - + /* dram_init must store complete ramsize in gd->ram_size */ + gd->ram_size = get_ram_size((volatile void *)PHYS_SDRAM, + PHYS_SDRAM_SIZE); return 0; }
+void dram_init_banksize (void) +{ + gd->bd->bi_dram[0].start = PHYS_SDRAM; + gd->bd->bi_dram[0].size = gd->ram_size; +} + int board_eth_init(bd_t *bis) { int rc = 0;

Hello Matthias,
Matthias Weißer wrote:
Am 29.07.2010 12:44, schrieb Heiko Schocher:
This patch series add full relocation and cache support for arm based boards. I did this for arm1136, arm_cortexa8 and arm926ejs based boards. As this change is not compatible to old code, before this can go to mainline *all* plattforms and boards have to be converted! As I don;t have access to all plattforms/ boards I need help here! Also I couldn;t test all boards, so please test and report, send bugfixes!
I just tested your patch set on my version of u-boot for MB86R01 from Fujitsu (arm926ejs based SoC). This is currently not available in mainline u-boot but current patches are available here
Thanks for testing!
http://lists.denx.de/pipermail/u-boot/2010-August/074688.html
The point is that the board doesn't boot after applying your patches and doing the changes to my board which are given at the end of this mail.
:-(
The board runs through my low level init (so DDR RAM is up) and later on crashes in the first call to memset. I could not further debug this as I
Where is this memset()? The first after low level init is in: arch/arm/lib/board.c board_init_f(), do you mean this?
If so, then something must be wrong with your memory setup.
have to admit that I am not an expert with GDB + BDI2000 debugging. Maybe you can give me some hints what I am missing.
Hmm.. hard to say without debugging it. If you don;t mean with "crashes in the first first memset" the function I above described, maybe maybe your Ram gets not correct detected? Can you try to find out, with what value dram_init() sets up gd->ram_size? (Or you set this for testing to fix values?)
Hmmm... from where did your board boot? I tried it on the tx25 board, which boots from nand. Do you boot from a NOR flash? If so you *must* change TEXT_BASE in config.mk (see: doc/README.arm-relocation line 45) in your board directory to where u-boot starts in flash!
Ah, yep, this seems to me the reason why it don;t work for you:
found in the patchseries you pointed to http://lists.denx.de/pipermail/u-boot/2010-August/074688.html
board/syteco/jadecpu/config.mk [...] +TEXT_BASE = 0x46000000
change this to
(as in include/configs/jadecpu.h is defined the following:
+/* + * FLASH and environment organization + */ +#define CONFIG_SYS_FLASH_BASE 0x10000000 +#define CONFIG_SYS_MAX_FLASH_BANKS 1 +#define CONFIG_SYS_MAX_FLASH_SECT 256 +#define CONFIG_SYS_MONITOR_BASE CONFIG_SYS_FLASH_BASE )
+TEXT_BASE = 0x10000000
and try it again.
Changes made to the board code after applying your patches:
diff --git a/include/configs/jadecpu.h b/include/configs/jadecpu.h index bfc60a6..24aa23d 100644 --- a/include/configs/jadecpu.h +++ b/include/configs/jadecpu.h @@ -149,6 +149,10 @@ #define PHYS_SDRAM 0x40000000 /* Start address of DDRRAM */ #define PHYS_SDRAM_SIZE 0x08000000 /* 128 megs */
+/* additions for new relocation code, must added to all boards */ +#define CONFIG_SYS_SDRAM_BASE PHYS_SDRAM +#define CONFIG_SYS_INIT_SP_ADDR 0x01000000
/*
- FLASH and environment organization
*/
diff --git a/board/syteco/jadecpu/jadecpu.c b/board/syteco/jadecpu/jadecpu.c index 04d2f9d..bf96bcd 100644 --- a/board/syteco/jadecpu/jadecpu.c +++ b/board/syteco/jadecpu/jadecpu.c @@ -154,12 +154,18 @@ int misc_init_r(void) */ int dram_init(void) {
gd->bd->bi_dram[0].start = PHYS_SDRAM;
gd->bd->bi_dram[0].size = PHYS_SDRAM_SIZE;
/* dram_init must store complete ramsize in gd->ram_size */
gd->ram_size = get_ram_size((volatile void *)PHYS_SDRAM,
PHYS_SDRAM_SIZE); return 0;
}
+void dram_init_banksize (void) +{
gd->bd->bi_dram[0].start = PHYS_SDRAM;
gd->bd->bi_dram[0].size = gd->ram_size;
+}
looks OK to me.
bye Heiko

Hello Heiko
Am 05.08.2010 08:19, schrieb Heiko Schocher:
The board runs through my low level init (so DDR RAM is up) and later on crashes in the first call to memset. I could not further debug this as I
Where is this memset()? The first after low level init is in: arch/arm/lib/board.c board_init_f(), do you mean this?
Yes. That was the point it crashed.
If so, then something must be wrong with your memory setup.
It was at more then one point :-)
Hmmm... from where did your board boot? I tried it on the tx25 board, which boots from nand. Do you boot from a NOR flash?
Yes. My board boots from NOR.
If so you *must* change TEXT_BASE in config.mk (see: doc/README.arm-relocation line 45) in your board directory to where u-boot starts in flash!
I think I missed that point.
Ah, yep, this seems to me the reason why it don;t work for you:
found in the patchseries you pointed to http://lists.denx.de/pipermail/u-boot/2010-August/074688.html
board/syteco/jadecpu/config.mk [...] +TEXT_BASE = 0x46000000
change this to
(as in include/configs/jadecpu.h is defined the following:
+/*
- FLASH and environment organization
- */
+#define CONFIG_SYS_FLASH_BASE 0x10000000 +#define CONFIG_SYS_MAX_FLASH_BANKS 1 +#define CONFIG_SYS_MAX_FLASH_SECT 256 +#define CONFIG_SYS_MONITOR_BASE CONFIG_SYS_FLASH_BASE )
+TEXT_BASE = 0x10000000
and try it again.
I did that but the board still failed to boot and crashed at the first call to memset. So I had a "nice" debug session single stepping through the code with a lss file in parallel.
The reason it crashed so early was my setup of CONFIG_SYS_INIT_SP_ADDR which I set to the *beginning* of some 32k internal SRAM. And as the stack grows downwards it crashed right at the first push instruction.
So, I now have a running system with your patches and greatly improved uncompressing times of my boot images. We are now able to use compressed images which was not possible due to boot time restrictions in older versions. So it would be great if this stuff could go mainline.
Thanks, Matthias

Hello Matthias,
Matthias Weißer wrote: [...]
I did that but the board still failed to boot and crashed at the first call to memset. So I had a "nice" debug session single stepping through the code with a lss file in parallel.
The reason it crashed so early was my setup of CONFIG_SYS_INIT_SP_ADDR which I set to the *beginning* of some 32k internal SRAM. And as the stack grows downwards it crashed right at the first push instruction.
Argh, you are right, overlooked this too :-(
So, I now have a running system with your patches and greatly improved uncompressing times of my boot images. We are now able to use compressed
Great!
images which was not possible due to boot time restrictions in older versions. So it would be great if this stuff could go mainline.
I hope so ;-) So, could you send a patch for this board, based on my RFC patches? I pick them up at first, and it will then go in mainline when it is time for this step ...
(And it would be nice if you could send some testsresults, how "faster" let your boot time ;-)
bye Heiko

Hello Heiko
images which was not possible due to boot time restrictions in older versions. So it would be great if this stuff could go mainline.
I hope so ;-) So, could you send a patch for this board, based on my RFC patches? I pick them up at first, and it will then go in mainline when it is time for this step ...
As my board is not mainlined (still waiting for any review on V5) this doesn't make sense from my point of view.
(And it would be nice if you could send some testsresults, how "faster" let your boot time ;-)
Here are some numbers:
Test: old (icache) new (i+d cache) image size 191k 204k x 1.06 copy 32MB NOR -> RAM 7.0s 6.5s x 1.07 iminfo of 1.2MiB image 0.5s 0.1s x 5.0 bootm of 1.2MiB gz image 5.4s 0.5s x 10.8 bootm of 0.8MiB lzma image 17.1s 1.5s x 11.4 bootm of 1.6MiB lzo image 3.2s 0.2s x 16.0
The three images have all the same payload. Just tested the three different compression methods.
Boot time limit into application code is 5s so u-boot image compression was not an option. But now it is.
Matthias

Dear =?ISO-8859-1?Q?Matthias_Wei=DFer?=,
In message 4C5AAB4C.9010209@arcor.de you wrote:
As my board is not mainlined (still waiting for any review on V5) this doesn't make sense from my point of view.
I just sent a few reviews to one of the patches. As soon as you resubmit this patch, I'll pull the stuff into u-boot-arm and then into mainline.
Here are some numbers:
Test: old (icache) new (i+d cache) image size 191k 204k x 1.06 copy 32MB NOR -> RAM 7.0s 6.5s x 1.07 iminfo of 1.2MiB image 0.5s 0.1s x 5.0 bootm of 1.2MiB gz image 5.4s 0.5s x 10.8 bootm of 0.8MiB lzma image 17.1s 1.5s x 11.4 bootm of 1.6MiB lzo image 3.2s 0.2s x 16.0
The three images have all the same payload. Just tested the three different compression methods.
Boot time limit into application code is 5s so u-boot image compression was not an option. But now it is.
Indeed - that's an impressive set of results. Thanks for the testing.
Best regards,
Wolfgang Denk

Hello,
Heiko Schocher wrote:
Also I think, it would be good to create an "arm-relocation.git" where patches (for example board maintainers bugfixes) for this change are collected.
As Detlev suggested we created for this patchset a branch in u-boot-testing. This can be found here:
Git: git://git.denx.de/u-boot-testing.git arm-reloc-and-cache-support
URL: http://git.denx.de/?p=u-boot/u-boot-testing.git;a=shortlog;h=refs/heads/arm-...
bye, Heiko
participants (4)
-
Detlev Zundel
-
Heiko Schocher
-
Matthias Weißer
-
Wolfgang Denk