[U-Boot] [PATCH v2 0/6] reboard: Introduce generic relocation feature

This is the second patch series aiming to unify the various board.c files in each architecture into a single one. This series implements a generic relocation feature, which is the bridge between board_init_f() and board_init_r(). It then moves ARM over to use this framework, as an example.
On ARM the relocation code is duplicated for each CPU yet it is the same. We can bring this up to the arch level. But since (I believe) Elf relocation is basically the same process for all archs, there is no reason not to bring it up to the generic level.
Each architecture which uses this framework needs to provide a function called arch_elf_relocate_entry() which processes a single relocation entry. This is a static inline function to reduce code size overhead.
For ARM, a new arch/arm/lib/proc.S file is created, which holds generic ARM assembler code (things that cannot be written in C and are common functions used by all ARM CPUs). This helps reduce duplication. Interrupt handling code and perhaps even some startup code can move there later.
It may be useful for other architectures with a lot of different CPUs to have a similar file.
Code size on my ARMv7 system increases by 54 bytes with generic relocation. This overhead is mostly just literal pool access and setting up to call the relocated U-Boot at the end.
On my system, execution time increases from 10.8ms to 15.6ms due to the less efficient C implementations of the copy and zero loops. If execution time is of concern, you can define CONFIG_USE_ARCH_MEMSET and CONFIG_USE_ARCH_MEMCPY to reduce it. For met this reduces relocation time to 5.4ms, i.e. twice as fast as the old system.
One problem remains which causes mx31pdk to fail to build. It doesn't have string.c in its SPL code and the architecture-specific versions of memset()/memcpy() are too large. I propose to add a local change to reloc.c that uses inline code for boards that use the old legacy SPL framework. We can remove it later. This is not included in v2 but I am interested in comments on this approach. An alternative would be just to add simple memset()/memcpy() functions just for this board (and one other affected MX31 board).
Changes in v2: - Use CONFIG_SYS_SKIP_RELOC instead of CONFIG_SYS_LEGACY_BOARD - Import asm-generic/sections.h from Linux and add U-Boot extras - Squash generic link symbols patch into generic relocation patch - Move reloc.c into common/ - Add function comments - Use memset, memcpy instead of inline code - Add README file for relocation - Invalidate I-cache when we jump to relocated code - Use an inline relocation function to reduce code size - Make relocation symbols global so we can use them outside start.S
Simon Glass (6): reboard: Create reloc.h and include it where needed reboard: define CONFIG_SYS_SKIP_RELOC for all archs reboard: Add generic relocation feature reboard: arm: Add processor function library reboard: arm: Move over to generic relocation reboard: arm: Remove unused code in start.S
README | 4 + arch/arm/cpu/arm1136/start.S | 133 ++------------ arch/arm/cpu/arm1176/start.S | 214 ++------------------- arch/arm/cpu/arm720t/start.S | 127 ++----------- arch/arm/cpu/arm920t/start.S | 135 ++------------ arch/arm/cpu/arm925t/start.S | 135 ++------------ arch/arm/cpu/arm926ejs/davinci/spl.c | 1 + arch/arm/cpu/arm926ejs/start.S | 144 ++------------- arch/arm/cpu/arm946es/start.S | 130 ++----------- arch/arm/cpu/arm_intcm/start.S | 135 ++------------ arch/arm/cpu/armv7/omap-common/spl.c | 1 + arch/arm/cpu/armv7/start.S | 141 ++------------ arch/arm/cpu/ixp/start.S | 127 ++----------- arch/arm/cpu/lh7a40x/start.S | 124 ++----------- arch/arm/cpu/pxa/start.S | 138 ++------------ arch/arm/cpu/s3c44b0/start.S | 127 ++----------- arch/arm/cpu/sa1100/start.S | 124 ++----------- arch/arm/include/asm/reloc.h | 56 ++++++ arch/arm/lib/Makefile | 2 + arch/arm/lib/board.c | 1 + arch/arm/lib/proc.S | 40 ++++ arch/avr32/config.mk | 3 + arch/avr32/lib/board.c | 1 + arch/blackfin/config.mk | 3 + arch/m68k/config.mk | 3 + arch/m68k/lib/board.c | 1 + arch/microblaze/config.mk | 3 + arch/mips/config.mk | 3 + arch/mips/lib/board.c | 1 + arch/nds32/config.mk | 3 + arch/nds32/lib/board.c | 1 + arch/nios2/config.mk | 3 + arch/powerpc/config.mk | 3 + arch/powerpc/lib/board.c | 1 + arch/sandbox/config.mk | 3 + arch/sh/config.mk | 3 + arch/sparc/config.mk | 3 + arch/x86/config.mk | 3 + arch/x86/lib/board.c | 1 + board/freescale/mpc8313erdb/mpc8313erdb.c | 1 + board/freescale/mpc8315erdb/mpc8315erdb.c | 1 + board/samsung/smdk6400/smdk6400_nand_spl.c | 1 + board/sheldon/simpc8313/simpc8313.c | 1 + common/Makefile | 4 + common/reloc.c | 121 ++++++++++++ doc/README.relocation | 87 +++++++++ include/asm-generic/sections.h | 92 +++++++++ include/common.h | 2 +- include/reloc.h | 54 +++++ nand_spl/board/freescale/mpc8536ds/nand_boot.c | 1 + nand_spl/board/freescale/mpc8569mds/nand_boot.c | 1 + nand_spl/board/freescale/mpc8572ds/nand_boot.c | 1 + nand_spl/board/freescale/mx31pdk/Makefile | 8 +- nand_spl/board/freescale/mx31pdk/u-boot.lds | 1 + nand_spl/board/freescale/p1010rdb/nand_boot.c | 1 + nand_spl/board/freescale/p1023rds/nand_boot.c | 1 + nand_spl/board/freescale/p1_p2_rdb/nand_boot.c | 1 + nand_spl/board/freescale/p1_p2_rdb_pc/nand_boot.c | 1 + nand_spl/board/karo/tx25/Makefile | 8 +- nand_spl/board/karo/tx25/u-boot.lds | 1 + nand_spl/nand_boot_fsl_nfc.c | 1 + 61 files changed, 705 insertions(+), 1766 deletions(-) create mode 100644 arch/arm/include/asm/reloc.h create mode 100644 arch/arm/lib/proc.S create mode 100644 common/reloc.c create mode 100644 doc/README.relocation create mode 100644 include/asm-generic/sections.h create mode 100644 include/reloc.h

Before adding new relocation functions, move this prototype out of common.h where things are pretty crowded.
Signed-off-by: Simon Glass sjg@chromium.org ---
arch/arm/cpu/arm926ejs/davinci/spl.c | 1 + arch/arm/cpu/armv7/omap-common/spl.c | 1 + arch/arm/lib/board.c | 1 + arch/avr32/lib/board.c | 1 + arch/m68k/lib/board.c | 1 + arch/mips/lib/board.c | 1 + arch/nds32/lib/board.c | 1 + arch/powerpc/lib/board.c | 1 + arch/x86/lib/board.c | 1 + board/freescale/mpc8313erdb/mpc8313erdb.c | 1 + board/freescale/mpc8315erdb/mpc8315erdb.c | 1 + board/samsung/smdk6400/smdk6400_nand_spl.c | 1 + board/sheldon/simpc8313/simpc8313.c | 1 + include/common.h | 2 +- include/reloc.h | 39 +++++++++++++++++++++ nand_spl/board/freescale/mpc8536ds/nand_boot.c | 1 + nand_spl/board/freescale/mpc8569mds/nand_boot.c | 1 + nand_spl/board/freescale/mpc8572ds/nand_boot.c | 1 + nand_spl/board/freescale/p1010rdb/nand_boot.c | 1 + nand_spl/board/freescale/p1023rds/nand_boot.c | 1 + nand_spl/board/freescale/p1_p2_rdb/nand_boot.c | 1 + nand_spl/board/freescale/p1_p2_rdb_pc/nand_boot.c | 1 + nand_spl/nand_boot_fsl_nfc.c | 1 + 23 files changed, 61 insertions(+), 1 deletions(-) create mode 100644 include/reloc.h
diff --git a/arch/arm/cpu/arm926ejs/davinci/spl.c b/arch/arm/cpu/arm926ejs/davinci/spl.c index d9b9398..aba50d1 100644 --- a/arch/arm/cpu/arm926ejs/davinci/spl.c +++ b/arch/arm/cpu/arm926ejs/davinci/spl.c @@ -24,6 +24,7 @@ #include <asm/u-boot.h> #include <asm/utils.h> #include <nand.h> +#include <reloc.h> #include <asm/arch/dm365_lowlevel.h> #include <ns16550.h>
diff --git a/arch/arm/cpu/armv7/omap-common/spl.c b/arch/arm/cpu/armv7/omap-common/spl.c index 9c35a09..06039fe 100644 --- a/arch/arm/cpu/armv7/omap-common/spl.c +++ b/arch/arm/cpu/armv7/omap-common/spl.c @@ -35,6 +35,7 @@ #include <i2c.h> #include <image.h> #include <malloc.h> +#include <reloc.h>
DECLARE_GLOBAL_DATA_PTR;
diff --git a/arch/arm/lib/board.c b/arch/arm/lib/board.c index 3d78274..bf1bf79 100644 --- a/arch/arm/lib/board.c +++ b/arch/arm/lib/board.c @@ -41,6 +41,7 @@ #include <common.h> #include <command.h> #include <malloc.h> +#include <reloc.h> #include <stdio_dev.h> #include <version.h> #include <net.h> diff --git a/arch/avr32/lib/board.c b/arch/avr32/lib/board.c index 63fe297..a7eaf76 100644 --- a/arch/avr32/lib/board.c +++ b/arch/avr32/lib/board.c @@ -22,6 +22,7 @@ #include <common.h> #include <command.h> #include <malloc.h> +#include <reloc.h> #include <stdio_dev.h> #include <version.h> #include <net.h> diff --git a/arch/m68k/lib/board.c b/arch/m68k/lib/board.c index 259b71c..85495cc 100644 --- a/arch/m68k/lib/board.c +++ b/arch/m68k/lib/board.c @@ -28,6 +28,7 @@ #include <watchdog.h> #include <command.h> #include <malloc.h> +#include <reloc.h> #include <stdio_dev.h>
#include <asm/immap.h> diff --git a/arch/mips/lib/board.c b/arch/mips/lib/board.c index d998f0e..e5bdcfc 100644 --- a/arch/mips/lib/board.c +++ b/arch/mips/lib/board.c @@ -24,6 +24,7 @@ #include <common.h> #include <command.h> #include <malloc.h> +#include <reloc.h> #include <stdio_dev.h> #include <version.h> #include <net.h> diff --git a/arch/nds32/lib/board.c b/arch/nds32/lib/board.c index 66e4537..9295f46 100644 --- a/arch/nds32/lib/board.c +++ b/arch/nds32/lib/board.c @@ -28,6 +28,7 @@ #include <common.h> #include <command.h> #include <malloc.h> +#include <reloc.h> #include <stdio_dev.h> #include <timestamp.h> #include <version.h> diff --git a/arch/powerpc/lib/board.c b/arch/powerpc/lib/board.c index ff5888e..248d452 100644 --- a/arch/powerpc/lib/board.c +++ b/arch/powerpc/lib/board.c @@ -25,6 +25,7 @@ #include <watchdog.h> #include <command.h> #include <malloc.h> +#include <reloc.h> #include <stdio_dev.h> #ifdef CONFIG_8xx #include <mpc8xx.h> diff --git a/arch/x86/lib/board.c b/arch/x86/lib/board.c index d742fec..3d00f20 100644 --- a/arch/x86/lib/board.c +++ b/arch/x86/lib/board.c @@ -35,6 +35,7 @@ #include <watchdog.h> #include <command.h> #include <stdio_dev.h> +#include <reloc.h> #include <version.h> #include <malloc.h> #include <net.h> diff --git a/board/freescale/mpc8313erdb/mpc8313erdb.c b/board/freescale/mpc8313erdb/mpc8313erdb.c index 08f873d..89a4832 100644 --- a/board/freescale/mpc8313erdb/mpc8313erdb.c +++ b/board/freescale/mpc8313erdb/mpc8313erdb.c @@ -27,6 +27,7 @@ #include <libfdt.h> #endif #include <pci.h> +#include <reloc.h> #include <mpc83xx.h> #include <vsc7385.h> #include <ns16550.h> diff --git a/board/freescale/mpc8315erdb/mpc8315erdb.c b/board/freescale/mpc8315erdb/mpc8315erdb.c index 5dc558a..848847b 100644 --- a/board/freescale/mpc8315erdb/mpc8315erdb.c +++ b/board/freescale/mpc8315erdb/mpc8315erdb.c @@ -34,6 +34,7 @@ #include <asm/io.h> #include <ns16550.h> #include <nand.h> +#include <reloc.h>
DECLARE_GLOBAL_DATA_PTR;
diff --git a/board/samsung/smdk6400/smdk6400_nand_spl.c b/board/samsung/smdk6400/smdk6400_nand_spl.c index a023284..23bbad3 100644 --- a/board/samsung/smdk6400/smdk6400_nand_spl.c +++ b/board/samsung/smdk6400/smdk6400_nand_spl.c @@ -29,6 +29,7 @@ */
#include <common.h> +#include <reloc.h>
void board_init_f(unsigned long bootflag) { diff --git a/board/sheldon/simpc8313/simpc8313.c b/board/sheldon/simpc8313/simpc8313.c index 9126c42..09d754b 100644 --- a/board/sheldon/simpc8313/simpc8313.c +++ b/board/sheldon/simpc8313/simpc8313.c @@ -29,6 +29,7 @@ #include <mpc83xx.h> #include <ns16550.h> #include <nand.h> +#include <reloc.h> #include <asm/io.h>
DECLARE_GLOBAL_DATA_PTR; diff --git a/include/common.h b/include/common.h index 05a658c..57bfa4c 100644 --- a/include/common.h +++ b/include/common.h @@ -470,7 +470,7 @@ int dcache_status (void); void dcache_enable (void); void dcache_disable(void); void mmu_disable(void); -void relocate_code (ulong, gd_t *, ulong) __attribute__ ((noreturn)); +#include <reloc.h> ulong get_endaddr (void); void trap_init (ulong); #if defined (CONFIG_4xx) || \ diff --git a/include/reloc.h b/include/reloc.h new file mode 100644 index 0000000..3dc7b85 --- /dev/null +++ b/include/reloc.h @@ -0,0 +1,39 @@ +/* + * Copyright (c) 2011 The Chromium OS Authors. + * + * See file CREDITS for list of people who contributed to this + * project. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of + * the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, + * MA 02111-1307 USA + */ + +#ifndef __RELOC_H +#define __RELOC_H + +/** + * Relocate U-Boot and jump to the relocated coded + * + * This copies U-Boot to a new location, zeroes the BSS, sets up a new stack + * and jumps to board_init_r() in the relocated code using the + * proc_call_board_init_r() function. It does not return. + * + * @param dest_sp New stack pointer to use + * @param new_gd Pointer to the relocated global data + * @param dest_addr Base code address of relocated U-Boot + */ +void relocate_code(ulong dest_sp, gd_t *new_gd, ulong dest_addr) + __attribute__ ((noreturn)); +#endif diff --git a/nand_spl/board/freescale/mpc8536ds/nand_boot.c b/nand_spl/board/freescale/mpc8536ds/nand_boot.c index 5a0a0c7..0356067 100644 --- a/nand_spl/board/freescale/mpc8536ds/nand_boot.c +++ b/nand_spl/board/freescale/mpc8536ds/nand_boot.c @@ -23,6 +23,7 @@ #include <ns16550.h> #include <asm/io.h> #include <nand.h> +#include <reloc.h>
u32 sysclk_tbl[] = { 33333000, 39999600, 49999500, 66666000, diff --git a/nand_spl/board/freescale/mpc8569mds/nand_boot.c b/nand_spl/board/freescale/mpc8569mds/nand_boot.c index 047da34..df391dc 100644 --- a/nand_spl/board/freescale/mpc8569mds/nand_boot.c +++ b/nand_spl/board/freescale/mpc8569mds/nand_boot.c @@ -23,6 +23,7 @@ #include <asm/io.h> #include <ns16550.h> #include <nand.h> +#include <reloc.h> #include <asm/mmu.h> #include <asm/immap_85xx.h> #include <asm/fsl_ddr_sdram.h> diff --git a/nand_spl/board/freescale/mpc8572ds/nand_boot.c b/nand_spl/board/freescale/mpc8572ds/nand_boot.c index 7ca4d4d..d0a059c 100644 --- a/nand_spl/board/freescale/mpc8572ds/nand_boot.c +++ b/nand_spl/board/freescale/mpc8572ds/nand_boot.c @@ -23,6 +23,7 @@ #include <ns16550.h> #include <asm/io.h> #include <nand.h> +#include <reloc.h>
u32 sysclk_tbl[] = { 33333000, 39999600, 49999500, 66666000, diff --git a/nand_spl/board/freescale/p1010rdb/nand_boot.c b/nand_spl/board/freescale/p1010rdb/nand_boot.c index 16eeb61..aba3cef 100644 --- a/nand_spl/board/freescale/p1010rdb/nand_boot.c +++ b/nand_spl/board/freescale/p1010rdb/nand_boot.c @@ -23,6 +23,7 @@ #include <asm/io.h> #include <ns16550.h> #include <nand.h> +#include <reloc.h> #include <asm/mmu.h> #include <asm/immap_85xx.h> #include <asm/fsl_ddr_sdram.h> diff --git a/nand_spl/board/freescale/p1023rds/nand_boot.c b/nand_spl/board/freescale/p1023rds/nand_boot.c index 0065c87..0c1a6b0 100644 --- a/nand_spl/board/freescale/p1023rds/nand_boot.c +++ b/nand_spl/board/freescale/p1023rds/nand_boot.c @@ -24,6 +24,7 @@ #include <ns16550.h> #include <asm/io.h> #include <nand.h> +#include <reloc.h> #include <asm/fsl_law.h>
/* Fixed sdram init -- doesn't use serial presence detect. */ diff --git a/nand_spl/board/freescale/p1_p2_rdb/nand_boot.c b/nand_spl/board/freescale/p1_p2_rdb/nand_boot.c index 16a756c..a482469 100644 --- a/nand_spl/board/freescale/p1_p2_rdb/nand_boot.c +++ b/nand_spl/board/freescale/p1_p2_rdb/nand_boot.c @@ -23,6 +23,7 @@ #include <asm/io.h> #include <ns16550.h> #include <nand.h> +#include <reloc.h> #include <asm/mmu.h> #include <asm/immap_85xx.h> #include <asm/fsl_ddr_sdram.h> diff --git a/nand_spl/board/freescale/p1_p2_rdb_pc/nand_boot.c b/nand_spl/board/freescale/p1_p2_rdb_pc/nand_boot.c index b9796ea..b7a1511 100644 --- a/nand_spl/board/freescale/p1_p2_rdb_pc/nand_boot.c +++ b/nand_spl/board/freescale/p1_p2_rdb_pc/nand_boot.c @@ -23,6 +23,7 @@ #include <ns16550.h> #include <asm/io.h> #include <nand.h> +#include <reloc.h> #include <asm/fsl_law.h> #include <asm/fsl_ddr_sdram.h>
diff --git a/nand_spl/nand_boot_fsl_nfc.c b/nand_spl/nand_boot_fsl_nfc.c index d6b0d9b..66dae24 100644 --- a/nand_spl/nand_boot_fsl_nfc.c +++ b/nand_spl/nand_boot_fsl_nfc.c @@ -29,6 +29,7 @@ #include <asm/arch/imx-regs.h> #include <asm/io.h> #include <fsl_nfc.h> +#include <reloc.h>
static struct fsl_nfc_regs *const nfc = (void *)NFC_BASE_ADDR;

Le 10/12/2011 20:16, Simon Glass a écrit :
Before adding new relocation functions, move this prototype out of common.h where things are pretty crowded.
Signed-off-by: Simon Glasssjg@chromium.org
arch/arm/cpu/arm926ejs/davinci/spl.c | 1 + arch/arm/cpu/armv7/omap-common/spl.c | 1 + arch/arm/lib/board.c | 1 + arch/avr32/lib/board.c | 1 + arch/m68k/lib/board.c | 1 + arch/mips/lib/board.c | 1 + arch/nds32/lib/board.c | 1 + arch/powerpc/lib/board.c | 1 + arch/x86/lib/board.c | 1 + board/freescale/mpc8313erdb/mpc8313erdb.c | 1 + board/freescale/mpc8315erdb/mpc8315erdb.c | 1 + board/samsung/smdk6400/smdk6400_nand_spl.c | 1 + board/sheldon/simpc8313/simpc8313.c | 1 + include/common.h | 2 +- include/reloc.h | 39 +++++++++++++++++++++ nand_spl/board/freescale/mpc8536ds/nand_boot.c | 1 + nand_spl/board/freescale/mpc8569mds/nand_boot.c | 1 + nand_spl/board/freescale/mpc8572ds/nand_boot.c | 1 + nand_spl/board/freescale/p1010rdb/nand_boot.c | 1 + nand_spl/board/freescale/p1023rds/nand_boot.c | 1 + nand_spl/board/freescale/p1_p2_rdb/nand_boot.c | 1 + nand_spl/board/freescale/p1_p2_rdb_pc/nand_boot.c | 1 + nand_spl/nand_boot_fsl_nfc.c | 1 + 23 files changed, 61 insertions(+), 1 deletions(-) create mode 100644 include/reloc.h
Since this patch set indicates that ARM, and only ARM, is moved over to relocation, I would prefer it that no other arch file be modified in this patch, and that the move of ARM to the new relocation mechanism be done in an atomic commit, so that other architectures can refer to a single commit in order to do their own move.
Amicalement,

Hi Albert,
On Sun, Dec 11, 2011 at 6:52 AM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Le 10/12/2011 20:16, Simon Glass a écrit :
Before adding new relocation functions, move this prototype out of common.h where things are pretty crowded.
Signed-off-by: Simon Glasssjg@chromium.org
arch/arm/cpu/arm926ejs/davinci/spl.c | 1 + arch/arm/cpu/armv7/omap-common/spl.c | 1 + arch/arm/lib/board.c | 1 + arch/avr32/lib/board.c | 1 + arch/m68k/lib/board.c | 1 + arch/mips/lib/board.c | 1 + arch/nds32/lib/board.c | 1 + arch/powerpc/lib/board.c | 1 + arch/x86/lib/board.c | 1 + board/freescale/mpc8313erdb/mpc8313erdb.c | 1 + board/freescale/mpc8315erdb/mpc8315erdb.c | 1 + board/samsung/smdk6400/smdk6400_nand_spl.c | 1 + board/sheldon/simpc8313/simpc8313.c | 1 + include/common.h | 2 +- include/reloc.h | 39 +++++++++++++++++++++ nand_spl/board/freescale/mpc8536ds/nand_boot.c | 1 + nand_spl/board/freescale/mpc8569mds/nand_boot.c | 1 + nand_spl/board/freescale/mpc8572ds/nand_boot.c | 1 + nand_spl/board/freescale/p1010rdb/nand_boot.c | 1 + nand_spl/board/freescale/p1023rds/nand_boot.c | 1 + nand_spl/board/freescale/p1_p2_rdb/nand_boot.c | 1 + nand_spl/board/freescale/p1_p2_rdb_pc/nand_boot.c | 1 + nand_spl/nand_boot_fsl_nfc.c | 1 + 23 files changed, 61 insertions(+), 1 deletions(-) create mode 100644 include/reloc.h
Since this patch set indicates that ARM, and only ARM, is moved over to relocation, I would prefer it that no other arch file be modified in this patch, and that the move of ARM to the new relocation mechanism be done in an atomic commit, so that other architectures can refer to a single commit in order to do their own move.
What specifically are you asking for in this patch? I added this at the request of a reviewer of v1, who felt that we should be removing code from common.h instead of adding it, and that relocation is done in only a few sites so should not be in common.h. Do you think this patch should be pulled out of the series and done on its own?
Amicalement,
Albert.
Regards, Simon

Hi Simon,
Le 11/12/2011 22:33, Simon Glass a écrit :
Hi Albert,
On Sun, Dec 11, 2011 at 6:52 AM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Le 10/12/2011 20:16, Simon Glass a écrit :
Before adding new relocation functions, move this prototype out of common.h where things are pretty crowded.
Signed-off-by: Simon Glasssjg@chromium.org
arch/arm/cpu/arm926ejs/davinci/spl.c | 1 + arch/arm/cpu/armv7/omap-common/spl.c | 1 + arch/arm/lib/board.c | 1 + arch/avr32/lib/board.c | 1 + arch/m68k/lib/board.c | 1 + arch/mips/lib/board.c | 1 + arch/nds32/lib/board.c | 1 + arch/powerpc/lib/board.c | 1 + arch/x86/lib/board.c | 1 + board/freescale/mpc8313erdb/mpc8313erdb.c | 1 + board/freescale/mpc8315erdb/mpc8315erdb.c | 1 + board/samsung/smdk6400/smdk6400_nand_spl.c | 1 + board/sheldon/simpc8313/simpc8313.c | 1 + include/common.h | 2 +- include/reloc.h | 39 +++++++++++++++++++++ nand_spl/board/freescale/mpc8536ds/nand_boot.c | 1 + nand_spl/board/freescale/mpc8569mds/nand_boot.c | 1 + nand_spl/board/freescale/mpc8572ds/nand_boot.c | 1 + nand_spl/board/freescale/p1010rdb/nand_boot.c | 1 + nand_spl/board/freescale/p1023rds/nand_boot.c | 1 + nand_spl/board/freescale/p1_p2_rdb/nand_boot.c | 1 + nand_spl/board/freescale/p1_p2_rdb_pc/nand_boot.c | 1 + nand_spl/nand_boot_fsl_nfc.c | 1 + 23 files changed, 61 insertions(+), 1 deletions(-) create mode 100644 include/reloc.h
Since this patch set indicates that ARM, and only ARM, is moved over to relocation, I would prefer it that no other arch file be modified in this patch, and that the move of ARM to the new relocation mechanism be done in an atomic commit, so that other architectures can refer to a single commit in order to do their own move.
What specifically are you asking for in this patch? I added this at the request of a reviewer of v1, who felt that we should be removing code from common.h instead of adding it, and that relocation is done in only a few sites so should not be in common.h. Do you think this patch should be pulled out of the series and done on its own?
What I am asking for, since your whole patch series only applies the 'new' relocation to ARM, is that this patch should not change anything to any arch other than ARM. Any other arch should only be touched by a later patch, whoever submits it, that will specifically aim to switch that arch to the 'new' relocation.
Regards, Simon
Amicalement,

Hi Simon,
On Mon, Dec 12, 2011 at 8:33 AM, Simon Glass sjg@chromium.org wrote:
Hi Albert,
On Sun, Dec 11, 2011 at 6:52 AM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Le 10/12/2011 20:16, Simon Glass a écrit :
Before adding new relocation functions, move this prototype out of common.h where things are pretty crowded.
Signed-off-by: Simon Glasssjg@chromium.org
arch/arm/cpu/arm926ejs/davinci/spl.c | 1 + arch/arm/cpu/armv7/omap-common/spl.c | 1 + arch/arm/lib/board.c | 1 + arch/avr32/lib/board.c | 1 + arch/m68k/lib/board.c | 1 + arch/mips/lib/board.c | 1 + arch/nds32/lib/board.c | 1 + arch/powerpc/lib/board.c | 1 + arch/x86/lib/board.c | 1 + board/freescale/mpc8313erdb/mpc8313erdb.c | 1 + board/freescale/mpc8315erdb/mpc8315erdb.c | 1 + board/samsung/smdk6400/smdk6400_nand_spl.c | 1 + board/sheldon/simpc8313/simpc8313.c | 1 + include/common.h | 2 +- include/reloc.h | 39 +++++++++++++++++++++ nand_spl/board/freescale/mpc8536ds/nand_boot.c | 1 + nand_spl/board/freescale/mpc8569mds/nand_boot.c | 1 + nand_spl/board/freescale/mpc8572ds/nand_boot.c | 1 + nand_spl/board/freescale/p1010rdb/nand_boot.c | 1 + nand_spl/board/freescale/p1023rds/nand_boot.c | 1 + nand_spl/board/freescale/p1_p2_rdb/nand_boot.c | 1 + nand_spl/board/freescale/p1_p2_rdb_pc/nand_boot.c | 1 + nand_spl/nand_boot_fsl_nfc.c | 1 + 23 files changed, 61 insertions(+), 1 deletions(-) create mode 100644 include/reloc.h
Since this patch set indicates that ARM, and only ARM, is moved over to relocation, I would prefer it that no other arch file be modified in this patch, and that the move of ARM to the new relocation mechanism be done in an atomic commit, so that other architectures can refer to a single commit in order to do their own move.
What specifically are you asking for in this patch? I added this at the request of a reviewer of v1, who felt that we should be removing code from common.h instead of adding it, and that relocation is done in only a few sites so should not be in common.h. Do you think this patch should be pulled out of the series and done on its own?
I think the point is that when we pull stuff out of common.h into, say, foo.h then only include foo.h where the functionality moved into foo.h is _currently_ used.
It looks like you've pulled out the functions definitions into reloc.h then included reloc.h everywhere (including common.h). Better would be to moved the definitions to reloc.h, do not include reloc.h in common.h and include reloc.h only where it is strictly needed right now - Add #include <reloc.h> to each arch file as and when it is needed
Regards,
Graeme

Hi Gream,
Le 11/12/2011 22:45, Graeme Russ a écrit :
I think the point is that when we pull stuff out of common.h into, say, foo.h then only include foo.h where the functionality moved into foo.h is _currently_ used.
It looks like you've pulled out the functions definitions into reloc.h then included reloc.h everywhere (including common.h). Better would be to moved the definitions to reloc.h, do not include reloc.h in common.h and include reloc.h only where it is strictly needed right now - Add #include<reloc.h> to each arch file as and when it is needed
Thanks Greame. This is indeed what I meant.
Regards,
Graeme
Amicalement,

Hi,
On Sun, Dec 11, 2011 at 2:29 PM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Hi Gream,
Le 11/12/2011 22:45, Graeme Russ a écrit :
I think the point is that when we pull stuff out of common.h into, say, foo.h then only include foo.h where the functionality moved into foo.h is _currently_ used.
It looks like you've pulled out the functions definitions into reloc.h then included reloc.h everywhere (including common.h). Better would be to moved the definitions to reloc.h, do not include reloc.h in common.h and include reloc.h only where it is strictly needed right now - Add #include<reloc.h> to each arch file as and when it is needed
Thanks Greame. This is indeed what I meant.
OK I understand, and that was my intention with this patch. Other than the mistake of leaving the #include in common.h where it is not now needed, is this patch correct?
Regards, Simon
Regards,
Graeme
Amicalement,
Albert.

We are introducing a new generic relocation features and we want this to be the default. So we need to opt all architectures out first. Some may never have relocation, but those that do will eventually move over to this generic relocation framework.
This is part of the unified board effort, but since we are only dealing with relocation in this series, CONFIG_SYS_SKIP_RELOC is more appropriate than CONFIG_SYS_LEGACY_BOARD.
Signed-off-by: Simon Glass sjg@chromium.org --- Changes in v2: - Use CONFIG_SYS_SKIP_RELOC instead of CONFIG_SYS_LEGACY_BOARD
README | 4 ++++ arch/arm/config.mk | 3 +++ arch/avr32/config.mk | 3 +++ arch/blackfin/config.mk | 3 +++ arch/m68k/config.mk | 3 +++ arch/microblaze/config.mk | 3 +++ arch/mips/config.mk | 3 +++ arch/nds32/config.mk | 3 +++ arch/nios2/config.mk | 3 +++ arch/powerpc/config.mk | 3 +++ arch/sandbox/config.mk | 3 +++ arch/sh/config.mk | 3 +++ arch/sparc/config.mk | 3 +++ arch/x86/config.mk | 3 +++ 14 files changed, 43 insertions(+), 0 deletions(-)
diff --git a/README b/README index e9d1891..be4bbf8 100644 --- a/README +++ b/README @@ -2707,6 +2707,10 @@ Configuration Settings: cases. This setting can be used to tune behaviour; see lib/hashtable.c for details.
+- CONFIG_SYS_SKIP_RELOC + This makes U-Boot skip relocation for those architectures which + don't support it. It is normally defined in arch/xxx/config.mk + The following definitions that deal with the placement and management of environment data (variable area); in general, we support the following configurations: diff --git a/arch/arm/config.mk b/arch/arm/config.mk index 45f9dca..f47d4f7 100644 --- a/arch/arm/config.mk +++ b/arch/arm/config.mk @@ -81,3 +81,6 @@ endif ifndef CONFIG_NAND_SPL LDFLAGS_u-boot += -pie endif + +# We use legacy relocation for now +CONFIG_SYS_SKIP_RELOC := y diff --git a/arch/avr32/config.mk b/arch/avr32/config.mk index d8e7ebb..1995983 100644 --- a/arch/avr32/config.mk +++ b/arch/avr32/config.mk @@ -31,3 +31,6 @@ PLATFORM_RELFLAGS += -ffunction-sections -fdata-sections LDFLAGS_u-boot = --gc-sections --relax
LDSCRIPT = $(SRCTREE)/$(CPUDIR)/u-boot.lds + +# We use legacy relocation for now +CONFIG_SYS_SKIP_RELOC := y diff --git a/arch/blackfin/config.mk b/arch/blackfin/config.mk index 3595aa2..56047c8 100644 --- a/arch/blackfin/config.mk +++ b/arch/blackfin/config.mk @@ -37,6 +37,9 @@ CONFIG_BFIN_BOOT_MODE := $(strip $(subst ",,$(CONFIG_BFIN_BOOT_MODE))) PLATFORM_RELFLAGS += -ffixed-P3 -fomit-frame-pointer -mno-fdpic PLATFORM_CPPFLAGS += -DCONFIG_BLACKFIN
+# Blackfin does not do relocation +CONFIG_SYS_SKIP_RELOC := y + LDFLAGS_FINAL += --gc-sections LDFLAGS += -m elf32bfin PLATFORM_RELFLAGS += -ffunction-sections -fdata-sections diff --git a/arch/m68k/config.mk b/arch/m68k/config.mk index 11ba334..52bfc81 100644 --- a/arch/m68k/config.mk +++ b/arch/m68k/config.mk @@ -29,3 +29,6 @@ PLATFORM_CPPFLAGS += -DCONFIG_M68K -D__M68K__ PLATFORM_LDFLAGS += -n PLATFORM_RELFLAGS += -ffunction-sections -fdata-sections LDFLAGS_FINAL += --gc-sections + +# We use legacy relocation for now +CONFIG_SYS_SKIP_RELOC := y diff --git a/arch/microblaze/config.mk b/arch/microblaze/config.mk index abea70b..7645f2e 100644 --- a/arch/microblaze/config.mk +++ b/arch/microblaze/config.mk @@ -29,3 +29,6 @@ CROSS_COMPILE ?= mb- CONFIG_STANDALONE_LOAD_ADDR ?= 0x80F00000
PLATFORM_CPPFLAGS += -ffixed-r31 -D__microblaze__ + +# Microblaze does not do relocation +CONFIG_SYS_SKIP_RELOC := y diff --git a/arch/mips/config.mk b/arch/mips/config.mk index 6ab8acd..832b93f 100644 --- a/arch/mips/config.mk +++ b/arch/mips/config.mk @@ -52,3 +52,6 @@ PLATFORM_CPPFLAGS += -msoft-float PLATFORM_LDFLAGS += -G 0 -static -n -nostdlib PLATFORM_RELFLAGS += -ffunction-sections -fdata-sections LDFLAGS_FINAL += --gc-sections + +# We use legacy relocation for now +CONFIG_SYS_SKIP_RELOC := y diff --git a/arch/nds32/config.mk b/arch/nds32/config.mk index c589829..4a4499b 100644 --- a/arch/nds32/config.mk +++ b/arch/nds32/config.mk @@ -33,3 +33,6 @@ PLATFORM_RELFLAGS += -gdwarf-2 PLATFORM_CPPFLAGS += -DCONFIG_NDS32 -D__nds32__ -G0 -ffixed-10 -fpie
LDFLAGS_u-boot = --gc-sections --relax + +# We use legacy relocation for now +CONFIG_SYS_SKIP_RELOC := y diff --git a/arch/nios2/config.mk b/arch/nios2/config.mk index 7b03ed8..cde7f82 100644 --- a/arch/nios2/config.mk +++ b/arch/nios2/config.mk @@ -31,3 +31,6 @@ PLATFORM_CPPFLAGS += -G0
LDFLAGS_FINAL += --gc-sections PLATFORM_RELFLAGS += -ffunction-sections -fdata-sections + +# NIOS2 does not do relocation +CONFIG_SYS_SKIP_RELOC := y diff --git a/arch/powerpc/config.mk b/arch/powerpc/config.mk index a307154..eba562f 100644 --- a/arch/powerpc/config.mk +++ b/arch/powerpc/config.mk @@ -42,3 +42,6 @@ endif ifeq ($(CROSS_COMPILE),powerpc-openbsd-) PLATFORM_CPPFLAGS+= -D__PPC__ endif + +# We use legacy relocation for now +CONFIG_SYS_SKIP_RELOC := y diff --git a/arch/sandbox/config.mk b/arch/sandbox/config.mk index ab33026..8a3198e 100644 --- a/arch/sandbox/config.mk +++ b/arch/sandbox/config.mk @@ -18,3 +18,6 @@ # MA 02111-1307 USA
PLATFORM_CPPFLAGS += -DCONFIG_SANDBOX -D__SANDBOX__ + +# Sandbox does not do relocation +CONFIG_SYS_SKIP_RELOC := y diff --git a/arch/sh/config.mk b/arch/sh/config.mk index 07ff8b9..48a7b37 100644 --- a/arch/sh/config.mk +++ b/arch/sh/config.mk @@ -31,3 +31,6 @@ endif PLATFORM_CPPFLAGS += -DCONFIG_SH -D__SH__ PLATFORM_LDFLAGS += -e $(CONFIG_SYS_TEXT_BASE) --defsym reloc_dst=$(CONFIG_SYS_TEXT_BASE) LDFLAGS_FINAL = --gc-sections + +# SH does not do relocation +CONFIG_SYS_SKIP_RELOC := y diff --git a/arch/sparc/config.mk b/arch/sparc/config.mk index cae7478..032659c 100644 --- a/arch/sparc/config.mk +++ b/arch/sparc/config.mk @@ -26,3 +26,6 @@ CROSS_COMPILE ?= sparc-elf- CONFIG_STANDALONE_LOAD_ADDR ?= 0x00000000 -L $(gcclibdir) -T sparc.lds
PLATFORM_CPPFLAGS += -DCONFIG_SPARC -D__sparc__ + +# Sparc does not do relocation +CONFIG_SYS_SKIP_RELOC := y diff --git a/arch/x86/config.mk b/arch/x86/config.mk index 23cacff..11f3d18 100644 --- a/arch/x86/config.mk +++ b/arch/x86/config.mk @@ -48,3 +48,6 @@ NORMAL_LIBGCC = $(shell $(CC) $(CFLAGS) -print-libgcc-file-name) PREFIXED_LIBGCC = $(OBJTREE)/arch/$(ARCH)/lib/$(shell basename $(NORMAL_LIBGCC))
export USE_PRIVATE_LIBGCC=$(shell dirname $(PREFIXED_LIBGCC)) + +# We use legacy relocation for now +CONFIG_SYS_SKIP_RELOC := y

Le 10/12/2011 20:16, Simon Glass a écrit :
We are introducing a new generic relocation features and we want this to be the default. So we need to opt all architectures out first. Some may never have relocation, but those that do will eventually move over to this generic relocation framework.
This is part of the unified board effort, but since we are only dealing with relocation in this series, CONFIG_SYS_SKIP_RELOC is more appropriate than CONFIG_SYS_LEGACY_BOARD.
I'm afraid I haven't made myself clear on CONFIG_SYS_SKIP_RELOC. I did not mean it to be an 'old vs new reloc' choice mechanism; I mean it to be the controlling option for whether relocation happens at all or not.
I want a relocation skip option, because it is useful for boards which, for any reason, know that they are already residing at the Right Address(tm).
As far as an option to switch from old to new relocation... If there is a consensus from all custodians that all arches should move to generic relocation, then I think we should avoid allowing the older mechanism to persist at all.
Amicalement,

Hi Albert,
On Sun, Dec 11, 2011 at 6:57 AM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Le 10/12/2011 20:16, Simon Glass a écrit :
We are introducing a new generic relocation features and we want this to be the default. So we need to opt all architectures out first. Some may never have relocation, but those that do will eventually move over to this generic relocation framework.
This is part of the unified board effort, but since we are only dealing with relocation in this series, CONFIG_SYS_SKIP_RELOC is more appropriate than CONFIG_SYS_LEGACY_BOARD.
I'm afraid I haven't made myself clear on CONFIG_SYS_SKIP_RELOC. I did not mean it to be an 'old vs new reloc' choice mechanism; I mean it to be the controlling option for whether relocation happens at all or not.
I want a relocation skip option, because it is useful for boards which, for any reason, know that they are already residing at the Right Address(tm).
As far as an option to switch from old to new relocation... If there is a consensus from all custodians that all arches should move to generic relocation, then I think we should avoid allowing the older mechanism to persist at all.
OK I see. That is a different thing to this patch. So what should I call CONFIG_SYS_SKIP_RELOC? Should it be renamed to CONFIG_SYS_LEGACY_RELOC?
With what you are looking for, there was a patch on the list some time ago which disables relocation under CONFIG control. From memory it received a cool reception. IMO there is value in it, but this again is a separate patch. There are three parts to relocate_code():
- copy the code - relocate the code - zero the BSS
While the first two can be skipped if the code doesn't need to move, the last must always be done.
Regards, Simon
Amicalement,
Albert.
Regards, Simon

Add a relocation implementation as the first thing in the generic board library. This library is needed by SPL also.
We create a separate header file for link symbols defined by the link scripts. It is helpful to have these all in one place and try to make them common across architectures. Since Linux already has a similar file, we bring this in even though many of the symbols there are not relevant to us.
The __relocate_code() function is what we expect all architectures which support relocation will use eventually. For now, they all override this with their own version.
Signed-off-by: Simon Glass sjg@chromium.org --- Changes in v2: - Import asm-generic/sections.h from Linux and add U-Boot extras - Squash generic link symbols patch into generic relocation patch - Move reloc.c into common/ - Add function comments - Use memset, memcpy instead of inline code - Add README file for relocation
common/Makefile | 4 + common/reloc.c | 121 ++++++++++++++++++++++++++++++++++++++++ doc/README.relocation | 87 ++++++++++++++++++++++++++++ include/asm-generic/sections.h | 92 ++++++++++++++++++++++++++++++ include/reloc.h | 15 +++++ 5 files changed, 319 insertions(+), 0 deletions(-) create mode 100644 common/reloc.c create mode 100644 doc/README.relocation create mode 100644 include/asm-generic/sections.h
diff --git a/common/Makefile b/common/Makefile index 1be7236..f64fea8 100644 --- a/common/Makefile +++ b/common/Makefile @@ -192,6 +192,10 @@ COBJS-y += dlmalloc.o COBJS-y += memsize.o COBJS-y += stdio.o
+ifndef CONFIG_SYS_SKIP_RELOC +COBJS-y += reloc.o +endif +
COBJS := $(sort $(COBJS-y)) XCOBJS := $(sort $(XCOBJS-y)) diff --git a/common/reloc.c b/common/reloc.c new file mode 100644 index 0000000..2344e98 --- /dev/null +++ b/common/reloc.c @@ -0,0 +1,121 @@ +/* + * Copyright (c) 2011 The Chromium OS Authors. + * + * See file CREDITS for list of people who contributed to this + * project. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of + * the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, + * MA 02111-1307 USA + */ + +#include <common.h> +#include <asm-generic/sections.h> +#include <asm/reloc.h> +#include <reloc.h> +#include <nand.h> + +DECLARE_GLOBAL_DATA_PTR; + +static int reloc_make_copy(void) +{ + char *dst_addr = (char *)gd->relocaddr; + + /* TODO: __text_start would be better when we have it */ + char *src_addr = (char *)_start; + /* TODO: switch over to __image_copy_end when we can */ +#ifdef CONFIG_SPL_BUILD + char *end_addr = src_addr + _image_copy_end_ofs; +#else + char *end_addr = src_addr + _rel_dyn_start_ofs; +#endif + + if (dst_addr != src_addr) { + size_t size = end_addr - src_addr; + + debug("%s: copy code %p-%p to %p-%p\n", __func__, + src_addr, end_addr, dst_addr, dst_addr + size); + memcpy(dst_addr, src_addr, size); + } + return 0; +} + +static int reloc_elf(void) +{ +#ifndef CONFIG_SPL_BUILD + const Elf32_Rel *ptr, *end; + Elf32_Addr *addr; + char *src_addr = (char *)_start; + Elf32_Sym *dynsym; + ulong reloc_ofs = gd->reloc_off; + + /* scan the relocation table for relevant entries */ + ptr = (Elf32_Rel *)(src_addr + _rel_dyn_start_ofs); + end = (Elf32_Rel *)(src_addr + _rel_dyn_end_ofs); + dynsym = (Elf32_Sym *)(src_addr + _dynsym_start_ofs); + debug("%s: process reloc entries %p-%p, dynsym at %p\n", __func__, + ptr, end, dynsym); + for (; ptr < end; ptr++) { + addr = (Elf32_Addr *)(ptr->r_offset + reloc_ofs); + if (arch_elf_relocate_entry(addr, ptr->r_info, dynsym, + reloc_ofs)) + return -1; + } +#endif + return 0; +} + +static int reloc_clear_bss(void) +{ + char *dst_addr = (char *)_start + _bss_start_ofs; + size_t size = _bss_end_ofs - _bss_start_ofs; + +#ifndef CONFIG_SPL_BUILD + /* No relocation for SPL (TBD: better to set reloc_off to zero) */ + dst_addr += gd->reloc_off; +#endif + + /* TODO: use memset */ + debug("%s: zero bss %p-%p\n", __func__, dst_addr, dst_addr + size); + memset(dst_addr, '\0', size); + + return 0; +} + +void __relocate_code(ulong dest_addr_sp, gd_t *new_gd, ulong dest_addr) +{ + ulong new_board_init_r = (uintptr_t)board_init_r + gd->reloc_off; + + /* TODO: It might be better to put the offsets in global data */ + debug("%s, dest_addr_sp=%lx, new_gd=%p, dest_addr=%lx\n", __func__, + dest_addr_sp, new_gd, dest_addr); + reloc_make_copy(); + reloc_elf(); + reloc_clear_bss(); + + debug("relocation complete: starting from board_init_r() at %lx\n", + new_board_init_r); + /* TODO: tidy this up since we don't want a separate nand_boot() */ +#ifdef CONFIG_NAND_SPL + nand_boot(); +#else + proc_call_board_init_r(new_gd, dest_addr, + (board_init_r_func)new_board_init_r, + dest_addr_sp); +#endif +} + +/* Allow architectures to override this function - initially they all will */ +void relocate_code(ulong dest_sp, gd_t *new_gd, ulong dest_add) + __attribute__((weak, alias("__relocate_code"))); diff --git a/doc/README.relocation b/doc/README.relocation new file mode 100644 index 0000000..6dfbe9c --- /dev/null +++ b/doc/README.relocation @@ -0,0 +1,87 @@ +/* + * Copyright (c) 2011 The Chromium OS Authors. + * + * See file CREDITS for list of people who contributed to this + * project. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of + * the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, + * MA 02111-1307 USA + */ + +Generic Relocation Framework +============================ + +Since most architectures perform relocation and mostly share the same +procedure, a generic relocation framework has been created. + + +What is Relocation? +------------------- +The basic purpose of relocation is to move U-Boot from its starting +address (probably CONFIG_SYS_TEXT_BASE) to the top the RAM. This makes +it easy to use the rest of available RAM in one chunk for things like +loading a kernel or ram disk. + +The relocation code is in common/reloc.c in a function called +__relocate_code(). It is called right at the end of board_init_f() and +performs these steps: + +- Copies U-Boot to the top of RAM +- Adjusts any code/data which needs relocation for the new position +- Clears our the BSS (so that your global variables start as zero!) +- Jumps to the new U-Boot, to a function called board_init_r() + + +How do I use the framework? +--------------------------- +To use the generic framework, you should define a function for your +architecture in arch/xxx/include/asm/reloc.h like this: + +/** + * Process a single ELF relocation entry + * + * @param addr Pointer to address of intruction/data to relocate + * @param info The ELF information word / flags + * @param symtab The ELF relocation symbol table + * @param reloc_off Offset of relocated U-Boot relative to load address + * @return 0 if ok, -1 on error + */ +static inline int arch_elf_relocate_entry(Elf32_Addr *addr, Elf32_Word info, + Elf32_Sym *symtab, ulong reloc_off); + + +This function should relocate the code/data at the given relocated address +based on the relocation information in 'info'. The ELF symbol table and +relocation offset (new position minus CONFIG_SYS_TEXT_BASE) are provided. + + +How fast is relocation? +----------------------- +It's pretty fast, but if you want to speed up relocation, you can define +these two CONFIGs in your board file: + +#define CONFIG_USE_ARCH_MEMSET - speeds up BSS clearing +#define CONFIG_USE_ARCH_MEMCPY - speeds up copying of code/data + +Rough benchmarks on a Tegra2x ARM system showed that using both cut the total +relocation time by 65% (from 15ms to 5ms). + + +Opting Out +---------- +If you want to do relocation yourself, you can define your own +relocate_code() function. See include/reloc.h for the prototype. You +can also define CONFIG_SYS_SKIP_RELOC to disable the generic relocation +and remove its code. diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h new file mode 100644 index 0000000..2935dc1 --- /dev/null +++ b/include/asm-generic/sections.h @@ -0,0 +1,92 @@ +/* + * Copyright (c) 2011 The Chromium OS Authors. + * See file CREDITS for list of people who contributed to this + * project. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of + * the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, + * MA 02111-1307 USA + */ + +/* Taken from Linux kernel */ + +#ifndef _ASM_GENERIC_SECTIONS_H_ +#define _ASM_GENERIC_SECTIONS_H_ + +/* References to section boundaries */ + +extern char _text[], _stext[], _etext[]; +extern char _data[], _sdata[], _edata[]; +extern char __bss_start[], __bss_stop[]; +extern char __init_begin[], __init_end[]; +extern char _sinittext[], _einittext[]; +extern char _end[]; +extern char __per_cpu_load[], __per_cpu_start[], __per_cpu_end[]; +extern char __kprobes_text_start[], __kprobes_text_end[]; +extern char __entry_text_start[], __entry_text_end[]; +extern char __initdata_begin[], __initdata_end[]; +extern char __start_rodata[], __end_rodata[]; + +/* Start and end of .ctors section - used for constructor calls. */ +extern char __ctors_start[], __ctors_end[]; + +/* function descriptor handling (if any). Override + * in asm/sections.h */ +#ifndef dereference_function_descriptor +#define dereference_function_descriptor(p) (p) +#endif + +/* random extra sections (if any). Override + * in asm/sections.h */ +#ifndef arch_is_kernel_text +static inline int arch_is_kernel_text(unsigned long addr) +{ + return 0; +} +#endif + +#ifndef arch_is_kernel_data +static inline int arch_is_kernel_data(unsigned long addr) +{ + return 0; +} +#endif + +#include <elf.h> + +/* U-Boot-specific things begin here */ + +/* Start of U-Boot text region */ +extern char __text_start[]; + +/* This marks the end of the text region which must be relocated */ +extern char __image_copy_end[]; + +/* + * This is the U-Boot entry point - prior to relocation it should be same + * as __text_start + */ +extern void _start(void); + +/* Start/end of the relocation entries, as an offset from _start */ +extern ulong _rel_dyn_start_ofs; +extern ulong _rel_dyn_end_ofs; + +/* Start/end of the relocation symbol table, as an offset from _start */ +extern ulong _dynsym_start_ofs; + +/* End of the region to be relocated, as an offset form _start */ +extern ulong _image_copy_end_ofs; + +#endif /* _ASM_GENERIC_SECTIONS_H_ */ diff --git a/include/reloc.h b/include/reloc.h index 3dc7b85..79c0a24 100644 --- a/include/reloc.h +++ b/include/reloc.h @@ -23,6 +23,21 @@ #ifndef __RELOC_H #define __RELOC_H
+/* This is the prototype for the post-relocation init function */ +typedef void (*board_init_r_func)(gd_t *, ulong); + +/** + * Call the relocated U-Boot. This is the last thing that is done after + * relocation. This function does not return. + * + * @param new_gd Pointer to the relocated global data + * @param dest_addr Base code address of relocated U-Boot + * @param board_init_r_func Pointer to relocated function to call + */ +void proc_call_board_init_r(gd_t *new_gd, ulong dest_addr, + board_init_r_func board_init_r, ulong dest_addr_sp) + __attribute__ ((noreturn)); + /** * Relocate U-Boot and jump to the relocated coded *

Add a library to hold ARM assembler code which is generic across all ARM CPUs. At first it just holds some basic relocation code. The plan is to move more start.S code here.
Signed-off-by: Simon Glass sjg@chromium.org --- Changes in v2: - Invalidate I-cache when we jump to relocated code
arch/arm/lib/Makefile | 2 ++ arch/arm/lib/proc.S | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+), 0 deletions(-) create mode 100644 arch/arm/lib/proc.S
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile index 300c8fa..213c76f 100644 --- a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -48,6 +48,8 @@ SOBJS-$(CONFIG_USE_ARCH_MEMSET) += memset.o SOBJS-$(CONFIG_USE_ARCH_MEMCPY) += memcpy.o endif
+SOBJS-y += proc.o + SRCS := $(GLSOBJS:.o=.S) $(GLCOBJS:.o=.c) \ $(SOBJS-y:.o=.S) $(COBJS-y:.o=.c) OBJS := $(addprefix $(obj),$(SOBJS-y) $(COBJS-y)) diff --git a/arch/arm/lib/proc.S b/arch/arm/lib/proc.S new file mode 100644 index 0000000..dba7c11 --- /dev/null +++ b/arch/arm/lib/proc.S @@ -0,0 +1,40 @@ +/* + * Copyright (c) 2011 The Chromium OS Authors. + * See file CREDITS for list of people who contributed to this + * project. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of + * the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, + * MA 02111-1307 USA + */ + + +/** + * Jump to board_init_r with a new stack pointer + * + * @param gd Pointer to global data + * @param dest_addr Destination address from global data + * @param func Address of board_init_r function (relocated) + * @param sp New stack pointer + */ +.globl proc_call_board_init_r +proc_call_board_init_r: +#ifndef CONFIG_SYS_ICACHE_OFF + mcr p15, 0, r0, c7, c5, 0 @ invalidate icache + mcr p15, 0, r0, c7, c10, 4 @ DSB + mcr p15, 0, r0, c7, c5, 4 @ ISB +#endif + mov sp, r3 + /* jump to it ... */ + mov pc, r2

Hi Simon,
Le 10/12/2011 20:16, Simon Glass a écrit :
Add a library to hold ARM assembler code which is generic across all ARM CPUs. At first it just holds some basic relocation code. The plan is to move more start.S code here.
I still don't see the point in this new file, as this code is common across all ARM boards and start.S already has much of such common code. I'd rather see start.S keeping the common code and moving up the hierarchy, and and the {ISA, core, SoC, board} specific code be moved out of start.S and down into their 'natural' specific location.
Amicalement,

Hi Albert,
On Sun, Dec 11, 2011 at 6:16 AM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Hi Simon,
Le 10/12/2011 20:16, Simon Glass a écrit :
Add a library to hold ARM assembler code which is generic across all ARM CPUs. At first it just holds some basic relocation code. The plan is to move more start.S code here.
I still don't see the point in this new file, as this code is common across all ARM boards and start.S already has much of such common code. I'd rather see start.S keeping the common code and moving up the hierarchy, and and the {ISA, core, SoC, board} specific code be moved out of start.S and down into their 'natural' specific location.
If we do this then it would be a separate series focused on refactoring start.S. Are you saying that you want that series before we address this relocation series? Where specifically should we put the new start.S and what should we call it? I assume that all the little start.S files will remain or are you wanting to rename all of those into some other filename in arch/arm/cpu/... In which case what should those be called?
It is good to be really specific so that I know the destination port before embarking on a new voyage.
Regards, Simon
Amicalement,
Albert.

Hi Albert,
On Sun, Dec 11, 2011 at 9:24 PM, Simon Glass sjg@chromium.org wrote:
Hi Albert,
On Sun, Dec 11, 2011 at 6:16 AM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Hi Simon,
Le 10/12/2011 20:16, Simon Glass a écrit :
Add a library to hold ARM assembler code which is generic across all ARM CPUs. At first it just holds some basic relocation code. The plan is to move more start.S code here.
I still don't see the point in this new file, as this code is common across all ARM boards and start.S already has much of such common code. I'd rather see start.S keeping the common code and moving up the hierarchy, and and the {ISA, core, SoC, board} specific code be moved out of start.S and down into their 'natural' specific location.
If we do this then it would be a separate series focused on refactoring start.S. Are you saying that you want that series before we address this relocation series? Where specifically should we put the new start.S and what should we call it? I assume that all the little start.S files will remain or are you wanting to rename all of those into some other filename in arch/arm/cpu/... In which case what should those be called?
It is good to be really specific so that I know the destination port before embarking on a new voyage.
Regards, Simon
Thanks for your comments.
Still I am not sure what is needed here. Short of refactoring all of start.S (which is a separate and larger effort0 I am not sure what to do. My purpose with this series is to move ARM to generic relocation which is why I am removing all the repeated relocation code from each start.S.
I think we need an assembler file shared across all ARM. Since it cannot be start.S (certainly not now and perhaps not ever if the early code is too different between different ARM architectures or cannot easily be tested) then what should it be? I have proposed arch/arm/lib/proc.S.
(No I really don't want to provide 10 copies of my little board_init_r()-caller function in each start.S, I hope you are not suggesting that!)
My idea is that the various start.S files will shrink down with time as code in there becomes common (IMO the first priority after relocation would be exception handling). But it is going to break a few eggs, since there are subtle differences between the different copies, and some people will ask why we are breaking their chip, or introducing the possibility of breakage. At least with relocation we see a real performance benefit and substantial LOC reduction which helps people stay motivated.
I have the generic board series waiting but would like to get this one sorted out first. Looking forward to your clarification when you have time.
Again the main question is where to put the common ARM assembler code.
Regards, Simon
Amicalement,
Albert.

Add a function to process a single ELF relocation and switch ARM over to use generic relocation.
Unfortunately a few boards need to be modified to make this work (mostly adding link symbols to the .lds files).
Signed-off-by: Simon Glass sjg@chromium.org --- Changes in v2: - Use an inline relocation function to reduce code size
arch/arm/config.mk | 3 - arch/arm/include/asm/reloc.h | 56 +++++++++++++++++++++++++++ nand_spl/board/freescale/mx31pdk/Makefile | 8 +++- nand_spl/board/freescale/mx31pdk/u-boot.lds | 1 + nand_spl/board/karo/tx25/Makefile | 8 +++- nand_spl/board/karo/tx25/u-boot.lds | 1 + 6 files changed, 72 insertions(+), 5 deletions(-) create mode 100644 arch/arm/include/asm/reloc.h
diff --git a/arch/arm/config.mk b/arch/arm/config.mk index f47d4f7..45f9dca 100644 --- a/arch/arm/config.mk +++ b/arch/arm/config.mk @@ -81,6 +81,3 @@ endif ifndef CONFIG_NAND_SPL LDFLAGS_u-boot += -pie endif - -# We use legacy relocation for now -CONFIG_SYS_SKIP_RELOC := y diff --git a/arch/arm/include/asm/reloc.h b/arch/arm/include/asm/reloc.h new file mode 100644 index 0000000..3b6491d --- /dev/null +++ b/arch/arm/include/asm/reloc.h @@ -0,0 +1,56 @@ +/* + * Copyright (c) 2011 The Chromium OS Authors. + * See file CREDITS for list of people who contributed to this + * project. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of + * the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, + * MA 02111-1307 USA + */ + +#include <common.h> +#include <elf.h> + +/** + * Process a single ELF relocation entry + * + * @param addr Pointer to address of intruction/data to relocate + * @param info The ELF information word / flags + * @param symtab The ELF relocation symbol table + * @param reloc_off Offset of relocated U-Boot relative to load address + * @return 0 if ok, -1 on error + */ +static inline int arch_elf_relocate_entry(Elf32_Addr *addr, Elf32_Word info, + Elf32_Sym *symtab, ulong reloc_off) +{ + int sym; + + switch (ELF32_R_TYPE(info)) { + /* relative fix: increase location by offset */ + case 23: /* TODO: add R_ARM_... defines to elf.h */ + *addr += reloc_off; + break; + + /* absolute fix: set location to (offset) symbol value */ + case 2: + sym = ELF32_R_SYM(info); + *addr = symtab[sym].st_value + reloc_off; + break; + + default: + debug("*** Invalid relocation\n"); + return -1; + } + return 0; +} diff --git a/nand_spl/board/freescale/mx31pdk/Makefile b/nand_spl/board/freescale/mx31pdk/Makefile index 87784d2..470e320 100644 --- a/nand_spl/board/freescale/mx31pdk/Makefile +++ b/nand_spl/board/freescale/mx31pdk/Makefile @@ -11,7 +11,7 @@ LDFLAGS := -T $(nandobj)u-boot.lds -Ttext $(CONFIG_SYS_TEXT_BASE) $(LDFLAGS) \ AFLAGS += -DCONFIG_SPL_BUILD -DCONFIG_NAND_SPL CFLAGS += -DCONFIG_SPL_BUILD -DCONFIG_NAND_SPL
-SOBJS = start.o lowlevel_init.o +SOBJS = start.o lowlevel_init.o proc.o reloc.o COBJS = nand_boot_fsl_nfc.o
SRCS := $(SRCTREE)/nand_spl/nand_boot_fsl_nfc.c @@ -50,6 +50,12 @@ $(obj)%.o: $(SRCTREE)/board/freescale/mx31pdk/%.S $(obj)%.o: $(SRCTREE)/nand_spl/%.c $(CC) $(CFLAGS) -c -o $@ $<
+$(obj)%.o: $(SRCTREE)/common/%.c + $(CC) $(CFLAGS) -c -o $@ $< + +$(obj)%.o: $(SRCTREE)/arch/arm/lib/%.S + $(CC) $(AFLAGS) -c -o $@ $< + # defines $(obj).depend target include $(SRCTREE)/rules.mk
diff --git a/nand_spl/board/freescale/mx31pdk/u-boot.lds b/nand_spl/board/freescale/mx31pdk/u-boot.lds index d2b08f6..2273e9b 100644 --- a/nand_spl/board/freescale/mx31pdk/u-boot.lds +++ b/nand_spl/board/freescale/mx31pdk/u-boot.lds @@ -51,6 +51,7 @@ SECTIONS __u_boot_cmd_end = .;
. = ALIGN(4); + __image_copy_end = .;
.rel.dyn : { __rel_dyn_start = .; diff --git a/nand_spl/board/karo/tx25/Makefile b/nand_spl/board/karo/tx25/Makefile index 0336346..cac5600 100644 --- a/nand_spl/board/karo/tx25/Makefile +++ b/nand_spl/board/karo/tx25/Makefile @@ -32,7 +32,7 @@ LDFLAGS := -T $(nandobj)u-boot.lds -Ttext $(CONFIG_SYS_TEXT_BASE) $(LDFLAGS) \ AFLAGS += -DCONFIG_SPL_BUILD -DCONFIG_NAND_SPL CFLAGS += -DCONFIG_SPL_BUILD -DCONFIG_NAND_SPL
-SOBJS = start.o lowlevel_init.o +SOBJS = start.o lowlevel_init.o proc.o reloc.o COBJS = nand_boot_fsl_nfc.o
SRCS := $(SRCTREE)/nand_spl/nand_boot_fsl_nfc.c @@ -71,6 +71,12 @@ $(obj)%.o: $(SRCTREE)/board/karo/tx25/%.S $(obj)%.o: $(SRCTREE)/nand_spl/%.c $(CC) $(CFLAGS) -c -o $@ $<
+$(obj)%.o: $(SRCTREE)/common/%.c + $(CC) $(CFLAGS) -c -o $@ $< + +$(obj)%.o: $(SRCTREE)/arch/arm/lib/%.S + $(CC) $(AFLAGS) -c -o $@ $< + # defines $(obj).depend target include $(SRCTREE)/rules.mk
diff --git a/nand_spl/board/karo/tx25/u-boot.lds b/nand_spl/board/karo/tx25/u-boot.lds index d2b08f6..2273e9b 100644 --- a/nand_spl/board/karo/tx25/u-boot.lds +++ b/nand_spl/board/karo/tx25/u-boot.lds @@ -51,6 +51,7 @@ SECTIONS __u_boot_cmd_end = .;
. = ALIGN(4); + __image_copy_end = .;
.rel.dyn : { __rel_dyn_start = .;

Now that we are using the generic relocation framework, we don't need this code.
Note: Here we lose the ARM1176's enable_mmu code. This seems to duplicate code already in U-Boot now. Can anyone comment on this?
Signed-off-by: Simon Glass sjg@chromium.org --- Changes in v2: - Make relocation symbols global so we can use them outside start.S
arch/arm/cpu/arm1136/start.S | 133 +++---------------------- arch/arm/cpu/arm1176/start.S | 214 +++------------------------------------- arch/arm/cpu/arm720t/start.S | 127 +++--------------------- arch/arm/cpu/arm920t/start.S | 135 +++----------------------- arch/arm/cpu/arm925t/start.S | 135 +++----------------------- arch/arm/cpu/arm926ejs/start.S | 144 +++------------------------ arch/arm/cpu/arm946es/start.S | 130 +++---------------------- arch/arm/cpu/arm_intcm/start.S | 135 +++----------------------- arch/arm/cpu/armv7/start.S | 141 +++----------------------- arch/arm/cpu/ixp/start.S | 127 +++--------------------- arch/arm/cpu/lh7a40x/start.S | 124 +++--------------------- arch/arm/cpu/pxa/start.S | 138 +++----------------------- arch/arm/cpu/s3c44b0/start.S | 127 +++--------------------- arch/arm/cpu/sa1100/start.S | 124 +++--------------------- 14 files changed, 171 insertions(+), 1763 deletions(-)
diff --git a/arch/arm/cpu/arm1136/start.S b/arch/arm/cpu/arm1136/start.S index c0db96c..f1efa5e 100644 --- a/arch/arm/cpu/arm1136/start.S +++ b/arch/arm/cpu/arm1136/start.S @@ -108,6 +108,18 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -169,127 +181,6 @@ call_board_init_f:
bl board_init_f
-/*------------------------------------------------------------------------------*/ - -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -clear_bss: -#ifndef CONFIG_SPL_BUILD - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - bne clbss_l -#endif /* #ifndef CONFIG_SPL_BUILD */ - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ -#ifdef CONFIG_NAND_SPL - ldr r0, _nand_boot_ofs - mov pc, r0 - -_nand_boot_ofs: - .word nand_boot -#else -jump_2_ram: - ldr r0, _board_init_r_ofs - ldr r1, _TEXT_BASE - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start -#endif - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - /* ************************************************************************* * diff --git a/arch/arm/cpu/arm1176/start.S b/arch/arm/cpu/arm1176/start.S index 848144a..a413f51 100644 --- a/arch/arm/cpu/arm1176/start.S +++ b/arch/arm/cpu/arm1176/start.S @@ -127,6 +127,18 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + /* IRQ stack memory (calculated at run-time) + 8 bytes */ .globl IRQ_STACK_START_IN IRQ_STACK_START_IN: @@ -231,208 +243,6 @@ call_board_init_f: ldr r0,=0x00000000 bl board_init_f
-/*------------------------------------------------------------------------------*/ - -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -#ifdef CONFIG_ENABLE_MMU -enable_mmu: - /* enable domain access */ - ldr r5, =0x0000ffff - mcr p15, 0, r5, c3, c0, 0 /* load domain access register */ - - /* Set the TTB register */ - ldr r0, _mmu_table_base - ldr r1, =CONFIG_SYS_PHY_UBOOT_BASE - ldr r2, =0xfff00000 - bic r0, r0, r2 - orr r1, r0, r1 - mcr p15, 0, r1, c2, c0, 0 - - /* Enable the MMU */ - mrc p15, 0, r0, c1, c0, 0 - orr r0, r0, #1 /* Set CR_M to enable MMU */ - - /* Prepare to enable the MMU */ - adr r1, skip_hw_init - and r1, r1, #0x3fc - ldr r2, _TEXT_BASE - ldr r3, =0xfff00000 - and r2, r2, r3 - orr r2, r2, r1 - b mmu_enable - - .align 5 - /* Run in a single cache-line */ -mmu_enable: - - mcr p15, 0, r0, c1, c0, 0 - nop - nop - mov pc, r2 -skip_hw_init: -#endif - -clear_bss: -#ifndef CONFIG_SPL_BUILD - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - bne clbss_l - -#ifndef CONFIG_NAND_SPL - bl coloured_LED_init - bl red_led_on -#endif -#endif - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ -#ifdef CONFIG_NAND_SPL - ldr pc, _nand_boot - -_nand_boot: .word nand_boot -#else - ldr r0, _board_init_r_ofs - adr r1, _start - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start -#endif - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - -#ifdef CONFIG_ENABLE_MMU -_mmu_table_base: - .word mmu_table -#endif - -#ifndef CONFIG_NAND_SPL -/* - * we assume that cache operation is done before. (eg. cleanup_before_linux()) - * actually, we don't need to do anything about cache if not use d-cache in - * U-Boot. So, in this function we clean only MMU. by scsuh - * - * void theLastJump(void *kernel, int arch_num, uint boot_params); - */ -#ifdef CONFIG_ENABLE_MMU - .globl theLastJump -theLastJump: - mov r9, r0 - ldr r3, =0xfff00000 - ldr r4, _TEXT_PHY_BASE - adr r5, phy_last_jump - bic r5, r5, r3 - orr r5, r5, r4 - mov pc, r5 -phy_last_jump: - /* - * disable MMU stuff - */ - mrc p15, 0, r0, c1, c0, 0 - bic r0, r0, #0x00002300 /* clear bits 13, 9:8 (--V- --RS) */ - bic r0, r0, #0x00000087 /* clear bits 7, 2:0 (B--- -CAM) */ - orr r0, r0, #0x00000002 /* set bit 2 (A) Align */ - orr r0, r0, #0x00001000 /* set bit 12 (I) I-Cache */ - mcr p15, 0, r0, c1, c0, 0 - - mcr p15, 0, r0, c8, c7, 0 /* flush v4 TLB */ - - mov r0, #0 - mov pc, r9 -#endif - - /* ************************************************************************* * diff --git a/arch/arm/cpu/arm720t/start.S b/arch/arm/cpu/arm720t/start.S index 540e3c2..c703a88 100644 --- a/arch/arm/cpu/arm720t/start.S +++ b/arch/arm/cpu/arm720t/start.S @@ -97,6 +97,18 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -146,121 +158,6 @@ call_board_init_f: ldr r0,=0x00000000 bl board_init_f
-/*------------------------------------------------------------------------------*/ - -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -clear_bss: -#ifndef CONFIG_SPL_BUILD - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - bne clbss_l - - bl coloured_LED_init - bl red_led_on -#endif - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ - ldr r0, _board_init_r_ofs - adr r1, _start - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - /* ************************************************************************* * diff --git a/arch/arm/cpu/arm920t/start.S b/arch/arm/cpu/arm920t/start.S index 8c5612c..75f9043 100644 --- a/arch/arm/cpu/arm920t/start.S +++ b/arch/arm/cpu/arm920t/start.S @@ -93,6 +93,18 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -189,129 +201,6 @@ call_board_init_f: ldr r0,=0x00000000 bl board_init_f
-/*------------------------------------------------------------------------------*/ - -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -clear_bss: -#ifndef CONFIG_SPL_BUILD - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - bne clbss_l - - bl coloured_LED_init - bl red_led_on -#endif - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ -#ifdef CONFIG_NAND_SPL - ldr r0, _nand_boot_ofs - mov pc, r0 - -_nand_boot_ofs: - .word nand_boot -#else - ldr r0, _board_init_r_ofs - adr r1, _start - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start -#endif - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - /* ************************************************************************* * diff --git a/arch/arm/cpu/arm925t/start.S b/arch/arm/cpu/arm925t/start.S index dbb93ef..30df379 100644 --- a/arch/arm/cpu/arm925t/start.S +++ b/arch/arm/cpu/arm925t/start.S @@ -103,6 +103,18 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -183,129 +195,6 @@ call_board_init_f: ldr r0,=0x00000000 bl board_init_f
-/*------------------------------------------------------------------------------*/ - -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -clear_bss: -#ifndef CONFIG_SPL_BUILD - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - bne clbss_l - - bl coloured_LED_init - bl red_led_on -#endif - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ -#ifdef CONFIG_NAND_SPL - ldr r0, _nand_boot_ofs - mov pc, r0 - -_nand_boot_ofs: - .word nand_boot -#else - ldr r0, _board_init_r_ofs - adr r1, _start - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start -#endif - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - /* ************************************************************************* * diff --git a/arch/arm/cpu/arm926ejs/start.S b/arch/arm/cpu/arm926ejs/start.S index 6a09c02..ac358bc 100644 --- a/arch/arm/cpu/arm926ejs/start.S +++ b/arch/arm/cpu/arm926ejs/start.S @@ -160,6 +160,18 @@ _end: .word __bss_end__ #endif
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -211,138 +223,6 @@ call_board_init_f: ldr r0,=0x00000000 bl board_init_f
-/*------------------------------------------------------------------------------*/ - -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - sub r9, r6, r0 /* r9 <- relocation offset */ - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -clear_bss: -#ifdef CONFIG_SPL_BUILD - /* No relocation for SPL */ - ldr r0, =__bss_start - ldr r1, =__bss_end__ -#else - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 -#endif - mov r2, #0x00000000 /* clear */ - -clbss_l:cmp r0, r1 /* clear loop... */ - bhs clbss_e /* if reached end of bss, exit */ - str r2, [r0] - add r0, r0, #4 - b clbss_l -clbss_e: - -#ifndef CONFIG_SPL_BUILD - bl coloured_LED_init - bl red_led_on -#endif - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ -#ifdef CONFIG_NAND_SPL - ldr r0, _nand_boot_ofs - mov pc, r0 - -_nand_boot_ofs: - .word nand_boot -#else - ldr r0, _board_init_r_ofs - ldr r1, _TEXT_BASE - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start -#endif - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - /* ************************************************************************* * diff --git a/arch/arm/cpu/arm946es/start.S b/arch/arm/cpu/arm946es/start.S index 89ba558..7c5d19f 100644 --- a/arch/arm/cpu/arm946es/start.S +++ b/arch/arm/cpu/arm946es/start.S @@ -109,6 +109,18 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -154,124 +166,6 @@ call_board_init_f: ldr r0,=0x00000000 bl board_init_f
-/*------------------------------------------------------------------------------*/ - -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -clear_bss: -#ifndef CONFIG_SPL_BUILD - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - blo clbss_l -#endif - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ -#ifdef CONFIG_NAND_SPL - ldr pc, _nand_boot - -_nand_boot: .word nand_boot -#else - ldr r0, _board_init_r_ofs - adr r1, _start - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start -#endif - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - /* ************************************************************************* * diff --git a/arch/arm/cpu/arm_intcm/start.S b/arch/arm/cpu/arm_intcm/start.S index 2033b36..28fa6c6 100644 --- a/arch/arm/cpu/arm_intcm/start.S +++ b/arch/arm/cpu/arm_intcm/start.S @@ -105,6 +105,18 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -150,129 +162,6 @@ call_board_init_f: ldr r0,=0x00000000 bl board_init_f
-/*------------------------------------------------------------------------------*/ - -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -clear_bss: -#ifndef CONFIG_SPL_BUILD - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - bne clbss_l - - bl coloured_LED_init - bl red_led_on -#endif - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ -#ifdef CONFIG_NAND_SPL - ldr r0, _nand_boot_ofs - mov pc, r0 - -_nand_boot_ofs: - .word nand_boot -#else - ldr r0, _board_init_r_ofs - adr r1, _start - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start -#endif - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - /* ************************************************************************* * diff --git a/arch/arm/cpu/armv7/start.S b/arch/arm/cpu/armv7/start.S index 4d8cf35..86510c7 100644 --- a/arch/arm/cpu/armv7/start.S +++ b/arch/arm/cpu/armv7/start.S @@ -96,6 +96,7 @@ _armboot_start:
/* * These are defined in the board-specific linker script. + * TODO: move these into proc.S since they are common */ .globl _bss_start_ofs _bss_start_ofs: @@ -113,6 +114,20 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+#ifndef CONFIG_SPL_BUILD +.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start +#endif + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -176,132 +191,6 @@ call_board_init_f:
/*------------------------------------------------------------------------------*/
-/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - cmp r0, r6 - moveq r9, #0 /* no relocation. relocation offset(r9) = 0 */ - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _image_copy_end_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop - b clear_bss -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - -#endif /* #ifndef CONFIG_SPL_BUILD */ - -clear_bss: -#ifdef CONFIG_SPL_BUILD - /* No relocation for SPL */ - ldr r0, =__bss_start - ldr r1, =__bss_end__ -#else - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 -#endif - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - bne clbss_l - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ -jump_2_ram: -/* - * If I-cache is enabled invalidate it - */ -#ifndef CONFIG_SYS_ICACHE_OFF - mcr p15, 0, r0, c7, c5, 0 @ invalidate icache - mcr p15, 0, r0, c7, c10, 4 @ DSB - mcr p15, 0, r0, c7, c5, 4 @ ISB -#endif - ldr r0, _board_init_r_ofs - adr r1, _start - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start - - #ifndef CONFIG_SKIP_LOWLEVEL_INIT /************************************************************************* * diff --git a/arch/arm/cpu/ixp/start.S b/arch/arm/cpu/ixp/start.S index cb32121..6ecf72e 100644 --- a/arch/arm/cpu/ixp/start.S +++ b/arch/arm/cpu/ixp/start.S @@ -118,6 +118,18 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -252,121 +264,6 @@ call_board_init_f: ldr r0,=0x00000000 bl board_init_f
-/*------------------------------------------------------------------------------*/ - -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -clear_bss: -#ifndef CONFIG_SPL_BUILD - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - bne clbss_l - - bl coloured_LED_init - bl red_led_on -#endif - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ - ldr r0, _board_init_r_ofs - adr r1, _start - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - /****************************************************************************/ /* */ /* Interrupt handling */ diff --git a/arch/arm/cpu/lh7a40x/start.S b/arch/arm/cpu/lh7a40x/start.S index 62de8b8..80bd15b 100644 --- a/arch/arm/cpu/lh7a40x/start.S +++ b/arch/arm/cpu/lh7a40x/start.S @@ -93,6 +93,18 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -163,118 +175,6 @@ call_board_init_f: ldr r0,=0x00000000 bl board_init_f
-/*------------------------------------------------------------------------------*/ - -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -clear_bss: -#ifndef CONFIG_SPL_BUILD - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - bne clbss_l -#endif - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ - ldr r0, _board_init_r_ofs - adr r1, _start - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - /* ************************************************************************* * diff --git a/arch/arm/cpu/pxa/start.S b/arch/arm/cpu/pxa/start.S index ba0de8f..6255151 100644 --- a/arch/arm/cpu/pxa/start.S +++ b/arch/arm/cpu/pxa/start.S @@ -126,6 +126,18 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -171,132 +183,6 @@ call_board_init_f: ldr r0, =0x00000000 bl board_init_f
-/*------------------------------------------------------------------------------*/ -#ifndef CONFIG_SPL_BUILD -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - -/* Disable the Dcache RAM lock for stack now */ -#ifdef CONFIG_CPU_PXA25X - bl cpu_init_crit -#endif - - adr r0, _start - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -clear_bss: -#ifndef CONFIG_SPL_BUILD - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - bne clbss_l -#endif /* #ifndef CONFIG_SPL_BUILD */ - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ -#ifdef CONFIG_ONENAND_SPL - ldr r0, _onenand_boot_ofs - mov pc, r0 - -_onenand_boot_ofs: - .word onenand_boot -#else -jump_2_ram: - ldr r0, _board_init_r_ofs - ldr r1, _TEXT_BASE - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start -#endif - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start -#endif /* ************************************************************************* * diff --git a/arch/arm/cpu/s3c44b0/start.S b/arch/arm/cpu/s3c44b0/start.S index a29d5b4..f52e94b 100644 --- a/arch/arm/cpu/s3c44b0/start.S +++ b/arch/arm/cpu/s3c44b0/start.S @@ -84,6 +84,18 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -135,121 +147,6 @@ call_board_init_f: ldr r0,=0x00000000 bl board_init_f
-/*------------------------------------------------------------------------------*/ - -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -clear_bss: -#ifndef CONFIG_SPL_BUILD - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - bne clbss_l - - bl coloured_LED_init - bl red_led_on -#endif - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ - ldr r0, _board_init_r_ofs - adr r1, _start - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - /* ************************************************************************* * diff --git a/arch/arm/cpu/sa1100/start.S b/arch/arm/cpu/sa1100/start.S index 92546d8..2221788 100644 --- a/arch/arm/cpu/sa1100/start.S +++ b/arch/arm/cpu/sa1100/start.S @@ -94,6 +94,18 @@ _bss_end_ofs: _end_ofs: .word _end - _start
+.globl _rel_dyn_start_ofs +_rel_dyn_start_ofs: + .word __rel_dyn_start - _start + +.globl _rel_dyn_end_ofs +_rel_dyn_end_ofs: + .word __rel_dyn_end - _start + +.globl _dynsym_start_ofs +_dynsym_start_ofs: + .word __dynsym_start - _start + #ifdef CONFIG_USE_IRQ /* IRQ stack memory (calculated at run-time) */ .globl IRQ_STACK_START @@ -139,118 +151,6 @@ call_board_init_f: ldr r0,=0x00000000 bl board_init_f
-/*------------------------------------------------------------------------------*/ - -/* - * void relocate_code (addr_sp, gd, addr_moni) - * - * This "function" does not return, instead it continues in RAM - * after relocating the monitor code. - * - */ - .globl relocate_code -relocate_code: - mov r4, r0 /* save addr_sp */ - mov r5, r1 /* save addr of gd */ - mov r6, r2 /* save addr of destination */ - - /* Set up the stack */ -stack_setup: - mov sp, r4 - - adr r0, _start - cmp r0, r6 - beq clear_bss /* skip relocation */ - mov r1, r6 /* r1 <- scratch for copy_loop */ - ldr r3, _bss_start_ofs - add r2, r0, r3 /* r2 <- source end address */ - -copy_loop: - ldmia r0!, {r9-r10} /* copy from source address [r0] */ - stmia r1!, {r9-r10} /* copy to target address [r1] */ - cmp r0, r2 /* until source end address [r2] */ - blo copy_loop - -#ifndef CONFIG_SPL_BUILD - /* - * fix .rel.dyn relocations - */ - ldr r0, _TEXT_BASE /* r0 <- Text base */ - sub r9, r6, r0 /* r9 <- relocation offset */ - ldr r10, _dynsym_start_ofs /* r10 <- sym table ofs */ - add r10, r10, r0 /* r10 <- sym table in FLASH */ - ldr r2, _rel_dyn_start_ofs /* r2 <- rel dyn start ofs */ - add r2, r2, r0 /* r2 <- rel dyn start in FLASH */ - ldr r3, _rel_dyn_end_ofs /* r3 <- rel dyn end ofs */ - add r3, r3, r0 /* r3 <- rel dyn end in FLASH */ -fixloop: - ldr r0, [r2] /* r0 <- location to fix up, IN FLASH! */ - add r0, r0, r9 /* r0 <- location to fix up in RAM */ - ldr r1, [r2, #4] - and r7, r1, #0xff - cmp r7, #23 /* relative fixup? */ - beq fixrel - cmp r7, #2 /* absolute fixup? */ - beq fixabs - /* ignore unknown type of fixup */ - b fixnext -fixabs: - /* absolute fix: set location to (offset) symbol value */ - mov r1, r1, LSR #4 /* r1 <- symbol index in .dynsym */ - add r1, r10, r1 /* r1 <- address of symbol in table */ - ldr r1, [r1, #4] /* r1 <- symbol value */ - add r1, r1, r9 /* r1 <- relocated sym addr */ - b fixnext -fixrel: - /* relative fix: increase location by offset */ - ldr r1, [r0] - add r1, r1, r9 -fixnext: - str r1, [r0] - add r2, r2, #8 /* each rel.dyn entry is 8 bytes */ - cmp r2, r3 - blo fixloop -#endif - -clear_bss: -#ifndef CONFIG_SPL_BUILD - ldr r0, _bss_start_ofs - ldr r1, _bss_end_ofs - mov r4, r6 /* reloc addr */ - add r0, r0, r4 - add r1, r1, r4 - mov r2, #0x00000000 /* clear */ - -clbss_l:str r2, [r0] /* clear loop... */ - add r0, r0, #4 - cmp r0, r1 - bne clbss_l -#endif - -/* - * We are done. Do not return, instead branch to second part of board - * initialization, now running from RAM. - */ - ldr r0, _board_init_r_ofs - adr r1, _start - add lr, r0, r1 - add lr, lr, r9 - /* setup parameters for board_init_r */ - mov r0, r5 /* gd_t */ - mov r1, r6 /* dest_addr */ - /* jump to it ... */ - mov pc, lr - -_board_init_r_ofs: - .word board_init_r - _start - -_rel_dyn_start_ofs: - .word __rel_dyn_start - _start -_rel_dyn_end_ofs: - .word __rel_dyn_end - _start -_dynsym_start_ofs: - .word __dynsym_start - _start - /* ************************************************************************* *

Hi Simon,
Le 10/12/2011 20:16, Simon Glass a écrit :
Now that we are using the generic relocation framework, we don't need this code.
Note: Here we lose the ARM1176's enable_mmu code. This seems to duplicate code already in U-Boot now. Can anyone comment on this?
First thing would be for you to explain why it is removed when it is not related to relocation, I think.
Amicalement,

Hi Albert,
On Sun, Dec 11, 2011 at 6:59 AM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Hi Simon,
Le 10/12/2011 20:16, Simon Glass a écrit :
Now that we are using the generic relocation framework, we don't need this code.
Note: Here we lose the ARM1176's enable_mmu code. This seems to duplicate code already in U-Boot now. Can anyone comment on this?
First thing would be for you to explain why it is removed when it is not related to relocation, I think.
Well the code is unique to that cpu, with no real justification as to why only ARM1176 needs it. Also it is buried in the relocation code, which is now becoming generic, so to keep it around we would need to re-implement ic C code something that seems like a hack. Finally, we already have a way of turning on the caches and MMU in ARM now, so it seems superfluous.
Regards, Simon
Amicalement,
Albert.

Hi Simon,
General comments here, detailed comments in reply to n/6 patches.
Le 10/12/2011 20:16, Simon Glass a écrit :
This is the second patch series aiming to unify the various board.c files in each architecture into a single one. This series implements a generic relocation feature, which is the bridge between board_init_f() and board_init_r(). It then moves ARM over to use this framework, as an example.
On ARM the relocation code is duplicated for each CPU yet it is the same. We can bring this up to the arch level. But since (I believe) Elf relocation is basically the same process for all archs, there is no reason not to bring it up to the generic level.
Actually most of start.S is very similar across all its avatars in arch/arm, and is a good candidate for being generalized. I would prefer a generalization of start.S with the vector table, generic startup sequence prepare for calling board_init_f, jump to board_init_r with a possible stack switch, exception handlers) , and anything specific moved into the adequate subdirectory.
Each architecture which uses this framework needs to provide a function called arch_elf_relocate_entry() which processes a single relocation entry. This is a static inline function to reduce code size overhead.
I assume this is due to some arch other than armnot using the same set of entry types, right? If so, I think it would be better to keep arch-specific entry relocation code under conditional complilation in the generic ELF relication source file, so that all ELF-structure dependent code is in a single file.
For ARM, a new arch/arm/lib/proc.S file is created, which holds generic ARM assembler code (things that cannot be written in C and are common functions used by all ARM CPUs). This helps reduce duplication. Interrupt handling code and perhaps even some startup code can move there later.
It may be useful for other architectures with a lot of different CPUs to have a similar file.
Indeed, but I think start.S is a better candidate for this, see comment above.
Code size on my ARMv7 system increases by 54 bytes with generic relocation. This overhead is mostly just literal pool access and setting up to call the relocated U-Boot at the end.
On my system, execution time increases from 10.8ms to 15.6ms due to the less efficient C implementations of the copy and zero loops. If execution time is of concern, you can define CONFIG_USE_ARCH_MEMSET and CONFIG_USE_ARCH_MEMCPY to reduce it. For met this reduces relocation time to 5.4ms, i.e. twice as fast as the old system.
One problem remains which causes mx31pdk to fail to build. It doesn't have string.c in its SPL code and the architecture-specific versions of memset()/memcpy() are too large. I propose to add a local change to reloc.c that uses inline code for boards that use the old legacy SPL framework. We can remove it later. This is not included in v2 but I am interested in comments on this approach. An alternative would be just to add simple memset()/memcpy() functions just for this board (and one other affected MX31 board).
Changes in v2:
- Use CONFIG_SYS_SKIP_RELOC instead of CONFIG_SYS_LEGACY_BOARD
- Import asm-generic/sections.h from Linux and add U-Boot extras
- Squash generic link symbols patch into generic relocation patch
- Move reloc.c into common/
- Add function comments
- Use memset, memcpy instead of inline code
- Add README file for relocation
- Invalidate I-cache when we jump to relocated code
- Use an inline relocation function to reduce code size
Seeing as inline will only avoid pushing a couple argument registers and doing a branch, and OTOH may require additional register shuffling in the calling code, how much code does this inlining save?
- Make relocation symbols global so we can use them outside start.S
Why should they relocation symbols ever be used outside of actually relocating?
Amicalement,

Hi Albert,
On Sun, Dec 11, 2011 at 6:47 AM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Hi Simon,
General comments here, detailed comments in reply to n/6 patches.
Thanks for looking at this.
Le 10/12/2011 20:16, Simon Glass a écrit :
This is the second patch series aiming to unify the various board.c files in each architecture into a single one. This series implements a generic relocation feature, which is the bridge between board_init_f() and board_init_r(). It then moves ARM over to use this framework, as an example.
On ARM the relocation code is duplicated for each CPU yet it is the same. We can bring this up to the arch level. But since (I believe) Elf relocation is basically the same process for all archs, there is no reason not to bring it up to the generic level.
Actually most of start.S is very similar across all its avatars in arch/arm, and is a good candidate for being generalized. I would prefer a generalization of start.S with the vector table, generic startup sequence prepare for calling board_init_f, jump to board_init_r with a possible stack switch, exception handlers) , and anything specific moved into the adequate subdirectory.
Yes I agree. However this is not actually the purpose of this series, which is to head (so far as is possible) towards a generic board.c file for all architectures. While I completely agree that start.S is a mess, that is a separate problem!
Each architecture which uses this framework needs to provide a function called arch_elf_relocate_entry() which processes a single relocation entry. This is a static inline function to reduce code size overhead.
I assume this is due to some arch other than armnot using the same set of entry types, right? If so, I think it would be better to keep arch-specific entry relocation code under conditional complilation in the generic ELF relication source file, so that all ELF-structure dependent code is in a single file.
Yes that takes things one step further. I did look at it but could not find an elf.h with defines for all the different relocations. Do you know where to find that?
For ARM, a new arch/arm/lib/proc.S file is created, which holds generic ARM assembler code (things that cannot be written in C and are common functions used by all ARM CPUs). This helps reduce duplication. Interrupt handling code and perhaps even some startup code can move there later.
It may be useful for other architectures with a lot of different CPUs to have a similar file.
Indeed, but I think start.S is a better candidate for this, see comment above.
My understanding of start.S is that it is the start-up code for the machine, and that over time we would hope to reduce it to a small amount of code which is specific to that particular cpu (i.e. the shared and common code moves out).
We can do this in many small steps - certainly I don't have the appetite for changing everything around all at once.
Anyway. as I may have mentioned before, I feel that calling back into start.S from C code is wrong. That file sits at the beginning of the image whereas the code you call back to doesn't need to. It also means that there is (currently) no generic ARM assembler file (apart from low_level.S) which C code can call to do things which are specific to the architecture (but not the cpu).
Code size on my ARMv7 system increases by 54 bytes with generic relocation. This overhead is mostly just literal pool access and setting up to call the relocated U-Boot at the end.
On my system, execution time increases from 10.8ms to 15.6ms due to the less efficient C implementations of the copy and zero loops. If execution time is of concern, you can define CONFIG_USE_ARCH_MEMSET and CONFIG_USE_ARCH_MEMCPY to reduce it. For met this reduces relocation time to 5.4ms, i.e. twice as fast as the old system.
One problem remains which causes mx31pdk to fail to build. It doesn't have string.c in its SPL code and the architecture-specific versions of memset()/memcpy() are too large. I propose to add a local change to reloc.c that uses inline code for boards that use the old legacy SPL framework. We can remove it later. This is not included in v2 but I am interested in comments on this approach. An alternative would be just to add simple memset()/memcpy() functions just for this board (and one other affected MX31 board).
Changes in v2:
- Use CONFIG_SYS_SKIP_RELOC instead of CONFIG_SYS_LEGACY_BOARD
- Import asm-generic/sections.h from Linux and add U-Boot extras
- Squash generic link symbols patch into generic relocation patch
- Move reloc.c into common/
- Add function comments
- Use memset, memcpy instead of inline code
- Add README file for relocation
- Invalidate I-cache when we jump to relocated code
- Use an inline relocation function to reduce code size
Seeing as inline will only avoid pushing a couple argument registers and doing a branch, and OTOH may require additional register shuffling in the calling code, how much code does this inlining save?
About 48 bytes all up from memory.
- Make relocation symbols global so we can use them outside start.S
Why should they relocation symbols ever be used outside of actually relocating?
Well since relocation is moving out of start.S to a generic library which is not architecture-specific, we must make these symbols available to it. Otherwise the generic code doesn't know what it is relocating.
Amicalement,
Albert.
Regards, Simon

Hi againLe 11/12/2011 22:30, Simon Glass a écrit :
Actually most of start.S is very similar across all its avatars in arch/arm, and is a good candidate for being generalized. I would prefer a generalization of start.S with the vector table, generic startup sequence prepare for calling board_init_f, jump to board_init_r with a possible stack switch, exception handlers) , and anything specific moved into the adequate subdirectory.
Yes I agree. However this is not actually the purpose of this series, which is to head (so far as is possible) towards a generic board.c file for all architectures. While I completely agree that start.S is a mess, that is a separate problem!
Indeed, Generalizing start.S is a separate problem from generalizing board.c; and that means your reboard patch series is unaffected whether you move some code into proc.S or not. Actually, that move to proc.S is just a code factorization, unrelated to relocation -- your series can work just as well without it, only it will keep on taking up code size that could be reduced regardless to the way relocation is done.
Now, seen from the reverse perspective of generalizing start.S, the move to proc.S in this reboard series is a step in the wrong direction, which is why I do not want it. Plus, it'll make your 'reboard' patch more focused on what it is about, i.e. generalizing relocation.
(BTW, even though I understand your greater goal about board.c, I still think the patch subjects should *not* contain this "reboard" prefix nor mention that greater goal. The patch series is about relocation, and should be clear about it.)
Each architecture which uses this framework needs to provide a function called arch_elf_relocate_entry() which processes a single relocation entry. This is a static inline function to reduce code size overhead.
I assume this is due to some arch other than armnot using the same set of entry types, right? If so, I think it would be better to keep arch-specific entry relocation code under conditional complilation in the generic ELF relication source file, so that all ELF-structure dependent code is in a single file.
(sorry for the typoes, BTW)
Yes that takes things one step further. I did look at it but could not find an elf.h with defines for all the different relocations. Do you know where to find that?
We don't need to find a header file with all the relocation types we will need, we only need one with the two ARM relocation types we need now. Adding them to include/elf.h.
/>>> For ARM, a new arch/arm/lib/proc.S file is created, which holds generic
ARM assembler code (things that cannot be written in C and are common functions used by all ARM CPUs). This helps reduce duplication. Interrupt handling code and perhaps even some startup code can move there later.
It may be useful for other architectures with a lot of different CPUs to have a similar file.
Indeed, but I think start.S is a better candidate for this, see comment above.
My understanding of start.S is that it is the start-up code for the machine, and that over time we would hope to reduce it to a small amount of code which is specific to that particular cpu (i.e. the shared and common code moves out).
We can do this in many small steps - certainly I don't have the appetite for changing everything around all at once.
This view of start.S is kind of an /a priori/ -- just because the name has start.S does not mean that there should only be startup code in there. Conversively, one could argue that handling the returning from board_init_f and branching into board_init_r is *still* startup code.
But yes, maybe one day start.S will be split into a file with only the startup code and a file with everything else. Only this is totally unrelated to relocation, and thus will not happen in this patch series.
Anyway. as I may have mentioned before, I feel that calling back into start.S from C code is wrong. That file sits at the beginning of the image whereas the code you call back to doesn't need to. It also means that there is (currently) no generic ARM assembler file (apart from low_level.S) which C code can call to do things which are specific to the architecture (but not the cpu).
I fail to see the point in your argument about about start.S being "far away" from the C code that call into it or return to it. Never do we consider the "distance" between C callers and callees, so why should we do it here?
As for there being no generic ARM assembler file etc., before complaining that there isn't one, I would like to see why one should be created. For me, just having a couple pieces of code that are 'generic ARM code that C can call' is not a criterion for putting them in a new file. Files must group pieces of code that have common functional purpose, not simply pieces of code that have a common characteristic.
Seeing as inline will only avoid pushing a couple argument registers and doing a branch, and OTOH may require additional register shuffling in the calling code, how much code does this inlining save?
About 48 bytes all up from memory.
About 12 instructions? Funny, I wouldn't have thought inlining a function called only once would save that much. Thanks for the info.
- Make relocation symbols global so we can use them outside start.S
Why should they relocation symbols ever be used outside of actually relocating?
Well since relocation is moving out of start.S to a generic library which is not architecture-specific, we must make these symbols available to it. Otherwise the generic code doesn't know what it is relocating.
Understood. My point is that they should not be made available to code other than the relocation code.
Regards, Simon
Amicalement,

Hi Albert,
On Sun, Dec 11, 2011 at 2:27 PM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Hi againLe 11/12/2011 22:30, Simon Glass a écrit :
Actually most of start.S is very similar across all its avatars in arch/arm, and is a good candidate for being generalized. I would prefer a generalization of start.S with the vector table, generic startup sequence prepare for calling board_init_f, jump to board_init_r with a possible stack switch, exception handlers) , and anything specific moved into the adequate subdirectory.
Yes I agree. However this is not actually the purpose of this series, which is to head (so far as is possible) towards a generic board.c file for all architectures. While I completely agree that start.S is a mess, that is a separate problem!
Indeed, Generalizing start.S is a separate problem from generalizing board.c; and that means your reboard patch series is unaffected whether you move some code into proc.S or not. Actually, that move to proc.S is just a code factorization, unrelated to relocation -- your series can work just as well without it, only it will keep on taking up code size that could be reduced regardless to the way relocation is done.
OK but sorry I am still a bit unclear.
Since I need to call from C the assembler function which sets up the stack, invalidates the I-cache and jumps to board_init_r(), I need this function to be somewhere. Perhaps in the future we might devise a way of doing some of this in C code, or we might change the API. But for now we need it.
Given that we need it, it makes little sense to me to put it in start.S. It then gets repeated 10 times throughout the code, with every cpu having its own version.
Now, seen from the reverse perspective of generalizing start.S, the move to proc.S in this reboard series is a step in the wrong direction, which is why I do not want it. Plus, it'll make your 'reboard' patch more focused on what it is about, i.e. generalizing relocation.
OK, so where should this code go? Repeated 10 times in start.S? Or are you asking for a new start.S file in arch/arm/lib/ ? If so, I would want to call it something else since link scripts may assume there is only one start.o.
(BTW, even though I understand your greater goal about board.c, I still think the patch subjects should *not* contain this "reboard" prefix nor mention that greater goal. The patch series is about relocation, and should be clear about it.)
OK. Should is use something like 'reloc', or just drop the prefix altogether?
Each architecture which uses this framework needs to provide a function called arch_elf_relocate_entry() which processes a single relocation entry. This is a static inline function to reduce code size overhead.
I assume this is due to some arch other than armnot using the same set of entry types, right? If so, I think it would be better to keep arch-specific entry relocation code under conditional complilation in the generic ELF relication source file, so that all ELF-structure dependent code is in a single file.
(sorry for the typoes, BTW)
And sorry for mine :-)
Yes that takes things one step further. I did look at it but could not find an elf.h with defines for all the different relocations. Do you know where to find that?
We don't need to find a header file with all the relocation types we will need, we only need one with the two ARM relocation types we need now. Adding them to include/elf.h.
OK, will do.
/>>> For ARM, a new arch/arm/lib/proc.S file is created, which holds generic
ARM assembler code (things that cannot be written in C and are common functions used by all ARM CPUs). This helps reduce duplication. Interrupt handling code and perhaps even some startup code can move there later.
It may be useful for other architectures with a lot of different CPUs to have a similar file.
Indeed, but I think start.S is a better candidate for this, see comment above.
My understanding of start.S is that it is the start-up code for the machine, and that over time we would hope to reduce it to a small amount of code which is specific to that particular cpu (i.e. the shared and common code moves out).
We can do this in many small steps - certainly I don't have the appetite for changing everything around all at once.
This view of start.S is kind of an /a priori/ -- just because the name has start.S does not mean that there should only be startup code in there. Conversively, one could argue that handling the returning from board_init_f and branching into board_init_r is *still* startup code.
But yes, maybe one day start.S will be split into a file with only the startup code and a file with everything else. Only this is totally unrelated to relocation, and thus will not happen in this patch series.
OK, will await your response on where this goes.
Anyway. as I may have mentioned before, I feel that calling back into start.S from C code is wrong. That file sits at the beginning of the image whereas the code you call back to doesn't need to. It also means that there is (currently) no generic ARM assembler file (apart from low_level.S) which C code can call to do things which are specific to the architecture (but not the cpu).
I fail to see the point in your argument about about start.S being "far away" from the C code that call into it or return to it. Never do we consider the "distance" between C callers and callees, so why should we do it here?
Well let's leave that one as it is a matter of taste. For the moment we have a start.S for each cpu so my objection would be repeating this code in each one. See above.
As for there being no generic ARM assembler file etc., before complaining that there isn't one, I would like to see why one should be created. For me, just having a couple pieces of code that are 'generic ARM code that C can call' is not a criterion for putting them in a new file. Files must group pieces of code that have common functional purpose, not simply pieces of code that have a common characteristic.
OK, will await your response.
Seeing as inline will only avoid pushing a couple argument registers and doing a branch, and OTOH may require additional register shuffling in the calling code, how much code does this inlining save?
About 48 bytes all up from memory.
About 12 instructions? Funny, I wouldn't have thought inlining a function called only once would save that much. Thanks for the info.
From memory. Inlining a function with 4 args can save function
entry/return, assignment of the args to registers (say 8 instructions all up), and perhaps take advantage of already-calculated partial expressions. Actually it took a bit of messing around to reduce the overhead of this patch to that level.
- Make relocation symbols global so we can use them outside start.S
Why should they relocation symbols ever be used outside of actually relocating?
Well since relocation is moving out of start.S to a generic library which is not architecture-specific, we must make these symbols available to it. Otherwise the generic code doesn't know what it is relocating.
Understood. My point is that they should not be made available to code other than the relocation code.
Oh I see. Yes of course.
Regards, Simon
Regards, Simon
Amicalement,
Albert.

Hi Simon,
On Mon, Dec 12, 2011 at 4:20 PM, Simon Glass sjg@chromium.org wrote:
Hi Albert,
On Sun, Dec 11, 2011 at 2:27 PM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Hi againLe 11/12/2011 22:30, Simon Glass a écrit :
Actually most of start.S is very similar across all its avatars in arch/arm, and is a good candidate for being generalized. I would prefer a generalization of start.S with the vector table, generic startup sequence prepare for calling board_init_f, jump to board_init_r with a possible stack switch, exception handlers) , and anything specific moved into the adequate subdirectory.
Yes I agree. However this is not actually the purpose of this series, which is to head (so far as is possible) towards a generic board.c file for all architectures. While I completely agree that start.S is a mess, that is a separate problem!
Indeed, Generalizing start.S is a separate problem from generalizing board.c; and that means your reboard patch series is unaffected whether you move some code into proc.S or not. Actually, that move to proc.S is just a code factorization, unrelated to relocation -- your series can work just as well without it, only it will keep on taking up code size that could be reduced regardless to the way relocation is done.
OK but sorry I am still a bit unclear.
Since I need to call from C the assembler function which sets up the stack, invalidates the I-cache and jumps to board_init_r(), I need this function to be somewhere. Perhaps in the future we might devise a way of doing some of this in C code, or we might change the API. But for now we need it.
Given that we need it, it makes little sense to me to put it in start.S. It then gets repeated 10 times throughout the code, with every cpu having its own version.
[snip]
The 'problem' is that in order to implement the centralisation of the relocation code, you need to do some non-relocation related fix-ups along the way.
I think it would be good to seperate what you are doing into a 'prepare' series which simply shuffles code around ready for the actual relocation patches - The compiled code should be identical before and after the 'prepare' patches. If you find some code that can be optimised (consolidating duplicate code across SoCs for example) then do this either immediately before are after the 'prepare' patches (if those optimisations make no difference to the relocation changes, then you can even leave them until after the relocation patches)
After the 'prepare' patches, the relocation changes will be much clearer
Remember, there is nothing wrong with submitting multiple series of patches where one depends on the other, but you need to make the precedence clear in the description. This approach is preferable over interleaving patches which are not, technically, related to the subject of what you are doing in the series. If it gets to the point where you simply must have some interleaving patches, then it is OK to do it within the series, but change the subject of the interleaving patches to make it clear that they are not directly related
Rember, once the patches are applied, the concept of a 'series' will be lost, so the tag at the beginning of the subject must clearly represent what that individual patch is about - Having a 'reloc' tag on a code consolidation patch will not make sense in a years time...
Regards,
Graeme

Hi Graeme,
On Sun, Dec 11, 2011 at 9:58 PM, Graeme Russ graeme.russ@gmail.com wrote:
Hi Simon,
On Mon, Dec 12, 2011 at 4:20 PM, Simon Glass sjg@chromium.org wrote:
Hi Albert,
On Sun, Dec 11, 2011 at 2:27 PM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Hi againLe 11/12/2011 22:30, Simon Glass a écrit :
Actually most of start.S is very similar across all its avatars in arch/arm, and is a good candidate for being generalized. I would prefer a generalization of start.S with the vector table, generic startup sequence prepare for calling board_init_f, jump to board_init_r with a possible stack switch, exception handlers) , and anything specific moved into the adequate subdirectory.
Yes I agree. However this is not actually the purpose of this series, which is to head (so far as is possible) towards a generic board.c file for all architectures. While I completely agree that start.S is a mess, that is a separate problem!
Indeed, Generalizing start.S is a separate problem from generalizing board.c; and that means your reboard patch series is unaffected whether you move some code into proc.S or not. Actually, that move to proc.S is just a code factorization, unrelated to relocation -- your series can work just as well without it, only it will keep on taking up code size that could be reduced regardless to the way relocation is done.
OK but sorry I am still a bit unclear.
Since I need to call from C the assembler function which sets up the stack, invalidates the I-cache and jumps to board_init_r(), I need this function to be somewhere. Perhaps in the future we might devise a way of doing some of this in C code, or we might change the API. But for now we need it.
Given that we need it, it makes little sense to me to put it in start.S. It then gets repeated 10 times throughout the code, with every cpu having its own version.
[snip]
The 'problem' is that in order to implement the centralisation of the relocation code, you need to do some non-relocation related fix-ups along the way.
I think it would be good to seperate what you are doing into a 'prepare' series which simply shuffles code around ready for the actual relocation patches - The compiled code should be identical before and after the 'prepare' patches. If you find some code that can be optimised (consolidating duplicate code across SoCs for example) then do this either immediately before are after the 'prepare' patches (if those optimisations make no difference to the relocation changes, then you can even leave them until after the relocation patches)
After the 'prepare' patches, the relocation changes will be much clearer
Remember, there is nothing wrong with submitting multiple series of patches where one depends on the other, but you need to make the precedence clear in the description. This approach is preferable over interleaving patches which are not, technically, related to the subject of what you are doing in the series. If it gets to the point where you simply must have some interleaving patches, then it is OK to do it within the series, but change the subject of the interleaving patches to make it clear that they are not directly related
Rember, once the patches are applied, the concept of a 'series' will be lost, so the tag at the beginning of the subject must clearly represent what that individual patch is about - Having a 'reloc' tag on a code consolidation patch will not make sense in a years time...
Can you please be specific about these non-relocation fix-ups? The last patch of the removes all the relocation code from the various start.S files. Is that what you mean? There is obviously a bit of a problem here, but I just don't see what it is.
Here are the patches with my questions / comments:
reboard: Create reloc.h and include it where needed - I think this is fine - it was requested by review feedback
reboard: define CONFIG_SYS_SKIP_RELOC for all archs - This option is intended to deal with architectures which don't yet use the generic relocation method. It is waiting on Albert to say what is actually wanted here, perhaps a separate 'skip-relocation' patch which can be turned on per board (but I thought that was NAKed), or perhaps renaming this option.
reboard: Add generic relocation feature - this just adds the generic code so I think is ok
reboard: arm: Add processor function library - this adds the 'jump to board_init_r()' function, which is currently a few instructions at the end of the assembler version of relocate_code(), repeated in each start.S
reboard: arm: Move over to generic relocation - this turns off CONFIG_SYS_SKIP_RELOC for ARM and makes it use the generic reloc
reboard: arm: Remove unused code in start.S - this removes the relocate_code() implementation in each start.S, now that this is not needed
Regards, Simon
Regards,
Graeme
participants (3)
-
Albert ARIBAUD
-
Graeme Russ
-
Simon Glass