[U-Boot] [PATCH 1/2] [NEXT] arm: change relocation flag from -fPIC to -fPIE

Replace GOT indirect addressing with more efficient pic-base relative addressing for initialized data (uninitialized data still use GOTi indirect addressing). This also reduces code size by 0.4% compared to -fPIC.
Signed-off-by: Albert Aribaud albert.aribaud@free.fr --- arch/arm/config.mk | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/arm/config.mk b/arch/arm/config.mk index 6923f6d..138c43a 100644 --- a/arch/arm/config.mk +++ b/arch/arm/config.mk @@ -35,7 +35,7 @@ endif
ifndef CONFIG_SYS_ARM_WITHOUT_RELOC # needed for relocation -PLATFORM_RELFLAGS += -fPIC +PLATFORM_RELFLAGS += -fPIE endif
ifdef CONFIG_SYS_ARM_WITHOUT_RELOC

Add -msingle-pic-base to the relocation flags, and compute the pic base in start.S twice and for all -- once before relocation to run board_init_f, and once after relocation to run board_init_r and the rest of u-boot. This further reduces code size by 2.5% compared to -fPIE alone.
Signed-off-by: Albert Aribaud albert.aribaud@free.fr --- arch/arm/cpu/arm926ejs/config.mk | 5 ++++ arch/arm/cpu/arm926ejs/start.S | 40 ++++++++++++++++++++++++++++++++++-- arch/arm/cpu/arm926ejs/u-boot.lds | 1 + 3 files changed, 43 insertions(+), 3 deletions(-)
diff --git a/arch/arm/cpu/arm926ejs/config.mk b/arch/arm/cpu/arm926ejs/config.mk index f8ef90f..aa84706 100644 --- a/arch/arm/cpu/arm926ejs/config.mk +++ b/arch/arm/cpu/arm926ejs/config.mk @@ -23,6 +23,11 @@
PLATFORM_RELFLAGS += -fno-common -ffixed-r8 -msoft-float
+ifndef CONFIG_SYS_ARM_WITHOUT_RELOC +# needed for optimal relocation +PLATFORM_RELFLAGS += -msingle-pic-base +endif + PLATFORM_CPPFLAGS += -march=armv5te # ========================================================================= # diff --git a/arch/arm/cpu/arm926ejs/start.S b/arch/arm/cpu/arm926ejs/start.S index 16ee972..739fa06 100644 --- a/arch/arm/cpu/arm926ejs/start.S +++ b/arch/arm/cpu/arm926ejs/start.S @@ -198,9 +198,23 @@ reset: bl cpu_init_crit #endif
-/* Set stackpointer in internal RAM to call board_init_f */ -call_board_init_f: + /* + * Set stack pointer in internal RAM + */ ldr sp, =(CONFIG_SYS_INIT_SP_ADDR) + + /* + * Set pic base register to the current GOT position. At the + * moment this is also our run-time position. + * Set both r10 and r9 because either could be used as pic base + * depending on whether stack checking is off or on. + */ + ldr r10, _got_base + mov r9, r10 + + /* + * Call board_init_f, passing it 0 for bootflag + */ ldr r0,=0x00000000 bl board_init_f
@@ -220,7 +234,9 @@ relocate_code: mov r6, r2 /* save addr of destination */ mov r7, r2 /* save addr of destination */
- /* Set up the stack */ + /* + * Set up the stack + */ stack_setup: mov sp, r4
@@ -282,6 +298,18 @@ clbss_l:str r2, [r0] /* clear loop... */ bl red_LED_on #endif
+ /* + * Set pic base register to the current GOT position. Since we are + * now running from RAM, we need to offset it from its link-time to + * its run-time position. + */ + ldr r9, relocate_got_base_r + sub r9, pc, r9 + ldr r10, _got_base +relocate_got_base_r: + add r10, r9, r10 + mov r9, r10 + /* * We are done. Do not return, instead branch to second part of board * initialization, now running from RAM. @@ -303,6 +331,12 @@ _nand_boot: .word nand_boot mov pc, lr
_board_init_r: .word board_init_r + +_got_base: + .word __got_base +_relocate_got_base_r: + .word relocate_got_base_r + #endif
#else /* #if !defined(CONFIG_SYS_ARM_WITHOUT_RELOC) */ diff --git a/arch/arm/cpu/arm926ejs/u-boot.lds b/arch/arm/cpu/arm926ejs/u-boot.lds index 02eb8ca..b6e21f2 100644 --- a/arch/arm/cpu/arm926ejs/u-boot.lds +++ b/arch/arm/cpu/arm926ejs/u-boot.lds @@ -53,6 +53,7 @@ SECTIONS
__got_start = .; . = ALIGN(4); + __got_base = .; .got : { *(.got) }
__got_end = .;

Bonjour Albert,
On Wed, Sep 22, 2010 at 9:57 AM, Albert Aribaud albert.aribaud@free.fr wrote:
Add -msingle-pic-base to the relocation flags, and compute the pic base in start.S twice and for all -- once before relocation to run board_init_f, and once after relocation to run board_init_r and the rest of u-boot. This further reduces code size by 2.5% compared to -fPIE alone.
Signed-off-by: Albert Aribaud albert.aribaud@free.fr
I tried tested this on da850evm -- the da850evm.h has "#undef CONFIG_SYS_ARM_WITHOUT_RELOC /* This board is tested with relocation support */".
I never got a boot-prompt. I tried mucking with the patch a little and found that if I did:
diff --git a/arch/arm/cpu/arm926ejs/config.mk b/arch/arm/cpu/arm926ejs/config.mk index aa84706..f8ef90f 100644 --- a/arch/arm/cpu/arm926ejs/config.mk +++ b/arch/arm/cpu/arm926ejs/config.mk @@ -23,11 +23,6 @@
PLATFORM_RELFLAGS += -fno-common -ffixed-r8 -msoft-float
-ifndef CONFIG_SYS_ARM_WITHOUT_RELOC -# needed for optimal relocation -PLATFORM_RELFLAGS += -msingle-pic-base -endif - PLATFORM_CPPFLAGS += -march=armv5te # ========================================================================= #
Then I do get a prompt.
The gcc version here is: $arm-none-linux-gnueabi-gcc --version arm-none-linux-gnueabi-gcc (Sourcery G++ Lite 2009q1-203) 4.3.3 Copyright (C) 2008 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
I have a JTAG debugger for this board. Please let me know if there are register dumps or the like that I can get you.
Best Regards, Ben Gardiner
--- Nanometrics Inc. http://www.nanometrics.ca

Le 22/09/2010 20:05, Ben Gardiner a écrit :
Bonjour Albert,
:) Hi Ben,
On Wed, Sep 22, 2010 at 9:57 AM, Albert Aribaudalbert.aribaud@free.fr wrote:
Add -msingle-pic-base to the relocation flags, and compute the pic base in start.S twice and for all -- once before relocation to run board_init_f, and once after relocation to run board_init_r and the rest of u-boot. This further reduces code size by 2.5% compared to -fPIE alone.
Signed-off-by: Albert Aribaudalbert.aribaud@free.fr
I tried tested this on da850evm -- the da850evm.h has "#undef CONFIG_SYS_ARM_WITHOUT_RELOC /* This board is tested with relocation support */".
I never got a boot-prompt. I tried mucking with the patch a little and found that if I did:
diff --git a/arch/arm/cpu/arm926ejs/config.mk b/arch/arm/cpu/arm926ejs/config.mk index aa84706..f8ef90f 100644 --- a/arch/arm/cpu/arm926ejs/config.mk +++ b/arch/arm/cpu/arm926ejs/config.mk @@ -23,11 +23,6 @@
PLATFORM_RELFLAGS += -fno-common -ffixed-r8 -msoft-float
-ifndef CONFIG_SYS_ARM_WITHOUT_RELOC -# needed for optimal relocation -PLATFORM_RELFLAGS += -msingle-pic-base -endif
- PLATFORM_CPPFLAGS += -march=armv5te # ========================================================================= #
Then I do get a prompt.
The gcc version here is: $arm-none-linux-gnueabi-gcc --version arm-none-linux-gnueabi-gcc (Sourcery G++ Lite 2009q1-203) 4.3.3 Copyright (C) 2008 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
I have a JTAG debugger for this board. Please let me know if there are register dumps or the like that I can get you.
Best Regards, Ben Gardiner
Basically your test seems to demonstrate show that the pic base value computed in start.S does not work for board_init_f.
Did you execute the binary at the address specified in TEXT_BASE? If not, please adjust TEXT_BASE to the location where u-boot resides when you debug it.
Otherwise, can you do the following? This will help me see if the pic base computed in start.S is the same as the one computed by each function without -msingle-pic-base.
1) build with your fix in;
2) debug (at the assembly instruction level) the start.S code and see what value ends up in r10 (aka sl) right before calling board_init_f;
3) proceed (still at the assembly instruction level) until you get within board_init_f. Among the first instructions will be the recomputation of 10/sl; see what value it is assigned;
4) compare values found in 2 and 3 with the value of __got_base in the .map file.
5) Report your findings here. :)
Thanks for your help!
Amicalement,

On Wed, Sep 22, 2010 at 3:07 PM, Albert ARIBAUD albert.aribaud@free.fr wrote:
Basically your test seems to demonstrate show that the pic base value computed in start.S does not work for board_init_f.
Did you execute the binary at the address specified in TEXT_BASE? If not, please adjust TEXT_BASE to the location where u-boot resides when you debug it.
On boot we see printed "Jumping to entry point at 0xC1080000." by UBL.
The debugger agrees: (gdb) info registers r0 0x2710 10000 r1 0x0 0 r2 0x80003478 2147497080 r3 0xc1080000 3238526976 r4 0x80003484 2147497092 r5 0x80003480 2147497088 r6 0xc7fe9760 3355350880 r7 0xc7fbd000 3355168768 r8 0xc7ea8f8c 3354038156 r9 0xa1892306 2710119174 r10 0xa1892306 2710119174 r11 0xc7ea8f84 3354038148 r12 0x30 48 sp 0x800037d0 0x800037d0 lr 0x80000120 2147483936 pc 0xc1080000 0xc1080000 <_start> fps 0x0 0 cpsr 0x600000d3 1610612947 (gdb) l 50 */ 51 52 53 .globl _start 54 _start: 55 b reset 56 #ifdef CONFIG_PRELOADER 57 /* No exception handlers in preloader */ 58 ldr pc, _hang 59 ldr pc, _hang (gdb)
It appears that TEXT_BASE Is 0xc1080040 in the binary though it is specified as 0xc1080000: $arm-none-linux-gnueabi-nm u-boot |grep TEXT c1080040 T _TEXT_BASE $cat board/davinci/da8xxevm/config.mk |grep TEXT TEXT_BASE = 0xC1080000
but I guess that's OK since _TEXT_BASE is a label at start.S:118 ...
I'm not sure if it is relevant but note that the da850 is a board for which we have "#undef CONFIG_SKIP_RELOCATE_UBOOT"
Otherwise, can you do the following? This will help me see if the pic base computed in start.S is the same as the one computed by each function without -msingle-pic-base.
build with your fix in;
debug (at the assembly instruction level) the start.S code and see what
value ends up in r10 (aka sl) right before calling board_init_f;
(gdb) info registers r0 0x0 0 r1 0x0 0 r2 0x80003478 2147497080 r3 0xc1080000 3238526976 r4 0x80003484 2147497092 r5 0x80003480 2147497088 r6 0xc7fe9ba0 3355351968 r7 0xc7fbb000 3355160576 r8 0xc7ea6f8c 3354029964 r9 0xc10ae600 3238716928 r10 0xc10ae600 3238716928 r11 0xc7ea6d3c 3354029372 r12 0x30 48 sp 0xc0000f80 0xc0000f80 lr 0x80000120 2147483936 pc 0xc1080088 0xc1080088 <reset+32> fps 0x0 0 cpsr 0x600000d3 1610612947 (gdb)
- proceed (still at the assembly instruction level) until you get within
board_init_f. Among the first instructions will be the recomputation of 10/sl; see what value it is assigned;
I couldn't see any modification of r10 in board_init_f; what follows is the assembly of the first instructions of that function: c1080730 <board_init_f>: c1080730: e92d4800 push {fp, lr} c1080734: e28db004 add fp, sp, #4 ; 0x4 c1080738: e24dd020 sub sp, sp, #32 ; 0x20 c108073c: e59f21e8 ldr r2, [pc, #488] ; c108092c <board_init_f+0x1fc> c1080740: e50b2024 str r2, [fp, #-36] c1080744: e51b3024 ldr r3, [fp, #-36] c1080748: e08f3003 add r3, pc, r3 c108074c: e50b3024 str r3, [fp, #-36] c1080750: e50b0020 str r0, [fp, #-32] c1080754: e59f81d4 ldr r8, [pc, #468] ; c1080930 <board_init_f+0x200> c1080758: e1a03008 mov r3, r8 c108075c: e1a00003 mov r0, r3 c1080760: e3a01000 mov r1, #0 ; 0x0 c1080764: e3a0205c mov r2, #92 ; 0x5c c1080768: eb007a1c bl c109efe0 <memset> c108076c: e1a01008 mov r1, r8 c1080770: e59f31bc ldr r3, [pc, #444] ; c1080934 <board_init_f+0x204> c1080774: e51b0024 ldr r0, [fp, #-36] c1080778: e7903003 ldr r3, [r0, r3] c108077c: e5932000 ldr r2, [r3] c1080780: e59f31b0 ldr r3, [pc, #432] ; c1080938 <board_init_f+0x208> c1080784: e51b0024 ldr r0, [fp, #-36] c1080788: e7903003 ldr r3, [r0, r3] c108078c: e5933000 ldr r3, [r3] c1080790: e0633002 rsb r3, r3, r2 c1080794: e5813024 str r3, [r1, #36] c1080798: e59f319c ldr r3, [pc, #412] ; c108093c <board_init_f+0x20c> c108079c: e51b2024 ldr r2, [fp, #-36] c10807a0: e0823003 add r3, r2, r3 c10807a4: e50b3014 str r3, [fp, #-20] c10807a8: ea000009 b c10807d4 <board_init_f+0xa4> c10807ac: e51b3014 ldr r3, [fp, #-20] c10807b0: e5933000 ldr r3, [r3] c10807b4: e12fff33 blx r3 c10807b8: e1a03000 mov r3, r0 c10807bc: e3530000 cmp r3, #0 ; 0x0 c10807c0: 0a000000 beq c10807c8 <board_init_f+0x98> c10807c4: eb0000d3 bl c1080b18 <hang> c10807c8: e51b3014 ldr r3, [fp, #-20] c10807cc: e2833004 add r3, r3, #4 ; 0x4 c10807d0: e50b3014 str r3, [fp, #-20] c10807d4: e51b3014 ldr r3, [fp, #-20] c10807d8: e5933000 ldr r3, [r3] c10807dc: e3530000 cmp r3, #0 ; 0x0 c10807e0: 1afffff1 bne c10807ac <board_init_f+0x7c> c10807e4: e1a03008 mov r3, r8 c10807e8: e5933020 ldr r3, [r3, #32] c10807ec: e2833103 add r3, r3, #-1073741824 ; 0xc0000000 c10807f0: e50b300c str r3, [fp, #-12] c10807f4: e51b300c ldr r3, [fp, #-12] c10807f8: e2433901 sub r3, r3, #16384 ; 0x4000 c10807fc: e50b300c str r3, [fp, #-12] c1080800: e51b300c ldr r3, [fp, #-12] c1080804: e1a03823 lsr r3, r3, #16 c1080808: e1a03803 lsl r3, r3, #16 c108080c: e50b300c str r3, [fp, #-12] c1080810: e1a02008 mov r2, r8 c1080814: e51b300c ldr r3, [fp, #-12] c1080818: e5823034 str r3, [r2, #52] c108081c: e51b300c ldr r3, [fp, #-12] c1080820: e3c33eff bic r3, r3, #4080 ; 0xff0 c1080824: e3c3300f bic r3, r3, #15 ; 0xf
just before board_init_f calls relocate_code the registers are as follows: (gdb) info registers r0 0xc7ea6f8c 3354029964 r1 0xc0000f80 3221229440 r2 0x0 0 r3 0xc7ea6f8c 3354029964 r4 0x80003484 2147497092 r5 0x80003480 2147497088 r6 0xc7fe9ba0 3355351968 r7 0xc7fbb000 3355160576 r8 0xc0000f80 3221229440 r9 0xc10ae600 3238716928 r10 0xc10ae600 3238716928 r11 0xc0000f7c 3221229436 r12 0xc10ae600 3238716928 sp 0xc0000f58 0xc0000f58 lr 0xc108091c 3238529308 pc 0xc108091c 0xc108091c <board_init_f+492> fps 0x0 0 cpsr 0x600000d3 1610612947
- compare values found in 2 and 3 with the value of __got_base in the .map
file.
In 2 and 3 r10 was 0xc10ae600; the System.map shows: $cat System.map |grep got_base c1080150 t relocate_got_base_r c108017c t _got_base c1080180 t _relocate_got_base_r c10ae600 A __got_base
Thanks for your help!
My pleasure.
Just for the sake of details: without the removal of the -msingle-pic-base I have the following register contents just before 'bl board_init_f' (start.S:219): (gdb) info registers r0 0x0 0 r1 0x0 0 r2 0x80003478 2147497080 r3 0xc1080000 3238526976 r4 0x80003484 2147497092 r5 0x80003480 2147497088 r6 0xc7fe9760 3355350880 r7 0xc7fbd000 3355168768 r8 0xc7ea8f8c 3354038156 r9 0xc10ac1c0 3238707648 r10 0xc10ac1c0 3238707648 r11 0xc7ea8f84 3354038148 r12 0x30 48 sp 0xc0000f80 0xc0000f80 lr 0x80000120 2147483936 pc 0xc1080088 0xc1080088 <reset+32> fps 0x0 0 cpsr 0x600000d3 1610612947 (gdb)
and the System.map shows: $cat System.map |grep got_base c1080150 t relocate_got_base_r c108017c t _got_base c1080180 t _relocate_got_base_r c10ac1c0 A __got_base
Best Regards, Ben Gardiner
--- Nanometrics Inc. http://www.nanometrics.ca

As for _TEXT_BASE vs TEXT_BASE, this is normal: TEXT_BASE is the image base, and _TEXT_BASE is the address of a variable holding the value of TEXT_BASE, so it's not surprising that they are not equal.
As for your problem: your reports show that start.S sets r10 and r9 as expected, but that board_init_f does not contain the pic prolog code. I wonder if you may have built it without -fPIC or -fPIE. Normally with -fPIE/PIC you should have seen board_init_f (and any function) start with something like:
c1080690: e59fa120 ldr sl, [pc, #288] ; c10807b8 <board_init_f+0x128> c1080694: e08fa00a add sl, pc, sl
Can you check the build logs to see which options were applied to lib/arm/board.c?
I'll try with your CodeSourcery toolchain (I'm using the ELDK 4.2 one), but there is no reason why it would not be EABI compliant.
Amicalement,

I've just recompiled with the CodeSourcery toolchain and I can see the board_init_f prolog going thus:
c1080688: e59fa118 ldr sl, [pc, #280] ; c10807a8 <board_init_f+0x120>
c108068c: e59f8118 ldr r8, [pc, #280] ; c10807ac <board_init_f+0x124>
c1080690: e08fa00a add sl, pc, sl
It is scattered, but it is there.
Also, I see board_init_f starting at c108064c. Ben, that's a different address from what you see in your map file, but maybe we're not using the same exact source tree? I tested with the current 'next' branch plus my two patches.
Amicalement,

On Wed, Sep 22, 2010 at 5:36 PM, Albert ARIBAUD albert.aribaud@free.fr wrote:
Can you check the build logs to see which options were applied to lib/arm/board.c?
make[1]: Entering directory `[...]/arch/arm/lib' arm-none-linux-gnueabi-gcc -g -Os -fPIE -fno-common -ffixed-r8 -msoft-float -msingle-pic-base -fno-common -ffixed-r8 -msoft-float -D__KERNEL__ -DTEXT_BASE=0xC1080000 -I[...]/include -fno-builtin -ffreestanding -nostdinc -isystem /opt/codesourcery-arm-none-eabi-2009q1/bin/../lib/gcc/arm-none-linux-gnueabi/4.3.3/include -pipe -DCONFIG_ARM -D__ARM__ -marm -mabi=aapcs-linux -mno-thumb-interwork -march=armv5te -march=armv5te -Wall -Wstrict-prototypes -fno-stack-protector \ -o board.o board.c -c
On Wed, Sep 22, 2010 at 6:07 PM, Albert ARIBAUD albert.aribaud@free.fr wrote:
I've just recompiled with the CodeSourcery toolchain and I can see the board_init_f prolog going thus:
c1080688: e59fa118 ldr sl, [pc, #280] ; c10807a8 <board_init_f+0x120>
c108068c: e59f8118 ldr r8, [pc, #280] ; c10807ac <board_init_f+0x124>
c1080690: e08fa00a add sl, pc, sl
It is scattered, but it is there.
Yes, I see them now that you point it out; but they are not pc-relative for here; c10806a4: e514a004 ldr sl, [r4, #-4] c10806a8: e35a0000 cmp sl, #0 ; 0x0 ...
What version of that compiler did you try? I tried both the 2009q1 and 2009q3 compilers.
Also, I see board_init_f starting at c108064c. Ben, that's a different address from what you see in your map file, but maybe we're not using the same exact source tree? I tested with the current 'next' branch plus my two patches.
I had applied your patches to d70d8ccc200db8c16a6654cb726c3d74b6640b32.
I rebased your patches onto e69e520f9d235bb7d96296081fdfc09b9fee8c46 this morning and I now have "c108064c <board_init_f>:" same as you. But I still do not get a boot prompt.
I am downloading the arm-2008-11-24.iso for ELDK-4.2 to try to reproduce your results.
Best Regards, Ben Gardiner
--- Nanometrics Inc. http://www.nanometrics.ca

23/09/2010 16:44, Ben Gardiner a écrit :
On Wed, Sep 22, 2010 at 5:36 PM, Albert ARIBAUDalbert.aribaud@free.fr wrote:
Can you check the build logs to see which options were applied to lib/arm/board.c?
make[1]: Entering directory `[...]/arch/arm/lib' arm-none-linux-gnueabi-gcc -g -Os -fPIE -fno-common -ffixed-r8 -msoft-float -msingle-pic-base -fno-common -ffixed-r8 -msoft-float -D__KERNEL__ -DTEXT_BASE=0xC1080000 -I[...]/include -fno-builtin -ffreestanding -nostdinc -isystem /opt/codesourcery-arm-none-eabi-2009q1/bin/../lib/gcc/arm-none-linux-gnueabi/4.3.3/include -pipe -DCONFIG_ARM -D__ARM__ -marm -mabi=aapcs-linux -mno-thumb-interwork -march=armv5te -march=armv5te -Wall -Wstrict-prototypes -fno-stack-protector \ -o board.o board.c -c
This command line has -msingle-pic-base; that's why there is no pic recomputation in board_init_f. Looks like you compiled with my patches unaltered, whereas I thought you were compiling with your fix above m patches. Can you either apply your fix above my patches or just compile with the first of my patches?
What version of that compiler did you try? I tried both the 2009q1 and 2009q3 compilers.
For this case I downloaded and used the same 2009q1 you're using.
Amicalement,

On Thu, Sep 23, 2010 at 11:13 AM, Albert ARIBAUD albert.aribaud@free.fr wrote:
This command line has -msingle-pic-base; that's why there is no pic recomputation in board_init_f. Looks like you compiled with my patches unaltered, whereas I thought you were compiling with your fix above m patches. Can you either apply your fix above my patches or just compile with the first of my patches?
Right. I forgot about that part. I also now recognize the prologue to which you were referring. I will complete the testing originally requested.
What version of that compiler did you try? I tried both the 2009q1 and 2009q3 compilers.
For this case I downloaded and used the same 2009q1 you're using.
Ok. Thank you for clarifying.
Best Regards, Ben Gardiner
--- Nanometrics Inc. http://www.nanometrics.ca

On Thu, Sep 23, 2010 at 10:44 AM, Ben Gardiner bengardiner@nanometrics.ca wrote:
I am downloading the arm-2008-11-24.iso for ELDK-4.2 to try to reproduce your results.
When I build u-boot/next plus your patches using the ELDK 4.2 arm toolchain I encounter the same problem as before; there is no boot prompt.
Please let me know if there is any other debugging information I can get for you.
Best Regards, Ben Gardiner
--- Nanometrics Inc. http://www.nanometrics.ca

On Wed, Sep 22, 2010 at 3:07 PM, Albert ARIBAUD albert.aribaud@free.fr wrote:
- build with your fix in;
$git log --format=oneline|head -n 3 f619c1537af105cd1471f9447ac9132feab79c3a arm926ejs: reduce code size with -msingle-pic-base 6285f63589a87dfd28c7d73ff68075c5d1ee7f35 arm: change relocation flag from -fPIC to -fPIE e69e520f9d235bb7d96296081fdfc09b9fee8c46 fsl: refactor MPC8610 and MPC5121 DIU code to use existing bitmap and logo features
$git diff diff --git a/arch/arm/cpu/arm926ejs/config.mk b/arch/arm/cpu/arm926ejs/config.mk index aa84706..f8ef90f 100644 --- a/arch/arm/cpu/arm926ejs/config.mk +++ b/arch/arm/cpu/arm926ejs/config.mk @@ -23,11 +23,6 @@
PLATFORM_RELFLAGS += -fno-common -ffixed-r8 -msoft-float
-ifndef CONFIG_SYS_ARM_WITHOUT_RELOC -# needed for optimal relocation -PLATFORM_RELFLAGS += -msingle-pic-base -endif - PLATFORM_CPPFLAGS += -march=armv5te # ========================================================================= #
$make mrproper; make da850evm_config; make -j9 all [...] make[1]: Entering directory `[...]arch/arm/lib' arm-none-linux-gnueabi-gcc -g -Os -fPIE -fno-common -ffixed-r8 -msoft-float -fno-common -ffixed-r8 -msoft-float -D__KERNEL__ -DTEXT_BASE=0xC1080000 -I[...]/include -fno-builtin -ffreestanding -nostdinc -isystem /opt/codesourcery-arm-none-eabi-2009q1/bin/../lib/gcc/arm-none-linux-gnueabi/4.3.3/include -pipe -DCONFIG_ARM -D__ARM__ -marm -mabi=aapcs-linux -mno-thumb-interwork -march=armv5te -march=armv5te -Wall -Wstrict-prototypes -fno-stack-protector \ -o board.o board.c -c [...]
- debug (at the assembly instruction level) the start.S code and see what
value ends up in r10 (aka sl) right before calling board_init_f;
(gdb) p /x $r10 $1 = 0xc1098564 (gdb) p /x $pc $2 = 0xc1080088
- proceed (still at the assembly instruction level) until you get within
board_init_f. Among the first instructions will be the recomputation of 10/sl; see what value it is assigned;
(gdb) p /x $r10 $5 = 0xc1098564 (gdb) p /x $pc $6 = 0xc1080694 (gdb)
- compare values found in 2 and 3 with the value of __got_base in the .map
file.
$cat System.map |grep __got_base c1098564 A __got_base
I hope that helps.
Best Regards, Ben Gardiner
--- Nanometrics Inc. http://www.nanometrics.ca

Le 23/09/2010 18:37, Ben Gardiner a écrit :
On Wed, Sep 22, 2010 at 3:07 PM, Albert ARIBAUDalbert.aribaud@free.fr wrote:
- build with your fix in;
$git log --format=oneline|head -n 3 f619c1537af105cd1471f9447ac9132feab79c3a arm926ejs: reduce code size with -msingle-pic-base 6285f63589a87dfd28c7d73ff68075c5d1ee7f35 arm: change relocation flag from -fPIC to -fPIE e69e520f9d235bb7d96296081fdfc09b9fee8c46 fsl: refactor MPC8610 and MPC5121 DIU code to use existing bitmap and logo features
$git diff diff --git a/arch/arm/cpu/arm926ejs/config.mk b/arch/arm/cpu/arm926ejs/config.mk index aa84706..f8ef90f 100644 --- a/arch/arm/cpu/arm926ejs/config.mk +++ b/arch/arm/cpu/arm926ejs/config.mk @@ -23,11 +23,6 @@
PLATFORM_RELFLAGS += -fno-common -ffixed-r8 -msoft-float
-ifndef CONFIG_SYS_ARM_WITHOUT_RELOC -# needed for optimal relocation -PLATFORM_RELFLAGS += -msingle-pic-base -endif
- PLATFORM_CPPFLAGS += -march=armv5te # ========================================================================= #
$make mrproper; make da850evm_config; make -j9 all [...] make[1]: Entering directory `[...]arch/arm/lib' arm-none-linux-gnueabi-gcc -g -Os -fPIE -fno-common -ffixed-r8 -msoft-float -fno-common -ffixed-r8 -msoft-float -D__KERNEL__ -DTEXT_BASE=0xC1080000 -I[...]/include -fno-builtin -ffreestanding -nostdinc -isystem /opt/codesourcery-arm-none-eabi-2009q1/bin/../lib/gcc/arm-none-linux-gnueabi/4.3.3/include -pipe -DCONFIG_ARM -D__ARM__ -marm -mabi=aapcs-linux -mno-thumb-interwork -march=armv5te -march=armv5te -Wall -Wstrict-prototypes -fno-stack-protector \ -o board.o board.c -c [...]
- debug (at the assembly instruction level) the start.S code and see what
value ends up in r10 (aka sl) right before calling board_init_f;
(gdb) p /x $r10 $1 = 0xc1098564 (gdb) p /x $pc $2 = 0xc1080088
- proceed (still at the assembly instruction level) until you get within
board_init_f. Among the first instructions will be the recomputation of 10/sl; see what value it is assigned;
(gdb) p /x $r10 $5 = 0xc1098564 (gdb) p /x $pc $6 = 0xc1080694 (gdb)
- compare values found in 2 and 3 with the value of __got_base in the .map
file.
$cat System.map |grep __got_base c1098564 A __got_base
I hope that helps.
Well, it confirms that the start.S code computes a correct r10 / sl register, since it is the same value as board_init_f (and any other function) computes when -msingle-pic-base is not present, and this value is that of __got_base.
Note that if I'm not mistaken, your build uses r9 as the pic base, not 10 -- this is a possibility, and my patches are written so as to allow either register use.
Now, we need to find out why board_init_f fails to go through the init sequence and print out at least the banner and some diagnostics.
Please build with my two patches alone (i.e., put -msingle-pic-base back in place, which will remove the recomputation of the pic base in every function), and run it until board_init_f() calls its first init function through (*init_fnc_ptr)(). This call should be materialized by a 'blx <reg>' instruction; please report the value of <reg> at that point as well as the addresses of arch_cpu_init, board_early_init_f, and timer_init; <reg> should be one of these depending on the board config.
Best Regards, Ben Gardiner
Thanks again for your help in testing my patches!
Amicalement,

On Thu, Sep 23, 2010 at 1:04 PM, Albert ARIBAUD albert.aribaud@free.fr wrote:
Note that if I'm not mistaken, your build uses r9 as the pic base, not 10 -- this is a possibility, and my patches are written so as to allow either register use.
I don't know for sure so I did a little poking around:
c1080688 <board_init_f>: c1080688: e59fa118 ldr sl, [pc, #280] ; c10807a8 <board_init_f+0x120> c108068c: e59f8118 ldr r8, [pc, #280] ; c10807ac <board_init_f+0x124>
(gdb) info registers [...] r9 0xc1098564 3238626660 r10 0xc1098564 3238626660 [...] pc 0xc1080688 0xc1080688 <board_init_f> [...] (gdb) ni 514 gd = (gd_t *) (CONFIG_SYS_INIT_SP_ADDR); (gdb) info registers [...] r9 0xc1098564 3238626660 r10 0x17ecc 97996 [...] pc 0xc108068c 0xc108068c <board_init_f+4> [...] (gdb) p /x $sl $1 = 0x17ecc (gdb) p /x $r10 $2 = 0x17ecc (gdb) p /x $r9 $3 = 0xc1098564 (gdb) p *0xc10807a8 $4 = 97996 (gdb) p /x *0xc10807a8 $5 = 0x17ecc
I think that confirms that sl == r10 .
Now, we need to find out why board_init_f fails to go through the init sequence and print out at least the banner and some diagnostics.
Please build with my two patches alone (i.e., put -msingle-pic-base back in place, which will remove the recomputation of the pic base in every function), and run it until board_init_f() calls its first init function through (*init_fnc_ptr)(). This call should be materialized by a 'blx <reg>' instruction; please report the value of <reg> at that point as well as the addresses of arch_cpu_init, board_early_init_f, and timer_init; <reg> should be one of these depending on the board config.
<reg> is r10/sl :
for (init_fnc_ptr = init_sequence; *init_fnc_ptr; ++init_fnc_ptr) { if ((*init_fnc_ptr)() != 0) { c1080690: e12fff3a blx sl c1080694: e2844004 add r4, r4, #4 ; 0x4 c1080698: e3500000 cmp r0, #0 ; 0x0 c108069c: 0a000000 beq c10806a4 <board_init_f+0x58> hang (); c10806a0: ebffff76 bl c1080480 <hang>
Here's the addresses you requested for the very first time it hit that function call through (*init_fnc_prt)(): (gdb) p /x $pc $4 = 0xc1080690 (gdb) p /x $sl $5 = 0xc1091e6c (gdb) p /x $r10 $6 = 0xc1091e6c
There is no arch_cpu_init or board_early_init_f; timer_init is at 0xc1091e6c. So it is timer_init which is being called above.
The call to timer_init completes successfully; the next function pointer dereferenced and called is 0xc1087e04 == env_init. That call completes successfully. The next is 0xc10804f8 == init_baudrate; it completes successfully. The next call is 0xc108119c == serial_init; it completes successfully. The next is 0xc1086550 == console_init_f; it completes successfully. The next is 0xc10804d0 == display_banner; it completes successfully. They all did. Execution reaches the call to relocate_code and enters; the last instruction I am able to single-step is the 'ldr r10, _got_base' in the following lines from your patch:
/*
* Set pic base register to the current GOT position. Since we are
* now running from RAM, we need to offset it from its link-time to
* its run-time position.
*/
ldr r9, relocate_got_base_r
sub r9, pc, r9
ldr r10, _got_base
+relocate_got_base_r:
add r10, r9, r10
mov r9, r10
Thanks again for your help in testing my patches!
My pleasure. I hope we can get them working on the da850evm.
Best Regards, Ben Gardiner
--- Nanometrics Inc. http://www.nanometrics.ca

Le 23/09/2010 23:13, Ben Gardiner a écrit :
On Thu, Sep 23, 2010 at 1:04 PM, Albert ARIBAUDalbert.aribaud@free.fr wrote:
Note that if I'm not mistaken, your build uses r9 as the pic base, not 10 -- this is a possibility, and my patches are written so as to allow either register use.
I don't know for sure so I did a little poking around:
c1080688<board_init_f>: c1080688: e59fa118 ldr sl, [pc, #280] ; c10807a8 <board_init_f+0x120> c108068c: e59f8118 ldr r8, [pc, #280] ; c10807ac <board_init_f+0x124>
(gdb) info registers [...] r9 0xc1098564 3238626660 r10 0xc1098564 3238626660 [...] pc 0xc1080688 0xc1080688<board_init_f> [...] (gdb) ni 514 gd = (gd_t *) (CONFIG_SYS_INIT_SP_ADDR); (gdb) info registers [...] r9 0xc1098564 3238626660 r10 0x17ecc 97996 [...] pc 0xc108068c 0xc108068c<board_init_f+4> [...] (gdb) p /x $sl $1 = 0x17ecc (gdb) p /x $r10 $2 = 0x17ecc (gdb) p /x $r9 $3 = 0xc1098564 (gdb) p *0xc10807a8 $4 = 97996 (gdb) p /x *0xc10807a8 $5 = 0x17ecc
I think that confirms that sl == r10 .
Watch out: 'sl' is always 'r10': those are alias names of the register; I specify both names because depending on the tool you use, you might see either name. But the pic base, i.e. the register which holds the address of the GOT, is not necessarily sl! It can can be r10 *or* r9, depending on the toolchain and options.
Here your excepts show that r9 remains constant (and, I suspect, its value is the address of the symbol __got_base) while r10 varies; r9 is the pic base. Which is lucky because r10 is trashed by the loop that goes through init_sequence and runs each of the functions it points to.
There is no arch_cpu_init or board_early_init_f; timer_init is at 0xc1091e6c. So it is timer_init which is being called above.
The call to timer_init completes successfully; the next function pointer dereferenced and called is 0xc1087e04 == env_init. That call completes successfully. The next is 0xc10804f8 == init_baudrate; it completes successfully. The next call is 0xc108119c == serial_init; it completes successfully. The next is 0xc1086550 == console_init_f; it completes successfully. The next is 0xc10804d0 == display_banner; it completes successfully. They all did.
Ok, so all this goes fine, yet you do not see the banner displayed?
Execution reaches the call to relocate_code and enters; the last instruction I am able to single-step is the 'ldr r10, _got_base' in the following lines from your patch:
What happens when you single-step through it? Any kind of error message, or does the target simply not respond any more?
Thanks again for your help in testing my patches!
My pleasure. I hope we can get them working on the da850evm.
We certainly will. For me to reproduce your SW setting and build exactly the same binary as you do, can you confirm that you are again using the 2009q1 toolchain, and indicate the exact command line you use to build your target?
Amicalement,

On Thu, Sep 23, 2010 at 5:30 PM, Albert ARIBAUD albert.aribaud@free.fr wrote:
Watch out: 'sl' is always 'r10': those are alias names of the register; I specify both names because depending on the tool you use, you might see either name. But the pic base, i.e. the register which holds the address of the GOT, is not necessarily sl! It can can be r10 *or* r9, depending on the toolchain and options.
Here your excepts show that r9 remains constant (and, I suspect, its value is the address of the symbol __got_base) while r10 varies; r9 is the pic base. Which is lucky because r10 is trashed by the loop that goes through init_sequence and runs each of the functions it points to.
Ok :) So i showed that the alias is true. :) -- Thanks for the info, I'll need to do some research into what the pic base is with this toolchain.
There is no arch_cpu_init or board_early_init_f; timer_init is at 0xc1091e6c. So it is timer_init which is being called above.
The call to timer_init completes successfully; the next function pointer dereferenced and called is 0xc1087e04 == env_init. That call completes successfully. The next is 0xc10804f8 == init_baudrate; it completes successfully. The next call is 0xc108119c == serial_init; it completes successfully. The next is 0xc1086550 == console_init_f; it completes successfully. The next is 0xc10804d0 == display_banner; it completes successfully. They all did.
Ok, so all this goes fine, yet you do not see the banner displayed?
Correct.
Execution reaches the call to relocate_code and enters; the last instruction I am able to single-step is the 'ldr r10, _got_base' in the following lines from your patch:
What happens when you single-step through it? Any kind of error message, or does the target simply not respond any more?
No output on the serial console or from the debugger software. As you say, the target simply does not respond anymore.
Thanks again for your help in testing my patches!
My pleasure. I hope we can get them working on the da850evm.
We certainly will. For me to reproduce your SW setting and build exactly the same binary as you do, can you confirm that you are again using the 2009q1 toolchain, and indicate the exact command line you use to build your target?
It was the 2009q1 toolchain.
export CROSS_COMPILE=arm-none-linux-gnueabi- export PATH=/opt/codesourcery-arm-none-eabi-2009q1/bin/:${PATH} export ARCH=arm make mrproper ; make da850evm_config ; make -j9 all
Best Regards, Ben Gardiner
--- Nanometrics Inc. http://www.nanometrics.ca

I think I have found the root cause of your issue.
I was actually lucky that it worked at all for me; and as Heiko's board skipped relocation, it would not trigger the bug either... My computing of r9/r10 in the relocate_code stage was plain wrong, based on the FLASH execution location, not the RAM one. :/
I'll post a V2 patch set within the next hour -- Heiko, this second patchset has your fixes except I mouved the r9/r10 recomputation inside the #ifndef CONFIG_SKIP_RELOCATE_UBOOT, the logic being that if we can skip relocation, then we are already running at the right location and we don't need fixing the pic base either.
Ben and Heiko, please (re-)test the patch when it's out -- sorry for the extra workload.
Amicalement,

Dear Ben Gardiner,
In message AANLkTinK1PWePkBaBHEnxcyV6eOwtjZb-7+kA-JhHM9Z@mail.gmail.com you wrote:
I tried tested this on da850evm -- the da850evm.h has "#undef CONFIG_SYS_ARM_WITHOUT_RELOC /* This board is tested with relocation support */".
This slipped through, sorry for that.
That line must be removed.
Best regards,
Wolfgang Denk

Hi Wolfgang,
On Wed, Sep 22, 2010 at 4:30 PM, Wolfgang Denk wd@denx.de wrote:
Dear Ben Gardiner,
In message AANLkTinK1PWePkBaBHEnxcyV6eOwtjZb-7+kA-JhHM9Z@mail.gmail.com you wrote:
I tried tested this on da850evm -- the da850evm.h has "#undef CONFIG_SYS_ARM_WITHOUT_RELOC /* This board is tested with relocation support */".
This slipped through, sorry for that.
That line must be removed.
I don't understand; I thought that '#undef' _enabled_ relocation support. According to doc/README.arm-relocation on u-boot./next:
To compile a board without relocation, define CONFIG_SYS_ARM_WITHOUT_RELOC This possibility will removed!! So please fix your board to compile without CONFIG_SYS_ARM_WITHOUT_RELOC defined!!!
But I did test relocation on this board with Heiko and it is working as-is on d70d8ccc200db8c16a6654cb726c3d74b6640b32 of u-boot/next; we found that 97003756249bd790910417eb66f0039bbf06a02c made it work.
I can confirm that -- as you suggest -- the following applied to u-boot/next diff --git a/include/configs/da850evm.h b/include/configs/da850evm.h index d02b196..e0a3bae 100644 --- a/include/configs/da850evm.h +++ b/include/configs/da850evm.h @@ -138,7 +138,6 @@ #endif
/* additions for new relocation code, must added to all boards */ -#undef CONFIG_SYS_ARM_WITHOUT_RELOC /* This board is tested with relocation support */ #define CONFIG_SYS_SDRAM_BASE 0xc0000000 #define CONFIG_SYS_INIT_SP_ADDR (CONFIG_SYS_SDRAM_BASE + 0x1000 - /* Fix this */ \ CONFIG_SYS_GBL_DATA_SIZE)
also works on da850evm. But isn't this disabling the recently added relocation support?
Best Regards, Ben Gardiner
--- Nanometrics Inc. http://www.nanometrics.ca

Dear Ben Gardiner,
In message AANLkTinLq9XStXA=6w1BLKwCrDuXAPYpK1X3XW5MruW7@mail.gmail.com you wrote:
I tried tested this on da850evm -- the da850evm.h has "#undef CONFIG_SYS_ARM_WITHOUT_RELOC /* This board is tested with relocation support */".
This slipped through, sorry for that.
That line must be removed.
I don't understand; I thought that '#undef' _enabled_ relocation support. According to doc/README.arm-relocation on u-boot./next:
We do not #undef what is not #defined in the first place.
To compile a board without relocation, define CONFIG_SYS_ARM_WITHOUT_RELOC This possibility will removed!! So please fix your board to compile without CONFIG_SYS_ARM_WITHOUT_RELOC defined!!!
If your board is fixed, then there is no need to build it with CONFIG_SYS_ARM_WITHOUT_RELOC set.
Best regards,
Wolfgang Denk

On Wed, Sep 22, 2010 at 5:11 PM, Wolfgang Denk wd@denx.de wrote:
Dear Ben Gardiner,
In message AANLkTinLq9XStXA=6w1BLKwCrDuXAPYpK1X3XW5MruW7@mail.gmail.com you wrote:
I tried tested this on da850evm -- the da850evm.h has "#undef CONFIG_SYS_ARM_WITHOUT_RELOC /* This board is tested with relocation support */".
This slipped through, sorry for that.
That line must be removed.
I don't understand; I thought that '#undef' _enabled_ relocation support. According to doc/README.arm-relocation on u-boot./next:
We do not #undef what is not #defined in the first place.
:) Good point. Patch is coming.
Best Regards, Ben Gardiner
--- Nanometrics Inc. http://www.nanometrics.ca

Hello Albert,
Albert Aribaud wrote:
Add -msingle-pic-base to the relocation flags, and compute the pic base in start.S twice and for all -- once before relocation to run board_init_f, and once after relocation to run board_init_r and the rest of u-boot. This further reduces code size by 2.5% compared to -fPIE alone.
Signed-off-by: Albert Aribaud albert.aribaud@free.fr
arch/arm/cpu/arm926ejs/config.mk | 5 ++++ arch/arm/cpu/arm926ejs/start.S | 40 ++++++++++++++++++++++++++++++++++-- arch/arm/cpu/arm926ejs/u-boot.lds | 1 + 3 files changed, 43 insertions(+), 3 deletions(-)
I get with your 2 patches for the tx25 board:
[hs@pollux u-boot]$ ./MAKEALL tx25 Configuring for tx25 board... /home/hs/celf/u-boot/arch/arm/cpu/arm926ejs/start.S: Assembler messages: /home/hs/celf/u-boot/arch/arm/cpu/arm926ejs/start.S:212: Error: internal_relocation (type: OFFSET_IMM) not fixed up /home/hs/celf/u-boot/arch/arm/cpu/arm926ejs/start.S:308: Error: internal_relocation (type: OFFSET_IMM) not fixed up make[1]: *** [start.o] Fehler 1 make: *** [nand_spl] Fehler 2 make: *** Warte auf noch nicht beendete Prozesse... make: INTERNAL: Exiting with 4 jobserver tokens available; should be 3! arm-linux-size: './u-boot': No such file
--------------------- SUMMARY ---------------------------- Boards compiled: 1 Boards with warnings or errors: 1 ( tx25 ) ---------------------------------------------------------- [hs@pollux u-boot]$
below fix fixes this issue :-)
I had to define _got_base and _relocate_got_base_r also for the NAND_SPL case, and don;t need to fix up the "pic base register", if we are at the right position.
I think we(you ;-) should add my below fix to your
[U-Boot] [PATCH 2/2] [NEXT] arm926ejs: reduce code size with -msingle-pic-base
and send a second version of it, so we are bisect compatible.
you can add my
Signed-off-by: Heiko Schocher hs@denx.de
to it.
Can you test this also on your board? Thanks!
actual next build for the tx25 board:
[hs@pollux u-boot]$ ./MAKEALL tx25 Configuring for tx25 board... text data bss dec hex filename 149540 9368 36980 195888 2fd30 ./u-boot
with your 2 patches, and my below fix:
[U-Boot] [PATCH 1/2] [NEXT] arm: change relocation flag from -fPIC to -fPIE [U-Boot] [PATCH 2/2] [NEXT] arm926ejs: reduce code size with -msingle-pic-base
[hs@pollux u-boot]$ ./MAKEALL tx25 Configuring for tx25 board... text data bss dec hex filename 146172 9172 36980 192324 2ef44 ./u-boot
So, this saves for this board 0xdec bytes!
-----------------------------------------------------------------------
here now the fix for the tx25 board:
diff --git a/arch/arm/cpu/arm926ejs/start.S b/arch/arm/cpu/arm926ejs/start.S index 739fa06..0f0a503 100644 --- a/arch/arm/cpu/arm926ejs/start.S +++ b/arch/arm/cpu/arm926ejs/start.S @@ -277,6 +277,18 @@ fixloop: #endif #endif /* #ifndef CONFIG_SKIP_RELOCATE_UBOOT */
+ /* + * Set pic base register to the current GOT position. Since we are + * now running from RAM, we need to offset it from its link-time to + * its run-time position. + */ + ldr r9, relocate_got_base_r + sub r9, pc, r9 + ldr r10, _got_base +relocate_got_base_r: + add r10, r9, r10 + mov r9, r10 + clear_bss: #ifndef CONFIG_PRELOADER ldr r0, _bss_start @@ -298,18 +310,6 @@ clbss_l:str r2, [r0] /* clear loop... */ bl red_LED_on #endif
- /* - * Set pic base register to the current GOT position. Since we are - * now running from RAM, we need to offset it from its link-time to - * its run-time position. - */ - ldr r9, relocate_got_base_r - sub r9, pc, r9 - ldr r10, _got_base -relocate_got_base_r: - add r10, r9, r10 - mov r9, r10 - /* * We are done. Do not return, instead branch to second part of board * initialization, now running from RAM. @@ -331,14 +331,13 @@ _nand_boot: .word nand_boot mov pc, lr
_board_init_r: .word board_init_r +#endif
_got_base: .word __got_base _relocate_got_base_r: .word relocate_got_base_r
-#endif - #else /* #if !defined(CONFIG_SYS_ARM_WITHOUT_RELOC) */ /* * the actual reset code diff --git a/board/karo/tx25/config.mk b/board/karo/tx25/config.mk index 51ca1ab..1a32c87 100644 --- a/board/karo/tx25/config.mk +++ b/board/karo/tx25/config.mk @@ -1,5 +1,5 @@ ifdef CONFIG_NAND_SPL TEXT_BASE = 0x810c0000 else -TEXT_BASE = 0x81fc0000 +TEXT_BASE = 0x81fc1000 endif diff --git a/include/configs/tx25.h b/include/configs/tx25.h index 7faa453..dfe8eab 100644 --- a/include/configs/tx25.h +++ b/include/configs/tx25.h @@ -41,7 +41,7 @@ #define CONFIG_SYS_NAND_U_BOOT_OFFS 0x800 #define CONFIG_SYS_NAND_U_BOOT_SIZE 0x30000
-#define CONFIG_SYS_NAND_U_BOOT_DST (0x81fc0000) +#define CONFIG_SYS_NAND_U_BOOT_DST (0x81fc1000) #define CONFIG_SYS_NAND_U_BOOT_START CONFIG_SYS_NAND_U_BOOT_DST
#define CONFIG_SYS_NAND_PAGE_SIZE 2048 diff --git a/nand_spl/board/karo/tx25/u-boot.lds b/nand_spl/board/karo/tx25/u-boot.lds index c572557..b60c0df 100644 --- a/nand_spl/board/karo/tx25/u-boot.lds +++ b/nand_spl/board/karo/tx25/u-boot.lds @@ -55,6 +55,7 @@ SECTIONS
__got_start = .; . = ALIGN(4); + __got_base = .; .got : { *(.got) }
__got_end = .;

Hello Heiko,
Thanks for the in-depth analysis.
IIUC:
Le 23/09/2010 09:12, Heiko Schocher a écrit :
/home/hs/celf/u-boot/arch/arm/cpu/arm926ejs/start.S:212: Error: internal_relocation (type: OFFSET_IMM) not fixed up /home/hs/celf/u-boot/arch/arm/cpu/arm926ejs/start.S:308: Error: internal_relocation (type: OFFSET_IMM) not fixed up
This is due to the fact you're using a specific .lds, and your fix below corrects this for your .lds indeed. However, there might be other boards in the arm926 subtree which have their own .lds and would need the same fix, so I'll find them all and extend your fix to others.
I had to define _got_base and _relocate_got_base_r also for the NAND_SPL case, and don;t need to fix up the "pic base register", if we are at the right position.
Thanks for spotting and fixing these. :)
I think we(you ;-) should add my below fix to your
[U-Boot] [PATCH 2/2] [NEXT] arm926ejs: reduce code size with -msingle-pic-base
and send a second version of it, so we are bisect compatible.
you can add my
Signed-off-by: Heiko Schocherhs@denx.de
to it.
I will -- thanks!
Can you test this also on your board? Thanks!
I will too. :)
Just one thing: are these TEXT_BASE / CONFIG_SYS_NAND_U_BOOT_DST modifications in your patch really needed for the board to run, or are they a manual layout optimization decided based on the code size reduction?
diff --git a/board/karo/tx25/config.mk b/board/karo/tx25/config.mk index 51ca1ab..1a32c87 100644 --- a/board/karo/tx25/config.mk +++ b/board/karo/tx25/config.mk @@ -1,5 +1,5 @@ ifdef CONFIG_NAND_SPL TEXT_BASE = 0x810c0000 else -TEXT_BASE = 0x81fc0000 +TEXT_BASE = 0x81fc1000 endif diff --git a/include/configs/tx25.h b/include/configs/tx25.h index 7faa453..dfe8eab 100644 --- a/include/configs/tx25.h +++ b/include/configs/tx25.h @@ -41,7 +41,7 @@ #define CONFIG_SYS_NAND_U_BOOT_OFFS 0x800 #define CONFIG_SYS_NAND_U_BOOT_SIZE 0x30000
-#define CONFIG_SYS_NAND_U_BOOT_DST (0x81fc0000) +#define CONFIG_SYS_NAND_U_BOOT_DST (0x81fc1000) #define CONFIG_SYS_NAND_U_BOOT_START CONFIG_SYS_NAND_U_BOOT_DST
#define CONFIG_SYS_NAND_PAGE_SIZE 2048
Thanks again for your work on my patch!
Amicalement,

Hello Albert,
Albert ARIBAUD wrote:
Hello Heiko,
Thanks for the in-depth analysis.
IIUC:
Le 23/09/2010 09:12, Heiko Schocher a écrit :
/home/hs/celf/u-boot/arch/arm/cpu/arm926ejs/start.S:212: Error: internal_relocation (type: OFFSET_IMM) not fixed up /home/hs/celf/u-boot/arch/arm/cpu/arm926ejs/start.S:308: Error: internal_relocation (type: OFFSET_IMM) not fixed up
This is due to the fact you're using a specific .lds, and your fix below corrects this for your .lds indeed. However, there might be other boards
Yep.
in the arm926 subtree which have their own .lds and would need the same fix, so I'll find them all and extend your fix to others.
Hmm.. I don;t know the other boards ... so, I decided for the relocation patches not to fix all arm boards, instead inroduce the CONFIG_SYS_ARM_WITHOUT_RELOC, and board maintainers (or others who have time for it) can adapt the boards to this new code. So we always have a chance to keep track which boards are converted *and* working and which boards are not working ... so, I could not say, if it is a good idea to fix other boards, you could not test (I know, usually we do this fixes, but the relocation code is such a deep change, I prefer to add only real tested patches ...)
I had to define _got_base and _relocate_got_base_r also for the NAND_SPL case, and don;t need to fix up the "pic base register", if we are at the right position.
Thanks for spotting and fixing these. :)
I think we(you ;-) should add my below fix to your
[U-Boot] [PATCH 2/2] [NEXT] arm926ejs: reduce code size with -msingle-pic-base
and send a second version of it, so we are bisect compatible.
you can add my
Signed-off-by: Heiko Schocherhs@denx.de
to it.
I will -- thanks!
Can you test this also on your board? Thanks!
I will too. :)
Thanks!
Just one thing: are these TEXT_BASE / CONFIG_SYS_NAND_U_BOOT_DST modifications in your patch really needed for the board to run, or are they a manual layout optimization decided based on the code size reduction?
Yes. I decided for this board to set TEXT_BASE = CONFIG_SYS_NAND_U_BOOT_DST = relocation address, so I can prevent one copying of u-boot code. (The u-boot-nand code copy u-boot to CONFIG_SYS_NAND_U_BOOT_DST, which is on the later calculated relocation address -> no need to copy it again ... if there is more RAM on this board, u-boot of course copy itself to the higher address)
bye, Heiko

Le 23/09/2010 12:08, Heiko Schocher a écrit :
Hmm.. I don;t know the other boards ... so, I decided for the relocation patches not to fix all arm boards, instead inroduce the CONFIG_SYS_ARM_WITHOUT_RELOC, and board maintainers (or others who have time for it) can adapt the boards to this new code. So we always have a chance to keep track which boards are converted *and* working and which boards are not working ... so, I could not say, if it is a good idea to fix other boards, you could not test (I know, usually we do this fixes, but the relocation code is such a deep change, I prefer to add only real tested patches ...)
Understood. If only to be consistent with your patches, I will not fix any other board than the ones I can test directly.
This leads me to another question: since I cannot test the tx25, I think I should take from your fix only the parts that modify my own patch set, and leave out the parts for tx25, which you would post above my patches. Is this ok?
Amicalement,

Hello Albert,
Albert ARIBAUD wrote:
Le 23/09/2010 12:08, Heiko Schocher a écrit :
Hmm.. I don;t know the other boards ... so, I decided for the relocation patches not to fix all arm boards, instead inroduce the CONFIG_SYS_ARM_WITHOUT_RELOC, and board maintainers (or others who have time for it) can adapt the boards to this new code. So we always have a chance to keep track which boards are converted *and* working and which boards are not working ... so, I could not say, if it is a good idea to fix other boards, you could not test (I know, usually we do this fixes, but the relocation code is such a deep change, I prefer to add only real tested patches ...)
Understood. If only to be consistent with your patches, I will not fix any other board than the ones I can test directly.
This leads me to another question: since I cannot test the tx25, I think
I test it ;-)
I should take from your fix only the parts that modify my own patch set, and leave out the parts for tx25, which you would post above my patches. Is this ok?
I prefer to have this change in one patch, because if we do it so, we don;t break "git bisect" functionallity.
bye, Heiko

Le 24/09/2010 07:11, Heiko Schocher a écrit :
I prefer to have this change in one patch, because if we do it so, we don;t break "git bisect" functionallity.
Not sure I follow you there, so let's dig a little bit into that point. How exactly would that break bisectability? The only difference I see is whether tx25 building succeeds or not; however if being able to build is the criterion for bisectability, this brings us back to all other boards in the tree which will also not build any more with my patch, and whether I should fix them or not.
On a side note, I do not know of a clear general definition of 'bisectability', which means I could break it yet again unkonwingly. Can you (or anyone, actually :) ) point me to a, or even the, standard definition of 'bisectable' and, if that has a specific meaning, of 'fully bisectable'?
bye, Heiko
Amicalement,

On 2010/09/24 7:47 AM, Albert ARIBAUD wrote:
On a side note, I do not know of a clear general definition of 'bisectability', which means I could break it yet again unkonwingly. Can you (or anyone, actually :) ) point me to a, or even the, standard definition of 'bisectable' and, if that has a specific meaning, of 'fully bisectable'?
Not sure if there is an agreed definition, but the main idea behind bisection stems from the "git bisect" tool, which allows you to mark one commit as "known good", and another as "known bad", and then calculates a commit in the middle to be tested for "goodness" or "badness". As commits are marked "good" or "bad", the tool halves (bisects) the range of commits until the single commit that introduced the breakage can be identified.
Of course, for the bisection process to be able to work, all points along the path must a) compile and b) be able to be tested for the breakage.
So, a fully bisectable patch series would maintain the above properties, for all boards.
This is one of the reasons that Linus likes to have code used as it is introduced, rather than building up a whole lot of unused infrastructure, only to activate it with a final commit. The bisection would then point to the final commit as the culprit, when the true failure may have been introduced by one of the preceding commits.
That's "as I understand it", of course.
Rogan

On 2010/09/24 6:45 PM, Rogan Dawes wrote:
On 2010/09/24 7:47 AM, Albert ARIBAUD wrote:
On a side note, I do not know of a clear general definition of 'bisectability', which means I could break it yet again unkonwingly. Can you (or anyone, actually :) ) point me to a, or even the, standard definition of 'bisectable' and, if that has a specific meaning, of 'fully bisectable'?
Here's a decent LWN.net article on the topic:
http://lwn.net/Articles/317154/
Rogan

Le 24/09/2010 18:58, Rogan Dawes a écrit :
On 2010/09/24 6:45 PM, Rogan Dawes wrote:
On 2010/09/24 7:47 AM, Albert ARIBAUD wrote:
On a side note, I do not know of a clear general definition of 'bisectability', which means I could break it yet again unkonwingly. Can you (or anyone, actually :) ) point me to a, or even the, standard definition of 'bisectable' and, if that has a specific meaning, of 'fully bisectable'?
Here's a decent LWN.net article on the topic:
http://lwn.net/Articles/317154/
Rogan
Thanks Rogan.
Actually I do know the git bisect command, having had to use it to track a boot issue in recent linux kernel versions; I should have made it clear that I was looking specifically for the meaning of 'bisectable' -- which you have clarified, thanks again.
So in this specific case, I guess what I should do is look up Heiko's patch series, see which arm926 boards it makes relocation-capable, and apply the .lds fx to those. The other arm926 boards, which were left untouched by Heiko, would not build with his patches anyway, so there is no real point in fixing them yet. Or is there?
So unless some strong disagreement is voiced, I'll prepare a V3 patch set which will be V2 + possible fixes stemming from Heiko's and Ben's test feedback + fixes to the .lds of boards which are made relocation capable in Heiko's patch series.
Amicalement,

On Wed, Sep 22, 2010 at 9:57 AM, Albert Aribaud albert.aribaud@free.fr wrote:
Replace GOT indirect addressing with more efficient pic-base relative addressing for initialized data (uninitialized data still use GOTi indirect addressing). This also reduces code size by 0.4% compared to -fPIC.
Signed-off-by: Albert Aribaud albert.aribaud@free.fr
Applies clean to u-boot/next.
Tested on da850evm; u-boot loads and gives the "DA850-evm >" prompt.
Tested-by: Ben Gardiner bengardiner@nanometrics.ca
Best Regards, Ben Gardiner
--- Nanometrics Inc. http://www.nanometrics.ca
participants (6)
-
Albert ARIBAUD
-
Albert Aribaud
-
Ben Gardiner
-
Heiko Schocher
-
Rogan Dawes
-
Wolfgang Denk