[U-Boot] [PATCH 0/2] Make sure 85xx bss doesn't start at 0x0

It looks like the 85xx platform is the only one which has boards with the bss at 0x0. It uses a slightly different linker script format which puts the bss after the reset vector, which is 0xfffffffc + 4 for a number of boards. Other platforms don't put their bss in a similar location, so they don't have this issue. I verified this by running MAKEALL and printing the bss address as well.
A few bytes of RAM are wasted for boards which used to have the bss at 0x0 FWIW.
These changes should be applied to the "reloc" branch.
Peter Tyser (2): 85xx: Preprocess link scripts 85xx: Ensure BSS segment doesn't start at address 0x0
cpu/mpc85xx/config.mk | 2 +- cpu/mpc85xx/{u-boot-nand.lds => u-boot-nand.lds.S} | 0 cpu/mpc85xx/{u-boot.lds => u-boot.lds.S} | 8 ++++++++ 3 files changed, 9 insertions(+), 1 deletions(-) rename cpu/mpc85xx/{u-boot-nand.lds => u-boot-nand.lds.S} (100%) rename cpu/mpc85xx/{u-boot.lds => u-boot.lds.S} (94%)

This allows for fancy conditionals and inclusions
Signed-off-by: Peter Tyser ptyser@xes-inc.com --- cpu/mpc85xx/config.mk | 2 +- cpu/mpc85xx/{u-boot-nand.lds => u-boot-nand.lds.S} | 0 cpu/mpc85xx/{u-boot.lds => u-boot.lds.S} | 0 3 files changed, 1 insertions(+), 1 deletions(-) rename cpu/mpc85xx/{u-boot-nand.lds => u-boot-nand.lds.S} (100%) rename cpu/mpc85xx/{u-boot.lds => u-boot.lds.S} (100%)
diff --git a/cpu/mpc85xx/config.mk b/cpu/mpc85xx/config.mk index beb3514..03a34a9 100644 --- a/cpu/mpc85xx/config.mk +++ b/cpu/mpc85xx/config.mk @@ -27,4 +27,4 @@ PLATFORM_CPPFLAGS += -ffixed-r2 -Wa,-me500 -msoft-float -mno-string PLATFORM_CPPFLAGS +=$(call cc-option,-mno-spe)
# Use default linker script. Board port can override in board/*/config.mk -LDSCRIPT := $(SRCTREE)/cpu/mpc85xx/u-boot.lds +LDSCRIPT := $(SRCTREE)/cpu/mpc85xx/u-boot.lds.S diff --git a/cpu/mpc85xx/u-boot-nand.lds b/cpu/mpc85xx/u-boot-nand.lds.S similarity index 100% rename from cpu/mpc85xx/u-boot-nand.lds rename to cpu/mpc85xx/u-boot-nand.lds.S diff --git a/cpu/mpc85xx/u-boot.lds b/cpu/mpc85xx/u-boot.lds.S similarity index 100% rename from cpu/mpc85xx/u-boot.lds rename to cpu/mpc85xx/u-boot.lds.S

Dear Peter Tyser,
In message 1254783670-21301-2-git-send-email-ptyser@xes-inc.com you wrote:
This allows for fancy conditionals and inclusions
Signed-off-by: Peter Tyser ptyser@xes-inc.com
cpu/mpc85xx/config.mk | 2 +- cpu/mpc85xx/{u-boot-nand.lds => u-boot-nand.lds.S} | 0 cpu/mpc85xx/{u-boot.lds => u-boot.lds.S} | 0 3 files changed, 1 insertions(+), 1 deletions(-) rename cpu/mpc85xx/{u-boot-nand.lds => u-boot-nand.lds.S} (100%) rename cpu/mpc85xx/{u-boot.lds => u-boot.lds.S} (100%)
diff --git a/cpu/mpc85xx/config.mk b/cpu/mpc85xx/config.mk index beb3514..03a34a9 100644 --- a/cpu/mpc85xx/config.mk +++ b/cpu/mpc85xx/config.mk @@ -27,4 +27,4 @@ PLATFORM_CPPFLAGS += -ffixed-r2 -Wa,-me500 -msoft-float -mno-string PLATFORM_CPPFLAGS +=$(call cc-option,-mno-spe)
# Use default linker script. Board port can override in board/*/config.mk -LDSCRIPT := $(SRCTREE)/cpu/mpc85xx/u-boot.lds +LDSCRIPT := $(SRCTREE)/cpu/mpc85xx/u-boot.lds.S diff --git a/cpu/mpc85xx/u-boot-nand.lds b/cpu/mpc85xx/u-boot-nand.lds.S similarity index 100% rename from cpu/mpc85xx/u-boot-nand.lds rename to cpu/mpc85xx/u-boot-nand.lds.S diff --git a/cpu/mpc85xx/u-boot.lds b/cpu/mpc85xx/u-boot.lds.S similarity index 100% rename from cpu/mpc85xx/u-boot.lds rename to cpu/mpc85xx/u-boot.lds.S
Why would such a rename be needed?
The linker scripts aready get preprocessed, even without this rename. See the rule
369 $(obj)u-boot.lds: $(LDSCRIPT) 370 $(CPP) $(CPPFLAGS) $(LDPPFLAGS) -ansi -D__ASSEMBLY__ -P - <$^ >$@
in the top level Makefile.
Best regards,
Wolfgang Denk

On Tue, 2009-10-06 at 09:28 +0200, Wolfgang Denk wrote:
Dear Peter Tyser,
In message 1254783670-21301-2-git-send-email-ptyser@xes-inc.com you wrote:
This allows for fancy conditionals and inclusions
Signed-off-by: Peter Tyser ptyser@xes-inc.com
cpu/mpc85xx/config.mk | 2 +- cpu/mpc85xx/{u-boot-nand.lds => u-boot-nand.lds.S} | 0 cpu/mpc85xx/{u-boot.lds => u-boot.lds.S} | 0 3 files changed, 1 insertions(+), 1 deletions(-) rename cpu/mpc85xx/{u-boot-nand.lds => u-boot-nand.lds.S} (100%) rename cpu/mpc85xx/{u-boot.lds => u-boot.lds.S} (100%)
diff --git a/cpu/mpc85xx/config.mk b/cpu/mpc85xx/config.mk index beb3514..03a34a9 100644 --- a/cpu/mpc85xx/config.mk +++ b/cpu/mpc85xx/config.mk @@ -27,4 +27,4 @@ PLATFORM_CPPFLAGS += -ffixed-r2 -Wa,-me500 -msoft-float -mno-string PLATFORM_CPPFLAGS +=$(call cc-option,-mno-spe)
# Use default linker script. Board port can override in board/*/config.mk -LDSCRIPT := $(SRCTREE)/cpu/mpc85xx/u-boot.lds +LDSCRIPT := $(SRCTREE)/cpu/mpc85xx/u-boot.lds.S diff --git a/cpu/mpc85xx/u-boot-nand.lds b/cpu/mpc85xx/u-boot-nand.lds.S similarity index 100% rename from cpu/mpc85xx/u-boot-nand.lds rename to cpu/mpc85xx/u-boot-nand.lds.S diff --git a/cpu/mpc85xx/u-boot.lds b/cpu/mpc85xx/u-boot.lds.S similarity index 100% rename from cpu/mpc85xx/u-boot.lds rename to cpu/mpc85xx/u-boot.lds.S
Why would such a rename be needed?
The linker scripts aready get preprocessed, even without this rename. See the rule
369 $(obj)u-boot.lds: $(LDSCRIPT) 370 $(CPP) $(CPPFLAGS) $(LDPPFLAGS) -ansi -D__ASSEMBLY__ -P - <$^ >$@
in the top level Makefile.
Argh, I see. I was using the fancy lib_blackfin/u-boot.lds.S as a reference and assumed the .S was needed. Ignore this patch.
Best, Peter

When U-Boot is relocated from flash to RAM pointers are modified accordingly. However, pointers initialzed with NULL values should not be modified so that they maintain their intended NULL value. The address of the BSS segment must be modified during relocation which means that it must not have a NULL value.
Signed-off-by: Peter Tyser ptyser@xes-inc.com --- cpu/mpc85xx/u-boot.lds.S | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/cpu/mpc85xx/u-boot.lds.S b/cpu/mpc85xx/u-boot.lds.S index a347cd1..ef3de66 100644 --- a/cpu/mpc85xx/u-boot.lds.S +++ b/cpu/mpc85xx/u-boot.lds.S @@ -131,6 +131,14 @@ SECTIONS
. = RESET_VECTOR_ADDRESS + 0x4;
+ /* + * Make sure that the bss segment doesn't start at 0x0, otherwise its + * address won't be updated during relocation fixups + */ +#if !((RESET_VECTOR_ADDRESS + 0x4) & 0xffffffff) + . |= 0x10; +#endif + __bss_start = .; .bss (NOLOAD) : {

Dear Peter Tyser,
In message 1254783670-21301-3-git-send-email-ptyser@xes-inc.com you wrote:
When U-Boot is relocated from flash to RAM pointers are modified accordingly. However, pointers initialzed with NULL values should not be modified so that they maintain their intended NULL value. The address of the BSS segment must be modified during relocation which means that it must not have a NULL value.
Signed-off-by: Peter Tyser ptyser@xes-inc.com
cpu/mpc85xx/u-boot.lds.S | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/cpu/mpc85xx/u-boot.lds.S b/cpu/mpc85xx/u-boot.lds.S index a347cd1..ef3de66 100644 --- a/cpu/mpc85xx/u-boot.lds.S +++ b/cpu/mpc85xx/u-boot.lds.S @@ -131,6 +131,14 @@ SECTIONS
. = RESET_VECTOR_ADDRESS + 0x4;
- /*
- Make sure that the bss segment doesn't start at 0x0, otherwise its
- address won't be updated during relocation fixups
- */
+#if !((RESET_VECTOR_ADDRESS + 0x4) & 0xffffffff)
This seems to be a pretty complicated way of writing
#if (RESET_VECTOR_ADDRESS == 0xFFFFFFFC)
?
- . |= 0x10;
I'm not sure if all this is always doing what we want, or if it's always working the same way. When building on 32 bit machines, dot will wrap around for "0xFFFFFFFC + 4" and result in 0; ". |= 0x10" makes it 0x10 then.
When built using a 64 bit host, 0xFFFFFFFC + 4 = 0x100000000, and the OR makes it 0x100000010. But here this OR was not needed.
Best regards,
Wolfgang Denk

--- a/cpu/mpc85xx/u-boot.lds.S +++ b/cpu/mpc85xx/u-boot.lds.S @@ -131,6 +131,14 @@ SECTIONS
. = RESET_VECTOR_ADDRESS + 0x4;
- /*
- Make sure that the bss segment doesn't start at 0x0, otherwise its
- address won't be updated during relocation fixups
- */
+#if !((RESET_VECTOR_ADDRESS + 0x4) & 0xffffffff)
This seems to be a pretty complicated way of writing
#if (RESET_VECTOR_ADDRESS == 0xFFFFFFFC)
?
Good point.
- . |= 0x10;
I'm not sure if all this is always doing what we want, or if it's always working the same way. When building on 32 bit machines, dot will wrap around for "0xFFFFFFFC + 4" and result in 0; ". |= 0x10" makes it 0x10 then.
When built using a 64 bit host, 0xFFFFFFFC + 4 = 0x100000000, and the OR makes it 0x100000010. But here this OR was not needed.
The 64-bit addresses will need to be truncated to 32-bits when the u-boot ELF is actually generated, so I think we'd get 0x10 when building on a 32 or 64 bit machine. I'll verify.
Best, Peter

Dear Peter Tyser,
In message 1254783670-21301-1-git-send-email-ptyser@xes-inc.com you wrote:
It looks like the 85xx platform is the only one which has boards with the bss at 0x0. It uses a slightly different linker script format which puts the bss after the reset vector, which is 0xfffffffc + 4 for a number of boards. Other platforms don't put their bss in a similar location, so they don't have this issue. I verified this by running MAKEALL and printing the bss address as well.
A few bytes of RAM are wasted for boards which used to have the bss at 0x0 FWIW.
I never understood how this is supposed to work at all.
We have two phases of operation:
1) before relocation to RAM. Here we actually do not have a working bss segment at all, no mater what the linker may think. 2) after relocation to RAM. Here we reserve space for the BSS at the end (well, more or less) of the RAM.
This whole "bss at 0x0" is a myth to me.
Best regards,
Wolfgang Denk

On Tue, 2009-10-06 at 09:32 +0200, Wolfgang Denk wrote:
Dear Peter Tyser,
In message 1254783670-21301-1-git-send-email-ptyser@xes-inc.com you wrote:
It looks like the 85xx platform is the only one which has boards with the bss at 0x0. It uses a slightly different linker script format which puts the bss after the reset vector, which is 0xfffffffc + 4 for a number of boards. Other platforms don't put their bss in a similar location, so they don't have this issue. I verified this by running MAKEALL and printing the bss address as well.
A few bytes of RAM are wasted for boards which used to have the bss at 0x0 FWIW.
I never understood how this is supposed to work at all.
We have two phases of operation:
- before relocation to RAM. Here we actually do not have a working bss segment at all, no mater what the linker may think.
Agreed.
- after relocation to RAM. Here we reserve space for the BSS at the end (well, more or less) of the RAM.
Agreed.
This whole "bss at 0x0" is a myth to me.
Do a readelf on most MPC8548 boards, eg MPC8548CDS. __bss_start is also located at 0x0 for these boards, which is the issue this patch attempted to address.
The current U-Boot code is already relocating this bss address higher up in SDRAM during relocation, all this patch does is add 0x10 bytes to that address. I had assumed the current code was working, but perhaps there's a bigger issue...
FWIW, it looked like non-85xx PPC boards seem to do something like: 0xfff00000 text 0xfff50000 data 0xfff60000 bss 0xffff0000 "boot page"/reset vectors
I shied away from this since as the text/data/bss grow at some point the bss is going to overlap with the boot page. I think ld would intelligently wrap the bss around the boot page, but U-Boot won't be so intelligent when the bss is zeroed out:) The bss address range would also wrap back around to 0x0. I didn't feel good about zeroing out the bootpage and wasn't sure what the ramifications of having the bss address wrap back around to 0x0 were (or if the wrapping is even a concern) so didn't use this memory layout. Other arches seem to do this though...
Anyway, this patch just adds 0x10 to boards that already have their bss at 0x0. But maybe those boards already have issues:) I'll investigate at bit and follow up.
Best, Peter

Dear Peter Tyser,
In message 1254830475.22896.43.camel@ptyser-laptop you wrote:
This whole "bss at 0x0" is a myth to me.
Do a readelf on most MPC8548 boards, eg MPC8548CDS. __bss_start is also located at 0x0 for these boards, which is the issue this patch attempted to address.
I know that this _is_ the case. My questions meant: _why_ is this the case? My speculkation is that it's just by accident, because the bss was located just after the instruction allocated for the reset vector; this being at 0xFFFFFFFC on most 8xxx systems, the address counter wrapped around on 32 bit tool chains, resulting in 0x0.
The current U-Boot code is already relocating this bss address higher up in SDRAM during relocation, all this patch does is add 0x10 bytes to that address. I had assumed the current code was working, but perhaps there's a bigger issue...
I don;t think it's an issue. The code seems to work. But I wonder if we could not simplify all this buy defining an arbitrary, non-zero address.
I shied away from this since as the text/data/bss grow at some point the bss is going to overlap with the boot page. I think ld would intelligently wrap the bss around the boot page, but U-Boot won't be so intelligent when the bss is zeroed out:) The bss address range would also wrap back around to 0x0. I didn't feel good about zeroing out the
But bss is NOLOAD, and the actual location in the flash is just a fiction - we never use anything of this but the start address.
Best regards,
Wolfgang Denk

On Oct 6, 2009, at 9:01 AM, Wolfgang Denk wrote:
Dear Peter Tyser,
In message 1254830475.22896.43.camel@ptyser-laptop you wrote:
This whole "bss at 0x0" is a myth to me.
Do a readelf on most MPC8548 boards, eg MPC8548CDS. __bss_start is also located at 0x0 for these boards, which is the issue this patch attempted to address.
I know that this _is_ the case. My questions meant: _why_ is this the case? My speculkation is that it's just by accident, because the bss was located just after the instruction allocated for the reset vector; this being at 0xFFFFFFFC on most 8xxx systems, the address counter wrapped around on 32 bit tool chains, resulting in 0x0.
The current U-Boot code is already relocating this bss address higher up in SDRAM during relocation, all this patch does is add 0x10 bytes to that address. I had assumed the current code was working, but perhaps there's a bigger issue...
I don;t think it's an issue. The code seems to work. But I wonder if we could not simplify all this buy defining an arbitrary, non-zero address.
I shied away from this since as the text/data/bss grow at some point the bss is going to overlap with the boot page. I think ld would intelligently wrap the bss around the boot page, but U-Boot won't be so intelligent when the bss is zeroed out:) The bss address range would also wrap back around to 0x0. I didn't feel good about zeroing out the
But bss is NOLOAD, and the actual location in the flash is just a fiction - we never use anything of this but the start address.
Where is BSS on 44x boards? I dont see any reason we shouldn't be able to put it at the same location.
- k

On Tue, 2009-10-06 at 09:07 -0500, Kumar Gala wrote:
This whole "bss at 0x0" is a myth to me.
Do a readelf on most MPC8548 boards, eg MPC8548CDS. __bss_start is also located at 0x0 for these boards, which is the issue this patch attempted to address.
I know that this _is_ the case. My questions meant: _why_ is this the case? My speculkation is that it's just by accident, because the bss was located just after the instruction allocated for the reset vector; this being at 0xFFFFFFFC on most 8xxx systems, the address counter wrapped around on 32 bit tool chains, resulting in 0x0.
The current U-Boot code is already relocating this bss address higher up in SDRAM during relocation, all this patch does is add 0x10 bytes to that address. I had assumed the current code was working, but perhaps there's a bigger issue...
I don;t think it's an issue. The code seems to work. But I wonder if we could not simplify all this buy defining an arbitrary, non-zero address.
I shied away from this since as the text/data/bss grow at some point the bss is going to overlap with the boot page. I think ld would intelligently wrap the bss around the boot page, but U-Boot won't be so intelligent when the bss is zeroed out:) The bss address range would also wrap back around to 0x0. I didn't feel good about zeroing out the
But bss is NOLOAD, and the actual location in the flash is just a fiction - we never use anything of this but the start address.
My concern was that we use __bss_start and _end to calculate the size of the bss to zero out. If the bss wraps, I'd be concerned about what gets cleared as _end would be truncated to a low memory address while __bss_start would be a high memory address. Or other similar problems - I didn't investigate what would really happen, I was just worried what could happen:)
Where is BSS on 44x boards? I dont see any reason we shouldn't be able to put it at the same location.
From the XPedite1000:
[ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .resetvec PROGBITS fffffffc 03f2e4 000004 00 AX 0 0 1 [ 2] .bootpg PROGBITS fffff000 03e2e8 000250 00 AX 0 0 1 [ 3] .text PROGBITS fff80000 000094 0303b0 00 AX 0 0 4 [ 4] .rodata PROGBITS fffb03b0 030444 00a14c 00 A 0 0 4 [ 5] .reloc PROGBITS fffba500 03a594 002280 00 WA 0 0 4 [ 6] .data PROGBITS fffbc780 03c814 00088c 00 WA 0 0 4 [ 7] .data.rel.local PROGBITS fffbd00c 03d0a0 000a98 00 WA 0 0 4 [ 8] .data.rel.ro.loca PROGBITS fffbdaa4 03db38 0000b0 00 WA 0 0 4 [ 9] .data.rel PROGBITS fffbdb54 03dbe8 000100 00 WA 0 0 4 [10] .u_boot_cmd PROGBITS fffbdc54 03dce8 000600 00 WA 0 0 4 [11] .bss NOBITS fffbe300 03e2e8 011c44 00 WA 0 0 4
I shied away from this for the 2 reasons above - the bootpg section will be wiped out when the bss is cleared for images near their maximum size and I wasn't sure if there were any ramifications about the bss wrapping around to 0. Other arches must have a similar issue which would somewhat imply: 1. No one cares if their bootpg/reset vector is cleared 2. U-Boot works even if the bss wraps around to 0.
If everyone is OK with the limitation of #1 above I can make the 85xx act like the other PPC boards. The only downside I see is that we could never put any non-reset related code in the bootpg.
Best, Peter

Dear Peter Tyser,
In message 1254839043.24664.1890.camel@localhost.localdomain you wrote:
But bss is NOLOAD, and the actual location in the flash is just a fiction - we never use anything of this but the start address.
My concern was that we use __bss_start and _end to calculate the size of the bss to zero out. If the bss wraps, I'd be concerned about what gets cleared as _end would be truncated to a low memory address while __bss_start would be a high memory address. Or other similar problems - I didn't investigate what would really happen, I was just worried what could happen:)
So far U-Boot is actually a 32 bit boot loader; address calculations like this "just wrap around". So far this has not caused problems yet; what has caused problems is that we can have overlapping sections on 4xx. Also it's probably overkill that each board has it's own linker script.
I would like to see this fixed in this process. Maybe Stefan finds some spare cycles to address this.
Where is BSS on 44x boards? I dont see any reason we shouldn't be able to put it at the same location.
From the XPedite1000:
[ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .resetvec PROGBITS fffffffc 03f2e4 000004 00 AX 0 0 1 [ 2] .bootpg PROGBITS fffff000 03e2e8 000250 00 AX 0 0 1 [ 3] .text PROGBITS fff80000 000094 0303b0 00 AX 0 0 4 [ 4] .rodata PROGBITS fffb03b0 030444 00a14c 00 A 0 0 4 [ 5] .reloc PROGBITS fffba500 03a594 002280 00 WA 0 0 4 [ 6] .data PROGBITS fffbc780 03c814 00088c 00 WA 0 0 4 [ 7] .data.rel.local PROGBITS fffbd00c 03d0a0 000a98 00 WA 0 0 4 [ 8] .data.rel.ro.loca PROGBITS fffbdaa4 03db38 0000b0 00 WA 0 0 4 [ 9] .data.rel PROGBITS fffbdb54 03dbe8 000100 00 WA 0 0 4 [10] .u_boot_cmd PROGBITS fffbdc54 03dce8 000600 00 WA 0 0 4 [11] .bss NOBITS fffbe300 03e2e8 011c44 00 WA 0 0 4
I shied away from this for the 2 reasons above - the bootpg section will be wiped out when the bss is cleared for images near their maximum size
I think it will not be needed any more by then.
and I wasn't sure if there were any ramifications about the bss wrapping around to 0. Other arches must have a similar issue which would somewhat imply:
- No one cares if their bootpg/reset vector is cleared
I think this is only relevant while running from flash, but not after relocation.
- U-Boot works even if the bss wraps around to 0.
This is indeed the case.
If everyone is OK with the limitation of #1 above I can make the 85xx act like the other PPC boards. The only downside I see is that we could never put any non-reset related code in the bootpg.
What about my suggestion to chose a fixed (random, non-zero) address?
Best regards,
Wolfgang Denk

Hi Wolfgang,
So far U-Boot is actually a 32 bit boot loader; address calculations like this "just wrap around". So far this has not caused problems yet; what has caused problems is that we can have overlapping sections on 4xx. Also it's probably overkill that each board has it's own linker script.
I added some debug and came to the same conclusion about the wrapping math.
Full ack on the linker script consolidation.
I would like to see this fixed in this process. Maybe Stefan finds some spare cycles to address this.
Where is BSS on 44x boards? I dont see any reason we shouldn't be able to put it at the same location.
From the XPedite1000:
[ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .resetvec PROGBITS fffffffc 03f2e4 000004 00 AX 0 0 1 [ 2] .bootpg PROGBITS fffff000 03e2e8 000250 00 AX 0 0 1 [ 3] .text PROGBITS fff80000 000094 0303b0 00 AX 0 0 4 [ 4] .rodata PROGBITS fffb03b0 030444 00a14c 00 A 0 0 4 [ 5] .reloc PROGBITS fffba500 03a594 002280 00 WA 0 0 4 [ 6] .data PROGBITS fffbc780 03c814 00088c 00 WA 0 0 4 [ 7] .data.rel.local PROGBITS fffbd00c 03d0a0 000a98 00 WA 0 0 4 [ 8] .data.rel.ro.loca PROGBITS fffbdaa4 03db38 0000b0 00 WA 0 0 4 [ 9] .data.rel PROGBITS fffbdb54 03dbe8 000100 00 WA 0 0 4 [10] .u_boot_cmd PROGBITS fffbdc54 03dce8 000600 00 WA 0 0 4 [11] .bss NOBITS fffbe300 03e2e8 011c44 00 WA 0 0 4
I shied away from this for the 2 reasons above - the bootpg section will be wiped out when the bss is cleared for images near their maximum size
I think it will not be needed any more by then.
Its not currently used (at least on 85xx), but I know using it had been mentioned in the past. There's a >3K chunk that's sitting empty right now that could be used. All things being equal I think it would be ideal not to trash a section of U-Boot code - it could be useful and at some point someone's going to be banging their head on the wall trying to figure out why some chunk of assembly code isn't working.
If everyone is OK with the limitation of #1 above I can make the 85xx act like the other PPC boards. The only downside I see is that we could never put any non-reset related code in the bootpg.
What about my suggestion to chose a fixed (random, non-zero) address?
I'd vote against this. It'd have to be some area in low memory and people would be bound to accidentally stomp on it and cause all sorts of odd errors- like overwriting the exception vectors, but harder to debug. I personally like the current implementation of putting the bss after the entire U-Boot image. It keeps U-Boot's code, malloc pool, stack, bss, etc all in the same general area which is nice, and has the side benefit that the bootpg won't be overwritten.
I know ORing in 0x10 is a bit ugly, but what's the real downside of doing it?
Best, Peter

Dear Peter Tyser,
In message 1254843932.24664.2083.camel@localhost.localdomain you wrote:
I personally like the current implementation of putting the bss after the entire U-Boot image. It keeps U-Boot's code, malloc pool, stack, bss, etc all in the same general area which is nice, and has the side benefit that the bootpg won't be overwritten.
OK, if you think so...
I know ORing in 0x10 is a bit ugly, but what's the real downside of doing it?
Nothing. I just hate to allocate the bss at 0x0, because this is actually incorrect - it's the result of an address overflow / truncation, and pretty much misleading to someone trying to read and understand the code. For the linked image, it does not _look_ as if the bss was located _after_ the U-Boot image, it looks detached and allocated in low RAM.
Best regards,
Wolfgang Denk

On Tue, 2009-10-06 at 19:51 +0200, Wolfgang Denk wrote:
Dear Peter Tyser,
In message 1254843932.24664.2083.camel@localhost.localdomain you wrote:
I personally like the current implementation of putting the bss after the entire U-Boot image. It keeps U-Boot's code, malloc pool, stack, bss, etc all in the same general area which is nice, and has the side benefit that the bootpg won't be overwritten.
OK, if you think so...
I know ORing in 0x10 is a bit ugly, but what's the real downside of doing it?
Nothing. I just hate to allocate the bss at 0x0, because this is actually incorrect - it's the result of an address overflow / truncation, and pretty much misleading to someone trying to read and understand the code. For the linked image, it does not _look_ as if the bss was located _after_ the U-Boot image, it looks detached and allocated in low RAM.
Do you have a preference Kumar? You're probably going to be the first in line to have to deal with any resulting confusion:)
I personally would rank the options: 1. OR in an offset to the bss address and leave some good comments in the linker script and commit message
2. Make the bss the last section like other PPC boards which would result in the bootpg sometimes being overwritten
3. Put the bss at an arbitrary address
Best, Peter

Peter Tyser wrote:
On Tue, 2009-10-06 at 19:51 +0200, Wolfgang Denk wrote:
Dear Peter Tyser,
In message 1254843932.24664.2083.camel@localhost.localdomain you wrote:
I personally like the current implementation of putting the bss after the entire U-Boot image. It keeps U-Boot's code, malloc pool, stack, bss, etc all in the same general area which is nice, and has the side benefit that the bootpg won't be overwritten.
OK, if you think so...
I know ORing in 0x10 is a bit ugly, but what's the real downside of doing it?
Nothing. I just hate to allocate the bss at 0x0, because this is actually incorrect - it's the result of an address overflow / truncation, and pretty much misleading to someone trying to read and understand the code. For the linked image, it does not _look_ as if the bss was located _after_ the U-Boot image, it looks detached and allocated in low RAM.
Do you have a preference Kumar? You're probably going to be the first in line to have to deal with any resulting confusion:)
I personally would rank the options:
- OR in an offset to the bss address and leave some good comments in
the linker script and commit message
- Make the bss the last section like other PPC boards which would
result in the bootpg sometimes being overwritten
- Put the bss at an arbitrary address
FWIW, I think an arbitrary address disjoint from the u-boot addresses is best. While u-boot is in ROM, you can't use the bss anyway. The bss will actually be located at an address selected by the u-boot code itself after memory is sized. All references to the bss will be re-located by subtracting the arbitrary start address and adding the run-time chosen start address. So the linked start address is not important, except that is cannot be NULL or it may confuse the relocation code that doesn't want to re-locate NULL pointers. Some of the confusion in this discussion probably stems from the fact that the linker scripts make the bss look like "part of u-boot", when it is really not. It is just a chunk of "zero'ed" ram, located anywhere the u-boot code decides to put it. An arbitrary strange address would make this more apparent.
Best Regards, Bill Campbell
Best, Peter
U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

On Tue, 2009-10-06 at 13:34 -0700, J. William Campbell wrote:
Peter Tyser wrote:
On Tue, 2009-10-06 at 19:51 +0200, Wolfgang Denk wrote:
Dear Peter Tyser,
In message 1254843932.24664.2083.camel@localhost.localdomain you wrote:
I personally like the current implementation of putting the bss after the entire U-Boot image. It keeps U-Boot's code, malloc pool, stack, bss, etc all in the same general area which is nice, and has the side benefit that the bootpg won't be overwritten.
OK, if you think so...
I know ORing in 0x10 is a bit ugly, but what's the real downside of doing it?
Nothing. I just hate to allocate the bss at 0x0, because this is actually incorrect - it's the result of an address overflow / truncation, and pretty much misleading to someone trying to read and understand the code. For the linked image, it does not _look_ as if the bss was located _after_ the U-Boot image, it looks detached and allocated in low RAM.
Do you have a preference Kumar? You're probably going to be the first in line to have to deal with any resulting confusion:)
I personally would rank the options:
- OR in an offset to the bss address and leave some good comments in
the linker script and commit message
- Make the bss the last section like other PPC boards which would
result in the bootpg sometimes being overwritten
- Put the bss at an arbitrary address
FWIW, I think an arbitrary address disjoint from the u-boot addresses is best. While u-boot is in ROM, you can't use the bss anyway. The bss will actually be located at an address selected by the u-boot code itself after memory is sized. All references to the bss will be re-located by subtracting the arbitrary start address and adding the run-time chosen start address. So the linked start address is not important, except that is cannot be NULL or it may confuse the relocation code that doesn't want to re-locate NULL pointers. Some of the confusion in this discussion probably stems from the fact that the linker scripts make the bss look like "part of u-boot", when it is really not. It is just a chunk of "zero'ed" ram, located anywhere the u-boot code decides to put it. An arbitrary strange address would make this more apparent.
Hi Bill, What's the advantage of having the bss not be located next to U-Boot? The big disadvantage of picking an arbitrary address for the bss is that there's now 1 more magical section of SDRAM that the user needs to know shouldn't be used. I already field enough question from people that corrupt their exception vectors or stack/malloc pool/u-boot code, I don't want to add more bss questions:)
Best, Peter
PS. please keep the original email recipients on CC

Peter Tyser wrote:
On Tue, 2009-10-06 at 13:34 -0700, J. William Campbell wrote:
Peter Tyser wrote:
On Tue, 2009-10-06 at 19:51 +0200, Wolfgang Denk wrote:
Dear Peter Tyser,
In message 1254843932.24664.2083.camel@localhost.localdomain you wrote:
I personally like the current implementation of putting the bss after the entire U-Boot image. It keeps U-Boot's code, malloc pool, stack, bss, etc all in the same general area which is nice, and has the side benefit that the bootpg won't be overwritten.
OK, if you think so...
I know ORing in 0x10 is a bit ugly, but what's the real downside of doing it?
Nothing. I just hate to allocate the bss at 0x0, because this is actually incorrect - it's the result of an address overflow / truncation, and pretty much misleading to someone trying to read and understand the code. For the linked image, it does not _look_ as if the bss was located _after_ the U-Boot image, it looks detached and allocated in low RAM.
Do you have a preference Kumar? You're probably going to be the first in line to have to deal with any resulting confusion:)
I personally would rank the options:
- OR in an offset to the bss address and leave some good comments in
the linker script and commit message
- Make the bss the last section like other PPC boards which would
result in the bootpg sometimes being overwritten
- Put the bss at an arbitrary address
FWIW, I think an arbitrary address disjoint from the u-boot addresses is best. While u-boot is in ROM, you can't use the bss anyway. The bss will actually be located at an address selected by the u-boot code itself after memory is sized. All references to the bss will be re-located by subtracting the arbitrary start address and adding the run-time chosen start address. So the linked start address is not important, except that is cannot be NULL or it may confuse the relocation code that doesn't want to re-locate NULL pointers. Some of the confusion in this discussion probably stems from the fact that the linker scripts make the bss look like "part of u-boot", when it is really not. It is just a chunk of "zero'ed" ram, located anywhere the u-boot code decides to put it. An arbitrary strange address would make this more apparent.
Hi Bill, What's the advantage of having the bss not be located next to U-Boot? The big disadvantage of picking an arbitrary address for the bss is that there's now 1 more magical section of SDRAM that the user needs to know shouldn't be used. I already field enough question from people that corrupt their exception vectors or stack/malloc pool/u-boot code, I don't want to add more bss questions:)
Hi Peter, The point is that the address chosen for the ld step is NOT the address in ram where the bss will reside anyway. This address can overlap the exception vectors, stack, or even the u-boot code itself and it wouldn't matter (other than possible confusion). The actual physical address where the bss and u-boot itself resides is COMPUTED by u-boot after it sizes memory. u-boot only needs to know how big the section is in order to allow enough room. All references to the bss will then be re-located correctly. Where the bss actually ends up is a function of u-boot code. It may be on some processors that the computation of bss start is done assuming the bss is adjacent to u-boot in the original memory map, but if so, it is an un-necessary restriction. All that is required is a "safe" chunk of ram, which is also what is needed for stack and malloc area and should be chosen in a similar manner. So at run time, bss probably ends up adjacent to u-boot in ram because that's how it was coded, but at ld time it shouldn't matter.
Best Regards, Bill Campbell
Best, Peter
PS. please keep the original email recipients on CC

On Tue, 2009-10-06 at 15:34 -0700, J. William Campbell wrote:
Peter Tyser wrote:
On Tue, 2009-10-06 at 13:34 -0700, J. William Campbell wrote:
Peter Tyser wrote:
On Tue, 2009-10-06 at 19:51 +0200, Wolfgang Denk wrote:
Dear Peter Tyser,
In message 1254843932.24664.2083.camel@localhost.localdomain you wrote:
I personally like the current implementation of putting the bss after the entire U-Boot image. It keeps U-Boot's code, malloc pool, stack, bss, etc all in the same general area which is nice, and has the side benefit that the bootpg won't be overwritten.
OK, if you think so...
I know ORing in 0x10 is a bit ugly, but what's the real downside of doing it?
Nothing. I just hate to allocate the bss at 0x0, because this is actually incorrect - it's the result of an address overflow / truncation, and pretty much misleading to someone trying to read and understand the code. For the linked image, it does not _look_ as if the bss was located _after_ the U-Boot image, it looks detached and allocated in low RAM.
Do you have a preference Kumar? You're probably going to be the first in line to have to deal with any resulting confusion:)
I personally would rank the options:
- OR in an offset to the bss address and leave some good comments in
the linker script and commit message
- Make the bss the last section like other PPC boards which would
result in the bootpg sometimes being overwritten
- Put the bss at an arbitrary address
FWIW, I think an arbitrary address disjoint from the u-boot addresses is best. While u-boot is in ROM, you can't use the bss anyway. The bss will actually be located at an address selected by the u-boot code itself after memory is sized. All references to the bss will be re-located by subtracting the arbitrary start address and adding the run-time chosen start address. So the linked start address is not important, except that is cannot be NULL or it may confuse the relocation code that doesn't want to re-locate NULL pointers. Some of the confusion in this discussion probably stems from the fact that the linker scripts make the bss look like "part of u-boot", when it is really not. It is just a chunk of "zero'ed" ram, located anywhere the u-boot code decides to put it. An arbitrary strange address would make this more apparent.
Hi Bill, What's the advantage of having the bss not be located next to U-Boot? The big disadvantage of picking an arbitrary address for the bss is that there's now 1 more magical section of SDRAM that the user needs to know shouldn't be used. I already field enough question from people that corrupt their exception vectors or stack/malloc pool/u-boot code, I don't want to add more bss questions:)
Hi Peter, The point is that the address chosen for the ld step is NOT the address in ram where the bss will reside anyway. This address can overlap the exception vectors, stack, or even the u-boot code itself and it wouldn't matter (other than possible confusion). The actual physical address where the bss and u-boot itself resides is COMPUTED by u-boot after it sizes memory. u-boot only needs to know how big the section is in order to allow enough room. All references to the bss will then be re-located correctly. Where the bss actually ends up is a function of u-boot code. It may be on some processors that the computation of bss start is done assuming the bss is adjacent to u-boot in the original memory map, but if so, it is an un-necessary restriction. All that is required is a "safe" chunk of ram, which is also what is needed for stack and malloc area and should be chosen in a similar manner. So at run time, bss probably ends up adjacent to u-boot in ram because that's how it was coded, but at ld time it shouldn't matter.
Hi Bill, I understand that the final addresses in RAM of all the sections are calculated by U-Boot during relocation based on memory size. However, the section addresses are the same relative to each other at link time as well as after relocation. Eg before relocation I print out:
(408) &_start = fff80000 (409) &__bss_start = (null) (410) &_end = 00008184 Now running in RAM - U-Boot at: 7ff70000 (After relocation) (665) &_start = 7ff70000 (666) &__bss_start = 7fff0000 (667) &_end = 7fff8184
The values all changed and are dependent on RAM size, but their relationship to one another didn't - they all just increased by 0x7fff0000. So practically speaking, we do need to know where the bss is at link time - its address is not dynamic like the malloc pool or stack - its tied directly to the address of the other sections at link time. (Unless we added some bss-specific fixups I imagine)
Am I missing something? Is there an example that would make things clearer?
Best, Peter

Dear Peter Tyser,
In message 1254870618.24664.3061.camel@localhost.localdomain you wrote:
I understand that the final addresses in RAM of all the sections are calculated by U-Boot during relocation based on memory size. However,
True. And nothing is ever written to the bss addresses as recorded in the linked image as running from flash, may thei be 0x0 0r 0x10 or any other random address.
the section addresses are the same relative to each other at link time as well as after relocation. Eg before relocation I print out:
(408) &_start = fff80000 (409) &__bss_start = (null) (410) &_end = 00008184 Now running in RAM - U-Boot at: 7ff70000 (After relocation) (665) &_start = 7ff70000 (666) &__bss_start = 7fff0000 (667) &_end = 7fff8184
The values all changed and are dependent on RAM size, but their relationship to one another didn't - they all just increased by 0x7fff0000. So practically speaking, we do need to know where the bss is at link time - its address is not dynamic like the malloc pool or stack - its tied directly to the address of the other sections at link time. (Unless we added some bss-specific fixups I imagine)
Right, that's the current situation.
My suggestion was NOT to put the bss at a fixed _offset_ to the U-Boot image, but to a fixed absolute address. My hope is that this might simplify the linker scripts at the cost of adding a little code to the relocation routine - for addresses in the bss we would have to add a different relocation offset.
Best regards,
Wolfgang Denk

The values all changed and are dependent on RAM size, but their relationship to one another didn't - they all just increased by 0x7fff0000. So practically speaking, we do need to know where the bss is at link time - its address is not dynamic like the malloc pool or stack - its tied directly to the address of the other sections at link time. (Unless we added some bss-specific fixups I imagine)
Right, that's the current situation.
My suggestion was NOT to put the bss at a fixed _offset_ to the U-Boot image, but to a fixed absolute address. My hope is that this might simplify the linker scripts at the cost of adding a little code to the relocation routine - for addresses in the bss we would have to add a different relocation offset.
I think I see what you're getting at. If we have a bss-specific fixup routine I don't give a hoot where the bss is located at link time. Its just that that bss-aware fixup routine doesn't exist right now:)
It seems like a clean solution. Adding a bss-aware fixup routine or putting the bss after the U-Boot image both seem good to me. The bss-aware fixup routine has a clearer readelf output and slightly more complicated code while the bss-after-uboot change has a misleading readelf output and simpler code. In any case I'd give a thumbs up to either of them.
Best, Peter

On Tue, 2009-10-06 at 18:43 -0500, Peter Tyser wrote:
The values all changed and are dependent on RAM size, but their relationship to one another didn't - they all just increased by 0x7fff0000. So practically speaking, we do need to know where the bss is at link time - its address is not dynamic like the malloc pool or stack - its tied directly to the address of the other sections at link time. (Unless we added some bss-specific fixups I imagine)
Right, that's the current situation.
My suggestion was NOT to put the bss at a fixed _offset_ to the U-Boot image, but to a fixed absolute address. My hope is that this might simplify the linker scripts at the cost of adding a little code to the relocation routine - for addresses in the bss we would have to add a different relocation offset.
I think I see what you're getting at. If we have a bss-specific fixup routine I don't give a hoot where the bss is located at link time. Its just that that bss-aware fixup routine doesn't exist right now:)
It seems like a clean solution. Adding a bss-aware fixup routine or putting the bss after the U-Boot image both seem good to me. The bss-aware fixup routine has a clearer readelf output and slightly more complicated code while the bss-after-uboot change has a misleading readelf output and simpler code. In any case I'd give a thumbs up to either of them.
Sorry, just to be clear, where did you want to put the fixed up bss? Still at a low memory address, ie the original address at link time? I had assumed if we were adding a bss-specific fixup we'd move it to the top of memory, near U-Boot, the malloc pool, etc. I'd be all for relocating it to higher in memory, but wouldn't be too excited about leaving at a low memory address... If we were to add bss fixups, we may as well move it to a location that lines up with the rest of U-Boot code, stack, and malloc, right?
Best, Peter

On Wed, Oct 7, 2009 at 11:09 AM, Peter Tyser ptyser@xes-inc.com wrote:
On Tue, 2009-10-06 at 18:43 -0500, Peter Tyser wrote:
The values all changed and are dependent on RAM size, but their relationship to one another didn't - they all just increased by 0x7fff0000. So practically speaking, we do need to know where the bss is at link time - its address is not dynamic like the malloc pool or stack - its tied directly to the address of the other sections at link time. (Unless we added some bss-specific fixups I imagine)
Right, that's the current situation.
My suggestion was NOT to put the bss at a fixed _offset_ to the U-Boot image, but to a fixed absolute address. My hope is that this might simplify the linker scripts at the cost of adding a little code to the relocation routine - for addresses in the bss we would have to add a different relocation offset.
I think I see what you're getting at. If we have a bss-specific fixup routine I don't give a hoot where the bss is located at link time. Its just that that bss-aware fixup routine doesn't exist right now:)
It seems like a clean solution. Adding a bss-aware fixup routine or putting the bss after the U-Boot image both seem good to me. The bss-aware fixup routine has a clearer readelf output and slightly more complicated code while the bss-after-uboot change has a misleading readelf output and simpler code. In any case I'd give a thumbs up to either of them.
Sorry, just to be clear, where did you want to put the fixed up bss? Still at a low memory address, ie the original address at link time? I had assumed if we were adding a bss-specific fixup we'd move it to the top of memory, near U-Boot, the malloc pool, etc. I'd be all for relocating it to higher in memory, but wouldn't be too excited about leaving at a low memory address... If we were to add bss fixups, we may as well move it to a location that lines up with the rest of U-Boot code, stack, and malloc, right?
Best, Peter
The longer this thread goes on, the more obvious it becomes to me that a basic dynamic loader which is 'ELF Aware' will overcome most (if not ultimately all) of the problems being discussed in relation to relocation.
I have a proof-of-concept for this (written entirely in C) which I needed to create due to the lack of a .fixup section on my arch (a limitation of all but one arch to date). More to the point, the information regarding the content of the ELF sections I was able to find wasn't even for the target arch I wrote the code for - It is all very generic (to a point).
I think that even the -mrelocatable / .fixup method may not be needed at all. -pie / -pic used by themselves creates enough information for an OS dynamic loader to relocate an executable, so why not U-Boot? Given that the type and location of each section is easily determined, a striped down dynamic loader can provide a platform-independent relocation scheme.
I believe that relocation for all arches, and thus the permanent removal of all the relocation fix ups littering the code, can be achieved far quicker this way.
Regards,
Graeme

Dear Graeme Russ,
In message d66caabb0910061824s4165d33bu5d5213f6783c09d0@mail.gmail.com you wrote:
I think that even the -mrelocatable / .fixup method may not be needed at all. -pie / -pic used by themselves creates enough information for an OS dynamic loader to relocate an executable, so why not U-Boot? Given that the type and location of each section is easily determined, a striped down dynamic loader can provide a platform-independent relocation scheme.
One reason for not using ELF images for the boot loader is size. The ELF header alone is often more than we would be willing to accept, not to mention the additional code.
Best regards,
Wolfgang Denk

On Wed, Oct 7, 2009 at 5:55 PM, Wolfgang Denk wd@denx.de wrote:
Dear Graeme Russ,
In message d66caabb0910061824s4165d33bu5d5213f6783c09d0@mail.gmail.com you wrote:
I think that even the -mrelocatable / .fixup method may not be needed at all. -pie / -pic used by themselves creates enough information for an OS dynamic loader to relocate an executable, so why not U-Boot? Given that the type and location of each section is easily determined, a striped down dynamic loader can provide a platform-independent relocation scheme.
One reason for not using ELF images for the boot loader is size. The ELF header alone is often more than we would be willing to accept, not to mention the additional code.
But the headers get stripped from the final binary. All we are left with in order to locate the ELF section data are the symbols exported from the linker script
The extra code is only three very tight for-loops. I had them wrapped in functions to improve readability, but they are good inline candidates (only called once each) and I doubt they use much code space at all (I'll send through actual numbers soon)
Question is, does -mrelocatable result in smaller .got (et al) are is the .fixup section adding extra size for the sake of ease of implementation?
Best regards,
Wolfgang Denk
Regards,
Graeme

On Wed, Oct 7, 2009 at 8:56 PM, Graeme Russ graeme.russ@gmail.com wrote:
On Wed, Oct 7, 2009 at 5:55 PM, Wolfgang Denk wd@denx.de wrote:
Dear Graeme Russ,
In message d66caabb0910061824s4165d33bu5d5213f6783c09d0@mail.gmail.com you wrote:
One reason for not using ELF images for the boot loader is size. The ELF header alone is often more than we would be willing to accept, not to mention the additional code.
[snip]
The extra code is only three very tight for-loops. I had them wrapped in functions to improve readability, but they are good inline candidates (only called once each) and I doubt they use much code space at all (I'll send through actual numbers soon)
341 bytes in the library archive for the ELF relocation code - I'm sure inlining will further reduce the footprint
Regards,
Graeme

On Wed, Oct 7, 2009 at 5:55 PM, Wolfgang Denk wd@denx.de wrote:
Dear Graeme Russ,
In message d66caabb0910061824s4165d33bu5d5213f6783c09d0@mail.gmail.com you wrote:
I think that even the -mrelocatable / .fixup method may not be needed at all. -pie / -pic used by themselves creates enough information for an OS dynamic loader to relocate an executable, so why not U-Boot? Given that the type and location of each section is easily determined, a striped down dynamic loader can provide a platform-independent relocation scheme.
One reason for not using ELF images for the boot loader is size. The ELF header alone is often more than we would be willing to accept, not to mention the additional code.
But the headers get stripped from the final binary. All we are left with in order to locate the ELF section data are the symbols exported from the linker script
The extra code is only three very tight for-loops. I had them wrapped in functions to improve readability, but they are good inline candidates (only called once each) and I doubt they use much code space at all (I'll send through actual numbers soon)
But how much space in the extra sections you link in?
if size is comparable with fixup ptrs we should probably consider using the same for ppc. Then we can use -fpic/-fpie and that is significantly smaller then -fPIC on PPC.
Question is, does -mrelocatable result in smaller .got (et al) are is the .fixup section adding extra size for the sake of ease of implementation?
fixup section expands with lots of ptrs, then fixup is placed just after .got

Joakim Tjernlund wrote:
On Wed, Oct 7, 2009 at 5:55 PM, Wolfgang Denk wd@denx.de wrote:
Dear Graeme Russ,
In message d66caabb0910061824s4165d33bu5d5213f6783c09d0@mail.gmail.com you wrote:
I think that even the -mrelocatable / .fixup method may not be needed at all. -pie / -pic used by themselves creates enough information for an OS dynamic loader to relocate an executable, so why not U-Boot? Given that the type and location of each section is easily determined, a striped down dynamic loader can provide a platform-independent relocation scheme.
One reason for not using ELF images for the boot loader is size. The ELF header alone is often more than we would be willing to accept, not to mention the additional code.
But the headers get stripped from the final binary. All we are left with in order to locate the ELF section data are the symbols exported from the linker script
The extra code is only three very tight for-loops. I had them wrapped in functions to improve readability, but they are good inline candidates (only called once each) and I doubt they use much code space at all (I'll send through actual numbers soon)
But how much space in the extra sections you link in?
if size is comparable with fixup ptrs we should probably consider using the same for ppc. Then we can use -fpic/-fpie and that is significantly smaller then -fPIC on PPC.
Question is, does -mrelocatable result in smaller .got (et al) are is the .fixup section adding extra size for the sake of ease of implementation?
fixup section expands with lots of ptrs, then fixup is placed just after .got
There is also a trade-off in using the -mrelocatable / .fixup method that should be considered in the general case. -fPIC or even -fpic code is almost always larger than the code generated for -mrelocateable, so it it the final size of the object blob in prom that matters, not just the length of the fixup segment. This effect is architecture dependent, where PIC is much cheaper on a PPC than on an Intel 386. So the "best" combination may depend on the chip. I would also strongly encourage the -mrelocateable method be always available as an option. Historically, -fPIC has been buggy in initial releases of GCC for new architectures, probably because you can do useful work with just relocateable code and PIC can come "later". So it may be that u-boot would need an -mrelocateable approach for some period until -fPIC/-fpic worked. In any case, it would be a good fallback if one suspected a bug in PIC.
Best Regards, Bill Campbell

Dear Peter Tyser,
In message 1254872619.24664.3159.camel@localhost.localdomain you wrote:
Right, that's the current situation.
My suggestion was NOT to put the bss at a fixed _offset_ to the U-Boot image, but to a fixed absolute address. My hope is that this might simplify the linker scripts at the cost of adding a little code to the relocation routine - for addresses in the bss we would have to add a different relocation offset.
I think I see what you're getting at. If we have a bss-specific fixup routine I don't give a hoot where the bss is located at link time. Its just that that bss-aware fixup routine doesn't exist right now:)
Right!!! Now you got it. Ufff...
It seems like a clean solution. Adding a bss-aware fixup routine or putting the bss after the U-Boot image both seem good to me. The bss-aware fixup routine has a clearer readelf output and slightly more complicated code while the bss-after-uboot change has a misleading readelf output and simpler code. In any case I'd give a thumbs up to either of them.
My vote is for the first, because otherwise we will run into situations again and again where users and/or the linker get confused about overlapping sections and/or sections wrapping around the physical end of address space.
Best regards,
Wolfgang Denk

It seems like a clean solution. Adding a bss-aware fixup routine or putting the bss after the U-Boot image both seem good to me. The bss-aware fixup routine has a clearer readelf output and slightly more complicated code while the bss-after-uboot change has a misleading readelf output and simpler code. In any case I'd give a thumbs up to either of them.
My vote is for the first, because otherwise we will run into situations again and again where users and/or the linker get confused about overlapping sections and/or sections wrapping around the physical end of address space.
Are you proposing adding this new bss fixup code to this release, or rolling it into the next release along with Jocke's addition of relocation code written in C-code? Logically it'd be much easier to add this new bss fixup logic to Jocke's 1 C-code function instead of 15 assembly files, but then we'd have to have a temporary 85xx workaround just for this release (which is fine by me).
Best, Peter

Dear Peter,
In message 1254916635.20662.22.camel@ptyser-laptop you wrote:
My vote is for the first, because otherwise we will run into situations again and again where users and/or the linker get confused about overlapping sections and/or sections wrapping around the physical end of address space.
Are you proposing adding this new bss fixup code to this release, or rolling it into the next release along with Jocke's addition of relocation code written in C-code? Logically it'd be much easier to add this new bss fixup logic to Jocke's 1 C-code function instead of 15 assembly files, but then we'd have to have a temporary 85xx workaround just for this release (which is fine by me).
I can live with a temporary workaround, too. Given that we're already in the middle of the "stabilization" period I prefer to delay this bigger change until next release.
Best regards,
Wolfgang Denk

Dear Peter Tyser,
In message 1254862383.24664.2742.camel@localhost.localdomain you wrote:
What's the advantage of having the bss not be located next to U-Boot?
One advantage is that we might chose the same address for all boards, and eventually for all Power processor families.
One disadvantage is that we need to relocate it separately, or we will have a gap in the RAm memory map which is IMO not acceptable.
The big disadvantage of picking an arbitrary address for the bss is that there's now 1 more magical section of SDRAM that the user needs to know shouldn't be used. I already field enough question from people that
Why should it not be used? You seem to be pretty fixed on that idea, which is wrong. No code will ever be written to RAM at list location.
In the current setup, we don't write any code to RAM at 0x0 either.
corrupt their exception vectors or stack/malloc pool/u-boot code, I don't want to add more bss questions:)
I cannot follow you here.
Best regards,
Wolfgang Denk

On Wed, 2009-10-07 at 01:07 +0200, Wolfgang Denk wrote:
Dear Peter Tyser,
In message 1254862383.24664.2742.camel@localhost.localdomain you wrote:
What's the advantage of having the bss not be located next to U-Boot?
One advantage is that we might chose the same address for all boards, and eventually for all Power processor families.
We could achieve this wherever we end up putting the bss. eg if people think putting the bss right after the u-boot image is best, we can update the 44x linker script, etc to do the same thing. I think this discussion is applicable to most any PPC board.
One disadvantage is that we need to relocate it separately, or we will have a gap in the RAm memory map which is IMO not acceptable.
What does "relocating the bss separately" entail?
The big disadvantage of picking an arbitrary address for the bss is that there's now 1 more magical section of SDRAM that the user needs to know shouldn't be used. I already field enough question from people that
Why should it not be used? You seem to be pretty fixed on that idea, which is wrong. No code will ever be written to RAM at list location.
When I say user, I'm refering to an end user, eg a customer. Not a developer.
For arguments sake, lets say we developers put the bss at a "fixed (random, non-zero) address" of 0x80000. A user tftps an image to 0x80000 and suddenly their board starts acting up.
In the current setup, we don't write any code to RAM at 0x0 either.
Right, and this limitation causes headaches. I personally get lots of questions from customers about why their board hangs when they tftp an image to 0x0. In a perfect world we'd only have 1 reserved section of memory which contained the interrupt vectors, text, bss, malloc, stack, etc.
corrupt their exception vectors or stack/malloc pool/u-boot code, I don't want to add more bss questions:)
Its crappy to have 2 sections of memory that a user has to know not to touch, I don't want to have 3:)
Maybe I'm not understanding your suggestion "to chose a fixed (random, non-zero) address" for the bss. That implies to me we choose an address low memory (eg 0x10000) and put the bss there. I think it'd be more plausible for someone to blow this away accidentally than high memory by U-Boot, and you also couldn't use any data stored in the bss after you blow it away, eg right before jumping to a linux kernel.
Best, Peter

Dear Peter Tyser,
In message 1254871741.24664.3117.camel@localhost.localdomain you wrote:
One disadvantage is that we need to relocate it separately, or we will have a gap in the RAm memory map which is IMO not acceptable.
What does "relocating the bss separately" entail?
The relocation routine would have to check the relocated address; if it falls into the bss, it has to add a different offset than otherwise. This way the location of the bss after relocation is moved close to the U-Boot image, avoiding gaps that just cost memory.
Why should it not be used? You seem to be pretty fixed on that idea, which is wrong. No code will ever be written to RAM at list location.
When I say user, I'm refering to an end user, eg a customer. Not a developer.
Me too.
For arguments sake, lets say we developers put the bss at a "fixed (random, non-zero) address" of 0x80000. A user tftps an image to 0x80000 and suddenly their board starts acting up.
No, it doesn't. It works perfectly well. The 0x80000 address is something that is ONLY used in the image as created by the linker, i. e. _before_ relocation. We no data gets ever written there. The end user who runs TFTP sees the system _after_ relocation, when the real bss has been allocated in high memory (just as it is now) and all symbols have been fixed up.
In the current setup, we don't write any code to RAM at 0x0 either.
Right, and this limitation causes headaches. I personally get lots of questions from customers about why their board hangs when they tftp an image to 0x0. In a perfect world we'd only have 1 reserved section of memory which contained the interrupt vectors, text, bss, malloc, stack, etc.
But this has _nothing_ to do with the fact that the bss is at 0x0. The problem is caused by overwriting the exception vectors, which is a completely different story.
corrupt their exception vectors or stack/malloc pool/u-boot code, I don't want to add more bss questions:)
Its crappy to have 2 sections of memory that a user has to know not to touch, I don't want to have 3:)
Neither do I. This would not change.
Maybe I'm not understanding your suggestion "to chose a fixed (random, non-zero) address" for the bss. That implies to me we choose an address low memory (eg 0x10000) and put the bss there. I think it'd be more
Yes, but ONLY for linking the image. Here the bss address is just _any_ address, more or less a virtual thing that just needs to be there so we can calculate the size later.
After relocation, i. e. when the end user sees it, the setupo shall be exactly the same as now, i. e. allocation of memory starting from end of available RAM downward would look like that:
[End of RAM] if configured: area that won't get touched by U-Boot and Linux if configured: shared log buffer if configured: pRAM (Protected RAM - unchanged by reset) if configured: LCD or video framebuffer .bss .u_boot_cmd .data .rodata .text malloc arena board info struct stack (growing downward)
Except for gaps caused for alignment purposes, these parts shall be tightly packed to make for as small a memory footprint as possible.
plausible for someone to blow this away accidentally than high memory by U-Boot, and you also couldn't use any data stored in the bss after you blow it away, eg right before jumping to a linux kernel.
You still fail to understand that the bss "address" as used for the linking of the image is unrelated to the storage area that later gets zeroed and used as bss. The former is before, the later after relocation to RAM.
Best regards,
Wolfgang Denk

On Oct 6, 2009, at 1:08 PM, Peter Tyser wrote:
On Tue, 2009-10-06 at 19:51 +0200, Wolfgang Denk wrote:
Dear Peter Tyser,
In message 1254843932.24664.2083.camel@localhost.localdomain you wrote:
I personally like the current implementation of putting the bss after the entire U-Boot image. It keeps U-Boot's code, malloc pool, stack, bss, etc all in the same general area which is nice, and has the side benefit that the bootpg won't be overwritten.
OK, if you think so...
I know ORing in 0x10 is a bit ugly, but what's the real downside of doing it?
Nothing. I just hate to allocate the bss at 0x0, because this is actually incorrect - it's the result of an address overflow / truncation, and pretty much misleading to someone trying to read and understand the code. For the linked image, it does not _look_ as if the bss was located _after_ the U-Boot image, it looks detached and allocated in low RAM.
Do you have a preference Kumar? You're probably going to be the first in line to have to deal with any resulting confusion:)
I personally would rank the options:
- OR in an offset to the bss address and leave some good comments in
the linker script and commit message
- Make the bss the last section like other PPC boards which would
result in the bootpg sometimes being overwritten
- Put the bss at an arbitrary address
I don't have a preference, but maybe I missed the answer to my question about where does 44x put the BSS.
Is it possible to put it before TEXTBASE?
- k

On Tue, 2009-10-06 at 15:46 -0500, Kumar Gala wrote:
On Oct 6, 2009, at 1:08 PM, Peter Tyser wrote:
On Tue, 2009-10-06 at 19:51 +0200, Wolfgang Denk wrote:
Dear Peter Tyser,
In message 1254843932.24664.2083.camel@localhost.localdomain you wrote:
I personally like the current implementation of putting the bss after the entire U-Boot image. It keeps U-Boot's code, malloc pool, stack, bss, etc all in the same general area which is nice, and has the side benefit that the bootpg won't be overwritten.
OK, if you think so...
I know ORing in 0x10 is a bit ugly, but what's the real downside of doing it?
Nothing. I just hate to allocate the bss at 0x0, because this is actually incorrect - it's the result of an address overflow / truncation, and pretty much misleading to someone trying to read and understand the code. For the linked image, it does not _look_ as if the bss was located _after_ the U-Boot image, it looks detached and allocated in low RAM.
Do you have a preference Kumar? You're probably going to be the first in line to have to deal with any resulting confusion:)
I personally would rank the options:
- OR in an offset to the bss address and leave some good comments in
the linker script and commit message
- Make the bss the last section like other PPC boards which would
result in the bootpg sometimes being overwritten
- Put the bss at an arbitrary address
I don't have a preference, but maybe I missed the answer to my question about where does 44x put the BSS.
The 44x boards put the bss after "the rest" of u-boot, but before the bootpg section. Sometimes the bss might overlap the bootpg which would mean the bootpg would get zeroed out on bootup and the bss would "wrap around to 0 (which is fine, just confusing). Eg:
[ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .resetvec PROGBITS fffffffc 03f2e4 000004 00 AX 0 0 1 [ 2] .bootpg PROGBITS fffff000 03e2e8 000250 00 AX 0 0 1 [ 3] .text PROGBITS fff80000 000094 0303b0 00 AX 0 0 4 [ 4] .rodata PROGBITS fffb03b0 030444 00a14c 00 A 0 0 4 [ 5] .reloc PROGBITS fffba500 03a594 002280 00 WA 0 0 4 [ 6] .data PROGBITS fffbc780 03c814 00088c 00 WA 0 0 4 [ 7] .data.rel.local PROGBITS fffbd00c 03d0a0 000a98 00 WA 0 0 4 [ 8] .data.rel.ro.loca PROGBITS fffbdaa4 03db38 0000b0 00 WA 0 0 4 [ 9] .data.rel PROGBITS fffbdb54 03dbe8 000100 00 WA 0 0 4 [10] .u_boot_cmd PROGBITS fffbdc54 03dce8 000600 00 WA 0 0 4 [11] .bss NOBITS fffbe300 03e2e8 011c44 00 WA 0 0 4
Is it possible to put it before TEXTBASE?
I looked into that originally but couldn't get it to work via the linker script alone. If we wanted to hardcode a bss size, we could pass "-Tbss <TEXTBASE - HARDCODED_BSS_SIZE>" to ld to position it. We could allocate some relatively huge chunk of memory for it below TEXTBASE, but I'm not sure we could make it dynamically sized.
Peter

On Tuesday 06 October 2009 17:22:10 Wolfgang Denk wrote:
My concern was that we use __bss_start and _end to calculate the size of the bss to zero out. If the bss wraps, I'd be concerned about what gets cleared as _end would be truncated to a low memory address while __bss_start would be a high memory address. Or other similar problems - I didn't investigate what would really happen, I was just worried what could happen:)
So far U-Boot is actually a 32 bit boot loader; address calculations like this "just wrap around". So far this has not caused problems yet; what has caused problems is that we can have overlapping sections on 4xx. Also it's probably overkill that each board has it's own linker script.
I would like to see this fixed in this process. Maybe Stefan finds some spare cycles to address this.
Yes, the consolidation of the 4xx linker scripts is definitely something that should be worked on. Again, I'll put it on my to-do list. But I won't complain if somebody else sends some patches here. ;)
Cheers, Stefan
-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-0 Fax: (+49)-8142-66989-80 Email: office@denx.de

Dear Kumar Gala,
In message 1DE23DE0-B901-4E15-845C-43889EE0B178@kernel.crashing.org you wrote:
...
But bss is NOLOAD, and the actual location in the flash is just a fiction - we never use anything of this but the start address.
Where is BSS on 44x boards? I dont see any reason we shouldn't be able to put it at the same location.
Um... maybe Stefan should explain this. I don't want to have to ;-)
Best regards,
Wolfgang Denk

On Tue, 2009-10-06 at 17:04 +0200, Wolfgang Denk wrote:
Dear Kumar Gala,
In message 1DE23DE0-B901-4E15-845C-43889EE0B178@kernel.crashing.org you wrote:
...
But bss is NOLOAD, and the actual location in the flash is just a fiction - we never use anything of this but the start address.
Where is BSS on 44x boards? I dont see any reason we shouldn't be able to put it at the same location.
Um... maybe Stefan should explain this. I don't want to have to ;-)
The 44x boards look the same as 85xx used to be - the bss is the last section in the ELF, but it has the downside that the code in the bootpg will be zeroed out along with the bss if the U-Boot image is near its maximum size and the bss overlaps the bootpg. Kumar prevented this (whether he meant to or not:) by putting the bss after the entire U-Boot image.
Best, Peter
participants (7)
-
Graeme Russ
-
J. William Campbell
-
Joakim Tjernlund
-
Kumar Gala
-
Peter Tyser
-
Stefan Roese
-
Wolfgang Denk