[U-Boot] arm: wrong Relocation and not cleared BSS

Hello,
to give the topic a better meaning and to summarize what I think is currently happening along with some "pictures" for a better understanding:
We are starting with code (c) and data (d) somewhere in the memory:
---------- |cd | ----------
The relocation in start.S should achieve this:
---------- | cd| ----------
That means code and data should be moved upwards. What currently is happening is the following:
---------- | d c | ----------
The code is moved upwards, but that code still uses the data at d. This results another problem: Some parts in the code are assuming that d is cleared (set to zero in start.S). But what start.S does it to clear the new location (z in the picture below).
---------- | d cz| ----------
Because the code (c) still uses the data (bss) in d and not in z, some hard to find errors might occur because the used data isn't set to zero as required.
I have almost no knowledge about how gcc and the binutils are handling relocation, therfore I can't help much further here. What I think is part of the problem, is that -fPIC was removed. Using -pie in LDFLAGS might be used to get relocatable code, but the data will not be relocated. And I would wonder if that is possible without instructing the compiler to build stuff for relocation (-fPIC).
I hope that brings some light into the problem.
Regards,
Alexander

Le 30/10/2010 15:08, Alexander Holler a écrit :
Hello,
to give the topic a better meaning and to summarize what I think is currently happening along with some "pictures" for a better understanding:
You may be right, but for now this is not necessarily what really happens, so the subjet is still somewhat of a misnomer.
We are starting with code (c) and data (d) somewhere in the memory:
|cd |
The relocation in start.S should achieve this:
| cd|
That means code and data should be moved upwards. What currently is happening is the following:
| d c |
The code is moved upwards, but that code still uses the data at d. This results another problem: Some parts in the code are assuming that d is cleared (set to zero in start.S). But what start.S does it to clear the new location (z in the picture below).
Wait a minute. No parts of the code assume BSS is *cleared*, or at least no pat of the should *should ever* assume that. BSS is not "zeroed data", it is "uninitialized data".
Which leads to another question: you found an issue according to which values put in nand_chip[] were not read back correctly later on, with both the writing and reading occurring after relocation.
But -- stop me if I'm wrong -- there is no reason that only one of the reading or writing used the "right" address and one used the "wrong" address, right? Both use the same address. Even if BSS does not end up where it should, it would still be at the same address for all code, so that does not explain the issue you're seeing.
| d cz|
Because the code (c) still uses the data (bss) in d and not in z, some hard to find errors might occur because the used data isn't set to zero as required.
BSS is not *required* to be zero. It is zeroed out as a courtesy, but code is expected not to depend on BSS being zeroed. Besides, in your case, the fact that it is zeroed out does not matter since you're (correctly) trying to read a BSS variable that is assumed to have been properly filled in earlier.
I have almost no knowledge about how gcc and the binutils are handling relocation, therfore I can't help much further here. What I think is part of the problem, is that -fPIC was removed. Using -pie in LDFLAGS might be used to get relocatable code, but the data will not be relocated. And I would wonder if that is possible without instructing the compiler to build stuff for relocation (-fPIC).
I hope that brings some light into the problem.
Again, when -fPIC was replace with -pie, it was because it actually relocated much better that GOT relocation, including on NAND devices IIRC.
Could people with ARM NAND-based boards on the list test their own board with Alexander's DEBUG and printf() added, and see what this gives on their board?
Regards,
Alexander
Amicalement,

Hello,
Am 30.10.2010 15:36, schrieb Albert ARIBAUD:
The code is moved upwards, but that code still uses the data at d. This results another problem: Some parts in the code are assuming that d is cleared (set to zero in start.S). But what start.S does it to clear the new location (z in the picture below).
Wait a minute. No parts of the code assume BSS is *cleared*, or at least no pat of the should *should ever* assume that. BSS is not "zeroed data", it is "uninitialized data".
Thats true for normal C, but I assume that is not true for u-boot.
This reminds me on some problems I've had long long time ago, with switching from debug to optimized code using vc++. I don't know if it is still true (>10a ago), but in those days, vc++ had preset all not initialized data with zero when optimization was turned off.
If the code in u-boot would not assume that bss is zero, I don't understand why clear_bss in start.S exists.
Regards,
Alexander

Le 30/10/2010 15:45, Alexander Holler a écrit :
Hello,
Am 30.10.2010 15:36, schrieb Albert ARIBAUD:
The code is moved upwards, but that code still uses the data at d. This results another problem: Some parts in the code are assuming that d is cleared (set to zero in start.S). But what start.S does it to clear the new location (z in the picture below).
Wait a minute. No parts of the code assume BSS is *cleared*, or at least no pat of the should *should ever* assume that. BSS is not "zeroed data", it is "uninitialized data".
Thats true for normal C, but I assume that is not true for u-boot.
You mean the rule is not respected for u-boot? then you should point out (or better yet, submit a patch to fix) parts of code which assume BSS is zero. Besides... U-boot *is* normal C (apart from the part before relocation where constraints prevent the use of any variable other than GD).
This reminds me on some problems I've had long long time ago, with switching from debug to optimized code using vc++. I don't know if it is still true (>10a ago), but in those days, vc++ had preset all not initialized data with zero when optimization was turned off.
If the code in u-boot would not assume that bss is zero, I don't understand why clear_bss in start.S exists.
As I said, out of courtesy.
Still, BSS zeroing does not seem to relate to what you witness. You're not reading a variable that you think should be zero; you're writing then reading a BSS variable, and find that you read something different from what you read.
BTW, can you add printfs() with the nand init functions to see where it writes? As it uses a pointer, if this pointer is not passed correctly for some reason, we'll be able to see.
Regards,
Alexander
Amicalement,

Am 30.10.2010 15:57, schrieb Albert ARIBAUD:
Le 30/10/2010 15:45, Alexander Holler a écrit :
If the code in u-boot would not assume that bss is zero, I don't understand why clear_bss in start.S exists.
As I said, out of courtesy.
Still, BSS zeroing does not seem to relate to what you witness. You're not reading a variable that you think should be zero; you're writing then reading a BSS variable, and find that you read something different from what you read.
I'm not doing anything. Thats just what I've cloncluded reading existing code. First, clear_bss exists, Second, code in u-boot seems to assume that bss is cleared.
I've run into this problem because the following check in nand_set_defaults() in nand_base.c: ------------- /* check, if a user supplied command function given */
if (chip->cmdfunc == NULL)
chip->cmdfunc = nand_command;
-------------
But the board-specific function which presets those values doesn't touch chip->cmdfunc (kirkwood_nand.c):
------------------ int board_nand_init(struct nand_chip *nand)
{
nand->options = NAND_COPYBACK | NAND_CACHEPRG | NAND_NO_PADDING;
nand->ecc.mode = NAND_ECC_SOFT;
nand->cmd_ctrl = kw_nand_hwcontrol;
nand->chip_delay = 30;
nand->select_chip = kw_nand_select_chip;
return 0;
} ------------------
And nothing else touches those nand_chip structures before they are used.
Regards,
Alexander

Dear Albert ARIBAUD,
In message 4CCC242C.8070303@free.fr you wrote:
You mean the rule is not respected for u-boot? then you should point out (or better yet, submit a patch to fix) parts of code which assume BSS is
It's a requirement of a standard C execution environment. BSS must _always_ be cleared.
Best regards,
Wolfgang Denk

Le 30/10/2010 16:39, Wolfgang Denk a écrit :
Dear Albert ARIBAUD,
In message4CCC242C.8070303@free.fr you wrote:
You mean the rule is not respected for u-boot? then you should point out (or better yet, submit a patch to fix) parts of code which assume BSS is
It's a requirement of a standard C execution environment. BSS must _always_ be cleared.
Just re-checked the C99 specs, and yes, all static scope vars must be initialized, so I stand corrected as for BSS initialization. I still think, though, that one should not count on a BSS-allocated variable to be zero at program start, and if one wants a variable to be zero, one must initialize it explicitly.
Anyway, as I said, the nand_chip[] issue is not, I believe, related to initializing BSS, as the writing and reading both occur after setting up the C environment for running from RAM.
Best regards,
Wolfgang Denk
Amicalement,

Dear Albert ARIBAUD,
In message 4CCC4161.8000807@free.fr you wrote:
Just re-checked the C99 specs, and yes, all static scope vars must be initialized, so I stand corrected as for BSS initialization. I still think, though, that one should not count on a BSS-allocated variable to be zero at program start, and if one wants a variable to be zero, one must initialize it explicitly.
The C standard guarantees that. And U-Boot guarantees that, after relocation, we have a standard C execution environment, i. e. a zeroed BSS.
Anyway, as I said, the nand_chip[] issue is not, I believe, related to initializing BSS, as the writing and reading both occur after setting up the C environment for running from RAM.
Agreed.
Best regards,
Wolfgang Denk

Dear Alexander Holler,
In message 4CCC218E.706@ahsoftware.de you wrote:
Wait a minute. No parts of the code assume BSS is *cleared*, or at least no pat of the should *should ever* assume that. BSS is not "zeroed data", it is "uninitialized data".
Thats true for normal C, but I assume that is not true for u-boot.
It is true for U-Boot, too. After relocation, we provide a standard C execution environment, which includes a zeroes BSS.
Best regards,
Wolfgang Denk

Dear Albert ARIBAUD,
In message 4CCC1F6C.7040603@free.fr you wrote:
Wait a minute. No parts of the code assume BSS is *cleared*, or at least no pat of the should *should ever* assume that. BSS is not "zeroed data", it is "uninitialized data".
BSS _is_ zeroed data. This is a very basic assumption of the C runtime.
BSS is not *required* to be zero. It is zeroed out as a courtesy, but >
No, you are wrong here.
Zeroing of the BSS is a mandatory requirement.
Best regards,
Wolfgang Denk

Hello,
Am 30.10.2010 16:36, schrieb Wolfgang Denk:
Dear Albert ARIBAUD,
In message4CCC1F6C.7040603@free.fr you wrote:
Wait a minute. No parts of the code assume BSS is *cleared*, or at least no pat of the should *should ever* assume that. BSS is not "zeroed data", it is "uninitialized data".
BSS _is_ zeroed data. This is a very basic assumption of the C runtime.
BSS is not *required* to be zero. It is zeroed out as a courtesy, but>
No, you are wrong here.
Zeroing of the BSS is a mandatory requirement.
Beeing curious, I've looked up that part in the C99 standard. Just to refresh our memories:
------------------------------------- 6.7.8
10
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then: — if it has pointer type, it is initialized to a null pointer; — if it has arithmetic type, it is initialized to (positive or unsigned) zero; — if it is an aggregate, every member is initialized (recursively) according to these rules; — if it is a union, the first named member is initialized (recursively) according to these rules. -------------------------------------
Regards,
Alexander

Dear Alexander Holler,
In message 4CCD4C1C.2050603@ahsoftware.de you wrote:
Zeroing of the BSS is a mandatory requirement.
Beeing curious, I've looked up that part in the C99 standard. Just to=20 refresh our memories:
6.7.8
10
If an object that has automatic storage duration is not initialized explicitly, its value is
Wrong section. We are not talking about variables with "automatic storage" (which means they are allocated on the stack), but about _static_ variables.
Best regards,
Wolfgang Denk

Le 31/10/2010 12:58, Wolfgang Denk a écrit :
Dear Alexander Holler,
In message4CCD4C1C.2050603@ahsoftware.de you wrote:
Zeroing of the BSS is a mandatory requirement.
Beeing curious, I've looked up that part in the C99 standard. Just to=20 refresh our memories:
6.7.8
10
If an object that has automatic storage duration is not initialized explicitly, its value is
Wrong section. We are not talking about variables with "automatic storage" (which means they are allocated on the stack), but about _static_ variables.
Actually I think this is the right section (and the only one to deal with initialization IIUC); is has indeed this one sentence about automatic objects, but then the following is about static objects:
"[...] If an object that has static storage duration is not initialized explicitly, then: [...]"
Best regards,
Wolfgang Denk
Amicalement,

Hello,
Am 31.10.2010 12:58, schrieb Wolfgang Denk:
If an object that has automatic storage duration is not initialized explicitly, its value is
Wrong section. We are not talking about variables with "automatic storage" (which means they are allocated on the stack), but about _static_ variables.
You should have read just one sentence further: ;) ------
If an object that has static storage duration is not initialized explicitly, then:
... ------
Regards,
Alexander

Dear Albert ARIBAUD,
In message 4CCC1F6C.7040603@free.fr you wrote:
Could people with ARM NAND-based boards on the list test their own board > with Alexander's DEBUG and printf() added, and see what this gives on > their board?
Here the results from tx25:
U-Boot 2010.12-rc1-00026-g0c0892b-dirty (Oct 30 2010 - 16:55:43)
U-Boot code: 81200000 -> 812291A0 BSS: -> 812322A0 CPU: Freescale i.MX25 at 399 MHz
monitor len: 000322A0 ramsize: 02000000 TLB table at: 81ff0000 Top of RAM usable for U-Boot at: 81ff0000 Reserving 200k for U-Boot at: 81fbd000 Reserving 1024k for malloc() at: 81ebd000 Reserving 24 Bytes for Board Info at: 81ebcfe8 Reserving 92 Bytes for Global Data at: 81ebcf8c New Stack Pointer is: 81ebcf88 RAM Configuration: Bank #0: 80000000 32 MiB relocation Offset is: 00dbd000 monitor flash len: 000291A0 Now running in RAM - U-Boot at: 81fbd000 NAND: 128 MiB ## nand_chip: 81fe91a0
Best regards,
Wolfgang Denk

Le 30/10/2010 17:00, Wolfgang Denk a écrit :
U-Boot 2010.12-rc1-00026-g0c0892b-dirty (Oct 30 2010 - 16:55:43)
I assume this is the latest master, right?
U-Boot code: 81200000 -> 812291A0 BSS: -> 812322A0
relocation Offset is: 00dbd000
Now running in RAM - U-Boot at: 81fbd000
## nand_chip: 81fe91a0
This is consistent. What commit exactly did you compile, and what toolchain did you use?
Best regards,
Wolfgang Denk
Amicalement,

Dear Albert ARIBAUD,
In message 4CCC5406.6050600@free.fr you wrote:
U-Boot 2010.12-rc1-00026-g0c0892b-dirty (Oct 30 2010 - 16:55:43)
I assume this is the latest master, right?
U-Boot code: 81200000 -> 812291A0 BSS: -> 812322A0
relocation Offset is: 00dbd000
Now running in RAM - U-Boot at: 81fbd000
## nand_chip: 81fe91a0
This is consistent. What commit exactly did you compile, and what toolchain did you use?
I used ELDK 4.2 in the arm-linux- setup, and the commit is visible in the version line above: it's 0c0892b (the "dirty" comes from adding the DEBUG in board.c and the printf() in nand.c
Best regards,
Wolfgang Denk

Hello Wolfgang,
Wolfgang Denk wrote:
Dear Albert ARIBAUD,
In message 4CCC1F6C.7040603@free.fr you wrote:
Could people with ARM NAND-based boards on the list test their own board > with Alexander's DEBUG and printf() added, and see what this gives on > their board?
Here the results from tx25:
Thanks for doing this. I also made such a log for Alexander on 27.10.2010 in u-boot%irc.freenode.org (couldn;t find this in the history :-( ) after we foun out, that on his board nand_chip resist on the old address before relocation ...
and I said to him, that it works fine on this board, and ... we came to the opinion, that it maybe is a toolchain issue ... IIRC he uses gcc-4.5x ... and he wanted to try ELDK-4.2 ...
U-Boot 2010.12-rc1-00026-g0c0892b-dirty (Oct 30 2010 - 16:55:43)
U-Boot code: 81200000 -> 812291A0 BSS: -> 812322A0 CPU: Freescale i.MX25 at 399 MHz
monitor len: 000322A0 ramsize: 02000000 TLB table at: 81ff0000 Top of RAM usable for U-Boot at: 81ff0000 Reserving 200k for U-Boot at: 81fbd000 Reserving 1024k for malloc() at: 81ebd000 Reserving 24 Bytes for Board Info at: 81ebcfe8 Reserving 92 Bytes for Global Data at: 81ebcf8c New Stack Pointer is: 81ebcf88 RAM Configuration: Bank #0: 80000000 32 MiB relocation Offset is: 00dbd000 monitor flash len: 000291A0 Now running in RAM - U-Boot at: 81fbd000 NAND: 128 MiB ## nand_chip: 81fe91a0
Yep, that look good, and I don;t think it is a code problem.
Alexander, did you tried (as you thought to do) ELDK-4.2 with gcc-4.2.x ? Are you sure your toolchain works correct with -pie?
bye, Heiko

Hi All,
I still have the same problem with my non main-line mini6410 board (arm1176). I based my board support on newest u-boot with cleaned relocation code:
U-Boot 2010.12-rc1-00028-ga1f6774 (Oct 30 2010 - 17:44:20) for MINI6410
U-Boot code: 57E00000 -> 57E20B58 BSS: -> 57E26218
CPU: S3C6400@532MHz Fclk = 532MHz, Hclk = 133MHz, Pclk = 66MHz (SYNC Mode) Board: MINI6410 monitor len: 00026218 ramsize: 08000000 TLB table at: 57ff0000 Top of RAM usable for U-Boot at: 57ff0000 Reserving 152k for U-Boot at: 57fc9000 Reserving 1280k for malloc() at: 57e89000 Reserving 24 Bytes for Board Info at: 57e88fe8 Reserving 92 Bytes for Global Data at: 57e88f8c New Stack Pointer is: 57e88f88 RAM Configuration: Bank #0: 50000000 128 MiB relocation Offset is: 001c9000 monitor flash len: 00020B58 Now running in RAM - U-Boot at: 57fc9000 Using default environment
Destroy Hash Table: 57e26100 table = (null) Create Hash Table: N=67 INSERT: table 57e26100, filled 1/67 rv 57e89268 ==> name="bootdelay" value="3" INSERT: table 57e26100, filled 2/67 rv 57e89274 ==> name="baudrate" value="115200" INSERT: free(data = 57e89008) INSERT: done In: serial Out: serial Err: serial Net: dm9000 ### main_loop entered: bootdelay=3
### main_loop: bootcmd="<UNDEFINED>" MINI6410 # help Unknown command 'help' - try 'help' MINI6410 #
It seems like cmd table somehow isn't relocated or is corrupted. I tried to change TEXT_BASE, then stack size, then malloc size, but in all cases result is the same. I use non standard nand_spl yet, which is only 10 lines of code to copy two nand pages to TEXT_BASE. I don't know if gcc or binutils version could cause such problem. There are versions of my tools:
$ arm-linux-gcc --version arm-linux-gcc (Buildroot 2010.11-git) 4.4.5
$ arm-linux-ld --version GNU ld (GNU Binutils) 2.20.1.20100303
Eric, do you still have the same problem of missing commands with your kirkwood board?
Darius.

Le 30/10/2010 17:15, Darius Augulis a écrit :
Hi All,
I still have the same problem with my non main-line mini6410 board (arm1176). I based my board support on newest u-boot with cleaned relocation code:
U-Boot 2010.12-rc1-00028-ga1f6774 (Oct 30 2010 - 17:44:20) for MINI6410
U-Boot code: 57E00000 -> 57E20B58 BSS: -> 57E26218
CPU: S3C6400@532MHz Fclk = 532MHz, Hclk = 133MHz, Pclk = 66MHz (SYNC Mode) Board: MINI6410 monitor len: 00026218 ramsize: 08000000 TLB table at: 57ff0000 Top of RAM usable for U-Boot at: 57ff0000 Reserving 152k for U-Boot at: 57fc9000 Reserving 1280k for malloc() at: 57e89000 Reserving 24 Bytes for Board Info at: 57e88fe8 Reserving 92 Bytes for Global Data at: 57e88f8c New Stack Pointer is: 57e88f88 RAM Configuration: Bank #0: 50000000 128 MiB relocation Offset is: 001c9000 monitor flash len: 00020B58 Now running in RAM - U-Boot at: 57fc9000 Using default environment
Destroy Hash Table: 57e26100 table = (null) Create Hash Table: N=67 INSERT: table 57e26100, filled 1/67 rv 57e89268 ==> name="bootdelay" value="3" INSERT: table 57e26100, filled 2/67 rv 57e89274 ==> name="baudrate" value="115200" INSERT: free(data = 57e89008) INSERT: done In: serial Out: serial Err: serial Net: dm9000 ### main_loop entered: bootdelay=3
### main_loop: bootcmd="<UNDEFINED>" MINI6410 # help Unknown command 'help' - try 'help' MINI6410 #
It seems like cmd table somehow isn't relocated or is corrupted. I tried to change TEXT_BASE, then stack size, then malloc size, but in all cases result is the same. I use non standard nand_spl yet, which is only 10 lines of code to copy two nand pages to TEXT_BASE. I don't know if gcc or binutils version could cause such problem. There are versions of my tools:
$ arm-linux-gcc --version arm-linux-gcc (Buildroot 2010.11-git) 4.4.5
$ arm-linux-ld --version GNU ld (GNU Binutils) 2.20.1.20100303
Eric, do you still have the same problem of missing commands with your kirkwood board?
Darius.
This is not quite the same issue as Alexander has, right?
If your board has NAND support, can you try and reproduce his issue?
Amicalement,

Hello,
I've written a small patch to test if relocation is working (at least as I have understood as it should work).
The patch is here:
http://lists.denx.de/pipermail/u-boot/2010-October/080798.html
This fials here with gcc 4.3.4 and 4.5.1 (and binutils 2.20.1). I currently have no other toolchain, so I can't test if that works with e.g. gcc 4.2.x.
The output I'm getting currently here (without DEBUG in board.c) is:
--------------------------------------------- Marvell>> g 0x700000 ## Starting application at 0x00700000 ...
U-Boot 2010.12-rc1-00036-g527491f (Oct 30 2010 - 21:38:17) Seagate-DockStar
SoC: Kirkwood 88F6281_A0 DRAM: 128 MiB (relocated) BSS is from 07fa8fa4 to 07fef0a0 &monitor_flash_len: 00759fc0 WARNING: relocation failed (&monitor_flash_len is outside reloctated BSS)! NAND: 256 MiB In: serial Out: serial Err: serial Net: egiga0 88E1116 Initialized on egiga0 Hit any key to stop autoboot: 0 Unknown command 'run' - try 'help' Unknown command 'usb' - try 'help' Unknown command 'run' - try 'help' Unknown command 'reset' - try 'help' Marvell>> ---------------------------------------------
Regards,
Alexander

Hello,
I've just tested an u-boot build by Wolfgang Denx with the ELDK 4.2 (using the HEAD of the current master with the same patches than I've used).
Thanks for that, Wolfgang.
This one works without any problems. So it seems to be proved, that it is a problem of the current relocation code as found in start.S in conjunction with a newer version of gcc.
Regards,
Alexander
Am 30.10.2010 22:03, schrieb Alexander Holler:
Hello,
I've written a small patch to test if relocation is working (at least as I have understood as it should work).
The patch is here:
http://lists.denx.de/pipermail/u-boot/2010-October/080798.html
This fials here with gcc 4.3.4 and 4.5.1 (and binutils 2.20.1). I currently have no other toolchain, so I can't test if that works with e.g. gcc 4.2.x.
The output I'm getting currently here (without DEBUG in board.c) is:
Marvell>> g 0x700000 ## Starting application at 0x00700000 ...
U-Boot 2010.12-rc1-00036-g527491f (Oct 30 2010 - 21:38:17) Seagate-DockStar
SoC: Kirkwood 88F6281_A0 DRAM: 128 MiB (relocated) BSS is from 07fa8fa4 to 07fef0a0 &monitor_flash_len: 00759fc0 WARNING: relocation failed (&monitor_flash_len is outside reloctated BSS)! NAND: 256 MiB In: serial Out: serial Err: serial Net: egiga0 88E1116 Initialized on egiga0 Hit any key to stop autoboot: 0 Unknown command 'run' - try 'help' Unknown command 'usb' - try 'help' Unknown command 'run' - try 'help' Unknown command 'reset' - try 'help' Marvell>>
Regards,
Alexander _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

Hello Alexander
Alexander Holler wrote:
I've just tested an u-boot build by Wolfgang Denx with the ELDK 4.2 (using the HEAD of the current master with the same patches than I've used).
Thanks for that, Wolfgang.
This one works without any problems. So it seems to be proved, that it
Great! (And as we thought on 27.10.2010 ...)
is a problem of the current relocation code as found in start.S in conjunction with a newer version of gcc.
Yep, seems so ...
bye, Heiko

Hello Heiko,
-----Original Message----- From: u-boot-bounces@lists.denx.de [mailto:u-boot- bounces@lists.denx.de] On Behalf Of Heiko Schocher Sent: Sunday, October 31, 2010 1:18 PM To: Alexander Holler Cc: Darius Augulis; u-boot@lists.denx.de Subject: Re: [U-Boot] arm: wrong Relocation and not cleared BSS
Hello Alexander
Alexander Holler wrote:
I've just tested an u-boot build by Wolfgang Denx with the ELDK
4.2
(using the HEAD of the current master with the same patches than
I've
used).
Thanks for that, Wolfgang.
This one works without any problems. So it seems to be proved,
that it
Great! (And as we thought on 27.10.2010 ...)
is a problem of the current relocation code as found in start.S in conjunction with a newer version of gcc.
Yep, seems so ...
I am also facing similar issues with booting OMAP4430 SDP(Cortex-A9) and did some debugging.
I am using GCC 4.4.1.
I found some strange issues with the code generated by the compiler.
Looks like the following labels created in start.S do not work as intended. Please look at the header information and assembly listing generated by objdump.
*******************************************************************
Code: ***** _rel_dyn_start_ofs: .word __rel_dyn_start - _start _rel_dyn_end_ofs: .word __rel_dyn_end - _start _dynsym_start_ofs: .word __dynsym_start - _start
Assembly listing: ***************** 80e8017c <_board_init_r_ofs>: 80e8017c: 00000748 .word 0x00000748
80e80180 <_rel_dyn_start_ofs>: 80e80180: 0002358c .word 0x0002358c
80e80184 <_rel_dyn_end_ofs>: 80e80184: 0002358c .word 0x0002358c
80e80188 <_dynsym_start_ofs>: 80e80188: 0002358c .word 0x0002358c
Header dump: ************ u-boot: file format elf32-littlearm
Sections: Idx Name Size VMA LMA File off Algn 0 .text 000187b4 80e80000 80e80000 00008000 2**5 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .rodata 00005fde 80e987b4 80e987b4 000207b4 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .interp 00000011 80e9e792 80e9e792 00026792 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 3 .dynamic 00000080 80ea3500 80ea3500 0002b500 2**2 CONTENTS, ALLOC, LOAD, DATA 4 .dynsym 00000100 80ea358c 80ea358c 0002b58c 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 5 .dynstr 000000c2 80e9e7a3 80e9e7a3 000267a3 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 6 .hash 00000054 80e9e868 80e9e868 00026868 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 7 .rel.dyn 00003c50 80e9e8bc 80e9e8bc 000268bc 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 8 .data 00000ff4 80ea250c 80ea250c 0002a50c 2**2 CONTENTS, ALLOC, LOAD, DATA 9 .got.plt 0000000c 80ea3580 80ea3580 0002b580 2**2 CONTENTS, ALLOC, LOAD, DATA 10 .u_boot_cmd 00000540 80ea368c 80ea368c 0002b68c 2**2 CONTENTS, ALLOC, LOAD, DATA 11 .bss 00031220 80ea3bd0 80ea3bd0 0002bbcc 2**3 ALLOC
So, there seems to be a problem in the way 'as' assembles the labels.
I see the following 'warning' in the 'as' manual:
Machines with a 32-bit address space, but that do less than 32-bit addressing, require the following special treatment. If the machine of interest to you does 32-bit addressing (or doesn't require it; see Chapter 9 [Machine Dependencies], page 75), you can ignore this issue. In order to assemble compiler output into something that works, as occasionally does strange things to `.word' directives. Directives of the form `.word sym1-sym2' are often emitted by compilers as part of jump tables.
Therefore, when as assembles a directive of the form `.word sym1-sym2', and the difference between sym1 and sym2 does not fit in 16 bits, as creates a secondary jump table, immediately before the next label. This secondary jump table is preceded by a short-jump to the first byte after the secondary table. This short-jump prevents the flow of control from accidentally falling into the new table. Inside the table is a long-jump to sym2. The original `.word' contains sym1 minus the address of the long-jump to sym2.
If there were several occurrences of `.word sym1-sym2' before the secondary jump table, all of them are adjusted. If there was a `.word sym3-sym4', that also did not fit in sixteen bits, a long-jump to sym4 is included in the secondary jump table, and the .word directives are adjusted to contain sym3 minus the address of the long-jump to sym4; and so on, for as many entries in the original jump table as necessary.
**********************************************************
Looks like part of the issue is due to these labels.
However, there seems to be other problems. I noticed that the code is also not relocatable. It's making absolute references to the address before relocation.
This seems to be solved if I add -fPIE to the PLATFORM_CPPFLAGS.
However, I couldn't solve the problem with the labels yet. I will let you know if I could make some progress.
Br, Aneesh

Hi Heiko,
-----Original Message----- From: u-boot-bounces@lists.denx.de [mailto:u-boot- bounces@lists.denx.de] On Behalf Of V, Aneesh Sent: Tuesday, November 02, 2010 11:10 AM To: hs@denx.de; Alexander Holler Cc: Darius Augulis; u-boot@lists.denx.de Subject: Re: [U-Boot] arm: wrong Relocation and not cleared BSS
Assembly listing:
80e8017c <_board_init_r_ofs>: 80e8017c: 00000748 .word 0x00000748
80e80180 <_rel_dyn_start_ofs>: 80e80180: 0002358c .word 0x0002358c
80e80184 <_rel_dyn_end_ofs>: 80e80184: 0002358c .word 0x0002358c
80e80188 <_dynsym_start_ofs>: 80e80188: 0002358c .word 0x0002358c
I don't know if this was clear in the previous mail. Please note that last three labels have same value.
Best regards, Aneesh

Le 02/11/2010 06:58, V, Aneesh a écrit :
Hi Heiko,
-----Original Message----- From: u-boot-bounces@lists.denx.de [mailto:u-boot- bounces@lists.denx.de] On Behalf Of V, Aneesh Sent: Tuesday, November 02, 2010 11:10 AM To: hs@denx.de; Alexander Holler Cc: Darius Augulis; u-boot@lists.denx.de Subject: Re: [U-Boot] arm: wrong Relocation and not cleared BSS
Assembly listing:
80e8017c<_board_init_r_ofs>: 80e8017c: 00000748 .word 0x00000748
80e80180<_rel_dyn_start_ofs>: 80e80180: 0002358c .word 0x0002358c
80e80184<_rel_dyn_end_ofs>: 80e80184: 0002358c .word 0x0002358c
80e80188<_dynsym_start_ofs>: 80e80188: 0002358c .word 0x0002358c
I don't know if this was clear in the previous mail. Please note that last three labels have same value.
Best regards, Aneesh
Aneesh,
This has been seen on tx25 and analyzed over the last few days. The root cause was found to be a change in the .rel.dyn handling by the compilers. See if the fixes in
http://article.gmane.org/gmane.comp.boot-loaders.u-boot/88156 http://article.gmane.org/gmane.comp.boot-loaders.u-boot/88157 http://article.gmane.org/gmane.comp.boot-loaders.u-boot/88158
can apply to your board; either the .lds fix as done in the first two patches, or the fix in the third one, or both.
Amicalement,

Hi Albert,
-----Original Message----- From: Albert ARIBAUD [mailto:albert.aribaud@free.fr] Sent: Tuesday, November 02, 2010 12:02 PM To: V, Aneesh Cc: hs@denx.de; Alexander Holler; Darius Augulis; u- boot@lists.denx.de Subject: Re: arm: wrong Relocation and not cleared BSS
Le 02/11/2010 06:58, V, Aneesh a écrit :
Hi Heiko,
-----Original Message----- From: u-boot-bounces@lists.denx.de [mailto:u-boot- bounces@lists.denx.de] On Behalf Of V, Aneesh Sent: Tuesday, November 02, 2010 11:10 AM To: hs@denx.de; Alexander Holler Cc: Darius Augulis; u-boot@lists.denx.de Subject: Re: [U-Boot] arm: wrong Relocation and not cleared BSS
Assembly listing:
80e8017c<_board_init_r_ofs>: 80e8017c: 00000748 .word 0x00000748
80e80180<_rel_dyn_start_ofs>: 80e80180: 0002358c .word 0x0002358c
80e80184<_rel_dyn_end_ofs>: 80e80184: 0002358c .word 0x0002358c
80e80188<_dynsym_start_ofs>: 80e80188: 0002358c .word 0x0002358c
I don't know if this was clear in the previous mail. Please note
that
last three labels have same value.
Best regards, Aneesh
Aneesh,
This has been seen on tx25 and analyzed over the last few days. The root cause was found to be a change in the .rel.dyn handling by the compilers. See if the fixes in
http://article.gmane.org/gmane.comp.boot-loaders.u-boot/88156 http://article.gmane.org/gmane.comp.boot-loaders.u-boot/88157 http://article.gmane.org/gmane.comp.boot-loaders.u-boot/88158
can apply to your board; either the .lds fix as done in the first two patches, or the fix in the third one, or both.
Thanks. This helps. I did the .lds change and it seems to be booting now.
However, I can't still explain my earlier observation because even in the absence of this fix .rel.dyn had some content and the offsets should have been different if I were to believe objdump.
Do you have any clue?
Best regards, Aneesh

While U-boot loads the Linux image, I have the following error. Do you have any suggestions on this?
detected lzma initramfs initramfs: LZMA lc=3,lp=0,pb=2,dictSize=8388608,origSize=12677632 Bad page state in process 'swapper' page:a8000000007b3418 flags:0x0000000000000000 mapping:0000000000000000 mapcount:-16711680 count:0 Trying to fix it up, but a reboot is needed Backtrace: Unwound Call Trace: [<ffffffff80211dc8>] dump_stack+0x8/0x48 [<ffffffff80265c98>] bad_page+0x78/0xb0 [<ffffffff80266988>] get_page_from_freelist+0x230/0x488 [<ffffffff80266c44>] __alloc_pages+0x64/0x348 [<ffffffff8027b4ac>] __vmalloc_area_node+0x10c/0x230 [<ffffffff804741cc>] populate_rootfs+0x974/0xae0 [<ffffffff802007e4>] init+0x84/0x530 [<ffffffff8020db58>] kernel_thread_helper+0x10/0x18
Thanks! Shuyou

Le 02/11/2010 08:37, sywang a écrit :
While U-boot loads the Linux image, I have the following error. Do you have any suggestions on this?
detected lzma initramfs initramfs: LZMA lc=3,lp=0,pb=2,dictSize=8388608,origSize=12677632 Bad page state in process 'swapper' page:a8000000007b3418 flags:0x0000000000000000 mapping:0000000000000000 mapcount:-16711680 count:0 Trying to fix it up, but a reboot is needed Backtrace: Unwound Call Trace: [<ffffffff80211dc8>] dump_stack+0x8/0x48 [<ffffffff80265c98>] bad_page+0x78/0xb0 [<ffffffff80266988>] get_page_from_freelist+0x230/0x488 [<ffffffff80266c44>] __alloc_pages+0x64/0x348 [<ffffffff8027b4ac>] __vmalloc_area_node+0x10c/0x230 [<ffffffff804741cc>] populate_rootfs+0x974/0xae0 [<ffffffff802007e4>] init+0x84/0x530 [<ffffffff8020db58>] kernel_thread_helper+0x10/0x18
Thanks! Shuyou
Hi Shuyou,
This is not 'while u-boot loads Linux', this is 'while Linux boots after it was loaded by u-boot', because these messages are from Linux.
I'd say these messages typically occur when the image (kernel+ramfs) is corrupted. What does u-boot display before jumping to the Linux kernel?
Amicalement,

Albert,
Thanks for your reply. My log is shown below. What you say is right. The error information is from Linux. I guess that what parameters passed to Linux by u-boot may be not right. However, I don't know how to identify?
TFTP from server 192.168.5.101; our IP address is 192.168.5.22 Filename 'mips.ari'. Load address: 0x9f00000 Loading: ############################################# done Bytes transferred = 2917360 (2c83f0 hex)
Image is not signed; verifying checksum... passed do_tftpboot, Linux image has been verified: pass do_tftpboot, going to do_bootoctlinux octeon_phy_mem_block_free addr: 0x100000, size: 0x8000000 ELF file is 64 bit block alloc called: req_size: 0x2b25e0, min_addr: 0xa00000, max_addr: 0x0, align: 0x0 Allocated memory for ELF segment: addr: 0xa00000, size 0x2b25e0 block alloc called: req_size: 0xd0, min_addr: 0xcb25e0, max_addr: 0x0, align: 0x0 Allocated memory for ELF segment: addr: 0xcb25e0, size 0xd0 Loading .text @ 0x80a00000 (0x2b25b8 bytes) Clearing .bss @ 0x80cb25c0 (0x20 bytes) Loading .data @ 0x80cb25e0 (0x30 bytes) Loading .MIPS.options @ 0x80cb2610 (0xa0 bytes) naddr 2 addr vec 0, 0x2b25e0 @ 0xa00000 addr vec 1, 0xd0 @ 0xcb25e0 ## Loading OS kernel with entry point: 0x80a00000 ... block alloc called: req_size: 0x77, min_addr: 0x0, max_addr: 0x7fffffff, align: 0x0 block alloc called: req_size: 0x190, min_addr: 0x0, max_addr: 0x7fffffff, align: 0x0 block alloc called: req_size: 0x98, min_addr: 0x0, max_addr: 0x7fffffff, align: 0x0 board type is: 11, CN3010_EVB_HS5 stack expected: 0x0, actual: 0x0 heap_base expected: 0x0, actual: 0x0 heap_top expected: 0x0, actual: 0x0 Entry point (virt): 0x80a00000 Address of start app: 0xffffffff80096d90 Bootloader: Done loading app on coremask: 0x1 octeon_phy_mem_block_free addr: 0x9f00000, size: 0x6000000 octeon_phy_mem_block_free addr: 0x8100000, size: 0x3200 octeon_phy_mem_block_free addr: 0x8103200, size: 0x20000 octeon_phy_mem_block_free addr: 0x8123200, size: 0x32000 do_bootoctlinux, going to start_cores Bringing coremask: 0x1 out of reset! Address of start app: 0xffffffff80070914 block alloc called: req_size: 0x330, min_addr: 0x0, max_addr: 0x40000000, align: 0x0 Bootloader: Starting app at cycle: 0 Welcome to start_cores. (octeon_boot.c:1547) start_cores, going to BOOT_VECTOR_BASE app_start_func_addr 80096d90 ==== start_linux ==== printf_boot_init_vector stack_addr:0x80062f58 code_addr: 0x80070568 k0_val:0x8003ff44 flags:0x0 boot_info_addr:0x800c2440 pad:0x0 pad2:0x0 printf_boot_info_block entry_point:0x80a00000 boot_desc_addr: 0x100080 stack_top:0x0 exception_base:0x1000 cvmx_desc_addr:0x0 flags:0x0 Welcome to start_linux. (cmd_octeon_linux.c:596) Uncompressing.. Welcome to start_kernel. (init/main.c:458)
XXXX Networks XXXXOS Version (build 0000 / label #wangsy@-ENG.0000) Built by wangsy@localhost on 2010-11-02 at 15:09:36 CST (gcc version 3.4.5 Cavium Networks Version: 1.4.0, build 58) Welcome to start_kernel. (init/main.c:472) prom_init(arch/mips/cavium-octeon/setup.c:783) arcs_cmdline: console=ttyS0,9600 prom_init(arch/mips/cavium-octeon/setup.c:790) para[0]: bootoctlinux prom_init(arch/mips/cavium-octeon/setup.c:790) para[1]: 9f00200 prom_init(arch/mips/cavium-octeon/setup.c:790) para[2]: bootver= 1.1.4.0/wangsy@-ENG.0000 CVMSEG size: 2 cache lines (256 bytes) Setting flash physical map for 4MB flash at 0x1f800000 Determined physical RAM map: Welcome to start_kernel. (init/main.c:474) Kernel command line: console=ttyS0,9600 rdinit=/sbin/init Welcome to start_kernel. (init/main.c:505) Primary instruction cache 32kB, virtually tagged, 4 way, 64 sets, linesize 128 bytes. Primary data cache 16kB, 64-way, 2 sets, linesize 128 bytes. Welcome to start_kernel. (init/main.c:518) Using 500.000 MHz high precision timer. cycles_per_jiffy=1000000 Welcome to start_kernel. (init/main.c:532) Memory: 57344k/65536k available (1918k kernel code, 8144k reserved, 561k data, 2172k init, 0k highmem) Calibrating delay using timer specific routine.. 1000.32 BogoMIPS (lpj=1000323) available. Checking for the multiply/shift bug... no. Checking for the daddi bug... no. Checking for the daddiu bug... no. Welcome to start_kernel. (init/main.c:616) Welcome to rest_init. (init/main.c:396) Welcome to schedule. (kernel/sched.c:2889) Welcome to schedule. (kernel/sched.c:3065) detected lzma initramfs initramfs: LZMA lc=3,lp=0,pb=2,dictSize=8388608,origSize=12677632 Bad page state in process 'swapper' page:a8000000007b3418 flags:0x0000000000000000 mapping:0000000000000000 mapcount:-16711680 count:0 Trying to fix it up, but a reboot is needed Backtrace: Unwound Call Trace: [<ffffffff80211dc8>] dump_stack+0x8/0x48 [<ffffffff80265c98>] bad_page+0x78/0xb0 [<ffffffff80266988>] get_page_from_freelist+0x230/0x488 [<ffffffff80266c44>] __alloc_pages+0x64/0x348 [<ffffffff8027b4ac>] __vmalloc_area_node+0x10c/0x230 [<ffffffff804741cc>] populate_rootfs+0x974/0xae0 [<ffffffff802007e4>] init+0x84/0x530 [<ffffffff8020db58>] kernel_thread_helper+0x10/0x18
Thanks! Shuyou
-----Original Message----- From: Albert ARIBAUD [mailto:albert.aribaud@free.fr] Sent: 2010年11月2日 15:44 To: sywang Cc: u-boot@lists.denx.de Subject: Re: Bad page state in process 'swapper'
Le 02/11/2010 08:37, sywang a écrit :
While U-boot loads the Linux image, I have the following error. Do you have any suggestions on this?
detected lzma initramfs initramfs: LZMA lc=3,lp=0,pb=2,dictSize=8388608,origSize=12677632 Bad page state in process 'swapper' page:a8000000007b3418 flags:0x0000000000000000 mapping:0000000000000000 mapcount:-16711680 count:0 Trying to fix it up, but a reboot is needed Backtrace: Unwound Call Trace: [<ffffffff80211dc8>] dump_stack+0x8/0x48 [<ffffffff80265c98>] bad_page+0x78/0xb0 [<ffffffff80266988>] get_page_from_freelist+0x230/0x488 [<ffffffff80266c44>] __alloc_pages+0x64/0x348 [<ffffffff8027b4ac>] __vmalloc_area_node+0x10c/0x230 [<ffffffff804741cc>] populate_rootfs+0x974/0xae0 [<ffffffff802007e4>] init+0x84/0x530 [<ffffffff8020db58>] kernel_thread_helper+0x10/0x18
Thanks! Shuyou
Hi Shuyou,
This is not 'while u-boot loads Linux', this is 'while Linux boots after it was loaded by u-boot', because these messages are from Linux.
I'd say these messages typically occur when the image (kernel+ramfs) is corrupted. What does u-boot display before jumping to the Linux kernel?
Amicalement,

Dear "sywang",
In message 20101102081321.0571D28090@theia.denx.de you wrote:
Image is not signed; verifying checksum... passed do_tftpboot, Linux image has been verified: pass do_tftpboot, going to do_bootoctlinux octeon_phy_mem_block_free addr: 0x100000, size: 0x8000000 ELF file is 64 bit block alloc called: req_size: 0x2b25e0, min_addr: 0xa00000, max_addr:
...
You are using a proprietary version of U-Boot, where the source code is not available to the public.
We have no idea what Cavium might have changed in U-Boot (obviously they have changed a lot), of what their interface requirements are.
Please take all Cavium / Octon related questions to Cavium technical support. The U-Boot community CANNOT help you with that as we don not have ANY informations about that code.
Best regards,
Wolfgang Denk

Dear "sywang",
In message 20101102073757.0C5E828093@theia.denx.de you wrote:
While U-boot loads the Linux image, I have the following error. Do you have any suggestions on this?
detected lzma initramfs initramfs: LZMA lc=3,lp=0,pb=2,dictSize=8388608,origSize=12677632 Bad page state in process 'swapper' page:a8000000007b3418 flags:0x0000000000000000 mapping:0000000000000000 mapcount:-16711680 count:0 Trying to fix it up, but a reboot is needed Backtrace: Unwound Call Trace: [<ffffffff80211dc8>] dump_stack+0x8/0x48 [<ffffffff80265c98>] bad_page+0x78/0xb0 [<ffffffff80266988>] get_page_from_freelist+0x230/0x488 [<ffffffff80266c44>] __alloc_pages+0x64/0x348 [<ffffffff8027b4ac>] __vmalloc_area_node+0x10c/0x230 [<ffffffff804741cc>] populate_rootfs+0x974/0xae0 [<ffffffff802007e4>] init+0x84/0x530 [<ffffffff8020db58>] kernel_thread_helper+0x10/0x18
These are Linux error messages. This is the U-Boot mailing list.
Best regards,
Wolfgang Denk

Hi Wolfgang,
Although the error is reported by Linux, I think that the linux is booted by u-boot. Or the issue at least is the extension issue of u-boot.
I believe that there are guys who fixed my issues in our community. So send the sos email. Thanks for your understanding.
Thanks! Shuyou
-----Original Message----- From: Wolfgang Denk [mailto:wd@denx.de] Sent: 2010年11月2日 16:40 To: sywang Cc: u-boot@lists.denx.de Subject: Re: [U-Boot] Bad page state in process 'swapper'
Dear "sywang",
In message 20101102073757.0C5E828093@theia.denx.de you wrote:
While U-boot loads the Linux image, I have the following error. Do you have any suggestions on this?
detected lzma initramfs initramfs: LZMA lc=3,lp=0,pb=2,dictSize=8388608,origSize=12677632 Bad page state in process 'swapper' page:a8000000007b3418 flags:0x0000000000000000 mapping:0000000000000000 mapcount:-16711680 count:0 Trying to fix it up, but a reboot is needed Backtrace: Unwound Call Trace: [<ffffffff80211dc8>] dump_stack+0x8/0x48 [<ffffffff80265c98>] bad_page+0x78/0xb0 [<ffffffff80266988>] get_page_from_freelist+0x230/0x488 [<ffffffff80266c44>] __alloc_pages+0x64/0x348 [<ffffffff8027b4ac>] __vmalloc_area_node+0x10c/0x230 [<ffffffff804741cc>] populate_rootfs+0x974/0xae0 [<ffffffff802007e4>] init+0x84/0x530 [<ffffffff8020db58>] kernel_thread_helper+0x10/0x18
These are Linux error messages. This is the U-Boot mailing list.
Best regards,
Wolfgang Denk

Le 02/11/2010 08:18, V, Aneesh a écrit :
Thanks. This helps. I did the .lds change and it seems to be booting now.
Good!
However, I can't still explain my earlier observation because even in the absence of this fix .rel.dyn had some content and the offsets should have been different if I were to believe objdump.
Do you have any clue?
There were two issues:
First, "older" linkers always emitted input relocation sections with the name ".rel.dyn" whereas "newer" linkers emitted them with names of the form ".rel*". As our linker files only took ".rel.dyn" to form the output section, using newer linkers would produce empty .rel.dyn sections.
Second, a fix to the first issue was RFCed to the list which worked on several boards but tx25 would not boot completely. The root cause of this second issue is that CONFIG_SYS_NAND_U_BOOT_SIZE in the board config file hard-codes the size of the u-boot binary that will be read from NAND and put in RAM. When/if u-boot grows in size, this constant must be adjusted, and it was not.
What hit you was the first issue for sure, and this explains why _rel_dyn_start_ofs and _rel_dyn_end_ofs are identical. What *could* have hit you was the second issue *if* your board boots from NAND *and* if u-boot grew beyond your CONFIG_SYS_NAND_U_BOOT_SIZE.
BTW, Wolfgang, couldn't this 'constant' be generated once objcopy has produced u-boot.bin? A script could 'du' its size, round it up as required, and generate a .h with the result so that the SPL would always compile with a correct value.
Best regards, Aneesh
Amicalement,

Albert,
-----Original Message----- From: Albert ARIBAUD [mailto:albert.aribaud@free.fr] Sent: Tuesday, November 02, 2010 1:11 PM To: V, Aneesh Cc: Darius@theia.denx.de; hs@denx.de; u-boot@lists.denx.de; Augulis Subject: Re: arm: wrong Relocation and not cleared BSS
Le 02/11/2010 08:18, V, Aneesh a écrit :
Thanks. This helps. I did the .lds change and it seems to be
booting
now.
Good!
However, I can't still explain my earlier observation because even
in
the absence of this fix .rel.dyn had some content and the offsets should have been different if I were to believe objdump.
Do you have any clue?
There were two issues:
First, "older" linkers always emitted input relocation sections with the name ".rel.dyn" whereas "newer" linkers emitted them with names of the form ".rel*". As our linker files only took ".rel.dyn" to form the output section, using newer linkers would produce empty .rel.dyn sections.
My .rel.dyn was not empty even before taking your fix.
This is what puzzled me. When I dumped the ELF headers with 'objdump -h' .rel.dyn seemed to have some content: Please see the diff of objdump's output before and after applying your patch.
Please note that .rel.dyn was there even without the patch having the same size but at a different offset.
So, this is what it looks like to me: Even when our rule in .lds was not correct the linker generated .rel.dyn section by default putting together the 'rel*' sections that were not covered by any rules. But since it didn't use our rule for this, the labels associated with our rule indicated zero size.
**************************************************************** @@ -9,7 +9,7 @@ Idx Name Size VMA LMA File off Algn CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .interp 00000011 80e9e6d0 80e9e6d0 000266d0 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA - 3 .dynamic 00000080 80ea343c 80ea343c 0002b43c 2**2 + 3 .dynamic 00000080 80e9f7ec 80e9f7ec 000277ec 2**2 CONTENTS, ALLOC, LOAD, DATA 4 .dynsym 00000100 80ea34c8 80ea34c8 0002b4c8 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA @@ -17,12 +17,12 @@ Idx Name Size VMA LMA File off Algn CONTENTS, ALLOC, LOAD, READONLY, DATA 6 .hash 00000054 80e9e7a4 80e9e7a4 000267a4 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA - 7 .rel.dyn 00003c50 80e9e7f8 80e9e7f8 000267f8 2**2 - CONTENTS, ALLOC, LOAD, READONLY, DATA - 8 .data 00000ff4 80ea2448 80ea2448 0002a448 2**2 + 7 .data 00000ff4 80e9e7f8 80e9e7f8 000267f8 2**2 CONTENTS, ALLOC, LOAD, DATA - 9 .got.plt 0000000c 80ea34bc 80ea34bc 0002b4bc 2**2 + 8 .got.plt 0000000c 80e9f86c 80e9f86c 0002786c 2**2 CONTENTS, ALLOC, LOAD, DATA + 9 .rel.dyn 00003c50 80e9f878 80e9f878 00027878 2**2 + CONTENTS, ALLOC, LOAD, READONLY, DATA 10 .u_boot_cmd 00000540 80ea35c8 80ea35c8 0002b5c8 2**2 CONTENTS, ALLOC, LOAD, DATA 11 .bss 00031210 80ea3b08 80ea3b08 0002bb08 2**3 ****************************************************************
Second, a fix to the first issue was RFCed to the list which worked on several boards but tx25 would not boot completely. The root cause of this second issue is that CONFIG_SYS_NAND_U_BOOT_SIZE in the board config file hard-codes the size of the u-boot binary that will be read from NAND and put in RAM. When/if u-boot grows in size, this constant must be adjusted, and it was not.
What hit you was the first issue for sure, and this explains why _rel_dyn_start_ofs and _rel_dyn_end_ofs are identical. What *could* have hit you was the second issue *if* your board boots from NAND *and* if u-boot grew beyond your CONFIG_SYS_NAND_U_BOOT_SIZE.
We did not face the second issue because we are loading the entire u-boot.bin using a separate preloader.
BTW, Wolfgang, couldn't this 'constant' be generated once objcopy has produced u-boot.bin? A script could 'du' its size, round it up as required, and generate a .h with the result so that the SPL would always compile with a correct value.
Best regards, Aneesh
Amicalement,
Albert.
Best regards, Aneesh

Le 02/11/2010 09:53, V, Aneesh a écrit :
My .rel.dyn was not empty even before taking your fix.
This is what puzzled me. When I dumped the ELF headers with 'objdump -h' ..rel.dyn seemed to have some content: Please see the diff of objdump's output before and after applying your patch.
Please note that .rel.dyn was there even without the patch having the same size but at a different offset.
So, this is what it looks like to me: Even when our rule in .lds was not correct the linker generated ..rel.dyn section by default putting together the 'rel*' sections that were not covered by any rules. But since it didn't use our rule for this, the labels associated with our rule indicated zero size.
Correct: the linker dumped the ununsed .rel* input sections so you can see them, but they were outside of _rel_dyn_start / _rel_dyn_end, so for all useful purposes, we can say that .rel.dyn was empty. What was not empty was some other place in the binary that was not .rel.dyn as such.
We did not face the second issue because we are loading the entire u-boot.bin using a separate preloader.
Ok.
Best regards, Aneesh
Amicalement,

Hello Alexander,
Alexander Holler wrote:
to give the topic a better meaning and to summarize what I think is currently happening along with some "pictures" for a better understanding:
We are starting with code (c) and data (d) somewhere in the memory:
|cd |
The relocation in start.S should achieve this:
| cd|
Yep, and this works fine on boards, I have access( arm1136 qong, arm926ejs tx25, suen3 arm926ejs kirkwood, mx25 magnesium, armv7 omap3_beagle)
That means code and data should be moved upwards. What currently is happening is the following:
| d c |
really?
I posted you on 27.10.2010 in u-boot%irc.freenode.org (couldn;t find this in the history :-( ) a log on the tx25, where this works fine, and we came to the opinion, that you maybe have problems with your toolchain! IIRC you use gcc-4.5x ... Alexander, did you tried (as you thought to do) ELDK-4.2 with gcc-4.2.x ?
Are you sure your toolchain works correct with -pie?
The code is moved upwards, but that code still uses the data at d. This results another problem: Some parts in the code are assuming that d is cleared (set to zero in start.S). But what start.S does it to clear the new location (z in the picture below).
| d cz|
which is OK.
Because the code (c) still uses the data (bss) in d and not in z, some hard to find errors might occur because the used data isn't set to zero as required.
Yep, an that is, what you(we?) have to find out, why this not works with your toolchain!
I have almost no knowledge about how gcc and the binutils are handling relocation, therfore I can't help much further here. What I think is part of the problem, is that -fPIC was removed. Using -pie in LDFLAGS might be used to get relocatable code, but the data will not be relocated. And I would wonder if that is possible without instructing the compiler to build stuff for relocation (-fPIC).
Try to find out, why -pie not works with your tollchain!
bye, Heiko
participants (7)
-
Albert ARIBAUD
-
Alexander Holler
-
Darius Augulis
-
Heiko Schocher
-
sywang
-
V, Aneesh
-
Wolfgang Denk