[U-Boot] new uboot with relocation change cannot boot when download the bin file to different address than TEXT_BASE

Hi,
I recently try to port our board code to new uboot, which has been changed to use new relocation scheme. But I found a very strange thing, that is if the uboot is loaded to the TEXT_BASE address, it could run without problem. But if it is loaded to a different place, it fail to boot up...
I check the code, and found that in the board_init_f, it calls the init_sequence which is stored as a data sector in the u-boot.bin file. While the new scheme use the fPIC, the code could locate the GOT table correctly, and it seem to forgot what the GOT table stores is context that is meaningful in TEXT_BASE, not the loaded base. That is to say, if the TEXT_BASE is 0xf00000, and loaded base is 0x500000, I found the GOT table also filled with 0xf0****, not the 0x50****. This leads the cpu loading wrong function address in the init_sequence table, and cause pc become invalid...
Am I missing something to switch to the new relocation scheme?
Thanks, Lei

On 09/10/10 16:50, Lei Wen wrote:
Hi,
I recently try to port our board code to new uboot, which has been changed to use new relocation scheme. But I found a very strange thing, that is if the uboot is loaded to the TEXT_BASE address, it could run without problem. But if it is loaded to a different place, it fail to boot up...
I check the code, and found that in the board_init_f, it calls the init_sequence which is stored as a data sector in the u-boot.bin file. While the new scheme use the fPIC, the code could locate the GOT table correctly, and it seem to forgot what the GOT table stores is context that is meaningful in TEXT_BASE, not the loaded base. That is to say, if the TEXT_BASE is 0xf00000, and loaded base is 0x500000, I found the GOT table also filled with 0xf0****, not the 0x50****. This leads the cpu loading wrong function address in the init_sequence table, and cause pc become invalid...
Am I missing something to switch to the new relocation scheme?
The ARM relocation schemes (there are two being evaluated - one which uses .got and another .rel.dyn) are both designed to relocate from TEXT_BASE to an upper memory location determined during DRAM init
x86 is the only arch that I know implements what you are describing (in a patch series in u-boot-x86/master but not yet in mainline)
Regards,
Graeme

On Sat, Oct 9, 2010 at 2:27 PM, Graeme Russ graeme.russ@gmail.com wrote:
On 09/10/10 16:50, Lei Wen wrote:
Hi,
I recently try to port our board code to new uboot, which has been changed to use new relocation scheme. But I found a very strange thing, that is if the uboot is loaded to the TEXT_BASE address, it could run without problem. But if it is loaded to a different place, it fail to boot up...
I check the code, and found that in the board_init_f, it calls the init_sequence which is stored as a data sector in the u-boot.bin file. While the new scheme use the fPIC, the code could locate the GOT table correctly, and it seem to forgot what the GOT table stores is context that is meaningful in TEXT_BASE, not the loaded base. That is to say, if the TEXT_BASE is 0xf00000, and loaded base is 0x500000, I found the GOT table also filled with 0xf0****, not the 0x50****. This leads the cpu loading wrong function address in the init_sequence table, and cause pc become invalid...
Am I missing something to switch to the new relocation scheme?
The ARM relocation schemes (there are two being evaluated - one which uses .got and another .rel.dyn) are both designed to relocate from TEXT_BASE to an upper memory location determined during DRAM init
x86 is the only arch that I know implements what you are describing (in a patch series in u-boot-x86/master but not yet in mainline)
Glad to know this info here. :-) But it let the uboot lose a good feature that let it boot itself to debug a new feature. In original scheme, the relocate operation is done before calling the init_sequence, so the problem here was gone.
So for the new relocation scheme currently on mainline code, we only could load the u-boot.bin to the TEXT_BASE in ram and transfer the control to it? And if do this, the relocate seems has no meaning here, since in the past experience we need the relocation for we need copy the rest of uboot code to the TEXT_BASE when start up at a different place at the very beginning.
Thanks, Lei

Dear Lei Wen,
In message AANLkTi==rkiYsGNFYiy2nO_w4uSw=Da8stZjcHt14Q9J@mail.gmail.com you wrote:
But it let the uboot lose a good feature that let it boot itself to debug a new feature.
This has never been supported, and never been working reliably, not across acrchitectures, and not on all ARM systems.
In original scheme, the relocate operation is done before calling the init_sequence, so the problem here was gone.
No. Before, There was no relocation done at all, because U-Boot was linked to a fixed RAM address, and was simply copied there.
You ware just lucky when you were able to start U-Boot from an already running system.
Best regards,
Wolfgang Denk

Hi Wolfang,
On Sun, Oct 10, 2010 at 1:50 AM, Wolfgang Denk wd@denx.de wrote:
Dear Lei Wen,
In message AANLkTi==rkiYsGNFYiy2nO_w4uSw=Da8stZjcHt14Q9J@mail.gmail.com you wrote:
But it let the uboot lose a good feature that let it boot itself to debug a new feature.
This has never been supported, and never been working reliably, not across acrchitectures, and not on all ARM systems.
In original scheme, the relocate operation is done before calling the init_sequence, so the problem here was gone.
No. Before, There was no relocation done at all, because U-Boot was linked to a fixed RAM address, and was simply copied there.
You ware just lucky when you were able to start U-Boot from an already running system.
Well, I think I also think I am lucky enough. :) But I am still a little confussed, even for the new scheme, at the end of board_init_f, there is a relocate_code, which is the exactly as original "simply copied" code. What the relocation in new scheme is different with the old simply copy, has more feature, or could fix some bug?
Thanks, Lei

Le 10/10/2010 06:33, Lei Wen a écrit :
But I am still a little confussed, even for the new scheme, at the end of board_init_f, there is a relocate_code, which is the exactly as original "simply copied" code. What the relocation in new scheme is different with the old simply copy, has more feature, or could fix some bug?
Before relocation, for ARM926 we used to put in TEXT_BASE the RAM address where we wanted u-boot to end up at once relocated, and the first thing we did was initalize RAM, move the code and jump there; so we did not really relocate the code, just move it -- no fixups.
The new relocation allows putting u-boot as high in RAM as possible without computing this location by hand (which would be tedious for fixed-size RAM targets) or impossible (for targets with variable amount of RAM). But this meant u-boot might run at different places, which in turns meant truly relocating -- with fixups.
In the process, TEXT_BASE for ARM went back to what is should have been all along in order to converge with other platforms, i.e. the FLASH, not RAM, target address.
The general goal is to be able to have a single binary running on several HW variants.
Thanks, Lei
You're welcome.
Amicalement,

Dear Lei Wen,
In message AANLkTi=-Ppr=4eCqT2gkpp8Tb4XrtdWNyuauD-5xzj4w@mail.gmail.com you wrote:
Well, I think I also think I am lucky enough. :) But I am still a little confussed, even for the new scheme, at the end of board_init_f, there is a relocate_code, which is the exactly as original "simply copied" code. What the relocation in new scheme is different with the old simply copy, has more feature, or could fix some bug?
Just read the mail archives, or the commit messages, or read http://elinux.org/CELF_Project_Proposal/Rework_ARM_architecture_support_in_U...
Best regards,
Wolfgang Denk

On Sun, Oct 10, 2010 at 2:50 PM, Wolfgang Denk wd@denx.de wrote:
Dear Lei Wen,
In message AANLkTi=-Ppr=4eCqT2gkpp8Tb4XrtdWNyuauD-5xzj4w@mail.gmail.com you wrote:
Well, I think I also think I am lucky enough. :) But I am still a little confussed, even for the new scheme, at the end of board_init_f, there is a relocate_code, which is the exactly as original "simply copied" code. What the relocation in new scheme is different with the old simply copy, has more feature, or could fix some bug?
Just read the mail archives, or the commit messages, or read http://elinux.org/CELF_Project_Proposal/Rework_ARM_architecture_support_in_U...
Got it. :)
Thanks, Lei

Le 09/10/2010 07:50, Lei Wen a écrit :
Hi,
I recently try to port our board code to new uboot, which has been changed to use new relocation scheme. But I found a very strange thing, that is if the uboot is loaded to the TEXT_BASE address, it could run without problem. But if it is loaded to a different place, it fail to boot up...
I check the code, and found that in the board_init_f, it calls the init_sequence which is stored as a data sector in the u-boot.bin file. While the new scheme use the fPIC, the code could locate the GOT table correctly, and it seem to forgot what the GOT table stores is context that is meaningful in TEXT_BASE, not the loaded base. That is to say, if the TEXT_BASE is 0xf00000, and loaded base is 0x500000, I found the GOT table also filled with 0xf0****, not the 0x50****. This leads the cpu loading wrong function address in the init_sequence table, and cause pc become invalid...
Am I missing something to switch to the new relocation scheme?
Thanks, Lei
Can you indicate which hardware (architecture, cpu, SoC, etc) you're running this code on?
Amicalement,

Hi Albert,
On Sat, Oct 9, 2010 at 3:43 PM, Albert ARIBAUD albert.aribaud@free.fr wrote:
Le 09/10/2010 07:50, Lei Wen a écrit :
Hi,
I recently try to port our board code to new uboot, which has been changed to use new relocation scheme. But I found a very strange thing, that is if the uboot is loaded to the TEXT_BASE address, it could run without problem. But if it is loaded to a different place, it fail to boot up...
I check the code, and found that in the board_init_f, it calls the init_sequence which is stored as a data sector in the u-boot.bin file. While the new scheme use the fPIC, the code could locate the GOT table correctly, and it seem to forgot what the GOT table stores is context that is meaningful in TEXT_BASE, not the loaded base. That is to say, if the TEXT_BASE is 0xf00000, and loaded base is 0x500000, I found the GOT table also filled with 0xf0****, not the 0x50****. This leads the cpu loading wrong function address in the init_sequence table, and cause pc become invalid...
Am I missing something to switch to the new relocation scheme?
Thanks, Lei
Can you indicate which hardware (architecture, cpu, SoC, etc) you're running this code on?
I am running the code on Marvell aspen soc, which is arm926ejs compatible core.
Best regards, Lei

Le 09/10/2010 09:53, Lei Wen a écrit :
Hi Albert,
On Sat, Oct 9, 2010 at 3:43 PM, Albert ARIBAUDalbert.aribaud@free.fr wrote:
Le 09/10/2010 07:50, Lei Wen a écrit :
Hi,
I recently try to port our board code to new uboot, which has been changed to use new relocation scheme. But I found a very strange thing, that is if the uboot is loaded to the TEXT_BASE address, it could run without problem. But if it is loaded to a different place, it fail to boot up...
I check the code, and found that in the board_init_f, it calls the init_sequence which is stored as a data sector in the u-boot.bin file. While the new scheme use the fPIC, the code could locate the GOT table correctly, and it seem to forgot what the GOT table stores is context that is meaningful in TEXT_BASE, not the loaded base. That is to say, if the TEXT_BASE is 0xf00000, and loaded base is 0x500000, I found the GOT table also filled with 0xf0****, not the 0x50****. This leads the cpu loading wrong function address in the init_sequence table, and cause pc become invalid...
Am I missing something to switch to the new relocation scheme?
Thanks, Lei
Can you indicate which hardware (architecture, cpu, SoC, etc) you're running this code on?
I am running the code on Marvell aspen soc, which is arm926ejs compatible core.
For arm926, TEXT_BASE should be the FLASH location (if booting from NOR) or a location in DRAM (for NAND and other methods).
I have had little difficulty in running the .got relocation code in a Marvell oront5x (arm926ejs too), except for some functions called from board_init_f which did not respect the general rule that code run before relocation should only access gd; one place in orion5x wrote to global variables, which always was a no-no and only happened to work because the arm926ejs init sequence did not run in proper order.
However, .got relocation has shortcomings of its own; mainly, it requires manual fixups in many places within the code. I have provided patches which replace .got relocation with ELF relocation (look up [ELF-RELOC] tags in the posts), which eliminates the need for any manual fixup; you may want to try this, as it might eventually replace the .got patches.
Best regards, Lei
Amicalement,

Le 09/10/2010 10:10, Albert ARIBAUD a écrit :
(one coffe later)
For arm926, TEXT_BASE should be the FLASH location (if booting from NOR) or a location in DRAM (for NAND and other methods).
... or a location in DRAM awa from top of RAM; relocation can be from RAM to RAM, as long as you define CONFIG_SKIP_LOWLEVEL_INIT and as long as you don't try to relocate to an overlapping area.
Amicalement,

On Sat, Oct 9, 2010 at 4:10 PM, Albert ARIBAUD albert.aribaud@free.fr wrote:
Le 09/10/2010 09:53, Lei Wen a écrit :
Hi Albert,
On Sat, Oct 9, 2010 at 3:43 PM, Albert ARIBAUDalbert.aribaud@free.fr wrote:
Le 09/10/2010 07:50, Lei Wen a écrit :
Hi,
I recently try to port our board code to new uboot, which has been changed to use new relocation scheme. But I found a very strange thing, that is if the uboot is loaded to the TEXT_BASE address, it could run without problem. But if it is loaded to a different place, it fail to boot up...
I check the code, and found that in the board_init_f, it calls the init_sequence which is stored as a data sector in the u-boot.bin file. While the new scheme use the fPIC, the code could locate the GOT table correctly, and it seem to forgot what the GOT table stores is context that is meaningful in TEXT_BASE, not the loaded base. That is to say, if the TEXT_BASE is 0xf00000, and loaded base is 0x500000, I found the GOT table also filled with 0xf0****, not the 0x50****. This leads the cpu loading wrong function address in the init_sequence table, and cause pc become invalid...
Am I missing something to switch to the new relocation scheme?
Thanks, Lei
Can you indicate which hardware (architecture, cpu, SoC, etc) you're running this code on?
I am running the code on Marvell aspen soc, which is arm926ejs compatible core.
For arm926, TEXT_BASE should be the FLASH location (if booting from NOR) or a location in DRAM (for NAND and other methods).
Yeah, got that. The TEXT_BASE of 0xf00000 in my case is the exactly what I want to uboot run during its run time.
I have had little difficulty in running the .got relocation code in a Marvell oront5x (arm926ejs too), except for some functions called from board_init_f which did not respect the general rule that code run before relocation should only access gd; one place in orion5x wrote to global variables, which always was a no-no and only happened to work because the arm926ejs init sequence did not run in proper order.
Have you tried load the uboot to different place with tftp or something else? When I load the uboot to the TEXT_BASE and run, there is also no issue...
However, .got relocation has shortcomings of its own; mainly, it requires manual fixups in many places within the code. I have provided patches which replace .got relocation with ELF relocation (look up [ELF-RELOC] tags in the posts), which eliminates the need for any manual fixup; you may want to try this, as it might eventually replace the .got patches.
Glad to hear this. :-) But my problem is before relocating, the new scheme call the init_sequence in board_init_f, while the TEXT_BASE keep the function entry as static value during compile time. Does the ELF relocation could bring us a relative jump when call the init_sequence table?
Thanks, Lei

Le 09/10/2010 10:24, Lei Wen a écrit :
For arm926, TEXT_BASE should be the FLASH location (if booting from NOR) or a location in DRAM (for NAND and other methods).
Yeah, got that. The TEXT_BASE of 0xf00000 in my case is the exactly what I want to uboot run during its run time.
Watch out: TEXT_BASE does not define where u-boot will run, only where it will *start running*. With relocation, u-boot will run as high in RAM as can be.
I have had little difficulty in running the .got relocation code in a Marvell oront5x (arm926ejs too), except for some functions called from board_init_f which did not respect the general rule that code run before relocation should only access gd; one place in orion5x wrote to global variables, which always was a no-no and only happened to work because the arm926ejs init sequence did not run in proper order.
Have you tried load the uboot to different place with tftp or something else? When I load the uboot to the TEXT_BASE and run, there is also no issue...
Not sure I understand what you mean here. U-boot is assumed to *start* located at TEXT_BASE, then moved up in RAM, so there should *never* be issues with starting u-boot at its TEXT_BASE.
However, .got relocation has shortcomings of its own; mainly, it requires manual fixups in many places within the code. I have provided patches which replace .got relocation with ELF relocation (look up [ELF-RELOC] tags in the posts), which eliminates the need for any manual fixup; you may want to try this, as it might eventually replace the .got patches.
Glad to hear this. :-) But my problem is before relocating, the new scheme call the init_sequence in board_init_f, while the TEXT_BASE keep the function entry as static value during compile time. Does the ELF relocation could bring us a relative jump when call the init_sequence table?
Short answer is relocation brings a way to fix all the code and data for correct relocation.
Note however, that when board_init_f runs, relocation has not happened yet, but OTOH, board_init_f is running at TEXT_BASE, so no relocation is needed yet. We need relocation fixup only once the code is moved to top of DRAM, and we want to execute board_init_r there.
If you flash u-boot at its TEXT_BASE intended NOR FLASH location, then any issue within board_init_f will probably be caused by the code trying to write to a global other than gd.
Thanks, Lei
You're welcome.
Amicalement,

On Sat, Oct 9, 2010 at 4:57 PM, Albert ARIBAUD albert.aribaud@free.fr wrote:
Le 09/10/2010 10:24, Lei Wen a écrit :
For arm926, TEXT_BASE should be the FLASH location (if booting from NOR) or a location in DRAM (for NAND and other methods).
Yeah, got that. The TEXT_BASE of 0xf00000 in my case is the exactly what I want to uboot run during its run time.
Watch out: TEXT_BASE does not define where u-boot will run, only where it will *start running*. With relocation, u-boot will run as high in RAM as can be.
I have had little difficulty in running the .got relocation code in a Marvell oront5x (arm926ejs too), except for some functions called from board_init_f which did not respect the general rule that code run before relocation should only access gd; one place in orion5x wrote to global variables, which always was a no-no and only happened to work because the arm926ejs init sequence did not run in proper order.
Have you tried load the uboot to different place with tftp or something else? When I load the uboot to the TEXT_BASE and run, there is also no issue...
Not sure I understand what you mean here. U-boot is assumed to *start* located at TEXT_BASE, then moved up in RAM, so there should *never* be issues with starting u-boot at its TEXT_BASE.
However, .got relocation has shortcomings of its own; mainly, it requires manual fixups in many places within the code. I have provided patches which replace .got relocation with ELF relocation (look up [ELF-RELOC] tags in the posts), which eliminates the need for any manual fixup; you may want to try this, as it might eventually replace the .got patches.
Glad to hear this. :-) But my problem is before relocating, the new scheme call the init_sequence in board_init_f, while the TEXT_BASE keep the function entry as static value during compile time. Does the ELF relocation could bring us a relative jump when call the init_sequence table?
Short answer is relocation brings a way to fix all the code and data for correct relocation.
Note however, that when board_init_f runs, relocation has not happened yet, but OTOH, board_init_f is running at TEXT_BASE, so no relocation is needed yet. We need relocation fixup only once the code is moved to top of DRAM, and we want to execute board_init_r there.
Ok, I know what the difference between our opinion now... You means the uboot should always loaded to TEXT_BASE(for nand case, as you said). So when we run to board_init_f, we don't need any relocation.
I am just feel put the code arrange like this would lose a good feature as original, tftp the uboot to a different place and use the go command to debug it.
Another question here, why the original implementation now call as CONFIG_SYS_ARM_WITHOUT_RELOC? I think CONFIG_SYS_ARM_WITH_RELOC is correct, for it do the relocation at the very begining. :-p
If you flash u-boot at its TEXT_BASE intended NOR FLASH location, then any issue within board_init_f will probably be caused by the code trying to write to a global other than gd.
Best regards, Lei

Le 09/10/2010 11:08, Lei Wen a écrit :
Ok, I know what the difference between our opinion now... You means the uboot should always loaded to TEXT_BASE(for nand case, as you said). So when we run to board_init_f, we don't need any relocation.
I am just feel put the code arrange like this would lose a good feature as original, tftp the uboot to a different place and use the go command to debug it.
This does not lose said feature. You can still disable relocation altogether by defining CONFIG_SKIP_RELOCATE_UBOOT (watch out then; you're responsible for linking u-boot for its intended *final* location).
Another question here, why the original implementation now call as CONFIG_SYS_ARM_WITHOUT_RELOC?
Means "this ARM build is without relocation", which is what the option does.
I think CONFIG_SYS_ARM_WITH_RELOC is correct, for it do the relocation at the very begining. :-p
Read the code carefully. You'll see that the old code is used when CONFIG_SYS_ARM_WITHOUT_RELOC is defined, and the new (reloc) code used when CONFIG_SYS_ARM_WITHOUT_RELOC is not define.
Best regards, Lei
Amicalement,
participants (4)
-
Albert ARIBAUD
-
Graeme Russ
-
Lei Wen
-
Wolfgang Denk