Re: [U-Boot] Relocation size penalty calculation

On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19:
[Massive Snip :)]
So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C
Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these.
Thanks - Now .dynsym contains only exports from the linker script
- 1 entry is a NULL entry
.dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) [01] 0x00000004, 0x38059994 DT_HASH, points to .hash [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? [06] 0x00000015, 0x00000000 DT_DEBUG, ??? [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x00000012, 0x000014D8 DT_RELSZ, ???
How big DT_REL is
[09] 0x00000013, 0x00000008 DT_RELENT, ???
hmm, cannot remeber :)
How big an entry in DT_REL is
[0a] 0x00000016, 0x00000000 DT_TEXTREL, ???
Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit.
Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack
This is not a problem because this is very low-level init that is not called once relocated into RAM - These relocations can be safely ignored
[0b] 0x6FFFFFFA, 0x00000236 ???, Entries in .rel.dyn [0c] 0x00000000, 0x00000000 DT_NULL, End of Array [0d] 0x00000000, 0x00000000 DT_NULL, End of Array [0e] 0x00000000, 0x00000000 DT_NULL, End of Array [0f] 0x00000000, 0x00000000 DT_NULL, End of Array [10] 0x00000000, 0x00000000 DT_NULL, End of Array
I think some more investigation into the need for .dynsym and .dynamic is still required...
.dynsym may still be required if only for accessing the __u_boot_cmd structure. However, I may be able to hack that a little and not create a __u_boot_cmd symbol in the linker script (create some other temporary symbol) and populate __u_boot_cmd with a valid value after relocation. It will look a little weird, but may mean not loading this section into RAM
Other than that, .dynsym is now only needed to locate the sections during the relocation phase and can be kept in flash and not copied to RAM
I don't think .dynamic is needed due to the exporting of section addresses from the linker script
Regards,
Graeme

Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05:
On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19:
[Massive Snip :)]
So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C
Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these.
Thanks - Now .dynsym contains only exports from the linker script
:)
- 1 entry is a NULL entry
.dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) [01] 0x00000004, 0x38059994 DT_HASH, points to .hash [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? [06] 0x00000015, 0x00000000 DT_DEBUG, ??? [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x00000012, 0x000014D8 DT_RELSZ, ???
How big DT_REL is
[09] 0x00000013, 0x00000008 DT_RELENT, ???
hmm, cannot remeber :)
How big an entry in DT_REL is
Right, how could I forget :)
[0a] 0x00000016, 0x00000000 DT_TEXTREL, ???
Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit.
Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack
hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess.
Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think.
This is not a problem because this is very low-level init that is not called once relocated into RAM - These relocations can be safely ignored
[0b] 0x6FFFFFFA, 0x00000236 ???, Entries in .rel.dyn [0c] 0x00000000, 0x00000000 DT_NULL, End of Array [0d] 0x00000000, 0x00000000 DT_NULL, End of Array [0e] 0x00000000, 0x00000000 DT_NULL, End of Array [0f] 0x00000000, 0x00000000 DT_NULL, End of Array [10] 0x00000000, 0x00000000 DT_NULL, End of Array
I think some more investigation into the need for .dynsym and .dynamic is still required...
.dynsym may still be required if only for accessing the __u_boot_cmd structure. However, I may be able to hack that a little and not create a __u_boot_cmd symbol in the linker script (create some other temporary symbol) and populate __u_boot_cmd with a valid value after relocation. It will look a little weird, but may mean not loading this section into RAM
Why do you need to much around with u_boot_cmd at all? Now that relocation works you should be able to drop all that code/linker stuff?
Other than that, .dynsym is now only needed to locate the sections during the relocation phase and can be kept in flash and not copied to RAM
Still occupies space in the *bin image though.

Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05:
On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19:
[Massive Snip :)]
So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C
Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these.
Thanks - Now .dynsym contains only exports from the linker script
:)
- 1 entry is a NULL entry
.dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) [01] 0x00000004, 0x38059994 DT_HASH, points to .hash [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? [06] 0x00000015, 0x00000000 DT_DEBUG, ??? [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x00000012, 0x000014D8 DT_RELSZ, ???
How big DT_REL is
[09] 0x00000013, 0x00000008 DT_RELENT, ???
hmm, cannot remeber :)
How big an entry in DT_REL is
Right, how could I forget :)
[0a] 0x00000016, 0x00000000 DT_TEXTREL, ???
Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit.
Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack
hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess.
Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think.
I think this approach will turn out to be a big win. At present, the problem with just using the relocs is that objcopy is stripping them out when u-boot.bin is created, as I understand it. It seems this can be solved by changing the command switches appropriately, like using --strip-unneeded. In any case, there is some combination of switches that will preserve the relocation data. The executable code will get smaller, there will be no .got, and the relocation data will be larger (than with -fpic). In total size, it probably will be slightly smaller, but that is a guess. The most important benefit of this approach is that it will work for all architectures, thereby solving the problem once and forever! Even if the result is a bit larger, the RAM footprint will be reduced by the smaller object code size (since the relocation data need not be copied into ram).Having this approach as an option would be real nice, since it would always "just work".
Best Regards, Bill Campbell
This is not a problem because this is very low-level init that is not called once relocated into RAM - These relocations can be safely ignored
[0b] 0x6FFFFFFA, 0x00000236 ???, Entries in .rel.dyn [0c] 0x00000000, 0x00000000 DT_NULL, End of Array [0d] 0x00000000, 0x00000000 DT_NULL, End of Array [0e] 0x00000000, 0x00000000 DT_NULL, End of Array [0f] 0x00000000, 0x00000000 DT_NULL, End of Array [10] 0x00000000, 0x00000000 DT_NULL, End of Array
I think some more investigation into the need for .dynsym and .dynamic is still required...
.dynsym may still be required if only for accessing the __u_boot_cmd structure. However, I may be able to hack that a little and not create a __u_boot_cmd symbol in the linker script (create some other temporary symbol) and populate __u_boot_cmd with a valid value after relocation. It will look a little weird, but may mean not loading this section into RAM
Why do you need to much around with u_boot_cmd at all? Now that relocation works you should be able to drop all that code/linker stuff?
Other than that, .dynsym is now only needed to locate the sections during the relocation phase and can be kept in flash and not copied to RAM
Still occupies space in the *bin image though.
U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

"J. William Campbell" jwilliamcampbell@comcast.net wrote on 13/10/2009 18:30:43:
Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05:
On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19:
[Massive Snip :)]
So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C
Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these.
Thanks - Now .dynsym contains only exports from the linker script
:)
- 1 entry is a NULL entry
.dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) [01] 0x00000004, 0x38059994 DT_HASH, points to .hash [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? [06] 0x00000015, 0x00000000 DT_DEBUG, ??? [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x00000012, 0x000014D8 DT_RELSZ, ???
How big DT_REL is
[09] 0x00000013, 0x00000008 DT_RELENT, ???
hmm, cannot remeber :)
How big an entry in DT_REL is
Right, how could I forget :)
[0a] 0x00000016, 0x00000000 DT_TEXTREL, ???
Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit.
Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack
hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess.
Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think.
I think this approach will turn out to be a big win. At present, the problem with just using the relocs is that objcopy is stripping them out when u-boot.bin is created, as I understand it. It seems this can be solved by changing the command switches appropriately, like using --strip-unneeded. In any case, there is some combination of switches that will preserve the relocation data. The executable code will get smaller, there will be no .got, and the relocation data will be larger (than with -fpic). In total size, it probably will be slightly smaller, but that is a guess. The most important benefit of this approach is that it will work for all architectures, thereby solving the problem once and forever! Even if the result is a bit larger, the RAM footprint will be reduced by the smaller object code size (since the relocation data need not be copied into ram).Having this approach as an option would be real nice, since it would always "just work".
Yes, I had this in the back of my head. I do think some other arch than ppc will have to try this out though :) I am not 100% sure this will work with my end goal, true PIC so I can load the same img anywhere in flash.
Jocke

On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05:
On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19:
[Massive Snip :)]
So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C
Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these.
Thanks - Now .dynsym contains only exports from the linker script
:)
- 1 entry is a NULL entry
.dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) [01] 0x00000004, 0x38059994 DT_HASH, points to .hash [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? [06] 0x00000015, 0x00000000 DT_DEBUG, ??? [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x00000012, 0x000014D8 DT_RELSZ, ???
How big DT_REL is
[09] 0x00000013, 0x00000008 DT_RELENT, ???
hmm, cannot remeber :)
How big an entry in DT_REL is
Right, how could I forget :)
[0a] 0x00000016, 0x00000000 DT_TEXTREL, ???
Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit.
Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack
hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess.
Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think.
With the following flags
PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions
I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them
This is not a problem because this is very low-level init that is not called once relocated into RAM - These relocations can be safely ignored
[0b] 0x6FFFFFFA, 0x00000236 ???, Entries in .rel.dyn [0c] 0x00000000, 0x00000000 DT_NULL, End of Array [0d] 0x00000000, 0x00000000 DT_NULL, End of Array [0e] 0x00000000, 0x00000000 DT_NULL, End of Array [0f] 0x00000000, 0x00000000 DT_NULL, End of Array [10] 0x00000000, 0x00000000 DT_NULL, End of Array
I think some more investigation into the need for .dynsym and .dynamic is still required...
.dynsym may still be required if only for accessing the __u_boot_cmd structure. However, I may be able to hack that a little and not create a __u_boot_cmd symbol in the linker script (create some other temporary symbol) and populate __u_boot_cmd with a valid value after relocation. It will look a little weird, but may mean not loading this section into RAM
Why do you need to much around with u_boot_cmd at all? Now that relocation works you should be able to drop all that code/linker stuff?
Other than that, .dynsym is now only needed to locate the sections during the relocation phase and can be kept in flash and not copied to RAM
Still occupies space in the *bin image though.

Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 22:06:56:
On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05:
On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19:
[Massive Snip :)]
So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C
Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these.
Thanks - Now .dynsym contains only exports from the linker script
:)
- 1 entry is a NULL entry
.dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) [01] 0x00000004, 0x38059994 DT_HASH, points to .hash [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? [06] 0x00000015, 0x00000000 DT_DEBUG, ??? [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x00000012, 0x000014D8 DT_RELSZ, ???
How big DT_REL is
[09] 0x00000013, 0x00000008 DT_RELENT, ???
hmm, cannot remeber :)
How big an entry in DT_REL is
Right, how could I forget :)
[0a] 0x00000016, 0x00000000 DT_TEXTREL, ???
Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit.
Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack
hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess.
Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think.
With the following flags
PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions
I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them
Possibly, but I think you only need to add an offset to all those relocs.
Jokce

Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 22:06:56:
On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05:
On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19:
[Massive Snip :)]
So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C
Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these.
Thanks - Now .dynsym contains only exports from the linker script
:)
- 1 entry is a NULL entry
.dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) [01] 0x00000004, 0x38059994 DT_HASH, points to .hash [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? [06] 0x00000015, 0x00000000 DT_DEBUG, ??? [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x00000012, 0x000014D8 DT_RELSZ, ???
How big DT_REL is
[09] 0x00000013, 0x00000008 DT_RELENT, ???
hmm, cannot remeber :)
How big an entry in DT_REL is
Right, how could I forget :)
[0a] 0x00000016, 0x00000000 DT_TEXTREL, ???
Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit.
Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack
hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess.
Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think.
With the following flags
PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions
I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them
Possibly, but I think you only need to add an offset to all those relocs.
Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load.
Best Regards, Bill Campbell
Jokce
U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

"J. William Campbell" jwilliamcampbell@comcast.net wrote on 14/10/2009 01:48:52:
Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 22:06:56:
On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05:
On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19:
[Massive Snip :)]
> So, all that is left are .dynsym and .dynamic ... > .dynsym > - Contains 70 entries (16 bytes each, 1120 bytes) > - 44 entries mimic those entries in .got which are not relocated > - 21 entries are the remaining symbols exported from the linker > script > - 4 entries are labels defined in inline asm and used in C > Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these.
Thanks - Now .dynsym contains only exports from the linker script
:)
> - 1 entry is a NULL entry > > .dynamic > - 88 bytes > - Array of Elf32_Dyn > - typedef struct { > Elf32_Sword d_tag; > union { > Elf32_Word d_val; > Elf32_Addr d_ptr; > } d_un; > } Elf32_Dyn; > - 0x11 entries > [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) > [01] 0x00000004, 0x38059994 DT_HASH, points to .hash > [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr > [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym > [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr > [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? > [06] 0x00000015, 0x00000000 DT_DEBUG, ??? > [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text > [08] 0x00000012, 0x000014D8 DT_RELSZ, ??? > How big DT_REL is
> [09] 0x00000013, 0x00000008 DT_RELENT, ??? > hmm, cannot remeber :)
How big an entry in DT_REL is
Right, how could I forget :)
> [0a] 0x00000016, 0x00000000 DT_TEXTREL, ??? > Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit.
Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack
hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess.
Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think.
With the following flags
PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions
I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them
BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more.
Possibly, but I think you only need to add an offset to all those relocs.
Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load.
Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new address. Therefore you should be able to read the current address, add offset, write back the new address.
Normally one do what you describe but here we know that the whole img has moved so we don't have to do calculate the new address from scratch.
Jocke

On Wed, Oct 14, 2009 at 6:25 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
"J. William Campbell" jwilliamcampbell@comcast.net wrote on 14/10/2009 01:48:52:
Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 22:06:56:
On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05:
On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
> Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19: > [Massive Snip :)]
>> So, all that is left are .dynsym and .dynamic ... >> .dynsym >> - Contains 70 entries (16 bytes each, 1120 bytes) >> - 44 entries mimic those entries in .got which are not relocated >> - 21 entries are the remaining symbols exported from the linker >> script >> - 4 entries are labels defined in inline asm and used in C >> > Try adding proper asm declarations. Look at what gcc > generates for a function/variable and mimic these. > Thanks - Now .dynsym contains only exports from the linker script
:)
>> - 1 entry is a NULL entry >> >> .dynamic >> - 88 bytes >> - Array of Elf32_Dyn >> - typedef struct { >> Elf32_Sword d_tag; >> union { >> Elf32_Word d_val; >> Elf32_Addr d_ptr; >> } d_un; >> } Elf32_Dyn; >> - 0x11 entries >> [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) >> [01] 0x00000004, 0x38059994 DT_HASH, points to .hash >> [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr >> [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym >> [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr >> [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? >> [06] 0x00000015, 0x00000000 DT_DEBUG, ??? >> [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text >> [08] 0x00000012, 0x000014D8 DT_RELSZ, ??? >> > How big DT_REL is > >> [09] 0x00000013, 0x00000008 DT_RELENT, ??? >> > hmm, cannot remeber :) > How big an entry in DT_REL is
Right, how could I forget :)
>> [0a] 0x00000016, 0x00000000 DT_TEXTREL, ??? >> > Oops, you got text relocations. This is generally a bad thing. > TEXTREL is commonly caused by asm code that arent truly pic so it needs > to modify the .text segment to adjust for relocation. > You should get rid of this one. Look for DT_TEXTREL in .o files to find > the culprit. > > Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack
hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess.
Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think.
With the following flags
PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions
I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them
BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more.
Possibly, but I think you only need to add an offset to all those relocs.
Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load.
Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new address. Therefore you should be able to read the current address, add offset, write back the new address.
OK, I don't really get this at all....
This code:
printf ("\n\n%s\n\n", version_string);
gets compiled into:
380403e7: 68 a4 18 05 38 push $0x380518a4 380403ec: 68 de 2c 05 38 push $0x38052cde 380403f1: e8 4f 84 00 00 call 38048845 <printf>
With relocation entries in .rel.text of:
Offset Info Type Sym.Value Sym. Name 380403e8 00016201 R_386_32 380519f0 version_string 380403ed 00000201 R_386_32 380519f0 .rodata 380403f2 00016b02 R_386_PC32 38048991 printf
Now I get the first two (R_386_32) entries - Relocation involves a simple addition of an offset to the values at addresses 0x380403e8 and 0x380403ed (of course, these addresses will be offset)
However, the R_386_PC32 is an enigma - The call is already relative - there is no need to relocate it at all (call is a position independent opcode because it is a relative jump!)
Will all R_386_PC32 be like this? Can I simply ignore them all? If so, why do they even need to be generated?
Hmmm
Graeme
Normally one do what you describe but here we know that the whole img has moved so we don't have to do calculate the new address from scratch.
Jocke

Graeme Russ graeme.russ@gmail.com wrote on 14/10/2009 13:48:27:
On Wed, Oct 14, 2009 at 6:25 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
"J. William Campbell" jwilliamcampbell@comcast.net wrote on 14/10/2009 01:48:52:
Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 22:06:56:
On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05:
> On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund > joakim.tjernlund@transmode.se wrote: > >> Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19: >> > [Massive Snip :)] > > >>> So, all that is left are .dynsym and .dynamic ... >>> .dynsym >>> - Contains 70 entries (16 bytes each, 1120 bytes) >>> - 44 entries mimic those entries in .got which are not relocated >>> - 21 entries are the remaining symbols exported from the linker >>> script >>> - 4 entries are labels defined in inline asm and used in C >>> >> Try adding proper asm declarations. Look at what gcc >> generates for a function/variable and mimic these. >> > Thanks - Now .dynsym contains only exports from the linker script > :)
>>> - 1 entry is a NULL entry >>> >>> .dynamic >>> - 88 bytes >>> - Array of Elf32_Dyn >>> - typedef struct { >>> Elf32_Sword d_tag; >>> union { >>> Elf32_Word d_val; >>> Elf32_Addr d_ptr; >>> } d_un; >>> } Elf32_Dyn; >>> - 0x11 entries >>> [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) >>> [01] 0x00000004, 0x38059994 DT_HASH, points to .hash >>> [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr >>> [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym >>> [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr >>> [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? >>> [06] 0x00000015, 0x00000000 DT_DEBUG, ??? >>> [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text >>> [08] 0x00000012, 0x000014D8 DT_RELSZ, ??? >>> >> How big DT_REL is >> >>> [09] 0x00000013, 0x00000008 DT_RELENT, ??? >>> >> hmm, cannot remeber :) >> > How big an entry in DT_REL is > Right, how could I forget :)
>>> [0a] 0x00000016, 0x00000000 DT_TEXTREL, ??? >>> >> Oops, you got text relocations. This is generally a bad thing. >> TEXTREL is commonly caused by asm code that arent truly pic so it needs >> to modify the .text segment to adjust for relocation. >> You should get rid of this one. Look for DT_TEXTREL in .o files to find >> the culprit. >> >> > Alas I cannot - The relocations are a result of loading a register with a > return address when calling show_boot_progress in the very early stages of > initialisation prior to the stack becoming available. The x86 does not > allow direct access to the IP so the only way to find the 'current > execution address' is to 'call' to the next instruction and pop the return > address off the stack > hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess.
Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think.
With the following flags
PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions
I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them
BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more.
Possibly, but I think you only need to add an offset to all those relocs.
Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load.
Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new address. Therefore you should be able to read the current address, add offset, write back the
new address.
OK, I don't really get this at all....
This code:
printf ("\n\n%s\n\n", version_string);
gets compiled into:
380403e7: 68 a4 18 05 38 push $0x380518a4 380403ec: 68 de 2c 05 38 push $0x38052cde 380403f1: e8 4f 84 00 00 call 38048845 <printf>
With relocation entries in .rel.text of:
Offset Info Type Sym.Value Sym. Name 380403e8 00016201 R_386_32 380519f0 version_string 380403ed 00000201 R_386_32 380519f0 .rodata 380403f2 00016b02 R_386_PC32 38048991 printf
Now I get the first two (R_386_32) entries - Relocation involves a simple addition of an offset to the values at addresses 0x380403e8 and 0x380403ed (of course, these addresses will be offset)
However, the R_386_PC32 is an enigma - The call is already relative - there is no need to relocate it at all (call is a position independent opcode because it is a relative jump!)
Yes, but printf is defined in glibc så the app needs to relocate the call to glibc. U-boot has all it needs so there you should not have PC32 I think. Try defining a local static function. For non static functions you may need to define visibility=hidden and/or -Bsymbolic too. You also need to look at the img after final linking.
Will all R_386_PC32 be like this? Can I simply ignore them all? If so, why do they even need to be generated?
Hopefully you won't have any. Not sure about weak functions though. These might need PC32 relocs in some cases.
Also, if you look at _dl_do_reloc() in uClibc/ldso/ldso/i386/elfinterp.c I think you can replace symbol_addr with relocation offset.
Jocke

Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 14/10/2009 13:48:27:
On Wed, Oct 14, 2009 at 6:25 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
"J. William Campbell" jwilliamcampbell@comcast.net wrote on 14/10/2009 01:48:52:
Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 22:06:56:
On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
> Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05: > > >> On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund >> joakim.tjernlund@transmode.se wrote: >> >> >>> Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19: >>> >>> >> [Massive Snip :)] >> >> >> >>>> So, all that is left are .dynsym and .dynamic ... >>>> .dynsym >>>> - Contains 70 entries (16 bytes each, 1120 bytes) >>>> - 44 entries mimic those entries in .got which are not relocated >>>> - 21 entries are the remaining symbols exported from the linker >>>> script >>>> - 4 entries are labels defined in inline asm and used in C >>>> >>>> >>> Try adding proper asm declarations. Look at what gcc >>> generates for a function/variable and mimic these. >>> >>> >> Thanks - Now .dynsym contains only exports from the linker script >> >> > :) > > >>>> - 1 entry is a NULL entry >>>> >>>> .dynamic >>>> - 88 bytes >>>> - Array of Elf32_Dyn >>>> - typedef struct { >>>> Elf32_Sword d_tag; >>>> union { >>>> Elf32_Word d_val; >>>> Elf32_Addr d_ptr; >>>> } d_un; >>>> } Elf32_Dyn; >>>> - 0x11 entries >>>> [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) >>>> [01] 0x00000004, 0x38059994 DT_HASH, points to .hash >>>> [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr >>>> [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym >>>> [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr >>>> [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? >>>> [06] 0x00000015, 0x00000000 DT_DEBUG, ??? >>>> [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text >>>> [08] 0x00000012, 0x000014D8 DT_RELSZ, ??? >>>> >>>> >>> How big DT_REL is >>> >>> >>>> [09] 0x00000013, 0x00000008 DT_RELENT, ??? >>>> >>>> >>> hmm, cannot remeber :) >>> >>> >> How big an entry in DT_REL is >> >> > Right, how could I forget :) > > >>>> [0a] 0x00000016, 0x00000000 DT_TEXTREL, ??? >>>> >>>> >>> Oops, you got text relocations. This is generally a bad thing. >>> TEXTREL is commonly caused by asm code that arent truly pic so it needs >>> to modify the .text segment to adjust for relocation. >>> You should get rid of this one. Look for DT_TEXTREL in .o files to find >>> the culprit. >>> >>> >>> >> Alas I cannot - The relocations are a result of loading a register with a >> return address when calling show_boot_progress in the very early stages of >> initialisation prior to the stack becoming available. The x86 does not >> allow direct access to the IP so the only way to find the 'current >> execution address' is to 'call' to the next instruction and pop the return >> address off the stack >> >> > hmm, same as ppc but that in it self should not cause a TEXREL, should it? > Ahh, the 'call' is absolute, not relative? I guess there is some way around it > but it is not important ATM I guess. > > Evil idea, skip -fpic et. all and add the full reloc procedure > to relocate by rewriting directly in TEXT segment. Then you save space > but you need more relocation code. Something like dl_do_reloc from > uClibc. Wonder how much extra code that would be? Not too much I think. > > > With the following flags
PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions
I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them
BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more.
Possibly, but I think you only need to add an offset to all those relocs.
Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load.
Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new address. Therefore you should be able to read the current address, add offset, write back the
new address.
OK, I don't really get this at all....
This code:
printf ("\n\n%s\n\n", version_string);
gets compiled into:
380403e7: 68 a4 18 05 38 push $0x380518a4 380403ec: 68 de 2c 05 38 push $0x38052cde 380403f1: e8 4f 84 00 00 call 38048845 <printf>
With relocation entries in .rel.text of:
Offset Info Type Sym.Value Sym. Name 380403e8 00016201 R_386_32 380519f0 version_string 380403ed 00000201 R_386_32 380519f0 .rodata 380403f2 00016b02 R_386_PC32 38048991 printf
Now I get the first two (R_386_32) entries - Relocation involves a simple addition of an offset to the values at addresses 0x380403e8 and 0x380403ed (of course, these addresses will be offset)
However, the R_386_PC32 is an enigma - The call is already relative - there is no need to relocate it at all (call is a position independent opcode because it is a relative jump!)
Yes, but printf is defined in glibc så the app needs to relocate the call to glibc.
Actually, the reason the call is relocatable is that the compiler DOESN'T KNOW where printf is at all. If it is in a library, it will not be in the text segment and must be relocated accordingly. It may be in a different segment for some reason. In any case, the compiler doesn't know the address in the image where printf resides, so it needs a relocation entry to get the value filled in at link time. After the value is filled in, if the referenced symbol is in the same segment (probably .text) as the point of reference, the relocation reference is probably of no more use. However, there is no rule that says the linker must delete the reference from the relocation list.
U-boot has all it needs so there you should not have PC32 I think. Try defining a local static function. For non static functions you may need to define visibility=hidden and/or -Bsymbolic too.
Won't help. Any symbols referenced but not defined locally are relocatable. After linking, they MAY, but need not, go away.
You also need to look at the img after final linking.
After linking, if the symbol is defined, the R_386_PC32 is no longer important UNLESS the symbol referenced is in a different segment AND the segments are relocated with different offsets from each other than originally linked. For this reason, I think the linker will not discard these relocations. If we are not relocating the segments with different relative offsets, we can ignore these relocations as the change in offset will come out to be zero anyway. However, if you process them normally, you will just add 0 and nothing will change.
Will all R_386_PC32 be like this? Can I simply ignore them all? If so, why do they even need to be generated?
Hopefully you won't have any.
I think they may still be there, because we ask the linker to preserve relocation information. However, if the entire image is being relocated, not changing the order or relative offset of any segments, they can be ignored, because the relative values will not change. It will be interesting to know if they remain or if the linker drops them out. For references in the same segment, we can hope that they get dropped. For references across segments (if any), or any undefined symbols, they will remain.
Not sure about weak functions though. These might need PC32 relocs in some cases.
There can be PC32 relocs referencing the weak symbol, but that symbol may be undefined.
Also, if you look at _dl_do_reloc() in uClibc/ldso/ldso/i386/elfinterp.c I think you can replace symbol_addr with relocation offset.
I agree, in the case you a moving the entire image and ignoring PC32 relocs.
Best Regards, Bill Campbell
Jocke

On Thu, Oct 15, 2009 at 3:45 AM, J. William Campbell jwilliamcampbell@comcast.net wrote:
Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 14/10/2009 13:48:27:
On Wed, Oct 14, 2009 at 6:25 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
"J. William Campbell" jwilliamcampbell@comcast.net wrote on 14/10/2009 01:48:52:
Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 22:06:56:
> > On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund > joakim.tjernlund@transmode.se wrote: > > >> >> Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05: >> >> >>> >>> On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund >>> joakim.tjernlund@transmode.se wrote: >>> >>> >>>> >>>> Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19: >>>> >>>> >>> >>> [Massive Snip :)] >>> >>> >>> >>>>> >>>>> So, all that is left are .dynsym and .dynamic ... >>>>> .dynsym >>>>> - Contains 70 entries (16 bytes each, 1120 bytes) >>>>> - 44 entries mimic those entries in .got which are not >>>>> relocated >>>>> - 21 entries are the remaining symbols exported from the >>>>> linker >>>>> script >>>>> - 4 entries are labels defined in inline asm and used in C >>>>> >>>>> >>>> >>>> Try adding proper asm declarations. Look at what gcc >>>> generates for a function/variable and mimic these. >>>> >>>> >>> >>> Thanks - Now .dynsym contains only exports from the linker script >>> >>> >> >> :) >> >> >>>>> >>>>> - 1 entry is a NULL entry >>>>> >>>>> .dynamic >>>>> - 88 bytes >>>>> - Array of Elf32_Dyn >>>>> - typedef struct { >>>>> Elf32_Sword d_tag; >>>>> union { >>>>> Elf32_Word d_val; >>>>> Elf32_Addr d_ptr; >>>>> } d_un; >>>>> } Elf32_Dyn; >>>>> - 0x11 entries >>>>> [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) >>>>> [01] 0x00000004, 0x38059994 DT_HASH, points to .hash >>>>> [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr >>>>> [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym >>>>> [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr >>>>> [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? >>>>> [06] 0x00000015, 0x00000000 DT_DEBUG, ??? >>>>> [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text >>>>> [08] 0x00000012, 0x000014D8 DT_RELSZ, ??? >>>>> >>>>> >>>> >>>> How big DT_REL is >>>> >>>> >>>>> >>>>> [09] 0x00000013, 0x00000008 DT_RELENT, ??? >>>>> >>>>> >>>> >>>> hmm, cannot remeber :) >>>> >>>> >>> >>> How big an entry in DT_REL is >>> >>> >> >> Right, how could I forget :) >> >> >>>>> >>>>> [0a] 0x00000016, 0x00000000 DT_TEXTREL, ??? >>>>> >>>>> >>>> >>>> Oops, you got text relocations. This is generally a bad thing. >>>> TEXTREL is commonly caused by asm code that arent truly pic so it >>>> needs >>>> to modify the .text segment to adjust for relocation. >>>> You should get rid of this one. Look for DT_TEXTREL in .o files to >>>> find >>>> the culprit. >>>> >>>> >>>> >>> >>> Alas I cannot - The relocations are a result of loading a register >>> with a >>> return address when calling show_boot_progress in the very early >>> stages of >>> initialisation prior to the stack becoming available. The x86 does >>> not >>> allow direct access to the IP so the only way to find the 'current >>> execution address' is to 'call' to the next instruction and pop the >>> return >>> address off the stack >>> >>> >> >> hmm, same as ppc but that in it self should not cause a TEXREL, >> should it? >> Ahh, the 'call' is absolute, not relative? I guess there is some way >> around it >> but it is not important ATM I guess. >> >> Evil idea, skip -fpic et. all and add the full reloc procedure >> to relocate by rewriting directly in TEXT segment. Then you save >> space >> but you need more relocation code. Something like dl_do_reloc from >> uClibc. Wonder how much extra code that would be? Not too much I >> think. >> >> >> > > With the following flags > > PLATFORM_RELFLAGS += -fvisibility=hidden > PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm > PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic > -Bsymbolic-functions > > I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I > think > this might mean I need the symbol table in the binary in order to > resolve > them > >
BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more.
Possibly, but I think you only need to add an offset to all those relocs.
Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load.
Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new address. Therefore you should be able to read the current address, add offset, write back the
new address. OK, I don't really get this at all....
This code:
printf ("\n\n%s\n\n", version_string);
gets compiled into:
380403e7: 68 a4 18 05 38 push $0x380518a4 380403ec: 68 de 2c 05 38 push $0x38052cde 380403f1: e8 4f 84 00 00 call 38048845 <printf>
With relocation entries in .rel.text of:
Offset Info Type Sym.Value Sym. Name 380403e8 00016201 R_386_32 380519f0 version_string 380403ed 00000201 R_386_32 380519f0 .rodata 380403f2 00016b02 R_386_PC32 38048991 printf
Now I get the first two (R_386_32) entries - Relocation involves a simple addition of an offset to the values at addresses 0x380403e8 and 0x380403ed (of course, these addresses will be offset)
However, the R_386_PC32 is an enigma - The call is already relative - there is no need to relocate it at all (call is a position independent opcode because it is a relative jump!)
Yes, but printf is defined in glibc så the app needs to relocate the call to glibc.
Actually, the reason the call is relocatable is that the compiler DOESN'T KNOW where printf is at all. If it is in a library, it will not be in the text segment and must be relocated accordingly. It may be in a different segment for some reason. In any case, the compiler doesn't know the address in the image where printf resides, so it needs a relocation entry to get the value filled in at link time. After the value is filled in, if the referenced symbol is in the same segment (probably .text) as the point of reference, the relocation reference is probably of no more use. However, there is no rule that says the linker must delete the reference from the relocation list.
U-boot has all it needs so there you should not have PC32 I think. Try defining a local static function. For non static functions you may need to define visibility=hidden and/or -Bsymbolic too.
Won't help. Any symbols referenced but not defined locally are relocatable. After linking, they MAY, but need not, go away.
You also need to look at the img after final linking.
After linking, if the symbol is defined, the R_386_PC32 is no longer important UNLESS the symbol referenced is in a different segment AND the segments are relocated with different offsets from each other than originally linked. For this reason, I think the linker will not discard these relocations. If we are not relocating the segments with different relative offsets, we can ignore these relocations as the change in offset will come out to be zero anyway. However, if you process them normally, you will just add 0 and nothing will change.
Will all R_386_PC32 be like this? Can I simply ignore them all? If so, why do they even need to be generated?
Hopefully you won't have any.
I think they may still be there, because we ask the linker to preserve relocation information. However, if the entire image is being relocated, not changing the order or relative offset of any segments, they can be ignored, because the relative values will not change. It will be interesting to know if they remain or if the linker drops them out. For references in the same segment, we can hope that they get dropped. For references across segments (if any), or any undefined symbols, they will remain.
Not sure about weak functions though. These might need PC32 relocs in some cases.
There can be PC32 relocs referencing the weak symbol, but that symbol may be undefined.
Also, if you look at _dl_do_reloc() in uClibc/ldso/ldso/i386/elfinterp.c I think you can replace symbol_addr with relocation offset.
I agree, in the case you a moving the entire image and ignoring PC32 relocs.
Best Regards, Bill Campbell
Jocke
Apologies if this is getting way off-topic for a simple boot loader, but this is information I have gathered from far and wide over the net. I am surprised that there isn't a web site out there on 'How to create a relocatable boot loader'...
OK, its all starting to come together now - It helps when you look at the right files ;)
Firstly, u-boot.map
0x380589a0 __rel_dyn_start = .
.rel.dyn 0x380589a0 0x42b0 *(.rel.dyn) .rel.got 0x00000000 0x0 cpu/i386/start.o .rel.plt 0x00000000 0x0 cpu/i386/start.o .rel.text 0x380589a0 0x2e28 cpu/i386/start.o .rel.start16 0x3805b7c8 0x10 cpu/i386/start.o .rel.data 0x3805b7d8 0xc18 cpu/i386/start.o .rel.rodata 0x3805c3f0 0x360 cpu/i386/start.o .rel.u_boot_cmd 0x3805c750 0x500 cpu/i386/start.o 0x3805cc50 __rel_dyn_end = .
And the output of readelf...
Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 38040000 001000 0118a4 00 AX 0 0 4 [ 2] .rel.text REL 00000000 066c68 005d00 08 40 1 4 [ 3] .rodata PROGBITS 380518a4 0128a4 005da5 00 A 0 0 4 [ 4] .rel.rodata REL 00000000 06c968 000360 08 40 3 4 [ 5] .interp PROGBITS 38057649 018649 000013 00 A 0 0 1 [ 6] .dynstr STRTAB 3805765c 01865c 0001ee 00 A 0 0 1 [ 7] .hash HASH 3805784c 01884c 0000cc 04 A 11 0 4 [ 8] .data PROGBITS 38057918 018918 000a3c 00 WA 0 0 4 [ 9] .rel.data REL 00000000 06ccc8 000c18 08 40 8 4 [10] .got.plt PROGBITS 38058354 019354 00000c 04 WA 0 0 4 [11] .dynsym DYNSYM 38058360 019360 000200 10 A 6 1 4 [12] .dynamic DYNAMIC 38058560 019560 000080 08 WA 6 0 4 [13] .u_boot_cmd PROGBITS 380585e0 0195e0 0003c0 00 WA 0 0 4 [14] .rel.u_boot_cmd REL 00000000 06d8e0 000500 08 40 13 4 [15] .bss NOBITS 3805cc50 01ec50 001a34 00 WA 0 0 4 [16] .bios PROGBITS 00000000 01e000 00053e 00 AX 0 0 1 [17] .rel.bios REL 00000000 06dde0 0000c0 08 40 16 4 [18] .rel.dyn REL 380589a0 0199a0 0042b0 08 A 11 0 4 [19] .start16 PROGBITS 0000f800 01e800 000110 00 AX 0 0 1 [20] .rel.start16 REL 00000000 06dea0 000038 08 40 19 4 [21] .resetvec PROGBITS 0000fff0 01eff0 000010 00 AX 0 0 1 [22] .rel.resetvec REL 00000000 06ded8 000008 08 40 21 4
...
Relocation section '.rel.text' at offset 0x66c68 contains 2976 entries: Offset Info Type Sym.Value Sym. Name 38040010 00000101 R_386_32 38040000 .text 3804001e 00000101 R_386_32 38040000 .text 38040028 00000101 R_386_32 38040000 .text 3804003f 00000101 R_386_32 38040000 .text 38040051 00000101 R_386_32 38040000 .text 38040075 00000101 R_386_32 38040000 .text 38040085 00000101 R_386_32 38040000 .text 3804009d 0003e602 R_386_PC32 380403fa load_uboot 380400a6 00000101 R_386_32 38040000 .text 38040015 00029f02 R_386_PC32 3804bdd8 early_board_init 38040023 0003f702 R_386_PC32 3804bdda show_boot_progress_asm
...
Relocation section '.rel.rodata' at offset 0x6c968 contains 108 entries: Offset Info Type Sym.Value Sym. Name 38051908 00000201 R_386_32 380518a4 .rodata 38051938 00000201 R_386_32 380518a4 .rodata 38051968 00000201 R_386_32 380518a4 .rodata 38051998 00000201 R_386_32 380518a4 .rodata 380519c8 00000201 R_386_32 380518a4 .rodata 380519f8 00000201 R_386_32 380518a4 .rodata
...
Relocation section '.rel.dyn' at offset 0x199a0 contains 2134 entries: Offset Info Type Sym.Value Sym. Name 0000f838 00000008 R_386_RELATIVE 0000f846 00000008 R_386_RELATIVE 38040010 00000008 R_386_RELATIVE 3804001e 00000008 R_386_RELATIVE 38040028 00000008 R_386_RELATIVE 3804003f 00000008 R_386_RELATIVE 38040051 00000008 R_386_RELATIVE 38040075 00000008 R_386_RELATIVE 38040085 00000008 R_386_RELATIVE
Notice that, apart from .rel.dyn, non of the .rel.* sections have the A (Allocated) flag set - They do not end up in the stripped binary image. .rel.dyn is allocated in the binary image with all the R_386_PC32 entries from the other .rel section are discarded and the R_386_32 have been 'converted' to R_386_RELATIVE which are simple to adjust (locate in memory and adjust by the relocation offset)
The relocation fixup is really easy:
Elf32_Rel *rel_dyn_start = (Elf32_Rel *)&__rel_dyn_start; Elf32_Rel *rel_dyn_end = (Elf32_Rel *)&__rel_dyn_end; Elf32_Rel *re;
for (re = rel_dyn_start; re < rel_dyn_end; re++) { if (re->r_offset >= TEXT_BASE) if (*(ulong *)re->r_offset >= TEXT_BASE) *(ulong *)(re->r_offset - rel_offset) -= (Elf32_Addr)rel_offset; }
The size penalty is ~17kB of extra data (which is not copied to RAM) and a tiny amount of relocation code (easily offset by removal of other fixups such as the command table fixup
Any without using the pic flag in gcc, there is no GOT and no associated performance penalty.
Thanks for everyone's help (especially Jocke and Bill)
Regards,
Graeme

Graeme Russ graeme.russ@gmail.com wrote on 17/10/2009 07:17:04:
[SNIP]
Apologies if this is getting way off-topic for a simple boot loader, but this is information I have gathered from far and wide over the net. I am surprised that there isn't a web site out there on 'How to create a relocatable boot loader'...
:), now you can write one :)
OK, its all starting to come together now - It helps when you look at the right files ;)
Firstly, u-boot.map
0x380589a0 __rel_dyn_start = .
.rel.dyn 0x380589a0 0x42b0 *(.rel.dyn) .rel.got 0x00000000 0x0 cpu/i386/start.o .rel.plt 0x00000000 0x0 cpu/i386/start.o .rel.text 0x380589a0 0x2e28 cpu/i386/start.o .rel.start16 0x3805b7c8 0x10 cpu/i386/start.o .rel.data 0x3805b7d8 0xc18 cpu/i386/start.o .rel.rodata 0x3805c3f0 0x360 cpu/i386/start.o .rel.u_boot_cmd 0x3805c750 0x500 cpu/i386/start.o 0x3805cc50 __rel_dyn_end = .
And the output of readelf...
Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 38040000 001000 0118a4 00 AX 0 0 4 [ 2] .rel.text REL 00000000 066c68 005d00 08 40 1 4 [ 3] .rodata PROGBITS 380518a4 0128a4 005da5 00 A 0 0 4 [ 4] .rel.rodata REL 00000000 06c968 000360 08 40 3 4 [ 5] .interp PROGBITS 38057649 018649 000013 00 A 0 0 1 [ 6] .dynstr STRTAB 3805765c 01865c 0001ee 00 A 0 0 1 [ 7] .hash HASH 3805784c 01884c 0000cc 04 A 11 0 4 [ 8] .data PROGBITS 38057918 018918 000a3c 00 WA 0 0 4 [ 9] .rel.data REL 00000000 06ccc8 000c18 08 40 8 4 [10] .got.plt PROGBITS 38058354 019354 00000c 04 WA 0 0 4 [11] .dynsym DYNSYM 38058360 019360 000200 10 A 6 1 4 [12] .dynamic DYNAMIC 38058560 019560 000080 08 WA 6 0 4 [13] .u_boot_cmd PROGBITS 380585e0 0195e0 0003c0 00 WA 0 0 4 [14] .rel.u_boot_cmd REL 00000000 06d8e0 000500 08 40 13 4 [15] .bss NOBITS 3805cc50 01ec50 001a34 00 WA 0 0 4 [16] .bios PROGBITS 00000000 01e000 00053e 00 AX 0 0 1 [17] .rel.bios REL 00000000 06dde0 0000c0 08 40 16 4 [18] .rel.dyn REL 380589a0 0199a0 0042b0 08 A 11 0 4 [19] .start16 PROGBITS 0000f800 01e800 000110 00 AX 0 0 1 [20] .rel.start16 REL 00000000 06dea0 000038 08 40 19 4 [21] .resetvec PROGBITS 0000fff0 01eff0 000010 00 AX 0 0 1 [22] .rel.resetvec REL 00000000 06ded8 000008 08 40 21 4
...
Relocation section '.rel.text' at offset 0x66c68 contains 2976 entries: Offset Info Type Sym.Value Sym. Name 38040010 00000101 R_386_32 38040000 .text 3804001e 00000101 R_386_32 38040000 .text 38040028 00000101 R_386_32 38040000 .text 3804003f 00000101 R_386_32 38040000 .text 38040051 00000101 R_386_32 38040000 .text 38040075 00000101 R_386_32 38040000 .text 38040085 00000101 R_386_32 38040000 .text 3804009d 0003e602 R_386_PC32 380403fa load_uboot 380400a6 00000101 R_386_32 38040000 .text 38040015 00029f02 R_386_PC32 3804bdd8 early_board_init 38040023 0003f702 R_386_PC32 3804bdda show_boot_progress_asm
...
Relocation section '.rel.rodata' at offset 0x6c968 contains 108 entries: Offset Info Type Sym.Value Sym. Name 38051908 00000201 R_386_32 380518a4 .rodata 38051938 00000201 R_386_32 380518a4 .rodata 38051968 00000201 R_386_32 380518a4 .rodata 38051998 00000201 R_386_32 380518a4 .rodata 380519c8 00000201 R_386_32 380518a4 .rodata 380519f8 00000201 R_386_32 380518a4 .rodata
...
Relocation section '.rel.dyn' at offset 0x199a0 contains 2134 entries: Offset Info Type Sym.Value Sym. Name 0000f838 00000008 R_386_RELATIVE 0000f846 00000008 R_386_RELATIVE 38040010 00000008 R_386_RELATIVE 3804001e 00000008 R_386_RELATIVE 38040028 00000008 R_386_RELATIVE 3804003f 00000008 R_386_RELATIVE 38040051 00000008 R_386_RELATIVE 38040075 00000008 R_386_RELATIVE 38040085 00000008 R_386_RELATIVE
Notice that, apart from .rel.dyn, non of the .rel.* sections have the A (Allocated) flag set - They do not end up in the stripped binary image. .rel.dyn is allocated in the binary image with all the R_386_PC32 entries from the other .rel section are discarded and the R_386_32 have been 'converted' to R_386_RELATIVE which are simple to adjust (locate in memory and adjust by the relocation offset)
Ah, they are converted to relative. Wonder if all archs do this? If so one only will need two reloc functions, one for Rel and one for Rela relocs.
The relocation fixup is really easy:
Elf32_Rel *rel_dyn_start = (Elf32_Rel *)&__rel_dyn_start; Elf32_Rel *rel_dyn_end = (Elf32_Rel *)&__rel_dyn_end; Elf32_Rel *re;
for (re = rel_dyn_start; re < rel_dyn_end; re++) { if (re->r_offset >= TEXT_BASE) if (*(ulong *)re->r_offset >= TEXT_BASE) *(ulong *)(re->r_offset - rel_offset) -= (Elf32_Addr)rel_offset; }
Do you need the TEXT_BASE stuff or is it just a precaution? Not sure if you need some test for NULL to handle weak undefined symbols though.
The size penalty is ~17kB of extra data (which is not copied to RAM) and a tiny amount of relocation code (easily offset by removal of other fixups such as the command table fixup
17kB, how does that compare to the -fPIC version?
Any without using the pic flag in gcc, there is no GOT and no associated performance penalty.
Yep :)
Thanks for everyone's help (especially Jocke and Bill)
NP, will we see a patch soon?
Jocke

Graeme Russ wrote:
On Thu, Oct 15, 2009 at 3:45 AM, J. William Campbell jwilliamcampbell@comcast.net wrote:
Joakim Tjernlund wrote:
<megasnip>
Apologies if this is getting way off-topic for a simple boot loader, but this is information I have gathered from far and wide over the net. I am surprised that there isn't a web site out there on 'How to create a relocatable boot loader'...
OK, its all starting to come together now - It helps when you look at the right files ;)
Firstly, u-boot.map
0x380589a0 __rel_dyn_start = .
.rel.dyn 0x380589a0 0x42b0 *(.rel.dyn) .rel.got 0x00000000 0x0 cpu/i386/start.o .rel.plt 0x00000000 0x0 cpu/i386/start.o .rel.text 0x380589a0 0x2e28 cpu/i386/start.o .rel.start16 0x3805b7c8 0x10 cpu/i386/start.o .rel.data 0x3805b7d8 0xc18 cpu/i386/start.o .rel.rodata 0x3805c3f0 0x360 cpu/i386/start.o .rel.u_boot_cmd 0x3805c750 0x500 cpu/i386/start.o 0x3805cc50 __rel_dyn_end = .
And the output of readelf...
Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 38040000 001000 0118a4 00 AX 0 0 4 [ 2] .rel.text REL 00000000 066c68 005d00 08 40 1 4 [ 3] .rodata PROGBITS 380518a4 0128a4 005da5 00 A 0 0 4 [ 4] .rel.rodata REL 00000000 06c968 000360 08 40 3 4 [ 5] .interp PROGBITS 38057649 018649 000013 00 A 0 0 1 [ 6] .dynstr STRTAB 3805765c 01865c 0001ee 00 A 0 0 1 [ 7] .hash HASH 3805784c 01884c 0000cc 04 A 11 0 4 [ 8] .data PROGBITS 38057918 018918 000a3c 00 WA 0 0 4 [ 9] .rel.data REL 00000000 06ccc8 000c18 08 40 8 4 [10] .got.plt PROGBITS 38058354 019354 00000c 04 WA 0 0 4 [11] .dynsym DYNSYM 38058360 019360 000200 10 A 6 1 4 [12] .dynamic DYNAMIC 38058560 019560 000080 08 WA 6 0 4 [13] .u_boot_cmd PROGBITS 380585e0 0195e0 0003c0 00 WA 0 0 4 [14] .rel.u_boot_cmd REL 00000000 06d8e0 000500 08 40 13 4 [15] .bss NOBITS 3805cc50 01ec50 001a34 00 WA 0 0 4 [16] .bios PROGBITS 00000000 01e000 00053e 00 AX 0 0 1 [17] .rel.bios REL 00000000 06dde0 0000c0 08 40 16 4 [18] .rel.dyn REL 380589a0 0199a0 0042b0 08 A 11 0 4 [19] .start16 PROGBITS 0000f800 01e800 000110 00 AX 0 0 1 [20] .rel.start16 REL 00000000 06dea0 000038 08 40 19 4 [21] .resetvec PROGBITS 0000fff0 01eff0 000010 00 AX 0 0 1 [22] .rel.resetvec REL 00000000 06ded8 000008 08 40 21 4
...
Relocation section '.rel.text' at offset 0x66c68 contains 2976 entries: Offset Info Type Sym.Value Sym. Name 38040010 00000101 R_386_32 38040000 .text 3804001e 00000101 R_386_32 38040000 .text 38040028 00000101 R_386_32 38040000 .text 3804003f 00000101 R_386_32 38040000 .text 38040051 00000101 R_386_32 38040000 .text 38040075 00000101 R_386_32 38040000 .text 38040085 00000101 R_386_32 38040000 .text 3804009d 0003e602 R_386_PC32 380403fa load_uboot 380400a6 00000101 R_386_32 38040000 .text 38040015 00029f02 R_386_PC32 3804bdd8 early_board_init 38040023 0003f702 R_386_PC32 3804bdda show_boot_progress_asm
...
Relocation section '.rel.rodata' at offset 0x6c968 contains 108 entries: Offset Info Type Sym.Value Sym. Name 38051908 00000201 R_386_32 380518a4 .rodata 38051938 00000201 R_386_32 380518a4 .rodata 38051968 00000201 R_386_32 380518a4 .rodata 38051998 00000201 R_386_32 380518a4 .rodata 380519c8 00000201 R_386_32 380518a4 .rodata 380519f8 00000201 R_386_32 380518a4 .rodata
...
Relocation section '.rel.dyn' at offset 0x199a0 contains 2134 entries: Offset Info Type Sym.Value Sym. Name 0000f838 00000008 R_386_RELATIVE 0000f846 00000008 R_386_RELATIVE 38040010 00000008 R_386_RELATIVE 3804001e 00000008 R_386_RELATIVE 38040028 00000008 R_386_RELATIVE 3804003f 00000008 R_386_RELATIVE 38040051 00000008 R_386_RELATIVE 38040075 00000008 R_386_RELATIVE 38040085 00000008 R_386_RELATIVE
Notice that, apart from .rel.dyn, non of the .rel.* sections have the A (Allocated) flag set - They do not end up in the stripped binary image. .rel.dyn is allocated in the binary image with all the R_386_PC32 entries from the other .rel section are discarded and the R_386_32 have been 'converted' to R_386_RELATIVE which are simple to adjust (locate in memory and adjust by the relocation offset)
The relocation fixup is really easy:
Elf32_Rel *rel_dyn_start = (Elf32_Rel *)&__rel_dyn_start; Elf32_Rel *rel_dyn_end = (Elf32_Rel *)&__rel_dyn_end; Elf32_Rel *re;
for (re = rel_dyn_start; re < rel_dyn_end; re++) { if (re->r_offset >= TEXT_BASE) if (*(ulong *)re->r_offset >= TEXT_BASE) *(ulong *)(re->r_offset - rel_offset) -= (Elf32_Addr)rel_offset; }
The size penalty is ~17kB of extra data (which is not copied to RAM) and a tiny amount of relocation code (easily offset by removal of other fixups such as the command table fixup
Any without using the pic flag in gcc, there is no GOT and no associated performance penalty.
Thanks for everyone's help (especially Jocke and Bill)
Great work Graeme. You have taken a lot of conjecture and guessing and converted it to actual truth!
In line with your comment about -fpic, the .text segment size goes from 000137fc down to 000118a4, or about an 8 k reduction in size. -fpic also contains a .rel_dyn segment, that presumably needs to be processed the same way as in the non -fpic case (otherwise, why would it be there?). The size of the "residual" .rel_dyn was 00001228, or 4.6 k. This means that the size penalty for not using -fpic is only about 3k bytes total in the image, and the ram footprint is actually smaller than with -fpic. So now, after Graeme's work here, it is easily possible to support three different u-boot configurations, absolute, relocatable, and relocatable with -fpic. If there are any size maniacs out there, we can reduce the size of the relocation table at the expense of some post-processing. These days, 9k of flash vs 4.5k of flash doesn't seem important, but I imagine if you are right against the stops on an existing product it can be very important!
It will be interesting to see similar numbers for other architectures. I expect similar results, but you never know. PPC relocation entries are larger, so they become more of an issue.
Still more questions for Graeme if he will indulge me! Are the if statements in the relocation code ever false? Are there relocations for stuff below TEXT_BASE in the input binary? If so, do you have any idea why? Not that two if statements are a big deal, it is just that I can't explain why there would be any relocations below TEXT_BASE, and I can't explain why there would be any relocatable references to anything below text base. . I assume this might be related to not relocating NULL pointers. That would be reflected in the innermost if statement. I would not expect there to be any such references, as gas does know the relocation attributes of initialized data, and NULL is absolute(?) Also, if a function is not defined (weak or otherwise), the loader should give it an address of absolute 0, which would also not generate a relocation entry(?). It would be interesting to intentionally call an un-defined function in u-boot and see if the call ends up relocatable. It should not, and if it does we should file a bug report for ld!
Thanks again Graeme!
Best Regards, Bill Campbell
Regards,
Graeme

On Sat, Oct 17, 2009 at 11:59 PM, J. William Campbell jwilliamcampbell@comcast.net wrote:
Graeme Russ wrote:
On Thu, Oct 15, 2009 at 3:45 AM, J. William Campbell jwilliamcampbell@comcast.net wrote:
Joakim Tjernlund wrote:
<megasnip>
[Yawn... YAS (Yet Another Snip) ;)]
The relocation fixup is really easy:
Elf32_Rel *rel_dyn_start = (Elf32_Rel *)&__rel_dyn_start; Elf32_Rel *rel_dyn_end = (Elf32_Rel *)&__rel_dyn_end; Elf32_Rel *re; for (re = rel_dyn_start; re < rel_dyn_end; re++) { if (re->r_offset >= TEXT_BASE) if (*(ulong *)re->r_offset >= TEXT_BASE) *(ulong *)(re->r_offset - rel_offset) -=
(Elf32_Addr)rel_offset; }
The size penalty is ~17kB of extra data (which is not copied to RAM) and a tiny amount of relocation code (easily offset by removal of other fixups such as the command table fixup
Any without using the pic flag in gcc, there is no GOT and no associated performance penalty.
Thanks for everyone's help (especially Jocke and Bill)
Great work Graeme. You have taken a lot of conjecture and guessing and converted it to actual truth!
In line with your comment about -fpic, the .text segment size goes from 000137fc down to 000118a4, or about an 8 k reduction in size. -fpic also contains a .rel_dyn segment, that presumably needs to be processed the same way as in the non -fpic case (otherwise, why would it be there?). The size of the "residual" .rel_dyn was 00001228, or 4.6 k. This means that the size penalty for not using -fpic is only about 3k bytes total in the image, and the ram footprint is actually smaller than with -fpic. So now, after
Yes, especially on the x86 because with -fpic, the x86 needs to do a CALL/POP in the beginning of each function to determine the current IP in order to calculate absolute addresses using the GOT (ouch!)
Graeme's work here, it is easily possible to support three different u-boot configurations, absolute, relocatable, and relocatable with -fpic. If there are any size maniacs out there, we can reduce the size of the relocation table at the expense of some post-processing. These days, 9k of flash vs 4.5k of flash doesn't seem important, but I imagine if you are right against the stops on an existing product it can be very important!
It will be interesting to see similar numbers for other architectures. I expect similar results, but you never know. PPC relocation entries are larger, so they become more of an issue.
Still more questions for Graeme if he will indulge me! Are the if statements in the relocation code ever false? Are there relocations for stuff below TEXT_BASE in the input binary? If so, do you have any idea why? Not that two if statements are a big deal, it is just that I can't explain why there would be any relocations below TEXT_BASE, and I can't explain why there would be any relocatable references to anything below text base. . I assume this might be related to not relocating NULL pointers. That would be reflected in the innermost if statement. I would not expect there to be any such references, as gas does know the relocation attributes of initialized data, and NULL is absolute(?) Also, if a function is not defined (weak or otherwise), the loader should give it an address of absolute 0, which would also not generate a relocation entry(?). It would be interesting to intentionally call an un-defined function in u-boot and see if the call ends up relocatable. It should not, and if it does we should file a bug report for ld!
Apart from NULL pointers, there are some peculiarities for x86 that have to be dealt with. There are two sections (for BIOS and the real mode trampoline) which get linked at a hard coded memory location in the low are of memory (<16M) - The TEXT_BASE checks are to ensure these do not get trampled.
Thanks again Graeme!
NP - Just scratching an itch
Best Regards, Bill Campbell
Regards,
Graeme

Joakim Tjernlund wrote:
"J. William Campbell" jwilliamcampbell@comcast.net wrote on 14/10/2009 01:48:52:
Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 22:06:56:
On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05:
On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
> Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19: > > [Massive Snip :)]
>> So, all that is left are .dynsym and .dynamic ... >> .dynsym >> - Contains 70 entries (16 bytes each, 1120 bytes) >> - 44 entries mimic those entries in .got which are not relocated >> - 21 entries are the remaining symbols exported from the linker >> script >> - 4 entries are labels defined in inline asm and used in C >> >> > Try adding proper asm declarations. Look at what gcc > generates for a function/variable and mimic these. > > Thanks - Now .dynsym contains only exports from the linker script
:)
>> - 1 entry is a NULL entry >> >> .dynamic >> - 88 bytes >> - Array of Elf32_Dyn >> - typedef struct { >> Elf32_Sword d_tag; >> union { >> Elf32_Word d_val; >> Elf32_Addr d_ptr; >> } d_un; >> } Elf32_Dyn; >> - 0x11 entries >> [00] 0x00000010, 0x00000000 DT_SYMBOLIC, (ignored) >> [01] 0x00000004, 0x38059994 DT_HASH, points to .hash >> [02] 0x00000005, 0x380595AB DT_STRTAB, points to .dynstr >> [03] 0x00000006, 0x3805BDCC DT_SYMTAB, points to .dynsym >> [04] 0x0000000A, 0x000003E6 DT_STRSZ, size of .dynstr >> [05] 0x0000000B, 0x00000010 DT_SYMENT, ??? >> [06] 0x00000015, 0x00000000 DT_DEBUG, ??? >> [07] 0x00000011, 0x3805A8F4 DT_REL, points to .rel.text >> [08] 0x00000012, 0x000014D8 DT_RELSZ, ??? >> >> > How big DT_REL is > > >> [09] 0x00000013, 0x00000008 DT_RELENT, ??? >> >> > hmm, cannot remeber :) > > How big an entry in DT_REL is
Right, how could I forget :)
>> [0a] 0x00000016, 0x00000000 DT_TEXTREL, ??? >> >> > Oops, you got text relocations. This is generally a bad thing. > TEXTREL is commonly caused by asm code that arent truly pic so it needs > to modify the .text segment to adjust for relocation. > You should get rid of this one. Look for DT_TEXTREL in .o files to find > the culprit. > > > Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack
hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess.
Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think.
With the following flags
PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions
I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them
BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more.
Possibly, but I think you only need to add an offset to all those relocs.
Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load.
Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new address. Therefore you should be able to read the current address, add offset, write back the new address.
Normally one do what you describe but here we know that the whole img has moved so we don't have to do calculate the new address from scratch.
If the addresses of the bss, text, and data segments change by the same value, I think you are correct. However, if the text and data/bss segments are moved by different offsets, naturally the relocations would be different. One reason to retain this capability would be to allow the u-boot copy to execute in place in NOR flash while re-locating the read-write storage once memory has been sized. Having different relocation factors is not much worse than just one, and it may be just as easy to get working initially as a single relocation constant.
FWIW, the "ultimate" solution to minimum relocation size is a post-processing step that creates "several" arrays of relocation offsets as two byte quantities. This reduces the cost of each relocation entry to just a bit more than two bytes (there is a small overhead for array size, MSB values and relocation offset selection.) Naturally, this is much less than the ELF version of the same relocations, because we do not need to retain as much information and ELF doesn't worry about size that much.. This may pacify users for which the flash size of the image is critical, at the expense of an extra link step. Naturally, getting things to work with "standard ELF" is the most important step, and probably enough for most people.
I also am interested in the number of additional relocations generated without -fpic. I suspect on the 386 it can be substantial. However, for every new reloc generated, a .got reference load will probably be eliminated. This should result in a shorter text segment to balance the increased relocation segment. Adding the -fno-jump-tables gcc option may also help a bit.
Bill Campbell
Jocke

"J. William Campbell" jwilliamcampbell@comcast.net wrote on 14/10/2009 17:35:44:
Joakim Tjernlund wrote:
"J. William Campbell" jwilliamcampbell@comcast.net wrote on 14/10/2009 01:48:52:
Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 22:06:56:
On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05:
> On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund > joakim.tjernlund@transmode.se wrote: > > >> Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19: >> >> > [Massive Snip :)]
[Yet another SNIP :)]
Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think.
With the following flags
PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions
I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them
BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more.
Possibly, but I think you only need to add an offset to all those relocs.
Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load.
Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new address. Therefore you should be able to read the current address, add offset, write back the
new address.
Normally one do what you describe but here we know that the whole img has moved so we don't have to do calculate the new address from scratch.
If the addresses of the bss, text, and data segments change by the same value, I think you are correct. However, if the text and data/bss segments are moved by different offsets, naturally the relocations would be different. One reason to retain this capability would be to allow the u-boot copy to execute in place in NOR flash while re-locating the read-write storage once memory has been sized. Having different relocation factors is not much worse than just one, and it may be just as easy to get working initially as a single relocation constant.
How do figure that? You need to rewrite the insn to access the moved data/bss and they are in flash, did I miss something?
FWIW, the "ultimate" solution to minimum relocation size is a post-processing step that creates "several" arrays of relocation offsets as two byte quantities. This reduces the cost of each relocation entry to just a bit more than two bytes (there is a small overhead for array size, MSB values and relocation offset selection.) Naturally, this is much less than the ELF version of the same relocations, because we do not need to retain as much information and ELF doesn't worry about size that much.. This may pacify users for which the flash size of the image is critical, at the expense of an extra link step. Naturally, getting things to work with "standard ELF" is the most important step, and probably enough for most people.
That would save 2+4 bytes/reloc on REL arches and 2+4+4 on RELA(ppc) (provided one can ignore r_addend)
But yes, this is probably too "fancy" for the moment.
Jocke

Joakim Tjernlund wrote:
"J. William Campbell" jwilliamcampbell@comcast.net wrote on 14/10/2009 17:35:44:
Joakim Tjernlund wrote:
"J. William Campbell" jwilliamcampbell@comcast.net wrote on 14/10/2009 01:48:52:
Joakim Tjernlund wrote:
Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 22:06:56:
On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernlund@transmode.se wrote:
> Graeme Russ graeme.russ@gmail.com wrote on 13/10/2009 13:21:05: > > > >> On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund >> joakim.tjernlund@transmode.se wrote: >> >> >> >>> Graeme Russ graeme.russ@gmail.com wrote on 11/10/2009 12:47:19: >>> >>> >>> >> [Massive Snip :)] >>
[Yet another SNIP :)]
> Evil idea, skip -fpic et. all and add the full reloc procedure > to relocate by rewriting directly in TEXT segment. Then you save space > but you need more relocation code. Something like dl_do_reloc from > uClibc. Wonder how much extra code that would be? Not too much I think. > > > > With the following flags
PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions
I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them
BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more.
Possibly, but I think you only need to add an offset to all those relocs.
Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load.
Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new address. Therefore you should be able to read the current address, add offset, write back the
new address.
Normally one do what you describe but here we know that the whole img has moved so we don't have to do calculate the new address from scratch.
If the addresses of the bss, text, and data segments change by the same value, I think you are correct. However, if the text and data/bss segments are moved by different offsets, naturally the relocations would be different. One reason to retain this capability would be to allow the u-boot copy to execute in place in NOR flash while re-locating the read-write storage once memory has been sized. Having different relocation factors is not much worse than just one, and it may be just as easy to get working initially as a single relocation constant.
How do figure that? You need to rewrite the insn to access the moved data/bss and they are in flash, did I miss something?
No, I did. You are quite correct, there would be references in flash that couldn't be fixed. Sorry about that.
Best Regards, Bill Campbell
FWIW, the "ultimate" solution to minimum relocation size is a post-processing step that creates "several" arrays of relocation offsets as two byte quantities. This reduces the cost of each relocation entry to just a bit more than two bytes (there is a small overhead for array size, MSB values and relocation offset selection.) Naturally, this is much less than the ELF version of the same relocations, because we do not need to retain as much information and ELF doesn't worry about size that much.. This may pacify users for which the flash size of the image is critical, at the expense of an extra link step. Naturally, getting things to work with "standard ELF" is the most important step, and probably enough for most people.
That would save 2+4 bytes/reloc on REL arches and 2+4+4 on RELA(ppc) (provided one can ignore r_addend)
But yes, this is probably too "fancy" for the moment.
Jocke
participants (3)
-
Graeme Russ
-
J. William Campbell
-
Joakim Tjernlund