[U-Boot] [PATCH] avr32: fix relocation address calculation

Commit 1865286466a5d0c7f2e3c37632da56556c838e9e (Introduce generic link section.h symbol files) changed the __bss_end symbol type from char[] to ulong. This led to wrong relocation parameters which ended up in a not working u-boot. Unfortunately this is not clear to see cause due to RAM aliasing we may get a 'half-working' u-boot then.
Fix this by dereferencing the __bss_end symbol where needed.
Signed-off-by: Andreas Bießmann andreas.devel@googlemail.com --- arch/avr32/lib/board.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/avr32/lib/board.c b/arch/avr32/lib/board.c index ccf862a..2e79e98 100644 --- a/arch/avr32/lib/board.c +++ b/arch/avr32/lib/board.c @@ -116,7 +116,7 @@ static int display_banner (void) printf ("\n\n%s\n\n", version_string); printf ("U-Boot code: %08lx -> %08lx data: %08lx -> %08lx\n", (unsigned long)_text, (unsigned long)_etext, - (unsigned long)_data, (unsigned long)__bss_end); + (unsigned long)_data, (unsigned long)(&__bss_end)); return 0; }
@@ -183,7 +183,7 @@ void board_init_f(ulong board_type) * - stack */ addr = CONFIG_SYS_SDRAM_BASE + sdram_size; - monitor_len = (char *)__bss_end - _text; + monitor_len = (char *)(&__bss_end) - _text;
/* * Reserve memory for u-boot code, data and bss.

Hi Andreas,
On Wed, 8 May 2013 11:25:17 +0200, Andreas Bießmann andreas.devel@googlemail.com wrote:
Commit 1865286466a5d0c7f2e3c37632da56556c838e9e (Introduce generic link section.h symbol files) changed the __bss_end symbol type from char[] to ulong. This led to wrong relocation parameters which ended up in a not working u-boot. Unfortunately this is not clear to see cause due to RAM aliasing we may get a 'half-working' u-boot then.
Fix this by dereferencing the __bss_end symbol where needed.
(cc:ing Simon and Tom)
The dereferencing is correct, so this patch seems good per se (it could actually have applied when __bss_end was still a char[]).
But the definition of __bss_end to being an ulong, as introduced by 18652864, is plain wrong. Just because it is a linker defined symbol does not mean the object it represents is a 32-bit quantity -- it is not. It still is a non-object, a pure address.
Ditto for __data_end, __rel_dyn_start, __rel_dyn_end and pretty much any symbol in sections.h which is not an offset.
Sorry for not spotting this before it was merged in; but now this must be fixed. I'm ok with the wrongly-ulong symbols being changed back to char[], though I would prefer their type to be char[0] if possible, as this is (admittedly marginally) more likely to help the compiler catch accidental dereferencings.
Amicalement,

Hi Albert,
On 05/10/2013 11:24 AM, Albert ARIBAUD wrote:
Hi Andreas,
On Wed, 8 May 2013 11:25:17 +0200, Andreas Bießmann andreas.devel@googlemail.com wrote:
Commit 1865286466a5d0c7f2e3c37632da56556c838e9e (Introduce generic link section.h symbol files) changed the __bss_end symbol type from char[] to ulong. This led to wrong relocation parameters which ended up in a not working u-boot. Unfortunately this is not clear to see cause due to RAM aliasing we may get a 'half-working' u-boot then.
Fix this by dereferencing the __bss_end symbol where needed.
(cc:ing Simon and Tom)
The dereferencing is correct, so this patch seems good per se (it could actually have applied when __bss_end was still a char[]).
well, as I understood this the __bss_end being a char[] did implicitly take the address when accessing __bss_end (as we do when we have a definition of char foo[2] and we take just 'foo'). But you say here we should reference the address of __bss_end while it was still of type char[]. Sorry, I do not understand that, can you please clarify?
Best regards
Andreas Bießmann

Hi Andreas,
On Fri, 10 May 2013 12:57:21 +0200, "Andreas Bießmann" andreas.devel@googlemail.com wrote:
Hi Albert,
On 05/10/2013 11:24 AM, Albert ARIBAUD wrote:
Hi Andreas,
On Wed, 8 May 2013 11:25:17 +0200, Andreas Bießmann andreas.devel@googlemail.com wrote:
Commit 1865286466a5d0c7f2e3c37632da56556c838e9e (Introduce generic link section.h symbol files) changed the __bss_end symbol type from char[] to ulong. This led to wrong relocation parameters which ended up in a not working u-boot. Unfortunately this is not clear to see cause due to RAM aliasing we may get a 'half-working' u-boot then.
Fix this by dereferencing the __bss_end symbol where needed.
(cc:ing Simon and Tom)
The dereferencing is correct, so this patch seems good per se (it could actually have applied when __bss_end was still a char[]).
well, as I understood this the __bss_end being a char[] did implicitly take the address when accessing __bss_end (as we do when we have a definition of char foo[2] and we take just 'foo'). But you say here we should reference the address of __bss_end while it was still of type char[]. Sorry, I do not understand that, can you please clarify?
There are several concepts here, some pertaining to the compiler, some to the linker.
From the linker viewpoint, a symbol is *always* and *only* an address, the first address of the object corresponding to the symbol, and an object is just some area in the addressable space.
From the compiler viewpoint, an object has a C type, possibly with an initial value, and a name, which is the symbol. The compiler considers the name/symbol to be value, not the address of the corresponding object... at least most of the time: as you indicate, when the symbol denotes a C array, then the C compiler understand the symbol as the address of the array.
The __bss_end symbol does not actually correspond to an object in the usual sense, since the BSS contains all sorts of data: any C global, either uninitialized or initialized with zeroes, whatever its type, ends up in BSS. The most general way one can assign a type to BSS itself is by considering it as a shapeless array of bytes -- hence the char[] definition.
Thus, the C compiler considered the name __bss_end to denote the address of the BSS "object", and the C code for AVR32 was correct as it was actually referring to the BSS "object"'s address.
When the __bss_end symbol's C type was changed to 'ulong', this changed the way the compiler understood the symbol: it now thinks __bss_end is the BSS' "value", which has no true sense, and certainly does not mean 'the first 4 bytes of BSS considered as a 32-bit value'.
To compensate this, the AVR32 code has to add an & to find the address of __bss_end, but the original error is to have changed the "type" of the BSS.
IOW, we should *always* take the address of __bss_end, since this is the only thing it was defined for. We should never give it a chance to even *have* a value at the C level, because we don't want to read, or worse, write, the BSS itself; we only want to access C globals in the BSS.
HTH,
Best regards
Andreas Bießmann
Amicalement,

On May 10, 2013, at 11:09 AM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
The compiler considers the name/symbol to be value, not the address of the corresponding object... at least most of the time: as you indicate, when the symbol denotes a C array, then the C compiler understand the symbol as the address of the array.
I ran into a case one on Codewarrior Mac PPC tools where there was a subtle different here. In an existing body of code there was originally a global char[] defined with a matching extern char[] declared in a header.
At some point the definition was changed to char* because the size of the array grew and we wanted it out of the global section. I traced an obscure crash to some assembly code that had worked when the definition was char[] but needed an extra level of indirection when it was char *.
During that debugging I found that the declaration had not been changed to char * but the C compiler hadn't cared. It handled the mixed forms just fine despite the clear difference in the code.
I surmised it was some subtle issue around PPC's handling of global data (or the Codewarrior PPC ABI) but still don't really know.
-Mike

Hi Michael,
On Fri, 10 May 2013 13:02:57 -0400, Michael Cashwell mboards@prograde.net wrote:
On May 10, 2013, at 11:09 AM, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
The compiler considers the name/symbol to be value, not the address of the corresponding object... at least most of the time: as you indicate, when the symbol denotes a C array, then the C compiler understand the symbol as the address of the array.
I ran into a case one on Codewarrior Mac PPC tools where there was a subtle different here. In an existing body of code there was originally a global char[] defined with a matching extern char[] declared in a header.
At some point the definition was changed to char* because the size of the array grew and we wanted it out of the global section. I traced an obscure crash to some assembly code that had worked when the definition was char[] but needed an extra level of indirection when it was char *.
Well, of course it did! char* is a pointer to char, with syntactic facilities to use it as a pointer to char array, but char* is not an array. The value of a pointer to char is a (probably 32-bit) address, and you need to dereferenc it to get a char.
During that debugging I found that the declaration had not been changed to char * but the C compiler hadn't cared. It handled the mixed forms just fine despite the clear difference in the code.
Well, the compiler would not care that one C module would know the symbol as char* and another one would know it as char[], since the compiler treats compilation units completely independently. You would end up using the same address space area for two different objects: an array of chars, overlapped with a (probably 32-bit) pointer to char.
Where I worked up until recently, we had a 'coding rule' that required us to always #include a module's .h file (its public interface) from within its .c file (its implementation) if only to make sure the compiler saw both the declarations and the definitions in a single compilation unit, and would thus bark at us for screwing up between declaration and definition.
I surmised it was some subtle issue around PPC's handling of global data (or the Codewarrior PPC ABI) but still don't really know.
This has nothing to do with PPC or CW; this is just C working as designed.
-Mike
Amicalement,

Hi Albert,
On 05/10/2013 05:09 PM, Albert ARIBAUD wrote:
Hi Andreas,
On Fri, 10 May 2013 12:57:21 +0200, "Andreas Bießmann" andreas.devel@googlemail.com wrote:
Hi Albert,
On 05/10/2013 11:24 AM, Albert ARIBAUD wrote:
Hi Andreas,
On Wed, 8 May 2013 11:25:17 +0200, Andreas Bießmann andreas.devel@googlemail.com wrote:
Commit 1865286466a5d0c7f2e3c37632da56556c838e9e (Introduce generic link section.h symbol files) changed the __bss_end symbol type from char[] to ulong. This led to wrong relocation parameters which ended up in a not working u-boot. Unfortunately this is not clear to see cause due to RAM aliasing we may get a 'half-working' u-boot then.
Fix this by dereferencing the __bss_end symbol where needed.
(cc:ing Simon and Tom)
The dereferencing is correct, so this patch seems good per se (it could actually have applied when __bss_end was still a char[]).
well, as I understood this the __bss_end being a char[] did implicitly take the address when accessing __bss_end (as we do when we have a definition of char foo[2] and we take just 'foo'). But you say here we should reference the address of __bss_end while it was still of type char[]. Sorry, I do not understand that, can you please clarify?
There are several concepts here, some pertaining to the compiler, some to the linker.
From the linker viewpoint, a symbol is *always* and *only* an address, the first address of the object corresponding to the symbol, and an object is just some area in the addressable space.
From the compiler viewpoint, an object has a C type, possibly with an initial value, and a name, which is the symbol. The compiler considers the name/symbol to be value, not the address of the corresponding object... at least most of the time: as you indicate, when the symbol denotes a C array, then the C compiler understand the symbol as the address of the array.
The __bss_end symbol does not actually correspond to an object in the usual sense, since the BSS contains all sorts of data: any C global, either uninitialized or initialized with zeroes, whatever its type, ends up in BSS. The most general way one can assign a type to BSS itself is by considering it as a shapeless array of bytes -- hence the char[] definition.
Thus, the C compiler considered the name __bss_end to denote the address of the BSS "object", and the C code for AVR32 was correct as it was actually referring to the BSS "object"'s address.
When the __bss_end symbol's C type was changed to 'ulong', this changed the way the compiler understood the symbol: it now thinks __bss_end is the BSS' "value", which has no true sense, and certainly does not mean 'the first 4 bytes of BSS considered as a 32-bit value'.
To compensate this, the AVR32 code has to add an & to find the address of __bss_end, but the original error is to have changed the "type" of the BSS.
IOW, we should *always* take the address of __bss_end, since this is the only thing it was defined for. We should never give it a chance to even *have* a value at the C level, because we don't want to read, or worse, write, the BSS itself; we only want to access C globals in the BSS.
thank you for your detailed explanation. So now its clear why referring the address of an object of type char[] will also work. Another question, wouldn't it make sense to declare these C globals as const then?
Best regards
Andreas Bießmann

Hi Andreas,
On Mon, 13 May 2013 10:35:12 +0200, "Andreas Bießmann" andreas.devel@googlemail.com wrote:
Hi Albert,
On 05/10/2013 05:09 PM, Albert ARIBAUD wrote:
Hi Andreas,
On Fri, 10 May 2013 12:57:21 +0200, "Andreas Bießmann" andreas.devel@googlemail.com wrote:
Hi Albert,
On 05/10/2013 11:24 AM, Albert ARIBAUD wrote:
Hi Andreas,
On Wed, 8 May 2013 11:25:17 +0200, Andreas Bießmann andreas.devel@googlemail.com wrote:
Commit 1865286466a5d0c7f2e3c37632da56556c838e9e (Introduce generic link section.h symbol files) changed the __bss_end symbol type from char[] to ulong. This led to wrong relocation parameters which ended up in a not working u-boot. Unfortunately this is not clear to see cause due to RAM aliasing we may get a 'half-working' u-boot then.
Fix this by dereferencing the __bss_end symbol where needed.
(cc:ing Simon and Tom)
The dereferencing is correct, so this patch seems good per se (it could actually have applied when __bss_end was still a char[]).
well, as I understood this the __bss_end being a char[] did implicitly take the address when accessing __bss_end (as we do when we have a definition of char foo[2] and we take just 'foo'). But you say here we should reference the address of __bss_end while it was still of type char[]. Sorry, I do not understand that, can you please clarify?
There are several concepts here, some pertaining to the compiler, some to the linker.
From the linker viewpoint, a symbol is *always* and *only* an address, the first address of the object corresponding to the symbol, and an object is just some area in the addressable space.
From the compiler viewpoint, an object has a C type, possibly with an initial value, and a name, which is the symbol. The compiler considers the name/symbol to be value, not the address of the corresponding object... at least most of the time: as you indicate, when the symbol denotes a C array, then the C compiler understand the symbol as the address of the array.
The __bss_end symbol does not actually correspond to an object in the usual sense, since the BSS contains all sorts of data: any C global, either uninitialized or initialized with zeroes, whatever its type, ends up in BSS. The most general way one can assign a type to BSS itself is by considering it as a shapeless array of bytes -- hence the char[] definition.
Thus, the C compiler considered the name __bss_end to denote the address of the BSS "object", and the C code for AVR32 was correct as it was actually referring to the BSS "object"'s address.
When the __bss_end symbol's C type was changed to 'ulong', this changed the way the compiler understood the symbol: it now thinks __bss_end is the BSS' "value", which has no true sense, and certainly does not mean 'the first 4 bytes of BSS considered as a 32-bit value'.
To compensate this, the AVR32 code has to add an & to find the address of __bss_end, but the original error is to have changed the "type" of the BSS.
IOW, we should *always* take the address of __bss_end, since this is the only thing it was defined for. We should never give it a chance to even *have* a value at the C level, because we don't want to read, or worse, write, the BSS itself; we only want to access C globals in the BSS.
thank you for your detailed explanation. So now its clear why referring the address of an object of type char[] will also work. Another question, wouldn't it make sense to declare these C globals as const then?
Indeed, const may help prevent these symbols from being accidentally written into, at least in the most expected cases such as passing &__bss_end to a function expecting a non-const char*.
There is, however, a much better way of preventing this and more: just give these symbols a C type of 'struct {}' (empty struct).
Since this type has absolutely no field which could be written into or read from, it is completely protected from direct write but also from direct read; and a pointer to it has type "struct {} *" which is incompatible with any other pointer, so any inconsiderate use of it is detected at compile time.
I had thought of the 'struct {}' method for linker lists refactoring, when I needed a zero-size type; I finally turned to char[0] instead (and comment on this at line 150 of include/linker_lists.h) because the struct method would cause gcc 4.4 and earlier, such as eldk42, to throw diagnostics like "warning: dereferencing type-punned pointer will break strict-aliasing rules" -- that is the incompatibility I am talking about.
Note that the diagnostics did not stem from the empty struct variable declarations as such, but type-casting the address of an empty struct into a pointer to a known, non-empty, struct; I just checked now, and doing an intermediate cast to char* or void* prevents the warnings. Why I failed to find this when I was refactoring linker lists, I'll never know.
Of course, there is always a possibility that the area around BSS end be accidentally read or written into if &__bss_end is force-cast into, say, a char*, but that's true whatever type you give __bss_end. Aside from this force-casting, the empty struct approach is IMO the most protective.
Thus if one wants to protect symbols in sections.h, I suggest giving them a type of 'struct {}', then building U-Boot with e.g. ELDK42, and fixing all type-punning diagnostics.
(and I will 'fix' include/linker_lists.h myself to use empty structs)
Best regards
Andreas Bießmann
Amicalement,

On Fri, 10 May 2013 11:24:44 +0200, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Sorry for not spotting this before it was merged in; but now this must be fixed. I'm ok with the wrongly-ulong symbols being changed back to char[], though I would prefer their type to be char[0] if possible, as this is (admittedly marginally) more likely to help the compiler catch accidental dereferencings.
Following the latest exchange with Andreas, and after some tests with eldk42 and a gcc 4.7 toolchain, I amend the previous paragraph: rather than 'char[0]', I suggest using 'struct {}'.
Amicalement,

Dear Andreas Bießmann,
Andreas Bießmann andreas.devel@googlemail.com writes:
Commit 1865286466a5d0c7f2e3c37632da56556c838e9e (Introduce generic link section.h symbol files) changed the __bss_end symbol type from char[] to ulong. This led to wrong relocation parameters which ended up in a not working u-boot. Unfortunately this is not clear to see cause due to RAM aliasing we may get a 'half-working' u-boot then.
Fix this by dereferencing the __bss_end symbol where needed.
Signed-off-by: Andreas Bießmann andreas.devel@googlemail.com
arch/avr32/lib/board.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
applied to u-boot-atmel/master, thanks!
Best regards, Andreas Bießmann

On 05/13/2013 10:38 AM, Andreas Bießmann wrote:
Dear Andreas Bießmann,
Andreas Bießmann andreas.devel@googlemail.com writes:
Commit 1865286466a5d0c7f2e3c37632da56556c838e9e (Introduce generic link section.h symbol files) changed the __bss_end symbol type from char[] to ulong. This led to wrong relocation parameters which ended up in a not working u-boot. Unfortunately this is not clear to see cause due to RAM aliasing we may get a 'half-working' u-boot then.
Fix this by dereferencing the __bss_end symbol where needed.
Signed-off-by: Andreas Bießmann andreas.devel@googlemail.com
arch/avr32/lib/board.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
applied to u-boot-atmel/master, thanks!
damn scripts ... have to rework them ;)
participants (3)
-
Albert ARIBAUD
-
Andreas Bießmann
-
Michael Cashwell