Re: [U-Boot] [PATCH 2/7] HACK: rearrange link order for thumb

10 Jul 2012


      On Sat, Jul 07, 2012 at 03:15:36AM -0700, Albert ARIBAUD wrote:
...
Hi Allen,
On Fri, 6 Jul 2012 16:17:19 -0700, Allen Martin amartin@nvidia.com wrote:
...
On Fri, Jul 06, 2012 at 01:44:32PM -0700, Stephen Warren wrote:
...
On 07/06/2012 02:33 PM, Allen Martin wrote:
...
On Fri, Jul 06, 2012 at 12:09:43PM -0700, Stephen Warren wrote:
...
On 07/06/2012 12:08 PM, Allen Martin wrote:
...
Rearrange the link order of libraries to avoid out of bound
relocations in thumb mode.  I have no idea how to fix this for real.
Are the relocations branches or something else? It looks like
unconditional jump range is +/-4MB for Thumb1 and +/-16MB for Thumb2, so
I'm surprised we'd be exceeding that, considering the U-boot binary is
on the order of 256KB on Tegra right now.
This is the relcation type:
arch/arm/lib/libarm.o: In function `__flush_dcache_all':
/home/arm/u-boot/arch/arm/lib/cache.c:52: relocation truncated to fit: R_ARM_THM_JUMP11 against symbol `flush_cache' defined in .text section in arch/arm/cpu/armv7/libarmv7.o
The instruction is a "b.n" not a "b", which is what is causing the problem.
I think because of the weak alias the compiler used a short jump to
the local function, but when it got linked it resolved to a function
that was too far away for the short jump:
void  flush_cache(unsigned long start, unsigned long size)
        __attribute__((weak, alias("__flush_cache")));
00000002 <__flush_dcache_all>:
   2:   2000            movs    r0, #0
   4:   f04f 31ff       mov.w   r1, #4294967295 ; 0xffffffff
   8:   e7fe            b.n     0 <__flush_cache>
Ah, that explanation makes sense.
...
It looks like there's a "-fno-optimize-sibling-calls" option to gcc to
avoid this problem.  Seems a shame to disable all short jumps for this
one case though.
It seems like a bug that the b-vs-b.n optimization is applied to a weak
symbol, since the compiler can't possibly know the range of the jump.
Also, I've seen ld for some architectures rewrite the equivalent of b.n
to plain b when needing to expand the branch target range; IIRC a
process known as "relaxing"? Perhaps gcc is expecting ld to do that, but
ld isn't?
And I forgot to mention, the code bloat from disabling the
optimization is about 400 bytes (185136 -> 185540), so it's not bad,
but it still seems a shame to disable all short branches because of
one misoptimized one.
Can this not be limited to compiling the object files which are known to be
sensitive to the problem?
I understand this issue fairly well now.  It's a known bug in the
assembler that has already been fixed:
http://sourceware.org/bugzilla/show_bug.cgi?id=12532
It only impacts preembtable symbols, and since u-boot doesn't have any
dynamic loadable objects it's only explictly defined weak symbols that
should trigger the bug.
I built a new toolchain with binutils 2.22 and verified the bug is no
longer present there, and -fno-optimize-sibling-calls is the correct
workaround for toolchains that do have the bug, so conditionally
disabling the optimization for binutils < 2.22 seems like the right
fix.
I ran a quick scrub of the u-boot tree and there's 195 instances of
__attribute__((weak)) spread across 123 source files, so I think just
disabling optimization on the failing object files may be too fragile,
as code movement could cause others to crop up.
-Allen
-- 
nvpublic