[RFC PATCH u-boot 00/12] U-Boot LTO (Sandbox + ARM Nokia RX-51)

Hello,
I have managed to add support for building U-Boot with LTO (with GCC) in a rather sane way (in LOC changed).
This series and its follows will also be available at https://github.com/elkablo/u-boot branch lto.
I have tested these builds on Turris Omnia, Turris MOX and on Nokia N900 (via the test/nokia_rx51_test.sh script). For other tests I have created a pull-request on github to trigger CI (https://github.com/u-boot/u-boot/pull/57) For some reason it is waiting now, maybe Azure is not working or something.
My tests on Omnia and MOX show that U-Boot boots sucessfully, and basic commands seem to work. But of course something broken due to LTO may be found later.
So for all of you that are interested and have an ARM board, please test this on your boards by enabling CONFIG_LTO option. Also please report code size reductions. (Chris Packham reports an error related to jobserver, so if `make -jN` produces an error, please try without the `-jN` flag.)
I have only tested with gcc-10. There are still some warnings printed, like: bfd plugin: invalid symbol type found but these don't seem to matter. I will look into this later.
Here are some results by how much code size reduced. Note that SPL binary seems to gain more code reduction (15.4 % on average) than main binary (4.5 % on average).
I guess this is because of how drivers are written. The optimizer cannot know which code paths won't be used, since it does not see the device tree. Maybe this could be somehow integrated with Simon's work on OF_PLATDATA_INST in the future, to make the compiler optimize out unused code paths in drivers by understanding the device tree.
u-boot.bin u-boot-spl.bin
clearfog 4.34 % 19.0 KB 13.55 % 16.8 KB controlcenterdc 4.79 % 24.2 KB 16.27 % 21.9 KB db-88f6820-amc 4.23 % 25.0 KB 16.17 % 22.9 KB db-88f6820-gp 4.42 % 22.1 KB 17.00 % 23.8 KB helios4 4.32 % 18.9 KB 13.70 % 16.8 KB nokia_rx51 6.11 % 16.5 KB turris_mox 4.17 % 31.8 KB turris_omnia 4.32 % 30.2 KB 14.91 % 16.6 KB x530 3.93 % 30.0 KB 16.26 % 23.4 KB
Marek
Marek Behún (12): build: use thin archives instead of incremental linking sandbox: errno: avoid conflict with libc's errno linker_lists: declare entries and lists externally visible efi_loader: fix warning when linking with LTO binman: declare symbols externally visible build: support building with Link Time Optimizations arch: sandbox: make LTO available sandbox: build with LTO ARM: make gd a function for LTO string: make memcpy() visible to fix LTO linking errors arch: ARM: make LTO available Nokia RX-51: build with LTO
Kbuild | 2 ++ Kconfig | 19 +++++++++++++++++++ Makefile | 28 +++++++++++++++++++++++++++- arch/Kconfig | 2 ++ arch/arm/include/asm/global_data.h | 2 +- arch/arm/lib/Makefile | 2 ++ arch/sandbox/config.mk | 10 +++++----- configs/nokia_rx51_defconfig | 1 + configs/sandbox64_defconfig | 1 + configs/sandbox_defconfig | 1 + configs/sandbox_flattree_defconfig | 1 + configs/sandbox_spl_defconfig | 1 + include/binman.h | 1 + include/binman_sym.h | 4 ++-- include/efi_loader.h | 4 ++-- include/errno.h | 8 +++++++- include/linker_lists.h | 6 ++++-- lib/efi_loader/Makefile | 2 +- lib/errno.c | 4 +++- lib/string.c | 3 ++- scripts/Makefile.build | 5 ++--- scripts/Makefile.lib | 3 +++ scripts/Makefile.spl | 20 ++++++++++++++++---- 23 files changed, 106 insertions(+), 24 deletions(-)

Using thin archives instead of incremental linking - saves disk space - works better with dead code elimination - prepares for potential LTO
Linux does this for some time now, do this also in U-Boot.
Signed-off-by: Marek Behún marek.behun@nic.cz --- Makefile | 2 +- arch/sandbox/config.mk | 6 +++--- scripts/Makefile.build | 5 ++--- scripts/Makefile.spl | 7 +++---- 4 files changed, 9 insertions(+), 11 deletions(-)
diff --git a/Makefile b/Makefile index 6cdd3677eb..33d0b80de8 100644 --- a/Makefile +++ b/Makefile @@ -1750,7 +1750,7 @@ ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(ARCH)/Makefile.postlink) quiet_cmd_u-boot__ ?= LD $@ cmd_u-boot__ ?= $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_u-boot) -o $@ \ -T u-boot.lds $(u-boot-init) \ - --start-group $(u-boot-main) --end-group \ + --whole-archive $(u-boot-main) --no-whole-archive \ $(PLATFORM_LIBS) -Map u-boot.map; \ $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
diff --git a/arch/sandbox/config.mk b/arch/sandbox/config.mk index 189e9c2b0c..ebbb094744 100644 --- a/arch/sandbox/config.mk +++ b/arch/sandbox/config.mk @@ -17,13 +17,13 @@ PLATFORM_CPPFLAGS += $(shell $(SDL_CONFIG) --cflags) endif
cmd_u-boot__ = $(CC) -o $@ -Wl,-T u-boot.lds $(u-boot-init) \ - -Wl,--start-group $(u-boot-main) -Wl,--end-group \ + -Wl,--whole-archive $(u-boot-main) -Wl,--no-whole-archive \ $(PLATFORM_LIBS) -Wl,-Map -Wl,u-boot.map
cmd_u-boot-spl = (cd $(obj) && $(CC) -o $(SPL_BIN) -Wl,-T u-boot-spl.lds \ $(patsubst $(obj)/%,%,$(u-boot-spl-init)) \ - -Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-main)) \ - $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \ + -Wl,--whole-archive $(patsubst $(obj)/%,%,$(u-boot-spl-main)) -Wl,--no-whole-archive \ + -Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \ $(PLATFORM_LIBS) -Wl,-Map -Wl,u-boot-spl.map -Wl,--gc-sections)
CONFIG_ARCH_DEVICE_TREE := sandbox diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 705a886cb9..3659d0af1b 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -331,11 +331,10 @@ $(sort $(subdir-obj-y)): $(subdir-ym) ; # Rule to compile a set of .o files into one .o file # ifdef builtin-target -quiet_cmd_link_o_target = LD $@ +quiet_cmd_link_o_target = AR $@ # If the list of objects to link is empty, just create an empty built-in.o cmd_link_o_target = $(if $(strip $(obj-y)),\ - $(LD) $(ld_flags) -r -o $@ $(filter $(obj-y), $^) \ - $(cmd_secanalysis),\ + rm -f $@; $(AR) cDPrsT $@ $(filter $(obj-y), $^), \ rm -f $@; $(AR) rcs$(KBUILD_ARFLAGS) $@)
$(builtin-target): $(obj-y) FORCE diff --git a/scripts/Makefile.spl b/scripts/Makefile.spl index ea4e045769..f9faf804de 100644 --- a/scripts/Makefile.spl +++ b/scripts/Makefile.spl @@ -421,10 +421,9 @@ $(obj)/$(SPL_BIN).sym: $(obj)/$(SPL_BIN) FORCE # May be overridden by arch/$(ARCH)/config.mk quiet_cmd_u-boot-spl ?= LD $@ cmd_u-boot-spl ?= (cd $(obj) && $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_$(@F)) \ - $(patsubst $(obj)/%,%,$(u-boot-spl-init)) --start-group \ - $(patsubst $(obj)/%,%,$(u-boot-spl-main)) \ - $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) \ - --end-group \ + $(patsubst $(obj)/%,%,$(u-boot-spl-init)) \ + --whole-archive $(patsubst $(obj)/%,%,$(u-boot-spl-main)) --no-whole-archive \ + --start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) --end-group \ $(PLATFORM_LIBS) -Map $(SPL_BIN).map -o $(SPL_BIN))
$(obj)/$(SPL_BIN): $(u-boot-spl-platdata) $(u-boot-spl-init) \

Hi Marek,
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
Using thin archives instead of incremental linking
- saves disk space
- works better with dead code elimination
- prepares for potential LTO
The commit message is a little bit confusing. This commit actually does 2 things: don't do incremental linking (using --whole-archive), and use thin archive (passing T to ar). I believe they are for different purposes, so we cannot say "using thin archives instead of incremental linking".
Linux does this for some time now, do this also in U-Boot.
Signed-off-by: Marek Behún marek.behun@nic.cz
Makefile | 2 +- arch/sandbox/config.mk | 6 +++--- scripts/Makefile.build | 5 ++--- scripts/Makefile.spl | 7 +++---- 4 files changed, 9 insertions(+), 11 deletions(-)
diff --git a/Makefile b/Makefile index 6cdd3677eb..33d0b80de8 100644 --- a/Makefile +++ b/Makefile @@ -1750,7 +1750,7 @@ ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(ARCH)/Makefile.postlink) quiet_cmd_u-boot__ ?= LD $@ cmd_u-boot__ ?= $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_u-boot) -o $@ \ -T u-boot.lds $(u-boot-init) \
--start-group $(u-boot-main) --end-group \
--whole-archive $(u-boot-main) --no-whole-archive \ $(PLATFORM_LIBS) -Map u-boot.map; \ $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
diff --git a/arch/sandbox/config.mk b/arch/sandbox/config.mk index 189e9c2b0c..ebbb094744 100644 --- a/arch/sandbox/config.mk +++ b/arch/sandbox/config.mk @@ -17,13 +17,13 @@ PLATFORM_CPPFLAGS += $(shell $(SDL_CONFIG) --cflags) endif
cmd_u-boot__ = $(CC) -o $@ -Wl,-T u-boot.lds $(u-boot-init) \
-Wl,--start-group $(u-boot-main) -Wl,--end-group \
-Wl,--whole-archive $(u-boot-main) -Wl,--no-whole-archive \ $(PLATFORM_LIBS) -Wl,-Map -Wl,u-boot.map
cmd_u-boot-spl = (cd $(obj) && $(CC) -o $(SPL_BIN) -Wl,-T u-boot-spl.lds \ $(patsubst $(obj)/%,%,$(u-boot-spl-init)) \
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-main)) \
$(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
-Wl,--whole-archive $(patsubst $(obj)/%,%,$(u-boot-spl-main)) -Wl,--no-whole-archive \
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
u-boot-spl-platdata is still within --start-group, --end-group, is this intentional?
$(PLATFORM_LIBS) -Wl,-Map -Wl,u-boot-spl.map -Wl,--gc-sections)
CONFIG_ARCH_DEVICE_TREE := sandbox diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 705a886cb9..3659d0af1b 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -331,11 +331,10 @@ $(sort $(subdir-obj-y)): $(subdir-ym) ; # Rule to compile a set of .o files into one .o file # ifdef builtin-target -quiet_cmd_link_o_target = LD $@ +quiet_cmd_link_o_target = AR $@ # If the list of objects to link is empty, just create an empty built-in.o cmd_link_o_target = $(if $(strip $(obj-y)),\
$(LD) $(ld_flags) -r -o $@ $(filter $(obj-y), $^) \
$(cmd_secanalysis),\
rm -f $@; $(AR) cDPrsT $@ $(filter $(obj-y), $^), \
Is P required to make everything work?
rm -f $@; $(AR) rcs$(KBUILD_ARFLAGS) $@)
$(builtin-target): $(obj-y) FORCE diff --git a/scripts/Makefile.spl b/scripts/Makefile.spl index ea4e045769..f9faf804de 100644 --- a/scripts/Makefile.spl +++ b/scripts/Makefile.spl @@ -421,10 +421,9 @@ $(obj)/$(SPL_BIN).sym: $(obj)/$(SPL_BIN) FORCE # May be overridden by arch/$(ARCH)/config.mk quiet_cmd_u-boot-spl ?= LD $@ cmd_u-boot-spl ?= (cd $(obj) && $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_$(@F)) \
$(patsubst $(obj)/%,%,$(u-boot-spl-init)) --start-group \
$(patsubst $(obj)/%,%,$(u-boot-spl-main)) \
$(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) \
--end-group \
$(patsubst $(obj)/%,%,$(u-boot-spl-init)) \
--whole-archive $(patsubst $(obj)/%,%,$(u-boot-spl-main)) --no-whole-archive \
--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) --end-group \ $(PLATFORM_LIBS) -Map $(SPL_BIN).map -o $(SPL_BIN))
$(obj)/$(SPL_BIN): $(u-boot-spl-platdata) $(u-boot-spl-init) \
Regards, Bin

On Thu, 4 Mar 2021 18:57:11 +0800 Bin Meng bmeng.cn@gmail.com wrote:
Hi Marek,
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
Using thin archives instead of incremental linking
- saves disk space
- works better with dead code elimination
- prepares for potential LTO
The commit message is a little bit confusing. This commit actually does 2 things: don't do incremental linking (using --whole-archive), and use thin archive (passing T to ar). I believe they are for different purposes, so we cannot say "using thin archives instead of incremental linking".
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-main)) \
$(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
-Wl,--whole-archive $(patsubst $(obj)/%,%,$(u-boot-spl-main)) -Wl,--no-whole-archive \
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
u-boot-spl-platdata is still within --start-group, --end-group, is this intentional?
I confess that I did not really study these options, I have made these changes according to old LTO patches for Linux. But you are right that it does not make sense. I have fixed this for the next version of this patch.
Is P required to make everything work?
It is not. Removed in next version.

Hi Marek,
On Fri, Mar 5, 2021 at 2:17 AM Marek Behun marek.behun@nic.cz wrote:
On Thu, 4 Mar 2021 18:57:11 +0800 Bin Meng bmeng.cn@gmail.com wrote:
Hi Marek,
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
Using thin archives instead of incremental linking
- saves disk space
- works better with dead code elimination
- prepares for potential LTO
The commit message is a little bit confusing. This commit actually does 2 things: don't do incremental linking (using --whole-archive), and use thin archive (passing T to ar). I believe they are for different purposes, so we cannot say "using thin archives instead of incremental linking".
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-main)) \
$(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
-Wl,--whole-archive $(patsubst $(obj)/%,%,$(u-boot-spl-main)) -Wl,--no-whole-archive \
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
u-boot-spl-platdata is still within --start-group, --end-group, is this intentional?
I confess that I did not really study these options, I have made these changes according to old LTO patches for Linux. But you are right that it does not make sense. I have fixed this for the next version of this patch.
Is P required to make everything work?
It is not. Removed in next version.
I did more investigation on this.
The Linux kernel specially added P to ar, in below commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
So it looks like we should keep P here?
But I don't get the point of switching to thin archives. Based on my experiment, LTO does not rely on thin archives. The Linux kernel did not introduce thin archives for LTO. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Regards, Bin

On Fri, Mar 05, 2021 at 09:34:42PM +0800, Bin Meng wrote:
Hi Marek,
On Fri, Mar 5, 2021 at 2:17 AM Marek Behun marek.behun@nic.cz wrote:
On Thu, 4 Mar 2021 18:57:11 +0800 Bin Meng bmeng.cn@gmail.com wrote:
Hi Marek,
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
Using thin archives instead of incremental linking
- saves disk space
- works better with dead code elimination
- prepares for potential LTO
The commit message is a little bit confusing. This commit actually does 2 things: don't do incremental linking (using --whole-archive), and use thin archive (passing T to ar). I believe they are for different purposes, so we cannot say "using thin archives instead of incremental linking".
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-main)) \
$(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
-Wl,--whole-archive $(patsubst $(obj)/%,%,$(u-boot-spl-main)) -Wl,--no-whole-archive \
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
u-boot-spl-platdata is still within --start-group, --end-group, is this intentional?
I confess that I did not really study these options, I have made these changes according to old LTO patches for Linux. But you are right that it does not make sense. I have fixed this for the next version of this patch.
Is P required to make everything work?
It is not. Removed in next version.
I did more investigation on this.
The Linux kernel specially added P to ar, in below commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
So it looks like we should keep P here?
But I don't get the point of switching to thin archives. Based on my experiment, LTO does not rely on thin archives. The Linux kernel did not introduce thin archives for LTO. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
So technically it would just be part of dealing with the backlog of kbuild-resync to take it in this series I guess.

On Fri, 5 Mar 2021 08:37:28 -0500 Tom Rini trini@konsulko.com wrote:
On Fri, Mar 05, 2021 at 09:34:42PM +0800, Bin Meng wrote:
Hi Marek,
On Fri, Mar 5, 2021 at 2:17 AM Marek Behun marek.behun@nic.cz wrote:
On Thu, 4 Mar 2021 18:57:11 +0800 Bin Meng bmeng.cn@gmail.com wrote:
Hi Marek,
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
Using thin archives instead of incremental linking
- saves disk space
- works better with dead code elimination
- prepares for potential LTO
The commit message is a little bit confusing. This commit actually does 2 things: don't do incremental linking (using --whole-archive), and use thin archive (passing T to ar). I believe they are for different purposes, so we cannot say "using thin archives instead of incremental linking".
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-main)) \
$(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
-Wl,--whole-archive $(patsubst $(obj)/%,%,$(u-boot-spl-main)) -Wl,--no-whole-archive \
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
u-boot-spl-platdata is still within --start-group, --end-group, is this intentional?
I confess that I did not really study these options, I have made these changes according to old LTO patches for Linux. But you are right that it does not make sense. I have fixed this for the next version of this patch.
Is P required to make everything work?
It is not. Removed in next version.
I did more investigation on this.
The Linux kernel specially added P to ar, in below commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
So it looks like we should keep P here?
But I don't get the point of switching to thin archives. Based on my experiment, LTO does not rely on thin archives. The Linux kernel did not introduce thin archives for LTO. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
So technically it would just be part of dealing with the backlog of kbuild-resync to take it in this series I guess.
It seems the P flag is needed for ar, otherwise final linking may fail, for example for nokia rx51. Since Linux uses this as well I am just gonna put it there.

On Fri, 5 Mar 2021 21:34:42 +0800 Bin Meng bmeng.cn@gmail.com wrote:
Hi Marek,
On Fri, Mar 5, 2021 at 2:17 AM Marek Behun marek.behun@nic.cz wrote:
On Thu, 4 Mar 2021 18:57:11 +0800 Bin Meng bmeng.cn@gmail.com wrote:
Hi Marek,
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
Using thin archives instead of incremental linking
- saves disk space
- works better with dead code elimination
- prepares for potential LTO
The commit message is a little bit confusing. This commit actually does 2 things: don't do incremental linking (using --whole-archive), and use thin archive (passing T to ar). I believe they are for different purposes, so we cannot say "using thin archives instead of incremental linking".
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-main)) \
$(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
-Wl,--whole-archive $(patsubst $(obj)/%,%,$(u-boot-spl-main)) -Wl,--no-whole-archive \
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
u-boot-spl-platdata is still within --start-group, --end-group, is this intentional?
I confess that I did not really study these options, I have made these changes according to old LTO patches for Linux. But you are right that it does not make sense. I have fixed this for the next version of this patch.
Is P required to make everything work?
It is not. Removed in next version.
I did more investigation on this.
The Linux kernel specially added P to ar, in below commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
So it looks like we should keep P here?
But I don't get the point of switching to thin archives. Based on my experiment, LTO does not rely on thin archives. The Linux kernel did not introduce thin archives for LTO. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
It does not matter whether we use thin archives or real archives. But we did not use any of this before, instead we linked the object files in a directory into one object file in that directory. And to do this with LTO would cause unnecessary complications.
Marek

When building with LTO, the system libc's `errno` variable used in arch/sandbox/cpu/os.c conflicts with U-Boot's `errno` (defined in lib/errno.c) with the following error: .../ld: errno@@GLIBC_PRIVATE: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in /tmp/u-boot.EQlEXz.ltrans0.ltrans.o
To avoid this conflict use different asm label for this variable when CONFIG_SANDBOX is enabled.
Signed-off-by: Marek Behún marek.behun@nic.cz --- include/errno.h | 8 +++++++- lib/errno.c | 4 +++- 2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/include/errno.h b/include/errno.h index 3af539b9e9..652ad67306 100644 --- a/include/errno.h +++ b/include/errno.h @@ -8,7 +8,13 @@
#include <linux/errno.h>
-extern int errno; +#ifdef __SANDBOX__ +#define __errno_asm_label asm("__u_boot_errno") +#else +#define __errno_asm_label +#endif + +extern int errno __errno_asm_label;
#define __set_errno(val) do { errno = val; } while (0)
diff --git a/lib/errno.c b/lib/errno.c index 8330a8fd14..ca0c756bd9 100644 --- a/lib/errno.c +++ b/lib/errno.c @@ -1 +1,3 @@ -int errno = 0; +#include <errno.h> + +int errno __errno_asm_label = 0;

On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
When building with LTO, the system libc's `errno` variable used in arch/sandbox/cpu/os.c conflicts with U-Boot's `errno` (defined in lib/errno.c) with the following error: .../ld: errno@@GLIBC_PRIVATE: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in /tmp/u-boot.EQlEXz.ltrans0.ltrans.o
Do you know if this is the expected behavior when enabling LTO on the compiler?
To avoid this conflict use different asm label for this variable when CONFIG_SANDBOX is enabled.
Signed-off-by: Marek Behún marek.behun@nic.cz
include/errno.h | 8 +++++++- lib/errno.c | 4 +++- 2 files changed, 10 insertions(+), 2 deletions(-)
Regards, Bin

On Fri, 5 Mar 2021 11:00:45 +0800 Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
When building with LTO, the system libc's `errno` variable used in arch/sandbox/cpu/os.c conflicts with U-Boot's `errno` (defined in lib/errno.c) with the following error: .../ld: errno@@GLIBC_PRIVATE: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in /tmp/u-boot.EQlEXz.ltrans0.ltrans.o
Do you know if this is the expected behavior when enabling LTO on the compiler?
I don't, but this is a bug anyway. The symbol clashes with the symbol from glibc. Does somebody know whether the usage of this symbol in os.c does really use glibc's version or U-Boot's one?

Hi Marek,
On Fri, 5 Mar 2021 at 08:37, Marek Behun marek.behun@nic.cz wrote:
On Fri, 5 Mar 2021 11:00:45 +0800 Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
When building with LTO, the system libc's `errno` variable used in arch/sandbox/cpu/os.c conflicts with U-Boot's `errno` (defined in lib/errno.c) with the following error: .../ld: errno@@GLIBC_PRIVATE: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in /tmp/u-boot.EQlEXz.ltrans0.ltrans.o
Do you know if this is the expected behavior when enabling LTO on the compiler?
I don't, but this is a bug anyway. The symbol clashes with the symbol from glibc. Does somebody know whether the usage of this symbol in os.c does really use glibc's version or U-Boot's one?
It is intended to use glibc's version. In fact I don't think U-Boot should have an errno. We return errors in each case, as does Linux.
Regards, Simon

On Fri, 5 Mar 2021 09:39:53 -0700 Simon Glass sjg@chromium.org wrote:
Hi Marek,
On Fri, 5 Mar 2021 at 08:37, Marek Behun marek.behun@nic.cz wrote:
On Fri, 5 Mar 2021 11:00:45 +0800 Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
When building with LTO, the system libc's `errno` variable used in arch/sandbox/cpu/os.c conflicts with U-Boot's `errno` (defined in lib/errno.c) with the following error: .../ld: errno@@GLIBC_PRIVATE: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in /tmp/u-boot.EQlEXz.ltrans0.ltrans.o
Do you know if this is the expected behavior when enabling LTO on the compiler?
I don't, but this is a bug anyway. The symbol clashes with the symbol from glibc. Does somebody know whether the usage of this symbol in os.c does really use glibc's version or U-Boot's one?
It is intended to use glibc's version. In fact I don't think U-Boot should have an errno. We return errors in each case, as does Linux.
The problem is that libc defines errno as a thread-local variable or, in older version, as a macro expading to a function dereference, i.e. #define errno (*__get_threads_errno()) But U-Boot usis the errno symbol defined in include/errno.h as a symbol.
So in order for these two symbols not to clash (in case libc is using thread-local symbol with name errno), we need to rename the U-Boot errno variable's symbol name.

Hi Marek,
On Fri, 5 Mar 2021 at 09:50, Marek Behun marek.behun@nic.cz wrote:
On Fri, 5 Mar 2021 09:39:53 -0700 Simon Glass sjg@chromium.org wrote:
Hi Marek,
On Fri, 5 Mar 2021 at 08:37, Marek Behun marek.behun@nic.cz wrote:
On Fri, 5 Mar 2021 11:00:45 +0800 Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
When building with LTO, the system libc's `errno` variable used in arch/sandbox/cpu/os.c conflicts with U-Boot's `errno` (defined in lib/errno.c) with the following error: .../ld: errno@@GLIBC_PRIVATE: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in /tmp/u-boot.EQlEXz.ltrans0.ltrans.o
Do you know if this is the expected behavior when enabling LTO on the compiler?
I don't, but this is a bug anyway. The symbol clashes with the symbol from glibc. Does somebody know whether the usage of this symbol in os.c does really use glibc's version or U-Boot's one?
It is intended to use glibc's version. In fact I don't think U-Boot should have an errno. We return errors in each case, as does Linux.
The problem is that libc defines errno as a thread-local variable or, in older version, as a macro expading to a function dereference, i.e. #define errno (*__get_threads_errno()) But U-Boot usis the errno symbol defined in include/errno.h as a symbol.
So in order for these two symbols not to clash (in case libc is using thread-local symbol with name errno), we need to rename the U-Boot errno variable's symbol name.
Rename is OK, but can we delete it instead? I really don't think it should be there.
Regards, Simon

On 05.03.21 17:58, Simon Glass wrote:
Hi Marek,
On Fri, 5 Mar 2021 at 09:50, Marek Behun marek.behun@nic.cz wrote:
On Fri, 5 Mar 2021 09:39:53 -0700 Simon Glass sjg@chromium.org wrote:
Hi Marek,
On Fri, 5 Mar 2021 at 08:37, Marek Behun marek.behun@nic.cz wrote:
On Fri, 5 Mar 2021 11:00:45 +0800 Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
When building with LTO, the system libc's `errno` variable used in arch/sandbox/cpu/os.c conflicts with U-Boot's `errno` (defined in lib/errno.c) with the following error: .../ld: errno@@GLIBC_PRIVATE: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in /tmp/u-boot.EQlEXz.ltrans0.ltrans.o
Do you know if this is the expected behavior when enabling LTO on the compiler?
I don't, but this is a bug anyway. The symbol clashes with the symbol from glibc. Does somebody know whether the usage of this symbol in os.c does really use glibc's version or U-Boot's one?
It is intended to use glibc's version. In fact I don't think U-Boot should have an errno. We return errors in each case, as does Linux.
The problem is that libc defines errno as a thread-local variable or, in older version, as a macro expading to a function dereference, i.e. #define errno (*__get_threads_errno()) But U-Boot usis the errno symbol defined in include/errno.h as a symbol.
So in order for these two symbols not to clash (in case libc is using thread-local symbol with name errno), we need to rename the U-Boot errno variable's symbol name.
Rename is OK, but can we delete it instead? I really don't think it should be there.
What makes you think so?
fs/fs.c:614: errno = -ret;
Best regards
Heinrich

On Fri, 5 Mar 2021 09:58:34 -0700 Simon Glass sjg@chromium.org wrote:
Hi Marek,
On Fri, 5 Mar 2021 at 09:50, Marek Behun marek.behun@nic.cz wrote:
On Fri, 5 Mar 2021 09:39:53 -0700 Simon Glass sjg@chromium.org wrote:
Hi Marek,
On Fri, 5 Mar 2021 at 08:37, Marek Behun marek.behun@nic.cz wrote:
On Fri, 5 Mar 2021 11:00:45 +0800 Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
When building with LTO, the system libc's `errno` variable used in arch/sandbox/cpu/os.c conflicts with U-Boot's `errno` (defined in lib/errno.c) with the following error: .../ld: errno@@GLIBC_PRIVATE: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in /tmp/u-boot.EQlEXz.ltrans0.ltrans.o
Do you know if this is the expected behavior when enabling LTO on the compiler?
I don't, but this is a bug anyway. The symbol clashes with the symbol from glibc. Does somebody know whether the usage of this symbol in os.c does really use glibc's version or U-Boot's one?
It is intended to use glibc's version. In fact I don't think U-Boot should have an errno. We return errors in each case, as does Linux.
The problem is that libc defines errno as a thread-local variable or, in older version, as a macro expading to a function dereference, i.e. #define errno (*__get_threads_errno()) But U-Boot usis the errno symbol defined in include/errno.h as a symbol.
So in order for these two symbols not to clash (in case libc is using thread-local symbol with name errno), we need to rename the U-Boot errno variable's symbol name.
Rename is OK, but can we delete it instead? I really don't think it should be there.
We can't simply delete it. The whole u-boot is using the errno symbol from include/errno.h and if we want the whole u-boot to use libc's symbol we need to code include/errno.h to declare it in the same way as libc, which may be different for different libcs.

Hi Marek,
On Fri, 5 Mar 2021 at 10:24, Marek Behun marek.behun@nic.cz wrote:
On Fri, 5 Mar 2021 09:58:34 -0700 Simon Glass sjg@chromium.org wrote:
Hi Marek,
On Fri, 5 Mar 2021 at 09:50, Marek Behun marek.behun@nic.cz wrote:
On Fri, 5 Mar 2021 09:39:53 -0700 Simon Glass sjg@chromium.org wrote:
Hi Marek,
On Fri, 5 Mar 2021 at 08:37, Marek Behun marek.behun@nic.cz wrote:
On Fri, 5 Mar 2021 11:00:45 +0800 Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote: > > When building with LTO, the system libc's `errno` variable used in > arch/sandbox/cpu/os.c conflicts with U-Boot's `errno` (defined in > lib/errno.c) with the following error: > .../ld: errno@@GLIBC_PRIVATE: TLS definition in /lib64/libc.so.6 > section .tbss mismatches non-TLS reference in > /tmp/u-boot.EQlEXz.ltrans0.ltrans.o
Do you know if this is the expected behavior when enabling LTO on the compiler?
I don't, but this is a bug anyway. The symbol clashes with the symbol from glibc. Does somebody know whether the usage of this symbol in os.c does really use glibc's version or U-Boot's one?
It is intended to use glibc's version. In fact I don't think U-Boot should have an errno. We return errors in each case, as does Linux.
The problem is that libc defines errno as a thread-local variable or, in older version, as a macro expading to a function dereference, i.e. #define errno (*__get_threads_errno()) But U-Boot usis the errno symbol defined in include/errno.h as a symbol.
So in order for these two symbols not to clash (in case libc is using thread-local symbol with name errno), we need to rename the U-Boot errno variable's symbol name.
Rename is OK, but can we delete it instead? I really don't think it should be there.
We can't simply delete it. The whole u-boot is using the errno symbol from include/errno.h and if we want the whole u-boot to use libc's symbol we need to code include/errno.h to declare it in the same way as libc, which may be different for different libcs.
OK...
Heinrich I don't think the fs needs to use errno, or perhaps it should have its own local version. It's just not nice to have a global error number IMO.
Anyway, this is for future discussion, not for Marek to worry about. I am fine with Marek's solution.
Regards, Simon

On 05.03.21 16:37, Marek Behun wrote:
On Fri, 5 Mar 2021 11:00:45 +0800 Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
When building with LTO, the system libc's `errno` variable used in arch/sandbox/cpu/os.c conflicts with U-Boot's `errno` (defined in lib/errno.c) with the following error: .../ld: errno@@GLIBC_PRIVATE: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in /tmp/u-boot.EQlEXz.ltrans0.ltrans.o
Do you know if this is the expected behavior when enabling LTO on the compiler?
I don't, but this is a bug anyway. The symbol clashes with the symbol from glibc. Does somebody know whether the usage of this symbol in os.c does really use glibc's version or U-Boot's one?
Hello Marek,
Why do you resort to assembler in your patch instead of simply using:
#define errno __uboot_errno
to substitute the symbol?
Why explicitly set errno = 0? Globals are automatically initialized to zero.
@Bin: Here is an example demonstrating that glibc's errno is used in os.c:
=> host ls hostfs errno = 9 readdir: Bad file descriptor double free or corruption (top) Aborted
caused by the change below:
diff --git a/arch/sandbox/cpu/os.c b/arch/sandbox/cpu/os.c index 3d8af0a52b..5b45296c47 100644 --- a/arch/sandbox/cpu/os.c +++ b/arch/sandbox/cpu/os.c @@ -456,9 +456,12 @@ int os_dirent_ls(const char *dirname, struct os_dirent_node **headp)
for (node = head = NULL;; node = next) { errno = 0; + closedir(dir); entry = readdir(dir); if (!entry) { ret = errno; + printf("errno = %d\n", errno); + perror("readdir"); break; } next = malloc(sizeof(*node) + strlen(entry->d_name) + 1);
Best regards
Heinrich

On Fri, 5 Mar 2021 18:21:51 +0100 Heinrich Schuchardt xypron.glpk@gmx.de wrote:
On 05.03.21 16:37, Marek Behun wrote:
On Fri, 5 Mar 2021 11:00:45 +0800 Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
When building with LTO, the system libc's `errno` variable used in arch/sandbox/cpu/os.c conflicts with U-Boot's `errno` (defined in lib/errno.c) with the following error: .../ld: errno@@GLIBC_PRIVATE: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in /tmp/u-boot.EQlEXz.ltrans0.ltrans.o
Do you know if this is the expected behavior when enabling LTO on the compiler?
I don't, but this is a bug anyway. The symbol clashes with the symbol from glibc. Does somebody know whether the usage of this symbol in os.c does really use glibc's version or U-Boot's one?
Hello Marek,
Why do you resort to assembler in your patch instead of simply using:
#define errno __uboot_errno
to substitute the symbol?
Meeeeeh. :D That would just make error messages from gcc more complicated, if suddenly the compiler spat out 2 more lines, saying "in expansion of macro...".
I think that using attributes, static inline functions and everything else the compiler provides instead of macros is better.
Why explicitly set errno = 0? Globals are automatically initialized to zero.
I just added the symbol renaming part, the = 0 assignment was already there. I don't think this commit should remove it. If we want that, we can make it in another commit.
Marek

On Tue, 2 Mar 2021 at 21:12, Marek Behún marek.behun@nic.cz wrote:
When building with LTO, the system libc's `errno` variable used in arch/sandbox/cpu/os.c conflicts with U-Boot's `errno` (defined in lib/errno.c) with the following error: .../ld: errno@@GLIBC_PRIVATE: TLS definition in /lib64/libc.so.6 section .tbss mismatches non-TLS reference in /tmp/u-boot.EQlEXz.ltrans0.ltrans.o
To avoid this conflict use different asm label for this variable when CONFIG_SANDBOX is enabled.
Signed-off-by: Marek Behún marek.behun@nic.cz
include/errno.h | 8 +++++++- lib/errno.c | 4 +++- 2 files changed, 10 insertions(+), 2 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org

Use the `__visible` macro to declare entires and lists declared by ll_entry_declare() and ll_entry_declare_list() externally visible, so that when building with LTO the compiler does not optimize this data away.
Signed-off-by: Marek Behún marek.behun@nic.cz --- include/linker_lists.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/include/linker_lists.h b/include/linker_lists.h index fd98ecd297..9d44dab2e5 100644 --- a/include/linker_lists.h +++ b/include/linker_lists.h @@ -70,7 +70,8 @@ #define ll_entry_declare(_type, _name, _list) \ _type _u_boot_list_2_##_list##_2_##_name __aligned(4) \ __attribute__((unused, \ - section(".u_boot_list_2_"#_list"_2_"#_name))) + section(".u_boot_list_2_"#_list"_2_"#_name))) \ + __visible
/** * ll_entry_declare_list() - Declare a list of link-generated array entries @@ -93,7 +94,8 @@ #define ll_entry_declare_list(_type, _name, _list) \ _type _u_boot_list_2_##_list##_2_##_name[] __aligned(4) \ __attribute__((unused, \ - section(".u_boot_list_2_"#_list"_2_"#_name))) + section(".u_boot_list_2_"#_list"_2_"#_name))) \ + __visible
/* * We need a 0-byte-size type for iterator symbols, and the compiler

We need to use the __ADDRESSABLE() macro from linux/compiler.h like Linux does in order to make it work even with clang's LTO.

On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
Use the `__visible` macro to declare entires and lists declared by ll_entry_declare() and ll_entry_declare_list() externally visible, so that when building with LTO the compiler does not optimize this data away.
__visible is defined like this:
/* * Optional: not supported by clang * * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-ext... */ #if __has_attribute(__externally_visible__) # define __visible __attribute__((__externally_visible__)) #else # define __visible #endif
It says clang does not support this. So what about clang?
Signed-off-by: Marek Behún marek.behun@nic.cz
include/linker_lists.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
Regards, Bin

On Fri, 5 Mar 2021 11:04:08 +0800 Bin Meng bmeng.cn@gmail.com wrote:
On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
Use the `__visible` macro to declare entires and lists declared by ll_entry_declare() and ll_entry_declare_list() externally visible, so that when building with LTO the compiler does not optimize this data away.
__visible is defined like this:
/*
- Optional: not supported by clang
- gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-ext...
*/ #if __has_attribute(__externally_visible__) # define __visible __attribute__((__externally_visible__)) #else # define __visible #endif
It says clang does not support this. So what about clang?
Signed-off-by: Marek Behún marek.behun@nic.cz
include/linker_lists.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
Regards, Bin
Bin, this is already changed to something different on my github. I will send new version once I am satisfied with CI tests.
Marek

When linking with LTO, the compiler complains about type mismatch of variables `__efi_runtime_start`, `__efi_runtime_stop`, `__efi_runtime_rel_start` and `__efi_runtime_rel_stop`:
include/efi_loader.h:218:21: warning: type of ‘__efi_runtime_start’ does not match original declaration [-Wlto-type-mismatch] 218 | extern unsigned int __efi_runtime_start, __efi_runtime_stop; | ^ arch/sandbox/lib/sections.c:7:6: note: ‘__efi_runtime_start’ was previously declared here 7 | char __efi_runtime_start[0] __attribute__((section(".__efi_runtime_start"))); | ^
Change the type to char[] in include/efi_loader.h.
Signed-off-by: Marek Behún marek.behun@nic.cz --- include/efi_loader.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/efi_loader.h b/include/efi_loader.h index 68daa1a4a9..fbe212e68d 100644 --- a/include/efi_loader.h +++ b/include/efi_loader.h @@ -215,8 +215,8 @@ extern const efi_guid_t efi_guid_capsule_report; /* GUID of firmware management protocol */ extern const efi_guid_t efi_guid_firmware_management_protocol;
-extern unsigned int __efi_runtime_start, __efi_runtime_stop; -extern unsigned int __efi_runtime_rel_start, __efi_runtime_rel_stop; +extern char __efi_runtime_start[], __efi_runtime_stop[]; +extern char __efi_runtime_rel_start[], __efi_runtime_rel_stop[];
/** * struct efi_open_protocol_info_item - open protocol info item

On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
When linking with LTO, the compiler complains about type mismatch of variables `__efi_runtime_start`, `__efi_runtime_stop`, `__efi_runtime_rel_start` and `__efi_runtime_rel_stop`:
include/efi_loader.h:218:21: warning: type of ‘__efi_runtime_start’ does not match original declaration [-Wlto-type-mismatch] 218 | extern unsigned int __efi_runtime_start, __efi_runtime_stop; | ^ arch/sandbox/lib/sections.c:7:6: note: ‘__efi_runtime_start’ was previously declared here 7 | char __efi_runtime_start[0] __attribute__((section(".__efi_runtime_start"))); | ^
Change the type to char[] in include/efi_loader.h.
Signed-off-by: Marek Behún marek.behun@nic.cz
include/efi_loader.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Bin Meng bmeng.cn@gmail.com

Use the `__visible` macro to declare binman symbols externally visible, so that when building with LTO the compiler does not optimize this data away.
Signed-off-by: Marek Behún marek.behun@nic.cz --- include/binman.h | 1 + include/binman_sym.h | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/include/binman.h b/include/binman.h index 5958dfb448..e72e85d4b7 100644 --- a/include/binman.h +++ b/include/binman.h @@ -9,6 +9,7 @@ #ifndef _BINMAN_H_ #define _BINMAN_H_
+#include <linux/compiler.h> #include <dm/ofnode.h>
/** diff --git a/include/binman_sym.h b/include/binman_sym.h index 72e6765fe5..55421f5893 100644 --- a/include/binman_sym.h +++ b/include/binman_sym.h @@ -33,7 +33,7 @@ * @_prop_name: Property value to get from that entry (e.g. 'pos') */ #define binman_sym_declare(_type, _entry_name, _prop_name) \ - _type binman_symname(_entry_name, _prop_name) \ + _type binman_symname(_entry_name, _prop_name) __visible \ __attribute__((aligned(4), unused, section(".binman_sym")))
/** @@ -58,7 +58,7 @@ * @_prop_name: Property value to get from that entry (e.g. 'pos') */ #define binman_sym_declare_optional(_type, _entry_name, _prop_name) \ - _type binman_symname(_entry_name, _prop_name) \ + _type binman_symname(_entry_name, _prop_name) __visible \ __attribute__((aligned(4), weak, unused, \ section(".binman_sym")))

On Wed, Mar 03, 2021 at 05:12:04AM +0100, Marek Behún wrote:
Use the `__visible` macro to declare binman symbols externally visible, so that when building with LTO the compiler does not optimize this data away.
Signed-off-by: Marek Behún marek.behun@nic.cz
include/binman.h | 1 + include/binman_sym.h | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/include/binman.h b/include/binman.h index 5958dfb448..e72e85d4b7 100644 --- a/include/binman.h +++ b/include/binman.h @@ -9,6 +9,7 @@ #ifndef _BINMAN_H_ #define _BINMAN_H_
+#include <linux/compiler.h> #include <dm/ofnode.h>
/** diff --git a/include/binman_sym.h b/include/binman_sym.h index 72e6765fe5..55421f5893 100644 --- a/include/binman_sym.h +++ b/include/binman_sym.h @@ -33,7 +33,7 @@
- @_prop_name: Property value to get from that entry (e.g. 'pos')
*/ #define binman_sym_declare(_type, _entry_name, _prop_name) \
- _type binman_symname(_entry_name, _prop_name) \
- _type binman_symname(_entry_name, _prop_name) __visible \ __attribute__((aligned(4), unused, section(".binman_sym")))
/** @@ -58,7 +58,7 @@
- @_prop_name: Property value to get from that entry (e.g. 'pos')
*/ #define binman_sym_declare_optional(_type, _entry_name, _prop_name) \
- _type binman_symname(_entry_name, _prop_name) \
- _type binman_symname(_entry_name, _prop_name) __visible \ __attribute__((aligned(4), weak, unused, \ section(".binman_sym")))
I see failure to run test suites: https://source.denx.de/u-boot/u-boot/-/jobs/232926 and adding <linux/compiler.h> to binman_sym.h leads to the same problem.

On Wed, 3 Mar 2021 19:37:09 -0500 Tom Rini trini@konsulko.com wrote:
On Wed, Mar 03, 2021 at 05:12:04AM +0100, Marek Behún wrote:
Use the `__visible` macro to declare binman symbols externally visible, so that when building with LTO the compiler does not optimize this data away.
Signed-off-by: Marek Behún marek.behun@nic.cz
include/binman.h | 1 + include/binman_sym.h | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/include/binman.h b/include/binman.h index 5958dfb448..e72e85d4b7 100644 --- a/include/binman.h +++ b/include/binman.h @@ -9,6 +9,7 @@ #ifndef _BINMAN_H_ #define _BINMAN_H_
+#include <linux/compiler.h> #include <dm/ofnode.h>
/** diff --git a/include/binman_sym.h b/include/binman_sym.h index 72e6765fe5..55421f5893 100644 --- a/include/binman_sym.h +++ b/include/binman_sym.h @@ -33,7 +33,7 @@
- @_prop_name: Property value to get from that entry (e.g. 'pos')
*/ #define binman_sym_declare(_type, _entry_name, _prop_name) \
- _type binman_symname(_entry_name, _prop_name) \
- _type binman_symname(_entry_name, _prop_name) __visible \ __attribute__((aligned(4), unused, section(".binman_sym")))
/** @@ -58,7 +58,7 @@
- @_prop_name: Property value to get from that entry (e.g. 'pos')
*/ #define binman_sym_declare_optional(_type, _entry_name, _prop_name) \
- _type binman_symname(_entry_name, _prop_name) \
- _type binman_symname(_entry_name, _prop_name) __visible \ __attribute__((aligned(4), weak, unused, \ section(".binman_sym")))
I see failure to run test suites: https://source.denx.de/u-boot/u-boot/-/jobs/232926 and adding <linux/compiler.h> to binman_sym.h leads to the same problem.
I have this fixed in CI already. There is new version of these patches there.

Add plumbing for building U-Boot with Link Time Optimizations (gcc only).
Signed-off-by: Marek Behún marek.behun@nic.cz --- Kbuild | 2 ++ Kconfig | 19 +++++++++++++++++++ Makefile | 26 ++++++++++++++++++++++++++ lib/efi_loader/Makefile | 2 +- scripts/Makefile.lib | 3 +++ scripts/Makefile.spl | 13 +++++++++++++ 6 files changed, 64 insertions(+), 1 deletion(-)
diff --git a/Kbuild b/Kbuild index 1eac091594..bf52e54051 100644 --- a/Kbuild +++ b/Kbuild @@ -10,6 +10,8 @@ generic-offsets-file := include/generated/generic-asm-offsets.h always := $(generic-offsets-file) targets := lib/asm-offsets.s
+CFLAGS_REMOVE_asm-offsets.o := $(LTO_CFLAGS) + $(obj)/$(generic-offsets-file): $(obj)/lib/asm-offsets.s FORCE $(call filechk,offsets,__GENERIC_ASM_OFFSETS_H__)
diff --git a/Kconfig b/Kconfig index 86f0a39bb0..ceba53926f 100644 --- a/Kconfig +++ b/Kconfig @@ -85,6 +85,25 @@ config SPL_OPTIMIZE_INLINING do what it thinks is best, which is desirable in some cases for size reasons.
+config ARCH_SUPPORTS_LTO + bool + +config LTO + bool "Enable Link Time Optimizations" + depends on ARCH_SUPPORTS_LTO + default n + help + This option enables Link Time Optimization (LTO), a mechanism which + allows the compiler to optimize between different compilation units. + + This can optimize away dead code paths, resulting in smaller binary + size (if CC_OPTIMIZE_FOR_SIZE is enabled). + + This option is not available for every architecture and may + introduce bugs. + + If unsure, say n. + config TPL_OPTIMIZE_INLINING bool "Allow compiler to uninline functions marked 'inline' in TPL" depends on TPL diff --git a/Makefile b/Makefile index 33d0b80de8..88600ec101 100644 --- a/Makefile +++ b/Makefile @@ -677,6 +677,21 @@ else KBUILD_CFLAGS += -O2 endif
+LTO_CFLAGS := +LTO_FINAL_LDFLAGS := +export LTO_CFLAGS LTO_FINAL_LDFLAGS +ifdef CONFIG_LTO + # use plugin aware tools + AR = $(CROSS_COMPILE)gcc-ar + NM = $(CROSS_COMPILE)gcc-nm + + LTO_CFLAGS := -flto + LTO_FINAL_LDFLAGS := -fuse-linker-plugin -flto=jobserver \ + -fwhole-program + + KBUILD_CFLAGS += $(LTO_CFLAGS) +endif + KBUILD_CFLAGS += $(call cc-option,-fno-stack-protector) KBUILD_CFLAGS += $(call cc-option,-fno-delete-null-pointer-checks)
@@ -1748,11 +1763,22 @@ ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(ARCH)/Makefile.postlink) # Rule to link u-boot # May be overridden by arch/$(ARCH)/config.mk quiet_cmd_u-boot__ ?= LD $@ +ifdef CONFIG_LTO + cmd_u-boot__ ?= $(CC) -nostdlib -nostartfiles \ + $(LTO_FINAL_CFLAGS) $(c_flags) \ + $(KBUILD_LDFLAGS:%=-Wl,%) $(LDFLAGS_u-boot:%=-Wl,%) \ + -o $@ -T u-boot.lds $(u-boot-init) \ + -Wl,--whole-archive \ + $(u-boot-main) $(PLATFORM_LIBS) \ + -Wl,--no-whole-archive -Wl,-Map,u-boot.map; \ + $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) +else cmd_u-boot__ ?= $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_u-boot) -o $@ \ -T u-boot.lds $(u-boot-init) \ --whole-archive $(u-boot-main) --no-whole-archive \ $(PLATFORM_LIBS) -Map u-boot.map; \ $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) +endif
quiet_cmd_smap = GEN common/system_map.o cmd_smap = \ diff --git a/lib/efi_loader/Makefile b/lib/efi_loader/Makefile index 10b42e8847..a5a6639fd3 100644 --- a/lib/efi_loader/Makefile +++ b/lib/efi_loader/Makefile @@ -13,7 +13,7 @@ CFLAGS_efi_boottime.o += \ -DFW_VERSION="0x$(VERSION)" \ -DFW_PATCHLEVEL="0x$(PATCHLEVEL)" CFLAGS_helloworld.o := $(CFLAGS_EFI) -Os -ffreestanding -CFLAGS_REMOVE_helloworld.o := $(CFLAGS_NON_EFI) +CFLAGS_REMOVE_helloworld.o := $(CFLAGS_NON_EFI) $(LTO_CFLAGS)
ifneq ($(CONFIG_CMD_BOOTEFI_HELLO_COMPILE),) always += helloworld.efi diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index 78543c6dd1..78bbebe7e9 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -419,6 +419,9 @@ $(obj)/%_efi.so: $(obj)/%.o $(obj)/efi_crt0.o $(obj)/efi_reloc.o $(obj)/efi_free
targets += $(obj)/efi_crt0.o $(obj)/efi_reloc.o $(obj)/efi_freestanding.o
+CFLAGS_REMOVE_efi_reloc.o := $(LTO_CFLAGS) +CFLAGS_REMOVE_efi_freestanding.o := $(LTO_CFLAGS) + # ACPI # --------------------------------------------------------------------------- # diff --git a/scripts/Makefile.spl b/scripts/Makefile.spl index f9faf804de..b7cab2d302 100644 --- a/scripts/Makefile.spl +++ b/scripts/Makefile.spl @@ -420,11 +420,24 @@ $(obj)/$(SPL_BIN).sym: $(obj)/$(SPL_BIN) FORCE # Rule to link u-boot-spl # May be overridden by arch/$(ARCH)/config.mk quiet_cmd_u-boot-spl ?= LD $@ +ifdef CONFIG_LTO + cmd_u-boot-spl ?= (cd $(obj) && \ + $(CC) -nostdlib -nostartfiles $(LTO_FINAL_LDFLAGS) $(c_flags) \ + $(KBUILD_LDFLAGS:%=-Wl,%) $(LDFLAGS_$(@F):%=-Wl,%) \ + $(patsubst $(obj)/%,%,$(u-boot-spl-init)) \ + -Wl,--whole-archive \ + $(patsubst $(obj)/%,%,$(u-boot-spl-main)) $(PLATFORM_LIBS) \ + -Wl,--no-whole-archive \ + -Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \ + -Wl,-Map,$(SPL_BIN).map -o $(SPL_BIN) \ + ) +else cmd_u-boot-spl ?= (cd $(obj) && $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_$(@F)) \ $(patsubst $(obj)/%,%,$(u-boot-spl-init)) \ --whole-archive $(patsubst $(obj)/%,%,$(u-boot-spl-main)) --no-whole-archive \ --start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) --end-group \ $(PLATFORM_LIBS) -Map $(SPL_BIN).map -o $(SPL_BIN)) +endif
$(obj)/$(SPL_BIN): $(u-boot-spl-platdata) $(u-boot-spl-init) \ $(u-boot-spl-main) $(obj)/u-boot-spl.lds FORCE

On Wed, Mar 3, 2021 at 12:13 PM Marek Behún marek.behun@nic.cz wrote:
Add plumbing for building U-Boot with Link Time Optimizations (gcc only).
Signed-off-by: Marek Behún marek.behun@nic.cz
Kbuild | 2 ++ Kconfig | 19 +++++++++++++++++++ Makefile | 26 ++++++++++++++++++++++++++ lib/efi_loader/Makefile | 2 +- scripts/Makefile.lib | 3 +++ scripts/Makefile.spl | 13 +++++++++++++ 6 files changed, 64 insertions(+), 1 deletion(-)
diff --git a/Kbuild b/Kbuild index 1eac091594..bf52e54051 100644 --- a/Kbuild +++ b/Kbuild @@ -10,6 +10,8 @@ generic-offsets-file := include/generated/generic-asm-offsets.h always := $(generic-offsets-file) targets := lib/asm-offsets.s
+CFLAGS_REMOVE_asm-offsets.o := $(LTO_CFLAGS)
$(obj)/$(generic-offsets-file): $(obj)/lib/asm-offsets.s FORCE $(call filechk,offsets,__GENERIC_ASM_OFFSETS_H__)
diff --git a/Kconfig b/Kconfig index 86f0a39bb0..ceba53926f 100644 --- a/Kconfig +++ b/Kconfig @@ -85,6 +85,25 @@ config SPL_OPTIMIZE_INLINING do what it thinks is best, which is desirable in some cases for size reasons.
+config ARCH_SUPPORTS_LTO
bool
+config LTO
bool "Enable Link Time Optimizations"
depends on ARCH_SUPPORTS_LTO
default n
help
This option enables Link Time Optimization (LTO), a mechanism which
allows the compiler to optimize between different compilation units.
This can optimize away dead code paths, resulting in smaller binary
size (if CC_OPTIMIZE_FOR_SIZE is enabled).
This option is not available for every architecture and may
introduce bugs.
If unsure, say n.
config TPL_OPTIMIZE_INLINING bool "Allow compiler to uninline functions marked 'inline' in TPL" depends on TPL diff --git a/Makefile b/Makefile index 33d0b80de8..88600ec101 100644 --- a/Makefile +++ b/Makefile @@ -677,6 +677,21 @@ else KBUILD_CFLAGS += -O2 endif
+LTO_CFLAGS := +LTO_FINAL_LDFLAGS := +export LTO_CFLAGS LTO_FINAL_LDFLAGS +ifdef CONFIG_LTO
# use plugin aware tools
AR = $(CROSS_COMPILE)gcc-ar
NM = $(CROSS_COMPILE)gcc-nm
LTO_CFLAGS := -flto
LTO_FINAL_LDFLAGS := -fuse-linker-plugin -flto=jobserver \
-fwhole-program
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html says:
-fwhole-program
Assume that the current compilation unit represents the whole program being compiled. All public functions and variables with the exception of main and those merged by attribute externally_visible become static functions and in effect are optimized more aggressively by interprocedural optimizers.
This option should not be used in combination with -flto. Instead relying on a linker plugin should provide safer and more precise information.
It suggests this option should not be used in combination with -flto.
KBUILD_CFLAGS += $(LTO_CFLAGS)
+endif
KBUILD_CFLAGS += $(call cc-option,-fno-stack-protector) KBUILD_CFLAGS += $(call cc-option,-fno-delete-null-pointer-checks)
@@ -1748,11 +1763,22 @@ ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(ARCH)/Makefile.postlink) # Rule to link u-boot # May be overridden by arch/$(ARCH)/config.mk quiet_cmd_u-boot__ ?= LD $@ +ifdef CONFIG_LTO
cmd_u-boot__ ?= $(CC) -nostdlib -nostartfiles \
$(LTO_FINAL_CFLAGS) $(c_flags) \
$(KBUILD_LDFLAGS:%=-Wl,%) $(LDFLAGS_u-boot:%=-Wl,%) \
-o $@ -T u-boot.lds $(u-boot-init) \
-Wl,--whole-archive \
$(u-boot-main) $(PLATFORM_LIBS) \
-Wl,--no-whole-archive -Wl,-Map,u-boot.map; \
$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)
+else cmd_u-boot__ ?= $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_u-boot) -o $@ \ -T u-boot.lds $(u-boot-init) \ --whole-archive $(u-boot-main) --no-whole-archive \ $(PLATFORM_LIBS) -Map u-boot.map; \ $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) +endif
quiet_cmd_smap = GEN common/system_map.o cmd_smap = \ diff --git a/lib/efi_loader/Makefile b/lib/efi_loader/Makefile index 10b42e8847..a5a6639fd3 100644 --- a/lib/efi_loader/Makefile +++ b/lib/efi_loader/Makefile @@ -13,7 +13,7 @@ CFLAGS_efi_boottime.o += \ -DFW_VERSION="0x$(VERSION)" \ -DFW_PATCHLEVEL="0x$(PATCHLEVEL)" CFLAGS_helloworld.o := $(CFLAGS_EFI) -Os -ffreestanding -CFLAGS_REMOVE_helloworld.o := $(CFLAGS_NON_EFI) +CFLAGS_REMOVE_helloworld.o := $(CFLAGS_NON_EFI) $(LTO_CFLAGS)
ifneq ($(CONFIG_CMD_BOOTEFI_HELLO_COMPILE),) always += helloworld.efi diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index 78543c6dd1..78bbebe7e9 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -419,6 +419,9 @@ $(obj)/%_efi.so: $(obj)/%.o $(obj)/efi_crt0.o $(obj)/efi_reloc.o $(obj)/efi_free
targets += $(obj)/efi_crt0.o $(obj)/efi_reloc.o $(obj)/efi_freestanding.o
+CFLAGS_REMOVE_efi_reloc.o := $(LTO_CFLAGS) +CFLAGS_REMOVE_efi_freestanding.o := $(LTO_CFLAGS)
# ACPI # --------------------------------------------------------------------------- # diff --git a/scripts/Makefile.spl b/scripts/Makefile.spl index f9faf804de..b7cab2d302 100644 --- a/scripts/Makefile.spl +++ b/scripts/Makefile.spl @@ -420,11 +420,24 @@ $(obj)/$(SPL_BIN).sym: $(obj)/$(SPL_BIN) FORCE # Rule to link u-boot-spl # May be overridden by arch/$(ARCH)/config.mk quiet_cmd_u-boot-spl ?= LD $@ +ifdef CONFIG_LTO
cmd_u-boot-spl ?= (cd $(obj) && \
$(CC) -nostdlib -nostartfiles $(LTO_FINAL_LDFLAGS) $(c_flags) \
$(KBUILD_LDFLAGS:%=-Wl,%) $(LDFLAGS_$(@F):%=-Wl,%) \
$(patsubst $(obj)/%,%,$(u-boot-spl-init)) \
-Wl,--whole-archive \
$(patsubst $(obj)/%,%,$(u-boot-spl-main)) $(PLATFORM_LIBS) \
-Wl,--no-whole-archive \
-Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \
-Wl,-Map,$(SPL_BIN).map -o $(SPL_BIN) \
)
+else cmd_u-boot-spl ?= (cd $(obj) && $(LD) $(KBUILD_LDFLAGS) $(LDFLAGS_$(@F)) \ $(patsubst $(obj)/%,%,$(u-boot-spl-init)) \ --whole-archive $(patsubst $(obj)/%,%,$(u-boot-spl-main)) --no-whole-archive \ --start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) --end-group \ $(PLATFORM_LIBS) -Map $(SPL_BIN).map -o $(SPL_BIN)) +endif
$(obj)/$(SPL_BIN): $(u-boot-spl-platdata) $(u-boot-spl-init) \ $(u-boot-spl-main) $(obj)/u-boot-spl.lds FORCE --
Regards, Bin

Make LTO available for sandbox architecture.
Signed-off-by: Marek Behún marek.behun@nic.cz --- arch/Kconfig | 1 + arch/sandbox/config.mk | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig index 27843cd79c..a6dab3e56d 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -101,6 +101,7 @@ config RISCV
config SANDBOX bool "Sandbox" + select ARCH_SUPPORTS_LTO select BOARD_LATE_INIT select BZIP2 select CMD_POWEROFF diff --git a/arch/sandbox/config.mk b/arch/sandbox/config.mk index ebbb094744..d9c430794e 100644 --- a/arch/sandbox/config.mk +++ b/arch/sandbox/config.mk @@ -16,11 +16,11 @@ PLATFORM_LIBS += $(shell $(SDL_CONFIG) --libs) PLATFORM_CPPFLAGS += $(shell $(SDL_CONFIG) --cflags) endif
-cmd_u-boot__ = $(CC) -o $@ -Wl,-T u-boot.lds $(u-boot-init) \ +cmd_u-boot__ = $(CC) $(LTO_FINAL_CFLAGS) -o $@ -Wl,-T u-boot.lds $(u-boot-init) \ -Wl,--whole-archive $(u-boot-main) -Wl,--no-whole-archive \ $(PLATFORM_LIBS) -Wl,-Map -Wl,u-boot.map
-cmd_u-boot-spl = (cd $(obj) && $(CC) -o $(SPL_BIN) -Wl,-T u-boot-spl.lds \ +cmd_u-boot-spl = (cd $(obj) && $(CC) $(LTO_FINAL_CFLAGS) -o $(SPL_BIN) -Wl,-T u-boot-spl.lds \ $(patsubst $(obj)/%,%,$(u-boot-spl-init)) \ -Wl,--whole-archive $(patsubst $(obj)/%,%,$(u-boot-spl-main)) -Wl,--no-whole-archive \ -Wl,--start-group $(patsubst $(obj)/%,%,$(u-boot-spl-platdata)) -Wl,--end-group \

Build sandbox targets with LTO enabled.
Signed-off-by: Marek Behún marek.behun@nic.cz --- configs/sandbox64_defconfig | 1 + configs/sandbox_defconfig | 1 + configs/sandbox_flattree_defconfig | 1 + configs/sandbox_spl_defconfig | 1 + 4 files changed, 4 insertions(+)
diff --git a/configs/sandbox64_defconfig b/configs/sandbox64_defconfig index cfda83474b..9a23b0420c 100644 --- a/configs/sandbox64_defconfig +++ b/configs/sandbox64_defconfig @@ -7,6 +7,7 @@ CONFIG_PRE_CON_BUF_ADDR=0x100000 CONFIG_BOOTSTAGE_STASH_ADDR=0x0 CONFIG_DEFAULT_DEVICE_TREE="sandbox64" CONFIG_SANDBOX64=y +CONFIG_LTO=y CONFIG_DEBUG_UART=y CONFIG_DISTRO_DEFAULTS=y CONFIG_FIT=y diff --git a/configs/sandbox_defconfig b/configs/sandbox_defconfig index 5bc90d09a8..1e9bc7699e 100644 --- a/configs/sandbox_defconfig +++ b/configs/sandbox_defconfig @@ -6,6 +6,7 @@ CONFIG_ENV_SIZE=0x2000 CONFIG_PRE_CON_BUF_ADDR=0xf0000 CONFIG_BOOTSTAGE_STASH_ADDR=0x0 CONFIG_DEFAULT_DEVICE_TREE="sandbox" +CONFIG_LTO=y CONFIG_DEBUG_UART=y CONFIG_DISTRO_DEFAULTS=y CONFIG_FIT=y diff --git a/configs/sandbox_flattree_defconfig b/configs/sandbox_flattree_defconfig index 4401f33f0b..caeb6ffaf1 100644 --- a/configs/sandbox_flattree_defconfig +++ b/configs/sandbox_flattree_defconfig @@ -5,6 +5,7 @@ CONFIG_SYS_MEMTEST_END=0x00101000 CONFIG_ENV_SIZE=0x2000 CONFIG_BOOTSTAGE_STASH_ADDR=0x0 CONFIG_DEFAULT_DEVICE_TREE="sandbox" +CONFIG_LTO=y CONFIG_DEBUG_UART=y CONFIG_DISTRO_DEFAULTS=y CONFIG_FIT=y diff --git a/configs/sandbox_spl_defconfig b/configs/sandbox_spl_defconfig index c0118702a8..4760c07e7f 100644 --- a/configs/sandbox_spl_defconfig +++ b/configs/sandbox_spl_defconfig @@ -11,6 +11,7 @@ CONFIG_SPL=y CONFIG_BOOTSTAGE_STASH_ADDR=0x0 CONFIG_DEFAULT_DEVICE_TREE="sandbox" CONFIG_SANDBOX_SPL=y +CONFIG_LTO=y CONFIG_DEBUG_UART=y CONFIG_DISTRO_DEFAULTS=y CONFIG_FIT=y

On ARM the gd pointer is stored in registers r9 / x18. For this the -ffixed-r9 / -ffixed-x18 flag when compiling, but using global register variables causes errors when building with LTO, and these errors are very difficult to overcome.
Richard Biener says [1]: Note that global register vars shouldn't be used with LTO and if they are restricted to just a few compilation units the recommended fix is to build those CUs without -flto.
We cannot do this for U-Boot since all CUs use -ffixed-reg flag.
It seems that with LTO we could in fact store the gd pointer differently and gain performance or size benefit by allowing the compiler to use r9 / x18. But this would need more work.
So for now, when building with LTO, go the clang way, and instead of declaring gd a global register variable we make it a function via macro.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68384
Signed-off-by: Marek Behún marek.behun@nic.cz --- arch/arm/include/asm/global_data.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/global_data.h b/arch/arm/include/asm/global_data.h index fba655f3b9..d694074bff 100644 --- a/arch/arm/include/asm/global_data.h +++ b/arch/arm/include/asm/global_data.h @@ -91,7 +91,7 @@ struct arch_global_data {
#include <asm-generic/global_data.h>
-#ifdef __clang__ +#if defined(__clang__) || defined(CONFIG_LTO)
#define DECLARE_GLOBAL_DATA_PTR #define gd get_gd()

It seems that sometimes (happening on ARM64, for example with turris_mox_defconfig) GCC, when linking with LTO, changes the name of lib/string.c's memcpy() function to memcpy.isra.0.
This is a problem however when GCC for a code such as this: struct some_struct *info = get_some_struct(); struct some struct tmpinfo; tmpinfo = *info; emits a call to memcpy() by builtin behaviour, to copy *info to tmpinfo.
This then results in the following linking error: .../lz4.c:93: undefined reference to `memcpy' .../uuid.c:206: more undefined references to `memcpy' follow
Make memcpy() visible by using the __visible macro to avoid this error.
Signed-off-by: Marek Behún marek.behun@nic.cz --- lib/string.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/lib/string.c b/lib/string.c index 73b984123d..2290af83c4 100644 --- a/lib/string.c +++ b/lib/string.c @@ -16,6 +16,7 @@ */
#include <config.h> +#include <linux/compiler.h> #include <linux/types.h> #include <linux/string.h> #include <linux/ctype.h> @@ -529,7 +530,7 @@ void * memset(void * s,int c,size_t count) * You should not use this function to access IO space, use memcpy_toio() * or memcpy_fromio() instead. */ -void * memcpy(void *dest, const void *src, size_t count) +__visible void * memcpy(void *dest, const void *src, size_t count) { unsigned long *dl = (unsigned long *)dest, *sl = (unsigned long *)src; char *d8, *s8;

Make LTO available for ARM architecture.
Signed-off-by: Marek Behún marek.behun@nic.cz --- arch/Kconfig | 1 + arch/arm/lib/Makefile | 2 ++ 2 files changed, 3 insertions(+)
diff --git a/arch/Kconfig b/arch/Kconfig index a6dab3e56d..5e80bcebac 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -33,6 +33,7 @@ config ARC
config ARM bool "ARM architecture" + select ARCH_SUPPORTS_LTO select CREATE_ARCH_SYMLINK select HAVE_PRIVATE_LIBGCC if !ARM64 select SUPPORT_OF_CONTROL diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile index 27b12e7f2b..e977aacc61 100644 --- a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -45,6 +45,8 @@ obj-$(CONFIG_SEMIHOSTING) += semihosting.o
obj-y += bdinfo.o obj-y += sections.o +CFLAGS_REMOVE_sections.o := $(LTO_CFLAGS) + obj-y += stack.o ifdef CONFIG_CPU_V7M obj-y += interrupts_m.o

Build Nokia RX-51 target with LTO enabled.
Signed-off-by: Marek Behún marek.behun@nic.cz --- configs/nokia_rx51_defconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/configs/nokia_rx51_defconfig b/configs/nokia_rx51_defconfig index 9744d1c322..6b555ca748 100644 --- a/configs/nokia_rx51_defconfig +++ b/configs/nokia_rx51_defconfig @@ -3,6 +3,7 @@ CONFIG_ARM=y CONFIG_ARCH_OMAP2PLUS=y CONFIG_SYS_TEXT_BASE=0x80008000 CONFIG_NR_DRAM_BANKS=2 +CONFIG_LTO=y CONFIG_TARGET_NOKIA_RX51=y # CONFIG_SYS_MALLOC_F is not set # CONFIG_FIT is not set

On Wed, Mar 03, 2021 at 05:11:59AM +0100, Marek Behún wrote:
Hello,
I have managed to add support for building U-Boot with LTO (with GCC) in a rather sane way (in LOC changed).
This series and its follows will also be available at https://github.com/elkablo/u-boot branch lto.
I have tested these builds on Turris Omnia, Turris MOX and on Nokia N900 (via the test/nokia_rx51_test.sh script). For other tests I have created a pull-request on github to trigger CI (https://github.com/u-boot/u-boot/pull/57) For some reason it is waiting now, maybe Azure is not working or something.
As we're on the free tier with Azure it sometimes just queues us up for a long time, this job finally started running recently.
My tests on Omnia and MOX show that U-Boot boots sucessfully, and basic commands seem to work. But of course something broken due to LTO may be found later.
So for all of you that are interested and have an ARM board, please test this on your boards by enabling CONFIG_LTO option. Also please report code size reductions. (Chris Packham reports an error related to jobserver, so if `make -jN` produces an error, please try without the `-jN` flag.)
I have only tested with gcc-10. There are still some warnings printed, like: bfd plugin: invalid symbol type found but these don't seem to matter. I will look into this later.
Here are some results by how much code size reduced. Note that SPL binary seems to gain more code reduction (15.4 % on average) than main binary (4.5 % on average).
I guess this is because of how drivers are written. The optimizer cannot know which code paths won't be used, since it does not see the device tree. Maybe this could be somehow integrated with Simon's work on OF_PLATDATA_INST in the future, to make the compiler optimize out unused code paths in drivers by understanding the device tree.
u-boot.bin u-boot-spl.bin clearfog 4.34 % 19.0 KB 13.55 % 16.8 KB
controlcenterdc 4.79 % 24.2 KB 16.27 % 21.9 KB db-88f6820-amc 4.23 % 25.0 KB 16.17 % 22.9 KB db-88f6820-gp 4.42 % 22.1 KB 17.00 % 23.8 KB helios4 4.32 % 18.9 KB 13.70 % 16.8 KB nokia_rx51 6.11 % 16.5 KB turris_mox 4.17 % 31.8 KB turris_omnia 4.32 % 30.2 KB 14.91 % 16.6 KB x530 3.93 % 30.0 KB 16.26 % 23.4 KB
Marek
Thanks for starting on this! It's been on my list for a long time, especially since it does give overall better reduction than function/data-sections/discard. It does seem like clang fails to build with this series. One thing I want to try locally, and I'll fire off the results once I do it, is moving to LTO by default for ARM.

On Wed, 3 Mar 2021 11:11:59 -0500 Tom Rini trini@konsulko.com wrote:
On Wed, Mar 03, 2021 at 05:11:59AM +0100, Marek Behún wrote:
Hello,
I have managed to add support for building U-Boot with LTO (with GCC) in a rather sane way (in LOC changed).
This series and its follows will also be available at https://github.com/elkablo/u-boot branch lto.
I have tested these builds on Turris Omnia, Turris MOX and on Nokia N900 (via the test/nokia_rx51_test.sh script). For other tests I have created a pull-request on github to trigger CI (https://github.com/u-boot/u-boot/pull/57) For some reason it is waiting now, maybe Azure is not working or something.
As we're on the free tier with Azure it sometimes just queues us up for a long time, this job finally started running recently.
My tests on Omnia and MOX show that U-Boot boots sucessfully, and basic commands seem to work. But of course something broken due to LTO may be found later.
So for all of you that are interested and have an ARM board, please test this on your boards by enabling CONFIG_LTO option. Also please report code size reductions. (Chris Packham reports an error related to jobserver, so if `make -jN` produces an error, please try without the `-jN` flag.)
I have only tested with gcc-10. There are still some warnings printed, like: bfd plugin: invalid symbol type found but these don't seem to matter. I will look into this later.
Here are some results by how much code size reduced. Note that SPL binary seems to gain more code reduction (15.4 % on average) than main binary (4.5 % on average).
I guess this is because of how drivers are written. The optimizer cannot know which code paths won't be used, since it does not see the device tree. Maybe this could be somehow integrated with Simon's work on OF_PLATDATA_INST in the future, to make the compiler optimize out unused code paths in drivers by understanding the device tree.
u-boot.bin u-boot-spl.bin clearfog 4.34 % 19.0 KB 13.55 % 16.8 KB
controlcenterdc 4.79 % 24.2 KB 16.27 % 21.9 KB db-88f6820-amc 4.23 % 25.0 KB 16.17 % 22.9 KB db-88f6820-gp 4.42 % 22.1 KB 17.00 % 23.8 KB helios4 4.32 % 18.9 KB 13.70 % 16.8 KB nokia_rx51 6.11 % 16.5 KB turris_mox 4.17 % 31.8 KB turris_omnia 4.32 % 30.2 KB 14.91 % 16.6 KB x530 3.93 % 30.0 KB 16.26 % 23.4 KB
Marek
Thanks for starting on this! It's been on my list for a long time, especially since it does give overall better reduction than function/data-sections/discard. It does seem like clang fails to build with this series. One thing I want to try locally, and I'll fire off the results once I do it, is moving to LTO by default for ARM.
Yes, it seems clang is the last thing I need to look at. I did not even try, really, my first priority was gcc. I will look into this tomorrow.
All in all I am happy with this since it seems to be running for several different boards without issue.
If you want to enable LTO by default for ARM, we probably need to determine which gcc version should be minimal for this. Because older gcc versions may have problems with LTO. What is the current minimal version of gcc for U-Boot?
Marek

On Wed, Mar 03, 2021 at 05:41:57PM +0100, Marek Behun wrote:
On Wed, 3 Mar 2021 11:11:59 -0500 Tom Rini trini@konsulko.com wrote:
On Wed, Mar 03, 2021 at 05:11:59AM +0100, Marek Behún wrote:
Hello,
I have managed to add support for building U-Boot with LTO (with GCC) in a rather sane way (in LOC changed).
This series and its follows will also be available at https://github.com/elkablo/u-boot branch lto.
I have tested these builds on Turris Omnia, Turris MOX and on Nokia N900 (via the test/nokia_rx51_test.sh script). For other tests I have created a pull-request on github to trigger CI (https://github.com/u-boot/u-boot/pull/57) For some reason it is waiting now, maybe Azure is not working or something.
As we're on the free tier with Azure it sometimes just queues us up for a long time, this job finally started running recently.
My tests on Omnia and MOX show that U-Boot boots sucessfully, and basic commands seem to work. But of course something broken due to LTO may be found later.
So for all of you that are interested and have an ARM board, please test this on your boards by enabling CONFIG_LTO option. Also please report code size reductions. (Chris Packham reports an error related to jobserver, so if `make -jN` produces an error, please try without the `-jN` flag.)
I have only tested with gcc-10. There are still some warnings printed, like: bfd plugin: invalid symbol type found but these don't seem to matter. I will look into this later.
Here are some results by how much code size reduced. Note that SPL binary seems to gain more code reduction (15.4 % on average) than main binary (4.5 % on average).
I guess this is because of how drivers are written. The optimizer cannot know which code paths won't be used, since it does not see the device tree. Maybe this could be somehow integrated with Simon's work on OF_PLATDATA_INST in the future, to make the compiler optimize out unused code paths in drivers by understanding the device tree.
u-boot.bin u-boot-spl.bin clearfog 4.34 % 19.0 KB 13.55 % 16.8 KB
controlcenterdc 4.79 % 24.2 KB 16.27 % 21.9 KB db-88f6820-amc 4.23 % 25.0 KB 16.17 % 22.9 KB db-88f6820-gp 4.42 % 22.1 KB 17.00 % 23.8 KB helios4 4.32 % 18.9 KB 13.70 % 16.8 KB nokia_rx51 6.11 % 16.5 KB turris_mox 4.17 % 31.8 KB turris_omnia 4.32 % 30.2 KB 14.91 % 16.6 KB x530 3.93 % 30.0 KB 16.26 % 23.4 KB
Marek
Thanks for starting on this! It's been on my list for a long time, especially since it does give overall better reduction than function/data-sections/discard. It does seem like clang fails to build with this series. One thing I want to try locally, and I'll fire off the results once I do it, is moving to LTO by default for ARM.
Yes, it seems clang is the last thing I need to look at. I did not even try, really, my first priority was gcc. I will look into this tomorrow.
All in all I am happy with this since it seems to be running for several different boards without issue.
Great. I'll give it a spin on my platforms as well, but I suspect things are good.
If you want to enable LTO by default for ARM, we probably need to determine which gcc version should be minimal for this. Because older gcc versions may have problems with LTO. What is the current minimal version of gcc for U-Boot?
We, funny enough, have a check for gcc-4.0.4 on ARM, followed by gcc-6.0. That 4.0.4 check should be dropped, and gcc-6.0 is the minimum. I think gcc-7.2 or so is going to be important to keep working but given the age of the Linux Kernel LTO support, we should be fine in that regard.

On Wed, Mar 03, 2021 at 05:41:57PM +0100, Marek Behun wrote:
On Wed, 3 Mar 2021 11:11:59 -0500 Tom Rini trini@konsulko.com wrote:
On Wed, Mar 03, 2021 at 05:11:59AM +0100, Marek Behún wrote:
Hello,
I have managed to add support for building U-Boot with LTO (with GCC) in a rather sane way (in LOC changed).
This series and its follows will also be available at https://github.com/elkablo/u-boot branch lto.
I have tested these builds on Turris Omnia, Turris MOX and on Nokia N900 (via the test/nokia_rx51_test.sh script). For other tests I have created a pull-request on github to trigger CI (https://github.com/u-boot/u-boot/pull/57) For some reason it is waiting now, maybe Azure is not working or something.
As we're on the free tier with Azure it sometimes just queues us up for a long time, this job finally started running recently.
My tests on Omnia and MOX show that U-Boot boots sucessfully, and basic commands seem to work. But of course something broken due to LTO may be found later.
So for all of you that are interested and have an ARM board, please test this on your boards by enabling CONFIG_LTO option. Also please report code size reductions. (Chris Packham reports an error related to jobserver, so if `make -jN` produces an error, please try without the `-jN` flag.)
I have only tested with gcc-10. There are still some warnings printed, like: bfd plugin: invalid symbol type found but these don't seem to matter. I will look into this later.
Here are some results by how much code size reduced. Note that SPL binary seems to gain more code reduction (15.4 % on average) than main binary (4.5 % on average).
I guess this is because of how drivers are written. The optimizer cannot know which code paths won't be used, since it does not see the device tree. Maybe this could be somehow integrated with Simon's work on OF_PLATDATA_INST in the future, to make the compiler optimize out unused code paths in drivers by understanding the device tree.
u-boot.bin u-boot-spl.bin clearfog 4.34 % 19.0 KB 13.55 % 16.8 KB
controlcenterdc 4.79 % 24.2 KB 16.27 % 21.9 KB db-88f6820-amc 4.23 % 25.0 KB 16.17 % 22.9 KB db-88f6820-gp 4.42 % 22.1 KB 17.00 % 23.8 KB helios4 4.32 % 18.9 KB 13.70 % 16.8 KB nokia_rx51 6.11 % 16.5 KB turris_mox 4.17 % 31.8 KB turris_omnia 4.32 % 30.2 KB 14.91 % 16.6 KB x530 3.93 % 30.0 KB 16.26 % 23.4 KB
Marek
Thanks for starting on this! It's been on my list for a long time, especially since it does give overall better reduction than function/data-sections/discard. It does seem like clang fails to build with this series. One thing I want to try locally, and I'll fire off the results once I do it, is moving to LTO by default for ARM.
Yes, it seems clang is the last thing I need to look at. I did not even try, really, my first priority was gcc. I will look into this tomorrow.
All in all I am happy with this since it seems to be running for several different boards without issue.
If you want to enable LTO by default for ARM, we probably need to determine which gcc version should be minimal for this. Because older gcc versions may have problems with LTO. What is the current minimal version of gcc for U-Boot?
So, as I start testing things locally with two additional changes (1. LTO by default 2. No ffunction/data-sections with LTO) we see: https://gist.github.com/trini/350ab850c42293563228b8d68a1bb89a as the detailed size reduction. This also shows that with LTO we want to turn off -ffunction-sections/etc as it's not useful now.
Marek

On Wed, Mar 3, 2021 at 3:36 PM Tom Rini trini@konsulko.com wrote:
On Wed, Mar 03, 2021 at 05:41:57PM +0100, Marek Behun wrote:
On Wed, 3 Mar 2021 11:11:59 -0500 Tom Rini trini@konsulko.com wrote:
On Wed, Mar 03, 2021 at 05:11:59AM +0100, Marek Behún wrote:
Hello,
I have managed to add support for building U-Boot with LTO (with GCC) in a rather sane way (in LOC changed).
This series and its follows will also be available at https://github.com/elkablo/u-boot branch lto.
I have tested these builds on Turris Omnia, Turris MOX and on Nokia N900 (via the test/nokia_rx51_test.sh script). For other tests I have created a pull-request on github to trigger CI (https://github.com/u-boot/u-boot/pull/57) For some reason it is waiting now, maybe Azure is not working or something.
As we're on the free tier with Azure it sometimes just queues us up for a long time, this job finally started running recently.
My tests on Omnia and MOX show that U-Boot boots sucessfully, and basic commands seem to work. But of course something broken due to LTO may be found later.
So for all of you that are interested and have an ARM board, please test this on your boards by enabling CONFIG_LTO option. Also please report code size reductions. (Chris Packham reports an error related to jobserver, so if `make -jN` produces an error, please try without the `-jN` flag.)
I have only tested with gcc-10. There are still some warnings printed, like: bfd plugin: invalid symbol type found but these don't seem to matter. I will look into this later.
Here are some results by how much code size reduced. Note that SPL binary seems to gain more code reduction (15.4 % on average) than main binary (4.5 % on average).
I guess this is because of how drivers are written. The optimizer cannot know which code paths won't be used, since it does not see the device tree. Maybe this could be somehow integrated with Simon's work on OF_PLATDATA_INST in the future, to make the compiler optimize out unused code paths in drivers by understanding the device tree.
u-boot.bin u-boot-spl.bin clearfog 4.34 % 19.0 KB 13.55 % 16.8 KB
controlcenterdc 4.79 % 24.2 KB 16.27 % 21.9 KB db-88f6820-amc 4.23 % 25.0 KB 16.17 % 22.9 KB db-88f6820-gp 4.42 % 22.1 KB 17.00 % 23.8 KB helios4 4.32 % 18.9 KB 13.70 % 16.8 KB nokia_rx51 6.11 % 16.5 KB turris_mox 4.17 % 31.8 KB turris_omnia 4.32 % 30.2 KB 14.91 % 16.6 KB x530 3.93 % 30.0 KB 16.26 % 23.4 KB
Marek
Thanks for starting on this! It's been on my list for a long time, especially since it does give overall better reduction than function/data-sections/discard. It does seem like clang fails to build with this series. One thing I want to try locally, and I'll fire off the results once I do it, is moving to LTO by default for ARM.
Yes, it seems clang is the last thing I need to look at. I did not even try, really, my first priority was gcc. I will look into this tomorrow.
All in all I am happy with this since it seems to be running for several different boards without issue.
If you want to enable LTO by default for ARM, we probably need to determine which gcc version should be minimal for this. Because older gcc versions may have problems with LTO. What is the current minimal version of gcc for U-Boot?
So, as I start testing things locally with two additional changes (1. LTO by default 2. No ffunction/data-sections with LTO) we see: https://gist.github.com/trini/350ab850c42293563228b8d68a1bb89a as the detailed size reduction. This also shows that with LTO we want to turn off -ffunction-sections/etc as it's not useful now.
Does ARM vs Thumb matter? I ask because I tried to build the da850evm which is fairly tight in SPL. It builds fine without LTO, but fails with the following:
{standard input}: Assembler messages: {standard input}:7744: Error: selected processor does not support `mrc p15,0,r2,c1,c0,0' in Thumb mode {standard input}:7784: Error: selected processor does not support `mcr p15,0,r3,c1,c0,0' in Thumb mode {standard input}:13192: Error: selected processor does not support `mrc p15,0,r3,c1,c0,0' in Thumb mode {standard input}:36752: Error: selected processor does not support `mcr p15,0,r3,c7,c7,0' in Thumb mode lto-wrapper: fatal error: arm-linux-gnueabihf-gcc returned 1 exit status compilation terminated. /usr/lib/gcc-cross/arm-linux-gnueabihf/9/../../../../arm-linux-gnueabihf/bin/ld: error: lto-wrapper failed collect2: error: ld returned 1 exit status
Marek
-- Tom

Fix LTO build for some thumb-interwork usecases (such as for da850evm_defconfig), where inline assmebly such as mrc p15,0,r2,c1,c0,0 causes the compiler to fail during LTO linking with Error: selected processor does not support `mrc p15,0,r2,c1,c0,0' in Thumb mode
Signed-off-by: Marek Behún marek.behun@nic.cz --- arch/arm/cpu/arm926ejs/Makefile | 2 ++ arch/arm/lib/Makefile | 1 + 2 files changed, 3 insertions(+)
diff --git a/arch/arm/cpu/arm926ejs/Makefile b/arch/arm/cpu/arm926ejs/Makefile index af63d5cc5e..98aafe805a 100644 --- a/arch/arm/cpu/arm926ejs/Makefile +++ b/arch/arm/cpu/arm926ejs/Makefile @@ -25,6 +25,8 @@ ifndef CONFIG_HAS_THUMB2
CFLAGS_cpu.o := -marm CFLAGS_cache.o := -marm +CFLAGS_REMOVE_cpu.o := $(LTO_CFLAGS) +CFLAGS_REMOVE_cache.o := $(LTO_CFLAGS)
endif endif diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile index e977aacc61..7f66332715 100644 --- a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -66,6 +66,7 @@ endif
obj-y += cache.o obj-$(CONFIG_SYS_ARM_CACHE_CP15) += cache-cp15.o +CFLAGS_REMOVE_cache-cp15.o := $(LTO_CFLAGS)
obj-y += psci-dt.o

On Wed, 3 Mar 2021 16:36:05 -0500 Tom Rini trini@konsulko.com wrote:
So, as I start testing things locally with two additional changes (1. LTO by default 2. No ffunction/data-sections with LTO) we see: https://gist.github.com/trini/350ab850c42293563228b8d68a1bb89a as the detailed size reduction. This also shows that with LTO we want to turn off -ffunction-sections/etc as it's not useful now.
Tom, I have pushed another version to github PR to trigger CI, and am still working on clang. You can look at the github PR if you want to try yourself. I have also added patch that disables -ffunction-section/fdata-section on arm.
After I manage to make it all work in CI I will send v2 to mailing list.
Marek

On Thu, Mar 4, 2021 at 4:43 AM Marek Behun marek.behun@nic.cz wrote:
On Wed, 3 Mar 2021 16:36:05 -0500 Tom Rini trini@konsulko.com wrote:
So, as I start testing things locally with two additional changes (1. LTO by default 2. No ffunction/data-sections with LTO) we see: https://gist.github.com/trini/350ab850c42293563228b8d68a1bb89a as the detailed size reduction. This also shows that with LTO we want to turn off -ffunction-sections/etc as it's not useful now.
Tom, I have pushed another version to github PR to trigger CI, and am still working on clang. You can look at the github PR if you want to try yourself. I have also added patch that disables -ffunction-section/fdata-section on arm.
After I manage to make it all work in CI I will send v2 to mailing list.
I tested this with the imx6q_logic board. I only tested the U-Boot portion, but it appeared to work and it booted the kernel. The U-Boot size reduced -7182 bytes (about 3% smaller).
I haven't been able to successfully boot the OMAP3 boards I have yet. I'm still looking into this.
I don't think we should enable LTO by default for all boards yet.
adam
Marek

On Thu, 4 Mar 2021 07:46:18 -0600 Adam Ford aford173@gmail.com wrote:
On Thu, Mar 4, 2021 at 4:43 AM Marek Behun marek.behun@nic.cz wrote:
On Wed, 3 Mar 2021 16:36:05 -0500 Tom Rini trini@konsulko.com wrote:
So, as I start testing things locally with two additional changes (1. LTO by default 2. No ffunction/data-sections with LTO) we see: https://gist.github.com/trini/350ab850c42293563228b8d68a1bb89a as the detailed size reduction. This also shows that with LTO we want to turn off -ffunction-sections/etc as it's not useful now.
Tom, I have pushed another version to github PR to trigger CI, and am still working on clang. You can look at the github PR if you want to try yourself. I have also added patch that disables -ffunction-section/fdata-section on arm.
After I manage to make it all work in CI I will send v2 to mailing list.
I tested this with the imx6q_logic board. I only tested the U-Boot portion, but it appeared to work and it booted the kernel. The U-Boot size reduced -7182 bytes (about 3% smaller).
I haven't been able to successfully boot the OMAP3 boards I have yet. I'm still looking into this.
I don't think we should enable LTO by default for all boards yet.
Adam, did you try the current version from github.com/elkablo/u-boot branch lto ?

On Thu, Mar 4, 2021 at 7:50 AM Marek Behun marek.behun@nic.cz wrote:
On Thu, 4 Mar 2021 07:46:18 -0600 Adam Ford aford173@gmail.com wrote:
On Thu, Mar 4, 2021 at 4:43 AM Marek Behun marek.behun@nic.cz wrote:
On Wed, 3 Mar 2021 16:36:05 -0500 Tom Rini trini@konsulko.com wrote:
So, as I start testing things locally with two additional changes (1. LTO by default 2. No ffunction/data-sections with LTO) we see: https://gist.github.com/trini/350ab850c42293563228b8d68a1bb89a as the detailed size reduction. This also shows that with LTO we want to turn off -ffunction-sections/etc as it's not useful now.
Tom, I have pushed another version to github PR to trigger CI, and am still working on clang. You can look at the github PR if you want to try yourself. I have also added patch that disables -ffunction-section/fdata-section on arm.
After I manage to make it all work in CI I will send v2 to mailing list.
I tested this with the imx6q_logic board. I only tested the U-Boot portion, but it appeared to work and it booted the kernel. The U-Boot size reduced -7182 bytes (about 3% smaller).
I haven't been able to successfully boot the OMAP3 boards I have yet. I'm still looking into this.
I don't think we should enable LTO by default for all boards yet.
Adam, did you try the current version from github.com/elkablo/u-boot branch lto ?
Not yet. The SPL issue that I am fighting appears to be a regression in master somewhere between 2020.04-rc1 and the current head. I want to resolve that issue before I get back to re-testing the LTO stuff. I don't really have time to git bisect now, but I'll try to work on it this weekend. This is very exciting to me because of the very limited SPL space in several of my 32-bit ARM boards. From my build-only tests, my SPL sizes are shrinking 10-20% with LTO enabled depending on the board.
Thank you for the work you've done.
adam

On Thu, Mar 04, 2021 at 07:46:18AM -0600, Adam Ford wrote:
On Thu, Mar 4, 2021 at 4:43 AM Marek Behun marek.behun@nic.cz wrote:
On Wed, 3 Mar 2021 16:36:05 -0500 Tom Rini trini@konsulko.com wrote:
So, as I start testing things locally with two additional changes (1. LTO by default 2. No ffunction/data-sections with LTO) we see: https://gist.github.com/trini/350ab850c42293563228b8d68a1bb89a as the detailed size reduction. This also shows that with LTO we want to turn off -ffunction-sections/etc as it's not useful now.
Tom, I have pushed another version to github PR to trigger CI, and am still working on clang. You can look at the github PR if you want to try yourself. I have also added patch that disables -ffunction-section/fdata-section on arm.
After I manage to make it all work in CI I will send v2 to mailing list.
I tested this with the imx6q_logic board. I only tested the U-Boot portion, but it appeared to work and it booted the kernel. The U-Boot size reduced -7182 bytes (about 3% smaller).
I haven't been able to successfully boot the OMAP3 boards I have yet. I'm still looking into this.
It boots (and pytest runs) on my Beagleboard xM, fwiw.

On Thu, Mar 4, 2021 at 8:58 AM Tom Rini trini@konsulko.com wrote:
On Thu, Mar 04, 2021 at 07:46:18AM -0600, Adam Ford wrote:
On Thu, Mar 4, 2021 at 4:43 AM Marek Behun marek.behun@nic.cz wrote:
On Wed, 3 Mar 2021 16:36:05 -0500 Tom Rini trini@konsulko.com wrote:
So, as I start testing things locally with two additional changes (1. LTO by default 2. No ffunction/data-sections with LTO) we see: https://gist.github.com/trini/350ab850c42293563228b8d68a1bb89a as the detailed size reduction. This also shows that with LTO we want to turn off -ffunction-sections/etc as it's not useful now.
Tom, I have pushed another version to github PR to trigger CI, and am still working on clang. You can look at the github PR if you want to try yourself. I have also added patch that disables -ffunction-section/fdata-section on arm.
After I manage to make it all work in CI I will send v2 to mailing list.
I tested this with the imx6q_logic board. I only tested the U-Boot portion, but it appeared to work and it booted the kernel. The U-Boot size reduced -7182 bytes (about 3% smaller).
I haven't been able to successfully boot the OMAP3 boards I have yet. I'm still looking into this.
It boots (and pytest runs) on my Beagleboard xM, fwiw.
Interesting. with LTO enabled, the DRAM reports 7.2 GB and hangs. Without LTO, U-Boot boots fine.
LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 7.2 GiB <hang>
Without LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 256 MiB NAND: 512 MiB MMC: OMAP SD/MMC: 0 Loading Environment from NAND... OK OMAP die ID: 619e00029ff800000168300f1502501f Net: smc911x-0 Hit any key to stop autoboot: 0 OMAP Logic #
adam
-- Tom

On Thu, 4 Mar 2021 09:07:33 -0600 Adam Ford aford173@gmail.com wrote:
On Thu, Mar 4, 2021 at 8:58 AM Tom Rini trini@konsulko.com wrote:
On Thu, Mar 04, 2021 at 07:46:18AM -0600, Adam Ford wrote:
On Thu, Mar 4, 2021 at 4:43 AM Marek Behun marek.behun@nic.cz wrote:
On Wed, 3 Mar 2021 16:36:05 -0500 Tom Rini trini@konsulko.com wrote:
So, as I start testing things locally with two additional changes (1. LTO by default 2. No ffunction/data-sections with LTO) we see: https://gist.github.com/trini/350ab850c42293563228b8d68a1bb89a as the detailed size reduction. This also shows that with LTO we want to turn off -ffunction-sections/etc as it's not useful now.
Tom, I have pushed another version to github PR to trigger CI, and am still working on clang. You can look at the github PR if you want to try yourself. I have also added patch that disables -ffunction-section/fdata-section on arm.
After I manage to make it all work in CI I will send v2 to mailing list.
I tested this with the imx6q_logic board. I only tested the U-Boot portion, but it appeared to work and it booted the kernel. The U-Boot size reduced -7182 bytes (about 3% smaller).
I haven't been able to successfully boot the OMAP3 boards I have yet. I'm still looking into this.
It boots (and pytest runs) on my Beagleboard xM, fwiw.
Interesting. with LTO enabled, the DRAM reports 7.2 GB and hangs. Without LTO, U-Boot boots fine.
LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 7.2 GiB
<hang>
Without LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 256 MiB NAND: 512 MiB MMC: OMAP SD/MMC: 0 Loading Environment from NAND... OK OMAP die ID: 619e00029ff800000168300f1502501f Net: smc911x-0 Hit any key to stop autoboot: 0 OMAP Logic #
adam
-- Tom
Tom, I think this means it would take some time to make it stable enough to enable it by default on all ARM boards.
But I would really like to make it work ASAP at least for Nokia N900, because otherwise my `mtd` patches make the binary ~200 bytes too big for Nokia N900 and CI fails.
Or would you be willing to accept `mtd` patches even with CI failing?
Marek

On Thu, Mar 04, 2021 at 04:37:29PM +0100, Marek Behun wrote:
On Thu, 4 Mar 2021 09:07:33 -0600 Adam Ford aford173@gmail.com wrote:
On Thu, Mar 4, 2021 at 8:58 AM Tom Rini trini@konsulko.com wrote:
On Thu, Mar 04, 2021 at 07:46:18AM -0600, Adam Ford wrote:
On Thu, Mar 4, 2021 at 4:43 AM Marek Behun marek.behun@nic.cz wrote:
On Wed, 3 Mar 2021 16:36:05 -0500 Tom Rini trini@konsulko.com wrote:
So, as I start testing things locally with two additional changes (1. LTO by default 2. No ffunction/data-sections with LTO) we see: https://gist.github.com/trini/350ab850c42293563228b8d68a1bb89a as the detailed size reduction. This also shows that with LTO we want to turn off -ffunction-sections/etc as it's not useful now.
Tom, I have pushed another version to github PR to trigger CI, and am still working on clang. You can look at the github PR if you want to try yourself. I have also added patch that disables -ffunction-section/fdata-section on arm.
After I manage to make it all work in CI I will send v2 to mailing list.
I tested this with the imx6q_logic board. I only tested the U-Boot portion, but it appeared to work and it booted the kernel. The U-Boot size reduced -7182 bytes (about 3% smaller).
I haven't been able to successfully boot the OMAP3 boards I have yet. I'm still looking into this.
It boots (and pytest runs) on my Beagleboard xM, fwiw.
Interesting. with LTO enabled, the DRAM reports 7.2 GB and hangs. Without LTO, U-Boot boots fine.
LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 7.2 GiB
<hang>
Without LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 256 MiB NAND: 512 MiB MMC: OMAP SD/MMC: 0 Loading Environment from NAND... OK OMAP die ID: 619e00029ff800000168300f1502501f Net: smc911x-0 Hit any key to stop autoboot: 0 OMAP Logic #
adam
-- Tom
Tom, I think this means it would take some time to make it stable enough to enable it by default on all ARM boards.
But I would really like to make it work ASAP at least for Nokia N900, because otherwise my `mtd` patches make the binary ~200 bytes too big for Nokia N900 and CI fails.
Or would you be willing to accept `mtd` patches even with CI failing?
You've posted this for about a day now, so I hope we can sort things out in time to merge for v2021.07 :) CI does need to pass, but we also need these savings for a variety of reasons, so lets see what needs to get tweaked to make it work.

On Thu, 4 Mar 2021 09:07:33 -0600 Adam Ford aford173@gmail.com wrote:
Interesting. with LTO enabled, the DRAM reports 7.2 GB and hangs. Without LTO, U-Boot boots fine.
LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 7.2 GiB
<hang>
Without LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 256 MiB NAND: 512 MiB MMC: OMAP SD/MMC: 0 Loading Environment from NAND... OK OMAP die ID: 619e00029ff800000168300f1502501f Net: smc911x-0 Hit any key to stop autoboot: 0 OMAP Logic #
In which file is the code for determining dram for this board? Try adding CFLAGS_REMOVE_file.o := $(LTO_CFLAGS) to Makefile in the directory where the file is located.
Marek

On Thu, Mar 4, 2021 at 9:59 AM Marek Behun marek.behun@nic.cz wrote:
On Thu, 4 Mar 2021 09:07:33 -0600 Adam Ford aford173@gmail.com wrote:
Interesting. with LTO enabled, the DRAM reports 7.2 GB and hangs. Without LTO, U-Boot boots fine.
LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 7.2 GiB
<hang>
Without LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 256 MiB NAND: 512 MiB MMC: OMAP SD/MMC: 0 Loading Environment from NAND... OK OMAP die ID: 619e00029ff800000168300f1502501f Net: smc911x-0 Hit any key to stop autoboot: 0 OMAP Logic #
In which file is the code for determining dram for this board? Try adding CFLAGS_REMOVE_file.o := $(LTO_CFLAGS) to Makefile in the directory where the file is located.
Marek / Tom,
I modified arch/arm/mach-omap2/omap3/Makefile with the above patch, and the U-Boot portion appears correctly now when I use an SPL that was compiled without LTO:
U-Boot 2021.04-rc3-00277-ge47d3424df-dirty (Mar 04 2021 - 16:09:09 -0600)
OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 256 MiB NAND: 512 MiB MMC: OMAP SD/MMC: 0 Loading Environment from NAND... OK OMAP die ID: 619e00029ff800000168300f1502501f Net: eth0: ethernet@08000000 Hit any key to stop autoboot: 0 OMAP Logic #
Unfortunately, the SPL portion doesn't boot when compiled with SPL, but I think we're getting closer. Unfortunately, I don't have a good debugger to use and without any serial port output, it may be difficult for me to debug.
Tom,
Since you have an OMAP3 board:
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..a2cc21c6d2 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_file.o := $(LTO_CFLAGS) + obj-y := lowlevel_init.o
obj-y += board.o
Marek

On Thu, 4 Mar 2021 16:18:03 -0600 Adam Ford aford173@gmail.com wrote:
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..a2cc21c6d2 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_file.o := $(LTO_CFLAGS)
Eh? There is no file.c in arch/arm/mach-omap2/omap3/ directory.

On Thu, Mar 4, 2021 at 4:33 PM Marek Behun marek.behun@nic.cz wrote:
On Thu, 4 Mar 2021 16:18:03 -0600 Adam Ford aford173@gmail.com wrote:
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..a2cc21c6d2 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_file.o := $(LTO_CFLAGS)
Eh? There is no file.c in arch/arm/mach-omap2/omap3/ directory.
It must be from something else that had changed when I did a git pull pull on your repo. I guess that's better news.
adam

On 05.03.21 12:25, Adam Ford wrote:
On Thu, Mar 4, 2021 at 4:33 PM Marek Behun marek.behun@nic.cz wrote:
On Thu, 4 Mar 2021 16:18:03 -0600 Adam Ford aford173@gmail.com wrote:
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..a2cc21c6d2 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_file.o := $(LTO_CFLAGS)
Eh? There is no file.c in arch/arm/mach-omap2/omap3/ directory.
It must be from something else that had changed when I did a git pull pull on your repo. I guess that's better news.
It might be a misunderstanding. Marek means that you need to replace "file" with your real filename, like:
driver_spi.c ->
+CFLAGS_REMOVE_driver_spi.o := $(LTO_CFLAGS)
HTH, Stefan

On Fri, Mar 5, 2021 at 6:31 AM Stefan Roese sr@denx.de wrote:
On 05.03.21 12:25, Adam Ford wrote:
On Thu, Mar 4, 2021 at 4:33 PM Marek Behun marek.behun@nic.cz wrote:
On Thu, 4 Mar 2021 16:18:03 -0600 Adam Ford aford173@gmail.com wrote:
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..a2cc21c6d2 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_file.o := $(LTO_CFLAGS)
Eh? There is no file.c in arch/arm/mach-omap2/omap3/ directory.
It must be from something else that had changed when I did a git pull pull on your repo. I guess that's better news.
It might be a misunderstanding. Marek means that you need to replace "file" with your real filename, like:
driver_spi.c ->
+CFLAGS_REMOVE_driver_spi.o := $(LTO_CFLAGS)
I first did a git pull from his git repo, and then blindly copy-pasted the suggested link to that above directly. I don't play with Makefiles often, so I didn't really notice that I needed to match the line to an actual file. When I rebuilt, my initial issue with showing too much DDR and hanging went away, so I wrongly assumed that this fixed it. In reality, I think it was a patch that had been applied to his git repo that got pulled in at the same time.
On the testing front, I started testing the da850evm (ARM9) this morning. I ran into an issue that I apparently caused some time ago, and I'm trying to resolve that now. I have it working, but I need to clean it up. I'll push the clean-up to the ML then try the LTO on that board to see if U-Boot and/or SPL work with LTO enabled.
As of now, I have one board that seems to fully boot (imx6q_logic) and a family of boards that work in U-Boot but not SPL (omap3_logic).
After the da850em, I'll test a 64-bit Renesas board without SPL and a 64-bit NXP board with SPL to test.
On the build-testing I have done, I am seeing an average of a 10-20% reduction in size for SPL, and a ~ 3% reduction in size in U-Boot.
adam
HTH, Stefan

On Fri, Mar 5, 2021 at 11:10 AM Adam Ford aford173@gmail.com wrote:
On Fri, Mar 5, 2021 at 6:31 AM Stefan Roese sr@denx.de wrote:
On 05.03.21 12:25, Adam Ford wrote:
On Thu, Mar 4, 2021 at 4:33 PM Marek Behun marek.behun@nic.cz wrote:
On Thu, 4 Mar 2021 16:18:03 -0600 Adam Ford aford173@gmail.com wrote:
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..a2cc21c6d2 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_file.o := $(LTO_CFLAGS)
Eh? There is no file.c in arch/arm/mach-omap2/omap3/ directory.
It must be from something else that had changed when I did a git pull pull on your repo. I guess that's better news.
It might be a misunderstanding. Marek means that you need to replace "file" with your real filename, like:
driver_spi.c ->
+CFLAGS_REMOVE_driver_spi.o := $(LTO_CFLAGS)
I first did a git pull from his git repo, and then blindly copy-pasted the suggested link to that above directly. I don't play with Makefiles often, so I didn't really notice that I needed to match the line to an actual file. When I rebuilt, my initial issue with showing too much DDR and hanging went away, so I wrongly assumed that this fixed it. In reality, I think it was a patch that had been applied to his git repo that got pulled in at the same time.
On the testing front, I started testing the da850evm (ARM9) this morning. I ran into an issue that I apparently caused some time ago, and I'm trying to resolve that now. I have it working, but I need to clean it up. I'll push the clean-up to the ML then try the LTO on that board to see if U-Boot and/or SPL work with LTO enabled.
As of now, I have one board that seems to fully boot (imx6q_logic) and a family of boards that work in U-Boot but not SPL (omap3_logic).
After the da850em, I'll test a 64-bit Renesas board without SPL and a 64-bit NXP board with SPL to test.
On the build-testing I have done, I am seeing an average of a 10-20% reduction in size for SPL, and a ~ 3% reduction in size in U-Boot.
With a patch that I've already sent to the mailing list for the da850evm, it's booting both SPL and U-Boot
With the compiler I have, the code went from: SPL 24305 U-Boot 381930
To: SPL 20937 U-Boot 358780
For a Reduction of: SPL -3368 (-13.86%) U-Boot -23150 (-6.06%)
I didn't test the NOR or NAND booting versions of the da850evm, but I will try to do that when the time comes.
adam
adam
HTH, Stefan

On Fri, Mar 5, 2021 at 9:03 PM Adam Ford aford173@gmail.com wrote:
On Fri, Mar 5, 2021 at 11:10 AM Adam Ford aford173@gmail.com wrote:
On Fri, Mar 5, 2021 at 6:31 AM Stefan Roese sr@denx.de wrote:
On 05.03.21 12:25, Adam Ford wrote:
On Thu, Mar 4, 2021 at 4:33 PM Marek Behun marek.behun@nic.cz wrote:
On Thu, 4 Mar 2021 16:18:03 -0600 Adam Ford aford173@gmail.com wrote:
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..a2cc21c6d2 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_file.o := $(LTO_CFLAGS)
Eh? There is no file.c in arch/arm/mach-omap2/omap3/ directory.
It must be from something else that had changed when I did a git pull pull on your repo. I guess that's better news.
It might be a misunderstanding. Marek means that you need to replace "file" with your real filename, like:
driver_spi.c ->
+CFLAGS_REMOVE_driver_spi.o := $(LTO_CFLAGS)
I first did a git pull from his git repo, and then blindly copy-pasted the suggested link to that above directly. I don't play with Makefiles often, so I didn't really notice that I needed to match the line to an actual file. When I rebuilt, my initial issue with showing too much DDR and hanging went away, so I wrongly assumed that this fixed it. In reality, I think it was a patch that had been applied to his git repo that got pulled in at the same time.
On the testing front, I started testing the da850evm (ARM9) this morning. I ran into an issue that I apparently caused some time ago, and I'm trying to resolve that now. I have it working, but I need to clean it up. I'll push the clean-up to the ML then try the LTO on that board to see if U-Boot and/or SPL work with LTO enabled.
As of now, I have one board that seems to fully boot (imx6q_logic) and a family of boards that work in U-Boot but not SPL (omap3_logic).
After the da850em, I'll test a 64-bit Renesas board without SPL and a 64-bit NXP board with SPL to test.
On the build-testing I have done, I am seeing an average of a 10-20% reduction in size for SPL, and a ~ 3% reduction in size in U-Boot.
With a patch that I've already sent to the mailing list for the da850evm, it's booting both SPL and U-Boot
With the compiler I have, the code went from: SPL 24305 U-Boot 381930
To: SPL 20937 U-Boot 358780
For a Reduction of: SPL -3368 (-13.86%) U-Boot -23150 (-6.06%)
I didn't test the NOR or NAND booting versions of the da850evm, but I will try to do that when the time comes.
I ran the same tests on the imx8mn_beacon board, and this board boots.
Without LTO SPL 82487 U-Boot 704477
With LTO: SPL 74526 U-Boot 670859
For a reduction of: SPL -7961 (-9.65%) U-Boot -33618 (-4.77%)
adam
adam
HTH, Stefan

On Sat, 6 Mar 2021 05:12:12 -0600 Adam Ford aford173@gmail.com wrote:
On Fri, Mar 5, 2021 at 9:03 PM Adam Ford aford173@gmail.com wrote:
On Fri, Mar 5, 2021 at 11:10 AM Adam Ford aford173@gmail.com wrote:
On Fri, Mar 5, 2021 at 6:31 AM Stefan Roese sr@denx.de wrote:
On 05.03.21 12:25, Adam Ford wrote:
On Thu, Mar 4, 2021 at 4:33 PM Marek Behun marek.behun@nic.cz wrote:
On Thu, 4 Mar 2021 16:18:03 -0600 Adam Ford aford173@gmail.com wrote:
> diff --git a/arch/arm/mach-omap2/omap3/Makefile > b/arch/arm/mach-omap2/omap3/Makefile > index 91ed8ebc9f..a2cc21c6d2 100644 > --- a/arch/arm/mach-omap2/omap3/Makefile > +++ b/arch/arm/mach-omap2/omap3/Makefile > @@ -6,6 +6,8 @@ > # If clock.c is compiled for Thumb2, then it fails on OMAP3530 > CFLAGS_clock.o += -marm > > +CFLAGS_REMOVE_file.o := $(LTO_CFLAGS)
Eh? There is no file.c in arch/arm/mach-omap2/omap3/ directory.
It must be from something else that had changed when I did a git pull pull on your repo. I guess that's better news.
It might be a misunderstanding. Marek means that you need to replace "file" with your real filename, like:
driver_spi.c ->
+CFLAGS_REMOVE_driver_spi.o := $(LTO_CFLAGS)
I first did a git pull from his git repo, and then blindly copy-pasted the suggested link to that above directly. I don't play with Makefiles often, so I didn't really notice that I needed to match the line to an actual file. When I rebuilt, my initial issue with showing too much DDR and hanging went away, so I wrongly assumed that this fixed it. In reality, I think it was a patch that had been applied to his git repo that got pulled in at the same time.
On the testing front, I started testing the da850evm (ARM9) this morning. I ran into an issue that I apparently caused some time ago, and I'm trying to resolve that now. I have it working, but I need to clean it up. I'll push the clean-up to the ML then try the LTO on that board to see if U-Boot and/or SPL work with LTO enabled.
As of now, I have one board that seems to fully boot (imx6q_logic) and a family of boards that work in U-Boot but not SPL (omap3_logic).
After the da850em, I'll test a 64-bit Renesas board without SPL and a 64-bit NXP board with SPL to test.
On the build-testing I have done, I am seeing an average of a 10-20% reduction in size for SPL, and a ~ 3% reduction in size in U-Boot.
With a patch that I've already sent to the mailing list for the da850evm, it's booting both SPL and U-Boot
With the compiler I have, the code went from: SPL 24305 U-Boot 381930
To: SPL 20937 U-Boot 358780
For a Reduction of: SPL -3368 (-13.86%) U-Boot -23150 (-6.06%)
I didn't test the NOR or NAND booting versions of the da850evm, but I will try to do that when the time comes.
I ran the same tests on the imx8mn_beacon board, and this board boots.
Without LTO SPL 82487 U-Boot 704477
With LTO: SPL 74526 U-Boot 670859
For a reduction of: SPL -7961 (-9.65%) U-Boot -33618 (-4.77%)
Thank you Adam for these tests. I think you are right in that we should not enable LTO for all ARM boards, but only those that are tested, at least until for example about 80% of them are tested.
Marek

On Sat, Mar 06, 2021 at 06:37:49PM +0100, Marek Behun wrote:
On Sat, 6 Mar 2021 05:12:12 -0600 Adam Ford aford173@gmail.com wrote:
On Fri, Mar 5, 2021 at 9:03 PM Adam Ford aford173@gmail.com wrote:
On Fri, Mar 5, 2021 at 11:10 AM Adam Ford aford173@gmail.com wrote:
On Fri, Mar 5, 2021 at 6:31 AM Stefan Roese sr@denx.de wrote:
On 05.03.21 12:25, Adam Ford wrote:
On Thu, Mar 4, 2021 at 4:33 PM Marek Behun marek.behun@nic.cz wrote: > > On Thu, 4 Mar 2021 16:18:03 -0600 > Adam Ford aford173@gmail.com wrote: > >> diff --git a/arch/arm/mach-omap2/omap3/Makefile >> b/arch/arm/mach-omap2/omap3/Makefile >> index 91ed8ebc9f..a2cc21c6d2 100644 >> --- a/arch/arm/mach-omap2/omap3/Makefile >> +++ b/arch/arm/mach-omap2/omap3/Makefile >> @@ -6,6 +6,8 @@ >> # If clock.c is compiled for Thumb2, then it fails on OMAP3530 >> CFLAGS_clock.o += -marm >> >> +CFLAGS_REMOVE_file.o := $(LTO_CFLAGS) > > Eh? There is no file.c in arch/arm/mach-omap2/omap3/ directory.
It must be from something else that had changed when I did a git pull pull on your repo. I guess that's better news.
It might be a misunderstanding. Marek means that you need to replace "file" with your real filename, like:
driver_spi.c ->
+CFLAGS_REMOVE_driver_spi.o := $(LTO_CFLAGS)
I first did a git pull from his git repo, and then blindly copy-pasted the suggested link to that above directly. I don't play with Makefiles often, so I didn't really notice that I needed to match the line to an actual file. When I rebuilt, my initial issue with showing too much DDR and hanging went away, so I wrongly assumed that this fixed it. In reality, I think it was a patch that had been applied to his git repo that got pulled in at the same time.
On the testing front, I started testing the da850evm (ARM9) this morning. I ran into an issue that I apparently caused some time ago, and I'm trying to resolve that now. I have it working, but I need to clean it up. I'll push the clean-up to the ML then try the LTO on that board to see if U-Boot and/or SPL work with LTO enabled.
As of now, I have one board that seems to fully boot (imx6q_logic) and a family of boards that work in U-Boot but not SPL (omap3_logic).
After the da850em, I'll test a 64-bit Renesas board without SPL and a 64-bit NXP board with SPL to test.
On the build-testing I have done, I am seeing an average of a 10-20% reduction in size for SPL, and a ~ 3% reduction in size in U-Boot.
With a patch that I've already sent to the mailing list for the da850evm, it's booting both SPL and U-Boot
With the compiler I have, the code went from: SPL 24305 U-Boot 381930
To: SPL 20937 U-Boot 358780
For a Reduction of: SPL -3368 (-13.86%) U-Boot -23150 (-6.06%)
I didn't test the NOR or NAND booting versions of the da850evm, but I will try to do that when the time comes.
I ran the same tests on the imx8mn_beacon board, and this board boots.
Without LTO SPL 82487 U-Boot 704477
With LTO: SPL 74526 U-Boot 670859
For a reduction of: SPL -7961 (-9.65%) U-Boot -33618 (-4.77%)
Thank you Adam for these tests. I think you are right in that we should not enable LTO for all ARM boards, but only those that are tested, at least until for example about 80% of them are tested.
Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit odd, but we'll see what happens on real N900 hardware. I'm gonna be pretty confident that the LTO-related issues are in the SoC-specific areas and probably related to some assumptions or another about where code gets located within the resulting binary.

On Saturday 06 March 2021 15:08:13 Tom Rini wrote:
Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit odd, but we'll see what happens on real N900 hardware.
Hello!
Could you send me a link to git repo / branch and tell me from which commit should I do tests on real N900 hardware? I will test it and let you know results.
Adding maemo ML to the loop as on the maemo list are more people with N900 HW and U-Boot.

On Sat, 6 Mar 2021 21:41:14 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 15:08:13 Tom Rini wrote:
Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit odd, but we'll see what happens on real N900 hardware.
Hello!
Could you send me a link to git repo / branch and tell me from which commit should I do tests on real N900 hardware? I will test it and let you know results.
Adding maemo ML to the loop as on the maemo list are more people with N900 HW and U-Boot.
https://github.com/elkablo/u-boot branch lto

On Saturday 06 March 2021 21:54:00 Marek Behun wrote:
On Sat, 6 Mar 2021 21:41:14 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 15:08:13 Tom Rini wrote:
Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit odd, but we'll see what happens on real N900 hardware.
Hello!
Could you send me a link to git repo / branch and tell me from which commit should I do tests on real N900 hardware? I will test it and let you know results.
Adding maemo ML to the loop as on the maemo list are more people with N900 HW and U-Boot.
https://github.com/elkablo/u-boot branch lto
Sorry, compilation is failing :-(
$ git clone https://github.com/elkablo/u-boot -b lto --depth=100 Cloning into 'u-boot'... remote: Enumerating objects: 33644, done. remote: Counting objects: 100% (33644/33644), done. remote: Compressing objects: 100% (20116/20116), done. remote: Total 33644 (delta 15838), reused 19947 (delta 13018), pack-reused 0 Receiving objects: 100% (33644/33644), 26.28 MiB | 10.21 MiB/s, done. Resolving deltas: 100% (15838/15838), done.
$ cd u-boot
$ make CROSS_COMPILE=arm-linux-gnueabi- nokia_rx51_config HOSTCC scripts/basic/fixdep HOSTCC scripts/kconfig/conf.o YACC scripts/kconfig/zconf.tab.c LEX scripts/kconfig/zconf.lex.c HOSTCC scripts/kconfig/zconf.tab.o HOSTLD scripts/kconfig/conf # # configuration written to .config #
$ make CROSS_COMPILE=arm-linux-gnueabi- u-boot.bin ... LTO u-boot /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1258291444) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1459618036) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: could not find abbrev number 48028 /tmp/cc8l0QSQ.ltrans3.ltrans.o: in function `omap3_set_aux_cr_secure': <artificial>:(.text+0x6eb8): undefined reference to `do_omap3_emu_romcode_call' collect2: error: ld returned 1 exit status make: *** [Makefile:1808: u-boot] Error 1
I'm using arm-linux-gnueabi-gcc version 8.3.0 which is available in current Debian stable (Debian 10 Buster).

On Sat, 6 Mar 2021 22:00:45 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 21:54:00 Marek Behun wrote:
On Sat, 6 Mar 2021 21:41:14 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 15:08:13 Tom Rini wrote:
Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit odd, but we'll see what happens on real N900 hardware.
Hello!
Could you send me a link to git repo / branch and tell me from which commit should I do tests on real N900 hardware? I will test it and let you know results.
Adding maemo ML to the loop as on the maemo list are more people with N900 HW and U-Boot.
https://github.com/elkablo/u-boot branch lto
Sorry, compilation is failing :-(
$ git clone https://github.com/elkablo/u-boot -b lto --depth=100 Cloning into 'u-boot'... remote: Enumerating objects: 33644, done. remote: Counting objects: 100% (33644/33644), done. remote: Compressing objects: 100% (20116/20116), done. remote: Total 33644 (delta 15838), reused 19947 (delta 13018), pack-reused 0 Receiving objects: 100% (33644/33644), 26.28 MiB | 10.21 MiB/s, done. Resolving deltas: 100% (15838/15838), done.
$ cd u-boot
$ make CROSS_COMPILE=arm-linux-gnueabi- nokia_rx51_config HOSTCC scripts/basic/fixdep HOSTCC scripts/kconfig/conf.o YACC scripts/kconfig/zconf.tab.c LEX scripts/kconfig/zconf.lex.c HOSTCC scripts/kconfig/zconf.tab.o HOSTLD scripts/kconfig/conf # # configuration written to .config #
$ make CROSS_COMPILE=arm-linux-gnueabi- u-boot.bin ... LTO u-boot /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1258291444) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1459618036) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: could not find abbrev number 48028 /tmp/cc8l0QSQ.ltrans3.ltrans.o: in function `omap3_set_aux_cr_secure': <artificial>:(.text+0x6eb8): undefined reference to `do_omap3_emu_romcode_call' collect2: error: ld returned 1 exit status make: *** [Makefile:1808: u-boot] Error 1
I'm using arm-linux-gnueabi-gcc version 8.3.0 which is available in current Debian stable (Debian 10 Buster).
Fixed and force-pushed, it seems ar needs the P flag that Bin Meng questioned.

On Saturday 06 March 2021 22:19:22 Marek Behun wrote:
On Sat, 6 Mar 2021 22:00:45 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 21:54:00 Marek Behun wrote:
On Sat, 6 Mar 2021 21:41:14 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 15:08:13 Tom Rini wrote:
Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit odd, but we'll see what happens on real N900 hardware.
Hello!
Could you send me a link to git repo / branch and tell me from which commit should I do tests on real N900 hardware? I will test it and let you know results.
Adding maemo ML to the loop as on the maemo list are more people with N900 HW and U-Boot.
https://github.com/elkablo/u-boot branch lto
Sorry, compilation is failing :-(
$ git clone https://github.com/elkablo/u-boot -b lto --depth=100 Cloning into 'u-boot'... remote: Enumerating objects: 33644, done. remote: Counting objects: 100% (33644/33644), done. remote: Compressing objects: 100% (20116/20116), done. remote: Total 33644 (delta 15838), reused 19947 (delta 13018), pack-reused 0 Receiving objects: 100% (33644/33644), 26.28 MiB | 10.21 MiB/s, done. Resolving deltas: 100% (15838/15838), done.
$ cd u-boot
$ make CROSS_COMPILE=arm-linux-gnueabi- nokia_rx51_config HOSTCC scripts/basic/fixdep HOSTCC scripts/kconfig/conf.o YACC scripts/kconfig/zconf.tab.c LEX scripts/kconfig/zconf.lex.c HOSTCC scripts/kconfig/zconf.tab.o HOSTLD scripts/kconfig/conf # # configuration written to .config #
$ make CROSS_COMPILE=arm-linux-gnueabi- u-boot.bin ... LTO u-boot /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1258291444) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1459618036) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: could not find abbrev number 48028 /tmp/cc8l0QSQ.ltrans3.ltrans.o: in function `omap3_set_aux_cr_secure': <artificial>:(.text+0x6eb8): undefined reference to `do_omap3_emu_romcode_call' collect2: error: ld returned 1 exit status make: *** [Makefile:1808: u-boot] Error 1
I'm using arm-linux-gnueabi-gcc version 8.3.0 which is available in current Debian stable (Debian 10 Buster).
Fixed and force-pushed, it seems ar needs the P flag that Bin Meng questioned.
Problem is fixed, now compilation succeeded. u-boot.bin has size 243788 bytes.
And seems that compiled U-Boot is working fine!
Nokia RX-51 # version U-Boot 2021.04-rc3-00338-g88d0a5042c97 (Mar 06 2021 - 22:19:08 +0100)
arm-linux-gnueabi-gcc (Debian 8.3.0-2) 8.3.0 GNU ld (GNU Binutils for Debian) 2.31.1
I can send binary files via usbtty and 'loadb' command. I can boot linux kernel via 'bootm'. I can chainload to another U-Boot binary (loaded by 'loadb') via 'go' command. Also 'ext4ls' and 'fatls' commands are working. Also 'onenand dump bootloader' is working.
Do you need something more to test?
If not you can add my 'Tested-by: Pali Rohár pali@kernel.org' line for all patches which are up to the commit 88d0a5042c97.
Good job!

On Sat, 6 Mar 2021 22:38:52 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 22:19:22 Marek Behun wrote:
On Sat, 6 Mar 2021 22:00:45 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 21:54:00 Marek Behun wrote:
On Sat, 6 Mar 2021 21:41:14 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 15:08:13 Tom Rini wrote:
Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit odd, but we'll see what happens on real N900 hardware.
Hello!
Could you send me a link to git repo / branch and tell me from which commit should I do tests on real N900 hardware? I will test it and let you know results.
Adding maemo ML to the loop as on the maemo list are more people with N900 HW and U-Boot.
https://github.com/elkablo/u-boot branch lto
Sorry, compilation is failing :-(
$ git clone https://github.com/elkablo/u-boot -b lto --depth=100 Cloning into 'u-boot'... remote: Enumerating objects: 33644, done. remote: Counting objects: 100% (33644/33644), done. remote: Compressing objects: 100% (20116/20116), done. remote: Total 33644 (delta 15838), reused 19947 (delta 13018), pack-reused 0 Receiving objects: 100% (33644/33644), 26.28 MiB | 10.21 MiB/s, done. Resolving deltas: 100% (15838/15838), done.
$ cd u-boot
$ make CROSS_COMPILE=arm-linux-gnueabi- nokia_rx51_config HOSTCC scripts/basic/fixdep HOSTCC scripts/kconfig/conf.o YACC scripts/kconfig/zconf.tab.c LEX scripts/kconfig/zconf.lex.c HOSTCC scripts/kconfig/zconf.tab.o HOSTLD scripts/kconfig/conf # # configuration written to .config #
$ make CROSS_COMPILE=arm-linux-gnueabi- u-boot.bin ... LTO u-boot /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1258291444) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1459618036) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: could not find abbrev number 48028 /tmp/cc8l0QSQ.ltrans3.ltrans.o: in function `omap3_set_aux_cr_secure': <artificial>:(.text+0x6eb8): undefined reference to `do_omap3_emu_romcode_call' collect2: error: ld returned 1 exit status make: *** [Makefile:1808: u-boot] Error 1
I'm using arm-linux-gnueabi-gcc version 8.3.0 which is available in current Debian stable (Debian 10 Buster).
Fixed and force-pushed, it seems ar needs the P flag that Bin Meng questioned.
Problem is fixed, now compilation succeeded. u-boot.bin has size 243788 bytes.
And seems that compiled U-Boot is working fine!
Nokia RX-51 # version U-Boot 2021.04-rc3-00338-g88d0a5042c97 (Mar 06 2021 - 22:19:08 +0100)
arm-linux-gnueabi-gcc (Debian 8.3.0-2) 8.3.0 GNU ld (GNU Binutils for Debian) 2.31.1
I can send binary files via usbtty and 'loadb' command. I can boot linux kernel via 'bootm'. I can chainload to another U-Boot binary (loaded by 'loadb') via 'go' command. Also 'ext4ls' and 'fatls' commands are working. Also 'onenand dump bootloader' is working.
Do you need something more to test?
If not you can add my 'Tested-by: Pali Rohár pali@kernel.org' line for all patches which are up to the commit 88d0a5042c97.
Good job!
Thanks.
I am still working on these patches, since I have discovered some more defconfigs which fail to build for one reason or another.
I will send a patch enabling LTO for Nokia N900 though.
marek

On Sat, Mar 6, 2021 at 3:49 PM Marek Behun marek.behun@nic.cz wrote:
On Sat, 6 Mar 2021 22:38:52 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 22:19:22 Marek Behun wrote:
On Sat, 6 Mar 2021 22:00:45 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 21:54:00 Marek Behun wrote:
On Sat, 6 Mar 2021 21:41:14 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 15:08:13 Tom Rini wrote: > Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit > odd, but we'll see what happens on real N900 hardware.
Hello!
Could you send me a link to git repo / branch and tell me from which commit should I do tests on real N900 hardware? I will test it and let you know results.
Adding maemo ML to the loop as on the maemo list are more people with N900 HW and U-Boot.
https://github.com/elkablo/u-boot branch lto
Sorry, compilation is failing :-(
$ git clone https://github.com/elkablo/u-boot -b lto --depth=100 Cloning into 'u-boot'... remote: Enumerating objects: 33644, done. remote: Counting objects: 100% (33644/33644), done. remote: Compressing objects: 100% (20116/20116), done. remote: Total 33644 (delta 15838), reused 19947 (delta 13018), pack-reused 0 Receiving objects: 100% (33644/33644), 26.28 MiB | 10.21 MiB/s, done. Resolving deltas: 100% (15838/15838), done.
$ cd u-boot
$ make CROSS_COMPILE=arm-linux-gnueabi- nokia_rx51_config HOSTCC scripts/basic/fixdep HOSTCC scripts/kconfig/conf.o YACC scripts/kconfig/zconf.tab.c LEX scripts/kconfig/zconf.lex.c HOSTCC scripts/kconfig/zconf.tab.o HOSTLD scripts/kconfig/conf # # configuration written to .config #
$ make CROSS_COMPILE=arm-linux-gnueabi- u-boot.bin ... LTO u-boot /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1258291444) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1459618036) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: could not find abbrev number 48028 /tmp/cc8l0QSQ.ltrans3.ltrans.o: in function `omap3_set_aux_cr_secure': <artificial>:(.text+0x6eb8): undefined reference to `do_omap3_emu_romcode_call' collect2: error: ld returned 1 exit status make: *** [Makefile:1808: u-boot] Error 1
I'm using arm-linux-gnueabi-gcc version 8.3.0 which is available in current Debian stable (Debian 10 Buster).
Fixed and force-pushed, it seems ar needs the P flag that Bin Meng questioned.
Problem is fixed, now compilation succeeded. u-boot.bin has size 243788 bytes.
And seems that compiled U-Boot is working fine!
Nokia RX-51 # version U-Boot 2021.04-rc3-00338-g88d0a5042c97 (Mar 06 2021 - 22:19:08 +0100)
arm-linux-gnueabi-gcc (Debian 8.3.0-2) 8.3.0 GNU ld (GNU Binutils for Debian) 2.31.1
I can send binary files via usbtty and 'loadb' command. I can boot linux kernel via 'bootm'. I can chainload to another U-Boot binary (loaded by 'loadb') via 'go' command. Also 'ext4ls' and 'fatls' commands are working. Also 'onenand dump bootloader' is working.
Do you need something more to test?
If not you can add my 'Tested-by: Pali Rohár pali@kernel.org' line for all patches which are up to the commit 88d0a5042c97.
Good job!
Thanks.
I am still working on these patches, since I have discovered some more defconfigs which fail to build for one reason or another.
I will send a patch enabling LTO for Nokia N900 though.
I have my DM3730 booting now
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..4c96c81bf7 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_board.o := $(LTO_CFLAGS) + obj-y := lowlevel_init.o
obj-y += board.o
The board.c file has a few functions with assembly code and some functions called by assembly. I wonder if there needs to be some compiler flags added to it to ensure it boots. I am not as experienced with compiler directives and Makefile tweaks, but I am willing to try stuff if people have suggestions.
However, at least for now, the omap3_logic board works. I haven't tried the AM3517 yet. It's similar to the omap3 with a different memory controller so I'm more concerned about it than the OMAP3530. If I have time tomorrow, I'll run some tests on the AM3517.
adam
marek

On Sat, 6 Mar 2021 21:45:02 -0600 Adam Ford aford173@gmail.com wrote:
On Sat, Mar 6, 2021 at 3:49 PM Marek Behun marek.behun@nic.cz wrote:
On Sat, 6 Mar 2021 22:38:52 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 22:19:22 Marek Behun wrote:
On Sat, 6 Mar 2021 22:00:45 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 21:54:00 Marek Behun wrote:
On Sat, 6 Mar 2021 21:41:14 +0100 Pali Rohár pali@kernel.org wrote:
> On Saturday 06 March 2021 15:08:13 Tom Rini wrote: > > Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit > > odd, but we'll see what happens on real N900 hardware. > > Hello! > > Could you send me a link to git repo / branch and tell me from which > commit should I do tests on real N900 hardware? I will test it and let > you know results. > > Adding maemo ML to the loop as on the maemo list are more people with > N900 HW and U-Boot.
https://github.com/elkablo/u-boot branch lto
Sorry, compilation is failing :-(
$ git clone https://github.com/elkablo/u-boot -b lto --depth=100 Cloning into 'u-boot'... remote: Enumerating objects: 33644, done. remote: Counting objects: 100% (33644/33644), done. remote: Compressing objects: 100% (20116/20116), done. remote: Total 33644 (delta 15838), reused 19947 (delta 13018), pack-reused 0 Receiving objects: 100% (33644/33644), 26.28 MiB | 10.21 MiB/s, done. Resolving deltas: 100% (15838/15838), done.
$ cd u-boot
$ make CROSS_COMPILE=arm-linux-gnueabi- nokia_rx51_config HOSTCC scripts/basic/fixdep HOSTCC scripts/kconfig/conf.o YACC scripts/kconfig/zconf.tab.c LEX scripts/kconfig/zconf.lex.c HOSTCC scripts/kconfig/zconf.tab.o HOSTLD scripts/kconfig/conf # # configuration written to .config #
$ make CROSS_COMPILE=arm-linux-gnueabi- u-boot.bin ... LTO u-boot /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1258291444) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1459618036) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: could not find abbrev number 48028 /tmp/cc8l0QSQ.ltrans3.ltrans.o: in function `omap3_set_aux_cr_secure': <artificial>:(.text+0x6eb8): undefined reference to `do_omap3_emu_romcode_call' collect2: error: ld returned 1 exit status make: *** [Makefile:1808: u-boot] Error 1
I'm using arm-linux-gnueabi-gcc version 8.3.0 which is available in current Debian stable (Debian 10 Buster).
Fixed and force-pushed, it seems ar needs the P flag that Bin Meng questioned.
Problem is fixed, now compilation succeeded. u-boot.bin has size 243788 bytes.
And seems that compiled U-Boot is working fine!
Nokia RX-51 # version U-Boot 2021.04-rc3-00338-g88d0a5042c97 (Mar 06 2021 - 22:19:08 +0100)
arm-linux-gnueabi-gcc (Debian 8.3.0-2) 8.3.0 GNU ld (GNU Binutils for Debian) 2.31.1
I can send binary files via usbtty and 'loadb' command. I can boot linux kernel via 'bootm'. I can chainload to another U-Boot binary (loaded by 'loadb') via 'go' command. Also 'ext4ls' and 'fatls' commands are working. Also 'onenand dump bootloader' is working.
Do you need something more to test?
If not you can add my 'Tested-by: Pali Rohár pali@kernel.org' line for all patches which are up to the commit 88d0a5042c97.
Good job!
Thanks.
I am still working on these patches, since I have discovered some more defconfigs which fail to build for one reason or another.
I will send a patch enabling LTO for Nokia N900 though.
I have my DM3730 booting now
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..4c96c81bf7 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_board.o := $(LTO_CFLAGS)
obj-y := lowlevel_init.o
obj-y += board.o
The board.c file has a few functions with assembly code and some functions called by assembly. I wonder if there needs to be some compiler flags added to it to ensure it boots. I am not as experienced with compiler directives and Makefile tweaks, but I am willing to try stuff if people have suggestions.
However, at least for now, the omap3_logic board works. I haven't tried the AM3517 yet. It's similar to the omap3 with a different memory controller so I'm more concerned about it than the OMAP3530. If I have time tomorrow, I'll run some tests on the AM3517.
adam
Adding to my patches, thanks.
Marek

On Sat, Mar 6, 2021 at 10:06 PM Marek Behun marek.behun@nic.cz wrote:
On Sat, 6 Mar 2021 21:45:02 -0600 Adam Ford aford173@gmail.com wrote:
On Sat, Mar 6, 2021 at 3:49 PM Marek Behun marek.behun@nic.cz wrote:
On Sat, 6 Mar 2021 22:38:52 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 22:19:22 Marek Behun wrote:
On Sat, 6 Mar 2021 22:00:45 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 21:54:00 Marek Behun wrote: > On Sat, 6 Mar 2021 21:41:14 +0100 > Pali Rohár pali@kernel.org wrote: > > > On Saturday 06 March 2021 15:08:13 Tom Rini wrote: > > > Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit > > > odd, but we'll see what happens on real N900 hardware. > > > > Hello! > > > > Could you send me a link to git repo / branch and tell me from which > > commit should I do tests on real N900 hardware? I will test it and let > > you know results. > > > > Adding maemo ML to the loop as on the maemo list are more people with > > N900 HW and U-Boot. > > https://github.com/elkablo/u-boot branch lto
Sorry, compilation is failing :-(
$ git clone https://github.com/elkablo/u-boot -b lto --depth=100 Cloning into 'u-boot'... remote: Enumerating objects: 33644, done. remote: Counting objects: 100% (33644/33644), done. remote: Compressing objects: 100% (20116/20116), done. remote: Total 33644 (delta 15838), reused 19947 (delta 13018), pack-reused 0 Receiving objects: 100% (33644/33644), 26.28 MiB | 10.21 MiB/s, done. Resolving deltas: 100% (15838/15838), done.
$ cd u-boot
$ make CROSS_COMPILE=arm-linux-gnueabi- nokia_rx51_config HOSTCC scripts/basic/fixdep HOSTCC scripts/kconfig/conf.o YACC scripts/kconfig/zconf.tab.c LEX scripts/kconfig/zconf.lex.c HOSTCC scripts/kconfig/zconf.tab.o HOSTLD scripts/kconfig/conf # # configuration written to .config #
$ make CROSS_COMPILE=arm-linux-gnueabi- u-boot.bin ... LTO u-boot /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1258291444) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1459618036) greater than or equal to .debug_str size (676) /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: could not find abbrev number 48028 /tmp/cc8l0QSQ.ltrans3.ltrans.o: in function `omap3_set_aux_cr_secure': <artificial>:(.text+0x6eb8): undefined reference to `do_omap3_emu_romcode_call' collect2: error: ld returned 1 exit status make: *** [Makefile:1808: u-boot] Error 1
I'm using arm-linux-gnueabi-gcc version 8.3.0 which is available in current Debian stable (Debian 10 Buster).
Fixed and force-pushed, it seems ar needs the P flag that Bin Meng questioned.
Problem is fixed, now compilation succeeded. u-boot.bin has size 243788 bytes.
And seems that compiled U-Boot is working fine!
Nokia RX-51 # version U-Boot 2021.04-rc3-00338-g88d0a5042c97 (Mar 06 2021 - 22:19:08 +0100)
arm-linux-gnueabi-gcc (Debian 8.3.0-2) 8.3.0 GNU ld (GNU Binutils for Debian) 2.31.1
I can send binary files via usbtty and 'loadb' command. I can boot linux kernel via 'bootm'. I can chainload to another U-Boot binary (loaded by 'loadb') via 'go' command. Also 'ext4ls' and 'fatls' commands are working. Also 'onenand dump bootloader' is working.
Do you need something more to test?
If not you can add my 'Tested-by: Pali Rohár pali@kernel.org' line for all patches which are up to the commit 88d0a5042c97.
Good job!
Thanks.
I am still working on these patches, since I have discovered some more defconfigs which fail to build for one reason or another.
I will send a patch enabling LTO for Nokia N900 though.
I have my DM3730 booting now
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..4c96c81bf7 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_board.o := $(LTO_CFLAGS)
obj-y := lowlevel_init.o
obj-y += board.o
The board.c file has a few functions with assembly code and some functions called by assembly. I wonder if there needs to be some compiler flags added to it to ensure it boots. I am not as experienced with compiler directives and Makefile tweaks, but I am willing to try stuff if people have suggestions.
However, at least for now, the omap3_logic board works. I haven't tried the AM3517 yet. It's similar to the omap3 with a different memory controller so I'm more concerned about it than the OMAP3530. If I have time tomorrow, I'll run some tests on the AM3517.
adam
Adding to my patches, thanks.
Do you think you'll be re-submitting any of this for the next release of U-Boot? I felt like we had some good momentum going.
Marek

On Sun, 9 May 2021 09:14:14 -0500 Adam Ford aford173@gmail.com wrote:
On Sat, Mar 6, 2021 at 10:06 PM Marek Behun marek.behun@nic.cz wrote:
On Sat, 6 Mar 2021 21:45:02 -0600 Adam Ford aford173@gmail.com wrote:
On Sat, Mar 6, 2021 at 3:49 PM Marek Behun marek.behun@nic.cz wrote:
On Sat, 6 Mar 2021 22:38:52 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 22:19:22 Marek Behun wrote:
On Sat, 6 Mar 2021 22:00:45 +0100 Pali Rohár pali@kernel.org wrote:
> On Saturday 06 March 2021 21:54:00 Marek Behun wrote: > > On Sat, 6 Mar 2021 21:41:14 +0100 > > Pali Rohár pali@kernel.org wrote: > > > > > On Saturday 06 March 2021 15:08:13 Tom Rini wrote: > > > > Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit > > > > odd, but we'll see what happens on real N900 hardware. > > > > > > Hello! > > > > > > Could you send me a link to git repo / branch and tell me from which > > > commit should I do tests on real N900 hardware? I will test it and let > > > you know results. > > > > > > Adding maemo ML to the loop as on the maemo list are more people with > > > N900 HW and U-Boot. > > > > https://github.com/elkablo/u-boot branch lto > > Sorry, compilation is failing :-( > > $ git clone https://github.com/elkablo/u-boot -b lto --depth=100 > Cloning into 'u-boot'... > remote: Enumerating objects: 33644, done. > remote: Counting objects: 100% (33644/33644), done. > remote: Compressing objects: 100% (20116/20116), done. > remote: Total 33644 (delta 15838), reused 19947 (delta 13018), pack-reused 0 > Receiving objects: 100% (33644/33644), 26.28 MiB | 10.21 MiB/s, done. > Resolving deltas: 100% (15838/15838), done. > > $ cd u-boot > > $ make CROSS_COMPILE=arm-linux-gnueabi- nokia_rx51_config > HOSTCC scripts/basic/fixdep > HOSTCC scripts/kconfig/conf.o > YACC scripts/kconfig/zconf.tab.c > LEX scripts/kconfig/zconf.lex.c > HOSTCC scripts/kconfig/zconf.tab.o > HOSTLD scripts/kconfig/conf > # > # configuration written to .config > # > > $ make CROSS_COMPILE=arm-linux-gnueabi- u-boot.bin > ... > LTO u-boot > /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1258291444) greater than or equal to .debug_str size (676) > /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1459618036) greater than or equal to .debug_str size (676) > /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: could not find abbrev number 48028 > /tmp/cc8l0QSQ.ltrans3.ltrans.o: in function `omap3_set_aux_cr_secure': > <artificial>:(.text+0x6eb8): undefined reference to `do_omap3_emu_romcode_call' > collect2: error: ld returned 1 exit status > make: *** [Makefile:1808: u-boot] Error 1 > > > I'm using arm-linux-gnueabi-gcc version 8.3.0 which is available in > current Debian stable (Debian 10 Buster).
Fixed and force-pushed, it seems ar needs the P flag that Bin Meng questioned.
Problem is fixed, now compilation succeeded. u-boot.bin has size 243788 bytes.
And seems that compiled U-Boot is working fine!
Nokia RX-51 # version U-Boot 2021.04-rc3-00338-g88d0a5042c97 (Mar 06 2021 - 22:19:08 +0100)
arm-linux-gnueabi-gcc (Debian 8.3.0-2) 8.3.0 GNU ld (GNU Binutils for Debian) 2.31.1
I can send binary files via usbtty and 'loadb' command. I can boot linux kernel via 'bootm'. I can chainload to another U-Boot binary (loaded by 'loadb') via 'go' command. Also 'ext4ls' and 'fatls' commands are working. Also 'onenand dump bootloader' is working.
Do you need something more to test?
If not you can add my 'Tested-by: Pali Rohár pali@kernel.org' line for all patches which are up to the commit 88d0a5042c97.
Good job!
Thanks.
I am still working on these patches, since I have discovered some more defconfigs which fail to build for one reason or another.
I will send a patch enabling LTO for Nokia N900 though.
I have my DM3730 booting now
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..4c96c81bf7 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_board.o := $(LTO_CFLAGS)
obj-y := lowlevel_init.o
obj-y += board.o
The board.c file has a few functions with assembly code and some functions called by assembly. I wonder if there needs to be some compiler flags added to it to ensure it boots. I am not as experienced with compiler directives and Makefile tweaks, but I am willing to try stuff if people have suggestions.
However, at least for now, the omap3_logic board works. I haven't tried the AM3517 yet. It's similar to the omap3 with a different memory controller so I'm more concerned about it than the OMAP3530. If I have time tomorrow, I'll run some tests on the AM3517.
adam
Adding to my patches, thanks.
Do you think you'll be re-submitting any of this for the next release of U-Boot? I felt like we had some good momentum going.
Marek
I shall look into this again this week. There is one problem I discovered last month and haven't yet time to look into it. It is pretty important - network does not work with the mvneta driver on Turris Omnia when compiled with LTO...
Marek

Hi Marek,
On Sun, 9 May 2021 at 12:45, Marek Behun marek.behun@nic.cz wrote:
On Sun, 9 May 2021 09:14:14 -0500 Adam Ford aford173@gmail.com wrote:
On Sat, Mar 6, 2021 at 10:06 PM Marek Behun marek.behun@nic.cz wrote:
On Sat, 6 Mar 2021 21:45:02 -0600 Adam Ford aford173@gmail.com wrote:
On Sat, Mar 6, 2021 at 3:49 PM Marek Behun marek.behun@nic.cz wrote:
On Sat, 6 Mar 2021 22:38:52 +0100 Pali Rohár pali@kernel.org wrote:
On Saturday 06 March 2021 22:19:22 Marek Behun wrote: > On Sat, 6 Mar 2021 22:00:45 +0100 > Pali Rohár pali@kernel.org wrote: > > > On Saturday 06 March 2021 21:54:00 Marek Behun wrote: > > > On Sat, 6 Mar 2021 21:41:14 +0100 > > > Pali Rohár pali@kernel.org wrote: > > > > > > > On Saturday 06 March 2021 15:08:13 Tom Rini wrote: > > > > > Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit > > > > > odd, but we'll see what happens on real N900 hardware. > > > > > > > > Hello! > > > > > > > > Could you send me a link to git repo / branch and tell me from which > > > > commit should I do tests on real N900 hardware? I will test it and let > > > > you know results. > > > > > > > > Adding maemo ML to the loop as on the maemo list are more people with > > > > N900 HW and U-Boot. > > > > > > https://github.com/elkablo/u-boot branch lto > > > > Sorry, compilation is failing :-( > > > > $ git clone https://github.com/elkablo/u-boot -b lto --depth=100 > > Cloning into 'u-boot'... > > remote: Enumerating objects: 33644, done. > > remote: Counting objects: 100% (33644/33644), done. > > remote: Compressing objects: 100% (20116/20116), done. > > remote: Total 33644 (delta 15838), reused 19947 (delta 13018), pack-reused 0 > > Receiving objects: 100% (33644/33644), 26.28 MiB | 10.21 MiB/s, done. > > Resolving deltas: 100% (15838/15838), done. > > > > $ cd u-boot > > > > $ make CROSS_COMPILE=arm-linux-gnueabi- nokia_rx51_config > > HOSTCC scripts/basic/fixdep > > HOSTCC scripts/kconfig/conf.o > > YACC scripts/kconfig/zconf.tab.c > > LEX scripts/kconfig/zconf.lex.c > > HOSTCC scripts/kconfig/zconf.tab.o > > HOSTLD scripts/kconfig/conf > > # > > # configuration written to .config > > # > > > > $ make CROSS_COMPILE=arm-linux-gnueabi- u-boot.bin > > ... > > LTO u-boot > > /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1258291444) greater than or equal to .debug_str size (676) > > /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1459618036) greater than or equal to .debug_str size (676) > > /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: could not find abbrev number 48028 > > /tmp/cc8l0QSQ.ltrans3.ltrans.o: in function `omap3_set_aux_cr_secure': > > <artificial>:(.text+0x6eb8): undefined reference to `do_omap3_emu_romcode_call' > > collect2: error: ld returned 1 exit status > > make: *** [Makefile:1808: u-boot] Error 1 > > > > > > I'm using arm-linux-gnueabi-gcc version 8.3.0 which is available in > > current Debian stable (Debian 10 Buster). > > Fixed and force-pushed, it seems ar needs the P flag that Bin Meng > questioned.
Problem is fixed, now compilation succeeded. u-boot.bin has size 243788 bytes.
And seems that compiled U-Boot is working fine!
Nokia RX-51 # version U-Boot 2021.04-rc3-00338-g88d0a5042c97 (Mar 06 2021 - 22:19:08 +0100)
arm-linux-gnueabi-gcc (Debian 8.3.0-2) 8.3.0 GNU ld (GNU Binutils for Debian) 2.31.1
I can send binary files via usbtty and 'loadb' command. I can boot linux kernel via 'bootm'. I can chainload to another U-Boot binary (loaded by 'loadb') via 'go' command. Also 'ext4ls' and 'fatls' commands are working. Also 'onenand dump bootloader' is working.
Do you need something more to test?
If not you can add my 'Tested-by: Pali Rohár pali@kernel.org' line for all patches which are up to the commit 88d0a5042c97.
Good job!
Thanks.
I am still working on these patches, since I have discovered some more defconfigs which fail to build for one reason or another.
I will send a patch enabling LTO for Nokia N900 though.
I have my DM3730 booting now
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..4c96c81bf7 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_board.o := $(LTO_CFLAGS)
obj-y := lowlevel_init.o
obj-y += board.o
The board.c file has a few functions with assembly code and some functions called by assembly. I wonder if there needs to be some compiler flags added to it to ensure it boots. I am not as experienced with compiler directives and Makefile tweaks, but I am willing to try stuff if people have suggestions.
However, at least for now, the omap3_logic board works. I haven't tried the AM3517 yet. It's similar to the omap3 with a different memory controller so I'm more concerned about it than the OMAP3530. If I have time tomorrow, I'll run some tests on the AM3517.
adam
Adding to my patches, thanks.
Do you think you'll be re-submitting any of this for the next release of U-Boot? I felt like we had some good momentum going.
Marek
I shall look into this again this week. There is one problem I discovered last month and haven't yet time to look into it. It is pretty important - network does not work with the mvneta driver on Turris Omnia when compiled with LTO...
Granted that this might be a more widespread issue, but you can make LTO depend on that driver not being present. The way to get people to try it out is to get something merged. People can turn it on as help you fix the remaining problems.
Regards, Simon

On Monday 10 May 2021 10:28:05 Simon Glass wrote:
Hi Marek,
On Sun, 9 May 2021 at 12:45, Marek Behun marek.behun@nic.cz wrote:
On Sun, 9 May 2021 09:14:14 -0500 Adam Ford aford173@gmail.com wrote:
On Sat, Mar 6, 2021 at 10:06 PM Marek Behun marek.behun@nic.cz wrote:
On Sat, 6 Mar 2021 21:45:02 -0600 Adam Ford aford173@gmail.com wrote:
On Sat, Mar 6, 2021 at 3:49 PM Marek Behun marek.behun@nic.cz wrote:
On Sat, 6 Mar 2021 22:38:52 +0100 Pali Rohár pali@kernel.org wrote:
> On Saturday 06 March 2021 22:19:22 Marek Behun wrote: > > On Sat, 6 Mar 2021 22:00:45 +0100 > > Pali Rohár pali@kernel.org wrote: > > > > > On Saturday 06 March 2021 21:54:00 Marek Behun wrote: > > > > On Sat, 6 Mar 2021 21:41:14 +0100 > > > > Pali Rohár pali@kernel.org wrote: > > > > > > > > > On Saturday 06 March 2021 15:08:13 Tom Rini wrote: > > > > > > Perhaps we'll default to yes on some SoCs. The omap3 thing is a bit > > > > > > odd, but we'll see what happens on real N900 hardware. > > > > > > > > > > Hello! > > > > > > > > > > Could you send me a link to git repo / branch and tell me from which > > > > > commit should I do tests on real N900 hardware? I will test it and let > > > > > you know results. > > > > > > > > > > Adding maemo ML to the loop as on the maemo list are more people with > > > > > N900 HW and U-Boot. > > > > > > > > https://github.com/elkablo/u-boot branch lto > > > > > > Sorry, compilation is failing :-( > > > > > > $ git clone https://github.com/elkablo/u-boot -b lto --depth=100 > > > Cloning into 'u-boot'... > > > remote: Enumerating objects: 33644, done. > > > remote: Counting objects: 100% (33644/33644), done. > > > remote: Compressing objects: 100% (20116/20116), done. > > > remote: Total 33644 (delta 15838), reused 19947 (delta 13018), pack-reused 0 > > > Receiving objects: 100% (33644/33644), 26.28 MiB | 10.21 MiB/s, done. > > > Resolving deltas: 100% (15838/15838), done. > > > > > > $ cd u-boot > > > > > > $ make CROSS_COMPILE=arm-linux-gnueabi- nokia_rx51_config > > > HOSTCC scripts/basic/fixdep > > > HOSTCC scripts/kconfig/conf.o > > > YACC scripts/kconfig/zconf.tab.c > > > LEX scripts/kconfig/zconf.lex.c > > > HOSTCC scripts/kconfig/zconf.tab.o > > > HOSTLD scripts/kconfig/conf > > > # > > > # configuration written to .config > > > # > > > > > > $ make CROSS_COMPILE=arm-linux-gnueabi- u-boot.bin > > > ... > > > LTO u-boot > > > /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1258291444) greater than or equal to .debug_str size (676) > > > /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: offset (1459618036) greater than or equal to .debug_str size (676) > > > /usr/lib/gcc-cross/arm-linux-gnueabi/8/../../../../arm-linux-gnueabi/bin/ld: DWARF error: could not find abbrev number 48028 > > > /tmp/cc8l0QSQ.ltrans3.ltrans.o: in function `omap3_set_aux_cr_secure': > > > <artificial>:(.text+0x6eb8): undefined reference to `do_omap3_emu_romcode_call' > > > collect2: error: ld returned 1 exit status > > > make: *** [Makefile:1808: u-boot] Error 1 > > > > > > > > > I'm using arm-linux-gnueabi-gcc version 8.3.0 which is available in > > > current Debian stable (Debian 10 Buster). > > > > Fixed and force-pushed, it seems ar needs the P flag that Bin Meng > > questioned. > > Problem is fixed, now compilation succeeded. u-boot.bin has size 243788 > bytes. > > And seems that compiled U-Boot is working fine! > > Nokia RX-51 # version > U-Boot 2021.04-rc3-00338-g88d0a5042c97 (Mar 06 2021 - 22:19:08 +0100) > > arm-linux-gnueabi-gcc (Debian 8.3.0-2) 8.3.0 > GNU ld (GNU Binutils for Debian) 2.31.1 > > I can send binary files via usbtty and 'loadb' command. I can boot linux > kernel via 'bootm'. I can chainload to another U-Boot binary (loaded by > 'loadb') via 'go' command. Also 'ext4ls' and 'fatls' commands are > working. Also 'onenand dump bootloader' is working. > > Do you need something more to test? > > If not you can add my 'Tested-by: Pali Rohár pali@kernel.org' line for > all patches which are up to the commit 88d0a5042c97. > > Good job!
Thanks.
I am still working on these patches, since I have discovered some more defconfigs which fail to build for one reason or another.
I will send a patch enabling LTO for Nokia N900 though.
I have my DM3730 booting now
diff --git a/arch/arm/mach-omap2/omap3/Makefile b/arch/arm/mach-omap2/omap3/Makefile index 91ed8ebc9f..4c96c81bf7 100644 --- a/arch/arm/mach-omap2/omap3/Makefile +++ b/arch/arm/mach-omap2/omap3/Makefile @@ -6,6 +6,8 @@ # If clock.c is compiled for Thumb2, then it fails on OMAP3530 CFLAGS_clock.o += -marm
+CFLAGS_REMOVE_board.o := $(LTO_CFLAGS)
obj-y := lowlevel_init.o
obj-y += board.o
The board.c file has a few functions with assembly code and some functions called by assembly. I wonder if there needs to be some compiler flags added to it to ensure it boots. I am not as experienced with compiler directives and Makefile tweaks, but I am willing to try stuff if people have suggestions.
However, at least for now, the omap3_logic board works. I haven't tried the AM3517 yet. It's similar to the omap3 with a different memory controller so I'm more concerned about it than the OMAP3530. If I have time tomorrow, I'll run some tests on the AM3517.
adam
Adding to my patches, thanks.
Do you think you'll be re-submitting any of this for the next release of U-Boot? I felt like we had some good momentum going.
Marek
I shall look into this again this week. There is one problem I discovered last month and haven't yet time to look into it. It is pretty important - network does not work with the mvneta driver on Turris Omnia when compiled with LTO...
Granted that this might be a more widespread issue, but you can make LTO depend on that driver not being present. The way to get people to try it out is to get something merged. People can turn it on as help you fix the remaining problems.
I agree. At least patches can be merged without enabling LTO by default and issues in particular drivers can be fixed later.
Regards, Simon

On Thu, Mar 04, 2021 at 09:07:33AM -0600, Adam Ford wrote:
On Thu, Mar 4, 2021 at 8:58 AM Tom Rini trini@konsulko.com wrote:
On Thu, Mar 04, 2021 at 07:46:18AM -0600, Adam Ford wrote:
On Thu, Mar 4, 2021 at 4:43 AM Marek Behun marek.behun@nic.cz wrote:
On Wed, 3 Mar 2021 16:36:05 -0500 Tom Rini trini@konsulko.com wrote:
So, as I start testing things locally with two additional changes (1. LTO by default 2. No ffunction/data-sections with LTO) we see: https://gist.github.com/trini/350ab850c42293563228b8d68a1bb89a as the detailed size reduction. This also shows that with LTO we want to turn off -ffunction-sections/etc as it's not useful now.
Tom, I have pushed another version to github PR to trigger CI, and am still working on clang. You can look at the github PR if you want to try yourself. I have also added patch that disables -ffunction-section/fdata-section on arm.
After I manage to make it all work in CI I will send v2 to mailing list.
I tested this with the imx6q_logic board. I only tested the U-Boot portion, but it appeared to work and it booted the kernel. The U-Boot size reduced -7182 bytes (about 3% smaller).
I haven't been able to successfully boot the OMAP3 boards I have yet. I'm still looking into this.
It boots (and pytest runs) on my Beagleboard xM, fwiw.
Interesting. with LTO enabled, the DRAM reports 7.2 GB and hangs. Without LTO, U-Boot boots fine.
LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 7.2 GiB
<hang>
Without LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 256 MiB NAND: 512 MiB MMC: OMAP SD/MMC: 0 Loading Environment from NAND... OK OMAP die ID: 619e00029ff800000168300f1502501f Net: smc911x-0 Hit any key to stop autoboot: 0 OMAP Logic #
I take it back, I had my bbxm disabled in the loop and indeed it doesn't boot either. What's odd there is that N900 is omap3.

On Thursday 04 March 2021 11:17:01 Tom Rini wrote:
On Thu, Mar 04, 2021 at 09:07:33AM -0600, Adam Ford wrote:
On Thu, Mar 4, 2021 at 8:58 AM Tom Rini trini@konsulko.com wrote:
On Thu, Mar 04, 2021 at 07:46:18AM -0600, Adam Ford wrote:
On Thu, Mar 4, 2021 at 4:43 AM Marek Behun marek.behun@nic.cz wrote:
On Wed, 3 Mar 2021 16:36:05 -0500 Tom Rini trini@konsulko.com wrote:
So, as I start testing things locally with two additional changes (1. LTO by default 2. No ffunction/data-sections with LTO) we see: https://gist.github.com/trini/350ab850c42293563228b8d68a1bb89a as the detailed size reduction. This also shows that with LTO we want to turn off -ffunction-sections/etc as it's not useful now.
Tom, I have pushed another version to github PR to trigger CI, and am still working on clang. You can look at the github PR if you want to try yourself. I have also added patch that disables -ffunction-section/fdata-section on arm.
After I manage to make it all work in CI I will send v2 to mailing list.
I tested this with the imx6q_logic board. I only tested the U-Boot portion, but it appeared to work and it booted the kernel. The U-Boot size reduced -7182 bytes (about 3% smaller).
I haven't been able to successfully boot the OMAP3 boards I have yet. I'm still looking into this.
It boots (and pytest runs) on my Beagleboard xM, fwiw.
Interesting. with LTO enabled, the DRAM reports 7.2 GB and hangs. Without LTO, U-Boot boots fine.
LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 7.2 GiB
<hang>
Without LTO: OMAP3630/3730-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 GHz Model: LogicPD Zoom DM3730 Torpedo + Wireless Development Kit Logic DM37x/OMAP35x reference board + LPDDR/NAND DRAM: 256 MiB NAND: 512 MiB MMC: OMAP SD/MMC: 0 Loading Environment from NAND... OK OMAP die ID: 619e00029ff800000168300f1502501f Net: smc911x-0 Hit any key to stop autoboot: 0 OMAP Logic #
I take it back, I had my bbxm disabled in the loop and indeed it doesn't boot either. What's odd there is that N900 is omap3.
Marek and Azure tested it only in qemu-system.
We already know that real HW can do different things as emulated qemu.
Also N900 U-Boot port is somehow special compared to other OMAP3 boards, as on N900 is U-Boot loaded by NOLO bootloader. On other OMAP3 boards U-Boot is loaded either by SPL or X-Load. NOLO is doing basically whole HW initialization (there are just few exceptions like eMMC...) so started U-Boot just skips lot of initialization code. And therefore behave differently as on other OMAP3 boards. And due to this fact, U-Boot can chainload to U-Boot itself without any issues (IIRC this is not possible for most boards).
During weekend I will try to test all patches on real N900 HW and say if they are working or not. Note if U-Boot hangs prior entering into monitor code it would be very hard / impossible to debug. It take some time for U-Boot to start musb subsystem to export usbtty console for debugging. So early hangs cannot be debugged via usbtty.
participants (9)
-
Adam Ford
-
Bin Meng
-
Heinrich Schuchardt
-
Marek Behun
-
Marek Behún
-
Pali Rohár
-
Simon Glass
-
Stefan Roese
-
Tom Rini