[U-Boot] [PATCH] Reduce build times

U-Boot Makefiles contain a number of tests for compiler features etc. which so far are executed again and again. On some architectures (especially ARM) this results in a large number of calls to gcc.
This patch makes sure to run such tests only once, thus largely reducing the number of "execve" system calls.
Example: number of "execve" system calls for building the "P2020DS" (Power Architecture) and "qong" (ARM) boards, measured as: -> strace -f -e trace=execve -o /tmp/foo ./MAKEALL <board> -> grep execve /tmp/foo | wc -l
Before: After: Reduction: ================================== P2020DS 20555 15205 -26% qong 31692 14490 -54%
As a result, built times are significantly reduced, typically by 30...50%.
Signed-off-by: Wolfgang Denk wd@denx.de Cc: Andy Fleming afleming@gmail.com Cc: Kumar Gala galak@kernel.crashing.org Cc: Albert Aribaud albert.aribaud@free.fr cc: Graeme Russ graeme.russ@gmail.com cc: Mike Frysinger vapier@gentoo.org ---
More detailled build results:
1) Number of "execve" system calls for two exemplary boards: "P2020DS" for Power Architecture, and "qong" for ARM. Measured as: -> strace -f -e trace=execve -o /tmp/foo ./MAKEALL <board> -> grep execve /tmp/foo | wc -l
Before: After: Reduction: ======================================= P2020DS 20555 15205 -26% qong 31692 14490 -54%
2) Build time for single boards: Measured as "time ./MAKEALL <board>" (average over 3 runs after a dummy run to populate the file system; using ELDK 4.2 tool chain; measured for "P2020DS" and "qong" on a Core2 Quad CPU at 2.83GHz, for "m28evk" on a i.MX28 at 454MHz over NFS)
Before: After: Reduction: ======================================= P2020DS real 29.429s 19.494s -34% user 64.035s 48.621s -24% sys 16.188s 9.203s -43% qong: real 34.274s 16.263s -53% user 65.014s 39.516s -39% sys 27.606s 9.551s -65% m28evk: real 45.752m 27.606m -40% user 24.868m 17.511m -30% sys 17.017m 6.642m -61%
3) Build time for MAKEALL: Measured as "time MAKEALL_LOGDIR=/work/wd/tmp-LOG BUILD_DIR=/work/wd/tmp ./MAKEALL <arch>" (using ELDK 4.2 tool chain; measured for "ppc" and "arm" on a Core i7 at 3.07GHz)
Before: After: Reduction: ======================================= ppc real 82.063m 66.595m -19% user 261.710m 231.429m -12% sys 61.739m 49.193m -20% arm real 55.269m 20.763m -62% user 84.302m 49.154m -42% sys 38.933m 15.028m -61%
Note: There is further potential for build time reductions by performing similar optimizations for a number of $(shell ...) constructs in the Makefiles, but I have no good ways to test these at the moment so this is left as exercise for the respective architecture maintainers (mostly blackfin and coldfire, AFAICT) -- wd
Makefile | 2 +- arch/arm/config.mk | 19 ++++++++++--------- arch/arm/cpu/arm1136/config.mk | 3 ++- arch/arm/cpu/arm1176/config.mk | 4 +++- arch/arm/cpu/arm1176/s3c64xx/config.mk | 4 +++- arch/arm/cpu/arm720t/config.mk | 4 +++- arch/arm/cpu/arm920t/config.mk | 3 ++- arch/arm/cpu/arm925t/config.mk | 3 ++- arch/arm/cpu/arm926ejs/at91/config.mk | 3 ++- arch/arm/cpu/arm926ejs/config.mk | 3 ++- arch/arm/cpu/arm946es/config.mk | 3 ++- arch/arm/cpu/arm_intcm/config.mk | 3 ++- arch/arm/cpu/armv7/config.mk | 4 ++-- arch/arm/cpu/armv7/omap-common/config.mk | 5 +++-- arch/arm/cpu/ixp/config.mk | 3 ++- arch/arm/cpu/lh7a40x/config.mk | 3 ++- arch/arm/cpu/pxa/config.mk | 3 ++- arch/arm/cpu/s3c44b0/config.mk | 3 ++- arch/arm/cpu/sa1100/config.mk | 3 ++- arch/powerpc/cpu/mpc824x/Makefile | 3 +-- arch/powerpc/cpu/mpc85xx/config.mk | 5 +++-- arch/x86/config.mk | 10 ++++++---- board/siemens/SCM/Makefile | 3 +-- config.mk | 8 +++++--- examples/standalone/Makefile | 3 ++- 25 files changed, 67 insertions(+), 43 deletions(-)
diff --git a/Makefile b/Makefile index 9ef33f9..82de62b 100644 --- a/Makefile +++ b/Makefile @@ -320,7 +320,7 @@ else PLATFORM_LIBGCC = -L $(USE_PRIVATE_LIBGCC) -lgcc endif else -PLATFORM_LIBGCC = -L $(shell dirname `$(CC) $(CFLAGS) -print-libgcc-file-name`) -lgcc +PLATFORM_LIBGCC := -L $(shell dirname `$(CC) $(CFLAGS) -print-libgcc-file-name`) -lgcc endif PLATFORM_LIBS += $(PLATFORM_LIBGCC) export PLATFORM_LIBS diff --git a/arch/arm/config.mk b/arch/arm/config.mk index 9b4e581..45f9dca 100644 --- a/arch/arm/config.mk +++ b/arch/arm/config.mk @@ -34,7 +34,7 @@ endif PLATFORM_CPPFLAGS += -DCONFIG_ARM -D__ARM__
# Explicitly specifiy 32-bit ARM ISA since toolchain default can be -mthumb: -PLATFORM_CPPFLAGS += $(call cc-option,-marm,) +PF_CPPFLAGS_ARM := $(call cc-option,-marm,)
# Try if EABI is supported, else fall back to old API, # i. e. for example: @@ -44,15 +44,16 @@ PLATFORM_CPPFLAGS += $(call cc-option,-marm,) # -mabi=apcs-gnu -mno-thumb-interwork # - with ELDK 3.1 (gcc 3.x), use: # -mapcs-32 -mno-thumb-interwork -PLATFORM_CPPFLAGS += $(call cc-option,\ - -mabi=aapcs-linux -mno-thumb-interwork,\ +PF_CPPFLAGS_ABI := $(call cc-option,\ + -mabi=aapcs-linux -mno-thumb-interwork,\ + $(call cc-option,\ + -mapcs-32,\ $(call cc-option,\ - -mapcs-32,\ - $(call cc-option,\ - -mabi=apcs-gnu,\ - )\ - ) $(call cc-option,-mno-thumb-interwork,)\ - ) + -mabi=apcs-gnu,\ + )\ + ) $(call cc-option,-mno-thumb-interwork,)\ + ) +PLATFORM_CPPFLAGS += $(PF_CPPFLAGS_ARM) $(PF_CPPFLAGS_ABI)
# For EABI, make sure to provide raise() ifneq (,$(findstring -mabi=aapcs-linux,$(PLATFORM_CPPFLAGS))) diff --git a/arch/arm/cpu/arm1136/config.mk b/arch/arm/cpu/arm1136/config.mk index 3e68535..efee0d1 100644 --- a/arch/arm/cpu/arm1136/config.mk +++ b/arch/arm/cpu/arm1136/config.mk @@ -29,4 +29,5 @@ PLATFORM_CPPFLAGS += -march=armv5 # Supply options according to compiler version # # ========================================================================= -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/arm1176/config.mk b/arch/arm/cpu/arm1176/config.mk index 14346cf..222d352 100644 --- a/arch/arm/cpu/arm1176/config.mk +++ b/arch/arm/cpu/arm1176/config.mk @@ -29,4 +29,6 @@ PLATFORM_CPPFLAGS += -march=armv5t # Supply options according to compiler version # # ========================================================================= -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,\ + $(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/arm1176/s3c64xx/config.mk b/arch/arm/cpu/arm1176/s3c64xx/config.mk index 14346cf..222d352 100644 --- a/arch/arm/cpu/arm1176/s3c64xx/config.mk +++ b/arch/arm/cpu/arm1176/s3c64xx/config.mk @@ -29,4 +29,6 @@ PLATFORM_CPPFLAGS += -march=armv5t # Supply options according to compiler version # # ========================================================================= -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,\ + $(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/arm720t/config.mk b/arch/arm/cpu/arm720t/config.mk index 3844c62..210c6dc 100644 --- a/arch/arm/cpu/arm720t/config.mk +++ b/arch/arm/cpu/arm720t/config.mk @@ -30,4 +30,6 @@ PLATFORM_CPPFLAGS += -march=armv4 -mtune=arm7tdmi # Supply options according to compiler version # # ========================================================================= -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,\ + $(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/arm920t/config.mk b/arch/arm/cpu/arm920t/config.mk index 8f6c1a3..f03030a 100644 --- a/arch/arm/cpu/arm920t/config.mk +++ b/arch/arm/cpu/arm920t/config.mk @@ -29,4 +29,5 @@ PLATFORM_CPPFLAGS += -march=armv4 # Supply options according to compiler version # # ========================================================================= -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/arm925t/config.mk b/arch/arm/cpu/arm925t/config.mk index 8f6c1a3..f03030a 100644 --- a/arch/arm/cpu/arm925t/config.mk +++ b/arch/arm/cpu/arm925t/config.mk @@ -29,4 +29,5 @@ PLATFORM_CPPFLAGS += -march=armv4 # Supply options according to compiler version # # ========================================================================= -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/arm926ejs/at91/config.mk b/arch/arm/cpu/arm926ejs/at91/config.mk index 19296fd..370630d 100644 --- a/arch/arm/cpu/arm926ejs/at91/config.mk +++ b/arch/arm/cpu/arm926ejs/at91/config.mk @@ -1 +1,2 @@ -PLATFORM_CPPFLAGS += $(call cc-option,-mtune=arm926ejs,) +PF_CPPFLAGS_TUNE := $(call cc-option,-mtune=arm926ejs,) +PLATFORM_CPPFLAGS += $(PF_CPPFLAGS_TUNE) diff --git a/arch/arm/cpu/arm926ejs/config.mk b/arch/arm/cpu/arm926ejs/config.mk index f8ef90f..ffb2e6c 100644 --- a/arch/arm/cpu/arm926ejs/config.mk +++ b/arch/arm/cpu/arm926ejs/config.mk @@ -29,4 +29,5 @@ PLATFORM_CPPFLAGS += -march=armv5te # Supply options according to compiler version # # ========================================================================= -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/arm946es/config.mk b/arch/arm/cpu/arm946es/config.mk index e783f69..c2354ba 100644 --- a/arch/arm/cpu/arm946es/config.mk +++ b/arch/arm/cpu/arm946es/config.mk @@ -29,4 +29,5 @@ PLATFORM_CPPFLAGS += -march=armv4 # Supply options according to compiler version # # ========================================================================= -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/arm_intcm/config.mk b/arch/arm/cpu/arm_intcm/config.mk index e783f69..c2354ba 100644 --- a/arch/arm/cpu/arm_intcm/config.mk +++ b/arch/arm/cpu/arm_intcm/config.mk @@ -29,4 +29,5 @@ PLATFORM_CPPFLAGS += -march=armv4 # Supply options according to compiler version # # ========================================================================= -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/armv7/config.mk b/arch/arm/cpu/armv7/config.mk index 49ac9c7..83ddf10 100644 --- a/arch/arm/cpu/armv7/config.mk +++ b/arch/arm/cpu/armv7/config.mk @@ -29,5 +29,5 @@ PLATFORM_CPPFLAGS += -march=armv5 # Supply options according to compiler version # # ========================================================================= -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,\ - $(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/armv7/omap-common/config.mk b/arch/arm/cpu/armv7/omap-common/config.mk index 49ac9c7..c400dcc 100644 --- a/arch/arm/cpu/armv7/omap-common/config.mk +++ b/arch/arm/cpu/armv7/omap-common/config.mk @@ -29,5 +29,6 @@ PLATFORM_CPPFLAGS += -march=armv5 # Supply options according to compiler version # # ========================================================================= -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,\ - $(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,\ + $(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/ixp/config.mk b/arch/arm/cpu/ixp/config.mk index 5868cba..9149665 100644 --- a/arch/arm/cpu/ixp/config.mk +++ b/arch/arm/cpu/ixp/config.mk @@ -37,4 +37,5 @@ LDFLAGS_u-boot += --gc-sections # Supply options according to compiler version # # ========================================================================= -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/lh7a40x/config.mk b/arch/arm/cpu/lh7a40x/config.mk index 47b2b7b..1c4aa97 100644 --- a/arch/arm/cpu/lh7a40x/config.mk +++ b/arch/arm/cpu/lh7a40x/config.mk @@ -29,4 +29,5 @@ PLATFORM_CPPFLAGS += -march=armv4 # Supply options according to compiler version # # ======================================================================== -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/pxa/config.mk b/arch/arm/cpu/pxa/config.mk index a05d69c..0bbe295 100644 --- a/arch/arm/cpu/pxa/config.mk +++ b/arch/arm/cpu/pxa/config.mk @@ -30,4 +30,5 @@ PLATFORM_CPPFLAGS += -march=armv5te -mtune=xscale # Supply options according to compiler version # # ======================================================================== -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/s3c44b0/config.mk b/arch/arm/cpu/s3c44b0/config.mk index 7454d72..f6f6398 100644 --- a/arch/arm/cpu/s3c44b0/config.mk +++ b/arch/arm/cpu/s3c44b0/config.mk @@ -30,4 +30,5 @@ PLATFORM_CPPFLAGS += -march=armv4 -mtune=arm7tdmi -msoft-float # Supply options according to compiler version # # ======================================================================== -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/arm/cpu/sa1100/config.mk b/arch/arm/cpu/sa1100/config.mk index 6f21f41..06af160 100644 --- a/arch/arm/cpu/sa1100/config.mk +++ b/arch/arm/cpu/sa1100/config.mk @@ -30,4 +30,5 @@ PLATFORM_CPPFLAGS += -march=armv4 -mtune=strongarm1100 # Supply options according to compiler version # # ======================================================================== -PLATFORM_RELFLAGS +=$(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PF_RELFLAGS_SLB_AT := $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) +PLATFORM_RELFLAGS += $(PF_RELFLAGS_SLB_AT) diff --git a/arch/powerpc/cpu/mpc824x/Makefile b/arch/powerpc/cpu/mpc824x/Makefile index 2bfcd85..ebf4cb2 100644 --- a/arch/powerpc/cpu/mpc824x/Makefile +++ b/arch/powerpc/cpu/mpc824x/Makefile @@ -23,8 +23,7 @@
include $(TOPDIR)/config.mk ifneq ($(OBJTREE),$(SRCTREE)) -$(shell mkdir -p $(obj)drivers/epic) -$(shell mkdir -p $(obj)drivers/i2c) +$(shell mkdir -p $(obj)drivers/epic $(obj)drivers/i2c) endif
LIB = $(obj)lib$(CPU).o diff --git a/arch/powerpc/cpu/mpc85xx/config.mk b/arch/powerpc/cpu/mpc85xx/config.mk index 68ac57d..f36d823 100644 --- a/arch/powerpc/cpu/mpc85xx/config.mk +++ b/arch/powerpc/cpu/mpc85xx/config.mk @@ -28,5 +28,6 @@ PLATFORM_CPPFLAGS += -ffixed-r2 -Wa,-me500 -msoft-float -mno-string # -mspe=yes is needed to have -mno-spe accepted by a buggy GCC; # see "[PATCH,rs6000] make -mno-spe work as expected" on # http://gcc.gnu.org/ml/gcc-patches/2008-04/msg00311.html -PLATFORM_CPPFLAGS +=$(call cc-option,-mspe=yes) -PLATFORM_CPPFLAGS +=$(call cc-option,-mno-spe) +PF_CPPFLAGS_SPE := $(call cc-option,-mspe=yes) \ + $(call cc-option,-mno-spe) +PLATFORM_CPPFLAGS += $(PF_CPPFLAGS_SPE) diff --git a/arch/x86/config.mk b/arch/x86/config.mk index ee23c9f..fe9083f 100644 --- a/arch/x86/config.mk +++ b/arch/x86/config.mk @@ -27,10 +27,12 @@ PLATFORM_CPPFLAGS += -fno-strict-aliasing PLATFORM_CPPFLAGS += -Wstrict-prototypes PLATFORM_CPPFLAGS += -mregparm=3 PLATFORM_CPPFLAGS += -fomit-frame-pointer -PLATFORM_CPPFLAGS += $(call cc-option, -ffreestanding) -PLATFORM_CPPFLAGS += $(call cc-option, -fno-toplevel-reorder, $(call cc-option, -fno-unit-at-a-time)) -PLATFORM_CPPFLAGS += $(call cc-option, -fno-stack-protector) -PLATFORM_CPPFLAGS += $(call cc-option, -mpreferred-stack-boundary=2) +PF_CPPFLAGS_X86 := $(call cc-option, -ffreestanding) \ + $(call cc-option, -fno-toplevel-reorder, \ + $(call cc-option, -fno-unit-at-a-time)) \ + $(call cc-option, -fno-stack-protector) \ + $(call cc-option, -mpreferred-stack-boundary=2) +PLATFORM_CPPFLAGS += $(PF_CPPFLAGS_X86) PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_CPPFLAGS += -DREALMODE_BASE=0x7c0
diff --git a/board/siemens/SCM/Makefile b/board/siemens/SCM/Makefile index 07cc5a6..07db9d4 100644 --- a/board/siemens/SCM/Makefile +++ b/board/siemens/SCM/Makefile @@ -24,8 +24,7 @@ include $(TOPDIR)/config.mk
ifneq ($(OBJTREE),$(SRCTREE)) -$(shell mkdir -p $(obj)../common) -$(shell mkdir -p $(obj)../../tqc/tqm8xx) +$(shell mkdir -p $(obj)../common $(obj)../../tqc/tqm8xx) endif
LIB = $(obj)lib$(BOARD).o diff --git a/config.mk b/config.mk index 11b67e5..918cffe 100644 --- a/config.mk +++ b/config.mk @@ -209,11 +209,13 @@ else CFLAGS := $(CPPFLAGS) -Wall -Wstrict-prototypes endif
-CFLAGS += $(call cc-option,-fno-stack-protector) +CFLAGS_SSP := $(call cc-option,-fno-stack-protector) +CFLAGS += $(CFLAGS_SSP) # Some toolchains enable security related warning flags by default, # but they don't make much sense in the u-boot world, so disable them. -CFLAGS += $(call cc-option,-Wno-format-nonliteral) -CFLAGS += $(call cc-option,-Wno-format-security) +CFLAGS_WARN := $(call cc-option,-Wno-format-nonliteral) \ + $(call cc-option,-Wno-format-security) +CFLAGS += $(CFLAGS_WARN)
# $(CPPFLAGS) sets -g, which causes gcc to pass a suitable -g<format> # option to the assembler. diff --git a/examples/standalone/Makefile b/examples/standalone/Makefile index b1e33fb..e23865b 100644 --- a/examples/standalone/Makefile +++ b/examples/standalone/Makefile @@ -85,7 +85,8 @@ endif # We don't want gcc reordering functions if possible. This ensures that an # application's entry point will be the first function in the application's # source file. -CFLAGS += $(call cc-option,-fno-toplevel-reorder) +CFLAGS_NTR := $(call cc-option,-fno-toplevel-reorder) +CFLAGS += $(CFLAGS_NTR)
all: $(obj).depend $(OBJS) $(LIB) $(SREC) $(BIN) $(ELF)

Hi Wolfgang,
On 02/11/11 17:54, Wolfgang Denk wrote:
U-Boot Makefiles contain a number of tests for compiler features etc. which so far are executed again and again. On some architectures (especially ARM) this results in a large number of calls to gcc.
This patch makes sure to run such tests only once, thus largely reducing the number of "execve" system calls.
Example: number of "execve" system calls for building the "P2020DS" (Power Architecture) and "qong" (ARM) boards, measured as: -> strace -f -e trace=execve -o /tmp/foo ./MAKEALL <board> -> grep execve /tmp/foo | wc -l
Before: After: Reduction:
P2020DS 20555 15205 -26% qong 31692 14490 -54%
As a result, built times are significantly reduced, typically by 30...50%.
Signed-off-by: Wolfgang Denk wd@denx.de Cc: Andy Fleming afleming@gmail.com Cc: Kumar Gala galak@kernel.crashing.org Cc: Albert Aribaud albert.aribaud@free.fr cc: Graeme Russ graeme.russ@gmail.com cc: Mike Frysinger vapier@gentoo.org
Tested on x86, does what is written on the box ;)
Tested-by: Graeme Russ graeme.russ@gmail.com
Regards,
Graeme

Am 02.11.2011 07:54, schrieb Wolfgang Denk:
U-Boot Makefiles contain a number of tests for compiler features etc. which so far are executed again and again. On some architectures (especially ARM) this results in a large number of calls to gcc.
This patch makes sure to run such tests only once, thus largely reducing the number of "execve" system calls.
Example: number of "execve" system calls for building the "P2020DS" (Power Architecture) and "qong" (ARM) boards, measured as: -> strace -f -e trace=execve -o /tmp/foo ./MAKEALL<board> -> grep execve /tmp/foo | wc -l
Before: After: Reduction:
P2020DS 20555 15205 -26% qong 31692 14490 -54%
As a result, built times are significantly reduced, typically by 30...50%.
Signed-off-by: Wolfgang Denkwd@denx.de Cc: Andy Flemingafleming@gmail.com Cc: Kumar Galagalak@kernel.crashing.org Cc: Albert Aribaudalbert.aribaud@free.fr cc: Graeme Russgraeme.russ@gmail.com cc: Mike Frysingervapier@gentoo.org
Nice. Some additional numbers:
zmx25: make ----------- real 1m47.546s 0m57.213s -53% user 1m39.698s 0m54.831s sys 0m24.798s 0m9.509s
zmx25: make -j2 --------------- real 0m56.791s 0m32.187s -57% user 1m38.478s 0m55.571s sys 0m24.522s 0m9.513s
Tested-by: Matthias Weisser weisserm@arcor.de
Matthias

-----Original Message----- From: u-boot-bounces@lists.denx.de [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Wolfgang Denk Sent: Wednesday, November 02, 2011 12:24 PM To: u-boot@lists.denx.de Cc: Graeme Russ; Kumar Gala; Albert Aribaud; Andy Fleming Subject: [U-Boot] [PATCH] Reduce build times
U-Boot Makefiles contain a number of tests for compiler features etc. which so far are executed again and again. On some architectures (especially ARM) this results in a large number of calls to gcc.
This patch makes sure to run such tests only once, thus largely reducing the number of "execve" system calls.
Example: number of "execve" system calls for building the "P2020DS" (Power Architecture) and "qong" (ARM) boards, measured as: -> strace -f -e trace=execve -o /tmp/foo ./MAKEALL <board> -> grep execve /tmp/foo | wc -l
Before: After: Reduction:
P2020DS 20555 15205 -26% qong 31692 14490 -54%
As a result, built times are significantly reduced, typically by 30...50%.
Signed-off-by: Wolfgang Denk wd@denx.de Cc: Andy Fleming afleming@gmail.com Cc: Kumar Gala galak@kernel.crashing.org Cc: Albert Aribaud albert.aribaud@free.fr cc: Graeme Russ graeme.russ@gmail.com cc: Mike Frysinger vapier@gentoo.org
Results for OMAP3EVM. (Tried 5 times just to be sure as I see >50% reduction.)
Before After ------ ------ real 109.03 49.78 user 71.43 29.06 sys 26.83 7.66
Compiled u-boot works fine on the board as well.
Tested-by: Sanjeev Premi premi@ti.com
[snip]...[snip]

On Wed, Nov 2, 2011 at 7:49 AM, Premi, Sanjeev premi@ti.com wrote:
-----Original Message----- From: u-boot-bounces@lists.denx.de [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Wolfgang Denk Sent: Wednesday, November 02, 2011 12:24 PM To: u-boot@lists.denx.de Cc: Graeme Russ; Kumar Gala; Albert Aribaud; Andy Fleming Subject: [U-Boot] [PATCH] Reduce build times
U-Boot Makefiles contain a number of tests for compiler features etc. which so far are executed again and again. On some architectures (especially ARM) this results in a large number of calls to gcc.
This patch makes sure to run such tests only once, thus largely reducing the number of "execve" system calls.
Example: number of "execve" system calls for building the "P2020DS" (Power Architecture) and "qong" (ARM) boards, measured as: -> strace -f -e trace=execve -o /tmp/foo ./MAKEALL <board> -> grep execve /tmp/foo | wc -l
Before: After: Reduction:
P2020DS 20555 15205 -26% qong 31692 14490 -54%
As a result, built times are significantly reduced, typically by 30...50%.
Signed-off-by: Wolfgang Denk wd@denx.de Cc: Andy Fleming afleming@gmail.com Cc: Kumar Gala galak@kernel.crashing.org Cc: Albert Aribaud albert.aribaud@free.fr cc: Graeme Russ graeme.russ@gmail.com cc: Mike Frysinger vapier@gentoo.org
Results for OMAP3EVM. (Tried 5 times just to be sure as I see >50% reduction.)
Before After ------ ------ real 109.03 49.78 user 71.43 29.06 sys 26.83 7.66
Over here omap3_evm wall-clock time on make -j12 goes from 27sec to 10sec.

On Wed, Nov 2, 2011 at 8:37 AM, Tom Rini tom.rini@gmail.com wrote:
On Wed, Nov 2, 2011 at 7:49 AM, Premi, Sanjeev premi@ti.com wrote:
-----Original Message----- From: u-boot-bounces@lists.denx.de [mailto:u-boot-bounces@lists.denx.de] On Behalf Of Wolfgang Denk Sent: Wednesday, November 02, 2011 12:24 PM To: u-boot@lists.denx.de Cc: Graeme Russ; Kumar Gala; Albert Aribaud; Andy Fleming Subject: [U-Boot] [PATCH] Reduce build times
U-Boot Makefiles contain a number of tests for compiler features etc. which so far are executed again and again. On some architectures (especially ARM) this results in a large number of calls to gcc.
This patch makes sure to run such tests only once, thus largely reducing the number of "execve" system calls.
Example: number of "execve" system calls for building the "P2020DS" (Power Architecture) and "qong" (ARM) boards, measured as: -> strace -f -e trace=execve -o /tmp/foo ./MAKEALL <board> -> grep execve /tmp/foo | wc -l
Before: After: Reduction:
P2020DS 20555 15205 -26% qong 31692 14490 -54%
As a result, built times are significantly reduced, typically by 30...50%.
Signed-off-by: Wolfgang Denk wd@denx.de
Tested-by: Simon Glass sjg@chromium.org
Cc: Andy Fleming afleming@gmail.com Cc: Kumar Gala galak@kernel.crashing.org Cc: Albert Aribaud albert.aribaud@free.fr cc: Graeme Russ graeme.russ@gmail.com cc: Mike Frysinger vapier@gentoo.org
Results for OMAP3EVM. (Tried 5 times just to be sure as I see >50% reduction.)
Before After ------ ------ real 109.03 49.78 user 71.43 29.06 sys 26.83 7.66
Over here omap3_evm wall-clock time on make -j12 goes from 27sec to 10sec.
-- Tom _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
For Tegra2 Seaboard (armv7) and -j15 or so: before and after times:
full build (clobber, config) 17.177s -> 7.060s incremental build 7.432s -> 2.267s
Thank you!
Regards, Simon

Hi Wolfgang,
On Wed, Nov 2, 2011 at 7:54 AM, Wolfgang Denk wd@denx.de wrote:
U-Boot Makefiles contain a number of tests for compiler features etc. which so far are executed again and again. On some architectures (especially ARM) this results in a large number of calls to gcc.
This patch makes sure to run such tests only once, thus largely reducing the number of "execve" system calls.
maybe you want to try this experimental patch. http://patchwork.ozlabs.org/patch/123313/
It significantly reduces the count of gcc calls by caching the results. This also improves compilation times.
Best regards, Daniel

Dear Daniel Schwierzeck,
In message CACUy__UjmnRYKMWiMB9pqr0_dS6cgiyo-MsoVY4eSH2zT6ZKHA@mail.gmail.com you wrote:
On Wed, Nov 2, 2011 at 7:54 AM, Wolfgang Denk wd@denx.de wrote:
U-Boot Makefiles contain a number of tests for compiler features etc. which so far are executed again and again. =C2=A0On some architectures (especially ARM) this results in a large number of calls to gcc.
This patch makes sure to run such tests only once, thus largely reducing the number of "execve" system calls.
maybe you want to try this experimental patch. http://patchwork.ozlabs.org/patch/123313/
It significantly reduces the count of gcc calls by caching the results. This also improves compilation times.
Do you suggest this in addition or instead of the patch I posted?
Can you provide some measurements of build times and/or execve system calls?
Best regards,
Wolfgang Denk

Hi Wolfgang,
On 02.11.2011 23:48, Wolfgang Denk wrote:
Dear Daniel Schwierzeck,
In messageCACUy__UjmnRYKMWiMB9pqr0_dS6cgiyo-MsoVY4eSH2zT6ZKHA@mail.gmail.com you wrote:
On Wed, Nov 2, 2011 at 7:54 AM, Wolfgang Denkwd@denx.de wrote:
U-Boot Makefiles contain a number of tests for compiler features etc. which so far are executed again and again. =C2=A0On some architectures (especially ARM) this results in a large number of calls to gcc.
This patch makes sure to run such tests only once, thus largely reducing the number of "execve" system calls.
maybe you want to try this experimental patch. http://patchwork.ozlabs.org/patch/123313/
It significantly reduces the count of gcc calls by caching the results. This also improves compilation times.
Do you suggest this in addition or instead of the patch I posted?
as an additional but separate patch to further reduce the execution time of MAKEALL.
Can you provide some measurements of build times and/or execve system calls?
I have attached the results of some MAKEALL runs in the patch mail (I cc-ed you).
Best regards, Daniel

Hi Wolfgang,
On Wed, Nov 2, 2011 at 11:48 PM, Wolfgang Denk wd@denx.de wrote:
Dear Daniel Schwierzeck,
In message CACUy__UjmnRYKMWiMB9pqr0_dS6cgiyo-MsoVY4eSH2zT6ZKHA@mail.gmail.com you wrote:
On Wed, Nov 2, 2011 at 7:54 AM, Wolfgang Denk wd@denx.de wrote:
U-Boot Makefiles contain a number of tests for compiler features etc. which so far are executed again and again. =C2=A0On some architectures (especially ARM) this results in a large number of calls to gcc.
This patch makes sure to run such tests only once, thus largely reducing the number of "execve" system calls.
maybe you want to try this experimental patch. http://patchwork.ozlabs.org/patch/123313/
It significantly reduces the count of gcc calls by caching the results. This also improves compilation times.
Do you suggest this in addition or instead of the patch I posted?
Can you provide some measurements of build times and/or execve system calls?
I ran some additonal tests with interesting results.
Board: ARM, Tegra2, seaboard Toolchain: Sourcery G++ Lite 2011.03-41 for ARM GNU/Linux Workstation: Core 2 Duo E6600 @2,4 Ghz, 4 GB, x86_64
I patched the cc-option macro to count all calls like this:
cc-option = $(shell if $(CC) $(CFLAGS) $(1) -S -o /dev/null -xc /dev/null \ - > /dev/null 2>&1; then echo "$(1)"; else echo "$(2)"; fi ;) + > /dev/null 2>&1; then echo "$(1)"; echo "$1" >> $(OBJTREE)/cc-option; else echo "$(2)"; fi ;)
I ran the steps below for following source trees: - unmodified HEAD - only your patch - only my patch - both patches combined
Steps: Complete build: -> git clean -xdf -> CROSS_COMPILE=/opt/codesourcery/arm-2011.03/bin/arm-none-linux-gnueabi- make seaboard_config -> time CROSS_COMPILE=/opt/codesourcery/arm-2011.03/bin/arm-none-linux-gnueabi- CACHE_CC_OPTIONS=y make -s -> cat cc-option | wc -l
Incremental rebuild: -> time CROSS_COMPILE=/opt/codesourcery/arm-2011.03/bin/arm-none-linux-gnueabi- CACHE_CC_OPTIONS=y make -s
Complete build with strace: -> git clean -xdf -> CROSS_COMPILE=/opt/codesourcery/arm-2011.03/bin/arm-none-linux-gnueabi- make seaboard_config -> CROSS_COMPILE=/opt/codesourcery/arm-2011.03/bin/arm-none-linux-gnueabi- CACHE_CC_OPTIONS=y strace -f -e trace=execve -o strace.out make -s -> grep execve strace.out | wc -l
Results: unmodified HEAD: real 1m11.540s user 2m7.170s sys 0m19.840s
cc-option calls 3024
real 0m20.176s user 0m39.260s sys 0m6.480s
execve calls 16502
only your patch: real 0m32.371s user 0m47.440s sys 0m7.900s
cc-option calls 864
real 0m9.606s user 0m16.890s sys 0m2.940s
execve calls 5906
only my patch: real 0m28.187s user 0m56.030s sys 0m7.820s
cc-option calls 20
real 0m5.013s user 0m13.300s sys 0m2.200s
execve calls 7415
both patches combined: real 0m19.777s user 0m28.010s sys 0m4.100s
cc-option calls 8
real 0m2.902s user 0m6.400s sys 0m1.070s
execve calls 3329
Conclusion: - complete build time reduced from 1m11s to 20s - incremental rebuild time reduced from 20s to 3s - cc-option calls reduced from 3024 to 8 - execve calls reduced from 16502 to 3329
Best regards, Daniel

Dear Daniel Schwierzeck,
In message CACUy__W_Z85aLiNUQXMxE3trrHm4auEqOBXBqs6DfSRFEPh9CA@mail.gmail.com you wrote:
Conclusion:
- complete build time reduced from 1m11s to 20s
- incremental rebuild time reduced from 20s to 3s
- cc-option calls reduced from 3024 to 8
- execve calls reduced from 16502 to 3329
That's really cool.
Can we please add another two or three of such optimizations? :-)
Best regards,
Wolfgang Denk

Hi Daniel, Wolfgang,
On Thursday 03 November 2011 08:55 PM, Daniel Schwierzeck wrote:
Hi Wolfgang,
[snip ..]
Conclusion:
- complete build time reduced from 1m11s to 20s
- incremental rebuild time reduced from 20s to 3s
- cc-option calls reduced from 3024 to 8
- execve calls reduced from 16502 to 3329
Results for omap4 sdp build: Build machine: Intel Core i5 2.5 GHz, 3M cache, 4GB DDR3
Un-modified HEAD: real 0m21.463s user 0m31.278s sys 0m9.281s
With only Wolfgang's patch: real 0m11.226s user 0m23.937s sys 0m4.200s
With only Daniel's patch: real 0m10.842s user 0m21.725s sys 0m2.532s
With both patches: real 0m8.306s user 0m21.201s sys 0m2.408s
Looks like both patches are helping. Thanks!!
br, Aneesh
Best regards, Daniel _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

HI Wolfgang,
2011/11/2 Wolfgang Denk wd@denx.de:
U-Boot Makefiles contain a number of tests for compiler features etc. which so far are executed again and again. On some architectures (especially ARM) this results in a large number of calls to gcc.
board before after reduction adp-ag101 7259 7059 2.7%
Tested-by: Macpaul Lin macpaul@gmail.com
Thanks

On Wednesday 02 November 2011 02:54:02 Wolfgang Denk wrote:
U-Boot Makefiles contain a number of tests for compiler features etc. which so far are executed again and again. On some architectures (especially ARM) this results in a large number of calls to gcc.
seems to shave ~10% off for Blackfin boards Acked-by: Mike Frysinger vapier@gentoo.org
Note: There is further potential for build time reductions by performing similar optimizations for a number of $(shell ...) constructs in the Makefiles, but I have no good ways to test these at the moment so this is left as exercise for the respective architecture maintainers (mostly blackfin and coldfire, AFAICT) -- wd
Blackfin does two $(shell), one of which i already cache. the other, i should be able to send a patch for. -mike
participants (10)
-
Aneesh V
-
Daniel Schwierzeck
-
Graeme Russ
-
Matthias Weißer
-
Mike Frysinger
-
Premi, Sanjeev
-
Simon Glass
-
Tom Rini
-
Wolfgang Denk
-
馬克泡