[U-Boot] [PATCH v2 0/5] Patches to reduce TPL code size

With the rockchip 'rock' board some build and code size problems have come to light with TPL. This series provides a few ideas to improve things.
Changes in v2: - Adjust the option to be SPL-only - Change the option to default to off (name it CONFIG_SPL_TINY_MEMSET) - Add a new patch to enable CONFIG_SPL_TINY_MEMSET - Add new patch to allow driver model to be disabled for TPL - Add new patch to allow driver-model serial to be disabled for TPL
Simon Glass (5): string: Provide a slimmed-down memset() rockchip: rock: Enable CONFIG_SPL_TINY_MEMSET Makefile: Provide an option to select SPL or TPL dm: core: Allow driver model to be disabled for TPL dm: serial: Allow driver-model serial to be disabled for TPL
configs/rock_defconfig | 1 + drivers/Makefile | 2 +- drivers/core/Kconfig | 14 ++++++++++++++ drivers/serial/Kconfig | 20 ++++++++++++++++++++ drivers/serial/Makefile | 2 +- lib/Kconfig | 8 ++++++++ lib/string.c | 6 ++++-- scripts/Kbuild.include | 6 ++++++ scripts/Makefile.spl | 6 ++++++ 9 files changed, 61 insertions(+), 4 deletions(-)

Most of the time the optimised memset() is what we want. For extreme situations such as TPL it may be too large. For example on the 'rock' board, using a simple loop saves a useful 48 bytes. With gcc 4.9 and the rodata bug, this patch is enough to reduce the TPL image below the limit.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: - Adjust the option to be SPL-only - Change the option to default to off (name it CONFIG_SPL_TINY_MEMSET)
lib/Kconfig | 8 ++++++++ lib/string.c | 6 ++++-- 2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/lib/Kconfig b/lib/Kconfig index 65c01573e1..58b5717dcd 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -52,6 +52,14 @@ config LIB_RAND help This library provides pseudo-random number generator functions.
+config SPL_TINY_MEMSET + bool "Use a very small memset() in SPL" + help + The faster memset() is the arch-specific one (if available) enabled + by CONFIG_USE_ARCH_MEMSET. If that is not enabled, we can still get + better performance by write a word at a time. Enable this option + to reduce code size slightly at the cost of some speed. + source lib/dhry/Kconfig
source lib/rsa/Kconfig diff --git a/lib/string.c b/lib/string.c index 67d5f6a421..c1a28c14ce 100644 --- a/lib/string.c +++ b/lib/string.c @@ -437,8 +437,10 @@ char *strswab(const char *s) void * memset(void * s,int c,size_t count) { unsigned long *sl = (unsigned long *) s; - unsigned long cl = 0; char *s8; + +#if !CONFIG_IS_ENABLED(TINY_MEMSET) + unsigned long cl = 0; int i;
/* do it one word at a time (32 bits or 64 bits) while possible */ @@ -452,7 +454,7 @@ void * memset(void * s,int c,size_t count) count -= sizeof(*sl); } } - /* fill 8 bits at a time */ +#endif /* fill 8 bits at a time */ s8 = (char *)sl; while (count--) *s8++ = c;

Am Sonntag, 2. April 2017, 09:50:28 CEST schrieb Simon Glass:
Most of the time the optimised memset() is what we want. For extreme situations such as TPL it may be too large. For example on the 'rock' board, using a simple loop saves a useful 48 bytes. With gcc 4.9 and the rodata bug, this patch is enough to reduce the TPL image below the limit.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2:
- Adjust the option to be SPL-only
- Change the option to default to off (name it CONFIG_SPL_TINY_MEMSET)
lib/Kconfig | 8 ++++++++ lib/string.c | 6 ++++-- 2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/lib/Kconfig b/lib/Kconfig index 65c01573e1..58b5717dcd 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -52,6 +52,14 @@ config LIB_RAND help This library provides pseudo-random number generator functions.
+config SPL_TINY_MEMSET
- bool "Use a very small memset() in SPL"
- help
The faster memset() is the arch-specific one (if available) enabled
by CONFIG_USE_ARCH_MEMSET. If that is not enabled, we can still get
better performance by write a word at a time. Enable this option
to reduce code size slightly at the cost of some speed.
Wording sounds off, I guess we could do something like
[...better performance by] writing a word at a time. In very size-constrained environments even this may be to big though. [Enable this option...]
Otherwise Reviewed-by: Heiko Stuebner heiko@sntech.de
source lib/dhry/Kconfig
source lib/rsa/Kconfig diff --git a/lib/string.c b/lib/string.c index 67d5f6a421..c1a28c14ce 100644 --- a/lib/string.c +++ b/lib/string.c @@ -437,8 +437,10 @@ char *strswab(const char *s) void * memset(void * s,int c,size_t count) { unsigned long *sl = (unsigned long *) s;
- unsigned long cl = 0; char *s8;
+#if !CONFIG_IS_ENABLED(TINY_MEMSET)
unsigned long cl = 0; int i;
/* do it one word at a time (32 bits or 64 bits) while possible */
@@ -452,7 +454,7 @@ void * memset(void * s,int c,size_t count) count -= sizeof(*sl); } }
- /* fill 8 bits at a time */
+#endif /* fill 8 bits at a time */ s8 = (char *)sl; while (count--) *s8++ = c;

On 4 April 2017 at 03:38, Heiko Stübner heiko@sntech.de wrote:
Am Sonntag, 2. April 2017, 09:50:28 CEST schrieb Simon Glass:
Most of the time the optimised memset() is what we want. For extreme situations such as TPL it may be too large. For example on the 'rock' board, using a simple loop saves a useful 48 bytes. With gcc 4.9 and the rodata bug, this patch is enough to reduce the TPL image below the limit.
Signed-off-by: Simon Glass sjg@chromium.org
Changes in v2:
- Adjust the option to be SPL-only
- Change the option to default to off (name it CONFIG_SPL_TINY_MEMSET)
lib/Kconfig | 8 ++++++++ lib/string.c | 6 ++++-- 2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/lib/Kconfig b/lib/Kconfig index 65c01573e1..58b5717dcd 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -52,6 +52,14 @@ config LIB_RAND help This library provides pseudo-random number generator functions.
+config SPL_TINY_MEMSET
bool "Use a very small memset() in SPL"
help
The faster memset() is the arch-specific one (if available) enabled
by CONFIG_USE_ARCH_MEMSET. If that is not enabled, we can still get
better performance by write a word at a time. Enable this option
to reduce code size slightly at the cost of some speed.
Wording sounds off, I guess we could do something like
[...better performance by] writing a word at a time. In very size-constrained environments even this may be to big though. [Enable this option...]
Otherwise Reviewed-by: Heiko Stuebner heiko@sntech.de
I am going to apply this one now and leave the rest of the series until it has had a bit more review. But this one is needed for me to enable the rock board.
Fixed this and:
Applied to u-boot-rockchip

Enable this option to shrink memset() a little. This is needed for TPL.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: - Add a new patch to enable CONFIG_SPL_TINY_MEMSET
configs/rock_defconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/configs/rock_defconfig b/configs/rock_defconfig index 153ebb5a44..0cd82114af 100644 --- a/configs/rock_defconfig +++ b/configs/rock_defconfig @@ -60,3 +60,4 @@ CONFIG_DEBUG_UART_NS16550=y CONFIG_SYS_NS16550=y CONFIG_CMD_DHRYSTONE=y CONFIG_ERRNO_STR=y +CONFIG_SPL_TINY_MEMSET=y

At present we have SPL_ which can be used in Makefiles to select between normal and SPL CONFIGs like this:
obj-$(CONFIG_$(SPL_)DM) += core/
When TPL is being built, SPL_ has the value 'SPL' which is generally a good idea since they tend to follow each other. But in extreme situations we may want to distinugish between SPL and TPL. For example we may not want to enable CONFIG_DM with TPL.
Add a new SPL_TPL_ variable which is set to either empty (for U-Boot proper), 'SPL' or 'TPL'. This may prove useful with TPL-specific options.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: None
scripts/Kbuild.include | 6 ++++++ scripts/Makefile.spl | 6 ++++++ 2 files changed, 12 insertions(+)
diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include index 1b62aedb00..a3a5c59d0d 100644 --- a/scripts/Kbuild.include +++ b/scripts/Kbuild.include @@ -321,6 +321,12 @@ endif
ifdef CONFIG_SPL_BUILD SPL_ := SPL_ +ifeq ($(CONFIG_TPL_BUILD),y) +SPL_TPL_ := TPL_ +else +SPL_TPL_ := SPL_ +endif else SPL_ := +SPL_TPL_ := endif diff --git a/scripts/Makefile.spl b/scripts/Makefile.spl index 5370648e85..4485ea8812 100644 --- a/scripts/Makefile.spl +++ b/scripts/Makefile.spl @@ -37,8 +37,14 @@ endif
ifdef CONFIG_SPL_BUILD SPL_ := SPL_ +ifeq ($(CONFIG_TPL_BUILD),y) +SPL_TPL_ := TPL_ +else +SPL_TPL_ := SPL_ +endif else SPL_ := +SPL_TPL_ := endif
include $(srctree)/config.mk

On Sun, Apr 02, 2017 at 09:50:30AM -0600, Simon Glass wrote:
At present we have SPL_ which can be used in Makefiles to select between normal and SPL CONFIGs like this:
obj-$(CONFIG_$(SPL_)DM) += core/
When TPL is being built, SPL_ has the value 'SPL' which is generally a good idea since they tend to follow each other. But in extreme situations we may want to distinugish between SPL and TPL. For example we may not want to enable CONFIG_DM with TPL.
Add a new SPL_TPL_ variable which is set to either empty (for U-Boot proper), 'SPL' or 'TPL'. This may prove useful with TPL-specific options.
Signed-off-by: Simon Glass sjg@chromium.org
Applied to u-boot/master, thanks!

Since TPL often needs to be very very small it may not make sense to enable driver model. Add an option for this.
This changes brings the 'rock' board under the TPL limit with gcc 4.9.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: - Add new patch to allow driver model to be disabled for TPL
drivers/Makefile | 2 +- drivers/core/Kconfig | 14 ++++++++++++++ 2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/Makefile b/drivers/Makefile index 34c55bfb2f..5d8baa5a1f 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -2,7 +2,7 @@ # SPDX-License-Identifier: GPL-2.0+ #
-obj-$(CONFIG_$(SPL_)DM) += core/ +obj-$(CONFIG_$(SPL_TPL_)DM) += core/ obj-$(CONFIG_$(SPL_)CLK) += clk/ obj-$(CONFIG_$(SPL_)LED) += led/ obj-$(CONFIG_$(SPL_)PINCTRL) += pinctrl/ diff --git a/drivers/core/Kconfig b/drivers/core/Kconfig index 87495614c2..405e9ad8ef 100644 --- a/drivers/core/Kconfig +++ b/drivers/core/Kconfig @@ -21,6 +21,20 @@ config SPL_DM and devices in SPL, so 1KB should be enable. See CONFIG_SYS_MALLOC_F_LEN for more details on how to enable it.
+config TPL_DM + bool "Enable Driver Model for TPL" + depends on DM && TPL + help + Enable driver model in TPL. You will need to provide a + suitable malloc() implementation. If you are not using the + full malloc() enabled by CONFIG_SYS_SPL_MALLOC_START, + consider using CONFIG_SYS_MALLOC_SIMPLE. In that case you + must provide CONFIG_SYS_MALLOC_F_LEN to set the size. + In most cases driver model will only allocate a few uclasses + and devices in SPL, so 1KB should be enough. See + CONFIG_SYS_MALLOC_F_LEN for more details on how to enable it. + Disable this for very small implementations. + config DM_WARN bool "Enable warnings in driver model" depends on DM

On Sun, Apr 02, 2017 at 09:50:31AM -0600, Simon Glass wrote:
Since TPL often needs to be very very small it may not make sense to enable driver model. Add an option for this.
This changes brings the 'rock' board under the TPL limit with gcc 4.9.
Signed-off-by: Simon Glass sjg@chromium.org
Applied to u-boot/master, thanks!

Add separate enable/disable controls for driver-model serial. While this is generally enabled in SPL it may not be in TPL, since serial output can be obtained with the debug UART with minimal code size.
Signed-off-by: Simon Glass sjg@chromium.org ---
Changes in v2: - Add new patch to allow driver-model serial to be disabled for TPL
drivers/serial/Kconfig | 20 ++++++++++++++++++++ drivers/serial/Makefile | 2 +- 2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/drivers/serial/Kconfig b/drivers/serial/Kconfig index ca56a7e604..0900cc8acb 100644 --- a/drivers/serial/Kconfig +++ b/drivers/serial/Kconfig @@ -53,6 +53,26 @@ config DM_SERIAL implements serial_putc() etc. The uclass interface is defined in include/serial.h.
+config SPL_DM_SERIAL + bool "Enable Driver Model for serial drivers" + depends on DM_SERIAL + default y if SPL && DM_SERIAL + help + Enable driver model for serial in SPL. This replaces + drivers/serial/serial.c with the serial uclass, which + implements serial_putc() etc. The uclass interface is + defined in include/serial.h. + +config TPL_DM_SERIAL + bool "Enable Driver Model for serial drivers" + depends on DM_SERIAL + default y if TPL && DM_SERIAL + help + Enable driver model for serial in TPL. This replaces + drivers/serial/serial.c with the serial uclass, which + implements serial_putc() etc. The uclass interface is + defined in include/serial.h. + config DEBUG_UART bool "Enable an early debug UART for debugging" help diff --git a/drivers/serial/Makefile b/drivers/serial/Makefile index 84a22ce14c..87c5f145d1 100644 --- a/drivers/serial/Makefile +++ b/drivers/serial/Makefile @@ -6,7 +6,7 @@ #
ifdef CONFIG_DM_SERIAL -obj-y += serial-uclass.o +obj-$(CONFIG_$(SPL_TPL_)DM_SERIAL) += serial-uclass.o obj-$(CONFIG_PL01X_SERIAL) += serial_pl01x.o else obj-y += serial.o

On Sun, Apr 02, 2017 at 09:50:32AM -0600, Simon Glass wrote:
Add separate enable/disable controls for driver-model serial. While this is generally enabled in SPL it may not be in TPL, since serial output can be obtained with the debug UART with minimal code size.
Signed-off-by: Simon Glass sjg@chromium.org
Applied to u-boot/master, thanks!

Am Sonntag, 2. April 2017, 09:50:27 CEST schrieb Simon Glass:
With the rockchip 'rock' board some build and code size problems have come to light with TPL. This series provides a few ideas to improve things.
great stuff!
With these patches applied, rk3188-rock still boots and the TPL has come down to 616byte on gcc-4.9 and 592bytes on gcc-6.3, so
Tested-by: Heiko Stuebner heiko@sntech.de
We have like vast amounts of free space in tpl now ;-)
I guess I should fold your TINY_MEMSET option into my rock board, once you've applied the core patch?
Heiko
Changes in v2:
- Adjust the option to be SPL-only
- Change the option to default to off (name it CONFIG_SPL_TINY_MEMSET)
- Add a new patch to enable CONFIG_SPL_TINY_MEMSET
- Add new patch to allow driver model to be disabled for TPL
- Add new patch to allow driver-model serial to be disabled for TPL
Simon Glass (5): string: Provide a slimmed-down memset() rockchip: rock: Enable CONFIG_SPL_TINY_MEMSET Makefile: Provide an option to select SPL or TPL dm: core: Allow driver model to be disabled for TPL dm: serial: Allow driver-model serial to be disabled for TPL
configs/rock_defconfig | 1 + drivers/Makefile | 2 +- drivers/core/Kconfig | 14 ++++++++++++++ drivers/serial/Kconfig | 20 ++++++++++++++++++++ drivers/serial/Makefile | 2 +- lib/Kconfig | 8 ++++++++ lib/string.c | 6 ++++-- scripts/Kbuild.include | 6 ++++++ scripts/Makefile.spl | 6 ++++++ 9 files changed, 61 insertions(+), 4 deletions(-)

Am Dienstag, 4. April 2017, 11:34:52 CEST schrieb Heiko Stübner:
Am Sonntag, 2. April 2017, 09:50:27 CEST schrieb Simon Glass:
With the rockchip 'rock' board some build and code size problems have come to light with TPL. This series provides a few ideas to improve things.
great stuff!
With these patches applied, rk3188-rock still boots and the TPL has come down to 616byte on gcc-4.9 and 592bytes on gcc-6.3, so
Actually, after finding out that I should add a # CONFIG_TPL_DM_SERIAL is not set to my defconfig, the size goes down even more - to 488 bytes on both gcc-4.9 and gcc-6.3.
Still works and all.
Tested-by: Heiko Stuebner heiko@sntech.de
We have like vast amounts of free space in tpl now ;-)
I guess I should fold your TINY_MEMSET option into my rock board, once you've applied the core patch?
Heiko
Changes in v2:
- Adjust the option to be SPL-only
- Change the option to default to off (name it CONFIG_SPL_TINY_MEMSET)
- Add a new patch to enable CONFIG_SPL_TINY_MEMSET
- Add new patch to allow driver model to be disabled for TPL
- Add new patch to allow driver-model serial to be disabled for TPL
Simon Glass (5): string: Provide a slimmed-down memset() rockchip: rock: Enable CONFIG_SPL_TINY_MEMSET Makefile: Provide an option to select SPL or TPL dm: core: Allow driver model to be disabled for TPL dm: serial: Allow driver-model serial to be disabled for TPL
configs/rock_defconfig | 1 + drivers/Makefile | 2 +- drivers/core/Kconfig | 14 ++++++++++++++ drivers/serial/Kconfig | 20 ++++++++++++++++++++ drivers/serial/Makefile | 2 +- lib/Kconfig | 8 ++++++++ lib/string.c | 6 ++++-- scripts/Kbuild.include | 6 ++++++ scripts/Makefile.spl | 6 ++++++ 9 files changed, 61 insertions(+), 4 deletions(-)

Hi Heiko,
On 4 April 2017 at 04:43, Heiko Stuebner heiko@sntech.de wrote:
Am Dienstag, 4. April 2017, 11:34:52 CEST schrieb Heiko Stübner:
Am Sonntag, 2. April 2017, 09:50:27 CEST schrieb Simon Glass:
With the rockchip 'rock' board some build and code size problems have come to light with TPL. This series provides a few ideas to improve things.
great stuff!
With these patches applied, rk3188-rock still boots and the TPL has come down to 616byte on gcc-4.9 and 592bytes on gcc-6.3, so
Actually, after finding out that I should add a # CONFIG_TPL_DM_SERIAL is not set to my defconfig, the size goes down even more - to 488 bytes on both gcc-4.9 and gcc-6.3.
Still works and all.
OK great thanks for the report.
Regards, Simon
Tested-by: Heiko Stuebner heiko@sntech.de
We have like vast amounts of free space in tpl now ;-)
I guess I should fold your TINY_MEMSET option into my rock board, once you've applied the core patch?
Heiko
Changes in v2:
- Adjust the option to be SPL-only
- Change the option to default to off (name it CONFIG_SPL_TINY_MEMSET)
- Add a new patch to enable CONFIG_SPL_TINY_MEMSET
- Add new patch to allow driver model to be disabled for TPL
- Add new patch to allow driver-model serial to be disabled for TPL
Simon Glass (5): string: Provide a slimmed-down memset() rockchip: rock: Enable CONFIG_SPL_TINY_MEMSET Makefile: Provide an option to select SPL or TPL dm: core: Allow driver model to be disabled for TPL dm: serial: Allow driver-model serial to be disabled for TPL
configs/rock_defconfig | 1 + drivers/Makefile | 2 +- drivers/core/Kconfig | 14 ++++++++++++++ drivers/serial/Kconfig | 20 ++++++++++++++++++++ drivers/serial/Makefile | 2 +- lib/Kconfig | 8 ++++++++ lib/string.c | 6 ++++-- scripts/Kbuild.include | 6 ++++++ scripts/Makefile.spl | 6 ++++++ 9 files changed, 61 insertions(+), 4 deletions(-)
participants (4)
-
Heiko Stuebner
-
Heiko Stübner
-
Simon Glass
-
Tom Rini