
Am Sonntag, 26. März 2017, 17:38:16 CEST schrieb Simon Glass:
Most of the time the optimised memset() is what we want. For extreme situations such as TPL it may be too large. For example on the 'rock' board, using a simple loop saves a useful 48 bytes. With gcc 4.9 and the rodata bug, this patch is enough to reduce the TPL image below the limit.
Signed-off-by: Simon Glass sjg@chromium.org
This brings down the rk3188-rock tpl from 1020 to 972 bytes (with a 1020 byte size limit for the tpl) even with gcc-4.9 and down to 748 bytes on gcc-6.3.
I was using the original memset in all tests before, so am quite sure it should work without issues, but cannot test it on actual hardware this week.
Heiko
lib/Kconfig | 9 +++++++++ lib/string.c | 6 ++++-- 2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/lib/Kconfig b/lib/Kconfig index 65c01573e1..5bf512d8c0 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -52,6 +52,15 @@ config LIB_RAND help This library provides pseudo-random number generator functions.
+config FAST_MEMSET
- bool "Use an optimised memset()"
- default y
- help
The faster memset() is the arch-specific one (if available) enabled
by CONFIG_USE_ARCH_MEMSET. If that is not enabled, we can still get
better performance by write a word at a time. Disable this option
to reduce code size slightly at the cost of some speed.
source lib/dhry/Kconfig
source lib/rsa/Kconfig diff --git a/lib/string.c b/lib/string.c index 67d5f6a421..159493ed17 100644 --- a/lib/string.c +++ b/lib/string.c @@ -437,8 +437,10 @@ char *strswab(const char *s) void * memset(void * s,int c,size_t count) { unsigned long *sl = (unsigned long *) s;
- unsigned long cl = 0; char *s8;
+#ifdef CONFIG_FAST_MEMSET
unsigned long cl = 0; int i;
/* do it one word at a time (32 bits or 64 bits) while possible */
@@ -452,7 +454,7 @@ void * memset(void * s,int c,size_t count) count -= sizeof(*sl); } }
- /* fill 8 bits at a time */
+#endif /* fill 8 bits at a time */ s8 = (char *)sl; while (count--) *s8++ = c;