
On 25/11/2020 12.26, Patrick Delaunay wrote:
A common use of memmove() can be handled by memcpy(). Also memcpy() includes an optimization for large sizes: it copies a word at a time. So we can get a speed-up by calling memcpy() to handle our move in this case.
Update memmove() to call also memcpy() if the source don't overlap the destination (src + count <= dest).
Signed-off-by: Patrick Delaunay patrick.delaunay@st.com
This patch allows to save 38ms for Kernel Image extraction (7327624 Bytes) from FIT loaded at 0xC2000000 for ARMV7 board STM32MP157C-EV1, and with kernel destination = Load Address: 0xc4000000, located after the FIT without overlap, compared with destination = Load Address: 0xc0008000.
-> 14,332 us vs 52,239 in bootstage report
In this case the memmove funtion is called in common/image.c::memmove_wd() to handle overlap.
lib/string.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/string.c b/lib/string.c index ae7835f600..ef8ead976c 100644 --- a/lib/string.c +++ b/lib/string.c @@ -567,7 +567,7 @@ void * memmove(void * dest,const void *src,size_t count) { char *tmp, *s;
- if (dest <= src) {
- if (dest <= src || (src + count) <= dest) { memcpy(dest, src, count);
Hm. So the condition you add (src + count <= dest) implies no overlap at all, so that's ok. So this doesn't really have anything to do with your patch per se.
The existing condition relies on memcpy doing forward-copying. That's the case if the implementation from lib/string.c is in use, i.e. if __HAVE_ARCH_MEMCPY is not defined. And if an arch defines __HAVE_ARCH_MEMMOVE, this memmove() is not used.
But AFAICT, there's a potential problem for the case where __HAVE_ARCH_MEMCPY is defined but __HAVE_ARCH_MEMMOVE is not, and e.g. arch/arm/include/asm/string.h does
#if CONFIG_IS_ENABLED(USE_ARCH_MEMCPY) #define __HAVE_ARCH_MEMCPY #endif #undef __HAVE_ARCH_MEMMOVE
Of course, the arch-specific implementation _may_ also do forward-copying (I haven't tried to decipher it), but that seems to be a rather fragile assumption. At the very least, some comments would be in order.
Rasmus