[U-Boot] [RFC PATCH] ARM: byteorder: add optimized swab16 and swab32

Use the specialized ARM instructions for swapping 16 and 32 bit values instead of using the generic ones from include/linux/byteorder/swab.h.
The x86 version in arch/x86/include/asm/byteorder.h was taken as an example for this.
E.g. for the mx6qsabrelite target this results in ~4k less code.
Signed-off-by: Dirk Behme dirk.behme@gmail.com ---
This patch is marked as RFC due to two questions:
1) These ARM instructions are available in ARMv6 and above. Do we have to limit this code somehow to >= ARMv6? I couldn't find any #define to do this in U-Boot (?)
2) I'm not sure about the rev16. The rev16 instruction swaps *both* half words, the lower _and_ the upper one:
REV16 Rd, Rm => Rd[15:8] := Rm[7:0], Rd[7:0] := Rm[15:8], Rd[31:24] := Rm[23:16], Rd[23:16] := Rm[31:24]
I'm not sure if we want this? Most probably we only want to swap the lower half word with swap16()?
arch/arm/include/asm/byteorder.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h index c3489f1..f21baf0 100644 --- a/arch/arm/include/asm/byteorder.h +++ b/arch/arm/include/asm/byteorder.h @@ -15,6 +15,23 @@ #ifndef __ASM_ARM_BYTEORDER_H #define __ASM_ARM_BYTEORDER_H
+static __inline__ __u32 ___arch__swab32(__u32 x) +{ + __asm__("rev %0,%1" /* swap bytes */ \ + : "=r" (x) \ + : "0" (x)); \ + return x; +} +#define __arch__swab32(x) ___arch__swab32(x) + +static __inline__ __u16 ___arch__swab16(__u16 x) +{ + __asm__("rev16 %0,%1" /* swap bytes */ \ + : "=r" (x) \ + : "0" (x)); \ + return x; +} +#define __arch__swab16(x) ___arch__swab16(x)
#include <asm/types.h>

Dirk Behme dirk.behme@gmail.com writes:
Use the specialized ARM instructions for swapping 16 and 32 bit values instead of using the generic ones from include/linux/byteorder/swab.h.
The x86 version in arch/x86/include/asm/byteorder.h was taken as an example for this.
E.g. for the mx6qsabrelite target this results in ~4k less code.
Which compiler are you using? GCC 4.5 and later recognises the byteswap pattern and emits these instructions automatically. This gives better code than the inline asm since it allows the compiler to do proper instruction scheduling, and in the 16-bit case it avoids extraneous masking.

Am 10.05.2013 18:56, schrieb Måns Rullgård:
Dirk Behme dirk.behme@gmail.com writes:
Use the specialized ARM instructions for swapping 16 and 32 bit values instead of using the generic ones from include/linux/byteorder/swab.h.
The x86 version in arch/x86/include/asm/byteorder.h was taken as an example for this.
E.g. for the mx6qsabrelite target this results in ~4k less code.
Which compiler are you using?
An older gcc 4.4.4 version.
GCC 4.5 and later recognises the byteswap pattern and emits these instructions automatically. This gives better code than the inline asm since it allows the compiler to do proper instruction scheduling, and in the 16-bit case it avoids extraneous masking.
Thanks,
Dirk

Dear Dirk Behme,
In message 518D3191.7060205@gmail.com you wrote:
Which compiler are you using?
An older gcc 4.4.4 version.
We should not really care about these. And if so, then onlya fter testing we are really suing such old stuff, i. e. those who are using recent tools (eventually the majority of users) should not suffer from hacst to optimize for ancient code.
Best regards,
Wolfgang Denk
participants (3)
-
Dirk Behme
-
Måns Rullgård
-
Wolfgang Denk