[U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions

newer
[U-Boot] [PATCH] OMAP3: EVM:...

older
[U-Boot] [RFC PATCH] ARM: Avoid...

Rob Herring

15 Dec 2010 15 Dec '10

4:13 p.m.

From: Rob Herring rob.herring@calxeda.com

swab functions are heavily used by FDT code, so enable optimized assembly code for ARMv6 and later.

Signed-off-by: Rob Herring rob.herring@calxeda.com --- arch/arm/include/asm/byteorder.h | 16 ++++++++++++++++ 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h index c3489f1..9df5844 100644 --- a/arch/arm/include/asm/byteorder.h +++ b/arch/arm/include/asm/byteorder.h @@ -23,6 +23,22 @@ # define __SWAB_64_THRU_32__ #endif

+#if defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_6__) +static inline __u16 __attribute__((const)) ___arch_swab16(__u16 x) +{ + __asm__ ("rev16 %0, %1" : "=r" (x) : "r" (x)); + return x; +} +#define __arch_swab16 ___arch_swab16 + +static inline __u32 __attribute__((const)) ___arch_swab32(__u32 x) +{ + __asm__ ("rev %0, %1" : "=r" (x) : "r" (x)); + return x; +} +#define __arch_swab32 ___arch_swab32 +#endif + #ifdef __ARMEB__ #include <linux/byteorder/big_endian.h> #else

-- 1.7.1

Show replies by date

Wolfgang Denk

17 Dec 17 Dec

9:21 p.m.

New subject: [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions

Dear Rob Herring,

In message 1292425994-24331-1-git-send-email-robherring2@gmail.com you wrote:

...

From: Rob Herring rob.herring@calxeda.com

swab functions are heavily used by FDT code, so enable optimized assembly code for ARMv6 and later.

Signed-off-by: Rob Herring rob.herring@calxeda.com

arch/arm/include/asm/byteorder.h | 16 ++++++++++++++++ 1 files changed, 16 insertions(+), 0 deletions(-)

Do you have any numbers if this changes gives any measurable improvement?

Best regards,

Wolfgang Denk

-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de A modem is a baudy house.

Rob Herring

9:52 p.m.

New subject: [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions

Wolfgang,

On 12/17/2010 02:21 PM, Wolfgang Denk wrote:

...

Dear Rob Herring,

In message1292425994-24331-1-git-send-email-robherring2@gmail.com you wrote:

...
From: Rob Herringrob.herring@calxeda.com

swab functions are heavily used by FDT code, so enable optimized assembly code for ARMv6 and later.

Signed-off-by: Rob Herringrob.herring@calxeda.com

arch/arm/include/asm/byteorder.h | 16 ++++++++++++++++ 1 files changed, 16 insertions(+), 0 deletions(-)

Do you have any numbers if this changes gives any measurable improvement?

I have an instruction trace capture and see repeated calls to swab32 by the fdt code. It's an obvious low hanging fruit. The boot time for device tree vs. non-device tree is noticeably longer, but I don't have any formal measurements.

Rob

Måns Rullgård

10:27 p.m.

New subject: [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions

Rob Herring robherring2@gmail.com writes:

...

From: Rob Herring rob.herring@calxeda.com

swab functions are heavily used by FDT code, so enable optimized assembly code for ARMv6 and later.

Signed-off-by: Rob Herring rob.herring@calxeda.com

arch/arm/include/asm/byteorder.h | 16 ++++++++++++++++ 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h index c3489f1..9df5844 100644 --- a/arch/arm/include/asm/byteorder.h +++ b/arch/arm/include/asm/byteorder.h @@ -23,6 +23,22 @@ # define __SWAB_64_THRU_32__ #endif

+#if defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_6__) +static inline __u16 __attribute__((const)) ___arch_swab16(__u16 x) +{

__asm__ ("rev16 %0, %1" : "=r" (x) : "r" (x));

return x;

+}

Pay close attention to what gcc does with this as it is prone to add unnecessary masking of the low halfword. If the callers are well-behaved (argument having top halfword clear), making the parameter and return types here plain unsigned (or u32) gives better code.

-- Måns Rullgård mans@mansr.com

Rob Herring

18 Dec 18 Dec

6:12 p.m.

New subject: [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions

On 12/17/2010 03:27 PM, Måns Rullgård wrote:

...

Rob Herringrobherring2@gmail.com writes:

...
From: Rob Herringrob.herring@calxeda.com

swab functions are heavily used by FDT code, so enable optimized assembly code for ARMv6 and later.

Signed-off-by: Rob Herringrob.herring@calxeda.com

arch/arm/include/asm/byteorder.h | 16 ++++++++++++++++ 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h index c3489f1..9df5844 100644 --- a/arch/arm/include/asm/byteorder.h +++ b/arch/arm/include/asm/byteorder.h @@ -23,6 +23,22 @@ # define __SWAB_64_THRU_32__ #endif

+#if defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_6__) +static inline __u16 __attribute__((const)) ___arch_swab16(__u16 x) +{

__asm__ ("rev16 %0, %1" : "=r" (x) : "r" (x));

return x;

+}

Pay close attention to what gcc does with this as it is prone to add unnecessary masking of the low halfword. If the callers are well-behaved (argument having top halfword clear), making the parameter and return types here plain unsigned (or u32) gives better code.

This straight from the Linux code and there are only a few users of swab16 (none in my build).

Rob

Måns Rullgård

7:17 p.m.

New subject: [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions

Rob Herring robherring2@gmail.com writes:

...

On 12/17/2010 03:27 PM, Måns Rullgård wrote:

...
Rob Herringrobherring2@gmail.com writes:

...
From: Rob Herringrob.herring@calxeda.com

swab functions are heavily used by FDT code, so enable optimized assembly code for ARMv6 and later.

Signed-off-by: Rob Herringrob.herring@calxeda.com

arch/arm/include/asm/byteorder.h | 16 ++++++++++++++++ 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/byteorder.h b/arch/arm/include/asm/byteorder.h index c3489f1..9df5844 100644 --- a/arch/arm/include/asm/byteorder.h +++ b/arch/arm/include/asm/byteorder.h @@ -23,6 +23,22 @@ # define __SWAB_64_THRU_32__ #endif

+#if defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_6__) +static inline __u16 __attribute__((const)) ___arch_swab16(__u16 x) +{

__asm__ ("rev16 %0, %1" : "=r" (x) : "r" (x));

return x;

+}

Pay close attention to what gcc does with this as it is prone to add unnecessary masking of the low halfword. If the callers are well-behaved (argument having top halfword clear), making the parameter and return types here plain unsigned (or u32) gives better code.

This straight from the Linux code and there are only a few users of swab16 (none in my build).

Look at the generated code if you don't believe me.

-- Måns Rullgård mans@mansr.com

Wolfgang Denk

10:59 p.m.

New subject: [U-Boot] [PATCH] arm: Add armv6 and armv7 optimized swab functions

Dear Rob Herring,

In message 4D0CEB67.2040502@gmail.com you wrote:

...

This straight from the Linux code and there are only a few users of swab16 (none in my build).

Given that we have no idea if this code really gives any measurable performance improvement, and that it appears to be dangerous as well, I tend to not include that as is.

Thanks.

Wolfgang Denk

5256

Age (days ago)

5259

Last active (days ago)

List overview

Download

6 comments

3 participants

tags (0)

participants (3)

Måns Rullgård
Rob Herring
Wolfgang Denk