[U-Boot] (no subject)

The following patch series is needed to build U-Boot for ARM platforms with CONFIG_CMD_UBIFS set. The UBIFS layer uses bit operation functions which are currently only implemented for PPC, and the ARM bit operation definitions are unused and wrong.
[PATCH 1/3] ARM: remove unused bit operations [PATCH 2/3] Add generic bit operations [PATCH 3/3] ARM: add unaligned macros

Remove include/asm-arm/bitops.h a bunch of 'external' marked functions from include/asm-arm/bitops.h. They are not implemented anywhere in the sources, so this forward declaration is wrong.
Also remove the functions __set_bit, __clear_bit, __change_bit, __test_and_set_bit, __test_and_clear_bit and __test_and_change_bit.
All these functions can be implemented in a generic fashion which will be done in the next patch.
Signed-off-by: Daniel Mack daniel@caiaq.de --- include/asm-arm/bitops.h | 70 ---------------------------------------------- 1 files changed, 0 insertions(+), 70 deletions(-)
diff --git a/include/asm-arm/bitops.h b/include/asm-arm/bitops.h index 4b8bab2..3ffd4d5 100644 --- a/include/asm-arm/bitops.h +++ b/include/asm-arm/bitops.h @@ -20,76 +20,6 @@ #define smp_mb__before_clear_bit() do { } while (0) #define smp_mb__after_clear_bit() do { } while (0)
-/* - * Function prototypes to keep gcc -Wall happy. - */ -extern void set_bit(int nr, volatile void * addr); - -static inline void __set_bit(int nr, volatile void *addr) -{ - ((unsigned char *) addr)[nr >> 3] |= (1U << (nr & 7)); -} - -extern void clear_bit(int nr, volatile void * addr); - -static inline void __clear_bit(int nr, volatile void *addr) -{ - ((unsigned char *) addr)[nr >> 3] &= ~(1U << (nr & 7)); -} - -extern void change_bit(int nr, volatile void * addr); - -static inline void __change_bit(int nr, volatile void *addr) -{ - ((unsigned char *) addr)[nr >> 3] ^= (1U << (nr & 7)); -} - -extern int test_and_set_bit(int nr, volatile void * addr); - -static inline int __test_and_set_bit(int nr, volatile void *addr) -{ - unsigned int mask = 1 << (nr & 7); - unsigned int oldval; - - oldval = ((unsigned char *) addr)[nr >> 3]; - ((unsigned char *) addr)[nr >> 3] = oldval | mask; - return oldval & mask; -} - -extern int test_and_clear_bit(int nr, volatile void * addr); - -static inline int __test_and_clear_bit(int nr, volatile void *addr) -{ - unsigned int mask = 1 << (nr & 7); - unsigned int oldval; - - oldval = ((unsigned char *) addr)[nr >> 3]; - ((unsigned char *) addr)[nr >> 3] = oldval & ~mask; - return oldval & mask; -} - -extern int test_and_change_bit(int nr, volatile void * addr); - -static inline int __test_and_change_bit(int nr, volatile void *addr) -{ - unsigned int mask = 1 << (nr & 7); - unsigned int oldval; - - oldval = ((unsigned char *) addr)[nr >> 3]; - ((unsigned char *) addr)[nr >> 3] = oldval ^ mask; - return oldval & mask; -} - -extern int find_first_zero_bit(void * addr, unsigned size); -extern int find_next_zero_bit(void * addr, int size, int offset); - -/* - * This routine doesn't need to be atomic. - */ -static inline int test_bit(int nr, const void * addr) -{ - return ((unsigned char *) addr)[nr >> 3] & (1U << (nr & 7)); -}
/* * ffz = Find First Zero in word. Undefined if no zero exists,

This adds generic bit operations for all platforms and enables includes the implementations from asm-arm.
Code taken from Linux kernel.
Signed-off-by: Daniel Mack daniel@caiaq.de --- include/asm-arm/bitops.h | 2 + include/asm-generic/bitops.h | 151 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 153 insertions(+), 0 deletions(-) create mode 100644 include/asm-generic/bitops.h
diff --git a/include/asm-arm/bitops.h b/include/asm-arm/bitops.h index 3ffd4d5..97abea6 100644 --- a/include/asm-arm/bitops.h +++ b/include/asm-arm/bitops.h @@ -15,6 +15,8 @@ #ifndef __ASM_ARM_BITOPS_H #define __ASM_ARM_BITOPS_H
+#include "asm-generic/bitops.h" + #ifdef __KERNEL__
#define smp_mb__before_clear_bit() do { } while (0) diff --git a/include/asm-generic/bitops.h b/include/asm-generic/bitops.h new file mode 100644 index 0000000..088e5da --- /dev/null +++ b/include/asm-generic/bitops.h @@ -0,0 +1,151 @@ +#ifndef _ASM_GENERIC_BITOPS_H_ +#define _ASM_GENERIC_BITOPS_H_ + +#include <asm/types.h> + +#define BIT(nr) (1UL << (nr)) +#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG)) +#define BIT_WORD(nr) ((nr) / BITS_PER_LONG) +#define BITS_PER_BYTE 8 +#define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long)) + +/** + * set_bit - Set a bit in memory + * @nr: the bit to set + * @addr: the address to start counting from + * + * Unlike set_bit(), this function is non-atomic and may be reordered. + * If it's called on the same region of memory simultaneously, the effect + * may be that only one operation succeeds. + */ +static inline void set_bit(int nr, volatile unsigned long *addr) +{ + unsigned long mask = BIT_MASK(nr); + unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr); + + *p |= mask; +} + +static inline void clear_bit(int nr, volatile unsigned long *addr) +{ + unsigned long mask = BIT_MASK(nr); + unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr); + + *p &= ~mask; +} + +/** + * change_bit - Toggle a bit in memory + * @nr: the bit to change + * @addr: the address to start counting from + * + * Unlike change_bit(), this function is non-atomic and may be reordered. + * If it's called on the same region of memory simultaneously, the effect + * may be that only one operation succeeds. + */ +static inline void change_bit(int nr, volatile unsigned long *addr) +{ + unsigned long mask = BIT_MASK(nr); + unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr); + + *p ^= mask; +} + +/** + * test_and_set_bit - Set a bit and return its old value + * @nr: Bit to set + * @addr: Address to count from + * + * This operation is non-atomic and can be reordered. + * If two examples of this operation race, one can appear to succeed + * but actually fail. You must protect multiple accesses with a lock. + */ +static inline int test_and_set_bit(int nr, volatile unsigned long *addr) +{ + unsigned long mask = BIT_MASK(nr); + unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr); + unsigned long old = *p; + + *p = old | mask; + return (old & mask) != 0; +} + +/** + * test_and_clear_bit - Clear a bit and return its old value + * @nr: Bit to clear + * @addr: Address to count from + * + * This operation is non-atomic and can be reordered. + * If two examples of this operation race, one can appear to succeed + * but actually fail. You must protect multiple accesses with a lock. + */ +static inline int test_and_clear_bit(int nr, volatile unsigned long *addr) +{ + unsigned long mask = BIT_MASK(nr); + unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr); + unsigned long old = *p; + + *p = old & ~mask; + return (old & mask) != 0; +} + +/* WARNING: non atomic and it can be reordered! */ +static inline int test_and_change_bit(int nr, + volatile unsigned long *addr) +{ + unsigned long mask = BIT_MASK(nr); + unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr); + unsigned long old = *p; + + *p = old ^ mask; + return (old & mask) != 0; +} + +/** + * test_bit - Determine whether a bit is set + * @nr: bit number to test + * @addr: Address to start counting from + */ +static inline int test_bit(int nr, const volatile unsigned long *addr) +{ + return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1))); +} + +/** + * fls - find last (most-significant) bit set + * @x: the word to search + * + * This is defined the same way as ffs. + * Note fls(0) = 0, fls(1) = 1, fls(0x80000000) = 32. + */ + +static inline int fls(int x) +{ + int r = 32; + + if (!x) + return 0; + if (!(x & 0xffff0000u)) { + x <<= 16; + r -= 16; + } + if (!(x & 0xff000000u)) { + x <<= 8; + r -= 8; + } + if (!(x & 0xf0000000u)) { + x <<= 4; + r -= 4; + } + if (!(x & 0xc0000000u)) { + x <<= 2; + r -= 2; + } + if (!(x & 0x80000000u)) { + x <<= 1; + r -= 1; + } + return r; +} + +#endif /* _ASM_GENERIC_BITOPS_H_ */

Signed-off-by: Daniel Mack daniel@caiaq.de --- include/asm-arm/unaligned.h | 14 ++++++++++++++ 1 files changed, 14 insertions(+), 0 deletions(-) create mode 100644 include/asm-arm/unaligned.h
diff --git a/include/asm-arm/unaligned.h b/include/asm-arm/unaligned.h new file mode 100644 index 0000000..dd7d852 --- /dev/null +++ b/include/asm-arm/unaligned.h @@ -0,0 +1,14 @@ +#ifndef _ASM_ARM_UNALIGNED_H +#define _ASM_ARM_UNALIGNED_H + +#ifdef __KERNEL__ + +#include <linux/unaligned/access_ok.h> +#include <linux/unaligned/generic.h> + +#define get_unaligned __get_unaligned_le +#define put_unaligned __put_unaligned_le + +#endif /* __KERNEL__ */ +#endif /* _ASM_ARM_UNALIGNED_H */ +

On Thu, Jun 04, 2009 at 12:27:21PM +0200, Daniel Mack wrote:
include/asm-arm/unaligned.h | 14 ++++++++++++++ 1 files changed, 14 insertions(+), 0 deletions(-) create mode 100644 include/asm-arm/unaligned.h
This one was too easy, updated patch below.
With that one applied, the lzo1x decompressor and the whole ubifs works fine on an ARM PXA3xx.
Daniel
From 827de3c735a829de64a07467e7d10c07299cfe04 Mon Sep 17 00:00:00 2001
From: Daniel Mack daniel@caiaq.de Date: Thu, 4 Jun 2009 12:19:52 +0200 Subject: [PATCH] ARM: add unaligned macros
Unaligned data access is evil on ARMs, especially when no magic foo like Linux' traps clean up after us, hence choose the 'packed struct' way.
Signed-off-by: Daniel Mack daniel@caiaq.de --- include/asm-arm/unaligned.h | 14 ++++++++++ include/linux/unaligned/le_struct.h | 36 +++++++++++++++++++++++++ include/linux/unaligned/packed_struct.h | 44 +++++++++++++++++++++++++++++++ 3 files changed, 94 insertions(+), 0 deletions(-) create mode 100644 include/asm-arm/unaligned.h create mode 100644 include/linux/unaligned/le_struct.h create mode 100644 include/linux/unaligned/packed_struct.h
diff --git a/include/asm-arm/unaligned.h b/include/asm-arm/unaligned.h new file mode 100644 index 0000000..569db55 --- /dev/null +++ b/include/asm-arm/unaligned.h @@ -0,0 +1,14 @@ +#ifndef _ASM_ARM_UNALIGNED_H +#define _ASM_ARM_UNALIGNED_H + +#ifdef __KERNEL__ + +#include <linux/unaligned/le_struct.h> +#include <linux/unaligned/generic.h> + +#define get_unaligned __get_unaligned_le +#define put_unaligned __put_unaligned_le + +#endif /* __KERNEL__ */ +#endif /* _ASM_ARM_UNALIGNED_H */ + diff --git a/include/linux/unaligned/le_struct.h b/include/linux/unaligned/le_struct.h new file mode 100644 index 0000000..088c457 --- /dev/null +++ b/include/linux/unaligned/le_struct.h @@ -0,0 +1,36 @@ +#ifndef _LINUX_UNALIGNED_LE_STRUCT_H +#define _LINUX_UNALIGNED_LE_STRUCT_H + +#include <linux/unaligned/packed_struct.h> + +static inline u16 get_unaligned_le16(const void *p) +{ + return __get_unaligned_cpu16((const u8 *)p); +} + +static inline u32 get_unaligned_le32(const void *p) +{ + return __get_unaligned_cpu32((const u8 *)p); +} + +static inline u64 get_unaligned_le64(const void *p) +{ + return __get_unaligned_cpu64((const u8 *)p); +} + +static inline void put_unaligned_le16(u16 val, void *p) +{ + __put_unaligned_cpu16(val, p); +} + +static inline void put_unaligned_le32(u32 val, void *p) +{ + __put_unaligned_cpu32(val, p); +} + +static inline void put_unaligned_le64(u64 val, void *p) +{ + __put_unaligned_cpu64(val, p); +} + +#endif /* _LINUX_UNALIGNED_LE_STRUCT_H */ diff --git a/include/linux/unaligned/packed_struct.h b/include/linux/unaligned/packed_struct.h new file mode 100644 index 0000000..5034257 --- /dev/null +++ b/include/linux/unaligned/packed_struct.h @@ -0,0 +1,44 @@ +#ifndef _LINUX_UNALIGNED_PACKED_STRUCT_H +#define _LINUX_UNALIGNED_PACKED_STRUCT_H + +struct __una_u16 { u16 x __attribute__((packed)); } __attribute__((packed)); +struct __una_u32 { u32 x __attribute__((packed)); } __attribute__((packed)); +struct __una_u64 { u64 x __attribute__((packed)); } __attribute__((packed)); + +static inline u16 __get_unaligned_cpu16(const void *p) +{ + const struct __una_u16 *ptr = (const struct __una_u16 *)p; + return ptr->x; +} + +static inline u32 __get_unaligned_cpu32(const void *p) +{ + const struct __una_u32 *ptr = (const struct __una_u32 *)p; + return ptr->x; +} + +static inline u64 __get_unaligned_cpu64(const void *p) +{ + const struct __una_u64 *ptr = (const struct __una_u64 *)p; + return ptr->x; +} + +static inline void __put_unaligned_cpu16(u16 val, void *p) +{ + struct __una_u16 *ptr = (struct __una_u16 *)p; + ptr->x = val; +} + +static inline void __put_unaligned_cpu32(u32 val, void *p) +{ + struct __una_u32 *ptr = (struct __una_u32 *)p; + ptr->x = val; +} + +static inline void __put_unaligned_cpu64(u64 val, void *p) +{ + struct __una_u64 *ptr = (struct __una_u64 *)p; + ptr->x = val; +} + +#endif /* _LINUX_UNALIGNED_PACKED_STRUCT_H */

Dear Daniel Mack,
In message 20090604174208.GM26688@buzzloop.caiaq.de you wrote:
+static inline u16 get_unaligned_le16(const void *p) +{
- return __get_unaligned_cpu16((const u8 *)p);
+}
+static inline u32 get_unaligned_le32(const void *p) +{
- return __get_unaligned_cpu32((const u8 *)p);
+}
+static inline u64 get_unaligned_le64(const void *p) +{
- return __get_unaligned_cpu64((const u8 *)p);
+}
Are these 3 really all "u8" pointers, or is this a copy & paste error?
Is there any guarantee that such macros are never used on device registers and the like?
Best regards,
Wolfgang Denk

On Thu, Jun 04, 2009 at 09:03:47PM +0200, Wolfgang Denk wrote:
+static inline u16 get_unaligned_le16(const void *p) +{
- return __get_unaligned_cpu16((const u8 *)p);
+}
+static inline u32 get_unaligned_le32(const void *p) +{
- return __get_unaligned_cpu32((const u8 *)p);
+}
+static inline u64 get_unaligned_le64(const void *p) +{
- return __get_unaligned_cpu64((const u8 *)p);
+}
Are these 3 really all "u8" pointers, or is this a copy & paste error?
Yes, this is how it came from the Linux kernel and my tests show that these access methods work well.
Is there any guarantee that such macros are never used on device registers and the like?
Well - how can I guarantee that? Anyway - the functions can be enhanced later to make them work with different types of memories. For now, they implement a working set of functions to allow ubifs (and probably other code as well) to be compiled and ran on ARMs.
Daniel

On Thursday 04 June 2009 21:23:31 Daniel Mack wrote:
Is there any guarantee that such macros are never used on device registers and the like?
Well - how can I guarantee that? Anyway - the functions can be enhanced later to make them work with different types of memories. For now, they implement a working set of functions to allow ubifs (and probably other code as well) to be compiled and ran on ARMs.
Yes. I suggest that we just document that these functions (and the set_bit()... ones) don't implement any memory barriers/sync operations and therefore should be handled with care when used on IO registers etc (on platforms that need such barriers like PPC).
Best regards, Stefan
===================================================================== DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de =====================================================================

Dear Daniel Mack,
In message 1244111241-32735-3-git-send-email-daniel@caiaq.de you wrote:
This adds generic bit operations for all platforms and enables includes the implementations from asm-arm.
Be careful. I am not sure if a generic definition is even possible. In any case, it is extremely risky to use in the wrong way.
It's nothing but faux ami.
My recommendation is NEVER to use any code based on bit numbers (even if this is what some chip documentation may refer to) but ONLY use appropriate masks instead.
new file mode 100644 index 0000000..088e5da --- /dev/null +++ b/include/asm-generic/bitops.h @@ -0,0 +1,151 @@ +#ifndef _ASM_GENERIC_BITOPS_H_ +#define _ASM_GENERIC_BITOPS_H_
+#include <asm/types.h>
+#define BIT(nr) (1UL << (nr)) +#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG)) +#define BIT_WORD(nr) ((nr) / BITS_PER_LONG) +#define BITS_PER_BYTE 8 +#define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
You see, this is plain wrong on PowerPC.
On PowerPC, bit number 0 is the most significant bit, not the least significant one as you assume here. So using this well-intended code on a PowerPC system will most likely get you in trouble.
[And you cannot even use a generic definition for PowerPC, as some- thing like "#define BIT(nr) (0x80000000 >> (nr))" will only work on 32 bit systems but be wrong on 64 bit ones.]
Let's get rid of this stuff, it is confusing.
Best regards,
Wolfgang Denk

On Thu, Jun 04, 2009 at 01:45:22PM +0200, Wolfgang Denk wrote:
+#define BIT(nr) (1UL << (nr)) +#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG)) +#define BIT_WORD(nr) ((nr) / BITS_PER_LONG) +#define BITS_PER_BYTE 8 +#define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
You see, this is plain wrong on PowerPC.
On PowerPC, bit number 0 is the most significant bit, not the least significant one as you assume here. So using this well-intended code on a PowerPC system will most likely get you in trouble.
Well, the idea is to let those platforms use the generic operations that do not bring their owm ones. The code above is not on use by PPC, so it doesn't harm either.
[And you cannot even use a generic definition for PowerPC, as some- thing like "#define BIT(nr) (0x80000000 >> (nr))" will only work on 32 bit systems but be wrong on 64 bit ones.]
That is why PPC implements that in its own fashion. Fair enough.
Let's get rid of this stuff, it is confusing.
Hmm, and how?
Daniel

Dear Daniel Mack,
In message 20090604114818.GH26688@buzzloop.caiaq.de you wrote:
+#define BIT(nr) (1UL << (nr)) +#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG)) +#define BIT_WORD(nr) ((nr) / BITS_PER_LONG) +#define BITS_PER_BYTE 8 +#define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
You see, this is plain wrong on PowerPC.
On PowerPC, bit number 0 is the most significant bit, not the least significant one as you assume here. So using this well-intended code on a PowerPC system will most likely get you in trouble.
Well, the idea is to let those platforms use the generic operations that do not bring their owm ones. The code above is not on use by PPC, so it doesn't harm either.
But it is not a generic operation. The notion of "bit number" is a generic concept, but here we have a machine dependent implementation that sails under the name "asm-generic/bitops.h".
But it is NOT generic.
Let's get rid of this stuff, it is confusing.
Hmm, and how?
Just don't use it. Use masks instead of bit numbers.
What's wrong with using 0x00008000 instead of BIT(15) (which would be 0x00010000 on 32 bit Power systems).
Best regards,
Wolfgang Denk

Dear Daniel Mack,
In message 1244111241-32735-3-git-send-email-daniel@caiaq.de you wrote:
This adds generic bit operations for all platforms and enables includes the implementations from asm-arm.
I should have read the patch to end...
+static inline void set_bit(int nr, volatile unsigned long *addr) +{
- unsigned long mask = BIT_MASK(nr);
- unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
- *p |= mask;
+}
+static inline void clear_bit(int nr, volatile unsigned long *addr) +{
- unsigned long mask = BIT_MASK(nr);
- unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
- *p &= ~mask;
+}
Such code has no chance of being accepted anyway.
It tries to be generic and does not care if you use it on memory or device addresses - but for device addresses, the use of proper I/O accessor functions is mandatory.
Full NAK.
Best regards,
Wolfgang Denk

On Thu, Jun 04, 2009 at 01:47:17PM +0200, Wolfgang Denk wrote:
+static inline void clear_bit(int nr, volatile unsigned long *addr) +{
- unsigned long mask = BIT_MASK(nr);
- unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
- *p &= ~mask;
+}
Such code has no chance of being accepted anyway.
It tries to be generic and does not care if you use it on memory or device addresses - but for device addresses, the use of proper I/O accessor functions is mandatory.
And the functions I removed from asm-arm/bitops.h did that?

On 12:27 Thu 04 Jun , Daniel Mack wrote:
This adds generic bit operations for all platforms and enables includes the implementations from asm-arm.
Code taken from Linux kernel.
the __set_bit, __clear_bit, __change_bit & co generic is used on arm ok
but be aware that we use it in U-Boot so NACK
and please do not rename __xxx by xxx they do not mean the same think
the __xxx can be re-ordered and the xxx not
IMHO please keep the linux naming to simplify think in the futur and please specify which file it's import and which kernel version
Best Regards, J.

On Fri, Jun 05, 2009 at 10:44:21PM +0200, Jean-Christophe PLAGNIOL-VILLARD wrote:
This adds generic bit operations for all platforms and enables includes the implementations from asm-arm.
Code taken from Linux kernel.
the __set_bit, __clear_bit, __change_bit & co generic is used on arm ok
Do you mean it worked before? Or are you referring to the version I posted which you think is ok?
but be aware that we use it in U-Boot so NACK
You use what? The functions I removed? I doubt that. The functions marked 'external' have no implementation, and I also don't believe the static inlines did the right thing (note the shift operations).
and please do not rename __xxx by xxx they do not mean the same think
the __xxx can be re-ordered and the xxx not
Sorry, I don't get it - what's your point about the status quo in the sources and about the patches?
Daniel
participants (4)
-
Daniel Mack
-
Jean-Christophe PLAGNIOL-VILLARD
-
Stefan Roese
-
Wolfgang Denk