[U-Boot] [PATCH v3] ARM: Avoid compiler optimization for usages of readb, writeb and friends.

gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that gcc version to ignore the volatile type qualifier used e.g. in __arch_getl(). Anyway, using a definition as in the kernel headers avoids such optimizations when gcc 4.5.1 is used.
Maybe the headers as used in the current linux-kernel should be used, but to avoid large changes, I've just added a small change to the current headers.
I haven't add the definitions which are using a memory barrier because I haven't found a place in the kernel where they were actually enabled (CONFIG_ARM_DMA_MEM_BUFFERABLE).
Signed-off-by: Alexander Holler holler@ahsoftware.de --- arch/arm/include/asm/io.h | 20 ++++++++++++++------ 1 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h index ff1518e..068ed17 100644 --- a/arch/arm/include/asm/io.h +++ b/arch/arm/include/asm/io.h @@ -125,13 +125,21 @@ extern inline void __raw_readsl(unsigned int addr, void *data, int longlen) #define __raw_readw(a) __arch_getw(a) #define __raw_readl(a) __arch_getl(a)
-#define writeb(v,a) __arch_putb(v,a) -#define writew(v,a) __arch_putw(v,a) -#define writel(v,a) __arch_putl(v,a) +/* + * TODO: The kernel offers some more advanced versions of barriers, it might + * have some advantages to use them instead of the simple one here. + */ +#define dmb() __asm__ __volatile__ ("" : : : "memory") +#define __iormb() dmb() +#define __iowmb() dmb() + +#define writeb(v,c) do { __iowmb(); __arch_putb(v,c); } while (0) +#define writew(v,c) do { __iowmb(); __arch_putw(v,c); } while (0) +#define writel(v,c) do { __iowmb(); __arch_putl(v,c); } while (0)
-#define readb(a) __arch_getb(a) -#define readw(a) __arch_getw(a) -#define readl(a) __arch_getl(a) +#define readb(c) ({ u8 __v = __arch_getb(c); __iormb(); __v; }) +#define readw(c) ({ u16 __v = __arch_getw(c); __iormb(); __v; }) +#define readl(c) ({ u32 __v = __arch_getl(c); __iormb(); __v; })
/* * The compiler seems to be incapable of optimising constants

On 22.12.2010 12:04, Alexander Holler wrote:
gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that gcc version to ignore the volatile type qualifier used e.g. in __arch_getl(). Anyway, using a definition as in the kernel headers avoids such optimizations when gcc 4.5.1 is used.
Maybe the headers as used in the current linux-kernel should be used, but to avoid large changes, I've just added a small change to the current headers.
I haven't add the definitions which are using a memory barrier because I haven't found a place in the kernel where they were actually enabled (CONFIG_ARM_DMA_MEM_BUFFERABLE).
Signed-off-by: Alexander Hollerholler@ahsoftware.de
This patch seems to be the same as the proposal from Wolfgang
http://lists.denx.de/pipermail/u-boot/2010-December/084122.html
So we shouldn't drop his
Signed-off-by: Wolfgang Denk wd@denx.de
Besides of this
Acked-by: Dirk Behme dirk.behme@googlemail.com
Thanks
Dirk
arch/arm/include/asm/io.h | 20 ++++++++++++++------ 1 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h index ff1518e..068ed17 100644 --- a/arch/arm/include/asm/io.h +++ b/arch/arm/include/asm/io.h @@ -125,13 +125,21 @@ extern inline void __raw_readsl(unsigned int addr, void *data, int longlen) #define __raw_readw(a) __arch_getw(a) #define __raw_readl(a) __arch_getl(a)
-#define writeb(v,a) __arch_putb(v,a) -#define writew(v,a) __arch_putw(v,a) -#define writel(v,a) __arch_putl(v,a) +/*
- TODO: The kernel offers some more advanced versions of barriers, it might
- have some advantages to use them instead of the simple one here.
- */
+#define dmb() __asm__ __volatile__ ("" : : : "memory") +#define __iormb() dmb() +#define __iowmb() dmb()
+#define writeb(v,c) do { __iowmb(); __arch_putb(v,c); } while (0) +#define writew(v,c) do { __iowmb(); __arch_putw(v,c); } while (0) +#define writel(v,c) do { __iowmb(); __arch_putl(v,c); } while (0)
-#define readb(a) __arch_getb(a) -#define readw(a) __arch_getw(a) -#define readl(a) __arch_getl(a) +#define readb(c) ({ u8 __v = __arch_getb(c); __iormb(); __v; }) +#define readw(c) ({ u16 __v = __arch_getw(c); __iormb(); __v; }) +#define readl(c) ({ u32 __v = __arch_getl(c); __iormb(); __v; })
/*
- The compiler seems to be incapable of optimising constants

Hello,
Am 22.12.2010 15:50, schrieb Dirk Behme:
This patch seems to be the same as the proposal from Wolfgang
http://lists.denx.de/pipermail/u-boot/2010-December/084122.html
Exactly.
So we shouldn't drop his
Signed-off-by: Wolfgang Denk wd@denx.de
Sorry, I haven't seen that Signed-Off and I have not seen that the message includes a complete patch. I've only seen the fix for write* while (quickly) reading the msg and doing the tests afterwards.
Besides of this
Acked-by: Dirk Behme dirk.behme@googlemail.com
And that I haven't added because the patch changed.
Should I send a new message with that Singed-Off and Acked-by?
Regards,
Alexander

On 22.12.2010 12:04, Alexander Holler wrote:
gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that gcc version to ignore the volatile type qualifier used e.g. in __arch_getl(). Anyway, using a definition as in the kernel headers avoids such optimizations when gcc 4.5.1 is used.
Maybe the headers as used in the current linux-kernel should be used, but to avoid large changes, I've just added a small change to the current headers.
I haven't add the definitions which are using a memory barrier because I haven't found a place in the kernel where they were actually enabled (CONFIG_ARM_DMA_MEM_BUFFERABLE).
Signed-off-by: Alexander Hollerholler@ahsoftware.de
arch/arm/include/asm/io.h | 20 ++++++++++++++------ 1 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h index ff1518e..068ed17 100644 --- a/arch/arm/include/asm/io.h +++ b/arch/arm/include/asm/io.h @@ -125,13 +125,21 @@ extern inline void __raw_readsl(unsigned int addr, void *data, int longlen) #define __raw_readw(a) __arch_getw(a) #define __raw_readl(a) __arch_getl(a)
-#define writeb(v,a) __arch_putb(v,a) -#define writew(v,a) __arch_putw(v,a) -#define writel(v,a) __arch_putl(v,a) +/*
- TODO: The kernel offers some more advanced versions of barriers, it might
- have some advantages to use them instead of the simple one here.
- */
+#define dmb() __asm__ __volatile__ ("" : : : "memory") +#define __iormb() dmb() +#define __iowmb() dmb()
+#define writeb(v,c) do { __iowmb(); __arch_putb(v,c); } while (0) +#define writew(v,c) do { __iowmb(); __arch_putw(v,c); } while (0) +#define writel(v,c) do { __iowmb(); __arch_putl(v,c); } while (0)
-#define readb(a) __arch_getb(a) -#define readw(a) __arch_getw(a) -#define readl(a) __arch_getl(a) +#define readb(c) ({ u8 __v = __arch_getb(c); __iormb(); __v; }) +#define readw(c) ({ u16 __v = __arch_getw(c); __iormb(); __v; }) +#define readl(c) ({ u32 __v = __arch_getl(c); __iormb(); __v; })
Do you like to test the patch in the attachment? I named it 'v4'.
After some thinking and testing, it seems to me that the volatile optimization issue this patch shall fix is only with the readx() macros. So the idea is to drop all writex() changes done in the v3 version of this patch. With dropping the writex() changes, we would drop all issues we discussed with e.g. the GCC statement-expression and the do while workaround, too.
Thanks
Dirk

Hello,
Am 01.01.2011 13:04, schrieb Dirk Behme:
On 22.12.2010 12:04, Alexander Holler wrote:
gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that gcc version to ignore the volatile type qualifier used e.g. in __arch_getl(). Anyway, using a definition as in the kernel headers avoids such optimizations when gcc 4.5.1 is used.
Maybe the headers as used in the current linux-kernel should be used, but to avoid large changes, I've just added a small change to the current headers.
Do you like to test the patch in the attachment? I named it 'v4'.
After some thinking and testing, it seems to me that the volatile optimization issue this patch shall fix is only with the readx() macros. So the idea is to drop all writex() changes done in the v3 version of this patch. With dropping the writex() changes, we would drop all issues we discussed with e.g. the GCC statement-expression and the do while workaround, too.
I've come across a bug which reads as the problem might be fixed in gcc 4.5.2:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
I will test gcc 4.5.2 in the next days.
Besides that I still think the correct solution would be to use the arm-headers as found in the current linux kernel. The problem is, that I don't know (haven't looked up) the reasons for changes in the arm-linux-headers as currently found in u-boot.
And because updating those headers might require some more changes in various other places in u-boot, I think it would be good if one of the u-boot-arm-maintainers would do that. I'm not that much involved in u-boot-development, don't follow the ml closely and therefor might miss necessary changes when taking the current arm-headers from the kernel and dropping them into u-boot.
Regards,
Alexander

On 01.01.2011 18:52, Alexander Holler wrote:
Hello,
Am 01.01.2011 13:04, schrieb Dirk Behme:
On 22.12.2010 12:04, Alexander Holler wrote:
gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that gcc version to ignore the volatile type qualifier used e.g. in __arch_getl(). Anyway, using a definition as in the kernel headers avoids such optimizations when gcc 4.5.1 is used.
Maybe the headers as used in the current linux-kernel should be used, but to avoid large changes, I've just added a small change to the current headers.
Do you like to test the patch in the attachment? I named it 'v4'.
After some thinking and testing, it seems to me that the volatile optimization issue this patch shall fix is only with the readx() macros. So the idea is to drop all writex() changes done in the v3 version of this patch. With dropping the writex() changes, we would drop all issues we discussed with e.g. the GCC statement-expression and the do while workaround, too.
I've come across a bug which reads as the problem might be fixed in gcc 4.5.2:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
I will test gcc 4.5.2 in the next days.
Have you been able to test v4 of the patch I sent with gcc 4.5.1?
Thanks
Dirk

Am 01.01.2011 19:25, schrieb Dirk Behme:
On 01.01.2011 18:52, Alexander Holler wrote:
Hello,
Am 01.01.2011 13:04, schrieb Dirk Behme:
On 22.12.2010 12:04, Alexander Holler wrote:
gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that gcc version to ignore the volatile type qualifier used e.g. in __arch_getl(). Anyway, using a definition as in the kernel headers avoids such optimizations when gcc 4.5.1 is used.
Maybe the headers as used in the current linux-kernel should be used, but to avoid large changes, I've just added a small change to the current headers.
Do you like to test the patch in the attachment? I named it 'v4'.
After some thinking and testing, it seems to me that the volatile optimization issue this patch shall fix is only with the readx() macros. So the idea is to drop all writex() changes done in the v3 version of this patch. With dropping the writex() changes, we would drop all issues we discussed with e.g. the GCC statement-expression and the do while workaround, too.
I've come across a bug which reads as the problem might be fixed in gcc 4.5.2:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
I will test gcc 4.5.2 in the next days.
Have you been able to test v4 of the patch I sent with gcc 4.5.1?
No, sorry, I don't have a test case for consequent write* and I will have to write one. I will do such, when testing gcc 4.5.2 (sometimes in the next days).
Regards,
Alexander

On 01.01.2011 19:47, Alexander Holler wrote:
Am 01.01.2011 19:25, schrieb Dirk Behme:
On 01.01.2011 18:52, Alexander Holler wrote:
Hello,
Am 01.01.2011 13:04, schrieb Dirk Behme:
On 22.12.2010 12:04, Alexander Holler wrote:
gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that gcc version to ignore the volatile type qualifier used e.g. in __arch_getl(). Anyway, using a definition as in the kernel headers avoids such optimizations when gcc 4.5.1 is used.
Maybe the headers as used in the current linux-kernel should be used, but to avoid large changes, I've just added a small change to the current headers.
Do you like to test the patch in the attachment? I named it 'v4'.
After some thinking and testing, it seems to me that the volatile optimization issue this patch shall fix is only with the readx() macros. So the idea is to drop all writex() changes done in the v3 version of this patch. With dropping the writex() changes, we would drop all issues we discussed with e.g. the GCC statement-expression and the do while workaround, too.
I've come across a bug which reads as the problem might be fixed in gcc 4.5.2:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
I will test gcc 4.5.2 in the next days.
Have you been able to test v4 of the patch I sent with gcc 4.5.1?
No, sorry, I don't have a test case for consequent write* and I will have to write one.
?
If I remember correctly, the test case for this patch was compiling U-Boot with 4.5.1 and then check
a) if it boots at Beagle (correct clock.c) b) if NAND works ok (correct omap_gpmc.c)
?
Thanks
Dirk

Am 01.01.2011 20:21, schrieb Dirk Behme:
On 01.01.2011 19:47, Alexander Holler wrote:
Am 01.01.2011 19:25, schrieb Dirk Behme:
On 01.01.2011 18:52, Alexander Holler wrote:
Hello,
Am 01.01.2011 13:04, schrieb Dirk Behme:
On 22.12.2010 12:04, Alexander Holler wrote:
gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that gcc version to ignore the volatile type qualifier used e.g. in __arch_getl(). Anyway, using a definition as in the kernel headers avoids such optimizations when gcc 4.5.1 is used.
Maybe the headers as used in the current linux-kernel should be used, but to avoid large changes, I've just added a small change to the current headers.
Do you like to test the patch in the attachment? I named it 'v4'.
After some thinking and testing, it seems to me that the volatile optimization issue this patch shall fix is only with the readx() macros. So the idea is to drop all writex() changes done in the v3 version of this patch. With dropping the writex() changes, we would drop all issues we discussed with e.g. the GCC statement-expression and the do while workaround, too.
I've come across a bug which reads as the problem might be fixed in gcc 4.5.2:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
I will test gcc 4.5.2 in the next days.
Have you been able to test v4 of the patch I sent with gcc 4.5.1?
No, sorry, I don't have a test case for consequent write* and I will have to write one.
?
If I remember correctly, the test case for this patch was compiling U-Boot with 4.5.1 and then check
a) if it boots at Beagle (correct clock.c) b) if NAND works ok (correct omap_gpmc.c)
?
No. None of those must fail when the compiler optimizes consequent write* to one write* because the compiler ignores the volatile keyword. I've only found the problem with consequent read* (in clock.c), but there might be problems with consequent write* somewhere else too. So if you remove the change for those write* some other problems might arise and just through booting a kernel those might not be found. So I think it would be dangerous to remove the change for write* when using gcc 4.5.x
And because the patch fixes only write* and read* some stuff in u-boot which uses volatile in another context might still fail, therefore I vote to use the current kernel headers where other things besides read* and write* are using those barriers too.
Regards,
Alexander

On 02.01.2011 13:43, Alexander Holler wrote:
Am 01.01.2011 20:21, schrieb Dirk Behme:
On 01.01.2011 19:47, Alexander Holler wrote:
Am 01.01.2011 19:25, schrieb Dirk Behme:
On 01.01.2011 18:52, Alexander Holler wrote:
Hello,
Am 01.01.2011 13:04, schrieb Dirk Behme:
On 22.12.2010 12:04, Alexander Holler wrote: > gcc 4.5.1 seems to ignore (at least some) volatile definitions, > avoid that as done in the kernel. > > Reading C99 6.7.3 8 and the comment 114) there, I think it is a > bug of > that > gcc version to ignore the volatile type qualifier used e.g. in > __arch_getl(). > Anyway, using a definition as in the kernel headers avoids such > optimizations when > gcc 4.5.1 is used. > > Maybe the headers as used in the current linux-kernel should be > used, > but to avoid large changes, I've just added a small change to the > current headers.
Do you like to test the patch in the attachment? I named it 'v4'.
After some thinking and testing, it seems to me that the volatile optimization issue this patch shall fix is only with the readx() macros. So the idea is to drop all writex() changes done in the v3 version of this patch. With dropping the writex() changes, we would drop all issues we discussed with e.g. the GCC statement-expression and the do while workaround, too.
I've come across a bug which reads as the problem might be fixed in gcc 4.5.2:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
I will test gcc 4.5.2 in the next days.
Have you been able to test v4 of the patch I sent with gcc 4.5.1?
No, sorry, I don't have a test case for consequent write* and I will have to write one.
?
If I remember correctly, the test case for this patch was compiling U-Boot with 4.5.1 and then check
a) if it boots at Beagle (correct clock.c) b) if NAND works ok (correct omap_gpmc.c)
?
No. None of those must fail when the compiler optimizes consequent write* to one write* because the compiler ignores the volatile keyword. I've only found the problem with consequent read* (in clock.c), but there might be problems with consequent write* somewhere else too. So if you remove the change for those write* some other problems might arise and just through booting a kernel those might not be found. So I think it would be dangerous to remove the change for write* when using gcc 4.5.x
And because the patch fixes only write* and read* some stuff in u-boot which uses volatile in another context might still fail, therefore I vote to use the current kernel headers where other things besides read* and write* are using those barriers too.
Just to understand correctly: Do you want to say that we should ignore your v3 patch
http://lists.denx.de/pipermail/u-boot/2010-December/084132.html
?
And that you didn't test the v4 patch
http://lists.denx.de/pipermail/u-boot/2011-January/084481.html
with the test you did in
http://lists.denx.de/pipermail/u-boot/2010-December/084134.html
("tested with both gcc 4.3.5 and gcc 4.5.1 using binutils 2.20.1") because you now think this test isn't sufficient?
Thanks
Dirk

On 02.01.2011 14:29, Dirk Behme wrote:
On 02.01.2011 13:43, Alexander Holler wrote:
Am 01.01.2011 20:21, schrieb Dirk Behme:
On 01.01.2011 19:47, Alexander Holler wrote:
Am 01.01.2011 19:25, schrieb Dirk Behme:
On 01.01.2011 18:52, Alexander Holler wrote:
Hello,
Am 01.01.2011 13:04, schrieb Dirk Behme: > On 22.12.2010 12:04, Alexander Holler wrote: >> gcc 4.5.1 seems to ignore (at least some) volatile definitions, >> avoid that as done in the kernel. >> >> Reading C99 6.7.3 8 and the comment 114) there, I think it is a >> bug of >> that >> gcc version to ignore the volatile type qualifier used e.g. in >> __arch_getl(). >> Anyway, using a definition as in the kernel headers avoids such >> optimizations when >> gcc 4.5.1 is used. >> >> Maybe the headers as used in the current linux-kernel should be >> used, >> but to avoid large changes, I've just added a small change to the >> current headers.
> Do you like to test the patch in the attachment? I named it 'v4'. > > After some thinking and testing, it seems to me that the volatile > optimization issue this patch shall fix is only with the readx() > macros. > So the idea is to drop all writex() changes done in the v3 > version of > this patch. With dropping the writex() changes, we would drop all > issues > we discussed with e.g. the GCC statement-expression and the do > while > workaround, too.
I've come across a bug which reads as the problem might be fixed in gcc 4.5.2:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052
I will test gcc 4.5.2 in the next days.
Have you been able to test v4 of the patch I sent with gcc 4.5.1?
No, sorry, I don't have a test case for consequent write* and I will have to write one.
?
If I remember correctly, the test case for this patch was compiling U-Boot with 4.5.1 and then check
a) if it boots at Beagle (correct clock.c) b) if NAND works ok (correct omap_gpmc.c)
?
No. None of those must fail when the compiler optimizes consequent write* to one write* because the compiler ignores the volatile keyword. I've only found the problem with consequent read* (in clock.c), but there might be problems with consequent write* somewhere else too. So if you remove the change for those write* some other problems might arise and just through booting a kernel those might not be found. So I think it would be dangerous to remove the change for write* when using gcc 4.5.x
And because the patch fixes only write* and read* some stuff in u-boot which uses volatile in another context might still fail, therefore I vote to use the current kernel headers where other things besides read* and write* are using those barriers too.
Just to understand correctly: Do you want to say that we should ignore your v3 patch
http://lists.denx.de/pipermail/u-boot/2010-December/084132.html
?
And that you didn't test the v4 patch
http://lists.denx.de/pipermail/u-boot/2011-January/084481.html
with the test you did in
http://lists.denx.de/pipermail/u-boot/2010-December/084134.html
("tested with both gcc 4.3.5 and gcc 4.5.1 using binutils 2.20.1") because you now think this test isn't sufficient?
Sorry, but I don't understand why you are assuming that the compiler will only use those (wrong) optimizations on reads and not writes.
If the compiler does the same wrong optimizations for writes (why not, if it ignores volatile), your v4 would'nt fix that.
Regards,
Alexander

Am 02.01.2011 22:00, schrieb Alexander Holler:
On 02.01.2011 14:29, Dirk Behme wrote:
On 02.01.2011 13:43, Alexander Holler wrote:
Am 01.01.2011 20:21, schrieb Dirk Behme:
On 01.01.2011 19:47, Alexander Holler wrote:
Am 01.01.2011 19:25, schrieb Dirk Behme:
On 01.01.2011 18:52, Alexander Holler wrote: > Hello, > > Am 01.01.2011 13:04, schrieb Dirk Behme: >> On 22.12.2010 12:04, Alexander Holler wrote: >>> gcc 4.5.1 seems to ignore (at least some) volatile definitions, >>> avoid that as done in the kernel. >>> >>> Reading C99 6.7.3 8 and the comment 114) there, I think it is a >>> bug of >>> that >>> gcc version to ignore the volatile type qualifier used e.g. in >>> __arch_getl(). >>> Anyway, using a definition as in the kernel headers avoids such >>> optimizations when >>> gcc 4.5.1 is used. >>> >>> Maybe the headers as used in the current linux-kernel should be >>> used, >>> but to avoid large changes, I've just added a small change to the >>> current headers. > >> Do you like to test the patch in the attachment? I named it 'v4'. >> >> After some thinking and testing, it seems to me that the volatile >> optimization issue this patch shall fix is only with the readx() >> macros. >> So the idea is to drop all writex() changes done in the v3 >> version of >> this patch. With dropping the writex() changes, we would drop all >> issues >> we discussed with e.g. the GCC statement-expression and the do >> while >> workaround, too. > > I've come across a bug which reads as the problem might be fixed in > gcc 4.5.2: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052 > > I will test gcc 4.5.2 in the next days.
Have you been able to test v4 of the patch I sent with gcc 4.5.1?
No, sorry, I don't have a test case for consequent write* and I will have to write one.
?
If I remember correctly, the test case for this patch was compiling U-Boot with 4.5.1 and then check
a) if it boots at Beagle (correct clock.c) b) if NAND works ok (correct omap_gpmc.c)
?
No. None of those must fail when the compiler optimizes consequent write* to one write* because the compiler ignores the volatile keyword. I've only found the problem with consequent read* (in clock.c), but there might be problems with consequent write* somewhere else too. So if you remove the change for those write* some other problems might arise and just through booting a kernel those might not be found. So I think it would be dangerous to remove the change for write* when using gcc 4.5.x
And because the patch fixes only write* and read* some stuff in u-boot which uses volatile in another context might still fail, therefore I vote to use the current kernel headers where other things besides read* and write* are using those barriers too.
Just to understand correctly: Do you want to say that we should ignore your v3 patch
http://lists.denx.de/pipermail/u-boot/2010-December/084132.html
?
And that you didn't test the v4 patch
http://lists.denx.de/pipermail/u-boot/2011-January/084481.html
with the test you did in
http://lists.denx.de/pipermail/u-boot/2010-December/084134.html
("tested with both gcc 4.3.5 and gcc 4.5.1 using binutils 2.20.1") because you now think this test isn't sufficient?
Sorry, but I don't understand why you are assuming that the compiler will only use those (wrong) optimizations on reads and not writes.
If the compiler does the same wrong optimizations for writes (why not, if it ignores volatile), your v4 would'nt fix that.
I've done now some more tests.
First, the bug is fixed in gcc 4.5.2.
And Indeed, gcc 4.5.0 and gcc 4.5.1 seems to ignore volatile only for reading. At least two writel() are not optimized to one when the volatile (as before) or the "__asm__ __volatile__ ("" : : : "memory")" is used.
Beeing kind of a defensive programmer, I still would prefer to use have that __asm__ for write* too. That would at least prevent us from a possible bug there too.
What makes me a bit nervous, is that I don't have a clue how to write a test if volatile works (without looking at the generated output). Maybe others have that problem too and therfore such a test doesn't exist in the testsuite of gcc,
Regards,
Alexander

Dear Alexander Holler,
In message 4D2B1D75.70809@ahsoftware.de you wrote:
Beeing kind of a defensive programmer, I still would prefer to use have that __asm__ for write* too. That would at least prevent us from a possible bug there too.
So why don't you simply test and, assuming it's working, ACK the patch I submitted yesterday? We should be on the safe side, then, and don't have to care about which mood the current compiler's optimizer might be in or what the POM is.
Best regards,
Wolfgang Denk

Am 10.01.2011 16:05, schrieb Wolfgang Denk:
Dear Alexander Holler,
In message4D2B1D75.70809@ahsoftware.de you wrote:
Beeing kind of a defensive programmer, I still would prefer to use have that __asm__ for write* too. That would at least prevent us from a possible bug there too.
So why don't you simply test and, assuming it's working, ACK the patch I submitted yesterday? We should be on the safe side, then, and don't have to care about which mood the current compiler's optimizer might be in or what the POM is.
Sorry, I haven't had your last patch (mail) before I've written the mail you are referencing.
I have updated my mail-system at home (armv5 with 128mb ram) and the incoming queue, mainly filled through lkml, is still not completly processed. ~2000 messages (3 days) need some time to go through spamassassin on such a low-level hardware. ;)
I've seen you've switched from do {} while() to "something else", but I can't comment on that "something else". Because I've already switched to 4.5.2. I'll have to dig out a system where I have a 4.5.1 to test the problem occured with the write. If anybody else already has tested it, I'm fine with it.
Regards,
Alexander

Dear Dirk Behme,
In message 4D1F1841.5060508@googlemail.com you wrote:
Do you like to test the patch in the attachment? I named it 'v4'.
Please send patches inline.
After some thinking and testing, it seems to me that the volatile optimization issue this patch shall fix is only with the readx() macros. So the idea is to drop all writex() changes done in the v3 version of this patch. With dropping the writex() changes, we would drop all issues we discussed with e.g. the GCC statement-expression and the do while workaround, too.
This makes no sense. Even if we experience problems only with read*() at the moment, we should to the Rigth Thing (TM) and fix both the read*() and write*() functions.
Please have a look a the patch I just posted, http://patchwork.ozlabs.org/patch/78056/
Best regards,
Wolfgang Denk

Dear Wolfgang,
On 09.01.2011 23:25, Wolfgang Denk wrote:
Dear Dirk Behme,
In message4D1F1841.5060508@googlemail.com you wrote:
Do you like to test the patch in the attachment? I named it 'v4'.
Please send patches inline.
After some thinking and testing, it seems to me that the volatile optimization issue this patch shall fix is only with the readx() macros. So the idea is to drop all writex() changes done in the v3 version of this patch. With dropping the writex() changes, we would drop all issues we discussed with e.g. the GCC statement-expression and the do while workaround, too.
This makes no sense. Even if we experience problems only with read*() at the moment, we should to the Rigth Thing (TM) and fix both the read*() and write*() functions.
The question I was thinking about with my patch was "what's Right Thing?" ;)
It's my understanding that we don't fix read*() and write*() because they are broken. We touch them to work around a broken tool chain.
We saw that this specific tool chain has issues with read*(). While working around this, we touched write*(), too. This was done in the wrong way. So while read*() was fine, write*() was accidentally broken (with all tool chains), then. So we could
(a) do write*() correctly, too (as you do in your patch below)
or
(b) just don't touch write*() as it isn't needed to work around the read*() tool chain issue (as I proposed in my patch v4)
Anyway:
Please have a look a the patch I just posted, http://patchwork.ozlabs.org/patch/78056/
I'm fine with that patch.
Thanks
Dirk

Dear Dirk Behme,
In message 4D2B3036.4010506@googlemail.com you wrote:
The question I was thinking about with my patch was "what's Right Thing?" ;)
The Right Thing i not to make specific assumptions how the compiler might handle volatile pointers.
It's my understanding that we don't fix read*() and write*() because they are broken. We touch them to work around a broken tool chain.
No. Please re-read volatile-considered-harmful.txt in the linux/Documentation directory: "accessing I/O memory directly through pointers is frowned upon and does not work on all architectures. Those accessors are written to prevent unwanted optimization".
Best regards,
Wolfgang Denk

From: Alexander Holler holler@ahsoftware.de
gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that gcc version to ignore the volatile type qualifier used e.g. in __arch_getl(). Anyway, using a definition as in the kernel headers avoids such optimizations when gcc 4.5.1 is used.
Maybe the headers as used in the current linux-kernel should be used, but to avoid large changes, I've just added a small change to the current headers.
Signed-off-by: Alexander Holler holler@ahsoftware.de Signed-off-by: Dirk Behme dirk.behme@googlemail.com Signed-off-by: Wolfgang Denk wd@denx.de Cc: Alessandro Rubini rubini-list@gnudd.com --- arch/arm/include/asm/io.h | 32 ++++++++++++++++++++------------ 1 files changed, 20 insertions(+), 12 deletions(-)
diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h index ff1518e..3886f15 100644 --- a/arch/arm/include/asm/io.h +++ b/arch/arm/include/asm/io.h @@ -117,21 +117,29 @@ extern inline void __raw_readsl(unsigned int addr, void *data, int longlen) *buf++ = __arch_getl(addr); }
-#define __raw_writeb(v,a) __arch_putb(v,a) -#define __raw_writew(v,a) __arch_putw(v,a) -#define __raw_writel(v,a) __arch_putl(v,a) +#define __raw_writeb(v,a) __arch_putb(v,a) +#define __raw_writew(v,a) __arch_putw(v,a) +#define __raw_writel(v,a) __arch_putl(v,a)
-#define __raw_readb(a) __arch_getb(a) -#define __raw_readw(a) __arch_getw(a) -#define __raw_readl(a) __arch_getl(a) +#define __raw_readb(a) __arch_getb(a) +#define __raw_readw(a) __arch_getw(a) +#define __raw_readl(a) __arch_getl(a)
-#define writeb(v,a) __arch_putb(v,a) -#define writew(v,a) __arch_putw(v,a) -#define writel(v,a) __arch_putl(v,a) +/* + * TODO: The kernel offers some more advanced versions of barriers, it might + * have some advantages to use them instead of the simple one here. + */ +#define dmb() __asm__ __volatile__ ("" : : : "memory") +#define __iormb() dmb() +#define __iowmb() dmb() + +#define writeb(v,c) ({ __iowmb(); __arch_putb(v,c); v; }) +#define writew(v,c) ({ __iowmb(); __arch_putw(v,c); v; }) +#define writel(v,c) ({ __iowmb(); __arch_putl(v,c); v; })
-#define readb(a) __arch_getb(a) -#define readw(a) __arch_getw(a) -#define readl(a) __arch_getl(a) +#define readb(c) ({ u8 __v = __arch_getb(c); __iormb(); __v; }) +#define readw(c) ({ u16 __v = __arch_getw(c); __iormb(); __v; }) +#define readl(c) ({ u32 __v = __arch_getl(c); __iormb(); __v; })
/* * The compiler seems to be incapable of optimising constants

Am 09.01.2011 23:19, schrieb Wolfgang Denk:
From: Alexander Holler holler@ahsoftware.de
gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that gcc version to ignore the volatile type qualifier used e.g. in __arch_getl(). Anyway, using a definition as in the kernel headers avoids such optimizations when gcc 4.5.1 is used.
Maybe the headers as used in the current linux-kernel should be used, but to avoid large changes, I've just added a small change to the current headers.
Signed-off-by: Alexander Holler holler@ahsoftware.de Signed-off-by: Dirk Behme dirk.behme@googlemail.com Signed-off-by: Wolfgang Denk wd@denx.de Cc: Alessandro Rubini rubini-list@gnudd.com
arch/arm/include/asm/io.h | 32 ++++++++++++++++++++------------ 1 files changed, 20 insertions(+), 12 deletions(-)
diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h index ff1518e..3886f15 100644 --- a/arch/arm/include/asm/io.h +++ b/arch/arm/include/asm/io.h @@ -117,21 +117,29 @@ extern inline void __raw_readsl(unsigned int addr, void *data, int longlen) *buf++ = __arch_getl(addr); }
-#define __raw_writeb(v,a) __arch_putb(v,a) -#define __raw_writew(v,a) __arch_putw(v,a) -#define __raw_writel(v,a) __arch_putl(v,a) +#define __raw_writeb(v,a) __arch_putb(v,a) +#define __raw_writew(v,a) __arch_putw(v,a) +#define __raw_writel(v,a) __arch_putl(v,a)
-#define __raw_readb(a) __arch_getb(a) -#define __raw_readw(a) __arch_getw(a) -#define __raw_readl(a) __arch_getl(a) +#define __raw_readb(a) __arch_getb(a) +#define __raw_readw(a) __arch_getw(a) +#define __raw_readl(a) __arch_getl(a)
-#define writeb(v,a) __arch_putb(v,a) -#define writew(v,a) __arch_putw(v,a) -#define writel(v,a) __arch_putl(v,a) +/*
- TODO: The kernel offers some more advanced versions of barriers, it might
- have some advantages to use them instead of the simple one here.
- */
+#define dmb() __asm__ __volatile__ ("" : : : "memory") +#define __iormb() dmb() +#define __iowmb() dmb()
+#define writeb(v,c) ({ __iowmb(); __arch_putb(v,c); v; }) +#define writew(v,c) ({ __iowmb(); __arch_putw(v,c); v; }) +#define writel(v,c) ({ __iowmb(); __arch_putl(v,c); v; })
-#define readb(a) __arch_getb(a) -#define readw(a) __arch_getw(a) -#define readl(a) __arch_getl(a) +#define readb(c) ({ u8 __v = __arch_getb(c); __iormb(); __v; }) +#define readw(c) ({ u16 __v = __arch_getw(c); __iormb(); __v; }) +#define readl(c) ({ u32 __v = __arch_getl(c); __iormb(); __v; })
/*
- The compiler seems to be incapable of optimising constants
Tested-by: Thomas Weber weber@corscience.de
on Devkit8000 with codesourcery arm2010.09 (gcc4.5.1) and arm2010q1 (gcc 4.4.1)

Am 09.01.2011 23:19, schrieb Wolfgang Denk:
gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Have had a look at the asm generated by gcc 4.5.1, looks good.
The wrong optimization in arch/arm/cpu/armv7/omap3/clock.c is gone and the writeb in drivers/mtd/nand/omap_gpmc.c doesn't have the problem as the v1-patch.
Reagards,
Alexander

Dear Alexander Holler,
In message 4D2DCB18.20409@ahsoftware.de you wrote:
Am 09.01.2011 23:19, schrieb Wolfgang Denk:
gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Have had a look at the asm generated by gcc 4.5.1, looks good.
The wrong optimization in arch/arm/cpu/armv7/omap3/clock.c is gone and the writeb in drivers/mtd/nand/omap_gpmc.c doesn't have the problem as the v1-patch.
Thanks - but please send a formal Acked-by: and/or Tested-by: .
Best regards,
Wolfgang Denk

Am 12.01.2011 17:40, schrieb Wolfgang Denk:
Dear Alexander Holler,
In message4D2DCB18.20409@ahsoftware.de you wrote:
Am 09.01.2011 23:19, schrieb Wolfgang Denk:
gcc 4.5.1 seems to ignore (at least some) volatile definitions, avoid that as done in the kernel.
Have had a look at the asm generated by gcc 4.5.1, looks good.
The wrong optimization in arch/arm/cpu/armv7/omap3/clock.c is gone and the writeb in drivers/mtd/nand/omap_gpmc.c doesn't have the problem as the v1-patch.
Thanks - but please send a formal Acked-by: and/or Tested-by: .
Oh, as I'm still listed as the author, I thought that isn't necessary.
I don't know if I should paste the whole patch (this is my first ack ;) ), but here are both:
Acked-by: Alexander Holler holler@ahsoftware.de Tested-by: Alexander Holler holler@ahsoftware.de
Regards,
Alexander

Le 12/01/2011 17:49, Alexander Holler a écrit :
Signed-off-by: Alexander Holler holler@ahsoftware.de Signed-off-by: Dirk Behme dirk.behme@googlemail.com Signed-off-by: Wolfgang Denk wd@denx.de
Tested-by: Thomas Weber weber@corscience.de
Acked-by: Alexander Hollerholler@ahsoftware.de Tested-by: Alexander Hollerholler@ahsoftware.de
Applied to u-boot-arm, thanks.
Amicalement,
participants (5)
-
Albert ARIBAUD
-
Alexander Holler
-
Dirk Behme
-
Thomas Weber
-
Wolfgang Denk