[U-Boot] [PATCH] ARM: UniPhier: remove __packed that causes a problem on GCC 4.9

The DDR PHY training function, ddrphy_prepare_training() would not work if compiled with GCC 4.9.
The struct ddrphy (arch/arm/include/asm/arch-uniphier/ddrphy-regs.h) is specified with __packed because it represents a hardware register mapping, but it turned out to cause a problem on GCC 4.9.
If -mno-unaligned-access is specified (actually it is in arch/arm/cpu/armv7/config.mk), GCC 4.9 is aware of the __attribute__((packed)) and generates extra instructions to perform the memory access in a way that does not cause unaligned access. (Actually it is bogus here because the register base, the first argument of the ddrphy_prepare_training(), is always given with a 4-byte aligned address.)
Anyway, as a result, readl() / writel() is divided into byte-wise accesses. The problem is that this hardware only accepts 4-byte register access. Byte-wise accesses lead to unexpected behavior.
There are some options to avoid this problem.
[1] Remove -mno-unaligned-access [2] Add __aligned(4) along with __packed to struct ddrphy [3] Remove __packed from struct ddrphy
[1] solves the problem for ARMv7, but it does not for pre-ARMv6 and ARMv6-M architectures where -mno-unaligned-access is default. So, [1] does not seem reasonable in terms of code portability.
Both [2] and [3] work well, but [2] seems too much. All the members of struct ddrphy have the u32 type. No padding would be inserted even if __packed is dropped.
Signed-off-by: Masahiro Yamada yamada.m@jp.panasonic.com ---
arch/arm/include/asm/arch-uniphier/ddrphy-regs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/arch-uniphier/ddrphy-regs.h b/arch/arm/include/asm/arch-uniphier/ddrphy-regs.h index 484559c..6b7d600 100644 --- a/arch/arm/include/asm/arch-uniphier/ddrphy-regs.h +++ b/arch/arm/include/asm/arch-uniphier/ddrphy-regs.h @@ -72,7 +72,7 @@ struct ddrphy { u32 gtr; /* General Timing Register */ u32 rsv[3]; /* Reserved */ } dx[9]; -} __packed; +};
#endif /* __ASSEMBLY__ */

On Wed, Jan 07, 2015 at 07:41:38PM +0900, Masahiro Yamada wrote:
The DDR PHY training function, ddrphy_prepare_training() would not work if compiled with GCC 4.9.
The struct ddrphy (arch/arm/include/asm/arch-uniphier/ddrphy-regs.h) is specified with __packed because it represents a hardware register mapping, but it turned out to cause a problem on GCC 4.9.
If -mno-unaligned-access is specified (actually it is in arch/arm/cpu/armv7/config.mk), GCC 4.9 is aware of the __attribute__((packed)) and generates extra instructions to perform the memory access in a way that does not cause unaligned access. (Actually it is bogus here because the register base, the first argument of the ddrphy_prepare_training(), is always given with a 4-byte aligned address.)
Anyway, as a result, readl() / writel() is divided into byte-wise accesses. The problem is that this hardware only accepts 4-byte register access. Byte-wise accesses lead to unexpected behavior.
There are some options to avoid this problem.
[1] Remove -mno-unaligned-access [2] Add __aligned(4) along with __packed to struct ddrphy [3] Remove __packed from struct ddrphy
[1] solves the problem for ARMv7, but it does not for pre-ARMv6 and ARMv6-M architectures where -mno-unaligned-access is default. So, [1] does not seem reasonable in terms of code portability.
Both [2] and [3] work well, but [2] seems too much. All the members of struct ddrphy have the u32 type. No padding would be inserted even if __packed is dropped.
Signed-off-by: Masahiro Yamada yamada.m@jp.panasonic.com
I wanted to think about this for a minute. I argued with myself a bit about [2] being the best choice, and lost. I'm fairly sure that __packed on an struct of all u32 (on a 32bit platform) is in fact the wrong thing to do so yes, [3] is right.
Reviewed-by: Tom Rini trini@ti.com

On Wed, 7 Jan 2015 14:00:05 -0500 Tom Rini trini@ti.com wrote:
On Wed, Jan 07, 2015 at 07:41:38PM +0900, Masahiro Yamada wrote:
The DDR PHY training function, ddrphy_prepare_training() would not work if compiled with GCC 4.9.
The struct ddrphy (arch/arm/include/asm/arch-uniphier/ddrphy-regs.h) is specified with __packed because it represents a hardware register mapping, but it turned out to cause a problem on GCC 4.9.
If -mno-unaligned-access is specified (actually it is in arch/arm/cpu/armv7/config.mk), GCC 4.9 is aware of the __attribute__((packed)) and generates extra instructions to perform the memory access in a way that does not cause unaligned access. (Actually it is bogus here because the register base, the first argument of the ddrphy_prepare_training(), is always given with a 4-byte aligned address.)
Anyway, as a result, readl() / writel() is divided into byte-wise accesses. The problem is that this hardware only accepts 4-byte register access. Byte-wise accesses lead to unexpected behavior.
There are some options to avoid this problem.
[1] Remove -mno-unaligned-access [2] Add __aligned(4) along with __packed to struct ddrphy [3] Remove __packed from struct ddrphy
[1] solves the problem for ARMv7, but it does not for pre-ARMv6 and ARMv6-M architectures where -mno-unaligned-access is default. So, [1] does not seem reasonable in terms of code portability.
Both [2] and [3] work well, but [2] seems too much. All the members of struct ddrphy have the u32 type. No padding would be inserted even if __packed is dropped.
Signed-off-by: Masahiro Yamada yamada.m@jp.panasonic.com
I wanted to think about this for a minute. I argued with myself a bit about [2] being the best choice, and lost. I'm fairly sure that __packed on an struct of all u32 (on a 32bit platform) is in fact the wrong thing to do so yes, [3] is right.
Reviewed-by: Tom Rini trini@ti.com
There is another (and I think the best) option:
[4] Sync writeb(), writew(), writel(), readlb(), readw(), readl() with Linux
This pitfall does not occur on Linux because ARM Linux hard-codes writel()/readl() etc.
Linux:
#define __raw_writel __raw_writel static inline void __raw_writel(u32 val, volatile void __iomem *addr) { asm volatile("str %1, %0" : "+Qo" (*(volatile u32 __force *)addr) : "r" (val)); }
U-Boot:
#define __arch_putl(v,a) (*(volatile unsigned int *)(a) = (v))
#define __raw_writel(v,a) __arch_putl(v,a)
On Linux, writel() is always converted to "str" instruction.
On U-Boot, it depends on compiler's decesion.
Best Regards Masahiro Yamada

2015-01-07 19:41 GMT+09:00 Masahiro Yamada yamada.m@jp.panasonic.com:
The DDR PHY training function, ddrphy_prepare_training() would not work if compiled with GCC 4.9.
The struct ddrphy (arch/arm/include/asm/arch-uniphier/ddrphy-regs.h) is specified with __packed because it represents a hardware register mapping, but it turned out to cause a problem on GCC 4.9.
If -mno-unaligned-access is specified (actually it is in arch/arm/cpu/armv7/config.mk), GCC 4.9 is aware of the __attribute__((packed)) and generates extra instructions to perform the memory access in a way that does not cause unaligned access. (Actually it is bogus here because the register base, the first argument of the ddrphy_prepare_training(), is always given with a 4-byte aligned address.)
Anyway, as a result, readl() / writel() is divided into byte-wise accesses. The problem is that this hardware only accepts 4-byte register access. Byte-wise accesses lead to unexpected behavior.
There are some options to avoid this problem.
[1] Remove -mno-unaligned-access [2] Add __aligned(4) along with __packed to struct ddrphy [3] Remove __packed from struct ddrphy
[1] solves the problem for ARMv7, but it does not for pre-ARMv6 and ARMv6-M architectures where -mno-unaligned-access is default. So, [1] does not seem reasonable in terms of code portability.
Both [2] and [3] work well, but [2] seems too much. All the members of struct ddrphy have the u32 type. No padding would be inserted even if __packed is dropped.
Signed-off-by: Masahiro Yamada yamada.m@jp.panasonic.com
Applied to u-boot-uniphier/master
participants (3)
-
Masahiro YAMADA
-
Masahiro Yamada
-
Tom Rini