[U-Boot] [PATCH v2] arm: Switch to -mno-unaligned-access when supported by the compiler

When we tell the compiler to optimize for ARMv7 it assumes a default of unaligned accesses being supported at the hardware level and can make use of this to perform what it deems as an optimization in any case, including allowing for data to become unaligned. We explicitly disallow this hardware feature so we must tell the compiler.
Cc: Albert ARIBAUD albert.u.boot@aribaud.net Cc: Mans Rullgard mans@mansr.com Signed-off-by: Tom Rini trini@ti.com
--- Changes in v2: - Drop references to README.arm-unaligned-accesses from arch/arm/lib/interrupts.c and top-level README --- README | 2 +- arch/arm/cpu/armv7/config.mk | 6 +- arch/arm/cpu/armv8/config.mk | 5 +- arch/arm/lib/interrupts.c | 1 - common/Makefile | 4 -- doc/README.arm-unaligned-accesses | 122 ------------------------------------- fs/ubifs/Makefile | 3 - lib/Makefile | 3 - 8 files changed, 6 insertions(+), 140 deletions(-) delete mode 100644 doc/README.arm-unaligned-accesses
diff --git a/README b/README index fe48ccd..1de9162 100644 --- a/README +++ b/README @@ -1725,7 +1725,7 @@ CBFS (Coreboot Filesystem) support
If this option is set, then U-Boot will prevent the environment variable "splashimage" from being set to a problematic address - (see README.displaying-bmps and README.arm-unaligned-accesses). + (see README.displaying-bmps). This option is useful for targets where, due to alignment restrictions, an improperly aligned BMP image will cause a data abort. If you think you will not have problems with unaligned diff --git a/arch/arm/cpu/armv7/config.mk b/arch/arm/cpu/armv7/config.mk index 38b7c40..ae4427c 100644 --- a/arch/arm/cpu/armv7/config.mk +++ b/arch/arm/cpu/armv7/config.mk @@ -10,9 +10,11 @@ PF_CPPFLAGS_ARMV7 := $(call cc-option, -march=armv7-a, -march=armv5) PLATFORM_CPPFLAGS += $(PF_CPPFLAGS_ARMV7)
-# SEE README.arm-unaligned-accesses +# On supported platforms we clear the bit which allows for hardware +# unaligned access so we must tell the compiler so it can make the correct +# decision. PF_NO_UNALIGNED := $(call cc-option, -mno-unaligned-access,) -PLATFORM_NO_UNALIGNED := $(PF_NO_UNALIGNED) +PLATFORM_CPPFLAGS += $(PF_NO_UNALIGNED)
ifneq ($(CONFIG_IMX_CONFIG),) ifdef CONFIG_SPL diff --git a/arch/arm/cpu/armv8/config.mk b/arch/arm/cpu/armv8/config.mk index 027a68c..f5b9559 100644 --- a/arch/arm/cpu/armv8/config.mk +++ b/arch/arm/cpu/armv8/config.mk @@ -6,10 +6,7 @@ # PLATFORM_RELFLAGS += -fno-common -ffixed-x18
-# SEE README.arm-unaligned-accesses -PF_NO_UNALIGNED := $(call cc-option, -mstrict-align) -PLATFORM_NO_UNALIGNED := $(PF_NO_UNALIGNED) - PF_CPPFLAGS_ARMV8 := $(call cc-option, -march=armv8-a) +PF_NO_UNALIGNED := $(call cc-option, -mstrict-align) PLATFORM_CPPFLAGS += $(PF_CPPFLAGS_ARMV8) PLATFORM_CPPFLAGS += $(PF_NO_UNALIGNED) diff --git a/arch/arm/lib/interrupts.c b/arch/arm/lib/interrupts.c index 603bf14..b4e331e 100644 --- a/arch/arm/lib/interrupts.c +++ b/arch/arm/lib/interrupts.c @@ -153,7 +153,6 @@ void do_prefetch_abort (struct pt_regs *pt_regs)
void do_data_abort (struct pt_regs *pt_regs) { - printf ("data abort\n\n MAYBE you should read doc/README.arm-unaligned-accesses\n\n"); show_regs (pt_regs); bad_mode (); } diff --git a/common/Makefile b/common/Makefile index a83246e..c06d307 100644 --- a/common/Makefile +++ b/common/Makefile @@ -239,7 +239,3 @@ $(obj)env_embedded.o: $(src)env_embedded.c $(CC) $(AFLAGS) -Wa,--no-warn \ -DENV_CRC=$(shell $(obj)../tools/envcrc) \ -c -o $@ $(src)env_embedded.c - -# SEE README.arm-unaligned-accesses -$(obj)hush.o: CFLAGS += $(PLATFORM_NO_UNALIGNED) -$(obj)fdt_support.o: CFLAGS += $(PLATFORM_NO_UNALIGNED) diff --git a/doc/README.arm-unaligned-accesses b/doc/README.arm-unaligned-accesses deleted file mode 100644 index c37d135..0000000 --- a/doc/README.arm-unaligned-accesses +++ /dev/null @@ -1,122 +0,0 @@ -If you are reading this because of a data abort: the following MIGHT -be relevant to your abort, if it was caused by an alignment violation. -In order to determine this, use the PC from the abort dump along with -an objdump -s -S of the u-boot ELF binary to locate the function where -the abort happened; then compare this function with the examples below. -If they match, then you've been hit with a compiler generated unaligned -access, and you should rewrite your code or add -mno-unaligned-access -to the command line of the offending file. - -Note that the PC shown in the abort message is relocated. In order to -be able to match it to an address in the ELF binary dump, you will need -to know the relocation offset. If your target defines CONFIG_CMD_BDI -and if you can get to the prompt and enter commands before the abort -happens, then command "bdinfo" will give you the offset. Otherwise you -will need to try a build with DEBUG set, which will display the offset, -or use a debugger and set a breakpoint at relocate_code() to see the -offset (passed as an argument). - -* - -Since U-Boot runs on a variety of hardware, some only able to perform -unaligned accesses with a strong penalty, some unable to perform them -at all, the policy regarding unaligned accesses is to not perform any, -unless absolutely necessary because of hardware or standards. - -Also, on hardware which permits it, the core is configured to throw -data abort exceptions on unaligned accesses in order to catch these -unallowed accesses as early as possible. - -Until version 4.7, the gcc default for performing unaligned accesses -(-mno-unaligned-access) is to emulate unaligned accesses using aligned -loads and stores plus shifts and masks. Emulated unaligned accesses -will not be caught by hardware. These accesses may be costly and may -be actually unnecessary. In order to catch these accesses and remove -or optimize them, option -munaligned-access is explicitly set for all -versions of gcc which support it. - -From gcc 4.7 onward starting at armv7 architectures, the default for -performing unaligned accesses is to use unaligned native loads and -stores (-munaligned-access), because the cost of unaligned accesses -has dropped on armv7 and beyond. This should not affect U-Boot's -policy of controlling unaligned accesses, however the compiler may -generate uncontrolled unaligned accesses on its own in at least one -known case: when declaring a local initialized char array, e.g. - -function foo() -{ - char buffer[] = "initial value"; -/* or */ - char buffer[] = { 'i', 'n', 'i', 't', 0 }; - ... -} - -Under -munaligned-accesses with optimizations on, this declaration -causes the compiler to generate native loads from the literal string -and native stores to the buffer, and the literal string alignment -cannot be controlled. If it is misaligned, then the core will throw -a data abort exception. - -Quite probably the same might happen for 16-bit array initializations -where the constant is aligned on a boundary which is a multiple of 2 -but not of 4: - -function foo() -{ - u16 buffer[] = { 1, 2, 3 }; - ... -} - -The long term solution to this issue is to add an option to gcc to -allow controlling the general alignment of data, including constant -initialization values. - -However this will only apply to the version of gcc which will have such -an option. For other versions, there are four workarounds: - -a) Enforce as a rule that array initializations as described above - are forbidden. This is generally not acceptable as they are valid, - and usual, C constructs. The only case where they could be rejected - is when they actually equate to a const char* declaration, i.e. the - array is initialized and never modified in the function's scope. - -b) Drop the requirement on unaligned accesses at least for ARMv7, - i.e. do not throw a data abort exception upon unaligned accesses. - But that will allow adding badly aligned code to U-Boot, only for - it to fail when re-used with a stricter target, possibly once the - bad code is already in mainline. - -c) Relax the -munaligned-access rule globally. This will prevent native - unaligned accesses of course, but that will also hide any bug caused - by a bad unaligned access, making it much harder to diagnose it. It - is actually what already happens when building ARM targets with a - pre-4.7 gcc, and it may actually already hide some bugs yet unseen - until the target gets compiled with -munaligned-access. - -d) Relax the -munaligned-access rule only for for files susceptible to - the local initialized array issue and for armv7 architectures and - beyond. This minimizes the quantity of code which can hide unwanted - misaligned accesses. - -The option retained is d). - -Considering that actual occurrences of the issue are rare (as of this -writing, 5 files out of 7840 in U-Boot, or .3%, contain an initialized -local char array which cannot actually be replaced with a const char*), -contributors should not be required to systematically try and detect -the issue in their patches. - -Detecting files susceptible to the issue can be automated through a -filter installed as a hook in .git which recognizes local char array -initializations. Automation should err on the false positive side, for -instance flagging non-local arrays as if they were local if they cannot -be told apart. - -In any case, detection shall not prevent committing the patch, but -shall pre-populate the commit message with a note to the effect that -this patch contains an initialized local char or 16-bit array and thus -should be protected from the gcc 4.7 issue. - -Upon a positive detection, either $(PLATFORM_NO_UNALIGNED) should be -added to CFLAGS for the affected file(s), or if the array is a pseudo -const char*, it should be replaced by an actual one. diff --git a/fs/ubifs/Makefile b/fs/ubifs/Makefile index 389b0e3..8c8c6ac 100644 --- a/fs/ubifs/Makefile +++ b/fs/ubifs/Makefile @@ -13,6 +13,3 @@ obj-y := ubifs.o io.o super.o sb.o master.o lpt.o obj-y += lpt_commit.o scan.o lprops.o obj-y += tnc.o tnc_misc.o debug.o crc16.o budget.o obj-y += log.o orphan.o recovery.o replay.o - -# SEE README.arm-unaligned-accesses -$(obj)super.o: CFLAGS += $(PLATFORM_NO_UNALIGNED) diff --git a/lib/Makefile b/lib/Makefile index 760340f..dedb97b 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -65,6 +65,3 @@ obj-y += vsprintf.o obj-$(CONFIG_RANDOM_MACADDR) += rand.o obj-$(CONFIG_BOOTP_RANDOM_DELAY) += rand.o obj-$(CONFIG_CMD_LINK_LOCAL) += rand.o - -# SEE README.arm-unaligned-accesses -$(obj)bzlib.o: CFLAGS += $(PLATFORM_NO_UNALIGNED)

Tom Rini trini@ti.com writes:
When we tell the compiler to optimize for ARMv7 it assumes a default of unaligned accesses being supported at the hardware level and can make use of this to perform what it deems as an optimization in any case, including allowing for data to become unaligned. We explicitly disallow this hardware feature so we must tell the compiler.
Cc: Albert ARIBAUD albert.u.boot@aribaud.net Cc: Mans Rullgard mans@mansr.com Signed-off-by: Tom Rini trini@ti.com
Acked-by: Mans Rullgard mans@mansr.com

Hi Tom,
On Tue, 4 Feb 2014 12:05:33 -0500, Tom Rini trini@ti.com wrote:
When we tell the compiler to optimize for ARMv7 it assumes a default of unaligned accesses being supported at the hardware level and can make use of this to perform what it deems as an optimization in any case, including allowing for data to become unaligned. We explicitly disallow this hardware feature so we must tell the compiler.
Cc: Albert ARIBAUD albert.u.boot@aribaud.net Cc: Mans Rullgard mans@mansr.com Signed-off-by: Tom Rini trini@ti.com
NAK -- the discrepancy between the compiler being told to allow native unaligned accesses while at the same time telling the hardware to trap them is conscious and voluntary. It was chosen to help detect unaligned accesses which are rarely necessary and can be explicitly performed by software on a case by case basis.
If and when a specific file requires unaligned access which cannot be made by some other mean than enabling -mno-unaligned-access, then this file should have it added, not the whole of U-Boot.
Amicalement,

On Mon, Feb 10, 2014 at 10:24:47AM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Tue, 4 Feb 2014 12:05:33 -0500, Tom Rini trini@ti.com wrote:
When we tell the compiler to optimize for ARMv7 it assumes a default of unaligned accesses being supported at the hardware level and can make use of this to perform what it deems as an optimization in any case, including allowing for data to become unaligned. We explicitly disallow this hardware feature so we must tell the compiler.
Cc: Albert ARIBAUD albert.u.boot@aribaud.net Cc: Mans Rullgard mans@mansr.com Signed-off-by: Tom Rini trini@ti.com
NAK -- the discrepancy between the compiler being told to allow native unaligned accesses while at the same time telling the hardware to trap them is conscious and voluntary. It was chosen to help detect unaligned accesses which are rarely necessary and can be explicitly performed by software on a case by case basis.
If and when a specific file requires unaligned access which cannot be made by some other mean than enabling -mno-unaligned-access, then this file should have it added, not the whole of U-Boot.
Right, I recall the discussion, and we chose wrong. We aren't being clever and making problems that would appear on armv5 and lower (or non-ARM never allows unaligned access platforms) problems to appear on more common armv7 platforms. We're telling the compiler it's OK to do one thing when it's not and then getting annoying problems such as the EFI partition one where the compiler looks at everything, says we can do $this and then fails at runtime because we lied to it. The whole splashguard set of options is another place where I believe we've shot ourselves in the foot, quite likely.

Hi Tom,
On Mon, 10 Feb 2014 08:21:39 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 10:24:47AM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Tue, 4 Feb 2014 12:05:33 -0500, Tom Rini trini@ti.com wrote:
When we tell the compiler to optimize for ARMv7 it assumes a default of unaligned accesses being supported at the hardware level and can make use of this to perform what it deems as an optimization in any case, including allowing for data to become unaligned. We explicitly disallow this hardware feature so we must tell the compiler.
Cc: Albert ARIBAUD albert.u.boot@aribaud.net Cc: Mans Rullgard mans@mansr.com Signed-off-by: Tom Rini trini@ti.com
NAK -- the discrepancy between the compiler being told to allow native unaligned accesses while at the same time telling the hardware to trap them is conscious and voluntary. It was chosen to help detect unaligned accesses which are rarely necessary and can be explicitly performed by software on a case by case basis.
If and when a specific file requires unaligned access which cannot be made by some other mean than enabling -mno-unaligned-access, then this file should have it added, not the whole of U-Boot.
Right, I recall the discussion, and we chose wrong.
I am quite prepared to discuss whether we chose wrong or right, and to change my mind when the conditions are right, but I'll need more than such a short and simple statement. :)
We aren't being clever and making problems that would appear on armv5 and lower (or non-ARM never allows unaligned access platforms) problems to appear on more common armv7 platforms.
Agreed that we are making problems appear on ARMv7 which are not that much of an issue on ARMv7, but are a true issue on non-ARMv7 arches; that *is* the intent. We want to know as early as possible when some code which runs ok on unaligned-access-friendly platforms such as ARMv7 might cause trouble on unaligned-access-adverse platforms later, once it gets used there too.
We're telling the compiler it's OK to do one thing when it's not and then getting annoying problems such as the EFI partition one where the compiler looks at everything, says we can do $this and then fails at runtime because we lied to it. The whole splashguard set of options is another place where I believe we've shot ourselves in the foot, quite likely.
Ok, so the cause of this patch is not the apparent contradiction between the compiler and hardware setting per se; it is that the EFI code has issues which make it susceptible to crash on unaligned-access-adverse platforms.
This means the trap has worked as expected and has caught some code which does unaligned accesses. Let's analyze it: either we'll conclude it can and should be fixed through e.g. ad hoc unaligned access macros or we'll conclude it can't and we'll add -mno-unaligned-access to the files which can't work otherwise.
Amicalement,

Albert ARIBAUD albert.u.boot@aribaud.net writes:
Hi Tom,
On Mon, 10 Feb 2014 08:21:39 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 10:24:47AM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Tue, 4 Feb 2014 12:05:33 -0500, Tom Rini trini@ti.com wrote:
When we tell the compiler to optimize for ARMv7 it assumes a default of unaligned accesses being supported at the hardware level and can make use of this to perform what it deems as an optimization in any case, including allowing for data to become unaligned. We explicitly disallow this hardware feature so we must tell the compiler.
Cc: Albert ARIBAUD albert.u.boot@aribaud.net Cc: Mans Rullgard mans@mansr.com Signed-off-by: Tom Rini trini@ti.com
NAK -- the discrepancy between the compiler being told to allow native unaligned accesses while at the same time telling the hardware to trap them is conscious and voluntary. It was chosen to help detect unaligned accesses which are rarely necessary and can be explicitly performed by software on a case by case basis.
If and when a specific file requires unaligned access which cannot be made by some other mean than enabling -mno-unaligned-access, then this file should have it added, not the whole of U-Boot.
Right, I recall the discussion, and we chose wrong.
I am quite prepared to discuss whether we chose wrong or right, and to change my mind when the conditions are right, but I'll need more than such a short and simple statement. :)
I already gave you a detailed explanation some months ago. You refused to read it.

Måns Rullgård mans@mansr.com writes:
Albert ARIBAUD albert.u.boot@aribaud.net writes:
Hi Tom,
On Mon, 10 Feb 2014 08:21:39 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 10:24:47AM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Tue, 4 Feb 2014 12:05:33 -0500, Tom Rini trini@ti.com wrote:
When we tell the compiler to optimize for ARMv7 it assumes a default of unaligned accesses being supported at the hardware level and can make use of this to perform what it deems as an optimization in any case, including allowing for data to become unaligned. We explicitly disallow this hardware feature so we must tell the compiler.
Cc: Albert ARIBAUD albert.u.boot@aribaud.net Cc: Mans Rullgard mans@mansr.com Signed-off-by: Tom Rini trini@ti.com
NAK -- the discrepancy between the compiler being told to allow native unaligned accesses while at the same time telling the hardware to trap them is conscious and voluntary. It was chosen to help detect unaligned accesses which are rarely necessary and can be explicitly performed by software on a case by case basis.
If and when a specific file requires unaligned access which cannot be made by some other mean than enabling -mno-unaligned-access, then this file should have it added, not the whole of U-Boot.
Right, I recall the discussion, and we chose wrong.
I am quite prepared to discuss whether we chose wrong or right, and to change my mind when the conditions are right, but I'll need more than such a short and simple statement. :)
I already gave you a detailed explanation some months ago. You refused to read it.
For reference: http://article.gmane.org/gmane.comp.boot-loaders.u-boot/171876

Hi Måns,
On Mon, 10 Feb 2014 15:14:49 +0000, Måns Rullgård mans@mansr.com wrote:
Albert ARIBAUD albert.u.boot@aribaud.net writes:
Hi Tom,
On Mon, 10 Feb 2014 08:21:39 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 10:24:47AM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Tue, 4 Feb 2014 12:05:33 -0500, Tom Rini trini@ti.com wrote:
When we tell the compiler to optimize for ARMv7 it assumes a default of unaligned accesses being supported at the hardware level and can make use of this to perform what it deems as an optimization in any case, including allowing for data to become unaligned. We explicitly disallow this hardware feature so we must tell the compiler.
Cc: Albert ARIBAUD albert.u.boot@aribaud.net Cc: Mans Rullgard mans@mansr.com Signed-off-by: Tom Rini trini@ti.com
NAK -- the discrepancy between the compiler being told to allow native unaligned accesses while at the same time telling the hardware to trap them is conscious and voluntary. It was chosen to help detect unaligned accesses which are rarely necessary and can be explicitly performed by software on a case by case basis.
If and when a specific file requires unaligned access which cannot be made by some other mean than enabling -mno-unaligned-access, then this file should have it added, not the whole of U-Boot.
Right, I recall the discussion, and we chose wrong.
I am quite prepared to discuss whether we chose wrong or right, and to change my mind when the conditions are right, but I'll need more than such a short and simple statement. :)
I already gave you a detailed explanation some months ago. You refused to read it.
I can hardly have "refused to read" a message which I *answered*, laid out how I thought the issue should be solved... and got no answer after this.
Now, are we going to discuss the technical issue or is this going to go debian-TC -- which I wouldn't see the point of.
Amicalement,

Albert ARIBAUD albert.u.boot@aribaud.net writes:
Hi Måns,
On Mon, 10 Feb 2014 15:14:49 +0000, Måns Rullgård mans@mansr.com wrote:
Albert ARIBAUD albert.u.boot@aribaud.net writes:
Hi Tom,
On Mon, 10 Feb 2014 08:21:39 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 10:24:47AM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Tue, 4 Feb 2014 12:05:33 -0500, Tom Rini trini@ti.com wrote:
When we tell the compiler to optimize for ARMv7 it assumes a default of unaligned accesses being supported at the hardware level and can make use of this to perform what it deems as an optimization in any case, including allowing for data to become unaligned. We explicitly disallow this hardware feature so we must tell the compiler.
Cc: Albert ARIBAUD albert.u.boot@aribaud.net Cc: Mans Rullgard mans@mansr.com Signed-off-by: Tom Rini trini@ti.com
NAK -- the discrepancy between the compiler being told to allow native unaligned accesses while at the same time telling the hardware to trap them is conscious and voluntary. It was chosen to help detect unaligned accesses which are rarely necessary and can be explicitly performed by software on a case by case basis.
If and when a specific file requires unaligned access which cannot be made by some other mean than enabling -mno-unaligned-access, then this file should have it added, not the whole of U-Boot.
Right, I recall the discussion, and we chose wrong.
I am quite prepared to discuss whether we chose wrong or right, and to change my mind when the conditions are right, but I'll need more than such a short and simple statement. :)
I already gave you a detailed explanation some months ago. You refused to read it.
I can hardly have "refused to read" a message which I *answered*, laid out how I thought the issue should be solved... and got no answer after this.
In your reply, you called the important parts of my explanation irrelevant. That's more or less the same thing.

On Mon, Feb 10, 2014 at 05:12:24PM +0100, Albert ARIBAUD wrote:
Hi Måns,
On Mon, 10 Feb 2014 15:14:49 +0000, Måns Rullgård mans@mansr.com wrote:
Albert ARIBAUD albert.u.boot@aribaud.net writes:
Hi Tom,
On Mon, 10 Feb 2014 08:21:39 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 10:24:47AM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Tue, 4 Feb 2014 12:05:33 -0500, Tom Rini trini@ti.com wrote:
When we tell the compiler to optimize for ARMv7 it assumes a default of unaligned accesses being supported at the hardware level and can make use of this to perform what it deems as an optimization in any case, including allowing for data to become unaligned. We explicitly disallow this hardware feature so we must tell the compiler.
Cc: Albert ARIBAUD albert.u.boot@aribaud.net Cc: Mans Rullgard mans@mansr.com Signed-off-by: Tom Rini trini@ti.com
NAK -- the discrepancy between the compiler being told to allow native unaligned accesses while at the same time telling the hardware to trap them is conscious and voluntary. It was chosen to help detect unaligned accesses which are rarely necessary and can be explicitly performed by software on a case by case basis.
If and when a specific file requires unaligned access which cannot be made by some other mean than enabling -mno-unaligned-access, then this file should have it added, not the whole of U-Boot.
Right, I recall the discussion, and we chose wrong.
I am quite prepared to discuss whether we chose wrong or right, and to change my mind when the conditions are right, but I'll need more than such a short and simple statement. :)
I already gave you a detailed explanation some months ago. You refused to read it.
I can hardly have "refused to read" a message which I *answered*, laid out how I thought the issue should be solved... and got no answer after this.
Now, are we going to discuss the technical issue or is this going to go debian-TC -- which I wouldn't see the point of.
Well, here's the point that I haven't seen an answer to. If we tell the compiler "you may choose to use unaligned accesses as an optimization, we support this", the compiler says "OK, I shall do that", and then we fail at run time because we don't actually allow the unaligned access, how is this not a problem on our end for the first part of the equation, keeping in mind that the real world is poorly designed and when we write code to this reality the compiler does the correct thing in all cases (or it's a compiler bug).

Hi Tom,
On Mon, 10 Feb 2014 11:24:03 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 05:12:24PM +0100, Albert ARIBAUD wrote:
Hi Måns,
On Mon, 10 Feb 2014 15:14:49 +0000, Måns Rullgård mans@mansr.com wrote:
Albert ARIBAUD albert.u.boot@aribaud.net writes:
Hi Tom,
On Mon, 10 Feb 2014 08:21:39 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 10:24:47AM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Tue, 4 Feb 2014 12:05:33 -0500, Tom Rini trini@ti.com wrote:
> When we tell the compiler to optimize for ARMv7 it assumes a default of > unaligned accesses being supported at the hardware level and can make > use of this to perform what it deems as an optimization in any case, > including allowing for data to become unaligned. We explicitly disallow > this hardware feature so we must tell the compiler. > > Cc: Albert ARIBAUD albert.u.boot@aribaud.net > Cc: Mans Rullgard mans@mansr.com > Signed-off-by: Tom Rini trini@ti.com
NAK -- the discrepancy between the compiler being told to allow native unaligned accesses while at the same time telling the hardware to trap them is conscious and voluntary. It was chosen to help detect unaligned accesses which are rarely necessary and can be explicitly performed by software on a case by case basis.
If and when a specific file requires unaligned access which cannot be made by some other mean than enabling -mno-unaligned-access, then this file should have it added, not the whole of U-Boot.
Right, I recall the discussion, and we chose wrong.
I am quite prepared to discuss whether we chose wrong or right, and to change my mind when the conditions are right, but I'll need more than such a short and simple statement. :)
I already gave you a detailed explanation some months ago. You refused to read it.
I can hardly have "refused to read" a message which I *answered*, laid out how I thought the issue should be solved... and got no answer after this.
Now, are we going to discuss the technical issue or is this going to go debian-TC -- which I wouldn't see the point of.
Well, here's the point that I haven't seen an answer to. If we tell the compiler "you may choose to use unaligned accesses as an optimization, we support this", the compiler says "OK, I shall do that", and then we fail at run time because we don't actually allow the unaligned access, how is this not a problem on our end for the first part of the equation, keeping in mind that the real world is poorly designed and when we write code to this reality the compiler does the correct thing in all cases (or it's a compiler bug).
Both in the thread you are referring to, and in this current thread, I have already explained why we allow native unaligned accesses in the compiler and prevent them in the hardware, and how this may seem contradictory but is not ; but I'm ok with explaining it again, so here comes.
U-Boot runs on all sorts of hardware. Some hardware can perform native unaligned accesses without any penalty; some can, but with a penalty; some just can't.
Therefore, except for some parts of it which are specific to a given architecture, the U-Boot source code should not assume that native unaligned accesses are possible. In fact, it should assume they are impossible.
Now if we restrict ourselves to a subset of ARM architectures, then native unaligned accesses could be used. And if we restrict ourselves to ARMv7, they are almost costless.
So on ARMv7 we could write code which does not care about alignment because native unaligned accesses are just as cheap as aligned ones.
The problem is, since U-Boot runs on a lot of platforms, such alignment-agnostic code might end up built and run on hardware which does not allow native unaligned accesses. Granted, this hardware will detect it; but this might happen quite some time after the code was pulled in.
Therefore, we should try and detect unaligned accesses as early as possible.
To perform such a detection, there is a bit in ARM architectures which, when set, causes native unaligned accesses to trap, allowing us to see them as early as possible, which is good: we want to catch such accesses.
Now, there is also a compiler flag related to alignments, and that is -m[no-]unaligned-access. Its effect is not to allow or disallow unaligned accesses, but to allow or disallow *native* unaligned accesses. If -mno-unaligned-access is in effect, then native unaligned accesses will be replaced by software unaligned accesses, e.g. an even-address dword access will be actually performed as a combination of byte and word accesses, all aligned. This will prevent the hardware from detecting the dword unaligned access, which is bad: we don't want to miss such accesses.
*That* is the reason for forbidding unaligned accesses at the hardware level while enabling them at the compiler level: it is the only combination which ensures that unsuspected unaligned accesses are detected at run time.
Now back to your question, "how is this not a problem on our end for the first part of the equation":
- first, if "the first part of the equation" means "the compiler setting" as opposed to the hardware setting, then the question fails to realize that we don't (and should not) consider the compiler and hardware settings separately; they work in *combination*.
- second, assuming the question is "how is it not a problem on our end that some code traps due to the combined hardware and compiler settings", the answer is: because the setting was not designed to catch *ARMv7* issues; it was designed to catch *U-Boot* issues. In other words, such traps show that there is code which won't work elsewhere than on ARMv7-like hardware which does not care about alignment.
The only drawback of this setup is that while some code does unneeded unaligned accesses (and is thus obviously wrong), some other code *requires* unaligned accesses (because of standards, or of hardware). Such code will trap... but *only* if it performs unaligned accesses *natively*, which it should not since it might run on a hardware not capable of native unaligned accesses.
That is why I consider that the fault is in the trapped software, not in the trap. The solution is to make the software use software, not native, unaligned accesses. The exact solution depends on whether the code has only a few such unaligned accesses (in which case we should use explicit unaligned access macros) or many (in which case, for the file considered, we can enable -mno-unaligned-access). You'll find instances of both in the U-Boot code.
Therefore:
- I am ok with -mno-unaligned-access applied to files which *require* unaligned access and where individual access macros would be impractical.
- I am NOT OK with blanket -mno-unaligned-access applied on a file where individual macros are feasible, and
- I am certainly NOT OK with a blanket -mno-unaligned-access on all code and the removal of the whole ARM misalignment detection setup.
Regarding the EFI code, there was a patch submitted (see the thread you have pointed me to) which had the proper unaligned access macros and thus did not require -mno-unaligned-access for this file, and certainly does not require it for the whole of ARM.
Amicalement,

Dear Albert,
In message 20140210182646.2de92810@lilith you wrote: ...
- first, if "the first part of the equation" means "the compiler setting" as opposed to the hardware setting, then the question fails to realize that we don't (and should not) consider the compiler and hardware settings separately; they work in *combination*.
Right, and we have full control over both sides of this combination: we set the compiler options in the Makefiles, and adjust the hardware selections in the code.
- second, assuming the question is "how is it not a problem on our end that some code traps due to the combined hardware and compiler settings", the answer is: because the setting was not designed to catch *ARMv7* issues; it was designed to catch *U-Boot* issues. In other words, such traps show that there is code which won't work elsewhere than on ARMv7-like hardware which does not care about alignment.
Full agreement. It is wrong to write code with only the feature set off a specific architecture in mind.
That is why I consider that the fault is in the trapped software, not in the trap. The solution is to make the software use software, not native, unaligned accesses. The exact solution depends on whether the code has only a few such unaligned accesses (in which case we should use explicit unaligned access macros) or many (in which case, for the file considered, we can enable -mno-unaligned-access). You'll find instances of both in the U-Boot code.
I agree mostly here - except that I tend to be even more radical: if we need to enable -mno-unaligned-access, then the code is inherently non-portable and should rather be redesigned / rewritten.
- I am ok with -mno-unaligned-access applied to files which *require* unaligned access and where individual access macros would be impractical.
I can live with this if we raise the bar sufficiently high so that only very few exceptions will be made. I would not like to see this enabled on tons of files.
- I am NOT OK with blanket -mno-unaligned-access applied on a file
where individual macros are feasible, and
Full agreement. But even in this case we should first consider if the code can / should not rather be rewitten to avoid the problem alltogether.
- I am certainly NOT OK with a blanket -mno-unaligned-access on all code and the removal of the whole ARM misalignment detection setup.
I agree with Albert here.
Thanks for the detailed explanation, bte.
Best regards,
Wolfgang Denk

On Mon, Feb 10, 2014 at 06:26:46PM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Mon, 10 Feb 2014 11:24:03 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 05:12:24PM +0100, Albert ARIBAUD wrote:
Hi Måns,
On Mon, 10 Feb 2014 15:14:49 +0000, Måns Rullgård mans@mansr.com wrote:
Albert ARIBAUD albert.u.boot@aribaud.net writes:
Hi Tom,
On Mon, 10 Feb 2014 08:21:39 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 10:24:47AM +0100, Albert ARIBAUD wrote: > Hi Tom, > > On Tue, 4 Feb 2014 12:05:33 -0500, Tom Rini trini@ti.com wrote: > > > When we tell the compiler to optimize for ARMv7 it assumes a default of > > unaligned accesses being supported at the hardware level and can make > > use of this to perform what it deems as an optimization in any case, > > including allowing for data to become unaligned. We explicitly disallow > > this hardware feature so we must tell the compiler. > > > > Cc: Albert ARIBAUD albert.u.boot@aribaud.net > > Cc: Mans Rullgard mans@mansr.com > > Signed-off-by: Tom Rini trini@ti.com > > NAK -- the discrepancy between the compiler being told to allow native > unaligned accesses while at the same time telling the hardware to trap > them is conscious and voluntary. It was chosen to help detect unaligned > accesses which are rarely necessary and can be explicitly performed by > software on a case by case basis. > > If and when a specific file requires unaligned access which cannot be > made by some other mean than enabling -mno-unaligned-access, then this > file should have it added, not the whole of U-Boot.
Right, I recall the discussion, and we chose wrong.
I am quite prepared to discuss whether we chose wrong or right, and to change my mind when the conditions are right, but I'll need more than such a short and simple statement. :)
I already gave you a detailed explanation some months ago. You refused to read it.
I can hardly have "refused to read" a message which I *answered*, laid out how I thought the issue should be solved... and got no answer after this.
Now, are we going to discuss the technical issue or is this going to go debian-TC -- which I wouldn't see the point of.
Well, here's the point that I haven't seen an answer to. If we tell the compiler "you may choose to use unaligned accesses as an optimization, we support this", the compiler says "OK, I shall do that", and then we fail at run time because we don't actually allow the unaligned access, how is this not a problem on our end for the first part of the equation, keeping in mind that the real world is poorly designed and when we write code to this reality the compiler does the correct thing in all cases (or it's a compiler bug).
Both in the thread you are referring to, and in this current thread, I have already explained why we allow native unaligned accesses in the compiler and prevent them in the hardware, and how this may seem contradictory but is not ; but I'm ok with explaining it again, so here comes.
U-Boot runs on all sorts of hardware. Some hardware can perform native unaligned accesses without any penalty; some can, but with a penalty; some just can't.
Therefore, except for some parts of it which are specific to a given architecture, the U-Boot source code should not assume that native unaligned accesses are possible. In fact, it should assume they are impossible.
Now if we restrict ourselves to a subset of ARM architectures, then native unaligned accesses could be used. And if we restrict ourselves to ARMv7, they are almost costless.
So on ARMv7 we could write code which does not care about alignment because native unaligned accesses are just as cheap as aligned ones.
The problem is, since U-Boot runs on a lot of platforms, such alignment-agnostic code might end up built and run on hardware which does not allow native unaligned accesses. Granted, this hardware will detect it; but this might happen quite some time after the code was pulled in.
Right.
Therefore, we should try and detect unaligned accesses as early as possible.
Then what I'm saying is that our proposal for catching them isn't valid, which I'll get to.
To perform such a detection, there is a bit in ARM architectures which, when set, causes native unaligned accesses to trap, allowing us to see them as early as possible, which is good: we want to catch such accesses.
Now, there is also a compiler flag related to alignments, and that is -m[no-]unaligned-access. Its effect is not to allow or disallow unaligned accesses, but to allow or disallow *native* unaligned accesses. If -mno-unaligned-access is in effect, then native unaligned accesses will be replaced by software unaligned accesses, e.g. an even-address dword access will be actually performed as a combination of byte and word accesses, all aligned. This will prevent the hardware from detecting the dword unaligned access, which is bad: we don't want to miss such accesses.
Then gcc has a bug and you need to convince them to fix it. What gcc does, as Mans has explained, and this invalidates the "lets catch unaligned access problems" notion, is for ARMv6 and higher say "we assume by default the hardware can perform native unaligned access, so make use of this in our optimization choices".
*That* is the reason for forbidding unaligned accesses at the hardware level while enabling them at the compiler level: it is the only combination which ensures that unsuspected unaligned accesses are detected at run time.
And it has the unintended side-effect of generating native unaligned accesses in cases where we have annotated the sources correctly and when native unaligned access would be disallowed, the compiler would work correctly.
Now back to your question, "how is this not a problem on our end for the first part of the equation":
- first, if "the first part of the equation" means "the compiler setting" as opposed to the hardware setting, then the question fails to realize that we don't (and should not) consider the compiler and hardware settings separately; they work in *combination*.
That they work in combination is my point. We're saying -march=armv7a -munaligned-access (the latter implicitly as it is the default from the compiler) and _NOT_ honoring the requirements.
- second, assuming the question is "how is it not a problem on our end that some code traps due to the combined hardware and compiler settings", the answer is: because the setting was not designed to catch *ARMv7* issues; it was designed to catch *U-Boot* issues. In other words, such traps show that there is code which won't work elsewhere than on ARMv7-like hardware which does not care about alignment.
But this part isn't true. Or rather, it is (or may, at the whim of the compiler) catching every case where we say __attribute__((packed)) and then don't follow up ensuring that every access is then fixed up by hand, rather than letting the compiler do it.
We've essentially picked "lets blow things up at times" over "lets keep an eye out for __packed being used in code, and be careful there".
The only drawback of this setup is that while some code does unneeded unaligned accesses (and is thus obviously wrong), some other code *requires* unaligned accesses (because of standards, or of hardware). Such code will trap... but *only* if it performs unaligned accesses *natively*, which it should not since it might run on a hardware not capable of native unaligned accesses.
That is why I consider that the fault is in the trapped software, not in the trap. The solution is to make the software use software, not native, unaligned accesses. The exact solution depends on whether the code has only a few such unaligned accesses (in which case we should use explicit unaligned access macros) or many (in which case, for the file considered, we can enable -mno-unaligned-access). You'll find instances of both in the U-Boot code.
The problem is that __packed means we can see this problem whenever its used. This is the design practice we need to be wary of, and make sure we're coding to an unfortunate reality rather than misoptimizing things.
[snip]
Regarding the EFI code, there was a patch submitted (see the thread you have pointed me to) which had the proper unaligned access macros and thus did not require -mno-unaligned-access for this file, and certainly does not require it for the whole of ARM.
Right, the EFI patch takes valid code and makes it differently valid.

Dear Tom,
In message 20140210212630.GB7049@bill-the-cat you wrote:
Then gcc has a bug and you need to convince them to fix it. What gcc does, as Mans has explained, and this invalidates the "lets catch unaligned access problems" notion, is for ARMv6 and higher say "we assume by default the hardware can perform native unaligned access, so make use of this in our optimization choices".
GCC for ARM has often perplexed me - I remember cases where it generated 4 single-byte accesses to a u32 data type with perfectly valid 32 bit alignment (like a device register). Unfortunaltely I never was able to have this considered a bug. Everybody else thought it was perfectly normal and it had always been like that (on ARM).
But this part isn't true. Or rather, it is (or may, at the whim of the compiler) catching every case where we say __attribute__((packed)) and then don't follow up ensuring that every access is then fixed up by hand, rather than letting the compiler do it.
We've essentially picked "lets blow things up at times" over "lets keep an eye out for __packed being used in code, and be careful there".
I think many people use __packed without thinking. Some code is just horrible. The fact that ARM code is full of examples where device I/O is performed without compiler checking for data types is just an indication.
I know this is bad, but do you see a way to make the compiler issue clear warnings or errors for such "accidential" unaligned accesses?
The problem is that __packed means we can see this problem whenever its used. This is the design practice we need to be wary of, and make sure we're coding to an unfortunate reality rather than misoptimizing things.
__packed should be strictly forbidden everywhere except where mandated by definitions for example of protocol implementations etc. And even there I tend to consider it wrong to use 32 or 16 bit types for data fields that are misaligned (assum=ing the whole data structure is properly aligned).
Regarding the EFI code, there was a patch submitted (see the thread you have pointed me to) which had the proper unaligned access macros and thus did not require -mno-unaligned-access for this file, and certainly does not require it for the whole of ARM.
Right, the EFI patch takes valid code and makes it differently valid.
Takes valid code? But would this original code run on an architecture where any unaligned access causes a machine check? My understanding is it doesn't?
Best regards,
Wolfgang Denk

On Mon, Feb 10, 2014 at 11:17:23PM +0100, Wolfgang Denk wrote:
Dear Tom,
In message 20140210212630.GB7049@bill-the-cat you wrote:
Then gcc has a bug and you need to convince them to fix it. What gcc does, as Mans has explained, and this invalidates the "lets catch unaligned access problems" notion, is for ARMv6 and higher say "we assume by default the hardware can perform native unaligned access, so make use of this in our optimization choices".
GCC for ARM has often perplexed me - I remember cases where it generated 4 single-byte accesses to a u32 data type with perfectly valid 32 bit alignment (like a device register). Unfortunaltely I never was able to have this considered a bug. Everybody else thought it was perfectly normal and it had always been like that (on ARM).
But this part isn't true. Or rather, it is (or may, at the whim of the compiler) catching every case where we say __attribute__((packed)) and then don't follow up ensuring that every access is then fixed up by hand, rather than letting the compiler do it.
We've essentially picked "lets blow things up at times" over "lets keep an eye out for __packed being used in code, and be careful there".
I think many people use __packed without thinking. Some code is just horrible. The fact that ARM code is full of examples where device I/O is performed without compiler checking for data types is just an indication.
Some quick grepping around and it's not really ARM code that's full of the references, it's 1) Code shared with Linux (NAND, UBI or filesystems, JFFS2/YAFFS2). 2) USB (which is a special case of the above).
I know this is bad, but do you see a way to make the compiler issue clear warnings or errors for such "accidential" unaligned accesses?
No, but we could make checkpatch complain about it pretty easily I bet.
The problem is that __packed means we can see this problem whenever its used. This is the design practice we need to be wary of, and make sure we're coding to an unfortunate reality rather than misoptimizing things.
__packed should be strictly forbidden everywhere except where mandated by definitions for example of protocol implementations etc. And even there I tend to consider it wrong to use 32 or 16 bit types for data fields that are misaligned (assum=ing the whole data structure is properly aligned).
I agree we shouldn't use __packed without a good reason. And I don't think we've got many no-reason uses right now but I'm all in favor of dropping them and keeping an eye on new users. The problem is that unless we do something, any of the valid and we aren't going to / can't change them __packed structs will be the next place things blow up when gcc optimizes things in a new (and valid based on what we've told it!) way.
Regarding the EFI code, there was a patch submitted (see the thread you have pointed me to) which had the proper unaligned access macros and thus did not require -mno-unaligned-access for this file, and certainly does not require it for the whole of ARM.
Right, the EFI patch takes valid code and makes it differently valid.
Takes valid code? But would this original code run on an architecture where any unaligned access causes a machine check? My understanding is it doesn't?
It would run because being __packed gcc would know to do the right thing on that architecture.

Dear Tom,
In message 20140210222819.GE7049@bill-the-cat you wrote:
I agree we shouldn't use __packed without a good reason. And I don't think we've got many no-reason uses right now but I'm all in favor of dropping them and keeping an eye on new users. The problem is that unless we do something, any of the valid and we aren't going to / can't change them __packed structs will be the next place things blow up when gcc optimizes things in a new (and valid based on what we've told it!) way.
If I look around where __packed (and variants) is being used I could just cry. Just have a loog for example at include/ec_commands.h - it appears they always add __packed to all structs, including such where all entries are uint8_t - see for example struct ec_lpc_host_args :-(
Takes valid code? But would this original code run on an architecture where any unaligned access causes a machine check? My understanding is it doesn't?
It would run because being __packed gcc would know to do the right thing on that architecture.
We are not discussing application code here. We are dealing with low- level hardware related stuff. When I try to read from a 32 bit device register I must be absolutley sure that this will be a single 32 bit transfer. It is totally unacceptable if I have to fear that the compiler might break this up into 4 byte accesses behind my back. What if this happens to be an auto-incrementing register or such?
But obviously only few people share this point of view...
Best regards,
Wolfgang Denk

Wolfgang Denk wd@denx.de writes:
Dear Tom,
In message 20140210222819.GE7049@bill-the-cat you wrote:
I agree we shouldn't use __packed without a good reason. And I don't think we've got many no-reason uses right now but I'm all in favor of dropping them and keeping an eye on new users. The problem is that unless we do something, any of the valid and we aren't going to / can't change them __packed structs will be the next place things blow up when gcc optimizes things in a new (and valid based on what we've told it!) way.
If I look around where __packed (and variants) is being used I could just cry. Just have a loog for example at include/ec_commands.h - it appears they always add __packed to all structs, including such where all entries are uint8_t - see for example struct ec_lpc_host_args :-(
This is obviously insane and should be fixed. Deliberately miscompiling other code isn't going to help with that.
Takes valid code? But would this original code run on an architecture where any unaligned access causes a machine check? My understanding is it doesn't?
It would run because being __packed gcc would know to do the right thing on that architecture.
We are not discussing application code here.
What does that have to do with anything. The compiler works exactly the same whatever the code is for.
We are dealing with low- level hardware related stuff. When I try to read from a 32 bit device register I must be absolutley sure that this will be a single 32 bit transfer. It is totally unacceptable if I have to fear that the compiler might break this up into 4 byte accesses behind my back.
So because you're afraid of __packed abuse, you want to make _other_ completely valid code crash? Would it not be preferable to make the actually broken code fail? Then someone might notice and fix it. Furthermore, the scenario you seem worried about will still happen on ARMv5 and other architectures which do not support unaligned memory accesses.

Dear Måns Rullgård,
In message yw1xob2dakq4.fsf@unicorn.mansr.com you wrote:
We are not discussing application code here.
What does that have to do with anything. The compiler works exactly the same whatever the code is for.
With application code you don't really care whether a variable is read/written in one piece or broken down into several smaller reads/writes - except when you notice that performance suffers.
When accessing hardware, it often makes a fundamental difference whether you access a device register with it's correct size or not. Usually breaking down an access into smaller ones results in crashes or incorrect data or other errors.
We are dealing with low- level hardware related stuff. When I try to read from a 32 bit device register I must be absolutley sure that this will be a single 32 bit transfer. It is totally unacceptable if I have to fear that the compiler might break this up into 4 byte accesses behind my back.
So because you're afraid of __packed abuse, you want to make _other_ completely valid code crash? Would it not be preferable to make the actually broken code fail? Then someone might notice and fix it. Furthermore, the scenario you seem worried about will still happen on ARMv5 and other architectures which do not support unaligned memory accesses.
I wonder if you actually read Albert's arguments. I'll try to put it in simple words for you.
No, this is not about __packed, at least not primarily. We are talking about code that performs unaligned accesses. There are architectures where the hardware supports this just fine; there are others where you pay for this with some penalty, but it still works; and there are yet others where it just does not work. And we cannot rely on the compiler to do "the right thing" because the compiler assumes the "application" model described above, while we need the "device access" model, i. e. something different. And we want all code (unless it is not inherently deeply architecture-specific) to be working on all architectures, whether these support unaligned accesses or not.
So I would like to adjust the default behaviour of the compiler to raise errors or at least warnings for all such unaligned accesses that he is capable of detecting, and I want clear runtime errors for the rest, on all architectures. Code that causes such errors needs to be reviewed and, normally, to be fixed. In cases where there are good reasons to perform unaligned accesses, these should normally be done explicitly, without automatic help from the compiler. Only if there is such good reasons for unaligned accesses AND there are good reasons not to touch the code AND we are sure it will not break on some architectures, then we should allow the compiler to use it's automatics.
BTW: thanks for the friendly words you found for me elsewhere.
Best regards,
Wolfgang Denk

Wolfgang Denk wd@denx.de writes:
Dear Måns Rullgård,
In message yw1xob2dakq4.fsf@unicorn.mansr.com you wrote:
We are not discussing application code here.
What does that have to do with anything. The compiler works exactly the same whatever the code is for.
With application code you don't really care whether a variable is read/written in one piece or broken down into several smaller reads/writes - except when you notice that performance suffers.
When accessing hardware, it often makes a fundamental difference whether you access a device register with it's correct size or not. Usually breaking down an access into smaller ones results in crashes or incorrect data or other errors.
I'm well aware of this, but it has nothing to do with the issue at hand.
We are dealing with low- level hardware related stuff. When I try to read from a 32 bit device register I must be absolutley sure that this will be a single 32 bit transfer. It is totally unacceptable if I have to fear that the compiler might break this up into 4 byte accesses behind my back.
So because you're afraid of __packed abuse, you want to make _other_ completely valid code crash? Would it not be preferable to make the actually broken code fail? Then someone might notice and fix it. Furthermore, the scenario you seem worried about will still happen on ARMv5 and other architectures which do not support unaligned memory accesses.
I wonder if you actually read Albert's arguments. I'll try to put it in simple words for you.
No need to be condescending.
No, this is not about __packed, at least not primarily. We are talking about code that performs unaligned accesses. There are architectures where the hardware supports this just fine; there are others where you pay for this with some penalty, but it still works; and there are yet others where it just does not work.
Correct.
And we cannot rely on the compiler to do "the right thing" because the compiler assumes the "application" model described above, while we need the "device access" model, i. e. something different. And we want all code (unless it is not inherently deeply architecture-specific) to be working on all architectures, whether these support unaligned accesses or not.
Device registers are always aligned. In properly written code these are accessed using pointers the compiler knows to be aligned and thus does the right thing.
If a device register is accessed in a manner that the compiler thinks it might be unaligned, this will result in the access being split on architectures like ARMv5 that do not support any unaligned accesses. As you point out, this will break at runtime. Any code doing this is thus broken and needs to be fixed.
So I would like to adjust the default behaviour of the compiler to raise errors or at least warnings for all such unaligned accesses that he is capable of detecting, and I want clear runtime errors for the rest, on all architectures. Code that causes such errors needs to be reviewed and, normally, to be fixed. In cases where there are good reasons to perform unaligned accesses, these should normally be done explicitly, without automatic help from the compiler. Only if there is such good reasons for unaligned accesses AND there are good reasons not to touch the code AND we are sure it will not break on some architectures, then we should allow the compiler to use it's automatics.
All of that makes sense. The problem is that the current settings do the exact opposite. By using -munaligned-access by default, you are asking the compiler to go ahead and do whatever it thinks is best, which is sometimes to perform an intentional unaligned access. Exactly when this will happen is largely impossible to predict.
To get the behaviour you desire, the code should be compiled with -mno-unaligned-access. This tells the compiler to _never_ automatically perform an unaligned memory access. If it thinks an address might be unaligned, it will split the access.
The *only* case where building with -munaligned-access might be required to make some code run correctly is if said code abuses the __packed attribute on pointers referring to device registers (which are always aligned and should never have this annotation), _and_ by sheer luck no actual unaligned accesses are introduced by the compiler. In this case, those build settings conspire to cover up a bug which will manifest itself when the code is built for a target that never allows unaligned accesses.
Note also that adding -mno-unaligned-access results in exactly the same compiler behaviour as we always had prior to gcc 4.7, which presumably generated usable code.

Hi Måns,
On Tue, 11 Feb 2014 15:33:09 +0000, Måns Rullgård mans@mansr.com wrote:
The problem is that the current settings do the exact opposite. By using -munaligned-access by default, you are asking the compiler to go ahead and do whatever it thinks is best, which is sometimes to perform an intentional unaligned access. Exactly when this will happen is largely impossible to predict.
The -munaligned-access option does *not* "[ask] the compiler to go ahead and do whatever it thinks is best", it tells it to use direct native accesses when unaligned accesses are required, as opposed to splitting unaligned accesses into smaller but aligned aligned native accesses, which is what you get with -mno-unaligned-access.
To get the behaviour you desire, the code should be compiled with -mno-unaligned-access. This tells the compiler to _never_ automatically perform an unaligned memory access. If it thinks an address might be unaligned, it will split the access.
This shows that you really have not read my argument, in which I *did* explain why we tell the compiler *not* to split unaligned accesses into smaller correctly aligned accesses, i.e., why -munaligned-access is used.
Note also that adding -mno-unaligned-access results in exactly the same compiler behaviour as we always had prior to gcc 4.7, which presumably generated usable code.
And hid potential unaligned accesses which would harm some other platforms, because not all platforms have -m[no-]unaligned-access or even a concept of unaligned access.
Amicalement,

Albert ARIBAUD albert.u.boot@aribaud.net writes:
Hi Måns,
On Tue, 11 Feb 2014 15:33:09 +0000, Måns Rullgård mans@mansr.com wrote:
The problem is that the current settings do the exact opposite. By using -munaligned-access by default, you are asking the compiler to go ahead and do whatever it thinks is best, which is sometimes to perform an intentional unaligned access. Exactly when this will happen is largely impossible to predict.
The -munaligned-access option does *not* "[ask] the compiler to go ahead and do whatever it thinks is best", it tells it to use direct native accesses when unaligned accesses are required, as opposed to splitting unaligned accesses into smaller but aligned aligned native accesses, which is what you get with -mno-unaligned-access.
The flag does both of those things. It even gives the compiler permission to merge multiple adjacent accesses into a single wider one.
To get the behaviour you desire, the code should be compiled with -mno-unaligned-access. This tells the compiler to _never_ automatically perform an unaligned memory access. If it thinks an address might be unaligned, it will split the access.
This shows that you really have not read my argument, in which I *did* explain why we tell the compiler *not* to split unaligned accesses into smaller correctly aligned accesses, i.e., why -munaligned-access is used.
I have read what you call your argument. Unfortunately for you, it is based on false premises, and as such any conclusions you arrive at are incorrect.

Hi Måns,
On Tue, 11 Feb 2014 16:44:46 +0000, Måns Rullgård mans@mansr.com wrote:
Albert ARIBAUD albert.u.boot@aribaud.net writes:
Hi Måns,
On Tue, 11 Feb 2014 15:33:09 +0000, Måns Rullgård mans@mansr.com wrote:
The problem is that the current settings do the exact opposite. By using -munaligned-access by default, you are asking the compiler to go ahead and do whatever it thinks is best, which is sometimes to perform an intentional unaligned access. Exactly when this will happen is largely impossible to predict.
The -munaligned-access option does *not* "[ask] the compiler to go ahead and do whatever it thinks is best", it tells it to use direct native accesses when unaligned accesses are required, as opposed to splitting unaligned accesses into smaller but aligned aligned native accesses, which is what you get with -mno-unaligned-access.
The flag does both of those things. It even gives the compiler permission to merge multiple adjacent accesses into a single wider one.
Both of which two things exactly?
Besides, the only place where we saw it merge adjacent accesses into a wider one is in some cases of array initialization which are documented in doc/README.arm-unaligned-accesses.
To get the behaviour you desire, the code should be compiled with -mno-unaligned-access. This tells the compiler to _never_ automatically perform an unaligned memory access. If it thinks an address might be unaligned, it will split the access.
This shows that you really have not read my argument, in which I *did* explain why we tell the compiler *not* to split unaligned accesses into smaller correctly aligned accesses, i.e., why -munaligned-access is used.
I have read what you call your argument. Unfortunately for you, it is based on false premises, and as such any conclusions you arrive at are incorrect.
(you have asked someone else not to be condescending. I, in turn, ask you to stop being insulting.)
If you have read my argument, then I assume you have understood that the point of it is to detect uncontrolled unaligned accesses; however, your rebuttal apparently misses this point, as it proposes to *prevent* detection of such unaligned accesses.
Amicalement,

On Tue, 11 Feb 2014 18:11:11 +0100, Albert ARIBAUD albert.u.boot@aribaud.net wrote:
Besides, the only place where we saw it merge adjacent accesses into a
s/it/-munaligned-access/
wider one is in some cases of array initialization which are documented in doc/README.arm-unaligned-accesses.
Amicalement,

On Tue, Feb 11, 2014 at 05:37:55PM +0100, Albert ARIBAUD wrote:
Hi Måns,
On Tue, 11 Feb 2014 15:33:09 +0000, Måns Rullgård mans@mansr.com wrote:
The problem is that the current settings do the exact opposite. By using -munaligned-access by default, you are asking the compiler to go ahead and do whatever it thinks is best, which is sometimes to perform an intentional unaligned access. Exactly when this will happen is largely impossible to predict.
The -munaligned-access option does *not* "[ask] the compiler to go ahead and do whatever it thinks is best", it tells it to use direct native accesses when unaligned accesses are required, as opposed to splitting unaligned accesses into smaller but aligned aligned native accesses, which is what you get with -mno-unaligned-access.
Incorrect, and gets to the heart of our problem. It says that native unaligned accesses are valid and make use of this as appropriate. So our goal of "make the compiler use native unaligned accesses so we can find bad code" is invalid. It's making properly written code fail instead and improperly written code will still be just as improper.

Hi Tom,
On Wed, 12 Feb 2014 09:35:55 -0500, Tom Rini trini@ti.com wrote:
On Tue, Feb 11, 2014 at 05:37:55PM +0100, Albert ARIBAUD wrote:
Hi Måns,
On Tue, 11 Feb 2014 15:33:09 +0000, Måns Rullgård mans@mansr.com wrote:
The problem is that the current settings do the exact opposite. By using -munaligned-access by default, you are asking the compiler to go ahead and do whatever it thinks is best, which is sometimes to perform an intentional unaligned access. Exactly when this will happen is largely impossible to predict.
The -munaligned-access option does *not* "[ask] the compiler to go ahead and do whatever it thinks is best", it tells it to use direct native accesses when unaligned accesses are required, as opposed to splitting unaligned accesses into smaller but aligned aligned native accesses, which is what you get with -mno-unaligned-access.
Incorrect, and gets to the heart of our problem. It says that native unaligned accesses are valid and make use of this as appropriate. So our goal of "make the compiler use native unaligned accesses so we can find bad code" is invalid. It's making properly written code fail instead and improperly written code will still be just as improper.
Code which translates into uncontrolled unaligned accesses is not properly written.
Amicalement,

On Wed, Feb 12, 2014 at 05:19:15PM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Wed, 12 Feb 2014 09:35:55 -0500, Tom Rini trini@ti.com wrote:
On Tue, Feb 11, 2014 at 05:37:55PM +0100, Albert ARIBAUD wrote:
Hi Måns,
On Tue, 11 Feb 2014 15:33:09 +0000, Måns Rullgård mans@mansr.com wrote:
The problem is that the current settings do the exact opposite. By using -munaligned-access by default, you are asking the compiler to go ahead and do whatever it thinks is best, which is sometimes to perform an intentional unaligned access. Exactly when this will happen is largely impossible to predict.
The -munaligned-access option does *not* "[ask] the compiler to go ahead and do whatever it thinks is best", it tells it to use direct native accesses when unaligned accesses are required, as opposed to splitting unaligned accesses into smaller but aligned aligned native accesses, which is what you get with -mno-unaligned-access.
Incorrect, and gets to the heart of our problem. It says that native unaligned accesses are valid and make use of this as appropriate. So our goal of "make the compiler use native unaligned accesses so we can find bad code" is invalid. It's making properly written code fail instead and improperly written code will still be just as improper.
Code which translates into uncontrolled unaligned accesses is not properly written.
It's not. A problem we now have is that when we want to do unaligned accesses for valid reasons the compiler generates valid code for what we told it to do, but then fails at run time because we lied to the compiler.

Hi Tom,
On Mon, 17 Feb 2014 10:45:35 -0500, Tom Rini trini@ti.com wrote:
On Wed, Feb 12, 2014 at 05:19:15PM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Wed, 12 Feb 2014 09:35:55 -0500, Tom Rini trini@ti.com wrote:
On Tue, Feb 11, 2014 at 05:37:55PM +0100, Albert ARIBAUD wrote:
Hi Måns,
On Tue, 11 Feb 2014 15:33:09 +0000, Måns Rullgård mans@mansr.com wrote:
The problem is that the current settings do the exact opposite. By using -munaligned-access by default, you are asking the compiler to go ahead and do whatever it thinks is best, which is sometimes to perform an intentional unaligned access. Exactly when this will happen is largely impossible to predict.
The -munaligned-access option does *not* "[ask] the compiler to go ahead and do whatever it thinks is best", it tells it to use direct native accesses when unaligned accesses are required, as opposed to splitting unaligned accesses into smaller but aligned aligned native accesses, which is what you get with -mno-unaligned-access.
Incorrect, and gets to the heart of our problem. It says that native unaligned accesses are valid and make use of this as appropriate. So our goal of "make the compiler use native unaligned accesses so we can find bad code" is invalid. It's making properly written code fail instead and improperly written code will still be just as improper.
Code which translates into uncontrolled unaligned accesses is not properly written.
It's not. A problem we now have is that when we want to do unaligned accesses for valid reasons the compiler generates valid code for what we told it to do, but then fails at run time because we lied to the compiler.
If we have a valid reason to perform an unaligned access, then we must make it explicit that we want it, by including <asm/unaligned.h> and using put_unaligned() or get_unaligned() or variants thereof.
Amicalement,

Dear Måns Rullgård,
In message yw1xfvnpaclm.fsf@unicorn.mansr.com you wrote:
With application code you don't really care whether a variable is read/written in one piece or broken down into several smaller reads/writes - except when you notice that performance suffers.
When accessing hardware, it often makes a fundamental difference whether you access a device register with it's correct size or not. Usually breaking down an access into smaller ones results in crashes or incorrect data or other errors.
I'm well aware of this, but it has nothing to do with the issue at hand.
This is by your definition, or based on which exact rationale?
Device registers are always aligned. In properly written code these are accessed using pointers the compiler knows to be aligned and thus does the right thing.
Maybe the device registers you have seen so far have always been perfectly aligned. Lucky you. Note that there are a number of designs around (that I do not hesitate to call broken) which have such properties. And when we include external hardware into the discussion like PCI attached devices or customer-designed FPGAs, it becomes even more "interesting".
I think we can stop the discussion here (at least I will do that now). I haven't seen any new arguments coming up.
Best regards,
Wolfgang Denk

Wolfgang Denk wd@denx.de writes:
Dear Måns Rullgård,
In message yw1xfvnpaclm.fsf@unicorn.mansr.com you wrote:
With application code you don't really care whether a variable is read/written in one piece or broken down into several smaller reads/writes - except when you notice that performance suffers.
When accessing hardware, it often makes a fundamental difference whether you access a device register with it's correct size or not. Usually breaking down an access into smaller ones results in crashes or incorrect data or other errors.
I'm well aware of this, but it has nothing to do with the issue at hand.
This is by your definition, or based on which exact rationale?
Device registers are always aligned. In properly written code these are accessed using pointers the compiler knows to be aligned and thus does the right thing.
Maybe the device registers you have seen so far have always been perfectly aligned. Lucky you. Note that there are a number of designs around (that I do not hesitate to call broken) which have such properties. And when we include external hardware into the discussion like PCI attached devices or customer-designed FPGAs, it becomes even more "interesting".
Please explain how you would access an unaligned register from a CPU that doesn't allow unaligned accesses. Clearly such a combination of hardware can never work, so there is no use trying to cater for it in haphazard ways that end up breaking perfectly normal systems.

Hi Tom,
On Mon, 10 Feb 2014 17:28:19 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 11:17:23PM +0100, Wolfgang Denk wrote:
Dear Tom,
In message 20140210212630.GB7049@bill-the-cat you wrote:
Then gcc has a bug and you need to convince them to fix it. What gcc does, as Mans has explained, and this invalidates the "lets catch unaligned access problems" notion, is for ARMv6 and higher say "we assume by default the hardware can perform native unaligned access, so make use of this in our optimization choices".
GCC for ARM has often perplexed me - I remember cases where it generated 4 single-byte accesses to a u32 data type with perfectly valid 32 bit alignment (like a device register). Unfortunaltely I never was able to have this considered a bug. Everybody else thought it was perfectly normal and it had always been like that (on ARM).
But this part isn't true. Or rather, it is (or may, at the whim of the compiler) catching every case where we say __attribute__((packed)) and then don't follow up ensuring that every access is then fixed up by hand, rather than letting the compiler do it.
We've essentially picked "lets blow things up at times" over "lets keep an eye out for __packed being used in code, and be careful there".
I think many people use __packed without thinking. Some code is just horrible. The fact that ARM code is full of examples where device I/O is performed without compiler checking for data types is just an indication.
Some quick grepping around and it's not really ARM code that's full of the references, it's
- Code shared with Linux (NAND, UBI or filesystems, JFFS2/YAFFS2).
- USB (which is a special case of the above).
Indeed, this is not *ARM-specific* code, but it is *ARM-run* code.
I know this is bad, but do you see a way to make the compiler issue clear warnings or errors for such "accidential" unaligned accesses?
No, but we could make checkpatch complain about it pretty easily I bet.
I'm not quite convinced that checkpatch would detect all cases -- granted, it could detect "packed" attributes, but not all "packed" structs are Bad(tm), and besides, not all unaligned accesses are due to packed structures (bad pointer arithmetic is also a good candidate, one which results from code semantics, not source).
The problem is that __packed means we can see this problem whenever its used. This is the design practice we need to be wary of, and make sure we're coding to an unfortunate reality rather than misoptimizing things.
__packed should be strictly forbidden everywhere except where mandated by definitions for example of protocol implementations etc. And even there I tend to consider it wrong to use 32 or 16 bit types for data fields that are misaligned (assum=ing the whole data structure is properly aligned).
I agree we shouldn't use __packed without a good reason. And I don't think we've got many no-reason uses right now but I'm all in favor of dropping them and keeping an eye on new users. The problem is that unless we do something, any of the valid and we aren't going to / can't change them __packed structs will be the next place things blow up when gcc optimizes things in a new (and valid based on what we've told it!) way.
Which is precisely why the solution is to tell GCC not to optimize, by telling it to use explicitly unaligned accesses for the data which we want to bypas normal C alignment rules.
Regarding the EFI code, there was a patch submitted (see the thread you have pointed me to) which had the proper unaligned access macros and thus did not require -mno-unaligned-access for this file, and certainly does not require it for the whole of ARM.
Right, the EFI patch takes valid code and makes it differently valid.
Takes valid code? But would this original code run on an architecture where any unaligned access causes a machine check? My understanding is it doesn't?
It would run because being __packed gcc would know to do the right thing on that architecture.
It is true that GCC might change behavior in the future and break things on us; after all, that's what 4.7 did with ARMv7 and native unaligned accesses. But then, we should expect GCC possibly doing it again in the future for new targets, causing packed code to break on us, unless we explicitly tell it not to by stating where unaligned accesses should be performed.
(IMO, making unaligned accesses explicit -- either with a macro or by breaking unaligned fields into smaller aligned fields -- goes hand in hand with adding packed attributes: if you pack a record, chances are you're *wanting* unalignd fields, and if you want them, you should make them explicit.)
Back to the issue at hand, I think we should fix it with a patch like
http://thread.gmane.org/gmane.comp.boot-loaders.u-boot/171646/focus=179000
Also, upon re-reading doc/ARM.unaligned-accesses, I notice that the rationale for the hardware and compiler settings is not explicit; I'll add it.
Amicalement,

On Tue, Feb 11, 2014 at 09:44:50AM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Mon, 10 Feb 2014 17:28:19 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 11:17:23PM +0100, Wolfgang Denk wrote:
Dear Tom,
In message 20140210212630.GB7049@bill-the-cat you wrote:
Then gcc has a bug and you need to convince them to fix it. What gcc does, as Mans has explained, and this invalidates the "lets catch unaligned access problems" notion, is for ARMv6 and higher say "we assume by default the hardware can perform native unaligned access, so make use of this in our optimization choices".
GCC for ARM has often perplexed me - I remember cases where it generated 4 single-byte accesses to a u32 data type with perfectly valid 32 bit alignment (like a device register). Unfortunaltely I never was able to have this considered a bug. Everybody else thought it was perfectly normal and it had always been like that (on ARM).
But this part isn't true. Or rather, it is (or may, at the whim of the compiler) catching every case where we say __attribute__((packed)) and then don't follow up ensuring that every access is then fixed up by hand, rather than letting the compiler do it.
We've essentially picked "lets blow things up at times" over "lets keep an eye out for __packed being used in code, and be careful there".
I think many people use __packed without thinking. Some code is just horrible. The fact that ARM code is full of examples where device I/O is performed without compiler checking for data types is just an indication.
Some quick grepping around and it's not really ARM code that's full of the references, it's
- Code shared with Linux (NAND, UBI or filesystems, JFFS2/YAFFS2).
- USB (which is a special case of the above).
Indeed, this is not *ARM-specific* code, but it is *ARM-run* code.
Right, and irrelevant.
I know this is bad, but do you see a way to make the compiler issue clear warnings or errors for such "accidential" unaligned accesses?
No, but we could make checkpatch complain about it pretty easily I bet.
I'm not quite convinced that checkpatch would detect all cases -- granted, it could detect "packed" attributes, but not all "packed" structs are Bad(tm), and besides, not all unaligned accesses are due to packed structures (bad pointer arithmetic is also a good candidate, one which results from code semantics, not source).
Right. But it's like the warnings about volatile and typedef. There's exceptions (see above), but when we see it we need to question it. And people doing stupid things to pointers isn't a good excuse and probably not even something that would be fixedup with -mno-unaligned-access.
The problem is that __packed means we can see this problem whenever its used. This is the design practice we need to be wary of, and make sure we're coding to an unfortunate reality rather than misoptimizing things.
__packed should be strictly forbidden everywhere except where mandated by definitions for example of protocol implementations etc. And even there I tend to consider it wrong to use 32 or 16 bit types for data fields that are misaligned (assum=ing the whole data structure is properly aligned).
I agree we shouldn't use __packed without a good reason. And I don't think we've got many no-reason uses right now but I'm all in favor of dropping them and keeping an eye on new users. The problem is that unless we do something, any of the valid and we aren't going to / can't change them __packed structs will be the next place things blow up when gcc optimizes things in a new (and valid based on what we've told it!) way.
Which is precisely why the solution is to tell GCC not to optimize, by telling it to use explicitly unaligned accesses for the data which we want to bypas normal C alignment rules.
No. The solution is to tell the compiler which optimizations it can, and cannot use. If we stop telling it that native unaligned accesses work it won't decide to make use of those optimizations.
Regarding the EFI code, there was a patch submitted (see the thread you have pointed me to) which had the proper unaligned access macros and thus did not require -mno-unaligned-access for this file, and certainly does not require it for the whole of ARM.
Right, the EFI patch takes valid code and makes it differently valid.
Takes valid code? But would this original code run on an architecture where any unaligned access causes a machine check? My understanding is it doesn't?
It would run because being __packed gcc would know to do the right thing on that architecture.
It is true that GCC might change behavior in the future and break things on us; after all, that's what 4.7 did with ARMv7 and native unaligned accesses. But then, we should expect GCC possibly doing it again in the future for new targets, causing packed code to break on us, unless we explicitly tell it not to by stating where unaligned accesses should be performed.
You're saying that in the future gcc might start generating broken code for a platform, but that it would not be a gcc bug? Lets step back and remember what exactly gcc did.
They decided that (and added support, and optimizations around) given that ARMv6 and later can do native unaligned accesses, they would assume that when generating for those targets, that feature would be enabled. The Linux Kernel decided "OK, lets set that bit in hardware". We decided "lets not set that bit in hardware, and do other things too".
I've decided that we made the wrong choice here and we should go with "no native unaligned access support on ARM". This puts us back to how things were prior to GCC adding support for optimizing around native unaligned access support. This does not change (nor does our current behavor!) how to deal with broken hardware or broken software.

Dear Tom,
In message 20140212142536.GA15819@bill-the-cat you wrote:
No. The solution is to tell the compiler which optimizations it can, and cannot use. If we stop telling it that native unaligned accesses work it won't decide to make use of those optimizations.
Can we do this in such a way that we can be absolutely sure that the compiler will never break up larger (say 32 bit) accesses into smaller (say, 2 x 16 bit or 4 x 8 bit or similar) accesses?
Best regards,
Wolfgang Denk

On Mon, Feb 10, 2014 at 03:57:51PM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Mon, 10 Feb 2014 08:21:39 -0500, Tom Rini trini@ti.com wrote:
On Mon, Feb 10, 2014 at 10:24:47AM +0100, Albert ARIBAUD wrote:
Hi Tom,
On Tue, 4 Feb 2014 12:05:33 -0500, Tom Rini trini@ti.com wrote:
When we tell the compiler to optimize for ARMv7 it assumes a default of unaligned accesses being supported at the hardware level and can make use of this to perform what it deems as an optimization in any case, including allowing for data to become unaligned. We explicitly disallow this hardware feature so we must tell the compiler.
Cc: Albert ARIBAUD albert.u.boot@aribaud.net Cc: Mans Rullgard mans@mansr.com Signed-off-by: Tom Rini trini@ti.com
NAK -- the discrepancy between the compiler being told to allow native unaligned accesses while at the same time telling the hardware to trap them is conscious and voluntary. It was chosen to help detect unaligned accesses which are rarely necessary and can be explicitly performed by software on a case by case basis.
If and when a specific file requires unaligned access which cannot be made by some other mean than enabling -mno-unaligned-access, then this file should have it added, not the whole of U-Boot.
Right, I recall the discussion, and we chose wrong.
I am quite prepared to discuss whether we chose wrong or right, and to change my mind when the conditions are right, but I'll need more than such a short and simple statement. :)
The problem is it really is a simple problem.
We aren't being clever and making problems that would appear on armv5 and lower (or non-ARM never allows unaligned access platforms) problems to appear on more common armv7 platforms.
Agreed that we are making problems appear on ARMv7 which are not that much of an issue on ARMv7, but are a true issue on non-ARMv7 arches;
No, this is incorrect.
that *is* the intent. We want to know as early as possible when some code which runs ok on unaligned-access-friendly platforms such as ARMv7 might cause trouble on unaligned-access-adverse platforms later, once it gets used there too.
We're telling the compiler it's OK to do one thing when it's not and then getting annoying problems such as the EFI partition one where the compiler looks at everything, says we can do $this and then fails at runtime because we lied to it. The whole splashguard set of options is another place where I believe we've shot ourselves in the foot, quite likely.
Ok, so the cause of this patch is not the apparent contradiction between the compiler and hardware setting per se; it is that the EFI code has issues which make it susceptible to crash on unaligned-access-adverse platforms.
This means the trap has worked as expected and has caught some code which does unaligned accesses. Let's analyze it: either we'll conclude it can and should be fixed through e.g. ad hoc unaligned access macros or we'll conclude it can't and we'll add -mno-unaligned-access to the files which can't work otherwise.
The conclusion is that the code was written and annotated correctly, and since we lied to the compiler we broke. We can rewrite the code to get around this, and find another problem somewhere else or we can behave correctly.
participants (4)
-
Albert ARIBAUD
-
Måns Rullgård
-
Tom Rini
-
Wolfgang Denk