
Hi Måns,
On Tue, 15 Oct 2013 16:23:44 +0100, Måns Rullgård mans@mansr.com wrote:
Albert ARIBAUD albert.u.boot@aribaud.net writes:
I sense that you have not understood the reason why I want alignment checking enabled in ARM yet also want ARMv6+ builds to emit native unaligned accesses if they consider it needed.
Your wishes are mutually exclusive. You cannot both allow hardware unaligned access AND at the same time trap them.
These are not wishes, there are actual settings chosen for the reason already laid out. They do appear contradictory if your goal is to use ARMv6+ features to their maximum, but this is not the goal here.
The reason is, if we prevent ARMv6 builds from using native unaligned accesses, they would replace *all* such accesses with smaller, aligned, ones, which would not trigger a data abort; even those unaligned accesses cased by programming errors.
If you disable unaligned accesses in hardware (as u-boot does), you have no option but doing them a byte at a time.
Indeed, but I do *not* *disable* native unaligned accesses, I *allow* them; and I do not *want* them to be replaced by byte-by-byte emulation.
Let's go back to the basics.
In ARMv6 and later there is a bit in the system control register (SCTLR.A) which decides whether or not unaligned memory accesses are allowed. The reset value of this bit allows unaligned accesses.
When unaligned accesses are allowed, word and halfword load/store instructions (LDR, STR, LDRH, LDRSH, STRH) with an unaligned address simply perform the requested memory operation. When unaligned accesses are disallowed (SCTLR.A set), these instructions cause an alignment fault if used with an unaligned address. The load/store double and multiple instructions (LDRD, STRD, LDM, STM) always trap on unaligned addresses.
This is all described in the ARM Architecture Reference Manual (DDI0406C) section A3.2.
That's the hardware side.
Your description is correct, although this bit is not specific to "ARMv6 and later", since ARMv5 has alignment checks too.
On the compiler side, gcc traditionally did not issue unaligned load/store instructions on ARM.
Please be specific: gcc did not emit *native* unaligned accesses.
Since version 4.7, gcc does issue unaligned accesses when the target is ARMv6 or later. This makes sense since a hardware unaligned access is faster than doing it byte-wise in software, and the default for the CPU is to permit unaligned accesses. Needless to say, a potentially unaligned address will only be accessed using the subset of load/store instructions for which this is supported.
Indeed. Note that this is stated in doc/README.arm-unaligned-accesses.
To support configurations where SCTR.A is set (disallowing unaligned accesses), gcc 4.7 also adds a flag (-mno-unaligned-access) causing it to never emit potentially unaligned loads or stores.
Maybe the intent which governed the addition of this option was indeed to support configurations where alignment check is enabled; what we can tell is what this option does, and yes, it controls whether the compiler will use native unaligned accesses.
The compiler behaviour described above is true only for well-behaved code. In standard C, pointers must always be aligned according to their target type. For instance, a pointer to a 32-bit integer type must typically be 32-bit aligned. Thus, if a pointer is constructed with incorrect alignment, any attempt to use it may result in invalid memory access instructions being executed.
Correct, although I'm not sure why you're mentioning this (and, strictly speaking, all of the compiler's behavior is defined only for 'well-behaved' C code).
In practice, various situations arise where there is a need to work with unaligned data, for example when parsing some communication protocols. To simplify such code, most compilers provide some language extension allowing the programmer to annotate a type definition or pointer as being potentially unaligned. In gcc, the 'packed' attribute on struct and union types serves this purpose.
Correct, and again, I fail to see why you mention this.
Any access to a member of a 'packed' struct/union is assumed to be potentially unaligned, and the instructions selection is limited accordingly. When -munaligned-access is in effect, unaligned word or halfword load/store instructions may be used here. When this feature is disabled (-mno-unaligned-access), only aligned loads and stores (typically bytes) are permitted.
Ditto.
The situations where the compiler will issue an unaligned memory access are generally not predictable. Currently, they tend to occur in struct/array assignment (including initialisation), inline expansion of memcpy/memset, and accesses to 'packed' struct members. As compiler optimisations improve, these cases will likely increase in number.
This last statement has no solid foundation. On the contrary, there is no reason that a compiler emit unaligned accesses when by default it is expected to align data to their natural boundaries.
As we can see, enabling the -munaligned-access flag results in load/store instructions occasionally accessing unaligned memory, and the precise places where this happens are not predictable. It is thus a requirement that SCTLR.A be clear when this compiler flag is set. Otherwise alignment faults will occur.
As I said, and as documented in doc/README.arm-unaligned-accesses for a whole year now, the only case where native unaligned accesses are emitted is with string literals. Even char arrays are aligned. There's actually a case to be made that string literals should be aligned, too, like all other strings are.
If for whatever reason SCTLR.A is set, it is required to use the -mno-unaligned-access compiler flag in order for the code to run cleanly. Failure to do so will result in alignment faults when the code is executed.
It is never *required* to use -mno-unaligned-access unless your hardware is unable to perform unaligned accesses for some reason external to the CPU.
One reason for setting SCTLR.A is to aid in catching programming errors whereby a normal pointer is assigned an unaligned value. Since these pointers are assumed by the compiler to be correctly aligned, accesses through them are unaffected by the -m[no-]unaligned-access flag, and any such errors will thus trigger an alignment fault.
This is one use of alignment checks. Another use is to catch code which intends to do unaligned accesses even though the policy of the project says otherwise.
If any of the above is unclear, please let me know, and I will try to explain it better.
All this is perfectly clear and essentially valid... if your goal is to use all the features of ARMv6+ to simplify the developer's life under the assumption that the code is well-behaved (and the compiler is error-free, for that matter).
It is also still not valid with respect to my goal, which is to make sure that the actual code of a multi-architecture project running on a variety of compilers will be as correct as possible.
Let me state this again: while the approach you describe is the logical one to make life as easy as possible for the developer (and compiler write) on ARMv6+ architectures, it is not *mandated* in any sense, and it is not the approach I have chosen.
I suggest now that you leave aside any assumptions on "how things must be done", then read my answers again in the context I have just described, and fin "why things are done this way" in ARM U-Boot.
Amicalement,