
Wolfgang Denk wd@denx.de writes:
Dear Måns Rullgård,
In message yw1xob2dakq4.fsf@unicorn.mansr.com you wrote:
We are not discussing application code here.
What does that have to do with anything. The compiler works exactly the same whatever the code is for.
With application code you don't really care whether a variable is read/written in one piece or broken down into several smaller reads/writes - except when you notice that performance suffers.
When accessing hardware, it often makes a fundamental difference whether you access a device register with it's correct size or not. Usually breaking down an access into smaller ones results in crashes or incorrect data or other errors.
I'm well aware of this, but it has nothing to do with the issue at hand.
We are dealing with low- level hardware related stuff. When I try to read from a 32 bit device register I must be absolutley sure that this will be a single 32 bit transfer. It is totally unacceptable if I have to fear that the compiler might break this up into 4 byte accesses behind my back.
So because you're afraid of __packed abuse, you want to make _other_ completely valid code crash? Would it not be preferable to make the actually broken code fail? Then someone might notice and fix it. Furthermore, the scenario you seem worried about will still happen on ARMv5 and other architectures which do not support unaligned memory accesses.
I wonder if you actually read Albert's arguments. I'll try to put it in simple words for you.
No need to be condescending.
No, this is not about __packed, at least not primarily. We are talking about code that performs unaligned accesses. There are architectures where the hardware supports this just fine; there are others where you pay for this with some penalty, but it still works; and there are yet others where it just does not work.
Correct.
And we cannot rely on the compiler to do "the right thing" because the compiler assumes the "application" model described above, while we need the "device access" model, i. e. something different. And we want all code (unless it is not inherently deeply architecture-specific) to be working on all architectures, whether these support unaligned accesses or not.
Device registers are always aligned. In properly written code these are accessed using pointers the compiler knows to be aligned and thus does the right thing.
If a device register is accessed in a manner that the compiler thinks it might be unaligned, this will result in the access being split on architectures like ARMv5 that do not support any unaligned accesses. As you point out, this will break at runtime. Any code doing this is thus broken and needs to be fixed.
So I would like to adjust the default behaviour of the compiler to raise errors or at least warnings for all such unaligned accesses that he is capable of detecting, and I want clear runtime errors for the rest, on all architectures. Code that causes such errors needs to be reviewed and, normally, to be fixed. In cases where there are good reasons to perform unaligned accesses, these should normally be done explicitly, without automatic help from the compiler. Only if there is such good reasons for unaligned accesses AND there are good reasons not to touch the code AND we are sure it will not break on some architectures, then we should allow the compiler to use it's automatics.
All of that makes sense. The problem is that the current settings do the exact opposite. By using -munaligned-access by default, you are asking the compiler to go ahead and do whatever it thinks is best, which is sometimes to perform an intentional unaligned access. Exactly when this will happen is largely impossible to predict.
To get the behaviour you desire, the code should be compiled with -mno-unaligned-access. This tells the compiler to _never_ automatically perform an unaligned memory access. If it thinks an address might be unaligned, it will split the access.
The *only* case where building with -munaligned-access might be required to make some code run correctly is if said code abuses the __packed attribute on pointers referring to device registers (which are always aligned and should never have this annotation), _and_ by sheer luck no actual unaligned accesses are introduced by the compiler. In this case, those build settings conspire to cover up a bug which will manifest itself when the code is built for a target that never allows unaligned accesses.
Note also that adding -mno-unaligned-access results in exactly the same compiler behaviour as we always had prior to gcc 4.7, which presumably generated usable code.