
Wolfgang Denk wrote:
In message 426DDE65.4060009@intracom.gr you wrote:
...which is broken, especially when there is no need to do this since the fields are properly aligned.
The compiler does not know this. A pointer to the structure could be pointing to an non-aligned address, and when an aligned field within the structure is accessed it will be not aligned.
This is no reason for anything. The same could be true for ANY structures, even without using the packed attribute. It is the user's problem if he passes incorrectly aligned pointers.
No, it is the compiler's problem if it incorrectly generates code which breaks given an access to an unaligned field. The pointer itself could very well be correctly aligned, and yet be pointing to an unaligned data structure. The compiler has no way of knowing this at compile-time, and hence must generate code which is the "least common denominator", i.e. - the code will work, correctly, in ALL situations, regardless of the structure's alignment.
This typically happens when networking.
Yes, and usually you will have to explicitely deal with this.
IMHO __attribute((packed)) is deceivingly named, since it implies two things.
I think you are wrong. You are trying to explain the current imple- mentation of GCC for ARM, assuming silently that the implementation is correct. I say: it is broken.
Given that a similar series of bytewise accesses will also be generated using armcc, I'd say gcc is _consistent_.
http://www.arm.com/support/faqdev/1469.html http://www.arm.com/support/faqdev/1228.html http://www.arm.com/support/faqip/3661.html
Again, the compiler MUST ensure that no incorrect code is generated to perform the access. Simply because said access is not performed in what may be the optimal way, does NOT mean the compiler is broken, or that this is a bug.
From one of the above referenced tech notes: "Unknown alignment (eg. a pointer to a word that can be at any address)
This is specified by a pointer to a __packed item, eg.
__packed int *pi;
In this case, the ARM compilers generate code which correctly accesses the value regardless of the alignment of the pointer. This code generated will be a sequence of byte accesses, or variable alignment-dependent shifting and masking (depending on the compile options) and will therefore incur a performance penalty.
Note that *any* __packed object accessed through a pointer has unknown alignment, even packed structures."
- Don't leave any space between fields of the structure.
Correct - this is what the documentation says:
The `packed' attribute specifies that a variable or structure field should have the smallest possible alignment--one byte for a variable, and one bit for a field, unless you specify a larger value with the `aligned' attribute. Here is a structure in which the field `x' is packed, so that it immediately follows `a': struct foo { char a; int x[2] __attribute__ ((packed)); };
...which further implies that ANY access to a member of a packed struct may be unaligned, even if it is aligned within the structure definition itself, if the structure itself is unaligned in memory...
struct foo { unsigned long foo1; unsigned long foo2; unsigned long foo3; } __attribute__ ((packed));
unsigned long dummy(struct foo *p) { p = (struct foo *)1; return(p->foo1 + p->foo2 + p->foo3); }
Yes, the above is contrived, but it serves to illustrate the point.
- Access the fields of the structure even when the alignment of the pointer pointing to it is wrong.
Wrong. Remember that a structure may be a representation for things like hardware registers. Registers may have funny properties, like auto-incrementing with each access, or auto-resetting on read. If I perform a 32 bit read on such an object the compiler MUST NOT break this up into any other combination of accesses.
No, right. If the _user_ chooses to implement such a structure for memory-mapped accesses, it is then up to the _user_ to validate that the accesses will be performed as intended. The _compiler_ is responsible ONLY for generating code to correctly load/store all 32 bits of data. There is nothing which requires that the load/store be atomic in nature. The _compiler_ has no knowledge it is reading/writing a register which requires atomic accesses, only the _user_ does.
Just by the name, you expect only (1).
By the name, and the documentation, and common sense.
Then I doubt we share a common sense of "common sense". ;)
Oh, the pain of the pun!
I think we have covered all relevant facts now. I understand that we won't have GCC fixed soon, and I already suggested what I think should be used to work around this compiler bug (using aligned).
gcc does not need fixed in this regard; the above is not a bug. gcc is generating correct code in all cases given the fact that the ARM architecture has no support for unaligned accesses.
Please cc: me directly if anyone wishes to discuss this further, as I should think this is now well off-topic for the list.
-Cory