[U-Boot-Users] struct NS16550 {...} __attribute__ ((packed)) at ns16550.h

Hi All,
I'm newbie to u-boot.
We are using u-boot for proprietary ARM board with NS16550 compatible serial controller. It is configured as 4bytes access to serial controller, so set CFG_NS16550_REG_SIZE to 4. But, because of "__attribute__ ((packed))" of struct NS16550, all register accesses to NS16550 are generated to byte access (LDRB, STRB in ARM).
I think it should NOT have __attribute__ ((packed)) if NS16550 is configured as 4bytes access. Am I missed something?
Thanks, Woojung

In message OF34DF8405.50E611E8-ON85256FEB.0051A68A-85256FEB.00522DF9@smsc.com you wrote:
We are using u-boot for proprietary ARM board with NS16550 compatible serial controller. It is configured as 4bytes access to serial controller, so set CFG_NS16550_REG_SIZE to 4. But, because of "__attribute__ ((packed))" of struct NS16550, all register accesses to NS16550 are generated to byte access (LDRB, STRB in ARM).
No, this staement makes no sense to me.
I think it should NOT have __attribute__ ((packed)) if NS16550 is configured as 4bytes access. Am I missed something?
I think so.
"__attribute__ ((packed))" only affects the alignment of the entries in the structure. It has no influence on how to access these entries. If your access is a 32 bit read or write access then it will not be changed whether there is a "__attribute__ ((packed))" in place or not.
Best regards,
Wolfgang Denk

Wofgang,
Did you inspect the generated assembly statemens of __attribute__ ((packed)) on the struct NS16650 when CFG_NS16550_REG_SIZE is 4? If you did then you would have noticed gcc (atleast 3.4.3 and 3.4.0) generates byte access code for it. As I mailed to this mailinglist at the first of March 2005. Subject: "[PATCH] revised word alignment fixes for word aligned NS16550 UART"
Kind greetings,
Jean-Paul Saman
Philips Semiconductors CTO/RTG Philips HighTech Campus, building WDA 3.29 Professor van den Holstlaan 4 5655 AA Eindhoven tel: +31 (0)40 27 45131
Wolfgang Denk wd@denx.de Sent by: u-boot-users-admin@lists.sourceforge.net 22-04-2005 20:39
To: Woojung.Huh@smsc.com cc: U-Boot-Users@lists.sourceforge.net (bcc: Jean-Paul Saman/EHV/SC/PHILIPS) Subject: Re: [U-Boot-Users] struct NS16550 {...} __attribute__ ((packed)) at ns16550.h Classification:
In message OF34DF8405.50E611E8-ON85256FEB.0051A68A-85256FEB.00522DF9@smsc.com you wrote:
We are using u-boot for proprietary ARM board with NS16550 compatible
serial controller.
It is configured as 4bytes access to serial controller, so set
CFG_NS16550_REG_SIZE to 4.
But, because of "__attribute__ ((packed))" of struct NS16550, all
register accesses to NS16550 are generated to byte
access (LDRB, STRB in ARM).
No, this staement makes no sense to me.
I think it should NOT have __attribute__ ((packed)) if NS16550 is
configured as 4bytes access.
Am I missed something?
I think so.
"__attribute__ ((packed))" only affects the alignment of the entries in the structure. It has no influence on how to access these entries. If your access is a 32 bit read or write access then it will not be changed whether there is a "__attribute__ ((packed))" in place or not.
Best regards,
Wolfgang Denk

[Please do not send regular messages to the u-boot-users-admin address!]
In message OFC533536B.66CF098A-ONC1256FEE.002A2AD5-C1256FEE.002A9FFB@philips.com you wrote:
Did you inspect the generated assembly statemens of __attribute__ ((packed)) on the struct NS16650 when CFG_NS16550_REG_SIZE is 4? If you
No, I did not.
did then you would have noticed gcc (atleast 3.4.3 and 3.4.0) generates byte access code for it. As I mailed to this mailinglist at the first of
If this is really true, then there is either bug in GCC or in the code. If I code a 32 bit access the compiler must not use any other accesses.
If CFG_NS16550_REG_SIZE is 4, then all entries in "struct NS16550" are of type "unsigned long". All of the compilers I'm working with generate exactly the same code whether "__attribute__ ((packed))" is used or not.
[BTW: at the moment only the U824 board uses CFG_NS16550_REG_SIZE=4, and this is a PowerPC board.]
Best regards,
Wolfgang Denk

If this is really true, then there is either bug in GCC or in the code. If I code a 32 bit access the compiler must not use any other accesses.
If you use __attribute__ ((packed)), you are telling the compiler you have an unaligned data structure (e.g. longs not by-4 aligned). Therefore, GCC will always reassemble the 32-bit value from byte accesses.
Art Shipkowski Videon Central Software Engineer (814)235-1111 x307 art@videon-central.com

In message 001501c5499f$f5364750$1780a8c0@art you wrote:
If this is really true, then there is either bug in GCC or in the code. If I code a 32 bit access the compiler must not use any other accesses.
If you use __attribute__ ((packed)), you are telling the compiler you have an unaligned data structure (e.g. longs not by-4 aligned). Therefore, GCC will always reassemble the 32-bit value from byte accesses.
You are wrong, with both of your statements.
-> cat foo.c struct foo { unsigned long foo1; unsigned long foo2; unsigned long foo3; } __attribute__ ((packed));
unsigned long dummy (struct foo *p) { return (p->foo1 + p->foo2 + p->foo3); }
Test 1: PowerPC:
-> ppc_8xx-gcc -O -S foo.c -> cat foo.s ... dummy: mr 9,3 lwz 3,0(3) lwz 0,4(9) add 3,3,0 lwz 0,8(9) add 3,3,0 blr
Test 2: MIPS:
-> mips_4KC-gcc -O -S foo.c -> cat foo.s ... dummy: .frame $sp,0,$31 # vars= 0, regs= 0/0, args= 0, extra= 0 .mask 0x00000000,0 .fmask 0x00000000,0 .set noreorder .cpload $25 .set reorder ulw $2,0($4) ulw $3,4($4) #nop addu $2,$2,$3 ulw $3,8($4) .set noreorder .set nomacro j $31 addu $2,$2,$3 .set macro .set reorder
As you can see, GCC alway uses plain word accesses (32 bit).
__attribute__ ((packed)) is telling the compiler that IN CASE of unaligned fields in the structure no padding should be performend, which will then result in unaligned fields that may cause access problems on certain architectures.
If a compiler is generating byte accesses instead (especially if the relevant data has been declared as volatile) then this compiler is broken.
Yes, I know that some versions of ARM compilers behave like that. But the fact that they do does not mean that they are correct. Accessing a (volatile) long variable though 4 byte accesses is a bug.
Period.
Best regards,
Wolfgang Denk

Wolfgang Denk wrote:
In message 001501c5499f$f5364750$1780a8c0@art you wrote:
If this is really true, then there is either bug in GCC or in the code. If I code a 32 bit access the compiler must not use any other accesses.
If you use __attribute__ ((packed)), you are telling the compiler you have an unaligned data structure (e.g. longs not by-4 aligned). Therefore, GCC will always reassemble the 32-bit value from byte accesses.
You are wrong, with both of your statements.
-> cat foo.c struct foo { unsigned long foo1; unsigned long foo2; unsigned long foo3; } __attribute__ ((packed));
unsigned long dummy (struct foo *p) { return (p->foo1 + p->foo2 + p->foo3); }
Test 1: PowerPC:
-> ppc_8xx-gcc -O -S foo.c -> cat foo.s ... dummy: mr 9,3 lwz 3,0(3) lwz 0,4(9) add 3,3,0 lwz 0,8(9) add 3,3,0 blr
Test 2: MIPS:
-> mips_4KC-gcc -O -S foo.c -> cat foo.s ... dummy: .frame $sp,0,$31 # vars= 0, regs= 0/0, args= 0, extra= 0 .mask 0x00000000,0 .fmask 0x00000000,0 .set noreorder .cpload $25 .set reorder ulw $2,0($4) ulw $3,4($4) #nop addu $2,$2,$3 ulw $3,8($4) .set noreorder .set nomacro j $31 addu $2,$2,$3 .set macro .set reorder
As you can see, GCC alway uses plain word accesses (32 bit).
Not on ARM it does not.
Best regards,
Wolfgang Denk
Both PPC & MIPS can do proper unalligned load/stores (or fault to an exception).
ARM (silently) does the wrong thing i.e. masks out the lower 2 bits.
So gcc on ARM generates byte load/stores.
Regards
Pantelis

In message 426D0A4C.5000105@intracom.gr you wrote:
Not on ARM it does not.
Yes, I know? Why did you not quote this part of my message, too?
ARM (silently) does the wrong thing i.e. masks out the lower 2 bits. So gcc on ARM generates byte load/stores.
...which is broken, especially when there is no need to do this since the fields are properly aligned.
Best regards,
Wolfgang Denk

ARM (silently) does the wrong thing i.e. masks out the lower 2 bits. So gcc on ARM generates byte load/stores.
...which is broken, especially when there is no need to do this since the fields are properly aligned.
Based on a quick Google scan of the GCC mailing list, __attribute__ ((packed)) also works (partially due to necessity and partially due to breaking old code) as a mode to allow unaligned access to a given structure. Some architectures offer no support for doing unaligned long loads and stores directly (I suspect on ARM it might be due to some sort of brokenness). On the other hand, PPC handles it in hardware, while MIPS invokes an exception handler in the Linux kernel (arch/mips/kernel/unaligned.c).
There are discussions on this on the GCC mailing list going back a ways; a notable one (that mentions m68k as another platform that generates byte accesses besides ARM) is at http://gcc.gnu.org/ml/gcc/1997-10/msg00063.html while a more recent bug report hinting at this fact is at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12200
Art

Dear Arthur,
in message 003501c549d3$7d67aa00$1780a8c0@art you wrote:
There are discussions on this on the GCC mailing list going back a ways; a notable one (that mentions m68k as another platform that generates byte accesses besides ARM) is at http://gcc.gnu.org/ml/gcc/1997-10/msg00063.html while a more recent bug report hinting at this fact is at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12200
Thanks for digging this out.
So the real fix for this problem should be to use a proper "__attribute__((aligned(?)))" where struct NS16550 is instantiated.
Best regards,
Wolfgang Denk

Wolfgang Denk wrote:
In message 426D0A4C.5000105@intracom.gr you wrote:
Not on ARM it does not.
Yes, I know? Why did you not quote this part of my message, too?
ARM (silently) does the wrong thing i.e. masks out the lower 2 bits. So gcc on ARM generates byte load/stores.
...which is broken, especially when there is no need to do this since the fields are properly aligned.
The compiler does not know this. A pointer to the structure could be pointing to an non-aligned address, and when an aligned field within the structure is accessed it will be not aligned.
This typically happens when networking.
IMHO __attribute((packed)) is deceivingly named, since it implies two things.
1) Don't leave any space between fields of the structure. 2) Access the fields of the structure even when the alignment of the pointer pointing to it is wrong.
Just by the name, you expect only (1).
Best regards,
Wolfgang Denk
Regards
Pantelis

In message 426DDE65.4060009@intracom.gr you wrote:
...which is broken, especially when there is no need to do this since the fields are properly aligned.
The compiler does not know this. A pointer to the structure could be pointing to an non-aligned address, and when an aligned field within the structure is accessed it will be not aligned.
This is no reason for anything. The same could be true for ANY structures, even without using the packed attribute. It is the user's problem if he passes incorrectly aligned pointers.
This typically happens when networking.
Yes, and usually you will have to explicitely deal with this.
IMHO __attribute((packed)) is deceivingly named, since it implies two things.
I think you are wrong. You are trying to explain the current imple- mentation of GCC for ARM, assuming silently that the implementation is correct. I say: it is broken.
- Don't leave any space between fields of the structure.
Correct - this is what the documentation says:
The `packed' attribute specifies that a variable or structure field should have the smallest possible alignment--one byte for a variable, and one bit for a field, unless you specify a larger value with the `aligned' attribute.
Here is a structure in which the field `x' is packed, so that it immediately follows `a':
struct foo { char a; int x[2] __attribute__ ((packed)); };
- Access the fields of the structure even when the alignment of the pointer pointing to it is wrong.
Wrong. Remember that a structure may be a representation for things like hardware registers. Registers may have funny properties, like auto-incrementing with each access, or auto-resetting on read. If I perform a 32 bit read on such an object the compiler MUST NOT break this up into any other combination of accesses.
Just by the name, you expect only (1).
By the name, and the documentation, and common sense.
I think we have covered all relevant facts now. I understand that we won't have GCC fixed soon, and I already suggested what I think should be used to work around this compiler bug (using aligned).
Best regards,
Wolfgang Denk

Wolfgang Denk wrote:
In message 426DDE65.4060009@intracom.gr you wrote:
...which is broken, especially when there is no need to do this since the fields are properly aligned.
The compiler does not know this. A pointer to the structure could be pointing to an non-aligned address, and when an aligned field within the structure is accessed it will be not aligned.
This is no reason for anything. The same could be true for ANY structures, even without using the packed attribute. It is the user's problem if he passes incorrectly aligned pointers.
No, it is the compiler's problem if it incorrectly generates code which breaks given an access to an unaligned field. The pointer itself could very well be correctly aligned, and yet be pointing to an unaligned data structure. The compiler has no way of knowing this at compile-time, and hence must generate code which is the "least common denominator", i.e. - the code will work, correctly, in ALL situations, regardless of the structure's alignment.
This typically happens when networking.
Yes, and usually you will have to explicitely deal with this.
IMHO __attribute((packed)) is deceivingly named, since it implies two things.
I think you are wrong. You are trying to explain the current imple- mentation of GCC for ARM, assuming silently that the implementation is correct. I say: it is broken.
Given that a similar series of bytewise accesses will also be generated using armcc, I'd say gcc is _consistent_.
http://www.arm.com/support/faqdev/1469.html http://www.arm.com/support/faqdev/1228.html http://www.arm.com/support/faqip/3661.html
Again, the compiler MUST ensure that no incorrect code is generated to perform the access. Simply because said access is not performed in what may be the optimal way, does NOT mean the compiler is broken, or that this is a bug.
From one of the above referenced tech notes: "Unknown alignment (eg. a pointer to a word that can be at any address)
This is specified by a pointer to a __packed item, eg.
__packed int *pi;
In this case, the ARM compilers generate code which correctly accesses the value regardless of the alignment of the pointer. This code generated will be a sequence of byte accesses, or variable alignment-dependent shifting and masking (depending on the compile options) and will therefore incur a performance penalty.
Note that *any* __packed object accessed through a pointer has unknown alignment, even packed structures."
- Don't leave any space between fields of the structure.
Correct - this is what the documentation says:
The `packed' attribute specifies that a variable or structure field should have the smallest possible alignment--one byte for a variable, and one bit for a field, unless you specify a larger value with the `aligned' attribute. Here is a structure in which the field `x' is packed, so that it immediately follows `a': struct foo { char a; int x[2] __attribute__ ((packed)); };
...which further implies that ANY access to a member of a packed struct may be unaligned, even if it is aligned within the structure definition itself, if the structure itself is unaligned in memory...
struct foo { unsigned long foo1; unsigned long foo2; unsigned long foo3; } __attribute__ ((packed));
unsigned long dummy(struct foo *p) { p = (struct foo *)1; return(p->foo1 + p->foo2 + p->foo3); }
Yes, the above is contrived, but it serves to illustrate the point.
- Access the fields of the structure even when the alignment of the pointer pointing to it is wrong.
Wrong. Remember that a structure may be a representation for things like hardware registers. Registers may have funny properties, like auto-incrementing with each access, or auto-resetting on read. If I perform a 32 bit read on such an object the compiler MUST NOT break this up into any other combination of accesses.
No, right. If the _user_ chooses to implement such a structure for memory-mapped accesses, it is then up to the _user_ to validate that the accesses will be performed as intended. The _compiler_ is responsible ONLY for generating code to correctly load/store all 32 bits of data. There is nothing which requires that the load/store be atomic in nature. The _compiler_ has no knowledge it is reading/writing a register which requires atomic accesses, only the _user_ does.
Just by the name, you expect only (1).
By the name, and the documentation, and common sense.
Then I doubt we share a common sense of "common sense". ;)
Oh, the pain of the pun!
I think we have covered all relevant facts now. I understand that we won't have GCC fixed soon, and I already suggested what I think should be used to work around this compiler bug (using aligned).
gcc does not need fixed in this regard; the above is not a bug. gcc is generating correct code in all cases given the fact that the ARM architecture has no support for unaligned accesses.
Please cc: me directly if anyone wishes to discuss this further, as I should think this is now well off-topic for the list.
-Cory
participants (6)
-
Arthur Shipkowski
-
Cory Tusar
-
Jean-Paul Saman
-
Pantelis Antoniou
-
Wolfgang Denk
-
Woojung.Huh@smsc.com