[U-Boot] Data Abort with gcc 7.1

Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
Thanks! Maxime

On Tue, Jul 11, 2017 at 06:54:55PM +0200, Maxime Ripard wrote:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
Can you reduce the code into a testcase? I think the first step is filing a bug with gcc and seeing where it goes from there as yes, we should be passing -mno-unaligned-access.

Maxime,
On 11 Jul 2017, at 18:59, Tom Rini trini@konsulko.com wrote:
On Tue, Jul 11, 2017 at 06:54:55PM +0200, Maxime Ripard wrote:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
Can you reduce the code into a testcase? I think the first step is filing a bug with gcc and seeing where it goes from there as yes, we should be passing -mno-unaligned-access.
I don’t think that this is a GCC bug, as “-mno-unaligned-access” will change the behaviour for packed data-structures only. Here’s from the GCC docs:
-munaligned-access -mno-unaligned-access Enables (or disables) reading and writing of 16- and 32- bit values from addresses that are not 16- or 32- bit aligned. By default unaligned access is disabled for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline architectures, and enabled for all other architectures. If unaligned access is not enabled then words in packed data structures are accessed a byte at a time.
The key word seems to be “in packed data structures”. However, I don’t see an attribute “packed” for the 'struct ip_udp_hdr’ (in include/net.h).
Could you try to verify that the error reproduces with a packed variant of the ‘struct ip_udp_hdr’?
Regards, Phil.

On Tue, Jul 11, 2017 at 07:59:21PM +0200, Dr. Philipp Tomsich wrote:
Maxime,
On 11 Jul 2017, at 18:59, Tom Rini trini@konsulko.com wrote:
On Tue, Jul 11, 2017 at 06:54:55PM +0200, Maxime Ripard wrote:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
Can you reduce the code into a testcase? I think the first step is filing a bug with gcc and seeing where it goes from there as yes, we should be passing -mno-unaligned-access.
I don’t think that this is a GCC bug, as “-mno-unaligned-access” will change the behaviour for packed data-structures only. Here’s from the GCC docs:
-munaligned-access -mno-unaligned-access
Enables (or disables) reading and writing of 16- and 32- bit values from addresses that are not 16- or 32- bit aligned. By default unaligned access is disabled for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline architectures, and enabled for all other architectures. If unaligned access is not enabled then words in packed data structures are accessed a byte at a time.
The key word seems to be “in packed data structures”. However, I don’t see an attribute “packed” for the 'struct ip_udp_hdr’ (in include/net.h).
Could you try to verify that the error reproduces with a packed variant of the ‘struct ip_udp_hdr’?
It indeed fixed the issue. There might just have been a subtle change of behaviour in GCC, and this is probably going to bite us in other areas.
I'll send a patch to add the packed attribute.
Thanks! Maxime

On Wed, Jul 12, 2017 at 04:20:52PM +0200, Maxime Ripard wrote:
On Tue, Jul 11, 2017 at 07:59:21PM +0200, Dr. Philipp Tomsich wrote:
Maxime,
On 11 Jul 2017, at 18:59, Tom Rini trini@konsulko.com wrote:
On Tue, Jul 11, 2017 at 06:54:55PM +0200, Maxime Ripard wrote:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
Can you reduce the code into a testcase? I think the first step is filing a bug with gcc and seeing where it goes from there as yes, we should be passing -mno-unaligned-access.
I don’t think that this is a GCC bug, as “-mno-unaligned-access” will change the behaviour for packed data-structures only. Here’s from the GCC docs:
-munaligned-access -mno-unaligned-access
Enables (or disables) reading and writing of 16- and 32- bit values from addresses that are not 16- or 32- bit aligned. By default unaligned access is disabled for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline architectures, and enabled for all other architectures. If unaligned access is not enabled then words in packed data structures are accessed a byte at a time.
The key word seems to be “in packed data structures”. However, I don’t see an attribute “packed” for the 'struct ip_udp_hdr’ (in include/net.h).
Could you try to verify that the error reproduces with a packed variant of the ‘struct ip_udp_hdr’?
It indeed fixed the issue. There might just have been a subtle change of behaviour in GCC, and this is probably going to bite us in other areas.
I'll send a patch to add the packed attribute.
Please bear in mind that packed should be used carefully. We've had some discussions about this before and have doc/README.unaligned-memory-access.txt which may need a little more updating now as well, depending on what the final resolution here is as I seem to recall some other problem reports with gcc-7.x, but I've not personally been able to hit these just yet. But I need to get on that soon. http://toolchains.free-electrons.com/ is awesome and I eagerly await gcc-7.x toolchains there if I can't get something else spun up in a chroot soon :)

On 12 Jul 2017, at 16:34, Tom Rini trini@konsulko.com wrote:
On Wed, Jul 12, 2017 at 04:20:52PM +0200, Maxime Ripard wrote:
On Tue, Jul 11, 2017 at 07:59:21PM +0200, Dr. Philipp Tomsich wrote:
Maxime,
On 11 Jul 2017, at 18:59, Tom Rini trini@konsulko.com wrote:
On Tue, Jul 11, 2017 at 06:54:55PM +0200, Maxime Ripard wrote:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
Can you reduce the code into a testcase? I think the first step is filing a bug with gcc and seeing where it goes from there as yes, we should be passing -mno-unaligned-access.
I don’t think that this is a GCC bug, as “-mno-unaligned-access” will change the behaviour for packed data-structures only. Here’s from the GCC docs:
-munaligned-access -mno-unaligned-access
Enables (or disables) reading and writing of 16- and 32- bit values from addresses that are not 16- or 32- bit aligned. By default unaligned access is disabled for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline architectures, and enabled for all other architectures. If unaligned access is not enabled then words in packed data structures are accessed a byte at a time.
The key word seems to be “in packed data structures”. However, I don’t see an attribute “packed” for the 'struct ip_udp_hdr’ (in include/net.h).
Could you try to verify that the error reproduces with a packed variant of the ‘struct ip_udp_hdr’?
It indeed fixed the issue. There might just have been a subtle change of behaviour in GCC, and this is probably going to bite us in other areas.
I'll send a patch to add the packed attribute.
Please bear in mind that packed should be used carefully. We've had some discussions about this before and have doc/README.unaligned-memory-access.txt which may need a little more updating now as well, depending on what the final resolution here is as
Note that the requirement regarding ‘packed’ has already been in place on GCC-4.9, GCC-5 and GCC-6: https://gcc.gnu.org/onlinedocs/gcc-4.9.3/gcc/ARM-Options.html#ARM-Options https://gcc.gnu.org/onlinedocs/gcc-5.4.0/gcc/ARM-Options.html#ARM-Options https://gcc.gnu.org/onlinedocs/gcc-6.3.0/gcc/ARM-Options.html#ARM-Options
It’s just that it might not have been enforced before (i.e. that the implementation only considered the flag and didn’t take into account if a structure was declared packed or not).
I seem to recall some other problem reports with gcc-7.x, but I've not personally been able to hit these just yet. But I need to get on that soon. http://toolchains.free-electrons.com/ is awesome and I eagerly await gcc-7.x toolchains there if I can't get something else spun up in a chroot soon :)
I have been using a gcc-7.1 for AArch64 for a couple of wweks now and haven’t had any issues… however, we always build using crosstools and it’s configures as an aarch64-elf (and not an aarch64-linux), which can change behaviour.
Phil.

Hello,
On Wed, 12 Jul 2017 10:34:47 -0400, Tom Rini wrote:
Please bear in mind that packed should be used carefully. We've had some discussions about this before and have doc/README.unaligned-memory-access.txt which may need a little more updating now as well, depending on what the final resolution here is as I seem to recall some other problem reports with gcc-7.x, but I've not personally been able to hit these just yet. But I need to get on that soon. http://toolchains.free-electrons.com/ is awesome and I eagerly await gcc-7.x toolchains there if I can't get something else spun up in a chroot soon :)
Gcc 7.x toolchains are high on the priority list for toolchains.free-electrons.com :)
Thomas

On Wed, Jul 12, 2017 at 05:30:14PM +0200, Thomas Petazzoni wrote:
Hello,
On Wed, 12 Jul 2017 10:34:47 -0400, Tom Rini wrote:
Please bear in mind that packed should be used carefully. We've had some discussions about this before and have doc/README.unaligned-memory-access.txt which may need a little more updating now as well, depending on what the final resolution here is as I seem to recall some other problem reports with gcc-7.x, but I've not personally been able to hit these just yet. But I need to get on that soon. http://toolchains.free-electrons.com/ is awesome and I eagerly await gcc-7.x toolchains there if I can't get something else spun up in a chroot soon :)
Gcc 7.x toolchains are high on the priority list for toolchains.free-electrons.com :)
Cool. I really want to move buildman to fetching from toolchains.free-electrons.com rather than kernel.org so that we can have gcc-6.x and then gcc-7.x available.

Tom & Maxime,
On 12 Jul 2017, at 16:34, Tom Rini trini@konsulko.com wrote:
On Wed, Jul 12, 2017 at 04:20:52PM +0200, Maxime Ripard wrote:
On Tue, Jul 11, 2017 at 07:59:21PM +0200, Dr. Philipp Tomsich wrote:
Maxime,
On 11 Jul 2017, at 18:59, Tom Rini trini@konsulko.com wrote:
On Tue, Jul 11, 2017 at 06:54:55PM +0200, Maxime Ripard wrote:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
Can you reduce the code into a testcase? I think the first step is filing a bug with gcc and seeing where it goes from there as yes, we should be passing -mno-unaligned-access.
I don’t think that this is a GCC bug, as “-mno-unaligned-access” will change the behaviour for packed data-structures only. Here’s from the GCC docs:
-munaligned-access -mno-unaligned-access
Enables (or disables) reading and writing of 16- and 32- bit values from addresses that are not 16- or 32- bit aligned. By default unaligned access is disabled for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline architectures, and enabled for all other architectures. If unaligned access is not enabled then words in packed data structures are accessed a byte at a time.
The key word seems to be “in packed data structures”. However, I don’t see an attribute “packed” for the 'struct ip_udp_hdr’ (in include/net.h).
Could you try to verify that the error reproduces with a packed variant of the ‘struct ip_udp_hdr’?
It indeed fixed the issue. There might just have been a subtle change of behaviour in GCC, and this is probably going to bite us in other areas.
I'll send a patch to add the packed attribute.
Please bear in mind that packed should be used carefully. We've had some discussions about this before and have doc/README.unaligned-memory-access.txt which may need a little more updating now as well, depending on what the final resolution here is as I seem to recall some other problem reports with gcc-7.x, but I've not personally been able to hit these just yet. But I need to get on that soon. http://toolchains.free-electrons.com/ is awesome and I eagerly await gcc-7.x toolchains there if I can't get something else spun up in a chroot soon :)
So there’s the remaining question of how to fix this permanently: — with my compiler engineering hat on, I’d recommend to mark this as a packed struct, as that’s what it is: the compiler needs to keep it packed, because that is how it is received/sent on the wire — rereading the doc/README.unaligned-memory-access.txt, the preferred solution in U-Boot would be to use put/get_unaligned to access these fields (although I have concerns with this—see below).
I honestly wonder if the recommendation (to avoid ‘packed’) from the README is appropriate for many of the data structure declarations in U-Boot which deal with the external representation of data (i.e. DMA descriptors, memory-mapped register files and data sent on a wire): the C language does not offer any protection against a compiler adding patting between structure members, as it sees fit. The original wording from the standard is: 14 Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type. 15 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
In other words: there’s nothing in the standard from stopping a compiler to insert additional padding within structures, unless the ‘packed’ attribute is added.
Regards, Philipp.

On Wed, Jul 12, 2017 at 05:48:16PM +0200, Dr. Philipp Tomsich wrote:
Tom & Maxime,
On 12 Jul 2017, at 16:34, Tom Rini trini@konsulko.com wrote:
On Wed, Jul 12, 2017 at 04:20:52PM +0200, Maxime Ripard wrote:
On Tue, Jul 11, 2017 at 07:59:21PM +0200, Dr. Philipp Tomsich wrote:
Maxime,
On 11 Jul 2017, at 18:59, Tom Rini trini@konsulko.com wrote:
On Tue, Jul 11, 2017 at 06:54:55PM +0200, Maxime Ripard wrote:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
Can you reduce the code into a testcase? I think the first step is filing a bug with gcc and seeing where it goes from there as yes, we should be passing -mno-unaligned-access.
I don’t think that this is a GCC bug, as “-mno-unaligned-access” will change the behaviour for packed data-structures only. Here’s from the GCC docs:
-munaligned-access -mno-unaligned-access
Enables (or disables) reading and writing of 16- and 32- bit values from addresses that are not 16- or 32- bit aligned. By default unaligned access is disabled for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline architectures, and enabled for all other architectures. If unaligned access is not enabled then words in packed data structures are accessed a byte at a time.
The key word seems to be “in packed data structures”. However, I don’t see an attribute “packed” for the 'struct ip_udp_hdr’ (in include/net.h).
Could you try to verify that the error reproduces with a packed variant of the ‘struct ip_udp_hdr’?
It indeed fixed the issue. There might just have been a subtle change of behaviour in GCC, and this is probably going to bite us in other areas.
I'll send a patch to add the packed attribute.
Please bear in mind that packed should be used carefully. We've had some discussions about this before and have doc/README.unaligned-memory-access.txt which may need a little more updating now as well, depending on what the final resolution here is as I seem to recall some other problem reports with gcc-7.x, but I've not personally been able to hit these just yet. But I need to get on that soon. http://toolchains.free-electrons.com/ is awesome and I eagerly await gcc-7.x toolchains there if I can't get something else spun up in a chroot soon :)
So there’s the remaining question of how to fix this permanently: — with my compiler engineering hat on, I’d recommend to mark this as a packed struct, as that’s what it is: the compiler needs to keep it packed, because that is how it is received/sent on the wire
OK. And are there other structures that we should mark as packed as well in include/net.h? I suspect so.
— rereading the doc/README.unaligned-memory-access.txt, the preferred solution in U-Boot would be to use put/get_unaligned to access these fields (although I have concerns with this—see below).
I honestly wonder if the recommendation (to avoid ‘packed’) from the README is appropriate for many of the data structure declarations in U-Boot which deal with the external representation of data (i.e. DMA descriptors, memory-mapped register files and data sent on a wire): the C language does not offer any protection against a compiler adding patting between structure members, as it sees fit. The original wording from the standard is: 14 Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type. 15 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
In other words: there’s nothing in the standard from stopping a compiler to insert additional padding within structures, unless the ‘packed’ attribute is added.
So, this is partly why I think we need to perhaps add / update / change some wording in the current file. The overall desire is to avoid "doesn't work, packed fixes it, packed must be the right fix!". We want to emphasize that packed should be used when it's appropriate, ie the cases you mention above. Thanks!

On 12 Jul 2017, at 19:49, Tom Rini trini@konsulko.com wrote:
On Wed, Jul 12, 2017 at 05:48:16PM +0200, Dr. Philipp Tomsich wrote:
Tom & Maxime,
On 12 Jul 2017, at 16:34, Tom Rini trini@konsulko.com wrote:
On Wed, Jul 12, 2017 at 04:20:52PM +0200, Maxime Ripard wrote:
On Tue, Jul 11, 2017 at 07:59:21PM +0200, Dr. Philipp Tomsich wrote:
Maxime,
On 11 Jul 2017, at 18:59, Tom Rini trini@konsulko.com wrote:
On Tue, Jul 11, 2017 at 06:54:55PM +0200, Maxime Ripard wrote: > Hi, > > I recently got a gcc 7.1 based toolchain, and it seems like it > generates unaligned code, specifically in the net_set_ip_header > function in my case. > > Whenever some packet is sent, this data abort is triggered: > > => setenv ipaddr 10.42.0.1; ping 10.42.0.254 > using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in > MAC de:ad:be:ef:00:01 > HOST MAC de:ad:be:af:00:00 > RNDIS ready > musb-hdrc: peripheral reset irq lost! > high speed config #2: 2 mA, Ethernet Gadget, using RNDIS > USB RNDIS network up! > Using usb_ether device > data abort > pc : [<7ff9db10>] lr : [<7ff9f00c>] > reloc pc : [<4a043b10>] lr : [<4a04500c>] > sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c > r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 > r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e > r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e > Flags: nZCv IRQs off FIQs off Mode SVC_32 > Resetting CPU ... > > > Running objdump on it gives us this: > > 4a043b04 <net_set_ip_header>: > > /* > * Construct an IP header. > */ > /* IP_HDR_SIZE / 4 (not including UDP) */ > ip->ip_hl_v = 0x45; > 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> > { > 4a043b08: e92d4013 push {r0, r1, r4, lr} > 4a043b0c: e1a04000 mov r4, r0 > ip->ip_hl_v = 0x45; > 4a043b10: e5803000 str r3, [r0] <---- Abort > ip->ip_tos = 0; > ip->ip_len = htons(IP_HDR_SIZE); > ip->ip_id = htons(net_ip_id++); > 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80> > > It seems like r0 is indeed set to an unaligned address (0x7ffed30e) > for some reason. > > Using a Linaro 6.3 toolchain works on the same commit with the same > config, so it really seems to be a compiler-related issue. > > It generates this code: > > 4a043ec4 <net_set_ip_header>: > > /* > * Construct an IP header. > */ > /* IP_HDR_SIZE / 4 (not including UDP) */ > ip->ip_hl_v = 0x45; > 4a043ec4: e3a03045 mov r3, #69 ; 0x45 > { > 4a043ec8: e92d4013 push {r0, r1, r4, lr} > 4a043ecc: e1a04000 mov r4, r0 > ip->ip_hl_v = 0x45; > 4a043ed0: e5c03000 strb r3, [r0] > ip->ip_tos = 0; > ip->ip_len = htons(IP_HDR_SIZE); > 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 > ip->ip_tos = 0; > 4a043ed8: e3a00000 mov r0, #0 > ip->ip_len = htons(IP_HDR_SIZE); > 4a043edc: e1c430b2 strh r3, [r4, #2] > ip->ip_id = htons(net_ip_id++); > 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88> > > And it seems like it's using an strb instruction to avoid the > unaligned access. > > As far as I know, we are passing --wno-unaligned-access, so the broken > situation should not arise, and yet it does, so I'm a bit confused, > and not really sure what to do from there.
Can you reduce the code into a testcase? I think the first step is filing a bug with gcc and seeing where it goes from there as yes, we should be passing -mno-unaligned-access.
I don’t think that this is a GCC bug, as “-mno-unaligned-access” will change the behaviour for packed data-structures only. Here’s from the GCC docs:
-munaligned-access -mno-unaligned-access
Enables (or disables) reading and writing of 16- and 32- bit values from addresses that are not 16- or 32- bit aligned. By default unaligned access is disabled for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline architectures, and enabled for all other architectures. If unaligned access is not enabled then words in packed data structures are accessed a byte at a time.
The key word seems to be “in packed data structures”. However, I don’t see an attribute “packed” for the 'struct ip_udp_hdr’ (in include/net.h).
Could you try to verify that the error reproduces with a packed variant of the ‘struct ip_udp_hdr’?
It indeed fixed the issue. There might just have been a subtle change of behaviour in GCC, and this is probably going to bite us in other areas.
I'll send a patch to add the packed attribute.
Please bear in mind that packed should be used carefully. We've had some discussions about this before and have doc/README.unaligned-memory-access.txt which may need a little more updating now as well, depending on what the final resolution here is as I seem to recall some other problem reports with gcc-7.x, but I've not personally been able to hit these just yet. But I need to get on that soon. http://toolchains.free-electrons.com/ is awesome and I eagerly await gcc-7.x toolchains there if I can't get something else spun up in a chroot soon :)
So there’s the remaining question of how to fix this permanently: — with my compiler engineering hat on, I’d recommend to mark this as a packed struct, as that’s what it is: the compiler needs to keep it packed, because that is how it is received/sent on the wire
OK. And are there other structures that we should mark as packed as well in include/net.h? I suspect so.
— rereading the doc/README.unaligned-memory-access.txt, the preferred solution in U-Boot would be to use put/get_unaligned to access these fields (although I have concerns with this—see below).
I honestly wonder if the recommendation (to avoid ‘packed’) from the README is appropriate for many of the data structure declarations in U-Boot which deal with the external representation of data (i.e. DMA descriptors, memory-mapped register files and data sent on a wire): the C language does not offer any protection against a compiler adding patting between structure members, as it sees fit. The original wording from the standard is: 14 Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type. 15 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
In other words: there’s nothing in the standard from stopping a compiler to insert additional padding within structures, unless the ‘packed’ attribute is added.
So, this is partly why I think we need to perhaps add / update / change some wording in the current file. The overall desire is to avoid "doesn't work, packed fixes it, packed must be the right fix!". We want to emphasize that packed should be used when it's appropriate, ie the cases you mention above. Thanks!
I’ll have a shot at rewording this. Give me a few days to get around to it.
—Phil.

On Wed, 12 Jul 2017 17:48:16 +0200 "Dr. Philipp Tomsich" philipp.tomsich@theobroma-systems.com wrote:
Tom & Maxime,
On 12 Jul 2017, at 16:34, Tom Rini trini@konsulko.com wrote:
On Wed, Jul 12, 2017 at 04:20:52PM +0200, Maxime Ripard wrote:
On Tue, Jul 11, 2017 at 07:59:21PM +0200, Dr. Philipp Tomsich wrote:
Maxime,
On 11 Jul 2017, at 18:59, Tom Rini trini@konsulko.com wrote:
On Tue, Jul 11, 2017 at 06:54:55PM +0200, Maxime Ripard wrote:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
Can you reduce the code into a testcase? I think the first step is filing a bug with gcc and seeing where it goes from there as yes, we should be passing -mno-unaligned-access.
I don’t think that this is a GCC bug, as “-mno-unaligned-access” will change the behaviour for packed data-structures only. Here’s from the GCC docs:
-munaligned-access -mno-unaligned-access
Enables (or disables) reading and writing of 16- and 32- bit values from addresses that are not 16- or 32- bit aligned. By default unaligned access is disabled for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline architectures, and enabled for all other architectures. If unaligned access is not enabled then words in packed data structures are accessed a byte at a time.
The key word seems to be “in packed data structures”. However, I don’t see an attribute “packed” for the 'struct ip_udp_hdr’ (in include/net.h).
Could you try to verify that the error reproduces with a packed variant of the ‘struct ip_udp_hdr’?
It indeed fixed the issue. There might just have been a subtle change of behaviour in GCC, and this is probably going to bite us in other areas.
I'll send a patch to add the packed attribute.
Please bear in mind that packed should be used carefully. We've had some discussions about this before and have doc/README.unaligned-memory-access.txt which may need a little more updating now as well, depending on what the final resolution here is as I seem to recall some other problem reports with gcc-7.x, but I've not personally been able to hit these just yet. But I need to get on that soon. http://toolchains.free-electrons.com/ is awesome and I eagerly await gcc-7.x toolchains there if I can't get something else spun up in a chroot soon :)
So there’s the remaining question of how to fix this permanently: — with my compiler engineering hat on, I’d recommend to mark this as a packed struct, as that’s what it is: the compiler needs to keep it packed, because that is how it is received/sent on the wire — rereading the doc/README.unaligned-memory-access.txt, the preferred solution in U-Boot would be to use put/get_unaligned to access these fields (although I have concerns with this—see below).
I honestly wonder if the recommendation (to avoid ‘packed’) from the README is appropriate for many of the data structure declarations in U-Boot which deal with the external representation of data (i.e. DMA descriptors, memory-mapped register files and data sent on a wire): the C language does not offer any protection against a compiler adding patting between structure members, as it sees fit. The original wording from the standard is: 14 Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type. 15 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
In other words: there’s nothing in the standard from stopping a compiler to insert additional padding within structures, unless the ‘packed’ attribute is added.
I would strongly advise against adding the "packed" attribute everywhere unnecessarily. This just makes the code bigger and slower.
The ANSI/ISO C language standard indeed leaves a lot up to the implementation. But we also have ABI documents for each platform, which cover all of these details. We just need to use them.
In the case of 32-bit ARM, it is http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapcs.pdf
In the case of 64-bit ARM, it is http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.p...

On 12 Jul 2017, at 20:07, Siarhei Siamashka siarhei.siamashka@gmail.com wrote:
On Wed, 12 Jul 2017 17:48:16 +0200 "Dr. Philipp Tomsich" philipp.tomsich@theobroma-systems.com wrote:
Tom & Maxime,
On 12 Jul 2017, at 16:34, Tom Rini trini@konsulko.com wrote:
On Wed, Jul 12, 2017 at 04:20:52PM +0200, Maxime Ripard wrote:
On Tue, Jul 11, 2017 at 07:59:21PM +0200, Dr. Philipp Tomsich wrote:
Maxime,
On 11 Jul 2017, at 18:59, Tom Rini trini@konsulko.com wrote:
On Tue, Jul 11, 2017 at 06:54:55PM +0200, Maxime Ripard wrote: > Hi, > > I recently got a gcc 7.1 based toolchain, and it seems like it > generates unaligned code, specifically in the net_set_ip_header > function in my case. > > Whenever some packet is sent, this data abort is triggered: > > => setenv ipaddr 10.42.0.1; ping 10.42.0.254 > using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in > MAC de:ad:be:ef:00:01 > HOST MAC de:ad:be:af:00:00 > RNDIS ready > musb-hdrc: peripheral reset irq lost! > high speed config #2: 2 mA, Ethernet Gadget, using RNDIS > USB RNDIS network up! > Using usb_ether device > data abort > pc : [<7ff9db10>] lr : [<7ff9f00c>] > reloc pc : [<4a043b10>] lr : [<4a04500c>] > sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c > r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 > r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e > r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e > Flags: nZCv IRQs off FIQs off Mode SVC_32 > Resetting CPU ... > > > Running objdump on it gives us this: > > 4a043b04 <net_set_ip_header>: > > /* > * Construct an IP header. > */ > /* IP_HDR_SIZE / 4 (not including UDP) */ > ip->ip_hl_v = 0x45; > 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> > { > 4a043b08: e92d4013 push {r0, r1, r4, lr} > 4a043b0c: e1a04000 mov r4, r0 > ip->ip_hl_v = 0x45; > 4a043b10: e5803000 str r3, [r0] <---- Abort > ip->ip_tos = 0; > ip->ip_len = htons(IP_HDR_SIZE); > ip->ip_id = htons(net_ip_id++); > 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80> > > It seems like r0 is indeed set to an unaligned address (0x7ffed30e) > for some reason. > > Using a Linaro 6.3 toolchain works on the same commit with the same > config, so it really seems to be a compiler-related issue. > > It generates this code: > > 4a043ec4 <net_set_ip_header>: > > /* > * Construct an IP header. > */ > /* IP_HDR_SIZE / 4 (not including UDP) */ > ip->ip_hl_v = 0x45; > 4a043ec4: e3a03045 mov r3, #69 ; 0x45 > { > 4a043ec8: e92d4013 push {r0, r1, r4, lr} > 4a043ecc: e1a04000 mov r4, r0 > ip->ip_hl_v = 0x45; > 4a043ed0: e5c03000 strb r3, [r0] > ip->ip_tos = 0; > ip->ip_len = htons(IP_HDR_SIZE); > 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 > ip->ip_tos = 0; > 4a043ed8: e3a00000 mov r0, #0 > ip->ip_len = htons(IP_HDR_SIZE); > 4a043edc: e1c430b2 strh r3, [r4, #2] > ip->ip_id = htons(net_ip_id++); > 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88> > > And it seems like it's using an strb instruction to avoid the > unaligned access. > > As far as I know, we are passing --wno-unaligned-access, so the broken > situation should not arise, and yet it does, so I'm a bit confused, > and not really sure what to do from there.
Can you reduce the code into a testcase? I think the first step is filing a bug with gcc and seeing where it goes from there as yes, we should be passing -mno-unaligned-access.
I don’t think that this is a GCC bug, as “-mno-unaligned-access” will change the behaviour for packed data-structures only. Here’s from the GCC docs:
-munaligned-access -mno-unaligned-access
Enables (or disables) reading and writing of 16- and 32- bit values from addresses that are not 16- or 32- bit aligned. By default unaligned access is disabled for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline architectures, and enabled for all other architectures. If unaligned access is not enabled then words in packed data structures are accessed a byte at a time.
The key word seems to be “in packed data structures”. However, I don’t see an attribute “packed” for the 'struct ip_udp_hdr’ (in include/net.h).
Could you try to verify that the error reproduces with a packed variant of the ‘struct ip_udp_hdr’?
It indeed fixed the issue. There might just have been a subtle change of behaviour in GCC, and this is probably going to bite us in other areas.
I'll send a patch to add the packed attribute.
Please bear in mind that packed should be used carefully. We've had some discussions about this before and have doc/README.unaligned-memory-access.txt which may need a little more updating now as well, depending on what the final resolution here is as I seem to recall some other problem reports with gcc-7.x, but I've not personally been able to hit these just yet. But I need to get on that soon. http://toolchains.free-electrons.com/ is awesome and I eagerly await gcc-7.x toolchains there if I can't get something else spun up in a chroot soon :)
So there’s the remaining question of how to fix this permanently: — with my compiler engineering hat on, I’d recommend to mark this as a packed struct, as that’s what it is: the compiler needs to keep it packed, because that is how it is received/sent on the wire — rereading the doc/README.unaligned-memory-access.txt, the preferred solution in U-Boot would be to use put/get_unaligned to access these fields (although I have concerns with this—see below).
I honestly wonder if the recommendation (to avoid ‘packed’) from the README is appropriate for many of the data structure declarations in U-Boot which deal with the external representation of data (i.e. DMA descriptors, memory-mapped register files and data sent on a wire): the C language does not offer any protection against a compiler adding patting between structure members, as it sees fit. The original wording from the standard is: 14 Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type. 15 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
In other words: there’s nothing in the standard from stopping a compiler to insert additional padding within structures, unless the ‘packed’ attribute is added.
I would strongly advise against adding the "packed" attribute everywhere unnecessarily. This just makes the code bigger and slower.
The intent was not to add it everywhere, but only to choose this solution when encountering cases like this one: i.e. cases where the source code was risking future trouble.
Clearly we can rely on ABI documents when we have code that targets a specific architecture (e.g the arch/arm), but should not do so in the architecture-independent code.
And clearly there’s also the aspect of needing the ‘packed’ to enable GCC to process the unaligned data elements with “-mno-unaligned-accesses”.
The ANSI/ISO C language standard indeed leaves a lot up to the implementation. But we also have ABI documents for each platform, which cover all of these details. We just need to use them.
In the case of 32-bit ARM, it is http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapcs.pdf
In the case of 64-bit ARM, it is http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.p...
Given that the common header files should be used across architectures (and also future architectures, once someone decides to port U-Boot there), we shouldn’t assume that the ABI documents for each platform will define this in such a way that we can rely on.
Cheers, Phil.

From: "Dr. Philipp Tomsich" philipp.tomsich@theobroma-systems.com Date: Wed, 12 Jul 2017 20:26:13 +0200
On 12 Jul 2017, at 20:07, Siarhei Siamashka siarhei.siamashka@gmail.com wrote:
On Wed, 12 Jul 2017 17:48:16 +0200 "Dr. Philipp Tomsich" philipp.tomsich@theobroma-systems.com wrote:
Tom & Maxime,
On 12 Jul 2017, at 16:34, Tom Rini trini@konsulko.com wrote:
On Wed, Jul 12, 2017 at 04:20:52PM +0200, Maxime Ripard wrote:
On Tue, Jul 11, 2017 at 07:59:21PM +0200, Dr. Philipp Tomsich wrote:
Maxime,
> On 11 Jul 2017, at 18:59, Tom Rini trini@konsulko.com wrote: > > On Tue, Jul 11, 2017 at 06:54:55PM +0200, Maxime Ripard wrote: >> Hi, >> >> I recently got a gcc 7.1 based toolchain, and it seems like it >> generates unaligned code, specifically in the net_set_ip_header >> function in my case. >> >> Whenever some packet is sent, this data abort is triggered: >> >> => setenv ipaddr 10.42.0.1; ping 10.42.0.254 >> using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in >> MAC de:ad:be:ef:00:01 >> HOST MAC de:ad:be:af:00:00 >> RNDIS ready >> musb-hdrc: peripheral reset irq lost! >> high speed config #2: 2 mA, Ethernet Gadget, using RNDIS >> USB RNDIS network up! >> Using usb_ether device >> data abort >> pc : [<7ff9db10>] lr : [<7ff9f00c>] >> reloc pc : [<4a043b10>] lr : [<4a04500c>] >> sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c >> r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 >> r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e >> r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e >> Flags: nZCv IRQs off FIQs off Mode SVC_32 >> Resetting CPU ... >> >> >> Running objdump on it gives us this: >> >> 4a043b04 <net_set_ip_header>: >> >> /* >> * Construct an IP header. >> */ >> /* IP_HDR_SIZE / 4 (not including UDP) */ >> ip->ip_hl_v = 0x45; >> 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> >> { >> 4a043b08: e92d4013 push {r0, r1, r4, lr} >> 4a043b0c: e1a04000 mov r4, r0 >> ip->ip_hl_v = 0x45; >> 4a043b10: e5803000 str r3, [r0] <---- Abort >> ip->ip_tos = 0; >> ip->ip_len = htons(IP_HDR_SIZE); >> ip->ip_id = htons(net_ip_id++); >> 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80> >> >> It seems like r0 is indeed set to an unaligned address (0x7ffed30e) >> for some reason. >> >> Using a Linaro 6.3 toolchain works on the same commit with the same >> config, so it really seems to be a compiler-related issue. >> >> It generates this code: >> >> 4a043ec4 <net_set_ip_header>: >> >> /* >> * Construct an IP header. >> */ >> /* IP_HDR_SIZE / 4 (not including UDP) */ >> ip->ip_hl_v = 0x45; >> 4a043ec4: e3a03045 mov r3, #69 ; 0x45 >> { >> 4a043ec8: e92d4013 push {r0, r1, r4, lr} >> 4a043ecc: e1a04000 mov r4, r0 >> ip->ip_hl_v = 0x45; >> 4a043ed0: e5c03000 strb r3, [r0] >> ip->ip_tos = 0; >> ip->ip_len = htons(IP_HDR_SIZE); >> 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 >> ip->ip_tos = 0; >> 4a043ed8: e3a00000 mov r0, #0 >> ip->ip_len = htons(IP_HDR_SIZE); >> 4a043edc: e1c430b2 strh r3, [r4, #2] >> ip->ip_id = htons(net_ip_id++); >> 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88> >> >> And it seems like it's using an strb instruction to avoid the >> unaligned access. >> >> As far as I know, we are passing --wno-unaligned-access, so the broken >> situation should not arise, and yet it does, so I'm a bit confused, >> and not really sure what to do from there. > > Can you reduce the code into a testcase? I think the first step is > filing a bug with gcc and seeing where it goes from there as yes, we > should be passing -mno-unaligned-access.
I don’t think that this is a GCC bug, as “-mno-unaligned-access” will change the behaviour for packed data-structures only. Here’s from the GCC docs:
> -munaligned-access > -mno-unaligned-access > > Enables (or disables) reading and writing of 16- and 32- bit > values from addresses that are not 16- or 32- bit aligned. By > default unaligned access is disabled for all pre-ARMv6, all > ARMv6-M and for ARMv8-M Baseline architectures, and enabled for > all other architectures. If unaligned access is not enabled then > words in packed data structures are accessed a byte at a time.
The key word seems to be “in packed data structures”. However, I don’t see an attribute “packed” for the 'struct ip_udp_hdr’ (in include/net.h).
Could you try to verify that the error reproduces with a packed variant of the ‘struct ip_udp_hdr’?
It indeed fixed the issue. There might just have been a subtle change of behaviour in GCC, and this is probably going to bite us in other areas.
I'll send a patch to add the packed attribute.
Please bear in mind that packed should be used carefully. We've had some discussions about this before and have doc/README.unaligned-memory-access.txt which may need a little more updating now as well, depending on what the final resolution here is as I seem to recall some other problem reports with gcc-7.x, but I've not personally been able to hit these just yet. But I need to get on that soon. http://toolchains.free-electrons.com/ is awesome and I eagerly await gcc-7.x toolchains there if I can't get something else spun up in a chroot soon :)
So there’s the remaining question of how to fix this permanently: — with my compiler engineering hat on, I’d recommend to mark this as a packed struct, as that’s what it is: the compiler needs to keep it packed, because that is how it is received/sent on the wire — rereading the doc/README.unaligned-memory-access.txt, the preferred solution in U-Boot would be to use put/get_unaligned to access these fields (although I have concerns with this—see below).
I honestly wonder if the recommendation (to avoid ‘packed’) from the README is appropriate for many of the data structure declarations in U-Boot which deal with the external representation of data (i.e. DMA descriptors, memory-mapped register files and data sent on a wire): the C language does not offer any protection against a compiler adding patting between structure members, as it sees fit. The original wording from the standard is: 14 Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type. 15 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
In other words: there’s nothing in the standard from stopping a compiler to insert additional padding within structures, unless the ‘packed’ attribute is added.
I would strongly advise against adding the "packed" attribute everywhere unnecessarily. This just makes the code bigger and slower.
The intent was not to add it everywhere, but only to choose this solution when encountering cases like this one: i.e. cases where the source code was risking future trouble.
Clearly we can rely on ABI documents when we have code that targets a specific architecture (e.g the arch/arm), but should not do so in the architecture-independent code.
Well, all ABIs I'm familliar with are "compatible" with some common sense rules about padding and alignment. Basically if one assumes natural alignment for all the types, and avoids holes in the struct, it should be fine to not mark the struct as "packed". However, one must make sure that pointers to such structs are always properly aligned.
The IP header case is notorious. The Ethernet header is 20 bytes. The IP header immediately follows te Ethernet header. So if you receive packets in buffers that are aligned, the IP header will be misaligned. The common solution is to receive packets in deliberately misaligned buffers. Not all hardware supports that though. Which means the packet has to be copied.
It doesn't help that the dominant x86 architecture doesn't really care about alignment. So people keep writing code that doesn't consider proper alignment of packet buffers. Sadly ARM has followed the x86 example and now handles misaligned loads and stores in hardware as well on ARMv7 and ARMv8.
And clearly there’s also the aspect of needing the ‘packed’ to enable GCC to process the unaligned data elements with “-mno-unaligned-accesses”.
The ANSI/ISO C language standard indeed leaves a lot up to the implementation. But we also have ABI documents for each platform, which cover all of these details. We just need to use them.
In the case of 32-bit ARM, it is http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapcs.pdf
In the case of 64-bit ARM, it is http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.p...
Given that the common header files should be used across architectures (and also future architectures, once someone decides to port U-Boot there), we shouldn’t assume that the ABI documents for each platform will define this in such a way that we can rely on.
I'm not sure these "alignment" semantics of the "packed" attribute can be relied on. In the past those semantics have been unreliable. The best way to avoid unaligned access is to make sure your packets are properly (mis)aligned.

On Tue, 11 Jul 2017 18:54:55 +0200 Maxime Ripard maxime.ripard@free-electrons.com wrote:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
void net_set_ip_header(uchar *pkt, struct in_addr dest, struct in_addr source) { struct ip_udp_hdr *ip = (struct ip_udp_hdr *)pkt;
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); ip->ip_off = htons(IP_FLAGS_DFRAG); /* Don't fragment */ ip->ip_ttl = 255; ip->ip_sum = 0; /* already in network byte order */ net_copy_ip((void *)&ip->ip_src, &source); /* already in network byte order */ net_copy_ip((void *)&ip->ip_dst, &dest); }
This looks like a real bug in the U-Boot code. When we are casting from "uchar *pkt" to "struct ip_udp_hdr *ip", the pointer has to be properly aligned to the struct alignment.
Then we need to refer to the AAPCS document for the size and alignment requirements used by ARM EABI. And it says the following:
4.3.1 Aggregates
* The alignment of an aggregate shall be the alignment of its most-aligned component. * The size of an aggregate shall be the smallest multiple of its alignment that is sufficient to hold all of its members when they are laid out according to these rules.
Basically, according to these rules, the alignment of the "ip_udp_hdr" must be 4 bytes because of the "in_addr" struct in it:
/* IPv4 addresses are always 32 bits in size */ struct in_addr { __be32 s_addr; };
The __be32 typedef is somewhat tricky, because it has a "bitwise" attribute, but such attribute has no real meaning in GCC:
https://gcc.gnu.org/ml/gcc-help/2008-03/msg00265.html
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Yes, if the caller passes an improperly aligned pointer, then undefined behaviour may happen. And 2 bytes alignment is not good enough for the "ip_udp_hdr" struct.
Apparently GCC 7 tries to optimize the code by doing initialization of multiple u8/u16 fields as 32-bit writes. Because it rightfully expects proper 32-bit alignment for the structure.
The 32-bit fields "ip_src" and "ip_dst" are not initialized in this function, so they did not trigger the data abort before.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
Yes, GCC 6 was not smart enough to combine initialization of multiple u8/u16 fields with a single 32-bit write.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
The --wno-unaligned-access option does not help because the compiler assumes that the struct pointer is properly aligned.
This bug can be fixed by either:
1) Always ensure proper alignment of the udp packet header pointers, which are passed to the net_set_ip_header() function and similar functions. Some further investigation is necessary.
or
2) Just add a "packed" attribute to the ip_udp_hdr struct. In this case the compiler will always assume the smallest 1 byte alignment. It may have some performance implications though.

Maxime Ripard maxime.ripard@free-electrons.com writes:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
For reference, this is the relevant code:
struct ip_udp_hdr { u8 ip_hl_v; /* header length and version */ u8 ip_tos; /* type of service */ u16 ip_len; /* total length */ u16 ip_id; /* identification */ u16 ip_off; /* fragment offset field */ u8 ip_ttl; /* time to live */ u8 ip_p; /* protocol */ u16 ip_sum; /* checksum */ struct in_addr ip_src; /* Source IP address */ struct in_addr ip_dst; /* Destination IP address */ u16 udp_src; /* UDP source port */ u16 udp_dst; /* UDP destination port */ u16 udp_len; /* Length of UDP packet */ u16 udp_xsum; /* Checksum */ };
void net_set_ip_header(uchar *pkt, struct in_addr dest, struct in_addr source) { struct ip_udp_hdr *ip = (struct ip_udp_hdr *)pkt;
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); ip->ip_off = htons(IP_FLAGS_DFRAG); /* Don't fragment */ ip->ip_ttl = 255; ip->ip_sum = 0; /* already in network byte order */ net_copy_ip((void *)&ip->ip_src, &source); /* already in network byte order */ net_copy_ip((void *)&ip->ip_dst, &dest); }
What's changed with gcc7 is that it now merges the writes to the first three fields (occupying 4 bytes) into a single 32-bit store. There is nothing wrong with this.
Now struct ip_udp_hdr includes 32-bit fields, so it's natural alignment is 4 bytes. If the 'pkt' argument is not adequately aligned, the cast in the first line of the function becomes invalid. Judging by the register dump and disassembly, the pointer is indeed not thusly aligned, and this is an error.
The misaligned pointer should still not cause a hardware abort, however, on ARMv6 or later which permits unaligned addresses with the STR instruction. That is unless some idiot has gone and enabled strict alignment checking again despite this being against all common sense. There was a lengthy argument about this a few years ago, ultimately resulting in the proper settings being put in place.
What hardware did this happen on? If it was on ARMv5, adding the packed attribute is probably the correct fix. If it was ARMv6 or later, something else is broken as well.

On Thu, 13 Jul 2017 01:43:37 +0100 Måns Rullgård mans@mansr.com wrote:
Maxime Ripard maxime.ripard@free-electrons.com writes:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
For reference, this is the relevant code:
struct ip_udp_hdr { u8 ip_hl_v; /* header length and version */ u8 ip_tos; /* type of service */ u16 ip_len; /* total length */ u16 ip_id; /* identification */ u16 ip_off; /* fragment offset field */ u8 ip_ttl; /* time to live */ u8 ip_p; /* protocol */ u16 ip_sum; /* checksum */ struct in_addr ip_src; /* Source IP address */ struct in_addr ip_dst; /* Destination IP address */ u16 udp_src; /* UDP source port */ u16 udp_dst; /* UDP destination port */ u16 udp_len; /* Length of UDP packet */ u16 udp_xsum; /* Checksum */ };
void net_set_ip_header(uchar *pkt, struct in_addr dest, struct in_addr source) { struct ip_udp_hdr *ip = (struct ip_udp_hdr *)pkt;
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); ip->ip_off = htons(IP_FLAGS_DFRAG); /* Don't fragment */ ip->ip_ttl = 255; ip->ip_sum = 0; /* already in network byte order */ net_copy_ip((void *)&ip->ip_src, &source); /* already in network byte order */ net_copy_ip((void *)&ip->ip_dst, &dest); }
What's changed with gcc7 is that it now merges the writes to the first three fields (occupying 4 bytes) into a single 32-bit store. There is nothing wrong with this.
Now struct ip_udp_hdr includes 32-bit fields, so it's natural alignment is 4 bytes. If the 'pkt' argument is not adequately aligned, the cast in the first line of the function becomes invalid. Judging by the register dump and disassembly, the pointer is indeed not thusly aligned, and this is an error.
Yes, your analysis appears to be the exact copy of mine up to this point: https://lists.denx.de/pipermail/u-boot/2017-July/298016.html
The misaligned pointer should still not cause a hardware abort, however, on ARMv6 or later which permits unaligned addresses with the STR instruction. That is unless some idiot has gone and enabled strict alignment checking again despite this being against all common sense. There was a lengthy argument about this a few years ago, ultimately resulting in the proper settings being put in place.
The trick is that unaligned memory accesses still need some special setup even on ARMv6 or later hardware. And we can't always count on having it working when running early bootloader code.
You can check sections "A3.2.2 Cases where unaligned accesses are UNPREDICTABLE" and "B3.12.4 Alignment faults" in the ARM Architecture Reference Manual.
Basically, the MMU has to be enabled. Also unaligned memory accesses can't be done for the memory pages with device or strongly-ordered attribute.
Moreover, not every ARM instruction supports unaligned memory accesses too. For example, LDM/STM instructions don't work with unaligned memory addresses regardless of the SCTLR.A bit. And if the compiler thinks that the pointer is supposed to be properly aligned, then it may use such instructions. A very good testcase is a simple function like this:
int f(int *x) { return x[0] + x[1]; }
The compiler will rightfully generate the following instructions:
00000000 <f>: 0: e8900009 ldm r0, {r0, r3} 4: e0830000 add r0, r3, r0 8: e12fff1e bx lr
If the pointer is misaligned, then it will surely fail.
What hardware did this happen on? If it was on ARMv5, adding the packed attribute is probably the correct fix. If it was ARMv6 or later, something else is broken as well.
It does not matter if this was ARMv6+ hardware or not. The current U-Boot code is wrong and we need to fix it.

Siarhei Siamashka siarhei.siamashka@gmail.com writes:
On Thu, 13 Jul 2017 01:43:37 +0100 Måns Rullgård mans@mansr.com wrote:
Maxime Ripard maxime.ripard@free-electrons.com writes:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
For reference, this is the relevant code:
struct ip_udp_hdr { u8 ip_hl_v; /* header length and version */ u8 ip_tos; /* type of service */ u16 ip_len; /* total length */ u16 ip_id; /* identification */ u16 ip_off; /* fragment offset field */ u8 ip_ttl; /* time to live */ u8 ip_p; /* protocol */ u16 ip_sum; /* checksum */ struct in_addr ip_src; /* Source IP address */ struct in_addr ip_dst; /* Destination IP address */ u16 udp_src; /* UDP source port */ u16 udp_dst; /* UDP destination port */ u16 udp_len; /* Length of UDP packet */ u16 udp_xsum; /* Checksum */ };
void net_set_ip_header(uchar *pkt, struct in_addr dest, struct in_addr source) { struct ip_udp_hdr *ip = (struct ip_udp_hdr *)pkt;
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); ip->ip_off = htons(IP_FLAGS_DFRAG); /* Don't fragment */ ip->ip_ttl = 255; ip->ip_sum = 0; /* already in network byte order */ net_copy_ip((void *)&ip->ip_src, &source); /* already in network byte order */ net_copy_ip((void *)&ip->ip_dst, &dest); }
What's changed with gcc7 is that it now merges the writes to the first three fields (occupying 4 bytes) into a single 32-bit store. There is nothing wrong with this.
Now struct ip_udp_hdr includes 32-bit fields, so it's natural alignment is 4 bytes. If the 'pkt' argument is not adequately aligned, the cast in the first line of the function becomes invalid. Judging by the register dump and disassembly, the pointer is indeed not thusly aligned, and this is an error.
Yes, your analysis appears to be the exact copy of mine up to this point: https://lists.denx.de/pipermail/u-boot/2017-July/298016.html
The misaligned pointer should still not cause a hardware abort, however, on ARMv6 or later which permits unaligned addresses with the STR instruction. That is unless some idiot has gone and enabled strict alignment checking again despite this being against all common sense. There was a lengthy argument about this a few years ago, ultimately resulting in the proper settings being put in place.
The trick is that unaligned memory accesses still need some special setup even on ARMv6 or later hardware. And we can't always count on having it working when running early bootloader code.
You can check sections "A3.2.2 Cases where unaligned accesses are UNPREDICTABLE" and "B3.12.4 Alignment faults" in the ARM Architecture Reference Manual.
Basically, the MMU has to be enabled. Also unaligned memory accesses can't be done for the memory pages with device or strongly-ordered attribute.
The MMU should already be enabled in order for the caches to work.
Moreover, not every ARM instruction supports unaligned memory accesses too. For example, LDM/STM instructions don't work with unaligned memory addresses regardless of the SCTLR.A bit. And if the compiler thinks that the pointer is supposed to be properly aligned, then it may use such instructions. A very good testcase is a simple function like this:
int f(int *x) { return x[0] + x[1]; }
The compiler will rightfully generate the following instructions:
00000000 <f>: 0: e8900009 ldm r0, {r0, r3} 4: e0830000 add r0, r3, r0 8: e12fff1e bx lr
If the pointer is misaligned, then it will surely fail.
I'm well aware of this. However, in the case at hand, it was an STR instruction that failed, and this shouldn't happen on a correctly configured ARMv6+.
What hardware did this happen on? If it was on ARMv5, adding the packed attribute is probably the correct fix. If it was ARMv6 or later, something else is broken as well.
It does not matter if this was ARMv6+ hardware or not. The current U-Boot code is wrong and we need to fix it.
The question is how many errors there are. That's why I asked about the hardware.

On Thu, Jul 13, 2017 at 11:11 AM, Måns Rullgård mans@mansr.com wrote:
Siarhei Siamashka siarhei.siamashka@gmail.com writes:
On Thu, 13 Jul 2017 01:43:37 +0100 Måns Rullgård mans@mansr.com wrote:
Maxime Ripard maxime.ripard@free-electrons.com writes:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Running objdump on it gives us this:
4a043b04 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043b04: e59f3074 ldr r3, [pc, #116] ; 4a043b80 <net_set_ip_header+0x7c> { 4a043b08: e92d4013 push {r0, r1, r4, lr} 4a043b0c: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043b10: e5803000 str r3, [r0] <---- Abort ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); 4a043b14: e59f3068 ldr r3, [pc, #104] ; 4a043b84 <net_set_ip_header+0x80>
It seems like r0 is indeed set to an unaligned address (0x7ffed30e) for some reason.
Using a Linaro 6.3 toolchain works on the same commit with the same config, so it really seems to be a compiler-related issue.
It generates this code:
4a043ec4 <net_set_ip_header>:
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; 4a043ec4: e3a03045 mov r3, #69 ; 0x45 { 4a043ec8: e92d4013 push {r0, r1, r4, lr} 4a043ecc: e1a04000 mov r4, r0 ip->ip_hl_v = 0x45; 4a043ed0: e5c03000 strb r3, [r0] ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); 4a043ed4: e3a03b05 mov r3, #5120 ; 0x1400 ip->ip_tos = 0; 4a043ed8: e3a00000 mov r0, #0 ip->ip_len = htons(IP_HDR_SIZE); 4a043edc: e1c430b2 strh r3, [r4, #2] ip->ip_id = htons(net_ip_id++); 4a043ee0: e59f3064 ldr r3, [pc, #100] ; 4a043f4c <net_set_ip_header+0x88>
And it seems like it's using an strb instruction to avoid the unaligned access.
As far as I know, we are passing --wno-unaligned-access, so the broken situation should not arise, and yet it does, so I'm a bit confused, and not really sure what to do from there.
For reference, this is the relevant code:
struct ip_udp_hdr { u8 ip_hl_v; /* header length and version */ u8 ip_tos; /* type of service */ u16 ip_len; /* total length */ u16 ip_id; /* identification */ u16 ip_off; /* fragment offset field */ u8 ip_ttl; /* time to live */ u8 ip_p; /* protocol */ u16 ip_sum; /* checksum */ struct in_addr ip_src; /* Source IP address */ struct in_addr ip_dst; /* Destination IP address */ u16 udp_src; /* UDP source port */ u16 udp_dst; /* UDP destination port */ u16 udp_len; /* Length of UDP packet */ u16 udp_xsum; /* Checksum */ };
void net_set_ip_header(uchar *pkt, struct in_addr dest, struct in_addr source) { struct ip_udp_hdr *ip = (struct ip_udp_hdr *)pkt;
/* * Construct an IP header. */ /* IP_HDR_SIZE / 4 (not including UDP) */ ip->ip_hl_v = 0x45; ip->ip_tos = 0; ip->ip_len = htons(IP_HDR_SIZE); ip->ip_id = htons(net_ip_id++); ip->ip_off = htons(IP_FLAGS_DFRAG); /* Don't fragment */ ip->ip_ttl = 255; ip->ip_sum = 0; /* already in network byte order */ net_copy_ip((void *)&ip->ip_src, &source); /* already in network byte order */ net_copy_ip((void *)&ip->ip_dst, &dest);
}
What's changed with gcc7 is that it now merges the writes to the first three fields (occupying 4 bytes) into a single 32-bit store. There is nothing wrong with this.
Now struct ip_udp_hdr includes 32-bit fields, so it's natural alignment is 4 bytes. If the 'pkt' argument is not adequately aligned, the cast in the first line of the function becomes invalid. Judging by the register dump and disassembly, the pointer is indeed not thusly aligned, and this is an error.
Yes, your analysis appears to be the exact copy of mine up to this point: https://lists.denx.de/pipermail/u-boot/2017-July/298016.html
The misaligned pointer should still not cause a hardware abort, however, on ARMv6 or later which permits unaligned addresses with the STR instruction. That is unless some idiot has gone and enabled strict alignment checking again despite this being against all common sense. There was a lengthy argument about this a few years ago, ultimately resulting in the proper settings being put in place.
The trick is that unaligned memory accesses still need some special setup even on ARMv6 or later hardware. And we can't always count on having it working when running early bootloader code.
You can check sections "A3.2.2 Cases where unaligned accesses are UNPREDICTABLE" and "B3.12.4 Alignment faults" in the ARM Architecture Reference Manual.
Basically, the MMU has to be enabled. Also unaligned memory accesses can't be done for the memory pages with device or strongly-ordered attribute.
The MMU should already be enabled in order for the caches to work.
Moreover, not every ARM instruction supports unaligned memory accesses too. For example, LDM/STM instructions don't work with unaligned memory addresses regardless of the SCTLR.A bit. And if the compiler thinks that the pointer is supposed to be properly aligned, then it may use such instructions. A very good testcase is a simple function like this:
int f(int *x) { return x[0] + x[1]; }
The compiler will rightfully generate the following instructions:
00000000 <f>: 0: e8900009 ldm r0, {r0, r3} 4: e0830000 add r0, r3, r0 8: e12fff1e bx lr
If the pointer is misaligned, then it will surely fail.
I'm well aware of this. However, in the case at hand, it was an STR instruction that failed, and this shouldn't happen on a correctly configured ARMv6+.
What hardware did this happen on? If it was on ARMv5, adding the packed attribute is probably the correct fix. If it was ARMv6 or later, something else is broken as well.
It does not matter if this was ARMv6+ hardware or not. The current U-Boot code is wrong and we need to fix it.
The question is how many errors there are. That's why I asked about the hardware.
I've seen it on a number of devices but they were all ARMv7+ (AllWinner, Rockchips etc)

On Thu, Jul 13, 2017 at 11:20:34AM +0100, Peter Robinson wrote:
What hardware did this happen on? If it was on ARMv5, adding the packed attribute is probably the correct fix. If it was ARMv6 or later, something else is broken as well.
It does not matter if this was ARMv6+ hardware or not. The current U-Boot code is wrong and we need to fix it.
The question is how many errors there are. That's why I asked about the hardware.
I've seen it on a number of devices but they were all ARMv7+ (AllWinner, Rockchips etc)
It was on an Allwinner SoCs with a Cortex-A7 CPU, so armv7. However, as far as I know, the unaligned accesses are disable in u-boot.
Maxime

Maxime Ripard maxime.ripard@free-electrons.com writes:
On Thu, Jul 13, 2017 at 11:20:34AM +0100, Peter Robinson wrote:
What hardware did this happen on? If it was on ARMv5, adding the packed attribute is probably the correct fix. If it was ARMv6 or later, something else is broken as well.
It does not matter if this was ARMv6+ hardware or not. The current U-Boot code is wrong and we need to fix it.
The question is how many errors there are. That's why I asked about the hardware.
I've seen it on a number of devices but they were all ARMv7+ (AllWinner, Rockchips etc)
It was on an Allwinner SoCs with a Cortex-A7 CPU, so armv7. However, as far as I know, the unaligned accesses are disable in u-boot.
Yes, so it seems, although I can't fathom why.

Hi,
On Thu, 2017-07-13 at 01:43 +0100, Måns Rullgård wrote:
Maxime Ripard maxime.ripard@free-electrons.com writes:
Hi,
I recently got a gcc 7.1 based toolchain, and it seems like it generates unaligned code, specifically in the net_set_ip_header function in my case.
Whenever some packet is sent, this data abort is triggered:
=> setenv ipaddr 10.42.0.1; ping 10.42.0.254 using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:af:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! Using usb_ether device data abort pc : [<7ff9db10>] lr : [<7ff9f00c>] reloc pc : [<4a043b10>] lr : [<4a04500c>] sp : 7bf37cc8 ip : 00000000 fp : 7ff6236c r10: 7ffed2b8 r9 : 7bf39ee8 r8 : 7ffed2b8 r7 : 00000001 r6 : 00000000 r5 : 0000002a r4 : 7ffed30e r3 : 14000045 r2 : 01002a0a r1 : fe002a0a r0 : 7ffed30e Flags: nZCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
I'm running into the same issue using a Buildroot GCC 7 toolchain.
[snip]
What hardware did this happen on? If it was on ARMv5, adding the packed attribute is probably the correct fix. If it was ARMv6 or later, something else is broken as well.
FWIW, the board I'm using is ARMv5 (an i.MX28).
Jörg Krause
participants (9)
-
Dr. Philipp Tomsich
-
Jörg Krause
-
Mark Kettenis
-
Maxime Ripard
-
Måns Rullgård
-
Peter Robinson
-
Siarhei Siamashka
-
Thomas Petazzoni
-
Tom Rini