
On Sun, Aug 6, 2017 at 10:17 AM, Rob Clark robdclark@gmail.com wrote:
On Sun, Aug 6, 2017 at 9:16 AM, Mark Kettenis mark.kettenis@xs4all.nl wrote:
From: Rob Clark robdclark@gmail.com Date: Sat, 5 Aug 2017 11:36:25 -0400
On Sat, Aug 5, 2017 at 11:24 AM, Rob Clark robdclark@gmail.com wrote:
On Sat, Aug 5, 2017 at 11:10 AM, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
On 08/05/2017 04:35 PM, Rob Clark wrote:
On Sat, Aug 5, 2017 at 10:28 AM, Mark Kettenis mark.kettenis@xs4all.nl wrote: >> Date: Sat, 5 Aug 2017 16:01:51 +0200 (CEST) >> From: Mark Kettenis mark.kettenis@xs4all.nl >> >> Unfortunately something in this patch series breaks things for me on a >> Banana Pi: > > And according to git bisect: > > 4e3e748a50fc3f43e20c7ff407184596d7c9a589 is the first bad commit > commit 4e3e748a50fc3f43e20c7ff407184596d7c9a589 > Author: Peter Jones pjones@redhat.com > Date: Wed Jun 21 16:39:02 2017 -0400 > > efi: add some more device path structures > > Signed-off-by: Peter Jones pjones@redhat.com > Signed-off-by: Rob Clark robdclark@gmail.com
hmm, odd.. it is only adding some #define's and structs that are not used until a later commit..
although it does also make 'struct efi_device_path_mac_addr' __packed, which it should have been before. Is this an armv7? I wonder if we have some troubles with unaligned accesses on armv7 that we don't have on aarch64 (or maybe compiler isn't turning access to device-path nodes into byte accesses if it can't do unaligned accesses. (The node in the device-path structure are byte-packed.)
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15414.htm...
<cite>On older processors, such as ARM9 family based processors, an unaligned load had to be synthesised in software. Typically by doing a series of small accesses, and combining the results. ... Unaligned access support must be enabled by setting the SCTLR.A bit in the system control coprocessor</cite>
Generally packing structures is not a good idea performance-wise. The sequence of fields should be carefully chosen to fill up to powers of two (2, 4 , 8).
Yeah, it was clearly a dumb idea for UEFI to not make device-path nodes word aligned. But when implementing a standard, we don't have a choice but to implement it properly, warts and all :-/
btw, just for reference (if anyone is curious), see sect 10.3.1 in UEFI spec v2.7. If you don't want to bother looking it up, here is the exact wording:
A Device Path is a series of generic Device Path nodes. The first Device Path node starts at byte offset zero of the Device Path. The next Device Path node starts at the end of the previous Device Path node. Therefore all nodes are byte-packed data structures that may appear on any byte boundary. All code references to device path notes must assume all fields are unaligned. Since every Device Path node contains a length field in a known place, it is possible to traverse Device Path nodes that are of an unknown type. There is no limit to the number, type, or sequence of nodes in a Device Path.
So clearly what we were doing before was incorrect.. but cheating w/ extra padding bytes on arch's that cannot handle unaligned accesses will avoid having to go fix grub, bootaa64/shim/fallback (and anything else that uses gnu-efi), and apparently openbsd's bootaa64.efi. It is kinda weird to be using efi on these arch's in the first place, so I guess I don't feel as badly about the padding bytes hack on those arch's. (But we should not use the hack on aarch64.)
Looking a bit more closely at the OpenBSD efiboot code, I'm pretty sure we handle the parsing of device path nodes correctly. We use an incarnation of the Intel EFI header files which have:
typedef struct _EFI_DEVICE_PATH { UINT8 Type; UINT8 SubType; UINT8 Length[2]; } EFI_DEVICE_PATH;
#define DevicePathNodeLength(a) ( ((a)->Length[0]) | ((a)->Length[1] << 8) )
so this is all done using byte access.
Hmm, I assume the OpenBSD efiboot code does look at the payload of device-path nodes, like HARDDRIVE_DEVICE_PATH.. which does use u32 and u64 fields (which would be unaligned). Although that might not be the problem here.
Going back to the original crash report:
data abort pc : [<7ef8d878>] lr : [<7ef8d874>] reloc pc : [<4a039878>] lr : [<4a039874>] sp : 7af29660 ip : 00000000 fp : 7af29774 r10: 7efec230 r9 : 7af33ee0 r8 : 00000000 r7 : 00000009 r6 : 7ef9e8b8 r5 : 7af296a0 r4 : 7efa4495 r3 : 7af296a0 r2 : 0000008c r1 : 7af29658 r0 : 00000004 Flags: nzCV IRQs off FIQs off Mode SVC_32
I think it is actually "reloc pc" instead of "pc" that we need to look at:
$ addr2line -e u-boot 0x4a039874 /home/kettenis/tmp/rclark/u-boot/include/efi_loader.h:272
which points at the guidstr() function. That code certainly looks suspicious. It will defenitely trigger alignment faults if the guid isn't 32-bit aligned.
hmm, interesting. At least gnu-efi's EFI_GUID uses the same u32+u16+u16+u8[8] layout. And iirc so did linux kernel and grub, so it seemed like u-boot was the odd one out for using u8[16]. Although maybe we are printing one of our own guid's or openbsd efiboot is also using u8[16].
Either way I've dropped this patch and instead added %pG support to vsprintf, using uuid_bin_to_str() which only does byte accesses.
The latest can be found here:
https://github.com/robclark/u-boot/commits/enough-uefi-for-shim
https://github.com/robclark/u-boot.git enough-uefi-for-shim
The relevant instruction is a 16-bit load:
4a039878: e1d430b4 ldrh r3, [r4, #4]
and with r4 = 7efa4495 that will defenitely trap.
Looking at the defenition efi_guid_t in u-boot:
typedef struct { u8 b[16]; } efi_guid_t;
there is no guarantee that GUIDs are properly aligned, so you'll need to fix the guidstr function introduced in commit b6d913c6101ba891eb2bcb08a4ee9fc8fb57367.
Things are already broken before that commit though, so there is another problem. I'll see if I can figure out what it is...
Thanks
btw, we do have some travis tests that run grub.efi (in qemu) on armv7 and others.. maybe adding OpenBSD's efiboot to the test suit would be a good idea? (And eventually shim+fallback.efi after this patchset is merged..)
BR, -R