
On 8/18/21 7:13 AM, AKASHI Takahiro wrote:
On Tue, Aug 17, 2021 at 09:20:31AM +0200, Michal Simek wrote:
On 8/12/21 11:43 AM, AKASHI Takahiro wrote:
On Fri, Jul 30, 2021 at 08:22:18AM +0200, Michal Simek wrote:
On 7/30/21 7:33 AM, AKASHI Takahiro wrote:
On Fri, Jul 30, 2021 at 06:41:01AM +0200, Michal Simek wrote:
On 7/30/21 4:35 AM, AKASHI Takahiro wrote: > On Thu, Jul 29, 2021 at 04:09:32PM +0200, Michal Simek wrote: >> Hi, >> >> On 6/10/21 2:59 PM, AKASHI Takahiro wrote: >>> On Thu, Jun 10, 2021 at 02:31:46PM +0200, Michal Simek wrote: >>>> >>>> >>>> On 6/10/21 12:51 PM, Heinrich Schuchardt wrote: >>>>> On 6/10/21 12:04 PM, Michal Simek wrote: >>>>>> Hi, >>>>>> >>>>>> On 6/10/21 11:47 AM, Heinrich Schuchardt wrote: >>>>>>> On 6/10/21 10:44 AM, Michal Simek wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I am playing with booting from USB via EFI. And I see very weird >>>>>>>> behavior. I have burnt image with grub to USB flashdisk and I have >>>>>>>> tested it on 3 zynqmp boards. zcu102, zcu104 and SOM Kria board. >>>>>>>> On zcu102 grub is going to boot menu and everything is working fine as >>>>>>>> expected. >>>>>>>> On zcu104 and SOM Kria I am able to get grub not to menu. When I list >>>>>>>> partitions in grub I see that only SDs are listed: >>>>>>>> grub> ls >>>>>>>> (hd0) (hd0,msdos1) (hd1) (hd1,msdos1) >>>>>>> >>>>>>> Hello Michal, >>>>>>> >>>>>>> thanks for sharing your observations. >>>>>>> >>>>>>> What devices do hd0 and hd1 relate to? >>>>>>> >>>>>>>> >>>>>>>> On zcu102(working board) I also see usb(gpt) partitions and SD. >>>>>>>> grub> ls >>>>>>>> (hd0) (hd0,gpt2) (hd0,gpt1) (hd1) (hd1,msdos1) >>>>>>>> >>>>>>> >>>>>>> GPT and MBR partitioning are independent of the device type. >>>>>>> >>>>>>>> >>>>>>>> On zcu104 I see one more error message >>>>>>>> "PE image measurement failed" >>>>>>> >>>>>>> This is related to CONFIG_EFI_TCG2_PROTOCOL=y. Do you have a TPMv2? This >>>>>>> will not stop disk enumeration. >>>>>>> >>>>>>>> But I can't see it on SOM. >>>>>>>> >>>>>>>> U-Boot image is just the same for all boards. I am using generic >>>>>>>> xilinx_zynqmp_virt_defconfig. >>>>>>>> >>>>>>>> When I compare DT description for USB between zcu102 and zcu104 they >>>>>>>> are >>>>>>>> the same. SOM doesn't have usb enabled by default (but I enabled it) >>>>>>>> but >>>>>>>> grub starts which means that communication with USB is fine. >>>>>>>> >>>>>>>> It is based on my latest patches available here. >>>>>>>> u-boot/custodians/u-boot-microblaze.git (usb-efi-issue branch) >>>>>>>> >>>>>>>> Also when I list usb I see all partitions just fine. >>>>>>>> ZynqMP> part list usb 0 >>>>>>>> >>>>>>>> Partition Map for USB device 0 -- Partition Type: EFI >>>>>>>> >>>>>>>> Part Start LBA End LBA Name >>>>>>>> Attributes >>>>>>>> Type GUID >>>>>>>> Partition GUID >>>>>>>> 1 0x00000800 0x001007fe "Microsoft basic data" >>>>>>>> attrs: 0x0000000000000000 >>>>>>>> type: ebd0a0a2-b9e5-4433-87c0-68b6b72699c7 >>>>>>>> type: data >>>>>>>> guid: 0e7f8b3d-296b-4720-be9d-c4687d3c4a77 >>>>>>>> 2 0x00100800 0x001197fe "Microsoft basic data" >>>>>>>> attrs: 0x0000000000000000 >>>>>>>> type: ebd0a0a2-b9e5-4433-87c0-68b6b72699c7 >>>>>>>> type: data >>>>>>>> guid: 8892eddc-231a-4e6e-a5e1-c310f4482fb7 >>>>>>>> >>>>>>>> >>>>>>>> Do you have any idea why on one system is working fine to get to menu >>>>>>>> and on others there is an issue to get all partitions even u-boot is >>>>>>>> able to see them and can work with them. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Michal >>>>>>>> >>>>>>> >>>>>>> Where is the GRUB binary? - If it is in EFI/boot/bootaa64.efi, it could >>>>>>> be that the USB sub-system is simply not initialized yet when the boot >>>>>>> manager is called by distroboot. >>>>>>> >>>>>>> For testing partition detection in the UEFI sub-system it is enough >>>>>>> to run >>>>>>> >>>>>>> efidebug devices >>>>>>> >>>>>>> Until yesterday we had a problem with partition numbers >= 10, cf. >>>>>>> >>>>>>> efi_loader: partition numbers are hexadecimal >>>>>>> https://source.denx.de/u-boot/u-boot/-/commit/3dca77b1dc1b6dbf9c8b51572fe4b0... >>>>>>> >>>>>>> >>>>>>> >>>>>>> Block devices are enumerated in efi_disk_register(). Please, try to add >>>>>>> debug output there to elucidate the problem. >>>>>> >>>>>> I found where the problem is. First of all zcu102 didn't use the same >>>>>> image as others (it wasn't updated properly). >>>>>> When you have CONFIG_EFI_CAPSULE_ON_DISK_EARLY that efi_disk_register() >>>>>> is called before usb block devices are detected and registered that's >>>>>> why grub doesn't see them. >>>>> >>>>> The problem is CONFIG_EFI_SETUP_EARLY=y required by >>>>> CONFIG_EFI_CAPSULE_ON_DISK_EARLY. >>>>> >>>>> Why is USB initialized later then MMC? >>>> >>>> It is not just usb. SCSI/sata are behaving in the same way too. >>>> >>>>> >>>>> Overall we have a deficiency in the UEFI implementation in that we >>>>> cannot deal with block devices added or removed after initialization. >>>>> >>>>> Here integration with the driver model is missing. >>>> >>>> Right. And also there are commands which can create MBR partitions and I >>>> expect when you write image to SD and then run rescan or so you could >>>> get other partitions too. >>>> Maybe hook via part_init()? with removing efi_disk_register. >>> >>> For the record, I have proposed my ideas several times[1], [2]. >>> I'm, however, no longer working on this issue as I have shifted >>> my focus to UEFI secure boot and capsule update. >>> >>> -Takahiro Akashi >>> >>> [1] https://lists.denx.de/pipermail/u-boot/2018-November/347491.html >>> [2] https://lists.denx.de/pipermail/u-boot/2019-February/357923.html >> >> I want to continue on this thread. I have disabled >> EFI_CAPSULE_ON_DISK_EARLY some time ago and trying to workaround that >> usb/scsi detection by simply calling usb reset and scsi reset as the >> part of PREBOOT. Then all disks are recorded and visible by grub. >> >> But I found another issue which is kind of weird. We are using >> distroboot with soft of fixed sequence. Important part of sequence is >> sd, usb, scsi. >> >> I have added grub on scsi and when I boot directly via run bootcmd_scsi0 >> everything is working fine. When I let distroboot to do the job it or >> run printenv -e before bootcmd_scsi0 I am getting exception. >> From debug it is visible that it is exception called from >> efi_disk_read_blocks. >> >> 0 0x7ff5d188 hang()+20: include/bootstage.h, line 389 >> 1 0x7ff5f908 __assert_fail(): lib/panic.c, line 25 >> 2 0x7fe976a8 do_irq(): arch/arm/lib/interrupts_64.c, line 123 >> 3 0x7fe96a0c _restore_regs()+124: arch/arm/cpu/armv8/exceptions.S, >> line 141 >> 4 0x7ff43740 efi_disk_read_blocks()+160: lib/efi_loader/efi_disk.c, >> line 102 > > How and when did you get this stack trace?
When Abort happened I connected Xilinx debugger via jtag and look at cpu backtrace.
OK, but we are already in grub here and such a trace (in U-Boot) doesn't make sense. Right?
Correct grub already started. But I expect it is still using U-Boot drivers and all exception handlers are still in place from u-boot.
Yeah, but what I didn't understand was:
!"Synchronous Abort" handler, esr 0x02000000 !elr: ffffffffa816c5b0 lr : 000000000805e218 (reloc) !elr: 00000000200005b0 lr : 000000007fef2218 (snip) !Code: 000165fa 0b2d05de 0000ffff 00000000 (20000590) !UEFI image [0x0000000077d48000:0x0000000077de5fff] '/efi\boot\bootaa64.efi'
"Code:" at the exception doesn't seem to be sane assembler, and "elr" is not within the code of neither U-Boot nor shim/grub(bootaa64.efi). ("esr" doesn't tell us anything.) So I wondered where the backtrace came from.
BTW, can you please confirm which function sits at the address of "lr" (=0x7fe2218)?
I don't have that images anymore.
Maybe it is just sata/scsi related issue in EFI but weird is that when disks are scan just before command everything is working fine.
What do you mean by "when disks are scanned just before the command"? The case when you ran "run bootcmd_scsi" without "printenv -e"?
Do you reproduce the problem even if you revert the patch, "xilinx: zynqmp: Initialize usb and scsi via preboot", and run the commands, "run scsi_init; [printenv -e;] run bootcmd_scsi?
Can you also try other EFI commands, like "efidebug devices"?
I found that there is a difference if you run scsi reset or run scsi_init. When scsi_init is used I can't see any issue.
Here you have tried three cases: (1) scsi reset; efidebug devices; boot (hence distro_bootcmd) (2) run scsi_init; efidebug devices; boot (3) scsi rescan; efidebug devices; boot
Only case(2) succeeded to boot the system. Right?
Please double-check that you don't see this problem in all those cases if you don't execute "efidebug devices" (or "printenv -e"). # make sure that no efi command will be executed before # booting from scsi.
I tested these 3 cases and all of them works fine.
scsi reset devtype=scsi run scan_dev_for_boot_part
run scsi_init devtype=scsi run scan_dev_for_boot_part
scsi rescan devtype=scsi run scan_dev_for_boot_part
Variable looks like this scsi_init=if ${scsi_need_init}; then scsi_need_init=false; scsi scan; fi
And when you run scsi scan (last log) you see that problem again. It means when scsi reset/scan is called twice issue is observed. In all
If this is true, my guess is:
In the scenarios above, all the block devices are enumerated by scsi_scan() in the first "run reset" or "run rescan" and new blk_desc's are created.
efidebug is expected to execute efi_init_obj_list(). Please note: EFI subsystem uses U-Boot's blk_desc internally to access block devices. Mapping between U-Boot's blk_desc and UEFI's efi_disk_obj (aka handle) is created only once and statically at the initialization in efi_init_obj_list().
Now that scsi_scan() is executed again in the scond scsi command, all the block devices, hence blk_desc structures, will be freed by blk_unbind_all() and blk_desc's will be *re-created* by scsi probing.
Nevertheless, the binding between blk_desc and efi_disk_obj is maintained even at this point, so any succeeding r/w operations via UEFI interfaces can point to bogus data of old blk_desc and therefore block accesses will get corrupted.
My guess above seems to be likely, but it doesn't explain well that loading/starting "grub" binary succeeds any way.
That make sense what you described. I print desc and by reset there is new desc created at different address. And origin location is freed in device_unbind. Log is below. The question is how to fix this behavior.
Thanks, Michal
ZynqMP> scsi reset
Reset SCSI scanning bus for devices... blk_unbind_all: if_type 2 SATA link 0 timeout. Target spinup took 0 ms. AHCI 0001.0301 32 slots 2 ports 6 Gbps 0x3 impl SATA mode flags: 64bit ncq pm clo only pmp fbss pio slum part ccc apst blk_create_device: devnum -1 blk_create_device: name ahci_scsi.id1lun0, desc 000000007be21340 Device 0: (1:0) Vendor: ATA Prod.: Maxtor 7V300F0 Rev: VA11 Type: Hard Disk Capacity: 286188.8 MB = 279.4 GB (586114704 x 512) ZynqMP> efidebug devices Scanning disk mmc@ff170000.blk... efi_disk_add_dev: desc 000000007be15b30 efi_disk_add_dev: desc 000000007be15b30 Scanning disk ahci_scsi.id1lun0... efi_disk_add_dev: desc 000000007be21340 efi_disk_add_dev: desc 000000007be21340 efi_disk_add_dev: desc 000000007be21340 Found 5 disks ** Unable to read file ubootefi.var ** Failed to load EFI variables Unable to find TPMv2 device DFU alt info setting: done Device Device Path ================ ==================== 000000007be21590 /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b) 000000007be218a0 /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/SD(0)/SD(0) 000000007be21a70 /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/SD(0)/SD(0)/HD(1,0x01,0,0x2000,0x1cd2000) 000000007be21f00 /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/Scsi(1,0) 000000007be222e0 /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/Scsi(1,0)/HD(1,GPT,85b731b6-a4b2-47f4-b1c6-aef6e0f2ce81,0x800,0xfffff) 000000007be22730 /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/Scsi(1,0)/HD(2,GPT,ac600dc7-3160-4f3c-a824-496d00e3d007,0x100800,0x18fff) 000000007be22c80 /VenHw(e61d73b9-a384-4acc-aeab-82e828f3628b)/MAC(000a350370f6,1) ZynqMP> scsi reset
Reset SCSI scanning bus for devices... blk_unbind_all: if_type 2 Removing/unbinding device ahci_scsi.id1lun0 device_unbind: free desc 000000007be21340 blk_create_device: devnum -1 blk_create_device: name ahci_scsi.id1lun0, desc 000000007be3e070 Device 0: (1:0) Vendor: ATA Prod.: Maxtor 7V300F0 Rev: VA11 Type: Hard Disk Capacity: 286188.8 MB = 279.4 GB (586114704 x 512) ZynqMP>