
Hi Takahiro,
On Wed, 16 Feb 2022 at 01:31, AKASHI Takahiro takahiro.akashi@linaro.org wrote:
Hi Simon,
On Mon, Feb 14, 2022 at 11:35:06AM +0900, AKASHI Takahiro wrote:
Heinrich,
On Thu, Feb 10, 2022 at 04:20:11PM +0100, Heinrich Schuchardt wrote:
On 2/10/22 09:11, AKASHI Takahiro wrote:
Background:
The purpose of this patch is to reignite the discussion about how UEFI subystem would best be integrated into U-Boot driver model. In the past, I proposed a couple of patch series, the latest one[1], while Heinrich revealed his idea[2], and the approach taken here is something between them, with a focus on block device handlings.
Disks in UEFI world:
In general in UEFI world, accessing to any device is performed through a 'protocol' interface which are installed to (or associated with) the device's UEFI handle (or an opaque pointer to UEFI object data). Protocols are implemented by either the UEFI system itself or UEFI drivers.
For block IO's, it is a device which has EFI_BLOCK_IO_PROTOCOL (efi_disk hereafter). Currently, every efi_disk may have one of two origins:
a.U-Boot's block devices or related partitions (lib/efi_loader/efi_disk.c) b.UEFI objects which are implemented as a block device by UEFI drivers. (lib/efi_driver/efi_block_device.c)
All the efi_diskss as (a) will be enumerated and created only once at UEFI subsystem initialization (efi_disk_register()), which is triggered by first executing one of UEFI-related U-Boot commands, like "bootefi", "setenv -e" or "efidebug". EFI_BLOCK_IO_PROTOCOL is implemented by UEFI system using blk_desc(->ops) in the corresponding udevice(UCLASS_BLK).
On the other hand, efi_disk as (b) will be created each time UEFI boot services' connect_controller() is executed in UEFI app which, as a (device) controller, gives the method to access the device's data, ie. EFI_BLOCK_IO_PROTOCOL.
> more details >>>
Internally, connect_controller() search for UEFI driver that can support this controller/protocol, 'efi_block' driver(UCLASS_EFI) in this case, then calls the driver's 'bind' interface, which eventually installs the controller's EFI_BLOCK_IO_PROTOCOL to efi_disk object. 'efi_block' driver also create a corresponding udevice(UCLASS_BLK) for
- creating additional partitions efi_disk's, and
- supporting a file system (EFI_SIMPLE_FILE_SYSTEM_PROTOCOL) on it.
<<< <<<
Issues:
While an efi_disk represents a device equally for either a whole disk or a partition in UEFI world, the driver model treats only a whole disk as a real block device or udevice(UCLASS_BLK).
efi_disk holds and makes use of "blk_desc" data even though blk_desc in plat_data is supposed to be private and not to be accessed outside the driver model. # This issue, though, exists for all the implementation of U-Boot # file systems as well.
For efi_disk(a), 3. A block device can be enumerated dynamically by 'scanning' a device bus in U-Boot, but UEFI subsystem is not able to update efi_disks accordingly. For examples, => scsi rescan; efidebug devices => usb start; efidebug devices ... (A) (A) doesn't show any usb devices detected.
=> scsi rescan; efidebug boot add -b 0 TEST scsi 0:1 ... => scsi rescan ... (B) => bootefi bootmgr ... (C) (C) may de-reference a bogus blk_desc pointer which has been freed by (B). (Please note that "scsi rescan" removes all udevices/blk_desc and then re-create them even if nothing is changed on a bus.)
For efi_disk(b), 4. A "controller (handle)", combined with efi_block driver, has no corresponding udevice as a parent of efi_disks in DM tree, unlike, say, a scsi controller, even though it provides methods for block io operations. 5. There is no way supported to remove efi_disk's even after disconnect_controller() is called.
My approach:
Due to functional differences in semantics, it would be difficult to identify "udevice" structure as a handle in UEFI world. Instead, we will have to somehow maintain a relationship between a udevice and a handle.
1-1. add a dedicated uclass, UCLASS_PARTITION, for partitions Currently, the uclass for partitions is not a UCLASS_BLK. It can be possible to define partitions as UCLASS_BLK (with IF_TYPE_PARTION?), but I'm afraid that it may introduce some chaos since udevice(UCLASS_BLK) is tightly coupled with 'struct blk_desc' data which is still used as a "structure to a whole disk" in a lot of interfaces. (I hope that you understand what it means.)
In DM tree, a UCLASS_PARTITON instance has a UCLASS_BLK parent: For instance, UCLASS_SCSI --- UCLASS_BLK --- UCLASS_PARTITION (IF_TYPE_SCSI) | +- struct blk_desc +- struct disk_part +- scsi_blk_ops +- blk_part_ops
1-2. create partition udevices in the context of device_probe() part_init() is already called in blk_post_probe(). See the commit d0851c893706 ("blk: Call part_init() in the post_probe() method"). Why not enumerate partitions as well in there.
- add new block access interfaces, which takes a *udevice* as a target device, in U-Boot and use those functions to implement efi_disk operations (i.e. EFI_BLOCK_IO_PROTOCOL).
3-1. maintain a bi-directional link between a udevice and an efi_disk by adding - a UEFI handle pointer as a tag for a udevice - a udevice pointer in UEFI handle (in fact, in "struct efi_disk_obj")
3-2. synchronize the lifetime of efi_disk objects in UEFI world with the driver model using - event notification associated with device's probe/remove.
- I have no solution to issue(4) and (5) yet.
<<<Example DM tree on qemu-arm64>>> => dm tree Class Driver Name
root root_driver root_driver ... pci pci_generic_ecam |-- pcie@10000000 pci_generi pci_generic_drv | |-- pci_0:0.0 virtio virtio-pci.l | |-- virtio-pci.l#0 ethernet virtio-net | | `-- virtio-net#32 ahci ahci_pci | |-- ahci_pci scsi ahci_scsi | | `-- ahci_scsi blk scsi_blk | | |-- ahci_scsi.id0lun0 partition blk_partition | | | |-- ahci_scsi.id0lun0:1 partition blk_partition | | | `-- ahci_scsi.id0lun0:2 blk scsi_blk | | `-- ahci_scsi.id1lun0 partition blk_partition | | |-- ahci_scsi.id1lun0:1 partition blk_partition | | `-- ahci_scsi.id1lun0:2 usb xhci_pci | `-- xhci_pci usb_hub usb_hub | `-- usb_hub usb_dev_ge usb_dev_generic_drv | |-- generic_bus_0_dev_2 usb_mass_s usb_mass_storage | `-- usb_mass_storage blk usb_storage_blk | `-- usb_mass_storage.lun0 partition blk_partition | |-- usb_mass_storage.lun0:1 partition blk_partition | `-- usb_mass_storage.lun0:2 ... => efi devices Device Device Path ================ ==================== 000000013eeea8d0 /VenHw() 000000013eeed810 /VenHw()/MAC(525252525252,1) 000000013eefc460 /VenHw()/Scsi(0,0) 000000013eefc5a0 /VenHw()/Scsi(0,0)/HD(1,GPT,ce86c5a7-b32a-488f-a346-88fe698e0edc,0x22,0x4c2a) 000000013eefe320 /VenHw()/Scsi(0,0)/HD(2,GPT,aa80aab9-33e6-42b6-b5db-def2cb8d7844,0x5000,0x1a800) 000000013eeff210 /VenHw()/Scsi(1,0) 000000013eeff390 /VenHw()/Scsi(1,0)/HD(1,GPT,ce86c5a7-b32a-488f-a346-88fe698e0edc,0x22,0x4c2a) 000000013eeff7d0 /VenHw()/Scsi(1,0)/HD(2,GPT,aa80aab9-33e6-42b6-b5db-def2cb8d7844,0x5000,0x1a800) 000000013ef04c20 /VenHw()/UsbClass(0x0,0x0,0x9,0x0,0x3)/UsbClass(0x46f4,0x1,0x0,0x0,0x0) 000000013ef04da0 /VenHw()/UsbClass(0x0,0x0,0x9,0x0,0x3)/UsbClass(0x46f4,0x1,0x0,0x0,0x0)/HD(1,0x01,0,0x0,0x99800) 000000013ef04f70 /VenHw()/UsbClass(0x0,0x0,0x9,0x0,0x3)/UsbClass(0x46f4,0x1,0x0,0x0,0x0)/HD(2,0x01,0,0x99800,0x1800)
Patchs:
For easy understandings, patches may be categorized into separate groups of changes.
Patch#1-#7: DM: add device_probe() for later use of events Patch#8-#11: DM: add new features (tag and event notification) Patch#12-#16: UEFI: dynamically create/remove efi_disk's for a raw disk and its partitions For removal case, we may need more consideration since removing handles unconditionally may end up breaking integrity of handles (as some may still be held and referenced to by a UEFI app). Patch#17-#18: UEFI: use udevice read/write interfaces Patch#19-#20: UEFI: fix-up efi_driver, aligning with changes in DM integration
[1] https://lists.denx.de/pipermail/u-boot/2019-February/357923.html [2] https://lists.denx.de/pipermail/u-boot/2021-June/452297.html
This series does not pass Gitlab CI:
See https://source.denx.de/u-boot/custodians/u-boot-efi/-/jobs/391030 https://source.denx.de/u-boot/custodians/u-boot-efi/-/jobs/391031
I have noticed those errors but I didn't think that they were related to my patch set initially as I didn't touch any code in gpt driver, android/avb nor video driver.
I will set the whole series to "changes requested"
Please, run 'make tests' before resubmitting.
Best regards
Heinrich
=================================== FAILURES
________________________________ test_gpt_write ________________________________ test/py/tests/test_gpt.py:169: in test_gpt_write assert 'Writing GPT: success!' in output E AssertionError: assert 'Writing GPT: success!' in 'Writing GPT: Not a block device: rng\r\r\nsuccess!'
The reason of assertion failure here is that some log message was inserted in a output message although the test itself was finished successfully: "Writing GPT: success!" <== a correct output message ^ "Not a block device: rng"
Adding efi_disk_probe() as a callback to EVT_DM_POST_PROBE created this *log_info* message in dm_rng_read() <- get_rand_uuid() <- gen_rand_uuid_str() in "gpt write" command.
We can fix this type of failure by the hack: ===8<=== --- a/lib/efi_loader/efi_disk.c +++ b/lib/efi_loader/efi_disk.c @@ -612,8 +612,6 @@ static int efi_disk_probe(void *ctx, struct event *event)
/* TODO: We won't support partitions in a partition */ if (id != UCLASS_BLK) {
if (id != UCLASS_PARTITION)
log_info("Not a block device: %s\n", dev->name); return 0; }
===>8===
I don't think, however, that it is a good thing that test results depend on console outputs, especially *log* messages.
Furthermore, I don't know why we see *info*-level messages even under CONFIG_LOGLEVEL=4 (warning).
----------------------------- Captured stdout call
=> host bind 0 /tmp/sandbox/test_gpt_disk_image.bin
=> => gpt write host 0 "name=all,size=0"
Writing GPT: Not a block device: rng
success!
=> ___________________ test_ut[ut_dm_dm_test_video_comp_bmp32] ____________________ test/py/tests/test_ut.py:43: in test_ut assert output.endswith('Failures: 0') E AssertionError: assert False E + where False = <built-in method endswith of str object at 0x7fd72d2fc800>('Failures: 0') E + where <built-in method endswith of str object at 0x7fd72d2fc800> = 'Test: dm_test_video_comp_bmp32: video.c\r\r\nSDL renderer does not exist\r\r\ntest/dm/video.c:88, compress_frame_buff..._test_video_comp_bmp32(): 2024 == compress_frame_buffer(uts, dev): Expected 0x7e8 (2024), got 0x1 (1)\r\r\nFailures: 2'.endswith
----------------------------- Captured stdout call
=> ut dm dm_test_video_comp_bmp32
Test: dm_test_video_comp_bmp32: video.c
SDL renderer does not exist
test/dm/video.c:88, compress_frame_buffer(): !memcmp(uc_priv->fb, uc_priv->copy_fb, uc_priv->fb_size): Copy framebuffer does not match fb
test/dm/video.c:484, dm_test_video_comp_bmp32(): 2024 == compress_frame_buffer(uts, dev): Expected 0x7e8 (2024), got 0x1 (1)
Failures: 2
I don't know yet why this happened.
It seems that this error happened simply because more ut DM tests were added. Added here are DM tag tests (in my patch#14 of 20).
But what type of test is added doesn't matter. When a total number of ut DM tests is increased (and exceeds some limit?), one of tests (either video or another) may unexpectedly fail. For instance, I randomly picked up one test from test/dm/gpio.c and commented it out, and then I didn't see any error in test_ut.py.
So I suspect there may be some problem with pytest framework.
Do you have any clue, Simon?
Yes I believe it is a problem with memory allocation. Perhaps we run out of memory, or something else goes wrong. The value:
#define top (av_[2])
seems to get corrupted. I did spent some time trying to figure out what it was but have not found it yet.
Regards, Simon