
Date: Sun, 31 Oct 2021 13:04:40 +0100 From: Michael Walle michael@walle.cc
Am 2021-10-31 12:44, schrieb Mark Kettenis:
Date: Sun, 31 Oct 2021 11:43:38 +0100 From: Michael Walle michael@walle.cc
Hi,
I sometimes see a corrupted initrd during kernel boot on my board (kontron_sl28). Debugging showed that in this case the spin table for the secondary CPUs overlaps with the lmb initrd allocations (initrd_high is set).
I had a look at how the fdt and initrd are relocated (if enabled). In summary, they use lmb_alloc() which will first add all available memory and then carve out reserved regions. There are calls for arch (arch_lmb_reserve()) and board (board_lmb_reserve()) callbacks. The interesting thing is that, there is also code to carve out any reserved regions which were added to the fdt earlier (boot_fdt_add_mem_rsv_regions()). The problem here is, that the DT fixups, which might add the reserved regions are called just before jumping to linux. Thus both the allocation for the fdt (if fdt_high is set) and the ramdisk (if initrd_high is set) will ignore any reserved memory regions.
Unfortunately, I don't see any good way to fix this. You'd need all the DT fixups before we can initialize the lmb. Also, I don't know if this will affect any other areas; probably I'm the only one, who reserves an area which is outside of the u-boot code and data segment.
A hackish way would be to carve out the spin_table code in board_lmb_reserve(). But meh..
The spin table is embedded in the u-boot binary itself isn't it? But the memory occupied by the u-boot should already be reserved...
Yes it is. As long as it doen't overlap with the 64k EFI code page. See below.
Yes, I saw the comment. Which confuses me. The overlap itself shouldn't be a problem since both the spin table and the EFI runtime code page should be preserved. But I guess that the overlap causes problems later on because attempting to reserve overlapping areas fails?
Unless CONFIG_EFI_LOADER is defined. Then it relocates the spin table to memory allocated using efi_allocate_pages(). But that function only looks at the EFI memory map to figure out what memory is available. So I suspect that it might hand out the same memory as lmb_alloc(). It all looks a bit broken to me...
Yes, that is actually my code ;) The kontron_sl28 is the only board which uses spin tables as far as I know. It doesn't support PSCI; at least if you don't load a bl31 TF-A. Therefore, for SMP it uses spin tables. The relocation code work arounds a problem with the reserved EFI code, see [1].
And yes, it actually is broken. But so might be every code which is using the efi_allocate_pages(), no? LMB isn't global, but is just initialized at different places. Like before a linux kernel is booted or when you load a file (?). And everytime the whole memory is added, and then different regions are carved out (see above).
Right. So it mostly works because U-Boot either ends up using a code path where it uses LMB or a path where it ends up using the EFI memory map, but never both. Except if it hits your layerscape code.
I think the right way to solve this is to allocate the memory for the spin table through some other means and use efi_add_memory_map() to reserve that memory like what is done in board/raspberrypi/rpi/rpi.c.
Or don't relocate the spin table and fix the issue with overlapping reserves.
Cheers,
Mark