
Am 2021-10-31 12:44, schrieb Mark Kettenis:
Date: Sun, 31 Oct 2021 11:43:38 +0100 From: Michael Walle michael@walle.cc
Hi,
I sometimes see a corrupted initrd during kernel boot on my board (kontron_sl28). Debugging showed that in this case the spin table for the secondary CPUs overlaps with the lmb initrd allocations (initrd_high is set).
I had a look at how the fdt and initrd are relocated (if enabled). In summary, they use lmb_alloc() which will first add all available memory and then carve out reserved regions. There are calls for arch (arch_lmb_reserve()) and board (board_lmb_reserve()) callbacks. The interesting thing is that, there is also code to carve out any reserved regions which were added to the fdt earlier (boot_fdt_add_mem_rsv_regions()). The problem here is, that the DT fixups, which might add the reserved regions are called just before jumping to linux. Thus both the allocation for the fdt (if fdt_high is set) and the ramdisk (if initrd_high is set) will ignore any reserved memory regions.
Unfortunately, I don't see any good way to fix this. You'd need all the DT fixups before we can initialize the lmb. Also, I don't know if this will affect any other areas; probably I'm the only one, who reserves an area which is outside of the u-boot code and data segment.
A hackish way would be to carve out the spin_table code in board_lmb_reserve(). But meh..
The spin table is embedded in the u-boot binary itself isn't it? But the memory occupied by the u-boot should already be reserved...
Yes it is. As long as it doen't overlap with the 64k EFI code page. See below.
Unless CONFIG_EFI_LOADER is defined. Then it relocates the spin table to memory allocated using efi_allocate_pages(). But that function only looks at the EFI memory map to figure out what memory is available. So I suspect that it might hand out the same memory as lmb_alloc(). It all looks a bit broken to me...
Yes, that is actually my code ;) The kontron_sl28 is the only board which uses spin tables as far as I know. It doesn't support PSCI; at least if you don't load a bl31 TF-A. Therefore, for SMP it uses spin tables. The relocation code work arounds a problem with the reserved EFI code, see [1].
And yes, it actually is broken. But so might be every code which is using the efi_allocate_pages(), no? LMB isn't global, but is just initialized at different places. Like before a linux kernel is booted or when you load a file (?). And everytime the whole memory is added, and then different regions are carved out (see above).
Does your target end up with CONFIG_EFI_LOADER defined?
Yes ;)
-michael
[1] https://lore.kernel.org/u-boot/20200601195336.3237-1-michael@walle.cc/