Fwd: Re: [PATCH] efi_loader: Omit memory with "no-map" when returning memory map.

-------- Ursprüngliche Nachricht -------- Von: Ard Biesheuvel ardb@kernel.org Gesendet: 31. August 2021 12:33:56 MESZ An: Heinrich Schuchardt xypron.glpk@gmx.de CC: Kristian Amlie kristian.amlie@northern.tech Betreff: Re: [PATCH] efi_loader: Omit memory with "no-map" when returning memory map.
On Tue, 31 Aug 2021 at 08:41, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
On 8/27/21 9:55 AM, Kristian Amlie wrote:
You can use scripts/get_maintainer.pl to find the right addressees for your patches.
efi_reserve_memory() states that memory marked with "no-map" shall not be accessed by the UEFI payload. Make sure efi_get_memory_map() honors this.
Accessing memory and describing memory are two different things. Describing reserved memory in the memory map is important, because it helps us distinguish it from MMIO regions.
This helps the case when booting vexpress_ca9x4 under QEMU. Because the kernel wants to use an address in the lowest 128MiB of the range, but this range is reserved with "no-map", the kernel complains that it can not allocate the low memory it needs. In reality the actual usable memory starts much higher, which is reflected correctly in the memory map after this fix.
This is a u-boot patch right? (I cannot tell from the context, as there are no mailing lists on cc)
It is u-boot's job to describe all the memory, no matter how it is used. Even if the kernel itself may not use it as system memory, there are cases where kernel infrastructure is used to map these regions: for instance, the ACPI core may need to map a SystemMemory OpRegion, and we need the EFI memory map to tell us whether to use memory or I/O semantics.
As for the 128 MB issue: can you reproduce this with a recent kernel? We made some changes recently to the EFI stub as well as the decompressor code to prevent the decompressor code from relocating itself to the base of DRAM, and to allow the decompressed kernel to reside at any 2 MB aligned offset in memory.
The 'no-map' requirement needs to be fulfilled by the kernel.
The GetMemoryMap() boot time service must comply to the UEFI specification.
The Devicetree Specification has this clarification:
"Reserved regions with the no-map property must be listed in the memory map with type EfiReservedMemoryType. All other reserved regions must be listed with type EfiBootServicesData." https://devicetree-specification.readthedocs.io/en/latest/chapter3-devicenod...
Should the kernel calculate its internal 128 MiB requirement incorrectly this needs be fixed in the kernel EFI stub. Does the problem exist with kernel 5.14?
I wonder if the 128 MiB requirement of the kernel can be lifted for 32bit ARM as we already did for RISC-V. Cf.
As mentioned above, this should already be fixed, in v5.11 or later
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h...
Cc Ard, maintainer of the kernel EFI stub.
Best regards
Heinrich
Signed-off-by: Kristian Amlie kristian.amlie@northern.tech
lib/efi_loader/efi_memory.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/lib/efi_loader/efi_memory.c b/lib/efi_loader/efi_memory.c index f4acbee4f9..7f8543143a 100644 --- a/lib/efi_loader/efi_memory.c +++ b/lib/efi_loader/efi_memory.c @@ -646,8 +646,16 @@ efi_status_t efi_get_memory_map(efi_uintn_t *memory_map_size,
provided_map_size = *memory_map_size;
list_for_each(lhandle, &efi_mem)
list_for_each(lhandle, &efi_mem) {
struct efi_mem_list *lmem;
lmem = list_entry(lhandle, struct efi_mem_list, link);
if (lmem->desc.type == EFI_RESERVED_MEMORY_TYPE)
continue;
map_entries++;
} map_size = map_entries * sizeof(struct efi_mem_desc);
@@ -672,6 +680,10 @@ efi_status_t efi_get_memory_map(efi_uintn_t *memory_map_size, struct efi_mem_list *lmem;
lmem = list_entry(lhandle, struct efi_mem_list, link);
if (lmem->desc.type == EFI_RESERVED_MEMORY_TYPE)
continue;
*memory_map = lmem->desc; memory_map--; }

On 31/08/2021 12:46, Heinrich Schuchardt wrote:
*Von:* Ard Biesheuvel ardb@kernel.org *Gesendet:* 31. August 2021 12:33:56 MESZ *An:* Heinrich Schuchardt xypron.glpk@gmx.de *CC:* Kristian Amlie kristian.amlie@northern.tech *Betreff:* Re: [PATCH] efi_loader: Omit memory with "no-map" when returning memory map.
On Tue, 31 Aug 2021 at 08:41, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
On 8/27/21 9:55 AM, Kristian Amlie wrote: You can use scripts/get_maintainer.pl to find the right addressees for your patches. efi_reserve_memory() states that memory marked with "no-map" shall not be accessed by the UEFI payload. Make sure efi_get_memory_map() honors this.
Accessing memory and describing memory are two different things. Describing reserved memory in the memory map is important, because it helps us distinguish it from MMIO regions.
Ok, my mistake, I thought the kernel would deduce this separately through the DTB.
This helps the case when booting vexpress_ca9x4 under QEMU. Because the kernel wants to use an address in the lowest 128MiB of the range, but this range is reserved with "no-map", the kernel complains that it can not allocate the low memory it needs. In reality the actual usable memory starts much higher, which is reflected correctly in the memory map after this fix.
This is a u-boot patch right? (I cannot tell from the context, as there are no mailing lists on cc)
It is u-boot's job to describe all the memory, no matter how it is used. Even if the kernel itself may not use it as system memory, there are cases where kernel infrastructure is used to map these regions: for instance, the ACPI core may need to map a SystemMemory OpRegion, and we need the EFI memory map to tell us whether to use memory or I/O semantics.
As for the 128 MB issue: can you reproduce this with a recent kernel? We made some changes recently to the EFI stub as well as the decompressor code to prevent the decompressor code from relocating itself to the base of DRAM, and to allow the decompressed kernel to reside at any 2 MB aligned offset in memory.
I'll try this and get back to you!
-- Kristian
The 'no-map' requirement needs to be fulfilled by the kernel. The GetMemoryMap() boot time service must comply to the UEFI specification. The Devicetree Specification has this clarification: "Reserved regions with the no-map property must be listed in the memory map with type EfiReservedMemoryType. All other reserved regions must be listed with type EfiBootServicesData." https://devicetree-specification.readthedocs.io/en/latest/chapter3-devicenodes.html <https://devicetree-specification.readthedocs.io/en/latest/chapter3-devicenodes.html> Should the kernel calculate its internal 128 MiB requirement incorrectly this needs be fixed in the kernel EFI stub. Does the problem exist with kernel 5.14? I wonder if the 128 MiB requirement of the kernel can be lifted for 32bit ARM as we already did for RISC-V. Cf.
As mentioned above, this should already be fixed, in v5.11 or later
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.14&id=c79e89ecaa246c880292ba68cbe08c9c30db77e3 <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.14&id=c79e89ecaa246c880292ba68cbe08c9c30db77e3> Cc Ard, maintainer of the kernel EFI stub. Best regards Heinrich Signed-off-by: Kristian Amlie <kristian.amlie@northern.tech> ------------------------------------------------------------------------ lib/efi_loader/efi_memory.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/lib/efi_loader/efi_memory.c b/lib/efi_loader/efi_memory.c index f4acbee4f9..7f8543143a 100644 --- a/lib/efi_loader/efi_memory.c +++ b/lib/efi_loader/efi_memory.c @@ -646,8 +646,16 @@ efi_status_t efi_get_memory_map(efi_uintn_t *memory_map_size, provided_map_size = *memory_map_size; - list_for_each(lhandle, &efi_mem) + list_for_each(lhandle, &efi_mem) { + struct efi_mem_list *lmem; + + lmem = list_entry(lhandle, struct efi_mem_list, link); + + if (lmem->desc.type == EFI_RESERVED_MEMORY_TYPE) + continue; + map_entries++; + } map_size = map_entries * sizeof(struct efi_mem_desc); @@ -672,6 +680,10 @@ efi_status_t efi_get_memory_map(efi_uintn_t *memory_map_size, struct efi_mem_list *lmem; lmem = list_entry(lhandle, struct efi_mem_list, link); + + if (lmem->desc.type == EFI_RESERVED_MEMORY_TYPE) + continue; + *memory_map = lmem->desc; memory_map--; }

On 8/31/21 1:12 PM, Kristian Amlie wrote:
On 31/08/2021 12:46, Heinrich Schuchardt wrote:
*Von:* Ard Biesheuvel ardb@kernel.org *Gesendet:* 31. August 2021 12:33:56 MESZ *An:* Heinrich Schuchardt xypron.glpk@gmx.de *CC:* Kristian Amlie kristian.amlie@northern.tech *Betreff:* Re: [PATCH] efi_loader: Omit memory with "no-map" when returning memory map.
On Tue, 31 Aug 2021 at 08:41, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
On 8/27/21 9:55 AM, Kristian Amlie wrote: You can use scripts/get_maintainer.pl to find the right addressees for your patches. efi_reserve_memory() states that memory marked with "no-map" shall not be accessed by the UEFI payload. Make sure efi_get_memory_map() honors this.
Accessing memory and describing memory are two different things. Describing reserved memory in the memory map is important, because it helps us distinguish it from MMIO regions.
Ok, my mistake, I thought the kernel would deduce this separately through the DTB.
This helps the case when booting vexpress_ca9x4 under QEMU. Because the kernel wants to use an address in the lowest 128MiB of the range, but this range is reserved with "no-map", the kernel complains that it can not allocate the low memory it needs. In reality the actual usable memory starts much higher, which is reflected correctly in the memory map after this fix.
This is a u-boot patch right? (I cannot tell from the context, as there are no mailing lists on cc)
It is u-boot's job to describe all the memory, no matter how it is used. Even if the kernel itself may not use it as system memory, there are cases where kernel infrastructure is used to map these regions: for instance, the ACPI core may need to map a SystemMemory OpRegion, and we need the EFI memory map to tell us whether to use memory or I/O semantics.
As for the 128 MB issue: can you reproduce this with a recent kernel? We made some changes recently to the EFI stub as well as the decompressor code to prevent the decompressor code from relocating itself to the base of DRAM, and to allow the decompressed kernel to reside at any 2 MB aligned offset in memory.
This would allow putting the kernel at the very top of memory. But consider this function that misbehaves:
arch/arm/include/asm/efi.h:
76 /* on ARM, the initrd should be loaded in a lowmem region */ 77 static inline unsigned long efi_get_max_initrd_addr(unsigned long image_addr) 78 { 79 return round_down(image_addr, SZ_4M) + SZ_512M; 80 }
0x1F000000 = efi_get_max_initrd_addr(0xff000000);
And below 0x1f000000 there may be no RAM at all.
Best regards
Heinrich
I'll try this and get back to you!
-- Kristian
The 'no-map' requirement needs to be fulfilled by the kernel. The GetMemoryMap() boot time service must comply to the UEFI specification. The Devicetree Specification has this clarification: "Reserved regions with the no-map property must be listed in the memory map with type EfiReservedMemoryType. All other reserved regions must be listed with type EfiBootServicesData." https://devicetree-specification.readthedocs.io/en/latest/chapter3-devicenodes.html <https://devicetree-specification.readthedocs.io/en/latest/chapter3-devicenodes.html> Should the kernel calculate its internal 128 MiB requirement incorrectly this needs be fixed in the kernel EFI stub. Does the problem exist with kernel 5.14? I wonder if the 128 MiB requirement of the kernel can be lifted for 32bit ARM as we already did for RISC-V. Cf.
As mentioned above, this should already be fixed, in v5.11 or later
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.14&id=c79e89ecaa246c880292ba68cbe08c9c30db77e3 <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.14&id=c79e89ecaa246c880292ba68cbe08c9c30db77e3> Cc Ard, maintainer of the kernel EFI stub. Best regards Heinrich Signed-off-by: Kristian Amlie <kristian.amlie@northern.tech> ------------------------------------------------------------------------ lib/efi_loader/efi_memory.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/lib/efi_loader/efi_memory.c b/lib/efi_loader/efi_memory.c index f4acbee4f9..7f8543143a 100644 --- a/lib/efi_loader/efi_memory.c +++ b/lib/efi_loader/efi_memory.c @@ -646,8 +646,16 @@ efi_status_t efi_get_memory_map(efi_uintn_t *memory_map_size, provided_map_size = *memory_map_size; - list_for_each(lhandle, &efi_mem) + list_for_each(lhandle, &efi_mem) { + struct efi_mem_list *lmem; + + lmem = list_entry(lhandle, struct efi_mem_list, link); + + if (lmem->desc.type == EFI_RESERVED_MEMORY_TYPE) + continue; + map_entries++; + } map_size = map_entries * sizeof(struct efi_mem_desc); @@ -672,6 +680,10 @@ efi_status_t efi_get_memory_map(efi_uintn_t *memory_map_size, struct efi_mem_list *lmem; lmem = list_entry(lhandle, struct efi_mem_list, link); + + if (lmem->desc.type == EFI_RESERVED_MEMORY_TYPE) + continue; + *memory_map = lmem->desc; memory_map--; }

On Tue, 31 Aug 2021 at 18:59, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
On 8/31/21 1:12 PM, Kristian Amlie wrote:
On 31/08/2021 12:46, Heinrich Schuchardt wrote:
*Von:* Ard Biesheuvel ardb@kernel.org *Gesendet:* 31. August 2021 12:33:56 MESZ *An:* Heinrich Schuchardt xypron.glpk@gmx.de *CC:* Kristian Amlie kristian.amlie@northern.tech *Betreff:* Re: [PATCH] efi_loader: Omit memory with "no-map" when returning memory map.
On Tue, 31 Aug 2021 at 08:41, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
On 8/27/21 9:55 AM, Kristian Amlie wrote: You can use scripts/get_maintainer.pl to find the right addressees for your patches. efi_reserve_memory() states that memory marked with "no-map" shall not be accessed by the UEFI payload. Make sure efi_get_memory_map() honors this.
Accessing memory and describing memory are two different things. Describing reserved memory in the memory map is important, because it helps us distinguish it from MMIO regions.
Ok, my mistake, I thought the kernel would deduce this separately through the DTB.
This helps the case when booting vexpress_ca9x4 under QEMU. Because the kernel wants to use an address in the lowest 128MiB of the range, but this range is reserved with "no-map", the kernel complains that it can not allocate the low memory it needs. In reality the actual usable memory starts much higher, which is reflected correctly in the memory map after this fix.
This is a u-boot patch right? (I cannot tell from the context, as there are no mailing lists on cc)
It is u-boot's job to describe all the memory, no matter how it is used. Even if the kernel itself may not use it as system memory, there are cases where kernel infrastructure is used to map these regions: for instance, the ACPI core may need to map a SystemMemory OpRegion, and we need the EFI memory map to tell us whether to use memory or I/O semantics.
As for the 128 MB issue: can you reproduce this with a recent kernel? We made some changes recently to the EFI stub as well as the decompressor code to prevent the decompressor code from relocating itself to the base of DRAM, and to allow the decompressed kernel to reside at any 2 MB aligned offset in memory.
This would allow putting the kernel at the very top of memory.
No, not at all. The way Linux/ARM defines its linear map is tied to the placement of the kernel image, and any memory below it cannot be used by the OS. IOW, placing the kernel at the very top of memory would result in zero MB of lowmem being available, and therefore no successful boot.
Formerly, the decompressor would simply round down the decompressor's load address to 128 MB, and use the resulting value as the load address of the decompressed kernel. This meant that a) systems where the DRAM banks are not on a 128 MB boundary were difficult to support b) having reserved regions at the start of memory was problematic, because the decompressor did not look at the device tree at all (this is why we have all these TEXT_OFFSET hacks in the ARM kernel) c) the EFI stub was forced to relocate itself into the first 128 MB of memory, or the decompressor would misidentify the start of DRAM.
This has been fixed now: we can find the start of DRAM in the device tree when necessary, but for EFI boot, we use the memory map to define that kernel load address and pass it directly. This means we no longer need to move the decompressor before invoking it (d0f9ca9be11f25ef4151195eab7ea36d136084f6)
This uncovered another issue, though: the minimal relative alignment of physical vs. virtual addresses was 16 MB, to ensure that the PA-to-VA and vice versa routines worked correctly. So the tiniest memory reservation at the start of DRAM would mean losing ~16 MB of memory (given that DRAM below the kernel is not usable)
I fixed this in 9443076e4330a14ae2c6114307668b98a8293b77, and so now, the kernel is loaded at the lowest available 2 MB boundary.
But consider this function that misbehaves:
arch/arm/include/asm/efi.h:
76 /* on ARM, the initrd should be loaded in a lowmem region */ 77 static inline unsigned long efi_get_max_initrd_addr(unsigned long image_addr) 78 { 79 return round_down(image_addr, SZ_4M) + SZ_512M; 80 }
0x1F000000 = efi_get_max_initrd_addr(0xff000000);
And below 0x1f000000 there may be no RAM at all.
Yeah, that function is broken in more ways than one: the 512 MB is an approximation, but you could actually boot the kernel with a larger vmalloc space, in which case the lowmem region could be smaller than 512 MB.
But patches are welcome to address the issue you identified: taking the max() of (u64)(image_addr + SZ_512M) and ULONG_MAX should do the trick, I suppose.

On 31/08/2021 13:12, Kristian Amlie wrote:
On 31/08/2021 12:46, Heinrich Schuchardt wrote:
*Von:* Ard Biesheuvel ardb@kernel.org *Gesendet:* 31. August 2021 12:33:56 MESZ *An:* Heinrich Schuchardt xypron.glpk@gmx.de *CC:* Kristian Amlie kristian.amlie@northern.tech *Betreff:* Re: [PATCH] efi_loader: Omit memory with "no-map" when returning memory map.
On Tue, 31 Aug 2021 at 08:41, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
On 8/27/21 9:55 AM, Kristian Amlie wrote: You can use scripts/get_maintainer.pl to find the right addressees for your patches. efi_reserve_memory() states that memory marked with "no-map" shall not be accessed by the UEFI payload. Make sure efi_get_memory_map() honors this.
Accessing memory and describing memory are two different things. Describing reserved memory in the memory map is important, because it helps us distinguish it from MMIO regions.
Ok, my mistake, I thought the kernel would deduce this separately through the DTB.
This helps the case when booting vexpress_ca9x4 under QEMU. Because the kernel wants to use an address in the lowest 128MiB of the range, but this range is reserved with "no-map", the kernel complains that it can not allocate the low memory it needs. In reality the actual usable memory starts much higher, which is reflected correctly in the memory map after this fix.
This is a u-boot patch right? (I cannot tell from the context, as there are no mailing lists on cc)
It is u-boot's job to describe all the memory, no matter how it is used. Even if the kernel itself may not use it as system memory, there are cases where kernel infrastructure is used to map these regions: for instance, the ACPI core may need to map a SystemMemory OpRegion, and we need the EFI memory map to tell us whether to use memory or I/O semantics.
As for the 128 MB issue: can you reproduce this with a recent kernel? We made some changes recently to the EFI stub as well as the decompressor code to prevent the decompressor code from relocating itself to the base of DRAM, and to allow the decompressed kernel to reside at any 2 MB aligned offset in memory.
I'll try this and get back to you!
The result have been a bit inconclusive. I *think* it works, but the system later crashes because init is killed. Which may be a problem with that kernel in general, I don't know. I would require more time to figure out exactly what's causing it. The early boot and all the memory initialization seems to work just fine though.
I have pasted the log below if you want to look at it, but to me the error looks unrelated.
--- BOOT LOG ---
U-Boot 2021.07 (Jul 05 2021 - 15:11:28 +0000)
DRAM: 256 MiB WARNING: Caches not enabled Flash: 64 MiB MMC: sysctl@1000 - probe failed: -19 aaci@4000 - probe failed: -19 uart@9000 - probe failed: -19 uart@a000 - probe failed: -19 uart@b000 - probe failed: -19 uart@c000 - probe failed: -19 timer@11000 - probe failed: -19 timer@12000 - probe failed: -19 rtc@17000 - probe failed: -19 clcd@1f000 - probe failed: -19 clcd@10020000 - probe failed: -19 memory-controller@100e0000 - probe failed: -19 memory-controller@100e1000 - probe failed: -19 watchdog@100e5000 - probe failed: -19
Loading Environment from Flash... *** Warning - bad CRC, using default environment
In: serial Out: serial Err: serial Net: eth0: ethernet@3,02000000 Hit any key to stop autoboot: 0 no mmc device at slot 1 switch to partitions #0, OK mmc0 is current device Scanning mmc 0:1... 14163 bytes read in 16 ms (864.3 KiB/s) Scanning disk sysctl@1000.blk... Disk sysctl@1000.blk not ready Scanning disk aaci@4000.blk... Disk aaci@4000.blk not ready Scanning disk mmci@5000.blk... Scanning disk uart@9000.blk... Disk uart@9000.blk not ready Scanning disk uart@a000.blk... Disk uart@a000.blk not ready Scanning disk uart@b000.blk... Disk uart@b000.blk not ready Scanning disk uart@c000.blk... Disk uart@c000.blk not ready Scanning disk timer@11000.blk... Disk timer@11000.blk not ready Scanning disk timer@12000.blk... Disk timer@12000.blk not ready Scanning disk rtc@17000.blk... Disk rtc@17000.blk not ready Scanning disk clcd@1f000.blk... Disk clcd@1f000.blk not ready Scanning disk clcd@10020000.blk... Disk clcd@10020000.blk not ready Scanning disk memory-controller@100e0000.bl... Disk memory-controller@100e0000.bl not ready Scanning disk memory-controller@100e1000.bl... Disk memory-controller@100e1000.bl not ready Scanning disk watchdog@100e5000.blk... Disk watchdog@100e5000.blk not ready Found 5 disks ** Unable to read file ubootefi.var ** Failed to load EFI variables BootOrder not defined EFI boot manager: Cannot load any image Found EFI removable media binary efi/boot/bootarm.efi 536576 bytes read in 113 ms (4.5 MiB/s) Booting /efi\boot\bootarm.efi Welcome to GRUB!
lock: OK lock: OK
EFI stub: Entering in SVC mode with MMU disabled EFI stub: Booting Linux Kernel... EFI stub: Using DTB from configuration table EFI stub: Exiting boot services and installing virtual address map... [ 0.000000] Booting Linux on physical CPU 0x0 [ 0.000000] Linux version 5.10.60-yocto-standard (oe-user@oe-host) (arm-poky-linux-gnueabi-gcc (GCC) 9.3.0, GNU ld (GNU Binutils) 2.34.0.20200220) #1 SMP Mon Aug 23 03:04:08 UTC 2021 [ 0.000000] CPU: ARMv7 Processor [410fc090] revision 0 (ARMv7), cr=10c5387d [ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instruction cache [ 0.000000] OF: fdt: Machine model: V2P-CA9 [ 0.000000] OF: fdt: Ignoring memory block 0x80000000 - 0x80000004 [ 0.000000] Memory policy: Data cache writeback [ 0.000000] efi: EFI v2.80 by Das U-Boot [ 0.000000] efi: RTPROP=0x6ee57040 SMBIOS=0x6ee53000 MEMRESERVE=0x6e5a9040 [ 0.000000] OF: fdt: Ignoring memory block 0x4c000000 - 0x4c800000 [ 0.000000] efi: [Firmware Bug]: EFI stub was entered with MMU and Dcache disabled, please fix your firmware! [ 0.000000] efi: CPSR at EFI stub entry : 0x600001d3 [ 0.000000] efi: SCTLR at EFI stub entry : 0x00c51878 [ 0.000000] efi: CPSR after ExitBootServices() : 0x600001d3 [ 0.000000] efi: SCTLR after ExitBootServices(): 0x00c50878 [ 0.000000] Reserved memory: created DMA memory pool at 0x4c000000, size 8 MiB [ 0.000000] OF: reserved mem: initialized node vram@4c000000, compatible id shared-dma-pool [ 0.000000] Zone ranges: [ 0.000000] Normal [mem 0x0000000060000000-0x000000006fffffff] [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x0000000060000000-0x000000006e9b5fff] [ 0.000000] node 0: [mem 0x000000006e9b6000-0x000000006e9b7fff] [ 0.000000] node 0: [mem 0x000000006e9b8000-0x000000006ee4afff] [ 0.000000] node 0: [mem 0x000000006ee4b000-0x000000006ee51fff] [ 0.000000] node 0: [mem 0x000000006ee52000-0x000000006ee52fff] [ 0.000000] node 0: [mem 0x000000006ee53000-0x000000006ee53fff] [ 0.000000] node 0: [mem 0x000000006ee54000-0x000000006ee55fff] [ 0.000000] node 0: [mem 0x000000006ee56000-0x000000006ee59fff] [ 0.000000] node 0: [mem 0x000000006ee5a000-0x000000006ee5afff] [ 0.000000] node 0: [mem 0x000000006ee5b000-0x000000006ee5ffff] [ 0.000000] node 0: [mem 0x000000006ee60000-0x000000006ee60fff] [ 0.000000] node 0: [mem 0x000000006ee61000-0x000000006ee61fff] [ 0.000000] node 0: [mem 0x000000006ee62000-0x000000006ee62fff] [ 0.000000] node 0: [mem 0x000000006ee63000-0x000000006ee63fff] [ 0.000000] node 0: [mem 0x000000006ee64000-0x000000006ee64fff] [ 0.000000] node 0: [mem 0x000000006ee65000-0x000000006ee65fff] [ 0.000000] node 0: [mem 0x000000006ee66000-0x000000006ee67fff] [ 0.000000] node 0: [mem 0x000000006ee68000-0x000000006ee68fff] [ 0.000000] node 0: [mem 0x000000006ee69000-0x000000006ff6bfff] [ 0.000000] node 0: [mem 0x000000006ff6c000-0x000000006ff6dfff] [ 0.000000] node 0: [mem 0x000000006ff6e000-0x000000006fffffff] [ 0.000000] Initmem setup node 0 [mem 0x0000000060000000-0x000000006fffffff] [ 0.000000] CPU: All CPU(s) started in SVC mode. [ 0.000000] percpu: Embedded 15 pages/cpu s29324 r8192 d23924 u61440 [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 65024 [ 0.000000] Kernel command line: BOOT_IMAGE=(hd0,2)/boot/zImage root=/dev/mmcblk0p2 console=ttyAMA0,115200n8 rootwait [ 0.000000] printk: log_buf_len individual max cpu contribution: 4096 bytes [ 0.000000] printk: log_buf_len total cpu_extra contributions: 12288 bytes [ 0.000000] printk: log_buf_len min size: 16384 bytes [ 0.000000] printk: log_buf_len: 32768 bytes [ 0.000000] printk: early log buf free: 13072(79%) [ 0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes, linear) [ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear) [ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off [ 0.000000] Memory: 250940K/262144K available (5987K kernel code, 243K rwdata, 1412K rodata, 340K init, 176K bss, 11204K reserved, 0K cma-reserved) [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 [ 0.000000] rcu: Hierarchical RCU implementation. [ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=4. [ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies. [ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4 [ 0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16 [ 0.000000] GIC CPU mask not found - kernel will fail to boot. [ 0.000000] GIC CPU mask not found - kernel will fail to boot. [ 0.000000] L2C: platform modifies aux control register: 0x02020000 -> 0x02420000 [ 0.000000] L2C: DT/platform modifies aux control register: 0x02020000 -> 0x02420000 [ 0.000000] L2C-310 enabling early BRESP for Cortex-A9 [ 0.000000] L2C-310 full line of zeros enabled for Cortex-A9 [ 0.000000] L2C-310 dynamic clock gating disabled, standby mode disabled [ 0.000000] L2C-310 cache controller enabled, 8 ways, 128 kB [ 0.000000] L2C-310: CACHE_ID 0x410000c8, AUX_CTRL 0x46420001 [ 0.000263] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns [ 0.004179] clocksource: arm,sp804: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275 ns [ 0.005143] Failed to initialize '/bus@4000000/motherboard/iofpga@7,00000000/timer@12000': -22 [ 0.005908] smp_twd: clock not found -2 [ 0.011764] Console: colour dummy device 80x30 [ 0.012281] Calibrating local timer... 92.08MHz. [ 0.064864] Calibrating delay loop... 606.20 BogoMIPS (lpj=3031040) [ 0.165603] pid_max: default: 32768 minimum: 301 [ 0.166869] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) [ 0.166928] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) [ 0.179849] CPU: Testing write buffer coherency: ok [ 0.180657] CPU0: Spectre v2: using BPIALL workaround [ 0.189553] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000 [ 0.195577] Setting up static identity map for 0x60008280 - 0x600082e0 [ 0.196963] rcu: Hierarchical SRCU implementation. [ 0.204617] Remapping and enabling EFI services. [ 0.206930] smp: Bringing up secondary CPUs ... [ 0.213694] smp: Brought up 1 node, 1 CPU [ 0.213793] SMP: Total of 1 processors activated (606.20 BogoMIPS). [ 0.213874] CPU: All CPU(s) started in SVC mode. [ 0.221890] devtmpfs: initialized [ 0.232819] random: get_random_u32 called from bucket_table_alloc+0x58/0x14c with crng_init=0 [ 0.235806] VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 0 [ 0.256031] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns [ 0.256892] futex hash table entries: 1024 (order: 4, 65536 bytes, linear) [ 0.274887] NET: Registered protocol family 16 [ 0.309743] DMA: preallocated 256 KiB pool for atomic coherent allocations [ 0.314428] cpuidle: using governor ladder [ 0.314839] hw-breakpoint: debug architecture 0x4 unsupported. [ 0.315677] Serial: AMBA PL011 UART driver [ 0.537363] SCSI subsystem initialized [ 0.546398] usbcore: registered new interface driver usbfs [ 0.546752] usbcore: registered new interface driver hub [ 0.547004] usbcore: registered new device driver usb [ 0.573627] Registered efivars operations [ 0.574847] Advanced Linux Sound Architecture Driver Initialized. [ 0.604622] clocksource: Switched to clocksource arm,sp804 [ 0.698679] NET: Registered protocol family 2 [ 0.700112] IP idents hash table entries: 4096 (order: 3, 32768 bytes, linear) [ 0.703597] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 6144 bytes, linear) [ 0.703748] TCP established hash table entries: 2048 (order: 1, 8192 bytes, linear) [ 0.703939] TCP bind hash table entries: 2048 (order: 2, 16384 bytes, linear) [ 0.704091] TCP: Hash tables configured (established 2048 bind 2048) [ 0.706092] UDP hash table entries: 256 (order: 1, 8192 bytes, linear) [ 0.706307] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear) [ 0.713435] NET: Registered protocol family 1 [ 0.720521] RPC: Registered named UNIX socket transport module. [ 0.720614] RPC: Registered udp transport module. [ 0.720644] RPC: Registered tcp transport module. [ 0.720691] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 0.734819] workingset: timestamp_bits=30 max_order=16 bucket_order=0 [ 0.753411] squashfs: version 4.0 (2009/01/31) Phillip Lougher [ 0.758316] jffs2: version 2.2. (NAND) © 2001-2006 Red Hat, Inc. [ 0.760873] 9p: Installing v9fs 9p2000 file system support [ 0.842588] loop: module loaded [ 1.177339] physmap-flash 40000000.flash: physmap platform flash device: [mem 0x40000000-0x43ffffff] [ 1.178885] 40000000.flash: Found 2 x16 devices at 0x0 in 32-bit bank. Manufacturer ID 0x000000 Chip ID 0x000000 [ 1.179365] Intel/Sharp Extended Query Table at 0x0031 [ 1.179912] Using buffer write method [ 1.182505] physmap-flash 40000000.flash: physmap platform flash device: [mem 0x44000000-0x47ffffff] [ 1.182916] 40000000.flash: Found 2 x16 devices at 0x0 in 32-bit bank. Manufacturer ID 0x000000 Chip ID 0x000000 [ 1.182986] Intel/Sharp Extended Query Table at 0x0031 [ 1.183307] Using buffer write method [ 1.183452] Concatenating MTD devices: [ 1.183498] (0): "40000000.flash" [ 1.183531] (1): "40000000.flash" [ 1.183557] into device "40000000.flash" [ 1.199746] physmap-flash 48000000.psram: physmap platform flash device: [mem 0x48000000-0x49ffffff] [ 1.209640] libphy: Fixed MDIO Bus: probed [ 1.274434] libphy: smsc911x-mdio: probed [ 1.276718] smsc911x 4e000000.ethernet eth0: MAC Address: 52:54:00:91:52:86 [ 1.381425] isp1760 4f000000.usb: bus width: 32, oc: digital [ 1.381905] isp1760 4f000000.usb: NXP ISP1760 USB Host Controller [ 1.382136] isp1760 4f000000.usb: new USB bus registered, assigned bus number 1 [ 1.382892] isp1760 4f000000.usb: Scratch test failed. [ 1.383047] isp1760 4f000000.usb: can't setup: -19 [ 1.383279] isp1760 4f000000.usb: USB bus 1 deregistered [ 1.386211] usbcore: registered new interface driver usb-storage [ 1.391247] mousedev: PS/2 mouse device common for all mice [ 1.425411] ledtrig-cpu: registered to indicate activity on CPUs [ 1.429427] usbcore: registered new interface driver usbhid [ 1.429482] usbhid: USB HID core driver [ 1.451206] NET: Registered protocol family 17 [ 1.452143] 9pnet: Installing 9P2000 support [ 1.452380] oprofile: hardware counters not available [ 1.452438] oprofile: using timer interrupt. [ 1.452807] Registering SWP/SWPB emulation handler [ 1.490295] aaci-pl041 10004000.aaci: ARM AC'97 Interface PL041 rev0 at 0x10004000, irq 28 [ 1.490369] aaci-pl041 10004000.aaci: FIFO 512 entries [ 1.496852] mmci-pl18x 10005000.mmci: Got CD GPIO [ 1.497048] mmci-pl18x 10005000.mmci: Got WP GPIO [ 1.505820] mmci-pl18x 10005000.mmci: mmc0: PL181 manf 41 rev0 at 0x10005000 irq 29,30 (pio) [ 1.558779] 10009000.uart: ttyAMA0 at MMIO 0x10009000 (irq = 33, base_baud = 0) is a PL011 rev1 [ 1.624113] mmc0: new SD card at address 4567 [ 1.632634] mmcblk0: mmc0:4567 QEMU! 608 MiB [ 1.645152] printk: console [ttyAMA0] enabled [ 1.649191] 1000a000.uart: ttyAMA1 at MMIO 0x1000a000 (irq = 34, base_baud = 0) is a PL011 rev1 [ 1.657083] 1000b000.uart: ttyAMA2 at MMIO 0x1000b000 (irq = 35, base_baud = 0) is a PL011 rev1 [ 1.667853] 1000c000.uart: ttyAMA3 at MMIO 0x1000c000 (irq = 36, base_baud = 0) is a PL011 rev1 [ 1.678424] input: AT Raw Set 2 keyboard as /devices/platform/bus@4000000/bus@4000000:motherboard/bus@4000000:motherboard:iofpga@7,00000000/10006000.kmi/serio0/input/input0 [ 1.696853] random: fast init done [ 1.704747] rtc-pl031 10017000.rtc: registered as rtc0 [ 1.705851] rtc-pl031 10017000.rtc: setting system clock to 2021-09-01T15:04:45 UTC (1630508685) [ 1.716470] clcd-pl11x 1001f000.clcd: PL111 designer 41 rev2 at 0x1001f000 [ 1.717596] clcd-pl11x: probe of 1001f000.clcd failed with error -2 [ 1.720860] clcd-pl11x 10020000.clcd: PL111 designer 41 rev2 at 0x10020000 [ 1.721435] clcd-pl11x: probe of 10020000.clcd failed with error -2 [ 1.732424] mmcblk0: p1 p2 p3 p4 [ 1.735612] ALSA device list: [ 1.735830] #0: ARM AC'97 Interface PL041 rev0 at 0x10004000, irq 28 [ 2.275907] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/bus@4000000/bus@4000000:motherboard/bus@4000000:motherboard:iofpga@7,00000000/10007000.kmi/serio1/input/input2 [ 2.314240] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null) [ 2.315988] VFS: Mounted root (ext4 filesystem) readonly on device 179:2. [ 2.330406] Freeing unused kernel memory: 340K [ 2.330669] Kernel memory protection not selected by kernel config. [ 2.331244] Run /sbin/init as init process [ 2.591922] random: crng init done [ 3.912835] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 [ 3.913375] CPU: 0 PID: 1 Comm: systemd Not tainted 5.10.60-yocto-standard #1 [ 3.913726] Hardware name: ARM-Versatile Express [ 3.914929] [<c0014950>] (unwind_backtrace) from [<c00118cc>] (show_stack+0x10/0x14) [ 3.915539] [<c00118cc>] (show_stack) from [<c05d6be0>] (dump_stack+0x98/0xac) [ 3.915930] [<c05d6be0>] (dump_stack) from [<c05d323c>] (panic+0xf8/0x2e8) [ 3.916324] [<c05d323c>] (panic) from [<c0025230>] (do_exit+0xa80/0xac4) [ 3.916699] [<c0025230>] (do_exit) from [<c0026100>] (do_group_exit+0x3c/0xc8) [ 3.917112] [<c0026100>] (do_group_exit) from [<c0030558>] (get_signal+0x140/0x864) [ 3.917576] [<c0030558>] (get_signal) from [<c0011220>] (do_work_pending+0xe0/0x4dc) [ 3.917963] [<c0011220>] (do_work_pending) from [<c0008344>] (slow_work_pending+0xc/0x20) [ 3.918465] Exception stack(0xc0c2dfb0 to 0xc0c2dff8) [ 3.918756] dfa0: 00000000 be929358 00000005 b6c963f0 [ 3.919102] dfc0: 00000000 b6f85010 be929590 b6f8a000 00000001 b6c94e3c b6cb0d8c be9295c8 [ 3.919547] dfe0: 00000020 be929468 ffffffff b6b69900 60000010 ffffffff [ 3.920503] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 ]---
--- END OF BOOT LOG ---
-- Kristian
The 'no-map' requirement needs to be fulfilled by the kernel. The GetMemoryMap() boot time service must comply to the UEFI specification. The Devicetree Specification has this clarification: "Reserved regions with the no-map property must be listed in the memory map with type EfiReservedMemoryType. All other reserved regions must be listed with type EfiBootServicesData." https://devicetree-specification.readthedocs.io/en/latest/chapter3-devicenodes.html <https://devicetree-specification.readthedocs.io/en/latest/chapter3-devicenodes.html> Should the kernel calculate its internal 128 MiB requirement incorrectly this needs be fixed in the kernel EFI stub. Does the problem exist with kernel 5.14? I wonder if the 128 MiB requirement of the kernel can be lifted for 32bit ARM as we already did for RISC-V. Cf.
As mentioned above, this should already be fixed, in v5.11 or later
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.14&id=c79e89ecaa246c880292ba68cbe08c9c30db77e3 <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.14&id=c79e89ecaa246c880292ba68cbe08c9c30db77e3> Cc Ard, maintainer of the kernel EFI stub. Best regards Heinrich Signed-off-by: Kristian Amlie <kristian.amlie@northern.tech> ------------------------------------------------------------------------ lib/efi_loader/efi_memory.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/lib/efi_loader/efi_memory.c b/lib/efi_loader/efi_memory.c index f4acbee4f9..7f8543143a 100644 --- a/lib/efi_loader/efi_memory.c +++ b/lib/efi_loader/efi_memory.c @@ -646,8 +646,16 @@ efi_status_t efi_get_memory_map(efi_uintn_t *memory_map_size, provided_map_size = *memory_map_size; - list_for_each(lhandle, &efi_mem) + list_for_each(lhandle, &efi_mem) { + struct efi_mem_list *lmem; + + lmem = list_entry(lhandle, struct efi_mem_list, link); + + if (lmem->desc.type == EFI_RESERVED_MEMORY_TYPE) + continue; + map_entries++; + } map_size = map_entries * sizeof(struct efi_mem_desc); @@ -672,6 +680,10 @@ efi_status_t efi_get_memory_map(efi_uintn_t *memory_map_size, struct efi_mem_list *lmem; lmem = list_entry(lhandle, struct efi_mem_list, link); + + if (lmem->desc.type == EFI_RESERVED_MEMORY_TYPE) + continue; + *memory_map = lmem->desc; memory_map--; }

On Thu, 2 Sept 2021 at 10:43, Kristian Amlie kristian.amlie@northern.tech wrote:
On 31/08/2021 13:12, Kristian Amlie wrote:
On 31/08/2021 12:46, Heinrich Schuchardt wrote:
*Von:* Ard Biesheuvel ardb@kernel.org *Gesendet:* 31. August 2021 12:33:56 MESZ *An:* Heinrich Schuchardt xypron.glpk@gmx.de *CC:* Kristian Amlie kristian.amlie@northern.tech *Betreff:* Re: [PATCH] efi_loader: Omit memory with "no-map" when returning memory map.
On Tue, 31 Aug 2021 at 08:41, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
On 8/27/21 9:55 AM, Kristian Amlie wrote: You can use scripts/get_maintainer.pl to find the right addressees for your patches. efi_reserve_memory() states that memory marked with "no-map" shall not be accessed by the UEFI payload. Make sure efi_get_memory_map() honors this.
Accessing memory and describing memory are two different things. Describing reserved memory in the memory map is important, because it helps us distinguish it from MMIO regions.
Ok, my mistake, I thought the kernel would deduce this separately through the DTB.
This helps the case when booting vexpress_ca9x4 under QEMU. Because the kernel wants to use an address in the lowest 128MiB of the range, but this range is reserved with "no-map", the kernel complains that it can not allocate the low memory it needs. In reality the actual usable memory starts much higher, which is reflected correctly in the memory map after this fix.
This is a u-boot patch right? (I cannot tell from the context, as there are no mailing lists on cc)
It is u-boot's job to describe all the memory, no matter how it is used. Even if the kernel itself may not use it as system memory, there are cases where kernel infrastructure is used to map these regions: for instance, the ACPI core may need to map a SystemMemory OpRegion, and we need the EFI memory map to tell us whether to use memory or I/O semantics.
As for the 128 MB issue: can you reproduce this with a recent kernel? We made some changes recently to the EFI stub as well as the decompressor code to prevent the decompressor code from relocating itself to the base of DRAM, and to allow the decompressed kernel to reside at any 2 MB aligned offset in memory.
I'll try this and get back to you!
The result have been a bit inconclusive. I *think* it works, but the system later crashes because init is killed. Which may be a problem with that kernel in general, I don't know. I would require more time to figure out exactly what's causing it. The early boot and all the memory initialization seems to work just fine though.
I have pasted the log below if you want to look at it, but to me the error looks unrelated.
Agreed. systemd-init crashes for some reason, but this is highly unlikely to be caused by the EFI memory map containing reserved memory regions.

On 02/09/2021 10:47, Ard Biesheuvel wrote:
On Thu, 2 Sept 2021 at 10:43, Kristian Amlie kristian.amlie@northern.tech wrote:
On 31/08/2021 13:12, Kristian Amlie wrote:
On 31/08/2021 12:46, Heinrich Schuchardt wrote:
*Von:* Ard Biesheuvel ardb@kernel.org *Gesendet:* 31. August 2021 12:33:56 MESZ *An:* Heinrich Schuchardt xypron.glpk@gmx.de *CC:* Kristian Amlie kristian.amlie@northern.tech *Betreff:* Re: [PATCH] efi_loader: Omit memory with "no-map" when returning memory map.
On Tue, 31 Aug 2021 at 08:41, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
On 8/27/21 9:55 AM, Kristian Amlie wrote: You can use scripts/get_maintainer.pl to find the right addressees for your patches. efi_reserve_memory() states that memory marked with "no-map" shall not be accessed by the UEFI payload. Make sure efi_get_memory_map() honors this.
Accessing memory and describing memory are two different things. Describing reserved memory in the memory map is important, because it helps us distinguish it from MMIO regions.
Ok, my mistake, I thought the kernel would deduce this separately through the DTB.
This helps the case when booting vexpress_ca9x4 under QEMU. Because the kernel wants to use an address in the lowest 128MiB of the range, but this range is reserved with "no-map", the kernel complains that it can not allocate the low memory it needs. In reality the actual usable memory starts much higher, which is reflected correctly in the memory map after this fix.
This is a u-boot patch right? (I cannot tell from the context, as there are no mailing lists on cc)
It is u-boot's job to describe all the memory, no matter how it is used. Even if the kernel itself may not use it as system memory, there are cases where kernel infrastructure is used to map these regions: for instance, the ACPI core may need to map a SystemMemory OpRegion, and we need the EFI memory map to tell us whether to use memory or I/O semantics.
As for the 128 MB issue: can you reproduce this with a recent kernel? We made some changes recently to the EFI stub as well as the decompressor code to prevent the decompressor code from relocating itself to the base of DRAM, and to allow the decompressed kernel to reside at any 2 MB aligned offset in memory.
I'll try this and get back to you!
The result have been a bit inconclusive. I *think* it works, but the system later crashes because init is killed. Which may be a problem with that kernel in general, I don't know. I would require more time to figure out exactly what's causing it. The early boot and all the memory initialization seems to work just fine though.
I have pasted the log below if you want to look at it, but to me the error looks unrelated.
Agreed. systemd-init crashes for some reason, but this is highly unlikely to be caused by the EFI memory map containing reserved memory regions.
Thanks for checking! Then from my perspective you can consider this issue resolved, and drop the patch.
participants (3)
-
Ard Biesheuvel
-
Heinrich Schuchardt
-
Kristian Amlie