
On Wed, Jul 07, 2021 at 11:37:19AM +0200, Fran??ois Ozog wrote:
On Wed, 7 Jul 2021 at 09:40, François Ozog francois.ozog@linaro.org wrote:
On Wed, 7 Jul 2021 at 05:59, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
Am 7. Juli 2021 05:18:20 MESZ schrieb Heinrich Schuchardt <
xypron.glpk@gmx.de>:
Am 7. Juli 2021 03:44:35 MESZ schrieb AKASHI Takahiro takahiro.akashi@linaro.org:
François,
On Tue, Jul 06, 2021 at 08:10:08PM +0200, Heinrich Schuchardt wrote:
On 7/6/21 6:13 PM, François Ozog wrote: > Hi Heinrich, U-Boot 2021-07rc5 does not take into account memory > description when using Qemu 5.2 NUMA configuration to adapt memory
map
> (kernel_addr_r...): > > -smp 4 \ > -m 8G,slots=2,maxmem=16G \ > -object memory-backend-ram,size=4G,id=m0 \ > -object memory-backend-ram,size=4G,id=m1 \ > -numa node,cpus=0-1,nodeid=0,memdev=m0 \ > -numa node,cpus=2-3,nodeid=1,memdev=m1 > > kernel_addr_r is still 0x4040000 and thus you can't use it to
bootefi.
> > fdt addr 0x13ede6de0; fdt print > > Displays fdt while I think it should not. > > If I load the kernel at dram.start, the load works but not boot > > U-Boot 2021.07 (Jul 06 2021 - 13:26:43 +0000) > > > DRAM:4 GiB > > Flash: 64 MiB > > Loading Environment from Flash... OK > > In:pl011@9000000 > > Out: pl011@9000000 > > Err: pl011@9000000 > > Net: eth0: virtio-net#32 > > Hit any key to stop autoboot:0 > > => > > => bdinfo > > boot_params = 0x0000000000000000 > > DRAM bank = 0x0000000000000000 > > -> start= 0x0000000140000000 > > -> size = 0x0000000100000000 > > flashstart= 0x0000000000000000 > > flashsize = 0x0000000004000000 > > flashoffset = 0x00000000000bc990 > > baudrate= 115200 bps > > relocaddr = 0x000000013ff27000 > > reloc off = 0x000000013ff27000 > > Build = 64-bit > > current eth = virtio-net#32 > > ethaddr = 52:52:52:52:52:52 > > IP addr = <NULL> > > fdt_blob= 0x000000013ede6de0 > > new_fdt = 0x000000013ede6de0 > > fdt_size= 0x0000000000100000 > > lmb_dump_all: > > memory.cnt= 0x1 > > memory.reg[0x0].base = 0x140000000 > > .size = 0x100000000 > > > reserved.cnt= 0x0 > > arch_number = 0x0000000000000000 > > TLB addr= 0x000000013fff0000 > > irq_sp= 0x000000013ede6dd0 > > sp start= 0x000000013ede6dd0 > > Early malloc usage: 3a8 / 2000 > > => load virtio 0:1 0x140000000 /oskit.efi > > 853424 bytes read in 1 ms (813.9 MiB/s) > > => bootefi0x140000000 0x13ede6dd0 > > ERROR: Failed to register WaitForKey event > > Setting OsIndications failed > > Error: Cannot initialize UEFI sub-system, r = 9 > > > I think there is a need to calculate memory map based on previous > firmware (TFA, QEMU can be considered as previous frimware)
information
> (DT or blob_list). > > What do you think ? > > Cheers > > FF > > -- > > François-Frédéric Ozog | /Director Business Development/ > T: +33.67221.6485 > francois.ozog@linaro.org mailto:francois.ozog@linaro.org
| Skype: ffozog
> >
The kernel load address is hard coded here: include/configs/qemu-arm.h:41: "kernel_addr_r=0x40400000\0" \
bdinfo shows: DRAM start = 0x140000000 DRAM size = 0x100000000
fdt addr $fdt_addr fdt printf
shows two memory areas. One at 40000000, one at 140000000.
(This shows that U-Boot receives a correct memory map via dtb.)
Is this a NUMA machine, isn't it? Why should we care of which memory region be used here? Please note that this is a virtual
machine,
there is no practical difference between two regions.
The root problem is that U-Boot did not recognize there were two memory regions. We can fix this issue in either way:
diff --git a/configs/qemu_arm64_defconfig b/configs/qemu_arm64_defconfig index f6e586627a8e..b70ffae8bf6e 100644 --- a/configs/qemu_arm64_defconfig +++ b/configs/qemu_arm64_defconfig @@ -1,7 +1,7 @@ CONFIG_ARM=y CONFIG_POSITION_INDEPENDENT=y CONFIG_ARCH_QEMU=y -CONFIG_NR_DRAM_BANKS=1 +CONFIG_NR_DRAM_BANKS=2 CONFIG_ENV_SIZE=0x40000 CONFIG_ENV_SECT_SIZE=0x40000 CONFIG_AHCI=y
diff --git a/lib/fdtdec.c b/lib/fdtdec.c index 4b097fb588ed..4067ea2dead6 100644 --- a/lib/fdtdec.c +++ b/lib/fdtdec.c @@ -1111,7 +1111,7 @@ int fdtdec_setup_memory_banksize(void) return -EINVAL; }
for (bank = 0; bank < CONFIG_NR_DRAM_BANKS; bank++) {
for (bank = 0; ; bank++) { ret = ofnode_read_resource(mem, reg++, &res); if (ret < 0) { reg = 0;
(fdtdec_setup_memory_banksize() is called in dram_init_banksize().)
(2) seems much better, but I don't know why we had to use CONFIG_NR_DRAM_BANKS here.
- alone does not work as other places in the code refer to
CONFIG_NR_DRAM_BANKS. Setting ...BANKS to 32 makes my code work and bdinfo seems now correct:
=> bdinfo
boot_params = 0x0000000000000000 DRAM bank = 0x0000000000000000 -> start = 0x0000000140000000 -> size = 0x0000000100000000 DRAM bank = 0x0000000000000001 -> start = 0x0000000040000000 -> size = 0x0000000100000000 flashstart = 0x0000000000000000 flashsize = 0x0000000004000000 flashoffset = 0x00000000000bcb88 baudrate = 115200 bps relocaddr = 0x000000013ff27000 reloc off = 0x000000013ff27000 Build = 64-bit current eth = virtio-net#32 ethaddr = 52:52:52:52:52:52 IP addr = <NULL> fdt_blob = 0x000000013ede6cf0 new_fdt = 0x000000013ede6cf0 fdt_size = 0x0000000000100000 lmb_dump_all: memory.cnt = 0x1 memory.reg[0x0].base = 0x40000000 .size = 0x200000000 reserved.cnt = 0x1 reserved.reg[0x0].base = 0x13ede58f0 .size = 0x121a710 arch_number = 0x0000000000000000 TLB addr = 0x000000013fff0000 irq_sp = 0x000000013ede6ce0 sp start = 0x000000013ede6ce0 Early malloc usage: 3a8 / 2000
May I suggest you propose a combined patch Akashi-san? If we assume NUMA systems to be tested up to 8 nodes to mimic real existing enterprise hardware and up to 4 memory slots (say for memory hot plugging tests) what about a default value of 32? Alternatively, we could set this value to a much higher one if the costs are negligible.
Well, lets not rush as there are other twists:
the 4G bank in node 1 is marked BootServicesData in the UEFI GetMemoryMap which I assume is not the case. EDK2 reports it as ConventionalMemory.
The root cause seem to be gd->ramtop not being setup properly.
Further analysis shows that the DT passed to the booted EFI payload does not seem to be correct:
DT fragment passed to U-Boot
memory@140000000 { numa-node-id = <0x00000001>; reg = <0x00000001 0x40000000 0x00000001 0x00000000>; device_type = "memory"; }; memory@40000000 { numa-node-id = <0x00000000>; reg = <0x00000000 0x40000000 0x00000001 0x00000000>; device_type = "memory"; };
DT passed to payload (as per my debug code):
memory@140000000: memory
numa-node-id 1 reg (len= 32) 140000000 100000000 40000000 100000000
memory@40000000: memory
numa-node-id 0 reg (len= 16) 40000000 100000000
I am investigating this further...
You should check the logic of fdt_fixup_memory_banks() which is called this way: efi_dt_fixup() image_setup_libfdt() arch_fixup_fdt() fdt_fixup_memory_banks()
What it does is to put *all* the memory regions unconditionally as a single "reg" array into the *first-detected* "memory" node, which is "memory@140000000" in this case. It means that this function doesn't respect NUMA configuration.
-Takahiro Akashi
In this case, other occurrences of CONFIG_NR_DRAM_BANKS in this file
should be replaced with a variable for it.
Your use case is well beyond the typical U-Boot usage. So I guess it will be up to Linaro to provide the necessary patches:
- determine the active CPU
- determine the RAM assigned to the active CPU according to the numa-node-id in the device-tree
- make sure that U-Boot only uses the memory of the active CPU internally
- make sure that the UEFI memory map contains a compliant
description
- possibly, dynamically set up the environment variables
+CC Tuomas Tynkkynen (maintainer for qemu_arm64_defconfig)
For (1), we'd better have a different config, or increase the value of CONFIG_NR_DRAM_BANKS to a bigger number?
Is the system configured such that each CPU can access the others CPU's RAM when entering U-Boot?
Best regards
Heinrich
At least the comments for this patch sound as if on a physical system
cross NUMA node memory access is only available after full SMP initialization:
https://patchwork.kernel.org/project/linux-acpi/patch/20180625130552.5636-1-...
QEMU may be less restrictive.
QEMU allows the node distance to be 255 indicating that cross node
access is infeasible.
Best regards
Heinrich
-Takahiro Akashi
Best regards
Heinrich
-- François-Frédéric Ozog | Director Business Development T: +33.67221.6485 francois.ozog@linaro.org | Skype: ffozog
-- François-Frédéric Ozog | *Director Business Development* T: +33.67221.6485 francois.ozog@linaro.org | Skype: ffozog