v2023.07-rc5 regression: Image overlaps SPL

Hello Marek, as briefly discussed off-list it looks like commit 77aed22b48ab ("spl: spl_legacy: Add extra address checks") introduces a regression on some board/arch, at least colibri and apalis imx6 fails to boot now
``` Trying to boot from MMC1 SPL: Image overlaps SPL resetting ... ```
From a quick check verdin imx8mm and imx8mp are not affected.
I also noticed something weird on a colibri imx7s, this is not using SPL, likely something completly different, however given this is new also from rc5 I thought it's valuable to report:
``` U-Boot 2023.07-rc5-0.0.0-devel+git.580eb31199be (Jun 27 2023 - 13:39:58 +0000) CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 30C Reset cause: POR DRAM: initcall sequence 8786eafc failed at call 8781b351 (err=-3) ### ERROR ### Please RESET the board ### ```
Currently at the EOSS in Prague, so no way to do more testing/debugging, but I tought it was better to start reporting this on the list since we are getting close to the actual release.
Francesco

On Thu, Jun 29, 2023 at 04:19:22PM +0200, Francesco Dolcini wrote:
Hello Marek, as briefly discussed off-list it looks like commit 77aed22b48ab ("spl: spl_legacy: Add extra address checks") introduces a regression on some board/arch, at least colibri and apalis imx6 fails to boot now
Trying to boot from MMC1 SPL: Image overlaps SPL resetting ...
From a quick check verdin imx8mm and imx8mp are not affected.
I also noticed something weird on a colibri imx7s, this is not using SPL, likely something completly different, however given this is new also from rc5 I thought it's valuable to report:
U-Boot 2023.07-rc5-0.0.0-devel+git.580eb31199be (Jun 27 2023 - 13:39:58 +0000) CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 30C Reset cause: POR DRAM: initcall sequence 8786eafc failed at call 8781b351 (err=-3) ### ERROR ### Please RESET the board ###
Currently at the EOSS in Prague, so no way to do more testing/debugging, but I tought it was better to start reporting this on the list since we are getting close to the actual release.
Well ugh. And my mx6cuboxi was out of my CI loop so I didn't see this at the time. Does reverting just the legacy checks patch fix the i.MX7 case as well?

On Thu, Jun 29, 2023 at 06:58:26PM -0400, Tom Rini wrote:
On Thu, Jun 29, 2023 at 04:19:22PM +0200, Francesco Dolcini wrote:
I also noticed something weird on a colibri imx7s, this is not using SPL, likely something completly different, however given this is new also from rc5 I thought it's valuable to report:
U-Boot 2023.07-rc5-0.0.0-devel+git.580eb31199be (Jun 27 2023 - 13:39:58 +0000) CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 30C Reset cause: POR DRAM: initcall sequence 8786eafc failed at call 8781b351 (err=-3) ### ERROR ### Please RESET the board ###
Currently at the EOSS in Prague, so no way to do more testing/debugging, but I tought it was better to start reporting this on the list since we are getting close to the actual release.
Well ugh. And my mx6cuboxi was out of my CI loop so I didn't see this at the time. Does reverting just the legacy checks patch fix the i.MX7 case as well?
I would assume this is unrelated, since there is no SPL involved there. I'll try to get more information in the next days.
Francesco

Hi Francesco and Marek,
On Thu, Jun 29, 2023 at 11:19 AM Francesco Dolcini francesco@dolcini.it wrote:
Hello Marek, as briefly discussed off-list it looks like commit 77aed22b48ab ("spl: spl_legacy: Add extra address checks") introduces a regression on some board/arch, at least colibri and apalis imx6 fails to boot now
Trying to boot from MMC1 SPL: Image overlaps SPL resetting ...
I was able to reproduce this problem on an imx6dl colibri and added some debug information:
Trying to boot from USB SDP SDP: initialize... SDP: handle requests... Downloading file of size 551176 to 0x177fffc0... done Jumping to header at 0x177fffc0 Header Tag is not an IMX image Found header at 0x177fffc0 ********* start: 0x177fffc0 ********* size: 0x86908 ********* spl_start: 0x908000 ********* spl_end: 0x18200480 ********* end: 0x178868c8 SPL: Image overlaps SPL
Please note that spl_start is inside internal RAM space, but spl_end points to a DDR address.
Should we fix spl_end like this?
--- a/common/spl/spl_legacy.c +++ b/common/spl/spl_legacy.c @@ -19,9 +19,17 @@ static void spl_parse_legacy_validate(uintptr_t start, uintptr_t size) { uintptr_t spl_start = (uintptr_t)_start; - uintptr_t spl_end = (uintptr_t)__bss_end; + uintptr_t spl_end = spl_start + size; uintptr_t end = start + size;

On Fri, Jun 30, 2023 at 10:22 AM Fabio Estevam festevam@gmail.com wrote:
Should we fix spl_end like this?
Looking at u-boot-spl.map:
0x0000000000912264 0x0 common/spl/spl.o 0x0000000000912264 . = ALIGN (0x4) 0x0000000000912264 __image_copy_end = .
.end *(.__end) 0x0000000000912264 _image_binary_end = .
__image_copy_end is the last address of the SPL inside the internal RAM.
So shouldn't we do this instead?
--- a/common/spl/spl_legacy.c +++ b/common/spl/spl_legacy.c @@ -19,9 +19,17 @@ static void spl_parse_legacy_validate(uintptr_t start, uintptr_t size) { uintptr_t spl_start = (uintptr_t)_start; - uintptr_t spl_end = (uintptr_t)__bss_end; + uintptr_t spl_end = (uintptr_t)__image_copy_end; uintptr_t end = start + size;
Which results in:
********* start: 0x177fffc0 ********* size: 0x86908 ********* spl_start: 0x908000 ********* spl_end: 0x912264 ********* end: 0x178868c8 ********* start: 0x17800000 ********* size: 0x0 ********* spl_start: 0x908000 ********* spl_end: 0x912264 ********* end: 0x17800000
and allows the boot to proceed:
U-Boot 2023.07-rc5-00012-g5fa30f2351ac-dirty (Jun 30 2023 - 10:45:40 -0300)
CPU: Freescale i.MX6DL rev1.4 at 792MHz CPU: Industrial temperature grade (-40C to 105C) at 46C

On Fri, Jun 30, 2023 at 10:51:43AM -0300, Fabio Estevam wrote:
On Fri, Jun 30, 2023 at 10:22 AM Fabio Estevam festevam@gmail.com wrote:
Should we fix spl_end like this?
Looking at u-boot-spl.map:
0x0000000000912264 0x0 common/spl/spl.o 0x0000000000912264 . = ALIGN (0x4) 0x0000000000912264 __image_copy_end = .
.end *(.__end) 0x0000000000912264 _image_binary_end = .
__image_copy_end is the last address of the SPL inside the internal RAM.
So shouldn't we do this instead?
--- a/common/spl/spl_legacy.c +++ b/common/spl/spl_legacy.c @@ -19,9 +19,17 @@ static void spl_parse_legacy_validate(uintptr_t start, uintptr_t size) { uintptr_t spl_start = (uintptr_t)_start;
uintptr_t spl_end = (uintptr_t)__bss_end;
uintptr_t spl_end = (uintptr_t)__image_copy_end; uintptr_t end = start + size;
I think this breaks x86, without updating their linker scripts at least.

On Fri, Jun 30, 2023 at 11:05 AM Tom Rini trini@konsulko.com wrote:
I think this breaks x86, without updating their linker scripts at least.
What about this instead?
--- a/common/spl/spl_legacy.c +++ b/common/spl/spl_legacy.c @@ -19,7 +19,7 @@ static void spl_parse_legacy_validate(uintptr_t start, uintptr_t size) { uintptr_t spl_start = (uintptr_t)_start; - uintptr_t spl_end = (uintptr_t)__bss_end; + uintptr_t spl_end = (uintptr_t)_image_binary_end; uintptr_t end = start + size;

Short update on this regression.
On Thu, Jun 29, 2023 at 04:19:22PM +0200, Francesco Dolcini wrote:
I also noticed something weird on a colibri imx7s, this is not using SPL, likely something completly different, however given this is new also from rc5 I thought it's valuable to report:
U-Boot 2023.07-rc5-0.0.0-devel+git.580eb31199be (Jun 27 2023 - 13:39:58 +0000) CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 30C Reset cause: POR DRAM: initcall sequence 8786eafc failed at call 8781b351 (err=-3) ### ERROR ### Please RESET the board ###
This is unrelated to v2023.07-rc5, the issue was introduced with 8c103c33fb14 ("dm: dts: Convert driver model tags to use new schema"), that is in since v2023.07-rc1.
Sorry about the initial incomplete/wrong information.
For some reason this affects only imx7s, while imx7d is fine (single core, only one USB host interface, while the dual variant seems just fine).
Not many ideas for the moment, we might as well be doing something plain wrong in our board file.
I quickly tried removing the whole `&lcdif` and it does not seems to help. I do not think we even enable display output support at the moment.
Francesco

On 7/3/23 18:49, Francesco Dolcini wrote:
Short update on this regression.
On Thu, Jun 29, 2023 at 04:19:22PM +0200, Francesco Dolcini wrote:
I also noticed something weird on a colibri imx7s, this is not using SPL, likely something completly different, however given this is new also from rc5 I thought it's valuable to report:
U-Boot 2023.07-rc5-0.0.0-devel+git.580eb31199be (Jun 27 2023 - 13:39:58 +0000) CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 30C Reset cause: POR DRAM: initcall sequence 8786eafc failed at call 8781b351 (err=-3)
If you have the u-boot (elf file, unstripped), then check which initcall it is that failed . Just run arm-...-objdump -lSD on it and search for the 8781b350: address ; remember to clear the LSbit of the address in case this is thumb mode of the CPU.

On Mon, Jul 03, 2023 at 09:40:51PM +0200, Marek Vasut wrote:
On 7/3/23 18:49, Francesco Dolcini wrote:
Short update on this regression.
On Thu, Jun 29, 2023 at 04:19:22PM +0200, Francesco Dolcini wrote:
I also noticed something weird on a colibri imx7s, this is not using SPL, likely something completly different, however given this is new also from rc5 I thought it's valuable to report:
U-Boot 2023.07-rc5-0.0.0-devel+git.580eb31199be (Jun 27 2023 - 13:39:58 +0000) CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 30C Reset cause: POR DRAM: initcall sequence 8786eafc failed at call 8781b351 (err=-3)
If you have the u-boot (elf file, unstripped), then check which initcall it is that failed . Just run arm-...-objdump -lSD on it and search for the 8781b350: address ; remember to clear the LSbit of the address in case this is thumb mode of the CPU.
U-Boot 2023.07-rc6 (Jul 03 2023 - 21:50:58 +0200)
CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 35C Reset cause: POR DRAM: initcall sequence 8786fb00 failed at call 8781b715 (err=-3) ### ERROR ### Please RESET the board ###
u-boot/common/board_f.c:523 } 8781b710: 2000 movs r0, #0 8781b712: bd08 pop {r3, pc}
8781b714 <fix_fdt>: fix_fdt(): u-boot/common/board_f.c:729 return board_fix_fdt((void *)gd->fdt_blob); 8781b714: f8d9 0068 ldr.w r0, [r9, #104] ; 0x68 8781b718: f7ea bafe b.w 87805d18 <board_fix_fdt>

On Mon, Jul 03, 2023 at 09:54:40PM +0200, Francesco Dolcini wrote:
On Mon, Jul 03, 2023 at 09:40:51PM +0200, Marek Vasut wrote:
On 7/3/23 18:49, Francesco Dolcini wrote:
Short update on this regression.
On Thu, Jun 29, 2023 at 04:19:22PM +0200, Francesco Dolcini wrote:
I also noticed something weird on a colibri imx7s, this is not using SPL, likely something completly different, however given this is new also from rc5 I thought it's valuable to report:
U-Boot 2023.07-rc5-0.0.0-devel+git.580eb31199be (Jun 27 2023 - 13:39:58 +0000) CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 30C Reset cause: POR DRAM: initcall sequence 8786eafc failed at call 8781b351 (err=-3)
If you have the u-boot (elf file, unstripped), then check which initcall it is that failed . Just run arm-...-objdump -lSD on it and search for the 8781b350: address ; remember to clear the LSbit of the address in case this is thumb mode of the CPU.
U-Boot 2023.07-rc6 (Jul 03 2023 - 21:50:58 +0200)
CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 35C Reset cause: POR DRAM: initcall sequence 8786fb00 failed at call 8781b715 (err=-3) ### ERROR ### Please RESET the board ###
u-boot/common/board_f.c:523 } 8781b710: 2000 movs r0, #0 8781b712: bd08 pop {r3, pc}
8781b714 <fix_fdt>: fix_fdt(): u-boot/common/board_f.c:729 return board_fix_fdt((void *)gd->fdt_blob); 8781b714: f8d9 0068 ldr.w r0, [r9, #104] ; 0x68 8781b718: f7ea bafe b.w 87805d18 <board_fix_fdt>
so, this seems to be the broken code:
/* i.MX 7Solo has only one single USB OTG1 but no USB host port */ if (is_cpu_type(MXC_CPU_MX7S)) { int offset = fdt_path_offset(rw_fdt_blob, "/soc/bus@30800000/usb@30b20000");
return fdt_status_disabled(rw_fdt_blob, offset); }
now I wonder how is this related to that other change ...

On Mon, Jul 03, 2023 at 09:54:40PM +0200, Francesco Dolcini wrote:
On Mon, Jul 03, 2023 at 09:40:51PM +0200, Marek Vasut wrote:
On 7/3/23 18:49, Francesco Dolcini wrote:
Short update on this regression.
On Thu, Jun 29, 2023 at 04:19:22PM +0200, Francesco Dolcini wrote:
I also noticed something weird on a colibri imx7s, this is not using SPL, likely something completly different, however given this is new also from rc5 I thought it's valuable to report:
U-Boot 2023.07-rc5-0.0.0-devel+git.580eb31199be (Jun 27 2023 - 13:39:58 +0000) CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 30C Reset cause: POR DRAM: initcall sequence 8786eafc failed at call 8781b351 (err=-3)
If you have the u-boot (elf file, unstripped), then check which initcall it is that failed . Just run arm-...-objdump -lSD on it and search for the 8781b350: address ; remember to clear the LSbit of the address in case this is thumb mode of the CPU.
U-Boot 2023.07-rc6 (Jul 03 2023 - 21:50:58 +0200)
CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 35C Reset cause: POR DRAM: initcall sequence 8786fb00 failed at call 8781b715 (err=-3) ### ERROR ### Please RESET the board ###
u-boot/common/board_f.c:523 } 8781b710: 2000 movs r0, #0 8781b712: bd08 pop {r3, pc}
8781b714 <fix_fdt>: fix_fdt(): u-boot/common/board_f.c:729 return board_fix_fdt((void *)gd->fdt_blob); 8781b714: f8d9 0068 ldr.w r0, [r9, #104] ; 0x68 8781b718: f7ea bafe b.w 87805d18 <board_fix_fdt>
So, that's in your board code then: board/toradex/colibri_imx7/colibri_imx7.c:int board_fix_fdt(void *rw_fdt_blob)

On Mon, Jul 03, 2023 at 04:01:15PM -0400, Tom Rini wrote:
On Mon, Jul 03, 2023 at 09:54:40PM +0200, Francesco Dolcini wrote:
On Mon, Jul 03, 2023 at 09:40:51PM +0200, Marek Vasut wrote:
On 7/3/23 18:49, Francesco Dolcini wrote:
Short update on this regression.
On Thu, Jun 29, 2023 at 04:19:22PM +0200, Francesco Dolcini wrote:
I also noticed something weird on a colibri imx7s, this is not using SPL, likely something completly different, however given this is new also from rc5 I thought it's valuable to report:
U-Boot 2023.07-rc5-0.0.0-devel+git.580eb31199be (Jun 27 2023 - 13:39:58 +0000) CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 30C Reset cause: POR DRAM: initcall sequence 8786eafc failed at call 8781b351 (err=-3)
If you have the u-boot (elf file, unstripped), then check which initcall it is that failed . Just run arm-...-objdump -lSD on it and search for the 8781b350: address ; remember to clear the LSbit of the address in case this is thumb mode of the CPU.
U-Boot 2023.07-rc6 (Jul 03 2023 - 21:50:58 +0200)
CPU: Freescale i.MX7S rev1.2 800 MHz (running at 792 MHz) CPU: Extended Commercial temperature grade (-20C to 105C) at 35C Reset cause: POR DRAM: initcall sequence 8786fb00 failed at call 8781b715 (err=-3) ### ERROR ### Please RESET the board ###
u-boot/common/board_f.c:523 } 8781b710: 2000 movs r0, #0 8781b712: bd08 pop {r3, pc}
8781b714 <fix_fdt>: fix_fdt(): u-boot/common/board_f.c:729 return board_fix_fdt((void *)gd->fdt_blob); 8781b714: f8d9 0068 ldr.w r0, [r9, #104] ; 0x68 8781b718: f7ea bafe b.w 87805d18 <board_fix_fdt>
So, that's in your board code then: board/toradex/colibri_imx7/colibri_imx7.c:int board_fix_fdt(void *rw_fdt_blob)
Yep, this is the problematic code:
board/toradex/colibri_imx7/colibri_imx7.c: int board_fix_fdt(void *rw_fdt_blob) { /* i.MX 7Solo has only one single USB OTG1 but no USB host port */ if (is_cpu_type(MXC_CPU_MX7S)) { int offset = fdt_path_offset(rw_fdt_blob, "/soc/bus@30800000/usb@30b20000");
return fdt_status_disabled(rw_fdt_blob, offset); }
return 0; }
For what is worth in this node "/soc/bus@30800000/usb@30b20000" the `status="ok"` property is already present.
Before 8c103c33fb14 ("dm: dts: Convert driver model tags to use new schema") fdt_status_disabled() did return 0, so all good no issue.
If I do this small partial revert
--- a/arch/arm/dts/imx7d-colibri-eval-v3-u-boot.dtsi +++ b/arch/arm/dts/imx7d-colibri-eval-v3-u-boot.dtsi @@ -15,7 +15,8 @@ pinctrl-0 = <&pinctrl_lcdif_dat &pinctrl_lcdif_ctrl>; display = <&display0>; - u-boot,dm-pre-reloc; + bootph-all;
fdt_status_disabled() returns 0 again.
With the current master, fdt_status_disabled() returns -3, -FDT_ERR_NOSPACE, and I assume I could just change my code to call fdt_increase_size() and call it done.
However, what the reason for this different behavior triggered by that change in the *-u-boot.dtsi ? Is this expected?
From a quick check multiple place in the code just disable/enable nodes
or set properties without taking care of this, are those going to be affected by that commit that created the regression? Are those all buggy?
$ git grep fdt_setprop |wc -l 461
We have some helper around, for example arch/arm/mach-imx/imx8/fdt.c:disable_fdt_node(), but this is for example just specific on that file.
Simon: any suggestion?
Thanks, Francesco

Hi Francesco,
On Mon, Jul 3, 2023 at 5:49 PM Francesco Dolcini francesco@dolcini.it wrote:
If I do this small partial revert
--- a/arch/arm/dts/imx7d-colibri-eval-v3-u-boot.dtsi +++ b/arch/arm/dts/imx7d-colibri-eval-v3-u-boot.dtsi @@ -15,7 +15,8 @@ pinctrl-0 = <&pinctrl_lcdif_dat &pinctrl_lcdif_ctrl>; display = <&display0>;
u-boot,dm-pre-reloc;
bootph-all;
I managed to reproduce the behavior on a imx7d-sabresd by doing:
--- a/board/freescale/mx7dsabresd/mx7dsabresd.c +++ b/board/freescale/mx7dsabresd/mx7dsabresd.c @@ -25,6 +25,7 @@ #include <i2c.h> #include <asm/mach-imx/mxc_i2c.h> #include <asm/arch/crm_regs.h> +#include <fdt_support.h>
DECLARE_GLOBAL_DATA_PTR;
@@ -289,6 +290,20 @@ int power_init_board(void) } #endif
+int board_fix_fdt(void *rw_fdt_blob) +{ + int ret; + + int offset = fdt_path_offset(rw_fdt_blob, "/soc/bus@30800000/usb@30b20000"); + + ret = fdt_status_disabled(rw_fdt_blob, offset); + + printf("******** offset is 0x%x\n", offset); + printf("******** ret is %d\n", ret); + + return 0; +} + int board_late_init(void)
--- a/configs/mx7dsabresd_defconfig +++ b/configs/mx7dsabresd_defconfig @@ -86,3 +86,4 @@ CONFIG_USB_GADGET_PRODUCT_NUM=0xa4a5 CONFIG_CI_UDC=y CONFIG_USB_GADGET_DOWNLOAD=y CONFIG_ERRNO_STR=y +CONFIG_OF_BOARD_FIXUP=y
With top of tree U-Boot it fails with:
******** offset is 0x7ba4 ******** ret is -3
If u-boot.dtsi gets smaller, for example, by reverting 0aea5dda2928 then it succeeds:
******** offset is 0x7988 ******** ret is 0
fdt_status_disabled() returns 0 again.
With the current master, fdt_status_disabled() returns -3, -FDT_ERR_NOSPACE, and I assume I could just change my code to call fdt_increase_size() and call it done.
However, what the reason for this different behavior triggered by that change in the *-u-boot.dtsi ? Is this expected?
From a quick check multiple place in the code just disable/enable nodes or set properties without taking care of this, are those going to be affected by that commit that created the regression? Are those all buggy?
$ git grep fdt_setprop |wc -l 461
We have some helper around, for example arch/arm/mach-imx/imx8/fdt.c:disable_fdt_node(), but this is for example just specific on that file.
Here is my suggestion: let's increase the fdt size locally on your board for now, just like turris_omnia.c:
--- a/board/toradex/colibri_imx7/colibri_imx7.c +++ b/board/toradex/colibri_imx7/colibri_imx7.c @@ -315,6 +315,15 @@ int board_fix_fdt(void *rw_fdt_blob) if (is_cpu_type(MXC_CPU_MX7S)) { int offset = fdt_path_offset(rw_fdt_blob, "/soc/bus@30800000/usb@30b20000");
+ /* + * We're either adding status = "disabled" property, or changing + * status = "okay" to status = "disabled". In both cases we'll need more + * space. Increase the size a little. + */ + if (fdt_increase_size(rw_fdt_blob, 32) < 0) { + printf("Cannot increase FDT size.\n"); + return -ENOMEM; + } return fdt_status_disabled(rw_fdt_blob, offset); }
Then for the next cycle, we should plan on adding this fdt_increase_size() into the common fdt_status_disabled().
Does it work?

On Tue, Jul 04, 2023 at 11:39:59AM -0300, Fabio Estevam wrote:
Hi Francesco,
On Mon, Jul 3, 2023 at 5:49 PM Francesco Dolcini francesco@dolcini.it wrote:
If I do this small partial revert
--- a/arch/arm/dts/imx7d-colibri-eval-v3-u-boot.dtsi +++ b/arch/arm/dts/imx7d-colibri-eval-v3-u-boot.dtsi @@ -15,7 +15,8 @@ pinctrl-0 = <&pinctrl_lcdif_dat &pinctrl_lcdif_ctrl>; display = <&display0>;
u-boot,dm-pre-reloc;
bootph-all;
I managed to reproduce the behavior on a imx7d-sabresd by doing:
--- a/board/freescale/mx7dsabresd/mx7dsabresd.c +++ b/board/freescale/mx7dsabresd/mx7dsabresd.c @@ -25,6 +25,7 @@ #include <i2c.h> #include <asm/mach-imx/mxc_i2c.h> #include <asm/arch/crm_regs.h> +#include <fdt_support.h>
DECLARE_GLOBAL_DATA_PTR;
@@ -289,6 +290,20 @@ int power_init_board(void) } #endif
+int board_fix_fdt(void *rw_fdt_blob) +{
int ret;
int offset = fdt_path_offset(rw_fdt_blob,
"/soc/bus@30800000/usb@30b20000");
ret = fdt_status_disabled(rw_fdt_blob, offset);
printf("******** offset is 0x%x\n", offset);
printf("******** ret is %d\n", ret);
return 0;
+}
int board_late_init(void)
--- a/configs/mx7dsabresd_defconfig +++ b/configs/mx7dsabresd_defconfig @@ -86,3 +86,4 @@ CONFIG_USB_GADGET_PRODUCT_NUM=0xa4a5 CONFIG_CI_UDC=y CONFIG_USB_GADGET_DOWNLOAD=y CONFIG_ERRNO_STR=y +CONFIG_OF_BOARD_FIXUP=y
With top of tree U-Boot it fails with:
******** offset is 0x7ba4 ******** ret is -3
If u-boot.dtsi gets smaller, for example, by reverting 0aea5dda2928 then it succeeds:
******** offset is 0x7988 ******** ret is 0
fdt_status_disabled() returns 0 again.
With the current master, fdt_status_disabled() returns -3, -FDT_ERR_NOSPACE, and I assume I could just change my code to call fdt_increase_size() and call it done.
However, what the reason for this different behavior triggered by that change in the *-u-boot.dtsi ? Is this expected?
From a quick check multiple place in the code just disable/enable nodes or set properties without taking care of this, are those going to be affected by that commit that created the regression? Are those all buggy?
$ git grep fdt_setprop |wc -l 461
We have some helper around, for example arch/arm/mach-imx/imx8/fdt.c:disable_fdt_node(), but this is for example just specific on that file.
Here is my suggestion: let's increase the fdt size locally on your board for now, just like turris_omnia.c:
--- a/board/toradex/colibri_imx7/colibri_imx7.c +++ b/board/toradex/colibri_imx7/colibri_imx7.c @@ -315,6 +315,15 @@ int board_fix_fdt(void *rw_fdt_blob) if (is_cpu_type(MXC_CPU_MX7S)) { int offset = fdt_path_offset(rw_fdt_blob, "/soc/bus@30800000/usb@30b20000");
/*
* We're either adding status = "disabled" property, or changing
* status = "okay" to status = "disabled". In both cases we'll need more
* space. Increase the size a little.
*/
if (fdt_increase_size(rw_fdt_blob, 32) < 0) {
printf("Cannot increase FDT size.\n");
return -ENOMEM;
} return fdt_status_disabled(rw_fdt_blob, offset); }
Then for the next cycle, we should plan on adding this fdt_increase_size() into the common fdt_status_disabled().
I'm a little leary of generic changes here having an unexpected size / performance impact. The API specifically does not "just" handle this case like it does for some others. We should update the docs around fdt_set_node_status and fdt_status_disabled to reference the return codes of fdt_setprop_string itself and check for anyone else that hasn't been considering this possible failure case.

On Tue, Jul 04, 2023 at 11:09:06AM -0400, Tom Rini wrote:
On Tue, Jul 04, 2023 at 11:39:59AM -0300, Fabio Estevam wrote:
Then for the next cycle, we should plan on adding this fdt_increase_size() into the common fdt_status_disabled().
I'm a little leary of generic changes here having an unexpected size / performance impact. The API specifically does not "just" handle this case like it does for some others. We should update the docs around fdt_set_node_status and fdt_status_disabled to reference the return codes of fdt_setprop_string itself and check for anyone else that hasn't been considering this possible failure case.
Maybe we can just add a new additional small wrapper that does the fdt_increase_size() if required? This is a generic need, not something every board should implement on its own.
Francesco

On Tue, Jul 04, 2023 at 11:39:59AM -0300, Fabio Estevam wrote:
Then for the next cycle, we should plan on adding this fdt_increase_size() into the common fdt_status_disabled().
Does it work?
Now that the situation is pretty much clear I am not overly concerned for colibri-imx7s.
Do we consider this something to be worried about for other boards?
Francesco

On Tue, Jul 04, 2023 at 05:12:42PM +0200, Francesco Dolcini wrote:
On Tue, Jul 04, 2023 at 11:39:59AM -0300, Fabio Estevam wrote:
Then for the next cycle, we should plan on adding this fdt_increase_size() into the common fdt_status_disabled().
Does it work?
Now that the situation is pretty much clear I am not overly concerned for colibri-imx7s.
Do we consider this something to be worried about for other boards?
The cases calling fdt_status_disabled itself are small enough that I think a quick audit isn't unreasonable (and PowerPC should probably mirror the kernel and pass -p 0x1000 in DTC_FLAGS so it becomes a non-issue).

On Tue, Jul 04, 2023 at 11:14:57AM -0400, Tom Rini wrote:
On Tue, Jul 04, 2023 at 05:12:42PM +0200, Francesco Dolcini wrote:
On Tue, Jul 04, 2023 at 11:39:59AM -0300, Fabio Estevam wrote:
Then for the next cycle, we should plan on adding this fdt_increase_size() into the common fdt_status_disabled().
Does it work?
Now that the situation is pretty much clear I am not overly concerned for colibri-imx7s.
Do we consider this something to be worried about for other boards?
The cases calling fdt_status_disabled itself are small enough that I think a quick audit isn't unreasonable (and PowerPC should probably mirror the kernel and pass -p 0x1000 in DTC_FLAGS so it becomes a non-issue).
Do we expect this to not affect fdt_setprop() or other fdt_*() APIs? fdt_increase_size() seems not to be used most of the time, unless I am missing something here.
Francesco

On Tue, Jul 04, 2023 at 05:23:03PM +0200, Francesco Dolcini wrote:
On Tue, Jul 04, 2023 at 11:14:57AM -0400, Tom Rini wrote:
On Tue, Jul 04, 2023 at 05:12:42PM +0200, Francesco Dolcini wrote:
On Tue, Jul 04, 2023 at 11:39:59AM -0300, Fabio Estevam wrote:
Then for the next cycle, we should plan on adding this fdt_increase_size() into the common fdt_status_disabled().
Does it work?
Now that the situation is pretty much clear I am not overly concerned for colibri-imx7s.
Do we consider this something to be worried about for other boards?
The cases calling fdt_status_disabled itself are small enough that I think a quick audit isn't unreasonable (and PowerPC should probably mirror the kernel and pass -p 0x1000 in DTC_FLAGS so it becomes a non-issue).
Do we expect this to not affect fdt_setprop() or other fdt_*() APIs? fdt_increase_size() seems not to be used most of the time, unless I am missing something here.
It depends on the case. For example, fdt_node_set_part_info is big/complex enough that it does call fdt_increase_size (which is not cheap) when there's no space. Other cases already do check for no space and grow+retry. And then the general answer is that if you're expecting needing to make "big" changes to the dtb you should pad it to start with as that's a free operation at run time.

On Tue, Jul 4, 2023 at 12:12 PM Francesco Dolcini francesco@dolcini.it wrote:
Now that the situation is pretty much clear I am not overly concerned for colibri-imx7s.
Do we consider this something to be worried about for other boards?
There are only three boards that check the return value from fdt_status_disabled():
- board/toradex/colibri_imx7/colibri_imx7.c - board/gdsys/a38x/dt_helpers.c - board/CZ.NIC/turris_omnia/turris_omnia.c
As of today, only turris_omnia calls the fdt_increase_size() prior to fdt_status_disabled().

On Tue, Jul 04, 2023 at 12:57:34PM -0300, Fabio Estevam wrote:
On Tue, Jul 4, 2023 at 12:12 PM Francesco Dolcini francesco@dolcini.it wrote:
Now that the situation is pretty much clear I am not overly concerned for colibri-imx7s.
Do we consider this something to be worried about for other boards?
There are only three boards that check the return value from
The boards that do not check the return value might start to behave wrongly without an obvious error to help the debugging.
Francesco

On 04/07/2023 14:04, Francesco Dolcini wrote:
The boards that do not check the return value might start to behave wrongly without an obvious error to help the debugging.
Yes, the current implementation of fdt_status_disabled() is fragile, but there's not so much we can do for the upcoming 2023.07.
I can try to prepare a patch to improve it after 2023.07 is out next week.

On Tue, Jul 04, 2023 at 02:13:01PM -0300, Fabio Estevam wrote:
On 04/07/2023 14:04, Francesco Dolcini wrote:
The boards that do not check the return value might start to behave wrongly without an obvious error to help the debugging.
Yes, the current implementation of fdt_status_disabled() is fragile, but there's not so much we can do for the upcoming 2023.07.
I can try to prepare a patch to improve it after 2023.07 is out next week.
It's also been fragile-as-designed since inception. The trigger here was that renaming a ton of properties _reduced_ the overall dtb size itself and so some edge cases got pushed over the edge. I'm honestly not sure if it's better to: - Give everyone a small padding by default - Make the platforms which may call one of these functions a small padding by default - Audit the callers and make them handle -FDT_ERR_NOSPACE like the other cases where we grow the dtb size do. Even if yes, many of those other cases are for non-trivial growth rather than just adding 4 more letters. - Make fdt_set_node_status() handle the disabled case when out of space and grow/retry.

Hi,
On Wed, 5 Jul 2023 at 09:35, Tom Rini trini@konsulko.com wrote:
On Tue, Jul 04, 2023 at 02:13:01PM -0300, Fabio Estevam wrote:
On 04/07/2023 14:04, Francesco Dolcini wrote:
The boards that do not check the return value might start to behave wrongly without an obvious error to help the debugging.
Yes, the current implementation of fdt_status_disabled() is fragile, but there's not so much we can do for the upcoming 2023.07.
I can try to prepare a patch to improve it after 2023.07 is out next week.
It's also been fragile-as-designed since inception. The trigger here was that renaming a ton of properties _reduced_ the overall dtb size itself and so some edge cases got pushed over the edge. I'm honestly not sure if it's better to:
- Give everyone a small padding by default
- Make the platforms which may call one of these functions a small padding by default
One of the above seems good to me. Perhaps we could add a default padding, with the ability to reduce it to 0, or increase it. I suspect that not many boards update the FDT in SPL.
- Audit the callers and make them handle -FDT_ERR_NOSPACE like the other cases where we grow the dtb size do. Even if yes, many of those other cases are for non-trivial growth rather than just adding 4 more letters.
- Make fdt_set_node_status() handle the disabled case when out of space and grow/retry.
I don't like that since:
- it adds to complexity in that a 'leaf' function is messing with stuff it shouldn't - In generate we don't know if there is space to increase the FDT size - It likely increase code size for a case that cannot happen if people have tested it properly - It affects all boards
If boards are ignoring errors, that is up to the boards. We do encourage people (in core review) to check all errors. This is an example of why.
Regards, Simon
participants (6)
-
Fabio Estevam
-
Fabio Estevam
-
Francesco Dolcini
-
Marek Vasut
-
Simon Glass
-
Tom Rini