
Hi Rasmus,
On 15.03.23 16:59, Rasmus Villemoes wrote:
On 15/03/2023 16.24, Frieder Schrempf wrote:
On 15.03.23 15:42, Frieder Schrempf wrote:
On 15.03.23 15:17, Michael Nazzareno Trimarchi wrote:
Hi
On Wed, Mar 15, 2023 at 3:13 PM Frieder Schrempf frieder.schrempf@kontron.de wrote:
Hi,
I'm trying to bring up a new board based on the i.MX8MP and I have an issue I'm hoping someone can help solving.
I'm seeing failures in the early SPL code, usually in the DDR initialization. Often they look like:
U-Boot SPL 2023.04-rc3 (Mar 07 2023 - 14:32:34 +0000) Training FAILED Failed to initialize DDR RAM! ### ERROR ### Please RESET the board ###
But sometimes ddr_init() doesn't even return an error and only the get_ram_size() afterwards which tries to allocate the memory fails.
In my experience you don't have space inside the cpu internal memory. It means that you overlap some stack with the code. Change the printf means move a bit. So you have problem but depends what you are going to destroy
Thanks for your reply. That's exactly what I'm thinking, too.
The strange thing is that the issues appear or disappear deterministically on the binary level. This means I sometimes get a U-Boot binary which runs just fine in 100% of cases. Then I change for example one of the following:
- Adding a single printf() somewhere in the boards spl.c
- Using the same binary but booting from SD card instead of USB loader
- Using the same source but switching from the OS cross compiler to the
one from Yocto/OE
And afterwards I get 100% failure rate with an error as described above.
My suspicion is that there is some memory corruption/conflict. My SPL is quite large and I wonder if it exceeds some limit.
SPL is loaded to 0x920000 and CONFIG_SPL_STACK is set to 0x960000, which leaves 256 KiB in between for the SPL. But all i.MX8MP boards seem to set CONFIG_SPL_MAX_SIZE=0x26000 (152 KiB) for some reason. My u-boot-spl-ddr.bin currently has around 193 KiB but I don't get any warning about exceeding the SPL_MAX_SIZE.
I also ran into this problem a while back, but that was back when the ddr firmware files were padded to 16K and 32K each to make the magic offset computations work; now that binman symbols are used, they only take up as much space as they actually use (give or take some 4-byte padding perhaps), and I no longer need the debug code I put in place in our 2022.07 branch.
Remember that from the stack, the initial (and in SPL only) malloc arena is carved out, and if you haven't adjusted SPL_SYS_MALLOC_F_LEN, you probably have that set to the default SYS_MALLOC_F_LEN, which in turn (on imx8m) defaults to 0x10000 aka 64KiB. So that could easily explain why you collide with the firmware.
Ok, that's something I missed before and it provides a good explanation for my problems.
Maybe you can use the debug code I added to our copy of spl.c; I also include most of my commit-message-for-future-me. But just something as simple as
int dummy; printf("stack is around %p\n", &dummy);
can be quite valuable.
Thanks for all the valuable information and explanations. This helps a lot. In the first step I disabled some DM drivers in SPL and use legacy implementations for the PMIC, GPIO, etc. just as other i.MX8MP boards do. This seems to shrink the SPL enough to avoid collisions.
But I will also try to optimize SPL_SYS_MALLOC_F_LEN now that I know its role.
Thanks Frieder