Boot failure on Rock-pi-4-a when compiling with clang.

Hi all,
I'd like to have some ideas for debugging this conditions. When compiling with default gnu toolchain. It works great.
When compiling with clang (clang-14 or clang-15), I'll get hang sometimes. The following are two boot logs:
1. U-Boot SPL 2023.10-rc1-00207-g38dedebc54-dirty (Aug 03 2023 - 08:08:00 +0800) Trying to boot from MMC1 ## Checking hash(es) for config config-1 ... OK ## Checking hash(es) for Image atf-1 ... sha256+ OK ## Checking hash(es) for Image u-boot ... sha256+ OK ## Checking hash(es) for Image fdt-1 ... sha256+ OK ## Checking hash(es) for Image atf-2 ... sha256+ OK ## Checking hash(es) for Image atf-3 ... sha256+ OK ## Checking hash(es) for Image atf-4 ... sha256+ OK spl_load_fit_image: Skip load 'atf-5': image size is 0!
U-Boot 2023.10-rc1-00207-g38dedebc54-dirty (Aug 03 2023 - 08:08:00 +0800)
SoC: Rockchip rk3399 Reset cause: POR Model: Radxa ROCK Pi 4A DRAM: initcall sequence 00000000002aeb80 failed at call 00000000002042a8 (err=- 19) ### ERROR ### Please RESET the board ###
2. U-Boot TPL 2023.04-maybe-dirty (Jan 01 1970 - 00:00:00) lpddr4_set_rate: change freq to 400MHz 0, 1 Channel 0: LPDDR4, 400MHz BW=32 Col=10 Bk=8 CS0 Row=16/15 CS=1 Die BW=16 Size=2048MB Channel 1: LPDDR4, 400MHz BW=32 Col=10 Bk=8 CS0 Row=16/15 CS=1 Die BW=16 Size=2048MB 256B stride lpddr4_set_rate: change freq to 800MHz 1, 0 Trying to boot from BOOTROM Returning to boot ROM...
U-Boot SPL 2023.04-maybe-dirty (Jan 01 1970 - 00:00:00 +0000) Trying to boot from MMC1 fdt_addr: 0x2e9c88 Device tree error at node '__symbols__' Some drivers failed to bind initcall sequence 00000000002a95f8 failed at call 0000000000224138 (err=-11) ### ERROR ### Please RESET the board ###
As you can see, the bug happened at difference places. For the difference of 1 and 2, I just turn on the "fastboot over UDP" function.
Bisect is not working. Because if I keep turn on/off more extra functions (like fastboot over USB) it can boot sometimes. It seems to me that this could be some alignment or size of code problem because if I turn on something, the size of the binaries will be different. And some of it works, some not.
Now I only know that v2023.01 works well. And v2023.04 is unstable. I tried to bisect between them but found an unrelated commit. Because some bad commit could runs good just because of the probability.
Any ideas on how to debug this further?
I'm building U-boot by. make O="/tmp/a1" \ CROSS_COMPILE="aarch64-linux-gnu-" \ CC="clang -target aarch64-linux-gnu" \ HOSTCC="clang" \ rock-pi-4-rk3399_defconfig make O="/tmp/a1" \ CROSS_COMPILE="aarch64-linux-gnu-" \ CC="clang -target aarch64-linux-gnu" \ HOSTCC="clang" In Debian Trixie.
Yours, Paul

On Thu, Aug 03, 2023 at 08:30:20AM +0800, Ying-Chun Liu (PaulLiu) wrote:
Hi all,
I'd like to have some ideas for debugging this conditions. When compiling with default gnu toolchain. It works great.
When compiling with clang (clang-14 or clang-15), I'll get hang sometimes. The following are two boot logs:
U-Boot SPL 2023.10-rc1-00207-g38dedebc54-dirty (Aug 03 2023 - 08:08:00 +0800) Trying to boot from MMC1 ## Checking hash(es) for config config-1 ... OK ## Checking hash(es) for Image atf-1 ... sha256+ OK ## Checking hash(es) for Image u-boot ... sha256+ OK ## Checking hash(es) for Image fdt-1 ... sha256+ OK ## Checking hash(es) for Image atf-2 ... sha256+ OK ## Checking hash(es) for Image atf-3 ... sha256+ OK ## Checking hash(es) for Image atf-4 ... sha256+ OK spl_load_fit_image: Skip load 'atf-5': image size is 0!
U-Boot 2023.10-rc1-00207-g38dedebc54-dirty (Aug 03 2023 - 08:08:00 +0800)
SoC: Rockchip rk3399 Reset cause: POR Model: Radxa ROCK Pi 4A DRAM: initcall sequence 00000000002aeb80 failed at call 00000000002042a8 (err=- 19) ### ERROR ### Please RESET the board ###
U-Boot TPL 2023.04-maybe-dirty (Jan 01 1970 - 00:00:00) lpddr4_set_rate: change freq to 400MHz 0, 1 Channel 0: LPDDR4, 400MHz BW=32 Col=10 Bk=8 CS0 Row=16/15 CS=1 Die BW=16 Size=2048MB Channel 1: LPDDR4, 400MHz BW=32 Col=10 Bk=8 CS0 Row=16/15 CS=1 Die BW=16 Size=2048MB 256B stride lpddr4_set_rate: change freq to 800MHz 1, 0 Trying to boot from BOOTROM Returning to boot ROM...
U-Boot SPL 2023.04-maybe-dirty (Jan 01 1970 - 00:00:00 +0000) Trying to boot from MMC1 fdt_addr: 0x2e9c88 Device tree error at node '__symbols__' Some drivers failed to bind initcall sequence 00000000002a95f8 failed at call 0000000000224138 (err=-11) ### ERROR ### Please RESET the board ###
As you can see, the bug happened at difference places. For the difference of 1 and 2, I just turn on the "fastboot over UDP" function.
Bisect is not working. Because if I keep turn on/off more extra functions (like fastboot over USB) it can boot sometimes. It seems to me that this could be some alignment or size of code problem because if I turn on something, the size of the binaries will be different. And some of it works, some not.
Now I only know that v2023.01 works well. And v2023.04 is unstable. I tried to bisect between them but found an unrelated commit. Because some bad commit could runs good just because of the probability.
Any ideas on how to debug this further?
I'm building U-boot by. make O="/tmp/a1" \ CROSS_COMPILE="aarch64-linux-gnu-" \ CC="clang -target aarch64-linux-gnu" \ HOSTCC="clang" \ rock-pi-4-rk3399_defconfig make O="/tmp/a1" \ CROSS_COMPILE="aarch64-linux-gnu-" \ CC="clang -target aarch64-linux-gnu" \ HOSTCC="clang" In Debian Trixie.
The only hint I have for the moment is to check what's changed in rockchip code, assuming this is a problem still on v2023.07. Have you tried v2023.07 / current? It has a number of minor changes for using clang on ARM, and is when I put clang+arm in my CI loop for a few platforms including rpi_3 and rpi_arm64.
participants (2)
-
Tom Rini
-
Ying-Chun Liu (PaulLiu)