Re: [U-Boot] sunxi: kernel crash on shutdown due to locked i2c bus depending on u-boot version

Hi,
On 03-07-16 16:04, Karsten Merker wrote:
Hello,
I am experiencing a kernel panic on shutdown due to killing init on various Allwinner A20-based systems (e.g. Cubietruck and Lime2). Killing init in turn appears to be a result of a locked i2c bus. This only happens on shutdown, but not on reboot. The problem exists already for quite some time but I hadn't found enough time to debug this properly before.
Follwing is a log from shutting down a Cubietruck, running kernel 4.6.2 (Debian/sid kernel package) and u-boot v2016.07-rc3 (plain upstream build):
systemd-shutdown[1]: Sending SIGTERM to remaining processes... systemd-journald[522]: Received SIGTERM from PID 1 (systemd-shutdow). systemd-shutdown[1]: Sending SIGKILL to remaining processes... systemd-shutdown[1]: Unmounting file systems. systemd-shutdown[1]: Remounting '/' read-only with options 'errors=remount-ro,data=ordered'. EXT4-fs (sda2): re-mounted. Opts: errors=remount-ro,data=ordered systemd-shutdown[1]: Remounting '/' read-only with options 'errors=remount-ro,data=ordered'. EXT4-fs (sda2): re-mounted. Opts: errors=remount-ro,data=ordered systemd-shutdown[1]: All filesystems unmounted. systemd-shutdown[1]: Deactivating swaps. systemd-shutdown[1]: All swaps deactivated. systemd-shutdown[1]: Detaching loop devices. systemd-shutdown[1]: All loop devices detached. systemd-shutdown[1]: Detaching DM devices. systemd-shutdown[1]: All DM devices detached. systemd-shutdown[1]: Powering off. kvm: exiting hardware virtualization sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Stopping disk musb-hdrc musb-hdrc.1.auto: remove, state 4 usb usb3: USB disconnect, device number 1 musb-hdrc musb-hdrc.1.auto: USB bus 3 deregistered reboot: Power down i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0 Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000
CPU: 0 PID: 1 Comm: systemd-shutdow Tainted: G E 4.6.0-1-armmp-lpae #1 Debian 4.6.2-2 Hardware name: Allwinner sun7i (A20) Family [<c022efdc>] (unwind_backtrace) from [<c02290c8>] (show_stack+0x20/0x24) [<c02290c8>] (show_stack) from [<c051a66c>] (dump_stack+0x98/0xac) [<c051a66c>] (dump_stack) from [<c0374004>] (panic+0xfc/0x28c) [<c0374004>] (panic) from [<c026bcb8>] (complete_and_exit+0x0/0x2c) [<c026bcb8>] (complete_and_exit) from [<c028ac44>] (SyS_reboot+0x1c8/0x238) [<c028ac44>] (SyS_reboot) from [<c0224aa0>] (ret_fast_syscall+0x0/0x34) ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000
What makes this kind of strange is that it appears to happen depending on the u-boot version I use. Plain upstream v2016.03, v2016.05 and v2016.07rc3 all cause a crash, while the Debian version 2016.03+dfsg1-6 works without problems:
systemd-shutdown[1]: Sending SIGTERM to remaining processes... systemd-journald[534]: Received SIGTERM from PID 1 (systemd-shutdow). systemd-shutdown[1]: Sending SIGKILL to remaining processes... systemd-shutdown[1]: Unmounting file systems. systemd-shutdown[1]: Remounting '/' read-only with options 'errors=remount-ro,data=ordered'. EXT4-fs (sda2): re-mounted. Opts: errors=remount-ro,data=ordered systemd-shutdown[1]: Remounting '/' read-only with options 'errors=remount-ro,data=ordered'. EXT4-fs (sda2): re-mounted. Opts: errors=remount-ro,data=ordered systemd-shutdown[1]: All filesystems unmounted. systemd-shutdown[1]: Deactivating swaps. systemd-shutdown[1]: All swaps deactivated. systemd-shutdown[1]: Detaching loop devices. systemd-shutdown[1]: All loop devices detached. systemd-shutdown[1]: Detaching DM devices. systemd-shutdown[1]: All DM devices detached. systemd-shutdown[1]: Powering off. kvm: exiting hardware virtualization sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Stopping disk musb-hdrc musb-hdrc.1.auto: remove, state 4 usb usb2: USB disconnect, device number 1 musb-hdrc musb-hdrc.1.auto: USB bus 2 deregistered reboot: Power down
The only possibly relevant change in the Debian build compared to upstream v2016.03 that I can see is cherry-picking commit affa020559bca31d6531e19cb1f009c22705a73d from upstream u-boot v2016.05 (which addresses an i2c-related hang on Lime/Lime2 boards by setting certain regulator voltages, but doesn't touch e.g. the Cubietruck at all):
https://anonscm.debian.org/cgit/collab-maint/u-boot.git/tree/debian/patches/...
I have cross-checked that by building upstream v2016.03 with this patch on top - no panic happens, while with plain upstream v2016.03 the system does panic.
So on a Cubietruck, plain upstream v2016.03 results in a panic; plain upstream v2016.03 with the aforementioned commit cherry-picked on top works fine, even though the commit is completely irrelevant to the Cubietruck. Plain upstream v2016.05, which contains this commit, crashes again, as does v2016.07-rc3.
Can anybody reproduce this problem or does anybody have an idea what could be happening here?
You're likely being hit by the irq affinity problem I've been discussing upstream. Try disabling irqbalanced, it seems that the latest version ties the i2c controller irq to cpu-1, and before power down cpu-1 is shutdown leaving only cpu-0 resulting in the interrupts going nowhere and the bus seeming to be stuck to the kernel.
Regards,
Hans
participants (1)
-
Hans de Goede