
On Mon, Oct 10, 2022 at 07:44:05PM +0200, Pali Rohár wrote:
On Monday 10 October 2022 13:40:38 Tom Rini wrote:
On Mon, Oct 10, 2022 at 07:22:56PM +0200, Pali Rohár wrote:
On Monday 10 October 2022 12:28:18 Tom Rini wrote:
On Sun, Oct 09, 2022 at 09:12:25PM +0200, Pali Rohár wrote:
Hello! Watchdog code seems to be broken in u-boot master branch. On Nokia N900 I'm getting following message in qemu:
cyclic function rx51_watchdog took too long: 10000us vs 1000us max, disabling
Seems that watchdog core code is not prepared for "slower" watchdogs which communicate over slower i2c bus, like it is the case for N900.
Disabling slower watchdog is a bad idea as it would result in reboot loop instead of slower - but working code.
So, looking at this in more detail, we have CONFIG_CYCLIC_MAX_CPU_TIME_US as a configuration option (which is where the too long comes from). And picking a random CI run: https://source.denx.de/u-boot/u-boot/-/jobs/511177 I do see we hit this in CI once, but not every time, QEMU runs here. Is that the max time is configurable enough to satisfy your concerns here?
It is needed to investigate, how to _properly_ fix this issue, not just workarounded it. Probably other boards may be affected.
So it's the cyclic watchdog code, which we merged as early as possible that's the reason here. And it was merged as early as we could to see if there's problems. Are there problems? We're seeing "system too slow, disabling" on QEMU, sometimes, and the value of too slow is configurable. I know you reported other problems with n900 HW, so we can't see if it's failing there
I was tested it with older asm code (as described in that other email, via git checkout commit -- file) on n900 HW and watchdog problem is there too. Phone reboots in about 20 seconds. But as I do not have serial console, I do not know if that "disabling" message is printed there too (but I guess it is).
I think I'm a bit baffled at this point, honestly. The watchdog timeout is 60 seconds. If you're confident in it being about 20 seconds, consistently, changing WATCHDOG_TIMEOUT_MSECS to say 10000 (so, 10 seconds) should let you see if U-Boot has configured the watchdog and it's being tripped, or if it's still at the prior stage value.
I would have expected that QEMU would see problems that real HW doesn't (the value in your log is much higher than the one in CI), but I could be wrong here.