
On Wed, Oct 26, 2022 at 5:33 AM Rasmus Villemoes rasmus.villemoes@prevas.dk wrote:
On 25/10/2022 18.32, Tim Harvey wrote:
Greetings,
I've noticed a regression since the merge of the cyclic framework use for watchdog on my imx8m boards:
cyclic_register for watchdog@30280000 failed WDT: Failed to start watchdog@30280000
A bisect lead me to the following 3 sequential patches: 29caf9305b6f cyclic: Use schedule() instead of WATCHDOG_RESET() ^^^ bad 881d4108257a cyclic: Introduce schedule() function ^^^ unbuildable c2fd0ca1a822 watchdog: Integrate watchdog triggering into the cyclic framework ^^^ unbootable
Can you provide some more details on "unbootable" and "unbuildable"? I.e., what build error do you get for the middle patch, and what do you see on the console with the "unbootable" image?
Rasmus,
Sure. I'm building and testing for imx8mm_venice.
Here are the patches in question listed in commit order: d219fc06b30d configs: Resync with savedefconfig ^^^ no issues found
c2fd0ca1a822 watchdog: Integrate watchdog triggering into the cyclic framework ^^^ unbootable for imx8mm_venice - hangs after SPL banner: U-Boot SPL 2022.10-rc3-00241-gc2fd0ca1a822 (Oct 26 2022 - 10:08:54 -0700) ^^^ this occurs because 'uclass_get_device_by_driver(UCLASS_MISC, DM_DRIVER_GET(gsc), &dev))' in gateworks/venice/spl.c board_init_f hangs and I'm not clear why this patch affects that
881d4108257a cyclic: Introduce schedule() function ^^^ unbuildable for imx8mm_venice_defconfig CC arch/arm/lib/sections.o In file included from include/asm-generic/global_data.h:23, from ./arch/arm/include/asm/global_data.h:102, from include/dm/of.h:11, from include/dm/ofnode.h:12, from include/dm/device.h:13, from include/linux/mtd/mtd.h:26, from include/nand.h:17, from cmd/bootm.c:17: include/cyclic.h:116:19: error: macro "schedule" passed 1 arguments, but takes j ust 0 void schedule(void);
29caf9305b6f cyclic: Use schedule() instead of WATCHDOG_RESET() ^^^ build issue resolved, boot issue on imx8mm-venice resolved, but this is where we now encounter watchdog failures in both the SPL and U-Boot:
SPL: cyclic_register for watchdog@30280000 failed WDT: Failed to start watchdog@30280000
The failure here is 'Cyclic IF not ready yet' (which should probably be an error print not a debug print). In this case the watchdog gets started but never reset via cyclic. This is due to cyclic_init never being called in the SPL - where is that supposed to occur?
U-Boot: cyclic function watchdog@30280000 took too long: 2976us vs 1000us max, disabling
Here also the watchdog gets started but never reset via cyclic so the board will reset itself after 60s sitting in U-Boot for example.
The issue here is that for some reason the first call to wdt_cyclic() takes about 3ms vs subsequent calls taking about 185us on this platform which exceeds the 1ms default max of CONFIG_CYCLIC_MAX_CPU_TIME_US and thus the watchdog reset is disabled and the board resets in 60s. Setting CONFIG_CYCLIC_MAX_CPU_TIME_US=5000 resolves that issue but this feels like a workaround that perhaps shouldn't be required. I assume the extra time in the first call is the probing of the device. So I think the fix for this U-Boot regression is to move the init part of wdt_cyclic to wdt_start() and have wdt_cyclic() only call wdt_reset()
Fabio, I assume you see this on other imx8m boards?
Also, the .configs used in each case might be interesting, certainly all lines containing CYCLIC, WATCHDOG or WDT.
$ grep CYCLIC .config CONFIG_CYCLIC=y CONFIG_CYCLIC_MAX_CPU_TIME_US=1000 # CONFIG_CMD_MX_CYCLIC is not set CONFIG_CMD_CYCLIC=y $ grep WATCHDOG .config CONFIG_SPL_WATCHDOG=y CONFIG_SYSRESET_WATCHDOG=y # CONFIG_SYSRESET_WATCHDOG_AUTO is not set CONFIG_WATCHDOG=y CONFIG_WATCHDOG_AUTOSTART=y CONFIG_WATCHDOG_TIMEOUT_MSECS=60000 CONFIG_IMX_WATCHDOG=y # CONFIG_WATCHDOG_RESET_DISABLE is not set # CONFIG_ULP_WATCHDOG is not set # CONFIG_DESIGNWARE_WATCHDOG is not set # CONFIG_XILINX_TB_WATCHDOG is not set $ grep WDT .config # CONFIG_CMD_WDT is not set CONFIG_WDT=y # CONFIG_WDT_APPLE is not set # CONFIG_WDT_ASPEED is not set # CONFIG_WDT_AST2600 is not set # CONFIG_WDT_AT91 is not set # CONFIG_WDT_CDNS is not set # CONFIG_WDT_CORTINA is not set # CONFIG_WDT_GPIO is not set # CONFIG_WDT_MAX6370 is not set # CONFIG_WDT_MESON_GXBB is not set # CONFIG_WDT_ORION is not set # CONFIG_WDT_SBSA is not set # CONFIG_WDT_SP805 is not set # CONFIG_WDT_STM32MP is not set CONFIG_SPL_WDT=y
One thing I notice is that I think wdt_stop should unregister the cyclic function; there's really no point having the cyclic framework keep calling wdt_cyclic() only to bail out at "if (!priv->running)" - moreover, it's almost certainly a bug if the device is started again via another wdt_start(), because then we have two cyclic instances registered for the same device.
I also think the cyclic API is somewhat misdesigned. The storage for the cyclic metadata should be provided by the caller (e.g. in the watchdog case just embedded as part of struct wdt_priv), so that cyclic_register() can never fail. Otherwise we have this awkward situation where ops->start() have already been called and succeeded, but then perhaps we fail to register the cyclic instance; should we then do wdt_stop(), and what if _that_ then fails?
This also allows the callback to be a little more type-safe; let the callback take a "struct cyclic_info *" as argument, and then the callback can use e.g. container_of to get the containing wdt_priv.
I really didn't follow the various submissions for the cyclic framework so I don't have any input on that end.
Best Regards,
Tim