[PATCH] watchdog: allow overriding the rate-limiting logic

Rasmus Villemoes

12 Mar 2020 12 Mar '20

12:40 p.m.

On the MPC8309-derived board I'm currently working on, the current rate-limiting logic in the generic watchdog_reset function poses two problems:

First, the hard-coded limit of 1000ms is too large for the external GPIO-triggered watchdog we have.

Second, in the SPL, the decrementer interrupt is not enabled, so the get_timer() call calls always returns 0, so wdt_reset() is never actually called. Enabling that interrupt (i.e. calling interrupt_init() somewhere in my early board code) is a bit problematic, since U-Boot proper gets loaded to address 0, hence overwriting exception vectors - and the reason I even need to care about the watchdog in the SPL is that the signature verification takes a rather long time, so just disabling interrupts before loading U-Boot proper doesn't work.

Both problems can be solved by allowing the board to override the rate-limiting logic. For example, in my case I will implement the function in terms of get_ticks() (i.e. the time base registers on PPC) which do not depend on interrupts, and use a threshold of gd->bus_clk / (4*16) ticks (~62ms).

Signed-off-by: Rasmus Villemoes rasmus.villemoes@prevas.dk ---

This is on top of https://patchwork.ozlabs.org/patch/1242772/, but it's obviously trivial to do on master instead.

drivers/watchdog/wdt-uclass.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c index 309a0e9c5b..ad53e86b80 100644 --- a/drivers/watchdog/wdt-uclass.c +++ b/drivers/watchdog/wdt-uclass.c @@ -67,6 +67,19 @@ int wdt_expire_now(struct udevice *dev, ulong flags) }

#if defined(CONFIG_WATCHDOG) +__weak int watchdog_reset_ratelimit(void) +{ + static ulong next_reset; + ulong now; + + now = get_timer(0); + if (time_after(now, next_reset)) { + next_reset = now + 1000; /* reset every 1000ms */ + return 1; + } + return 0; +} + /* * Called by macro WATCHDOG_RESET. This function be called *very* early, * so we need to make sure, that the watchdog driver is ready before using @@ -74,19 +87,13 @@ int wdt_expire_now(struct udevice *dev, ulong flags) */ void watchdog_reset(void) { - static ulong next_reset; - ulong now; - /* Exit if GD is not ready or watchdog is not initialized yet */ if (!gd || !(gd->flags & GD_FLG_WDT_READY)) return;

/* Do not reset the watchdog too often */ - now = get_timer(0); - if (time_after(now, next_reset)) { - next_reset = now + 1000; /* reset every 1000ms */ + if (watchdog_reset_ratelimit()) wdt_reset(gd->watchdog_dev); - } } #endif

-- 2.23.0

Show replies by date

Stefan Roese

12 Mar 12 Mar

12:58 p.m.

Hi Rasmus,

(added Christophe to Cc)

On 12.03.20 12:40, Rasmus Villemoes wrote:

...

On the MPC8309-derived board I'm currently working on, the current rate-limiting logic in the generic watchdog_reset function poses two problems:

First, the hard-coded limit of 1000ms is too large for the external GPIO-triggered watchdog we have.

I remember a discussion a few weeks ago with Christophe, where you pointed out, that there is already the official "hw_margin_ms" DT property for this delay here. Why don't you introduce it here so that all boards can use it as well?

...

Second, in the SPL, the decrementer interrupt is not enabled, so the get_timer() call calls always returns 0, so wdt_reset() is never actually called. Enabling that interrupt (i.e. calling interrupt_init() somewhere in my early board code) is a bit problematic, since U-Boot proper gets loaded to address 0, hence overwriting exception vectors - and the reason I even need to care about the watchdog in the SPL is that the signature verification takes a rather long time, so just disabling interrupts before loading U-Boot proper doesn't work.

Both problems can be solved by allowing the board to override the rate-limiting logic. For example, in my case I will implement the function in terms of get_ticks() (i.e. the time base registers on PPC) which do not depend on interrupts, and use a threshold of gd->bus_clk / (4*16) ticks (~62ms).

I would assume that you might run into multiple issues, when your timer infrastructure is not working correctly in SPL. Can't you instead "fix" this by using this get_ticks() option for the get_timer() functions in SPL so that you can use the common functions here and in all other places in SPL as well?

...

Signed-off-by: Rasmus Villemoes rasmus.villemoes@prevas.dk

This is on top of https://patchwork.ozlabs.org/patch/1242772/, but it's obviously trivial to do on master instead.

drivers/watchdog/wdt-uclass.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c index 309a0e9c5b..ad53e86b80 100644 --- a/drivers/watchdog/wdt-uclass.c +++ b/drivers/watchdog/wdt-uclass.c @@ -67,6 +67,19 @@ int wdt_expire_now(struct udevice *dev, ulong flags) }

#if defined(CONFIG_WATCHDOG) +__weak int watchdog_reset_ratelimit(void) +{
static ulong next_reset;

ulong now;

now = get_timer(0);

if (time_after(now, next_reset)) {
next_reset = now + 1000;	/* reset every 1000ms */
return 1;
}

return 0;
+}

/*

Called by macro WATCHDOG_RESET. This function be called *very* early,

so we need to make sure, that the watchdog driver is ready before using

@@ -74,19 +87,13 @@ int wdt_expire_now(struct udevice *dev, ulong flags) */ void watchdog_reset(void) {
static ulong next_reset;

ulong now;

/* Exit if GD is not ready or watchdog is not initialized yet */ if (!gd || !(gd->flags & GD_FLG_WDT_READY)) return;

/* Do not reset the watchdog too often */

now = get_timer(0);

if (time_after(now, next_reset)) {
next_reset = now + 1000;	/* reset every 1000ms */
if (watchdog_reset_ratelimit()) wdt_reset(gd->watchdog_dev);

} } #endif

Thanks, Stefan

Rasmus Villemoes

4:36 p.m.

On 12/03/2020 12.58, Stefan Roese wrote:

...

Hi Rasmus,

(added Christophe to Cc)

On 12.03.20 12:40, Rasmus Villemoes wrote:

...
On the MPC8309-derived board I'm currently working on, the current rate-limiting logic in the generic watchdog_reset function poses two problems:

First, the hard-coded limit of 1000ms is too large for the external GPIO-triggered watchdog we have.

I remember a discussion a few weeks ago with Christophe, where you pointed out, that there is already the official "hw_margin_ms" DT property for this delay here. Why don't you introduce it here so that all boards can use it as well?

Well, I considered that. Reading the value is probably just adding a dev_read_u32_default(..., 1000) next to the one reading the timeout-sec property in initr_watchdog(). But what is a good place to stash the result? It really belongs with the device, but as long as only one is supported I suppose one could move initr_watchdog to wdt-uclass.c and use a static variable in that file.

I can certainly do that. That leaves the other problem.

...

...
Second, in the SPL, the decrementer interrupt is not enabled, so the get_timer() call calls always returns 0, so wdt_reset() is never actually called. Enabling that interrupt (i.e. calling interrupt_init() somewhere in my early board code) is a bit problematic, since U-Boot proper gets loaded to address 0, hence overwriting exception vectors - and the reason I even need to care about the watchdog in the SPL is that the signature verification takes a rather long time, so just disabling interrupts before loading U-Boot proper doesn't work.

Both problems can be solved by allowing the board to override the rate-limiting logic. For example, in my case I will implement the function in terms of get_ticks() (i.e. the time base registers on PPC) which do not depend on interrupts, and use a threshold of gd->bus_clk / (4*16) ticks (~62ms).

I would assume that you might run into multiple issues, when your timer infrastructure is not working correctly in SPL.

None so far, except for the ratelimiting in terms of get_timer() not working.

...

Can't you instead "fix" this by using this get_ticks() option for the get_timer() functions in SPL so that you can use the common functions here and in all other places in SPL as well?

I don't understand what you mean. On PPC, get_timer(0) just returns a static variable which is incremented from the timer_interrupt(). When that interrupt is not enabled, that variable is always 0.

I can enable the timer interrupt and get the time (in terms of get_timer) going, but as I wrote, I would have to disable that interrupt again before actually loading U-Boot [you can probably imagine the fun of debugging what happens when one overwrites the exception vectors], so time would stop passing as far as get_timer() is concerned, which means that again the watchdog would no longer be serviced. So if there is to be any ratelimiting at all, it really has to be based on a time that does tick without relying on interrupts.

Rasmus

Rasmus Villemoes

13 Mar 13 Mar

5:04 p.m.

New subject: [PATCH 0/3] watchdog: honour hw_margin_ms property

Some watchdogs must be reset more often than the once-per-second ratelimit used by the generic watchdog_reset function in wdt-uclass.c. There's precedent (from the gpio-wdt driver in linux) for using a property called hw_margin_ms to let the device tree tell the driver how often the device needs resetting. So use that generically. No change in default behaviour.

On top of https://patchwork.ozlabs.org/patch/1242772/ .

Stefan, something like this? That at least solves half my problems and might be useful to others as well. Then I'll have to figure out the time-stands-still problem in some other way.

Rasmus Villemoes (3): watchdog: remove stale ifndef CONFIG_WATCHDOG_TIMEOUT_MSECS from wdt.h watchdog: move initr_watchdog() to wdt-uclass.c watchdog: honour hw_margin_ms DT property

drivers/watchdog/wdt-uclass.c | 43 ++++++++++++++++++++++++++++++++++- include/wdt.h | 38 +------------------------------ 2 files changed, 43 insertions(+), 38 deletions(-)

-- 2.23.0

Rasmus Villemoes

5:04 p.m.

New subject: [PATCH 1/3] watchdog: remove stale ifndef CONFIG_WATCHDOG_TIMEOUT_MSECS from wdt.h

Since WATCHDOG_TIMEOUT_MSECS was converted to Kconfig (commit ca51ef7c0c), CONFIG_WATCHDOG_TIMEOUT_MSECS has been guaranteed to be defined. So remove the dead fallback ifdeffery.

Signed-off-by: Rasmus Villemoes rasmus.villemoes@prevas.dk --- include/wdt.h | 3 --- 1 file changed, 3 deletions(-)

diff --git a/include/wdt.h b/include/wdt.h index 5bcff24ab3..e833d3a772 100644 --- a/include/wdt.h +++ b/include/wdt.h @@ -107,9 +107,6 @@ struct wdt_ops { };

#if CONFIG_IS_ENABLED(WDT) -#ifndef CONFIG_WATCHDOG_TIMEOUT_MSECS -#define CONFIG_WATCHDOG_TIMEOUT_MSECS (60 * 1000) -#endif #define WATCHDOG_TIMEOUT_SECS (CONFIG_WATCHDOG_TIMEOUT_MSECS / 1000)

static inline int initr_watchdog(void)

-- 2.23.0

Stefan Roese

14 Mar 14 Mar

12:55 p.m.

New subject: [PATCH 1/3] watchdog: remove stale ifndef CONFIG_WATCHDOG_TIMEOUT_MSECS from wdt.h

On 13.03.20 17:04, Rasmus Villemoes wrote:

...

Since WATCHDOG_TIMEOUT_MSECS was converted to Kconfig (commit ca51ef7c0c), CONFIG_WATCHDOG_TIMEOUT_MSECS has been guaranteed to be defined. So remove the dead fallback ifdeffery.

Signed-off-by: Rasmus Villemoes rasmus.villemoes@prevas.dk

include/wdt.h | 3 --- 1 file changed, 3 deletions(-)

diff --git a/include/wdt.h b/include/wdt.h index 5bcff24ab3..e833d3a772 100644 --- a/include/wdt.h +++ b/include/wdt.h @@ -107,9 +107,6 @@ struct wdt_ops { };

#if CONFIG_IS_ENABLED(WDT) -#ifndef CONFIG_WATCHDOG_TIMEOUT_MSECS -#define CONFIG_WATCHDOG_TIMEOUT_MSECS (60 * 1000) -#endif #define WATCHDOG_TIMEOUT_SECS (CONFIG_WATCHDOG_TIMEOUT_MSECS / 1000)

static inline int initr_watchdog(void)

Reviewed-by: Stefan Roese sr@denx.de

Thanks, Stefan

Rasmus Villemoes

13 Mar 13 Mar

5:04 p.m.

New subject: [PATCH 2/3] watchdog: move initr_watchdog() to wdt-uclass.c

This function is a bit large for an inline function, and for U-Boot proper, it is called via a function pointer anyway (in board_r.c), so cannot be inlined.

It will shortly set a global variable to be used by the watchdog_reset() function in wdt-uclass.c, so this also allows making that variable local to wdt-uclass.c.

The WATCHDOG_TIMEOUT_SECS define is not used elsewhere.

Signed-off-by: Rasmus Villemoes rasmus.villemoes@prevas.dk --- drivers/watchdog/wdt-uclass.c | 33 +++++++++++++++++++++++++++++++++ include/wdt.h | 35 +---------------------------------- 2 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c index 309a0e9c5b..fb3e247c5f 100644 --- a/drivers/watchdog/wdt-uclass.c +++ b/drivers/watchdog/wdt-uclass.c @@ -13,6 +13,39 @@

DECLARE_GLOBAL_DATA_PTR;

+#define WATCHDOG_TIMEOUT_SECS (CONFIG_WATCHDOG_TIMEOUT_MSECS / 1000) + +int initr_watchdog(void) +{ + u32 timeout = WATCHDOG_TIMEOUT_SECS; + + /* + * Init watchdog: This will call the probe function of the + * watchdog driver, enabling the use of the device + */ + if (uclass_get_device_by_seq(UCLASS_WDT, 0, + (struct udevice **)&gd->watchdog_dev)) { + debug("WDT: Not found by seq!\n"); + if (uclass_get_device(UCLASS_WDT, 0, + (struct udevice **)&gd->watchdog_dev)) { + printf("WDT: Not found!\n"); + return 0; + } + } + + if (CONFIG_IS_ENABLED(OF_CONTROL)) { + timeout = dev_read_u32_default(gd->watchdog_dev, "timeout-sec", + WATCHDOG_TIMEOUT_SECS); + } + + wdt_start(gd->watchdog_dev, timeout * 1000, 0); + gd->flags |= GD_FLG_WDT_READY; + printf("WDT: Started with%s servicing (%ds timeout)\n", + IS_ENABLED(CONFIG_WATCHDOG) ? "" : "out", timeout); + + return 0; +} + int wdt_start(struct udevice *dev, u64 timeout_ms, ulong flags) { const struct wdt_ops *ops = device_get_ops(dev); diff --git a/include/wdt.h b/include/wdt.h index e833d3a772..aea5abc768 100644 --- a/include/wdt.h +++ b/include/wdt.h @@ -106,39 +106,6 @@ struct wdt_ops { int (*expire_now)(struct udevice *dev, ulong flags); };

-#if CONFIG_IS_ENABLED(WDT) -#define WATCHDOG_TIMEOUT_SECS (CONFIG_WATCHDOG_TIMEOUT_MSECS / 1000) - -static inline int initr_watchdog(void) -{ - u32 timeout = WATCHDOG_TIMEOUT_SECS; - - /* - * Init watchdog: This will call the probe function of the - * watchdog driver, enabling the use of the device - */ - if (uclass_get_device_by_seq(UCLASS_WDT, 0, - (struct udevice **)&gd->watchdog_dev)) { - debug("WDT: Not found by seq!\n"); - if (uclass_get_device(UCLASS_WDT, 0, - (struct udevice **)&gd->watchdog_dev)) { - printf("WDT: Not found!\n"); - return 0; - } - } - - if (CONFIG_IS_ENABLED(OF_CONTROL)) { - timeout = dev_read_u32_default(gd->watchdog_dev, "timeout-sec", - WATCHDOG_TIMEOUT_SECS); - } - - wdt_start(gd->watchdog_dev, timeout * 1000, 0); - gd->flags |= GD_FLG_WDT_READY; - printf("WDT: Started with%s servicing (%ds timeout)\n", - IS_ENABLED(CONFIG_WATCHDOG) ? "" : "out", timeout); - - return 0; -} -#endif +int initr_watchdog(void);

#endif /* _WDT_H_ */

-- 2.23.0

Stefan Roese

14 Mar 14 Mar

12:55 p.m.

New subject: [PATCH 2/3] watchdog: move initr_watchdog() to wdt-uclass.c

On 13.03.20 17:04, Rasmus Villemoes wrote:

...

This function is a bit large for an inline function, and for U-Boot proper, it is called via a function pointer anyway (in board_r.c), so cannot be inlined.

It will shortly set a global variable to be used by the watchdog_reset() function in wdt-uclass.c, so this also allows making that variable local to wdt-uclass.c.

The WATCHDOG_TIMEOUT_SECS define is not used elsewhere.

Signed-off-by: Rasmus Villemoes rasmus.villemoes@prevas.dk

drivers/watchdog/wdt-uclass.c | 33 +++++++++++++++++++++++++++++++++ include/wdt.h | 35 +---------------------------------- 2 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c index 309a0e9c5b..fb3e247c5f 100644 --- a/drivers/watchdog/wdt-uclass.c +++ b/drivers/watchdog/wdt-uclass.c @@ -13,6 +13,39 @@

DECLARE_GLOBAL_DATA_PTR;

+#define WATCHDOG_TIMEOUT_SECS (CONFIG_WATCHDOG_TIMEOUT_MSECS / 1000)

+int initr_watchdog(void) +{
u32 timeout = WATCHDOG_TIMEOUT_SECS;

/*
* Init watchdog: This will call the probe function of the
* watchdog driver, enabling the use of the device
*/
if (uclass_get_device_by_seq(UCLASS_WDT, 0,
		     (struct udevice **)&gd->watchdog_dev)) {
debug("WDT:   Not found by seq!\n");
if (uclass_get_device(UCLASS_WDT, 0,
		      (struct udevice **)&gd->watchdog_dev)) {
	printf("WDT:   Not found!\n");
	return 0;
}
}

if (CONFIG_IS_ENABLED(OF_CONTROL)) {
timeout = dev_read_u32_default(gd->watchdog_dev, "timeout-sec",
			       WATCHDOG_TIMEOUT_SECS);
}

wdt_start(gd->watchdog_dev, timeout * 1000, 0);

gd->flags |= GD_FLG_WDT_READY;

printf("WDT: Started with%s servicing (%ds timeout)\n",
      IS_ENABLED(CONFIG_WATCHDOG) ? "" : "out", timeout);
return 0;
+}

int wdt_start(struct udevice *dev, u64 timeout_ms, ulong flags) { const struct wdt_ops *ops = device_get_ops(dev);

diff --git a/include/wdt.h b/include/wdt.h index e833d3a772..aea5abc768 100644 --- a/include/wdt.h +++ b/include/wdt.h @@ -106,39 +106,6 @@ struct wdt_ops { int (*expire_now)(struct udevice *dev, ulong flags); };

-#if CONFIG_IS_ENABLED(WDT) -#define WATCHDOG_TIMEOUT_SECS (CONFIG_WATCHDOG_TIMEOUT_MSECS / 1000)

-static inline int initr_watchdog(void) -{
u32 timeout = WATCHDOG_TIMEOUT_SECS;

/*
* Init watchdog: This will call the probe function of the
* watchdog driver, enabling the use of the device
*/
if (uclass_get_device_by_seq(UCLASS_WDT, 0,
		     (struct udevice **)&gd->watchdog_dev)) {
debug("WDT:   Not found by seq!\n");
if (uclass_get_device(UCLASS_WDT, 0,
		      (struct udevice **)&gd->watchdog_dev)) {
	printf("WDT:   Not found!\n");
	return 0;
}
}

if (CONFIG_IS_ENABLED(OF_CONTROL)) {
timeout = dev_read_u32_default(gd->watchdog_dev, "timeout-sec",
			       WATCHDOG_TIMEOUT_SECS);
}

wdt_start(gd->watchdog_dev, timeout * 1000, 0);

gd->flags |= GD_FLG_WDT_READY;

printf("WDT: Started with%s servicing (%ds timeout)\n",
      IS_ENABLED(CONFIG_WATCHDOG) ? "" : "out", timeout);
return 0;
-} -#endif +int initr_watchdog(void);

#endif /* _WDT_H_ */

Reviewed-by: Stefan Roese sr@denx.de

Thanks, Stefan

Rasmus Villemoes

13 Mar 13 Mar

5:04 p.m.

New subject: [PATCH 3/3] watchdog: honour hw_margin_ms DT property

Some watchdog devices, e.g. external gpio-triggered ones, must be reset more often than once per second, which means that the current rate-limiting logic in watchdog_reset() fails to keep the board alive.

gpio-wdt.txt in the linux source tree defines a "hw_margin_ms" property used to specifiy the maximum time allowed between resetting the device. Allow any watchdog device to specify such a property, and then use a reset period of one quarter of that. We keep the current default of resetting once every 1000ms.

Signed-off-by: Rasmus Villemoes rasmus.villemoes@prevas.dk --- drivers/watchdog/wdt-uclass.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c index fb3e247c5f..436a52fd08 100644 --- a/drivers/watchdog/wdt-uclass.c +++ b/drivers/watchdog/wdt-uclass.c @@ -15,6 +15,12 @@ DECLARE_GLOBAL_DATA_PTR;

#define WATCHDOG_TIMEOUT_SECS (CONFIG_WATCHDOG_TIMEOUT_MSECS / 1000)

+/* + * Reset every 1000ms, or however often is required as indicated by a + * hw_margin_ms property. + */ +static ulong reset_period = 1000; + int initr_watchdog(void) { u32 timeout = WATCHDOG_TIMEOUT_SECS; @@ -36,6 +42,8 @@ int initr_watchdog(void) if (CONFIG_IS_ENABLED(OF_CONTROL)) { timeout = dev_read_u32_default(gd->watchdog_dev, "timeout-sec", WATCHDOG_TIMEOUT_SECS); + reset_period = dev_read_u32_default(gd->watchdog_dev, "hw_margin_ms", + 4*reset_period)/4; }

wdt_start(gd->watchdog_dev, timeout * 1000, 0); @@ -117,7 +125,7 @@ void watchdog_reset(void) /* Do not reset the watchdog too often */ now = get_timer(0); if (time_after(now, next_reset)) { - next_reset = now + 1000; /* reset every 1000ms */ + next_reset = now + reset_period; wdt_reset(gd->watchdog_dev); } }

-- 2.23.0

Stefan Roese

14 Mar 14 Mar

12:57 p.m.

New subject: [PATCH 3/3] watchdog: honour hw_margin_ms DT property

On 13.03.20 17:04, Rasmus Villemoes wrote:

...

Some watchdog devices, e.g. external gpio-triggered ones, must be reset more often than once per second, which means that the current rate-limiting logic in watchdog_reset() fails to keep the board alive.

gpio-wdt.txt in the linux source tree defines a "hw_margin_ms" property used to specifiy the maximum time allowed between resetting the device. Allow any watchdog device to specify such a property, and then use a reset period of one quarter of that. We keep the current default of resetting once every 1000ms.

Signed-off-by: Rasmus Villemoes rasmus.villemoes@prevas.dk

drivers/watchdog/wdt-uclass.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c index fb3e247c5f..436a52fd08 100644 --- a/drivers/watchdog/wdt-uclass.c +++ b/drivers/watchdog/wdt-uclass.c @@ -15,6 +15,12 @@ DECLARE_GLOBAL_DATA_PTR;

#define WATCHDOG_TIMEOUT_SECS (CONFIG_WATCHDOG_TIMEOUT_MSECS / 1000)

+/*

Reset every 1000ms, or however often is required as indicated by a

hw_margin_ms property.

*/

+static ulong reset_period = 1000;

int initr_watchdog(void) { u32 timeout = WATCHDOG_TIMEOUT_SECS;

@@ -36,6 +42,8 @@ int initr_watchdog(void) if (CONFIG_IS_ENABLED(OF_CONTROL)) { timeout = dev_read_u32_default(gd->watchdog_dev, "timeout-sec", WATCHDOG_TIMEOUT_SECS);
reset_period = dev_read_u32_default(gd->watchdog_dev, "hw_margin_ms",
				    4*reset_period)/4;

Nitpicking: checkpatch will most likely complain about missing spaces here.

...

}

wdt_start(gd->watchdog_dev, timeout * 1000, 0); @@ -117,7 +125,7 @@ void watchdog_reset(void) /* Do not reset the watchdog too often */ now = get_timer(0); if (time_after(now, next_reset)) {
next_reset = now + 1000;	/* reset every 1000ms */
next_reset = now + reset_period;
wdt_reset(gd->watchdog_dev); } }

Other than that:

Reviewed-by: Stefan Roese sr@denx.de

Thanks, Stefan

Rasmus Villemoes

16 Mar 16 Mar

4:39 p.m.

New subject: [PATCH 3/3] watchdog: honour hw_margin_ms DT property

On 14/03/2020 12.57, Stefan Roese wrote:

...

On 13.03.20 17:04, Rasmus Villemoes wrote:

...
int initr_watchdog(void) { u32 timeout = WATCHDOG_TIMEOUT_SECS; @@ -36,6 +42,8 @@ int initr_watchdog(void) if (CONFIG_IS_ENABLED(OF_CONTROL)) { timeout = dev_read_u32_default(gd->watchdog_dev, "timeout-sec", WATCHDOG_TIMEOUT_SECS); + reset_period = dev_read_u32_default(gd->watchdog_dev, "hw_margin_ms", + 4*reset_period)/4;

Nitpicking: checkpatch will most likely complain about missing spaces here.

Right. Want me to resend, or can you fix if/when applying?

Rasmus

Stefan Roese

17 Mar 17 Mar

8:25 a.m.

New subject: [PATCH 3/3] watchdog: honour hw_margin_ms DT property

On 16.03.20 16:39, Rasmus Villemoes wrote:

...

On 14/03/2020 12.57, Stefan Roese wrote:

...
On 13.03.20 17:04, Rasmus Villemoes wrote:

...
int initr_watchdog(void) { u32 timeout = WATCHDOG_TIMEOUT_SECS; @@ -36,6 +42,8 @@ int initr_watchdog(void) if (CONFIG_IS_ENABLED(OF_CONTROL)) { timeout = dev_read_u32_default(gd->watchdog_dev, "timeout-sec", WATCHDOG_TIMEOUT_SECS); + reset_period = dev_read_u32_default(gd->watchdog_dev, "hw_margin_ms", + 4*reset_period)/4;

Nitpicking: checkpatch will most likely complain about missing spaces here.

Right. Want me to resend, or can you fix if/when applying?

If there will be no other changes requiring a v2, then I can fix this while applying.

Thanks, Stefan

Stefan Roese

14 Mar 14 Mar

1:04 p.m.

New subject: [PATCH 0/3] watchdog: honour hw_margin_ms property

On 13.03.20 17:04, Rasmus Villemoes wrote:

...

Some watchdogs must be reset more often than the once-per-second ratelimit used by the generic watchdog_reset function in wdt-uclass.c. There's precedent (from the gpio-wdt driver in linux) for using a property called hw_margin_ms to let the device tree tell the driver how often the device needs resetting. So use that generically. No change in default behaviour.

On top of https://patchwork.ozlabs.org/patch/1242772/ .

Stefan, something like this?

Yes, thanks for looking into this.

...

That at least solves half my problems and might be useful to others as well. Then I'll have to figure out the time-stands-still problem in some other way.

If its too hard to enable interrupts in SPL for you or to provide some other means of a working get_timer() API, then we needto find another solution. You started with this weak function, which of course works. What other options are there? Adding a callback mechanism to register platform specific callback functions? Even though this might get a little bit too complicated.

Thanks, Stefan

Rasmus Villemoes

16 Mar 16 Mar

4:52 p.m.

New subject: [PATCH 0/3] watchdog: honour hw_margin_ms property

On 14/03/2020 13.04, Stefan Roese wrote:

...

On 13.03.20 17:04, Rasmus Villemoes wrote:

...

...
That at least solves half my problems and might be useful to others as well. Then I'll have to figure out the time-stands-still problem in some other way.

If its too hard to enable interrupts in SPL for you or to provide some other means of a working get_timer() API, then we needto find another solution. You started with this weak function, which of course works. What other options are there? Adding a callback mechanism to register platform specific callback functions? Even though this might get a little bit too complicated.

Now that I dig a bit more into this, I seem to remember that we actually also had problems in U-Boot proper when loading a compressed kernel, so for now we're using an uncompressed kernel in our FIT images. I will have to re-investigate, but it now occurs to me that it might be due to the fact that interrupts get disabled during bootm (which makes sense, the same reason I stated previously of interrupt vectors about to be overwritten), so even in U-Boot proper, time as measured by get_timer() ceases to pass after that point, so all the WATCHDOG_RESET() calls from the inflate code effectively get ignored.

So it may be necessary to have some wdt_ratelimit_disable() hook that can be called from bootm_disable_interrupts() and e.g. some board-specific SPL code. I'll do some experiments and figure out if I do indeed need such a hook.

Rasmus

Rasmus Villemoes

2 Jun 2 Jun

3:29 p.m.

New subject: [PATCH 0/3] watchdog: honour hw_margin_ms property

On 16/03/2020 16.52, Rasmus Villemoes wrote:

...

On 14/03/2020 13.04, Stefan Roese wrote:

...
On 13.03.20 17:04, Rasmus Villemoes wrote:

...
...
That at least solves half my problems and might be useful to others as well. Then I'll have to figure out the time-stands-still problem in some other way.

If its too hard to enable interrupts in SPL for you or to provide some other means of a working get_timer() API, then we needto find another solution. You started with this weak function, which of course works. What other options are there? Adding a callback mechanism to register platform specific callback functions? Even though this might get a little bit too complicated.

Now that I dig a bit more into this, I seem to remember that we actually also had problems in U-Boot proper when loading a compressed kernel, so for now we're using an uncompressed kernel in our FIT images. I will have to re-investigate, but it now occurs to me that it might be due to the fact that interrupts get disabled during bootm (which makes sense, the same reason I stated previously of interrupt vectors about to be overwritten), so even in U-Boot proper, time as measured by get_timer() ceases to pass after that point, so all the WATCHDOG_RESET() calls from the inflate code effectively get ignored.

So it may be necessary to have some wdt_ratelimit_disable() hook that can be called from bootm_disable_interrupts() and e.g. some board-specific SPL code. I'll do some experiments and figure out if I do indeed need such a hook.

OK, I have now had time to do some more experiments. I have enabled the timer tick in SPL, so get_timer() now "normally" works. Together with the .dts based read of the hardware margin, that makes the watchdog handling mostly work.

But, as I suspected, I do have a problem when loading a compressed kernel image - what I write above "so even in U-Boot proper, time as measured by get_timer() ceases to pass after that point, so all the WATCHDOG_RESET() calls from the inflate code effectively get ignored." is indeed the case.

So, what's the best way to proceed? Should there be a hook disabling the rate-limiting logic that bootm_disable_interrupts() can call? Or must get_timer() always return a sensible result even with interrupts disabled?

Rasmus

Stefan Roese

4:53 p.m.

New subject: [PATCH 0/3] watchdog: honour hw_margin_ms property

On 02.06.20 15:29, Rasmus Villemoes wrote:

...

On 16/03/2020 16.52, Rasmus Villemoes wrote:

...
On 14/03/2020 13.04, Stefan Roese wrote:

...
On 13.03.20 17:04, Rasmus Villemoes wrote:

...
...
That at least solves half my problems and might be useful to others as well. Then I'll have to figure out the time-stands-still problem in some other way.

If its too hard to enable interrupts in SPL for you or to provide some other means of a working get_timer() API, then we needto find another solution. You started with this weak function, which of course works. What other options are there? Adding a callback mechanism to register platform specific callback functions? Even though this might get a little bit too complicated.

Now that I dig a bit more into this, I seem to remember that we actually also had problems in U-Boot proper when loading a compressed kernel, so for now we're using an uncompressed kernel in our FIT images. I will have to re-investigate, but it now occurs to me that it might be due to the fact that interrupts get disabled during bootm (which makes sense, the same reason I stated previously of interrupt vectors about to be overwritten), so even in U-Boot proper, time as measured by get_timer() ceases to pass after that point, so all the WATCHDOG_RESET() calls from the inflate code effectively get ignored.

So it may be necessary to have some wdt_ratelimit_disable() hook that can be called from bootm_disable_interrupts() and e.g. some board-specific SPL code. I'll do some experiments and figure out if I do indeed need such a hook.

OK, I have now had time to do some more experiments. I have enabled the timer tick in SPL, so get_timer() now "normally" works. Together with the .dts based read of the hardware margin, that makes the watchdog handling mostly work.

But, as I suspected, I do have a problem when loading a compressed kernel image - what I write above "so even in U-Boot proper, time as measured by get_timer() ceases to pass after that point, so all the WATCHDOG_RESET() calls from the inflate code effectively get ignored." is indeed the case.

So, what's the best way to proceed? Should there be a hook disabling the rate-limiting logic that bootm_disable_interrupts() can call? Or must get_timer() always return a sensible result even with interrupts disabled?

Wouldn't it make sense to move the bootm_disable_interrupts() call to after loading and uncompressing the OS image? To right before jumping to the OS?

Thanks, Stefan

Rasmus Villemoes

5:38 p.m.

New subject: [PATCH 0/3] watchdog: honour hw_margin_ms property

On 02/06/2020 16.53, Stefan Roese wrote:

...

On 02.06.20 15:29, Rasmus Villemoes wrote:

...
On 16/03/2020 16.52, Rasmus Villemoes wrote:

...
On 14/03/2020 13.04, Stefan Roese wrote:

...
On 13.03.20 17:04, Rasmus Villemoes wrote:

...
...
That at least solves half my problems and might be useful to others as well. Then I'll have to figure out the time-stands-still problem in some other way.

If its too hard to enable interrupts in SPL for you or to provide some other means of a working get_timer() API, then we needto find another solution. You started with this weak function, which of course works. What other options are there? Adding a callback mechanism to register platform specific callback functions? Even though this might get a little bit too complicated.

Now that I dig a bit more into this, I seem to remember that we actually also had problems in U-Boot proper when loading a compressed kernel, so for now we're using an uncompressed kernel in our FIT images. I will have to re-investigate, but it now occurs to me that it might be due to the fact that interrupts get disabled during bootm (which makes sense, the same reason I stated previously of interrupt vectors about to be overwritten), so even in U-Boot proper, time as measured by get_timer() ceases to pass after that point, so all the WATCHDOG_RESET() calls from the inflate code effectively get ignored.

So it may be necessary to have some wdt_ratelimit_disable() hook that can be called from bootm_disable_interrupts() and e.g. some board-specific SPL code. I'll do some experiments and figure out if I do indeed need such a hook.

OK, I have now had time to do some more experiments. I have enabled the timer tick in SPL, so get_timer() now "normally" works. Together with the .dts based read of the hardware margin, that makes the watchdog handling mostly work.

But, as I suspected, I do have a problem when loading a compressed kernel image - what I write above "so even in U-Boot proper, time as measured by get_timer() ceases to pass after that point, so all the WATCHDOG_RESET() calls from the inflate code effectively get ignored." is indeed the case.

So, what's the best way to proceed? Should there be a hook disabling the rate-limiting logic that bootm_disable_interrupts() can call? Or must get_timer() always return a sensible result even with interrupts disabled?

Wouldn't it make sense to move the bootm_disable_interrupts() call to after loading and uncompressing the OS image? To right before jumping to the OS?

No, because the point of disabling interrupts is that we may start writing to physical address 0 (e.g. if that's the load= address in the FIT image), which is also where the interrupt vectors reside - i.e., we're about to overwrite 0x900 (the decrementer interrupt vector), so if we don't disable interrupts, we'll crash on the very next decrementer interrupt (i.e., within one millisecond).

Rasmus

Rasmus Villemoes

4 Jun 4 Jun

10:28 a.m.

New subject: [PATCH 0/3] watchdog: honour hw_margin_ms property

On 02/06/2020 17.38, Rasmus Villemoes wrote:

...

On 02/06/2020 16.53, Stefan Roese wrote:

...
On 02.06.20 15:29, Rasmus Villemoes wrote:

...
On 16/03/2020 16.52, Rasmus Villemoes wrote:

...
On 14/03/2020 13.04, Stefan Roese wrote:

...
On 13.03.20 17:04, Rasmus Villemoes wrote:

...

...
...
But, as I suspected, I do have a problem when loading a compressed kernel image - what I write above "so even in U-Boot proper, time as measured by get_timer() ceases to pass after that point, so all the WATCHDOG_RESET() calls from the inflate code effectively get ignored." is indeed the case.

So, what's the best way to proceed? Should there be a hook disabling the rate-limiting logic that bootm_disable_interrupts() can call? Or must get_timer() always return a sensible result even with interrupts disabled?

Wouldn't it make sense to move the bootm_disable_interrupts() call to after loading and uncompressing the OS image? To right before jumping to the OS?

No, because the point of disabling interrupts is that we may start writing to physical address 0 (e.g. if that's the load= address in the FIT image), which is also where the interrupt vectors reside - i.e., we're about to overwrite 0x900 (the decrementer interrupt vector), so if we don't disable interrupts, we'll crash on the very next decrementer interrupt (i.e., within one millisecond).

FWIW, the below very hacky patch makes get_timer() return sensible results on ppc even when interrupts are disabled, and hence ensures that the watchdog does get petted. It's obviously not meant for inclusion as is (it's prepared for being a proper config option, but for toying around it's easier to have it all in one file - also, I don't really like the name of the config knob). But it's also kind of expensive to do that do_div(), so I'm not sure I think this is even the right approach.

You previously rejected allowing the board to provide an override for the rate-limiting, and at least the "hw_margin_ms" parsing now solves part of what I wanted to use that for. What about implementing the rate-limiting instead in terms of get_ticks() (the hw_margin_ms can trivially be translated to a number of ticks at init time - there's already a usec_to_tick helper)? Are there any boards where get_ticks() doesn't return something sensible?

Rasmus

_Not_ for inclusion:

diff --git a/arch/powerpc/lib/interrupts.c b/arch/powerpc/lib/interrupts.c index 23ac5bca1e..e6b6a967ae 100644 --- a/arch/powerpc/lib/interrupts.c +++ b/arch/powerpc/lib/interrupts.c @@ -10,11 +10,14 @@ #include <common.h> #include <irq_func.h> #include <asm/processor.h> +#include <div64.h> #include <watchdog.h> #ifdef CONFIG_LED_STATUS #include <status_led.h> #endif

+#define CONFIG_GET_TIMER_IRQ 1 + #ifndef CONFIG_MPC83XX_TIMER #ifndef CONFIG_SYS_WATCHDOG_FREQ #define CONFIG_SYS_WATCHDOG_FREQ (CONFIG_SYS_HZ / 2) @@ -39,8 +42,33 @@ static __inline__ void set_dec (unsigned long val) } #endif /* !CONFIG_MPC83XX_TIMER */

+static u64 irq_off_ticks = 0; +static int interrupts_enabled = 0; +static volatile ulong timestamp = 0; + +static u32 irq_off_msecs(void) +{ + u64 t; + u32 d = get_tbclk(); + + if (!d) + return 0; + t = get_ticks() - irq_off_ticks; + t *= 1000; + do_div(t, d); + return t; +} + void enable_interrupts(void) { + ulong msr = get_msr (); + + if (IS_ENABLED(CONFIG_GET_TIMER_IRQ) && !(msr & MSR_EE)) { + /* Account for the time interrupts were off. */ + timestamp += irq_off_msecs(); + interrupts_enabled = 1; + } + set_msr (get_msr () | MSR_EE); }

@@ -50,6 +78,13 @@ int disable_interrupts(void) ulong msr = get_msr ();

set_msr (msr & ~MSR_EE); + + if (IS_ENABLED(CONFIG_GET_TIMER_IRQ) && (msr & MSR_EE)) { + /* Record when interrupts were disabled. */ + irq_off_ticks = get_ticks(); + interrupts_enabled = 0; + } + return ((msr & MSR_EE) != 0); }

@@ -61,13 +96,11 @@ int interrupt_init(void)

set_dec (decrementer_count);

- set_msr (get_msr () | MSR_EE); + enable_interrupts();

return (0); }

-static volatile ulong timestamp = 0; - void timer_interrupt(struct pt_regs *regs) { /* call cpu specific function from $(CPU)/interrupts.c */ @@ -90,6 +123,12 @@ void timer_interrupt(struct pt_regs *regs)

ulong get_timer (ulong base) { - return (timestamp - base); + ulong ts = timestamp; + + /* If called within an irq-off section, account for the time since irqs were turned off. */ + if (IS_ENABLED(CONFIG_GET_TIMER_IRQ) && !interrupts_enabled) + ts += irq_off_msecs(); + + return (ts - base); } #endif /* !CONFIG_MPC83XX_TIMER */

-- Rasmus Villemoes Software Developer Prevas A/S Hedeager 3 DK-8200 Aarhus N +45 51210274 rasmus.villemoes@prevas.dk www.prevas.dk

Stefan Roese

10:41 a.m.

New subject: [PATCH 0/3] watchdog: honour hw_margin_ms property

On 04.06.20 10:28, Rasmus Villemoes wrote:

...

On 02/06/2020 17.38, Rasmus Villemoes wrote:

...
On 02/06/2020 16.53, Stefan Roese wrote:

...
On 02.06.20 15:29, Rasmus Villemoes wrote:

...
On 16/03/2020 16.52, Rasmus Villemoes wrote:

...
On 14/03/2020 13.04, Stefan Roese wrote:

...
On 13.03.20 17:04, Rasmus Villemoes wrote:

...
...
...
But, as I suspected, I do have a problem when loading a compressed kernel image - what I write above "so even in U-Boot proper, time as measured by get_timer() ceases to pass after that point, so all the WATCHDOG_RESET() calls from the inflate code effectively get ignored." is indeed the case.

So, what's the best way to proceed? Should there be a hook disabling the rate-limiting logic that bootm_disable_interrupts() can call? Or must get_timer() always return a sensible result even with interrupts disabled?

Wouldn't it make sense to move the bootm_disable_interrupts() call to after loading and uncompressing the OS image? To right before jumping to the OS?

No, because the point of disabling interrupts is that we may start writing to physical address 0 (e.g. if that's the load= address in the FIT image), which is also where the interrupt vectors reside - i.e., we're about to overwrite 0x900 (the decrementer interrupt vector), so if we don't disable interrupts, we'll crash on the very next decrementer interrupt (i.e., within one millisecond).

Ah, thanks for refreshing me on this.

...

FWIW, the below very hacky patch makes get_timer() return sensible results on ppc even when interrupts are disabled, and hence ensures that the watchdog does get petted. It's obviously not meant for inclusion as is (it's prepared for being a proper config option, but for toying around it's easier to have it all in one file - also, I don't really like the name of the config knob). But it's also kind of expensive to do that do_div(), so I'm not sure I think this is even the right approach.

I agree. This does not look as its going into the right direction. But thanks for working on this anyways.

...

You previously rejected allowing the board to provide an override for the rate-limiting,

From my memory, I "just" suggested working on a different, more generic approach. But if this fails, I'm open to re-visit the options.

...

and at least the "hw_margin_ms" parsing now solves part of what I wanted to use that for. What about implementing the rate-limiting instead in terms of get_ticks() (the hw_margin_ms can trivially be translated to a number of ticks at init time - there's already a usec_to_tick helper)? Are there any boards where get_ticks() doesn't return something sensible?

Could you please send a new version of a patch(set) to address these issues by using the "override for the rate-limiting" or some other idea you have right now for this? I'll review it soon'ish.

Thanks, Stefan

...

Rasmus

_Not_ for inclusion:

diff --git a/arch/powerpc/lib/interrupts.c b/arch/powerpc/lib/interrupts.c index 23ac5bca1e..e6b6a967ae 100644 --- a/arch/powerpc/lib/interrupts.c +++ b/arch/powerpc/lib/interrupts.c @@ -10,11 +10,14 @@ #include <common.h> #include <irq_func.h> #include <asm/processor.h> +#include <div64.h> #include <watchdog.h> #ifdef CONFIG_LED_STATUS #include <status_led.h> #endif

+#define CONFIG_GET_TIMER_IRQ 1

#ifndef CONFIG_MPC83XX_TIMER #ifndef CONFIG_SYS_WATCHDOG_FREQ #define CONFIG_SYS_WATCHDOG_FREQ (CONFIG_SYS_HZ / 2)

@@ -39,8 +42,33 @@ static __inline__ void set_dec (unsigned long val) } #endif /* !CONFIG_MPC83XX_TIMER */

+static u64 irq_off_ticks = 0; +static int interrupts_enabled = 0; +static volatile ulong timestamp = 0;

+static u32 irq_off_msecs(void) +{
u64 t;

u32 d = get_tbclk();

if (!d)
return 0;
t = get_ticks() - irq_off_ticks;

t *= 1000;

do_div(t, d);

return t;
+}
void enable_interrupts(void) {

ulong msr = get_msr ();

if (IS_ENABLED(CONFIG_GET_TIMER_IRQ) && !(msr & MSR_EE)) {
/* Account for the time interrupts were off. */
timestamp += irq_off_msecs();
interrupts_enabled = 1;
}

set_msr (get_msr () | MSR_EE); }
@@ -50,6 +78,13 @@ int disable_interrupts(void) ulong msr = get_msr ();

set_msr (msr & ~MSR_EE);
if (IS_ENABLED(CONFIG_GET_TIMER_IRQ) && (msr & MSR_EE)) {
/* Record when interrupts were disabled. */
irq_off_ticks = get_ticks();
interrupts_enabled = 0;
}

return ((msr & MSR_EE) != 0); }
@@ -61,13 +96,11 @@ int interrupt_init(void)

set_dec (decrementer_count);

set_msr (get_msr () | MSR_EE);

enable_interrupts();

return (0); }

-static volatile ulong timestamp = 0;

void timer_interrupt(struct pt_regs *regs) { /* call cpu specific function from $(CPU)/interrupts.c */

@@ -90,6 +123,12 @@ void timer_interrupt(struct pt_regs *regs)

ulong get_timer (ulong base) {

return (timestamp - base);

ulong ts = timestamp;

/* If called within an irq-off section, account for the time since

irqs were turned off. */
if (IS_ENABLED(CONFIG_GET_TIMER_IRQ) && !interrupts_enabled)
ts += irq_off_msecs();
return (ts - base); } #endif /* !CONFIG_MPC83XX_TIMER */

Viele Grüße, Stefan

-- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-51 Fax: (+49)-8142-66989-80 Email: sr@denx.de

Stefan Roese

15 Apr 15 Apr

11:38 a.m.

New subject: [PATCH 0/3] watchdog: honour hw_margin_ms property

On 13.03.20 17:04, Rasmus Villemoes wrote:

...

Some watchdogs must be reset more often than the once-per-second ratelimit used by the generic watchdog_reset function in wdt-uclass.c. There's precedent (from the gpio-wdt driver in linux) for using a property called hw_margin_ms to let the device tree tell the driver how often the device needs resetting. So use that generically. No change in default behaviour.

On top of https://patchwork.ozlabs.org/patch/1242772/ .

Stefan, something like this? That at least solves half my problems and might be useful to others as well. Then I'll have to figure out the time-stands-still problem in some other way.

Rasmus Villemoes (3): watchdog: remove stale ifndef CONFIG_WATCHDOG_TIMEOUT_MSECS from wdt.h watchdog: move initr_watchdog() to wdt-uclass.c watchdog: honour hw_margin_ms DT property

drivers/watchdog/wdt-uclass.c | 43 ++++++++++++++++++++++++++++++++++- include/wdt.h | 38 +------------------------------ 2 files changed, 43 insertions(+), 38 deletions(-)

Series applied to u-boot-marvell/master

Thanks, Stefan

1803

Age (days ago)

1887

Last active (days ago)

List overview

Download

19 comments

2 participants

tags (0)

participants (2)

Rasmus Villemoes
Stefan Roese