[U-Boot] [PATCH] to correct Issue with PowerPc implementation of resetting the watchdog in the decrementer exception handler

Wolfgang,
Sorry about not being clear in my earlier message. Let me attempt to point out why I think the current code is incorrect.
From a systems point of view watchdog is used to protect against a system
becoming hung by bad code or corrupted memory or files. By automatically feeding them the code has circumvented this important feature.
So some examples of where I would expect the watchdog to recover from but with the current code it will not. If u-boot loads a corrupted OS image, when the code starts but before it gets to the point that it has replaced u-boots exception handler it falls into an infinite loop it will hang and the uboot exception handler will continue to keep the system "running". Another example and how we actually discovered this bug due to a leak in a network driver that after many loops would run u-boot out of memory. Once this occurs the hush parser drops into a "for ;;" loop again this is not recovered by the watchdog.
So while I understand that some users may want to have the cpu automatically reset the watchdog, We don't. I believe that setting the default value of CONFIG_SYS_WATCHDOG_FREQ if it was not defined is a serious bug. The feature should not be enabled if I did not define this variable.
I have attached what we believe is a patch to address our concerns and still allow this to be set by a target that wishes to do so.
Comments are appreciated.
--eric --------------------------------- Eric Olsen Staff Software Engineer Google, Inc. 650-253-2767
On Mon, Feb 13, 2012 at 12:36 PM, Wolfgang Denk wd@denx.de wrote:
Dear Eric,
In message <CAOzkcmEydTOcGQ7= LL9NHkyuUtJJbE73DYceOsffphtrGU7ZYg@mail.gmail.com> you wrote:
Today we discovered an issue with the implementation of watchdog on the PowerPc. In the file arch/powerpc/lib/interrupt.c, the code currently automatically feeds the watchdog via the timer interrupt handler if
either
CONFIG_WATCHDOG or CONFIG_HW_WATCHDOG are defined. I am not exactly sure when this got added or why, but I'd guess that for most systems this is
not
a good feature. For us it is exactly the opposite of what we want.
Would
either of you object to a patch that uses a different config flag to
enable
this.
You don't mention it, but I assume you are referring to U-Boot code here.
Please always poost U-Boot related questions to the U-Boot mailing list only (unless you are asking for commercial support from DENX).
I do not exactly understand why you think the current implementation is not OK as it - my guess is that your expectations are higher than what U-Boot offers. U-Boot is a boot loader, not an OS, and we use a lot of simplistic concepts - trading complexity and features which you can expect from a full-flavored OS for small code size and simple, easily maintainable code. U-Boot is capable to triggering a hardware watchdog (which is nevessary, as there are systems where this cannot be switched off or suspended), but we make no attempts to implement watchdog monitoring of the U-Boot code itself. If U-Boot should hang or crash, this is not expected to be detected or recovered by the watchdog.
As such, the existing code does what it is supposed to do.
Patches to add additional, new features to U-Boot (like full watchdog monitoring) are of course welcome. Please post these on the mailing list, then.
Thanks.
Best regards,
Wolfgang Denk
-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de Is a computer language with goto's totally Wirth-less?

Dear Eric,
please don't top-post / full quote!
In message CAOzkcmGNEmMGUvp7ETPTZnWj7ApiMRRD0=7JJG8vXmcNex2gnw@mail.gmail.com you wrote:
Sorry about not being clear in my earlier message. Let me attempt to point out why I think the current code is incorrect.
From a systems point of view watchdog is used to protect against a system becoming hung by bad code or corrupted memory or files. By automatically feeding them the code has circumvented this important feature.
As I explained before, U-Boot does not make any attempts to implement watchdog monitoring of the U-Boot code itself.
So some examples of where I would expect the watchdog to recover from but with the current code it will not. If u-boot loads a corrupted OS image, when the code starts but before it gets to the point that it has replaced u-boots exception handler it falls into an infinite loop it will hang and the uboot exception handler will continue to keep the system "running".
No, this is not correct. U-Boot will disable all interrupts before booting Linux, so the watdoch would reset the board in such situations.
Note that when we load the Linux kernel, we will overwrite all U-Boot exception handlers, so we could not execute these even if we tried.
Another example and how we actually discovered this bug due to a leak in a network driver that after many loops would run u-boot out of memory. Once this occurs the hush parser drops into a "for ;;" loop again this is not recovered by the watchdog.
Indeed - this may happen, and U-Boot currently does not attempt to protect against this.
So while I understand that some users may want to have the cpu automatically reset the watchdog, We don't. I believe that setting the default value of CONFIG_SYS_WATCHDOG_FREQ if it was not defined is a serious bug. The feature should not be enabled if I did not define this variable.
You misunderstand. There is not a feature we can enable: such a feature has not been implemented yet.
If you need it, you must implement it, before you can enable and use it.
As mentioned: patches welcome.
I have attached what we believe is a patch to address our concerns and still allow this to be set by a target that wishes to do so.
Ptaches must be inline (no MIME attachments, and folow certain rules. Please see http://www.denx.de/wiki/U-Boot/Patches for details.
Make automatic watchdog reset optional. Based upon defining CONFIG_SYS_WATCHDOG_FREQ
Signed-off-by: Eric Olsen eolsen@google.com
Sorry, but this will most likely break a numbr of boards.
On which exact hardware configurations did you test your patch?
Best regards,
Wolfgang Denk
participants (2)
-
Eric Olsen
-
Wolfgang Denk