[U-Boot] [PATCH] PowerPC MPC85xx: don't hang on read exception

Set HID1[RFXE] = 1 in cpu/mpc85xx/start.S. When this bit is 0, any condition that asserts the internal core_fault_in* signal will result in a processor hang, recoverable only with reset. When this bit is 1, such a condition will cause a machine check exception and software will have a chance to print an error message.
Conditions that can assert core_fault_in* include ECM local access error (read an unmapped target address), multi-bit ECC error in L2 cache or DDR RAM, localbus parity error, and a variety of PCI errors.
A long discussion of why this bit must be set can be found in, among other places, the "MPC8548E PowerQUICC III Integrated Processor Family Reference Manual" section 6.10.2, table 6-19 "HID1 Field Descriptions." It says that leaving the bit 0 "is not a recommended configuration. The processor may stall indefinitely due to an unreported error."
We have tested the use of this bit for two years, both in u-boot/Linux and in a proprietary operating system, in systems using MPC8541, MPC8545/8, and MPC8536.
Signed-off-by: Andrew Klossner andrew@cesa.opbu.xerox.com --- cpu/mpc85xx/start.S | 1 + 1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/cpu/mpc85xx/start.S b/cpu/mpc85xx/start.S index 80f9677..8dfbc81 100644 --- a/cpu/mpc85xx/start.S +++ b/cpu/mpc85xx/start.S @@ -166,6 +166,7 @@ _start_e500:
#ifndef CONFIG_E500MC li r0,(HID1_ASTME|HID1_ABE)@l /* Addr streaming & broadcast */ + ori r0,r0,HID1_RFXE@h /* Enable read fault exceptions */ mtspr HID1,r0 #endif

Set HID1[RFXE] = 1 in cpu/mpc85xx/start.S. When this bit is 0, any condition that asserts the internal core_fault_in* signal will result in a processor hang, recoverable only with reset. When this bit is 1, such a condition will cause a machine check exception and software will have a chance to print an error message.
Conditions that can assert core_fault_in* include ECM local access error (read an unmapped target address), multi-bit ECC error in L2 cache or DDR RAM, localbus parity error, and a variety of PCI errors.
A long discussion of why this bit must be set can be found in, among other places, the "MPC8548E PowerQUICC III Integrated Processor Family Reference Manual" section 6.10.2, table 6-19 "HID1 Field Descriptions." It says that leaving the bit 0 "is not a recommended configuration. The processor may stall indefinitely due to an unreported error."
We have tested the use of this bit for two years, both in u-boot/Linux and in a proprietary operating system, in systems using MPC8541, MPC8545/8, and MPC8536.
Why not use the interrupt mode? If you use that mode, it will get more valuable information. Currently, U-boot have a little bit error detection mechanism if you configure the interrupts and enable the error interrupts.
Thanks, Dave
participants (2)
-
Andrew Klossner
-
Liu Dave-R63238