[U-Boot] [PATCH 0/3] Fix for rare beaglebone ethernet failures

There have been reports of ethernet network failures on beagle bone black. These have been root caused to ethernet PHY latching to a different address on boot than what is programmed in DT.
See: https://groups.google.com/d/msg/beagleboard/9mctrG26Mc8/1FuI_i5KW10J
This patch series tries to workaround this suitaiton by patching device-tree passed to kernel with the correct address. As well as updating the PHY address U-Boot drivers themselves use.
Tested on beaglebone black rev A4 by simulating the error condition I was also able to hit the actual hardware error once in a while.
Sekhar Nori (3): drivers: net: cpsw: add support to update phy address board: ti: am335x: add support to fixup phy address configs: am335x_evm: enable OF_BOARD_SETUP
board/ti/am335x/board.c | 78 ++++++++++++++++++++++++++++++++++++++++++++ configs/am335x_evm_defconfig | 1 + drivers/net/cpsw.c | 29 ++++++++++++++++ include/cpsw.h | 1 + 4 files changed, 109 insertions(+)

On some boards using TI CPSW, it may be possible that PHY address was not latched correctly, and the actual address that the phy responds on is different from that set in device-tree. For example, see this problem report on beaglebone black:
https://groups.google.com/d/msg/beagleboard/9mctrG26Mc8/1FuI_i5KW10J
Add support to check for this condition and use the detected phy address when its safe to do so.
Also, add a public API that exposes the phy address of a given slave. This can be used to update device-tree that is passed to Linux kernel.
Signed-off-by: Sekhar Nori nsekhar@ti.com --- drivers/net/cpsw.c | 29 +++++++++++++++++++++++++++++ include/cpsw.h | 1 + 2 files changed, 30 insertions(+)
diff --git a/drivers/net/cpsw.c b/drivers/net/cpsw.c index c31695eba9dd..8e2a48cfd678 100644 --- a/drivers/net/cpsw.c +++ b/drivers/net/cpsw.c @@ -1008,6 +1008,25 @@ static int cpsw_phy_init(struct cpsw_priv *priv, struct cpsw_slave *slave) return 1; }
+static void cpsw_phy_addr_update(struct cpsw_priv *priv) +{ + struct cpsw_platform_data *data = &priv->data; + u16 alive = mdio_regs->alive & GENMASK(15, 0); + int active = data->active_slave; + int new_addr = ffs(alive) - 1; + + /* + * If there is only one phy alive and its address does not match + * that of active slave, then phy address can safely be updated. + */ + if (hweight16(alive) == 1 && + data->slave_data[active].phy_addr != new_addr) { + printf("Updated phy address for CPSW#%d, old: %d, new: %d\n", + active, data->slave_data[active].phy_addr, new_addr); + data->slave_data[active].phy_addr = new_addr; + } +} + int _cpsw_register(struct cpsw_priv *priv) { struct cpsw_slave *slave; @@ -1034,6 +1053,9 @@ int _cpsw_register(struct cpsw_priv *priv) }
cpsw_mdio_init(priv->dev->name, data->mdio_base, data->mdio_div); + + cpsw_phy_addr_update(priv); + priv->bus = miiphy_get_dev_by_name(priv->dev->name); for_active_slave(slave, priv) cpsw_phy_init(priv, slave); @@ -1458,6 +1480,13 @@ static int cpsw_eth_ofdata_to_platdata(struct udevice *dev) return 0; }
+int cpsw_get_slave_phy_addr(struct udevice *dev, int slave) +{ + struct cpsw_priv *priv = dev_get_priv(dev); + struct cpsw_platform_data *data = &priv->data; + + return data->slave_data[slave].phy_addr; +}
static const struct udevice_id cpsw_eth_ids[] = { { .compatible = "ti,cpsw" }, diff --git a/include/cpsw.h b/include/cpsw.h index f135e7bfe0cc..9f8ce8850f51 100644 --- a/include/cpsw.h +++ b/include/cpsw.h @@ -54,5 +54,6 @@ struct cpsw_platform_data {
int cpsw_register(struct cpsw_platform_data *data); int ti_cm_get_macid(struct udevice *dev, int slave, u8 *mac_addr); +int cpsw_get_slave_phy_addr(struct udevice *dev, int slave);
#endif /* _CPSW_H_ */

On Thu, Aug 23, 2018 at 05:11:29PM +0530, Sekhar Nori wrote:
On some boards using TI CPSW, it may be possible that PHY address was not latched correctly, and the actual address that the phy responds on is different from that set in device-tree. For example, see this problem report on beaglebone black:
https://groups.google.com/d/msg/beagleboard/9mctrG26Mc8/1FuI_i5KW10J
Add support to check for this condition and use the detected phy address when its safe to do so.
Also, add a public API that exposes the phy address of a given slave. This can be used to update device-tree that is passed to Linux kernel.
Signed-off-by: Sekhar Nori nsekhar@ti.com
Reviewed-by: Tom Rini trini@konsulko.com

On Thu, Aug 23, 2018 at 05:11:29PM +0530, Sekhar Nori wrote:
On some boards using TI CPSW, it may be possible that PHY address was not latched correctly, and the actual address that the phy responds on is different from that set in device-tree. For example, see this problem report on beaglebone black:
https://groups.google.com/d/msg/beagleboard/9mctrG26Mc8/1FuI_i5KW10J
Add support to check for this condition and use the detected phy address when its safe to do so.
Also, add a public API that exposes the phy address of a given slave. This can be used to update device-tree that is passed to Linux kernel.
Signed-off-by: Sekhar Nori nsekhar@ti.com Reviewed-by: Tom Rini trini@konsulko.com
Applied to u-boot/master, thanks!

On beaglebone black, it can so happen that PHY address is not latched correctly on reset and board boots with PHY responding to a different address than that programmed in device-tree. For example, see this report:
https://groups.google.com/d/msg/beagleboard/9mctrG26Mc8/1FuI_i5KW10J
Workaround this by fixing up device-tree passed to kernel by using the PHY address detected in hardware.
Beaglebone itself uses only one ethernet port and its DT currently uses phy_id (obsoleted). But the function has been written to handle multiple ports and phy_id as well as phy-handle to make the function more generically useful.
Signed-off-by: Sekhar Nori nsekhar@ti.com --- board/ti/am335x/board.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 78 insertions(+)
diff --git a/board/ti/am335x/board.c b/board/ti/am335x/board.c index a359d20021fd..13845251afb5 100644 --- a/board/ti/am335x/board.c +++ b/board/ti/am335x/board.c @@ -608,6 +608,84 @@ static struct clk_synth cdce913_data = { }; #endif
+#if defined(CONFIG_OF_BOARD_SETUP) && defined(CONFIG_OF_CONTROL) && \ + defined(CONFIG_DM_ETH) && defined(CONFIG_DRIVER_TI_CPSW) + +#define MAX_CPSW_SLAVES 2 + +/* At the moment, we do not want to stop booting for any failures here */ +int ft_board_setup(void *fdt, bd_t *bd) +{ + const char *slave_path, *enet_name; + int enetnode, slavenode, phynode; + struct udevice *ethdev; + char alias[16]; + u32 phy_id[2]; + int phy_addr; + int i, ret; + + /* phy address fixup needed only on beagle bone family */ + if (!board_is_beaglebonex()) + goto done; + + for (i = 0; i < MAX_CPSW_SLAVES; i++) { + sprintf(alias, "ethernet%d", i); + + slave_path = fdt_get_alias(fdt, alias); + if (!slave_path) + continue; + + slavenode = fdt_path_offset(fdt, slave_path); + if (slavenode < 0) + continue; + + enetnode = fdt_parent_offset(fdt, slavenode); + enet_name = fdt_get_name(fdt, enetnode, NULL); + + ethdev = eth_get_dev_by_name(enet_name); + if (!ethdev) + continue; + + phy_addr = cpsw_get_slave_phy_addr(ethdev, i); + + /* check for phy_id as well as phy-handle properties */ + ret = fdtdec_get_int_array_count(fdt, slavenode, "phy_id", + phy_id, 2); + if (ret == 2) { + if (phy_id[1] != phy_addr) { + printf("fixing up phy_id for %s, old: %d, new: %d\n", + alias, phy_id[1], phy_addr); + + phy_id[0] = cpu_to_fdt32(phy_id[0]); + phy_id[1] = cpu_to_fdt32(phy_addr); + do_fixup_by_path(fdt, slave_path, "phy_id", + phy_id, sizeof(phy_id), 0); + } + } else { + phynode = fdtdec_lookup_phandle(fdt, slavenode, + "phy-handle"); + if (phynode < 0) + continue; + + ret = fdtdec_get_int(fdt, phynode, "reg", -ENOENT); + if (ret < 0) + continue; + + if (ret != phy_addr) { + printf("fixing up phy-handle for %s, old: %d, new: %d\n", + alias, ret, phy_addr); + + fdt_setprop_u32(fdt, phynode, "reg", + cpu_to_fdt32(phy_addr)); + } + } + } + +done: + return 0; +} +#endif + /* * Basic board specific setup. Pinmux has been handled already. */

On Thu, Aug 23, 2018 at 05:11:30PM +0530, Sekhar Nori wrote:
On beaglebone black, it can so happen that PHY address is not latched correctly on reset and board boots with PHY responding to a different address than that programmed in device-tree. For example, see this report:
https://groups.google.com/d/msg/beagleboard/9mctrG26Mc8/1FuI_i5KW10J
Workaround this by fixing up device-tree passed to kernel by using the PHY address detected in hardware.
Beaglebone itself uses only one ethernet port and its DT currently uses phy_id (obsoleted). But the function has been written to handle multiple ports and phy_id as well as phy-handle to make the function more generically useful.
Signed-off-by: Sekhar Nori nsekhar@ti.com
Reviewed-by: Tom Rini trini@konsulko.com

On Thu, Aug 23, 2018 at 05:11:30PM +0530, Sekhar Nori wrote:
On beaglebone black, it can so happen that PHY address is not latched correctly on reset and board boots with PHY responding to a different address than that programmed in device-tree. For example, see this report:
https://groups.google.com/d/msg/beagleboard/9mctrG26Mc8/1FuI_i5KW10J
Workaround this by fixing up device-tree passed to kernel by using the PHY address detected in hardware.
Beaglebone itself uses only one ethernet port and its DT currently uses phy_id (obsoleted). But the function has been written to handle multiple ports and phy_id as well as phy-handle to make the function more generically useful.
Signed-off-by: Sekhar Nori nsekhar@ti.com Reviewed-by: Tom Rini trini@konsulko.com
Applied to u-boot/master, thanks!

Enable CONFIG_OF_BOARD_SETUP as it is needed for Beaglebone black to overwrite the Ethernet phy address present in DT in case the phy latches on to a different address.
Signed-off-by: Sekhar Nori nsekhar@ti.com --- configs/am335x_evm_defconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/configs/am335x_evm_defconfig b/configs/am335x_evm_defconfig index 06fdcd7a86a8..8627d75b1de3 100644 --- a/configs/am335x_evm_defconfig +++ b/configs/am335x_evm_defconfig @@ -6,6 +6,7 @@ CONFIG_SPL=y CONFIG_DEFAULT_DEVICE_TREE="am335x-evm" CONFIG_DISTRO_DEFAULTS=y CONFIG_SPL_LOAD_FIT=y +CONFIG_OF_BOARD_SETUP=y CONFIG_BOOTCOMMAND="if test ${boot_fit} -eq 1; then run update_to_fit; fi; run findfdt; run init_console; run envboot; run distro_bootcmd" CONFIG_SYS_CONSOLE_INFO_QUIET=y CONFIG_VERSION_VARIABLE=y

On Thu, Aug 23, 2018 at 05:11:31PM +0530, Sekhar Nori wrote:
Enable CONFIG_OF_BOARD_SETUP as it is needed for Beaglebone black to overwrite the Ethernet phy address present in DT in case the phy latches on to a different address.
Signed-off-by: Sekhar Nori nsekhar@ti.com
Reviewed-by: Tom Rini trini@konsulko.com

On Thu, Aug 23, 2018 at 05:11:31PM +0530, Sekhar Nori wrote:
Enable CONFIG_OF_BOARD_SETUP as it is needed for Beaglebone black to overwrite the Ethernet phy address present in DT in case the phy latches on to a different address.
Signed-off-by: Sekhar Nori nsekhar@ti.com Reviewed-by: Tom Rini trini@konsulko.com
Applied to u-boot/master, thanks!

On 08/23/2018 06:41 AM, Sekhar Nori wrote:
There have been reports of ethernet network failures on beagle bone black. These have been root caused to ethernet PHY latching to a different address on boot than what is programmed in DT.
See: https://groups.google.com/d/msg/beagleboard/9mctrG26Mc8/1FuI_i5KW10J
This patch series tries to workaround this suitaiton by patching device-tree passed to kernel with the correct address. As well as updating the PHY address U-Boot drivers themselves use.
Tested on beaglebone black rev A4 by simulating the error condition I was also able to hit the actual hardware error once in a while.
Sekhar Nori (3): drivers: net: cpsw: add support to update phy address board: ti: am335x: add support to fixup phy address configs: am335x_evm: enable OF_BOARD_SETUP
board/ti/am335x/board.c | 78 ++++++++++++++++++++++++++++++++++++++++++++ configs/am335x_evm_defconfig | 1 + drivers/net/cpsw.c | 29 ++++++++++++++++ include/cpsw.h | 1 + 4 files changed, 109 insertions(+)
Reviewed-by: Grygorii Strashko grygorii.strashko@ti.com
participants (3)
-
Grygorii Strashko
-
Sekhar Nori
-
Tom Rini