[PATCH 0/2] net: Fix Ethernet PHY detection on Beagleplay

Hi,
Beagleplay has a buggy Ethernet PHY implementation for the Gigabit PHY in the sense that it is non responsive over MDIO immediately after power-up/reset.
We need to either try multiple times or wait sufficiently long enough (couple of 10s of ms?) before the PHY begins to respond correctly.
One theory is that the PHY is configured to operate on MDIO address 0 which it treats as a special broadcast address.
Datasheet states: "PHYAD (config pins) sets the PHY address for the device. The RTL8211F(I)/RTL8211FD(I) supports PHY addresses from 0x01 to 0x07. Note 1: An MDIO command with PHY address=0 is a broadcast from the MAC; each PHY device should respond."
This issue is not seen with the other PHY (different make) on the board which is configured for address 0x1.
As a woraround we try to probe the PHY multiple times instead of giving up on the first attempt.
cheers, -roger
Roger Quadros (2): net: ti: am65-cpsw-nuss: Workaround for buggy PHY/Board net: phy: Change "PHY not found" message to debug()
drivers/net/phy/phy.c | 2 +- drivers/net/ti/am65-cpsw-nuss.c | 19 ++++++++++++++----- 2 files changed, 15 insertions(+), 6 deletions(-)
base-commit: a169438411f9277cc689c14078151aa1d1caae3c

Beagleplay has a buggy Ethernet PHY implementation for the Gigabit PHY in the sense that it is non responsive over MDIO immediately after power-up/reset.
We need to either try multiple times or wait sufficiently long enough (couple of 10s of ms?) before the PHY begins to respond correctly.
One theory is that the PHY is configured to operate on MDIO address 0 which it treats as a special broadcast address.
Datasheet states: "PHYAD (config pins) sets the PHY address for the device. The RTL8211F(I)/RTL8211FD(I) supports PHY addresses from 0x01 to 0x07. Note 1: An MDIO command with PHY address=0 is a broadcast from the MAC; each PHY device should respond."
This issue is not seen with the other PHY (different make) on the board which is configured for address 0x1.
As a woraround we try to probe the PHY multiple times instead of giving up on the first attempt.
Signed-off-by: Roger Quadros rogerq@kernel.org --- drivers/net/ti/am65-cpsw-nuss.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ti/am65-cpsw-nuss.c b/drivers/net/ti/am65-cpsw-nuss.c index 51a8167d14..4f8a2f9b93 100644 --- a/drivers/net/ti/am65-cpsw-nuss.c +++ b/drivers/net/ti/am65-cpsw-nuss.c @@ -27,6 +27,7 @@ #include <syscon.h> #include <linux/bitops.h> #include <linux/soc/ti/ti-udma.h> +#include <linux/delay.h>
#include "cpsw_mdio.h"
@@ -678,14 +679,22 @@ static int am65_cpsw_phy_init(struct udevice *dev) struct am65_cpsw_priv *priv = dev_get_priv(dev); struct am65_cpsw_common *cpsw_common = priv->cpsw_common; struct eth_pdata *pdata = dev_get_plat(dev); - struct phy_device *phydev; u32 supported = PHY_GBIT_FEATURES; + struct phy_device *phydev; + int tries; int ret;
- phydev = phy_connect(cpsw_common->bus, - priv->phy_addr, - priv->dev, - pdata->phy_interface); + /* Some boards (e.g. beagleplay) don't work on first go */ + for (tries = 1; tries <= 5; tries++) { + phydev = phy_connect(cpsw_common->bus, + priv->phy_addr, + priv->dev, + pdata->phy_interface); + if (phydev) + break; + + mdelay(10); + }
if (!phydev) { dev_err(dev, "phy_connect() failed\n");

Roger,
On 22/08/23 17:43, Roger Quadros wrote:
Beagleplay has a buggy Ethernet PHY implementation for the Gigabit PHY in the sense that it is non responsive over MDIO immediately after power-up/reset.
We need to either try multiple times or wait sufficiently long enough (couple of 10s of ms?) before the PHY begins to respond correctly.
Based on the datasheet at: https://datasheet.lcsc.com/lcsc/1912111437_Realtek-Semicon-RTL8211F-CG_C1879... in the section 7.17. PHY Reset (Hardware Reset), it is stated that software has to wait for at least 50 ms before accessing the PHY registers. Is this why the for-loop in the code below tries for at most 5 times with a delay of 10 ms before the next try? If yes, then isn't it better to move the delay into the realtek PHY driver at drivers/net/phy/realtek.c Shouldn't it be the PHY driver which ensures that any reads/writes to the PHY registers are valid? It can ensure this by having a one time 50ms delay for the very first access to the PHY registers.
One theory is that the PHY is configured to operate on MDIO address 0 which it treats as a special broadcast address.
Datasheet states: "PHYAD (config pins) sets the PHY address for the device. The RTL8211F(I)/RTL8211FD(I) supports PHY addresses from 0x01 to 0x07. Note 1: An MDIO command with PHY address=0 is a broadcast from the MAC; each PHY device should respond."
This issue is not seen with the other PHY (different make) on the board which is configured for address 0x1.
As a woraround we try to probe the PHY multiple times instead of giving up on the first attempt.
Signed-off-by: Roger Quadros rogerq@kernel.org
drivers/net/ti/am65-cpsw-nuss.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ti/am65-cpsw-nuss.c b/drivers/net/ti/am65-cpsw-nuss.c index 51a8167d14..4f8a2f9b93 100644 --- a/drivers/net/ti/am65-cpsw-nuss.c +++ b/drivers/net/ti/am65-cpsw-nuss.c @@ -27,6 +27,7 @@ #include <syscon.h> #include <linux/bitops.h> #include <linux/soc/ti/ti-udma.h> +#include <linux/delay.h>
#include "cpsw_mdio.h"
@@ -678,14 +679,22 @@ static int am65_cpsw_phy_init(struct udevice *dev) struct am65_cpsw_priv *priv = dev_get_priv(dev); struct am65_cpsw_common *cpsw_common = priv->cpsw_common; struct eth_pdata *pdata = dev_get_plat(dev);
- struct phy_device *phydev; u32 supported = PHY_GBIT_FEATURES;
- struct phy_device *phydev;
- int tries; int ret;
- phydev = phy_connect(cpsw_common->bus,
priv->phy_addr,
priv->dev,
pdata->phy_interface);
/* Some boards (e.g. beagleplay) don't work on first go */
for (tries = 1; tries <= 5; tries++) {
phydev = phy_connect(cpsw_common->bus,
priv->phy_addr,
priv->dev,
pdata->phy_interface);
if (phydev)
break;
mdelay(10);
}
if (!phydev) { dev_err(dev, "phy_connect() failed\n");

Hi Siddharth,
On 23/08/2023 07:35, Siddharth Vadapalli wrote:
Roger,
On 22/08/23 17:43, Roger Quadros wrote:
Beagleplay has a buggy Ethernet PHY implementation for the Gigabit PHY in the sense that it is non responsive over MDIO immediately after power-up/reset.
We need to either try multiple times or wait sufficiently long enough (couple of 10s of ms?) before the PHY begins to respond correctly.
Based on the datasheet at: https://datasheet.lcsc.com/lcsc/1912111437_Realtek-Semicon-RTL8211F-CG_C1879... in the section 7.17. PHY Reset (Hardware Reset), it is stated that software has to wait for at least 50 ms before accessing the PHY registers. Is this why the for-loop in the code below tries for at most 5 times with a delay of 10 ms before the next try? If yes, then isn't it
Good catch. This might be the reason why PHY is not responding in the first few attempts.
better to move the delay into the realtek PHY driver at drivers/net/phy/realtek.c
We are currently at the MDIO bus driver where we don't even know what PHY is on the bus. So this delay cannot come at the realtec PHY driver. It has to come at the MDIO bus driver level.
Shouldn't it be the PHY driver which ensures that any reads/writes to the PHY registers are valid? It can ensure this by having a one time 50ms delay for the very first access to the PHY registers.>
One theory is that the PHY is configured to operate on MDIO address 0 which it treats as a special broadcast address.
Datasheet states: "PHYAD (config pins) sets the PHY address for the device. The RTL8211F(I)/RTL8211FD(I) supports PHY addresses from 0x01 to 0x07. Note 1: An MDIO command with PHY address=0 is a broadcast from the MAC; each PHY device should respond."
This issue is not seen with the other PHY (different make) on the board which is configured for address 0x1.
As a woraround we try to probe the PHY multiple times instead of giving up on the first attempt.
Signed-off-by: Roger Quadros rogerq@kernel.org
drivers/net/ti/am65-cpsw-nuss.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ti/am65-cpsw-nuss.c b/drivers/net/ti/am65-cpsw-nuss.c index 51a8167d14..4f8a2f9b93 100644 --- a/drivers/net/ti/am65-cpsw-nuss.c +++ b/drivers/net/ti/am65-cpsw-nuss.c @@ -27,6 +27,7 @@ #include <syscon.h> #include <linux/bitops.h> #include <linux/soc/ti/ti-udma.h> +#include <linux/delay.h>
#include "cpsw_mdio.h"
@@ -678,14 +679,22 @@ static int am65_cpsw_phy_init(struct udevice *dev) struct am65_cpsw_priv *priv = dev_get_priv(dev); struct am65_cpsw_common *cpsw_common = priv->cpsw_common; struct eth_pdata *pdata = dev_get_plat(dev);
- struct phy_device *phydev; u32 supported = PHY_GBIT_FEATURES;
- struct phy_device *phydev;
- int tries; int ret;
- phydev = phy_connect(cpsw_common->bus,
priv->phy_addr,
priv->dev,
pdata->phy_interface);
/* Some boards (e.g. beagleplay) don't work on first go */
for (tries = 1; tries <= 5; tries++) {
phydev = phy_connect(cpsw_common->bus,
priv->phy_addr,
priv->dev,
pdata->phy_interface);
if (phydev)
break;
mdelay(10);
}
if (!phydev) { dev_err(dev, "phy_connect() failed\n");

On 23/08/23 13:22, Roger Quadros wrote:
Hi Siddharth,
On 23/08/2023 07:35, Siddharth Vadapalli wrote:
Roger,
On 22/08/23 17:43, Roger Quadros wrote:
Beagleplay has a buggy Ethernet PHY implementation for the Gigabit PHY in the sense that it is non responsive over MDIO immediately after power-up/reset.
We need to either try multiple times or wait sufficiently long enough (couple of 10s of ms?) before the PHY begins to respond correctly.
Based on the datasheet at: https://datasheet.lcsc.com/lcsc/1912111437_Realtek-Semicon-RTL8211F-CG_C1879... in the section 7.17. PHY Reset (Hardware Reset), it is stated that software has to wait for at least 50 ms before accessing the PHY registers. Is this why the for-loop in the code below tries for at most 5 times with a delay of 10 ms before the next try? If yes, then isn't it
Good catch. This might be the reason why PHY is not responding in the first few attempts.
better to move the delay into the realtek PHY driver at drivers/net/phy/realtek.c
We are currently at the MDIO bus driver where we don't even know what PHY is on the bus. So this delay cannot come at the realtec PHY driver. It has to come at the MDIO bus driver level.
Thank you for clarifying.
Shouldn't it be the PHY driver which ensures that any reads/writes to the PHY registers are valid? It can ensure this by having a one time 50ms delay for the very first access to the PHY registers.>
...

On 23/08/2023 11:02, Siddharth Vadapalli wrote:
On 23/08/23 13:22, Roger Quadros wrote:
Hi Siddharth,
On 23/08/2023 07:35, Siddharth Vadapalli wrote:
Roger,
On 22/08/23 17:43, Roger Quadros wrote:
Beagleplay has a buggy Ethernet PHY implementation for the Gigabit PHY in the sense that it is non responsive over MDIO immediately after power-up/reset.
We need to either try multiple times or wait sufficiently long enough (couple of 10s of ms?) before the PHY begins to respond correctly.
Based on the datasheet at: https://datasheet.lcsc.com/lcsc/1912111437_Realtek-Semicon-RTL8211F-CG_C1879... in the section 7.17. PHY Reset (Hardware Reset), it is stated that software has to wait for at least 50 ms before accessing the PHY registers. Is this why the for-loop in the code below tries for at most 5 times with a delay of 10 ms before the next try? If yes, then isn't it
Good catch. This might be the reason why PHY is not responding in the first few attempts.
better to move the delay into the realtek PHY driver at drivers/net/phy/realtek.c
We are currently at the MDIO bus driver where we don't even know what PHY is on the bus. So this delay cannot come at the realtec PHY driver. It has to come at the MDIO bus driver level.
Thank you for clarifying.
Looking closer, this is already being addressed by drivers/net/eth-phy-uclass.c in eth_phy_pre_probe()
linux/Documentation/devicetree/bindings/net/ethernet-phy.yaml
reset-assert-us: description: Delay after the reset was asserted in microseconds. If this property is missing the delay will be skipped.
reset-deassert-us: description: Delay after the reset was deasserted in microseconds. If this property is missing the delay will be skipped.
So we can drop this patch once we implement proper uclass driver for cpsw-mdio.
Shouldn't it be the PHY driver which ensures that any reads/writes to the PHY registers are valid? It can ensure this by having a one time 50ms delay for the very first access to the PHY registers.>
...

On 23/08/2023 11:14, Roger Quadros wrote:
On 23/08/2023 11:02, Siddharth Vadapalli wrote:
On 23/08/23 13:22, Roger Quadros wrote:
Hi Siddharth,
On 23/08/2023 07:35, Siddharth Vadapalli wrote:
Roger,
On 22/08/23 17:43, Roger Quadros wrote:
Beagleplay has a buggy Ethernet PHY implementation for the Gigabit PHY in the sense that it is non responsive over MDIO immediately after power-up/reset.
We need to either try multiple times or wait sufficiently long enough (couple of 10s of ms?) before the PHY begins to respond correctly.
Based on the datasheet at: https://datasheet.lcsc.com/lcsc/1912111437_Realtek-Semicon-RTL8211F-CG_C1879... in the section 7.17. PHY Reset (Hardware Reset), it is stated that software has to wait for at least 50 ms before accessing the PHY registers. Is this why the for-loop in the code below tries for at most 5 times with a delay of 10 ms before the next try? If yes, then isn't it
At power on, the constraint is even larger. Please see Section 9.3 Power sequence.
PHY registers can be accessed at T2 which is sum of Rt3, Rt4, Rt5 which is about 146.5ms after Power Active and PHY Reset de-asserted.
Good catch. This might be the reason why PHY is not responding in the first few attempts.
better to move the delay into the realtek PHY driver at drivers/net/phy/realtek.c
We are currently at the MDIO bus driver where we don't even know what PHY is on the bus. So this delay cannot come at the realtec PHY driver. It has to come at the MDIO bus driver level.
Thank you for clarifying.
Looking closer, this is already being addressed by drivers/net/eth-phy-uclass.c in eth_phy_pre_probe()
linux/Documentation/devicetree/bindings/net/ethernet-phy.yaml
reset-assert-us: description: Delay after the reset was asserted in microseconds. If this property is missing the delay will be skipped.
reset-deassert-us: description: Delay after the reset was deasserted in microseconds. If this property is missing the delay will be skipped.
So we can drop this patch once we implement proper uclass driver for cpsw-mdio.
Shouldn't it be the PHY driver which ensures that any reads/writes to the PHY registers are valid? It can ensure this by having a one time 50ms delay for the very first access to the PHY registers.>
...

Hello Roger,
On 22-08-2023 17:43, Roger Quadros wrote:
Beagleplay has a buggy Ethernet PHY implementation for the Gigabit PHY in the sense that it is non responsive over MDIO immediately after power-up/reset.
We need to either try multiple times or wait sufficiently long enough (couple of 10s of ms?) before the PHY begins to respond correctly.
One theory is that the PHY is configured to operate on MDIO address 0 which it treats as a special broadcast address.
Datasheet states: "PHYAD (config pins) sets the PHY address for the device. The RTL8211F(I)/RTL8211FD(I) supports PHY addresses from 0x01 to 0x07. Note 1: An MDIO command with PHY address=0 is a broadcast from the MAC; each PHY device should respond."
This issue is not seen with the other PHY (different make) on the board which is configured for address 0x1.
As a woraround we try to probe the PHY multiple times instead of giving up on the first attempt.
Signed-off-by: Roger Quadros rogerq@kernel.org
drivers/net/ti/am65-cpsw-nuss.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ti/am65-cpsw-nuss.c b/drivers/net/ti/am65-cpsw-nuss.c index 51a8167d14..4f8a2f9b93 100644 --- a/drivers/net/ti/am65-cpsw-nuss.c +++ b/drivers/net/ti/am65-cpsw-nuss.c @@ -27,6 +27,7 @@ #include <syscon.h> #include <linux/bitops.h> #include <linux/soc/ti/ti-udma.h> +#include <linux/delay.h>
#include "cpsw_mdio.h"
@@ -678,14 +679,22 @@ static int am65_cpsw_phy_init(struct udevice *dev) struct am65_cpsw_priv *priv = dev_get_priv(dev); struct am65_cpsw_common *cpsw_common = priv->cpsw_common; struct eth_pdata *pdata = dev_get_plat(dev);
- struct phy_device *phydev; u32 supported = PHY_GBIT_FEATURES;
- struct phy_device *phydev;
- int tries; int ret;
- phydev = phy_connect(cpsw_common->bus,
priv->phy_addr,
priv->dev,
pdata->phy_interface);
- /* Some boards (e.g. beagleplay) don't work on first go */
- for (tries = 1; tries <= 5; tries++) {
phydev = phy_connect(cpsw_common->bus,
priv->phy_addr,
priv->dev,
pdata->phy_interface);
if (phydev)
break;
mdelay(10);
- }
After rethinking about the above implementation and the second patch of this series, the second patch could be dropped altogether if the following implementation is acceptable:
phydev = phy_connect(cpsw_common->bus, priv->phy_addr, priv->dev, pdata->phy_interface);
if (!phydev) { /* Some boards (e.g. beagleplay) don't work on first go */ mdelay(50); phydev = phy_connect(cpsw_common->bus, priv->phy_addr, priv->dev, pdata->phy_interface); }
if (!phydev) { dev_err(dev, "phy_connect() failed\n"); ...
With this, there would be at most one "PHY not found" print, which should be fine. The mdelay value of 50 could be replaced with a sufficiently large value which guarantees success for Beagleplay.

On Thu, Aug 24, 2023 at 11:34:29PM +0530, Siddharth Vadapalli wrote:
Hello Roger,
On 22-08-2023 17:43, Roger Quadros wrote:
Beagleplay has a buggy Ethernet PHY implementation for the Gigabit PHY in the sense that it is non responsive over MDIO immediately after power-up/reset.
We need to either try multiple times or wait sufficiently long enough (couple of 10s of ms?) before the PHY begins to respond correctly.
One theory is that the PHY is configured to operate on MDIO address 0 which it treats as a special broadcast address.
Datasheet states: "PHYAD (config pins) sets the PHY address for the device. The RTL8211F(I)/RTL8211FD(I) supports PHY addresses from 0x01 to 0x07. Note 1: An MDIO command with PHY address=0 is a broadcast from the MAC; each PHY device should respond."
This issue is not seen with the other PHY (different make) on the board which is configured for address 0x1.
As a woraround we try to probe the PHY multiple times instead of giving up on the first attempt.
Signed-off-by: Roger Quadros rogerq@kernel.org
drivers/net/ti/am65-cpsw-nuss.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ti/am65-cpsw-nuss.c b/drivers/net/ti/am65-cpsw-nuss.c index 51a8167d14..4f8a2f9b93 100644 --- a/drivers/net/ti/am65-cpsw-nuss.c +++ b/drivers/net/ti/am65-cpsw-nuss.c @@ -27,6 +27,7 @@ #include <syscon.h> #include <linux/bitops.h> #include <linux/soc/ti/ti-udma.h> +#include <linux/delay.h>
#include "cpsw_mdio.h"
@@ -678,14 +679,22 @@ static int am65_cpsw_phy_init(struct udevice *dev) struct am65_cpsw_priv *priv = dev_get_priv(dev); struct am65_cpsw_common *cpsw_common = priv->cpsw_common; struct eth_pdata *pdata = dev_get_plat(dev);
- struct phy_device *phydev; u32 supported = PHY_GBIT_FEATURES;
- struct phy_device *phydev;
- int tries; int ret;
- phydev = phy_connect(cpsw_common->bus,
priv->phy_addr,
priv->dev,
pdata->phy_interface);
- /* Some boards (e.g. beagleplay) don't work on first go */
- for (tries = 1; tries <= 5; tries++) {
phydev = phy_connect(cpsw_common->bus,
priv->phy_addr,
priv->dev,
pdata->phy_interface);
if (phydev)
break;
mdelay(10);
- }
After rethinking about the above implementation and the second patch of this series, the second patch could be dropped altogether if the following implementation is acceptable:
phydev = phy_connect(cpsw_common->bus, priv->phy_addr, priv->dev, pdata->phy_interface);
if (!phydev) { /* Some boards (e.g. beagleplay) don't work on first go */ mdelay(50); phydev = phy_connect(cpsw_common->bus, priv->phy_addr, priv->dev, pdata->phy_interface); }
if (!phydev) { dev_err(dev, "phy_connect() failed\n"); ...
With this, there would be at most one "PHY not found" print, which should be fine. The mdelay value of 50 could be replaced with a sufficiently large value which guarantees success for Beagleplay.
Ramon, thoughts?

On 24/08/2023 21:24, Tom Rini wrote:
On Thu, Aug 24, 2023 at 11:34:29PM +0530, Siddharth Vadapalli wrote:
Hello Roger,
On 22-08-2023 17:43, Roger Quadros wrote:
Beagleplay has a buggy Ethernet PHY implementation for the Gigabit PHY in the sense that it is non responsive over MDIO immediately after power-up/reset.
We need to either try multiple times or wait sufficiently long enough (couple of 10s of ms?) before the PHY begins to respond correctly.
One theory is that the PHY is configured to operate on MDIO address 0 which it treats as a special broadcast address.
Datasheet states: "PHYAD (config pins) sets the PHY address for the device. The RTL8211F(I)/RTL8211FD(I) supports PHY addresses from 0x01 to 0x07. Note 1: An MDIO command with PHY address=0 is a broadcast from the MAC; each PHY device should respond."
This issue is not seen with the other PHY (different make) on the board which is configured for address 0x1.
As a woraround we try to probe the PHY multiple times instead of giving up on the first attempt.
Signed-off-by: Roger Quadros rogerq@kernel.org
drivers/net/ti/am65-cpsw-nuss.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ti/am65-cpsw-nuss.c b/drivers/net/ti/am65-cpsw-nuss.c index 51a8167d14..4f8a2f9b93 100644 --- a/drivers/net/ti/am65-cpsw-nuss.c +++ b/drivers/net/ti/am65-cpsw-nuss.c @@ -27,6 +27,7 @@ #include <syscon.h> #include <linux/bitops.h> #include <linux/soc/ti/ti-udma.h> +#include <linux/delay.h>
#include "cpsw_mdio.h"
@@ -678,14 +679,22 @@ static int am65_cpsw_phy_init(struct udevice *dev) struct am65_cpsw_priv *priv = dev_get_priv(dev); struct am65_cpsw_common *cpsw_common = priv->cpsw_common; struct eth_pdata *pdata = dev_get_plat(dev);
- struct phy_device *phydev; u32 supported = PHY_GBIT_FEATURES;
- struct phy_device *phydev;
- int tries; int ret;
- phydev = phy_connect(cpsw_common->bus,
priv->phy_addr,
priv->dev,
pdata->phy_interface);
- /* Some boards (e.g. beagleplay) don't work on first go */
- for (tries = 1; tries <= 5; tries++) {
phydev = phy_connect(cpsw_common->bus,
priv->phy_addr,
priv->dev,
pdata->phy_interface);
if (phydev)
break;
mdelay(10);
- }
After rethinking about the above implementation and the second patch of this series, the second patch could be dropped altogether if the following implementation is acceptable:
phydev = phy_connect(cpsw_common->bus, priv->phy_addr, priv->dev, pdata->phy_interface);
if (!phydev) { /* Some boards (e.g. beagleplay) don't work on first go */ mdelay(50); phydev = phy_connect(cpsw_common->bus, priv->phy_addr, priv->dev, pdata->phy_interface); }
if (!phydev) { dev_err(dev, "phy_connect() failed\n"); ...
With this, there would be at most one "PHY not found" print, which should be fine. The mdelay value of 50 could be replaced with a sufficiently large value which guarantees success for Beagleplay.
Ramon, thoughts?
Even a single "PHY not found" print is not OK. It looks like an error while it should not.
The correct solution is to use the MDIO uclass framework and add some generic handling in the class driver. drivers/net/eth-phy-uclass.c
We could provide the delay time in the 'reset-deassert-us' property. Although I'm not sure if this is the correct property for this case as there is no RESET GPIO on the board. What we really want is delay from power-on-reset. Which means we might have to introduce a new property and use time from boot to determine if PHY is ready or not?
NOTE: PHY ready time is different for hardware reset and power-on-reset. 50ms vs 150ms

Roger,
On 25/08/23 00:39, Roger Quadros wrote:
On 24/08/2023 21:24, Tom Rini wrote:
On Thu, Aug 24, 2023 at 11:34:29PM +0530, Siddharth Vadapalli wrote:
Hello Roger,
On 22-08-2023 17:43, Roger Quadros wrote:
...
Even a single "PHY not found" print is not OK. It looks like an error while it should not.
The correct solution is to use the MDIO uclass framework and add some generic handling in the class driver. drivers/net/eth-phy-uclass.c
We could provide the delay time in the 'reset-deassert-us' property. Although I'm not sure if this is the correct property for this case as there is no RESET GPIO on the board. What we really want is delay from power-on-reset. Which means we might have to introduce a new property and use time from boot to determine if PHY is ready or not?
NOTE: PHY ready time is different for hardware reset and power-on-reset. 50ms vs 150ms
Another alternative could be that of implementing the for-loop within phy_connect() along with the delay, by updating the function with a new parameter "tries" which indicates the number of retries before finally printing "PHY not found" in case of an error. Optionally, another parameter "delay" can be added, which indicates the delay within the for-loop.
This implementation will require updating all drivers in drivers/net which use phy_connect(), with the "tries" parameter set to 1 for them. The am65-cpsw-nuss driver can set "tries" to 5 as done in the current patch.
The idea behind moving the for-loop within phy_connect() is that it will help solve the issue for other drivers as well, if they potentially encounter such buggy PHYs in future boards.

Siddharth,
On 25/08/2023 08:42, Siddharth Vadapalli wrote:
Roger,
On 25/08/23 00:39, Roger Quadros wrote:
On 24/08/2023 21:24, Tom Rini wrote:
On Thu, Aug 24, 2023 at 11:34:29PM +0530, Siddharth Vadapalli wrote:
Hello Roger,
On 22-08-2023 17:43, Roger Quadros wrote:
...
Even a single "PHY not found" print is not OK. It looks like an error while it should not.
The correct solution is to use the MDIO uclass framework and add some generic handling in the class driver. drivers/net/eth-phy-uclass.c
We could provide the delay time in the 'reset-deassert-us' property. Although I'm not sure if this is the correct property for this case as there is no RESET GPIO on the board. What we really want is delay from power-on-reset. Which means we might have to introduce a new property and use time from boot to determine if PHY is ready or not?
NOTE: PHY ready time is different for hardware reset and power-on-reset. 50ms vs 150ms
Another alternative could be that of implementing the for-loop within phy_connect() along with the delay, by updating the function with a new parameter "tries" which indicates the number of retries before finally printing "PHY not found" in case of an error. Optionally, another parameter "delay" can be added, which indicates the delay within the for-loop.
This implementation will require updating all drivers in drivers/net which use phy_connect(), with the "tries" parameter set to 1 for them. The am65-cpsw-nuss driver can set "tries" to 5 as done in the current patch.
The idea behind moving the for-loop within phy_connect() is that it will help solve the issue for other drivers as well, if they potentially encounter such buggy PHYs in future boards.
This doesn't sound like a clean solution. All hardware specific details (like reset ready-time) should come from the device tree and not passed around in PHY APIs.
Why to do such an intrusive change instead of fixing the am65-cpsw-nuss driver to properly use the PHY uclass driver model and encode the required delays in the device tree?

On 25/08/23 13:22, Roger Quadros wrote:
Siddharth,
On 25/08/2023 08:42, Siddharth Vadapalli wrote:
Roger,
On 25/08/23 00:39, Roger Quadros wrote:
On 24/08/2023 21:24, Tom Rini wrote:
On Thu, Aug 24, 2023 at 11:34:29PM +0530, Siddharth Vadapalli wrote:
Hello Roger,
On 22-08-2023 17:43, Roger Quadros wrote:
...
This doesn't sound like a clean solution. All hardware specific details (like reset ready-time) should come from the device tree and not passed around in PHY APIs.
Why to do such an intrusive change instead of fixing the am65-cpsw-nuss driver to properly use the PHY uclass driver model and encode the required delays in the device tree?
Encoding delays in device-tree and using them is better. Kindly ignore my suggestion.

Some boards (e.g. Beagleplay) need multiple attempts to detect the PHY and the multiple "PHY not found" prints are not nice.
Change them to debug().
Signed-off-by: Roger Quadros rogerq@kernel.org --- drivers/net/phy/phy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index ae21acb059..3a524bcd81 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -928,7 +928,7 @@ struct phy_device *phy_connect(struct mii_dev *bus, int addr, if (phydev) phy_connect_dev(phydev, dev, interface); else - printf("Could not get PHY for %s: addr %d\n", bus->name, addr); + debug("Could not get PHY for %s: addr %d\n", bus->name, addr); return phydev; }

On 22/08/23 17:43, Roger Quadros wrote:
Some boards (e.g. Beagleplay) need multiple attempts to detect the PHY and the multiple "PHY not found" prints are not nice.
I tried grepping for calls to "phy_connect" across the drivers present in drivers/net. Most of them simply return -ENODEV on failure. The ones which add prints indicating failure, don't use debug(). For this reason, I believe that the drivers which don't have any prints in case of failure might be relying on the printf() within phy_connect() to indicate failure. Therefore, if the printf() is being changed to debug() in phy_connect(), maybe all the drivers which currently return -ENODEV, should have a print added to them as a part of this patch.
Change them to debug().
Signed-off-by: Roger Quadros rogerq@kernel.org
drivers/net/phy/phy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index ae21acb059..3a524bcd81 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -928,7 +928,7 @@ struct phy_device *phy_connect(struct mii_dev *bus, int addr, if (phydev) phy_connect_dev(phydev, dev, interface); else
printf("Could not get PHY for %s: addr %d\n", bus->name, addr);
return phydev;debug("Could not get PHY for %s: addr %d\n", bus->name, addr);
}

On 23/08/2023 07:54, Siddharth Vadapalli wrote:
On 22/08/23 17:43, Roger Quadros wrote:
Some boards (e.g. Beagleplay) need multiple attempts to detect the PHY and the multiple "PHY not found" prints are not nice.
I tried grepping for calls to "phy_connect" across the drivers present in drivers/net. Most of them simply return -ENODEV on failure. The ones which add prints indicating failure, don't use debug(). For this reason, I believe that the drivers which don't have any prints in case of failure might be relying on the printf() within phy_connect() to indicate failure. Therefore, if the printf() is being changed to debug() in phy_connect(), maybe all the drivers which currently return -ENODEV, should have a print added to them as a part of this patch.
Sounds reasonable. I can do that.
Change them to debug().
Signed-off-by: Roger Quadros rogerq@kernel.org
drivers/net/phy/phy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index ae21acb059..3a524bcd81 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -928,7 +928,7 @@ struct phy_device *phy_connect(struct mii_dev *bus, int addr, if (phydev) phy_connect_dev(phydev, dev, interface); else
printf("Could not get PHY for %s: addr %d\n", bus->name, addr);
return phydev;debug("Could not get PHY for %s: addr %d\n", bus->name, addr);
}
participants (3)
-
Roger Quadros
-
Siddharth Vadapalli
-
Tom Rini