[U-Boot] Intel e1000 PHY HW reset timeout reasons?

Hello,
I am currently working on u-boot mainline code supporting Marvell Armada armv8 platforms (80x0, 70x0, 37x0). During this coding and testing cycles I found a very strange issue related to u-boot e1000 network driver. The PCIe e1000 NIC is tested on 2 platforms based on the same SoC (8040). On one of these platforms the e1000 card is properly detected on PCIe bus, but fails to activate the HW due to HW PHY reset timeout. If I bypass the timeout error, the e1000 card continues to work without any problem and I am able to run a full Linux system with root on NFS using this network card. Below is the (ugly) change that solves me the PHY reset timeout problem:
diff --git a/drivers/net/e1000.c b/drivers/net/e1000.c index 3332ad9..a40b009 100644 --- a/drivers/net/e1000.c +++ b/drivers/net/e1000.c @@ -4361,7 +4361,7 @@ e1000_get_phy_cfg_done(struct e1000_hw *hw) if (!timeout) { DEBUGOUT("MNG configuration cycle has not " "completed.\n"); - return -E1000_ERR_RESET; +// return -E1000_ERR_RESET; } break;
I have found that the similar issue has already been discussed in this email list, but the driver code was not changed and the timeout seems to happen in some test cases. http://lists.denx.de/pipermail/u-boot/2014-September/188544.html
I would be glad to get an idea about such timeout reasons and ways to solve this issue on my development platform.
Both my platforms are based on Marvell Armada 8040 SoC.
1. Platform-1: PCIe x1, card reset connected to system reset wire No driver WA required.
2. Platform-2: PCIe x4, card reset connected to GPIO, reset released upon PCIe bus probe. e1000 driver WA required for a normal operation.
============================================== E1000 debug log on Platform-2: ==============================================
=> pci enum PCIE-0: Link up (Gen1-x1, Bus0) => pci 1 Scanning PCI devices on bus 1 BusDevFun VendorId DeviceId Device Class Sub-Class _____________________________________________________________ 01.00.00 0x8086 0x107d Network controller 0x00
=> ping 192.168.100.200 e1000: e1000#0: DEBUG: iobase 0xf6000000 e1000_set_mac_type e1000_set_media_type copper interface e1000_reset_hw Masking off all interrupts Issuing a global reset to MAC Masking off all interrupts e1000_init_eeprom_params e1000_read_mac_addr e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000: 68:05:ca:12:4d:dc
Warning: e1000#0 MAC addresses don't match: Address in SROM is 68:05:ca:12:4d:dc Address in environment is 00:00:00:00:51:81 e1000_reset_hw Masking off all interrupts Issuing a global reset to MAC Masking off all interrupts e1000_init_hw e1000_set_media_type Initializing the IEEE VLAN e1000_init_rx_addrs Programming MAC Address into RAR[0] Clearing RAR[1-15] Zeroing the MTA e1000_setup_link e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore After fix-ups FlowControl is now = 3 e1000_setup_copper_link e1000_copper_link_preconfig e1000_detect_gig_phy Phy ID = 2a80380 e1000_set_phy_mode e1000_copper_link_igp_setup e1000_phy_reset e1000_phy_hw_reset Resetting Phy... e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_get_phy_cfg_done MNG configuration cycle has not completed. Error Resetting the PHY e1000: e1000#0: ERROR: Hardware Initialization Failed ping failed; host 192.168.100.200 is not alive ==============================================
Thank you beforehand for your help.
Best Regards Konstantin Porotchkin

Hi Kostya,
On Wed, Dec 14, 2016 at 10:51 AM, Kostya Porotchkin kostap@marvell.com wrote:
Hello,
I am currently working on u-boot mainline code supporting Marvell Armada armv8 platforms (80x0, 70x0, 37x0). During this coding and testing cycles I found a very strange issue related to u-boot e1000 network driver. The PCIe e1000 NIC is tested on 2 platforms based on the same SoC (8040). On one of these platforms the e1000 card is properly detected on PCIe bus, but fails to activate the HW due to HW PHY reset timeout. If I bypass the timeout error, the e1000 card continues to work without any problem and I am able to run a full Linux system with root on NFS using this network card. Below is the (ugly) change that solves me the PHY reset timeout problem:
diff --git a/drivers/net/e1000.c b/drivers/net/e1000.c index 3332ad9..a40b009 100644 --- a/drivers/net/e1000.c +++ b/drivers/net/e1000.c @@ -4361,7 +4361,7 @@ e1000_get_phy_cfg_done(struct e1000_hw *hw) if (!timeout) { DEBUGOUT("MNG configuration cycle has not " "completed.\n");
return -E1000_ERR_RESET;
+// return -E1000_ERR_RESET; } break;
So basically if you have the driver lie about the reset succeeding, then it's OK. Maybe the reset timeout needs to be longer? Maybe the hardware isn't reporting the reset completion properly?
I have found that the similar issue has already been discussed in this email list, but the driver code was not changed and the timeout seems to happen in some test cases. http://lists.denx.de/pipermail/u-boot/2014-September/188544.html
I would be glad to get an idea about such timeout reasons and ways to solve this issue on my development platform.
Both my platforms are based on Marvell Armada 8040 SoC.
Maybe Prafulla has some ideas?
Platform-1: PCIe x1, card reset connected to system reset wire No driver WA required.
Platform-2: PCIe x4, card reset connected to GPIO, reset released upon PCIe bus probe. e1000 driver WA required for a normal operation.
============================================== E1000 debug log on Platform-2: ==============================================
=> pci enum PCIE-0: Link up (Gen1-x1, Bus0) => pci 1 Scanning PCI devices on bus 1 BusDevFun VendorId DeviceId Device Class Sub-Class _____________________________________________________________ 01.00.00 0x8086 0x107d Network controller 0x00
=> ping 192.168.100.200 e1000: e1000#0: DEBUG: iobase 0xf6000000 e1000_set_mac_type e1000_set_media_type copper interface e1000_reset_hw Masking off all interrupts Issuing a global reset to MAC Masking off all interrupts e1000_init_eeprom_params e1000_read_mac_addr e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000: 68:05:ca:12:4d:dc
Warning: e1000#0 MAC addresses don't match: Address in SROM is 68:05:ca:12:4d:dc Address in environment is 00:00:00:00:51:81 e1000_reset_hw Masking off all interrupts Issuing a global reset to MAC Masking off all interrupts e1000_init_hw e1000_set_media_type Initializing the IEEE VLAN e1000_init_rx_addrs Programming MAC Address into RAR[0] Clearing RAR[1-15] Zeroing the MTA e1000_setup_link e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore After fix-ups FlowControl is now = 3 e1000_setup_copper_link e1000_copper_link_preconfig e1000_detect_gig_phy Phy ID = 2a80380 e1000_set_phy_mode e1000_copper_link_igp_setup e1000_phy_reset e1000_phy_hw_reset Resetting Phy... e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_get_phy_cfg_done MNG configuration cycle has not completed. Error Resetting the PHY e1000: e1000#0: ERROR: Hardware Initialization Failed ping failed; host 192.168.100.200 is not alive ==============================================
Thank you beforehand for your help.
Best Regards Konstantin Porotchkin
U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

Hi, Joe,
Thank you for your response! Actually I tried to increase the after-reset delay up to 500mS and even to 1000mS, but got the same result. It looks like I hit a corner case when the u-boot e1000 driver expects semaphore release from the e1000 FW and it does not happen for some reason. I guess when e1000 reset signal released in the same time as the board reset, the u-boot driver is not started yet and the e1000 FW gets ready before the driver starts to talk to the NIC HW. Interesting if Intel has a FW update for this card type. Such update might be fixing this driver-FW lock issue.
Regards Kosta
________________________________________ From: Joe Hershberger joe.hershberger@gmail.com Sent: Thursday, February 2, 2017 20:46 To: Kostya Porotchkin Cc: u-boot@lists.denx.de; joe.hershberger@ni.com; rabeeh@solid-run.com; Prafulla Wadaskar Subject: [EXT] Re: [U-Boot] Intel e1000 PHY HW reset timeout reasons?
External Email
---------------------------------------------------------------------- Hi Kostya,
On Wed, Dec 14, 2016 at 10:51 AM, Kostya Porotchkin kostap@marvell.com wrote:
Hello,
I am currently working on u-boot mainline code supporting Marvell Armada armv8 platforms (80x0, 70x0, 37x0). During this coding and testing cycles I found a very strange issue related to u-boot e1000 network driver. The PCIe e1000 NIC is tested on 2 platforms based on the same SoC (8040). On one of these platforms the e1000 card is properly detected on PCIe bus, but fails to activate the HW due to HW PHY reset timeout. If I bypass the timeout error, the e1000 card continues to work without any problem and I am able to run a full Linux system with root on NFS using this network card. Below is the (ugly) change that solves me the PHY reset timeout problem:
diff --git a/drivers/net/e1000.c b/drivers/net/e1000.c index 3332ad9..a40b009 100644 --- a/drivers/net/e1000.c +++ b/drivers/net/e1000.c @@ -4361,7 +4361,7 @@ e1000_get_phy_cfg_done(struct e1000_hw *hw) if (!timeout) { DEBUGOUT("MNG configuration cycle has not " "completed.\n");
return -E1000_ERR_RESET;
+// return -E1000_ERR_RESET; } break;
So basically if you have the driver lie about the reset succeeding, then it's OK. Maybe the reset timeout needs to be longer? Maybe the hardware isn't reporting the reset completion properly?
I have found that the similar issue has already been discussed in this email list, but the driver code was not changed and the timeout seems to happen in some test cases. http://lists.denx.de/pipermail/u-boot/2014-September/188544.html
I would be glad to get an idea about such timeout reasons and ways to solve this issue on my development platform.
Both my platforms are based on Marvell Armada 8040 SoC.
Maybe Prafulla has some ideas?
Platform-1: PCIe x1, card reset connected to system reset wire No driver WA required.
Platform-2: PCIe x4, card reset connected to GPIO, reset released upon PCIe bus probe. e1000 driver WA required for a normal operation.
============================================== E1000 debug log on Platform-2: ==============================================
=> pci enum PCIE-0: Link up (Gen1-x1, Bus0) => pci 1 Scanning PCI devices on bus 1 BusDevFun VendorId DeviceId Device Class Sub-Class _____________________________________________________________ 01.00.00 0x8086 0x107d Network controller 0x00
=> ping 192.168.100.200 e1000: e1000#0: DEBUG: iobase 0xf6000000 e1000_set_mac_type e1000_set_media_type copper interface e1000_reset_hw Masking off all interrupts Issuing a global reset to MAC Masking off all interrupts e1000_init_eeprom_params e1000_read_mac_addr e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000: 68:05:ca:12:4d:dc
Warning: e1000#0 MAC addresses don't match: Address in SROM is 68:05:ca:12:4d:dc Address in environment is 00:00:00:00:51:81 e1000_reset_hw Masking off all interrupts Issuing a global reset to MAC Masking off all interrupts e1000_init_hw e1000_set_media_type Initializing the IEEE VLAN e1000_init_rx_addrs Programming MAC Address into RAR[0] Clearing RAR[1-15] Zeroing the MTA e1000_setup_link e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_read_eeprom e1000_is_onboard_nvm_eeprom e1000_acquire_eeprom e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_spi_eeprom_ready e1000_release_eeprom e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore After fix-ups FlowControl is now = 3 e1000_setup_copper_link e1000_copper_link_preconfig e1000_detect_gig_phy Phy ID = 2a80380 e1000_set_phy_mode e1000_copper_link_igp_setup e1000_phy_reset e1000_phy_hw_reset Resetting Phy... e1000_swfw_sync_acquire e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_swfw_sync_release e1000_get_hw_eeprom_semaphore e1000_put_hw_eeprom_semaphore e1000_get_phy_cfg_done MNG configuration cycle has not completed. Error Resetting the PHY e1000: e1000#0: ERROR: Hardware Initialization Failed ping failed; host 192.168.100.200 is not alive ==============================================
Thank you beforehand for your help.
Best Regards Konstantin Porotchkin
U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
participants (2)
-
Joe Hershberger
-
Kostya Porotchkin