
Hi Stefan,
Make use of this observation then and attempt to detect the inability to negotiate the link speed automatically, and then handle it by hand. Use the Data Link Layer Link Active status flag as the primary indicator of successful link speed negotiation, but given that the flag is optional by hardware to implement (the ASM2824 does have it though), resort to checking for the mandatory Link Bandwidth Management Status flag showing that the link speed or width has been changed in an attempt to correct unreliable link operation (the ASM2824 does set it too).
If these checks indicate that link may not operate correctly, then poll the Data Link Layer Link Active status flag along with the Link Training flag for the duration of 200ms to see if the link has stabilised, that is either that the Data Link Layer Link Active status flag has been set or that Link Training has been inactive during at least the second half of the inteval.
If that has indicated failure, reduce the target speed, request a link retrain and check again if the link has stabilised. Repeat until either successful or the link speeds supported by the downstream port have been exhausted.
So in such cases, the link speed will be downgraded? I would expect at least a big warning in such cases.
I had mixed feelings about such extra clutter and chose not to include it, but perhaps it's worth adding after all, especially with the most recent findings, noted below.
Did you try to change some other configuration options for the link establishment? I remember that on some hardware we were able to get better "link-up results" by setting the de-emphasis level to -3.5dB instead of -6dB (in the link control status register 2), before trying to re-estblish the link. Did you also test with "tuning" such parameters. There might be other, which I'm missing right now.
Thank you for the suggestion. I've never been too familiar with analogue electronics engineering, so I didn't consider working at that level.
So as it has turned out the ASM2824 has the de-emphasis level already set to -3.5dB by default (at power-up or reset; the power-up default is 0063, and some bits are sticky, so may not change at reset). Interestingly, the bit is defined as HwInit, and as such it is meant to be "read-only after intialization", however with the ASM2824 it appears freely writable at any time and state reported in other registers and at the other end of the link indicates these changes do take effect. They do not fix the issue with link training though; I have tried both settings to no avail.
However while fiddling with the register I have discovered an interesting phenomenon in that the link will actually switch to 5GT/s and then work reliably, provided that it is done in two steps: first clamping the target link speed to 2.5GT/s and letting link training succeed at it, and only then switching the target link speed to 5GT/s (or for that matter 8GT/s). At that point the link changes to 5GT/s instantaneously (there's no Link Training reported active, not even momentarily, or Data Link Layer Link Active reported inactive), as shown by the Link Status Register at both ends (and the de-emphasis level does not matter; it works at either value, as reported in the Link Status 2 register, again at both ends).
It makes me suspect that the problem with link negotiation is at the data link layer, rather than at the physical layer as I originally thought. IOW the two devices disagree at the protocol rather than electrical level, and only allowing a higher link speed once the data link layer has gone up somehow avoids the incompatibility.
It works the same regardless of whether I change the target link speed in U-Boot (whether by firmware code itself or by poking with commands entered at the prompt by hand) or in Linux (with `setpci'). However changing the target link speed back to any beyond 2.5GT/s in U-Boot has an unfortunate side effect of devices behind the problematic link being only accessible until Linux boots. This is because Linux issues a reset to the PCIe tree, which causes the link to be reinitialised with the target link speed set beyond 2.5GT/s, and that brings the problem back. The reset however does not cause an issue and lets devices behind the problematic link continue working if the target link speed has been set by U-Boot to 2.5GT/s. This is because the Target Link Speed field in the Link Control 2 register is sticky, so the clamp continues to be applied.
So I think my observations above have implications as follows:
1. We don't need to try lower and lower target link speeds as a workaround in U-Boot. It is enough if we force any link found problematic just to 2.5GT/s, as a minimal requirement to make such a link to work and also the speed all PCIe devices must support.
2. We don't want to try to switch to any higher speed afterwards in U-Boot as it will prevent an OS that does not have a workaround in place, but issues a PCIe reset from working with devices behind such a problematic link. I think we ought to do our best to prevent that from happening, i.e. have the most robust workaround possible.
3. We do want to have a more sophisticated workaround in Linux (and other OSes, as someone steps in to implement one) that will ensure correct hot-plug operation and also give better performance. With hot-plug events possible at any time an OS driver cannot do aggressive polling however, so I think unlike with the workaround in U-Boot it'll have to be structured differently, e.g. if it is vendor:device-specific, then it can rely on ASM2824 Data Link Layer Link Active reporting facility and can sleep instead then as it doesn't have to do the link training detection dance.
4. This all means that changes for U-Boot and Linux will definitely have to be different each.
I'll work on simplifying the U-Boot change along the lines outlined above then, and also look into a corresponding workaround for Linux.
Thank you very much indeed for your feedback, I think I have now made good progress here.
Maciej