[U-Boot] mvebu spl: translation offset set too late?

[cc: mvebu maintainers]
I've spent the best part of today trying to get upstream u-boot running on my Armada 835 device (Turris Omnia). I think in the process I might have uncovered a bug with SPL u-boot on these SoC.
mvebu is "special" in having a different memory map in SPL vs. "main" mode. The arch SPL initialization code calls dm_set_translation_offset to tell the DM subsystem how to translate addresses. However, this is called *after* spl_init, which triggers a DM scan. So at the point where the DM subsystem is aware of the translation offset, drivers might have already cached addresses (priv->base) or even performed initialization (the TWSI i2c module does some configuration at bind time).
This seems broken but I'm not experienced enough with u-boot to suggest a good fix here. Could someone confirm that at least I'm not completely off base with this analysis?
Thanks, Best,

Hi Pierre,
On 10.04.19 22:19, Pierre Bourdon wrote:
[cc: mvebu maintainers]
I've spent the best part of today trying to get upstream u-boot running on my Armada 835 device (Turris Omnia). I think in the process I might have uncovered a bug with SPL u-boot on these SoC.
mvebu is "special" in having a different memory map in SPL vs. "main" mode. The arch SPL initialization code calls dm_set_translation_offset to tell the DM subsystem how to translate addresses.
I remember working on this a few years ago, so my memory is still at bit foggy here. ;)
However, this is called *after* spl_init, which triggers a DM scan. So at the point where the DM subsystem is aware of the translation offset, drivers might have already cached addresses (priv->base) or even performed initialization (the TWSI i2c module does some configuration at bind time).
Yes, you might be correct here. Thanks for spotting. This might be the root cause for the I2C binding hang reported by Baruch (added to Cc):
https://marc.info/?l=u-boot&m=155462955329578&w=2
Is this the same issue that you have noticed?
This seems broken but I'm not experienced enough with u-boot to suggest a good fix here. Could someone confirm that at least I'm not completely off base with this analysis?
Yes, please see above. I believe you are correct here. This issue was hidden until patch 173ec35191 (i2c: mvtwsi: disable i2c slave on Armada 38x) introduced a driver bind function, which is called very early, before the address translation has been changed.
I'm also not sure, how to best solve this issue (if I am correct). We could rework patch 173ec35191 to move this out of this driver into board specific code (or some other way). But this would not solve the main issue, with the incorrect address translation at binding time.
Does anyone else have an idea on this?
Thanks, Stefan

On Thu, Apr 11, 2019 at 11:21 AM Stefan Roese sr@denx.de wrote:
However, this is called *after* spl_init, which triggers a DM scan. So at the point where the DM subsystem is aware of the translation offset, drivers might have already cached addresses (priv->base) or even performed initialization (the TWSI i2c module does some configuration at bind time).
Yes, you might be correct here. Thanks for spotting. This might be the root cause for the I2C binding hang reported by Baruch (added to Cc):
https://marc.info/?l=u-boot&m=155462955329578&w=2
Is this the same issue that you have noticed?
Yes, the original symptom I was debugging was indeed the mvtwsi I2C bind handler causing an early freeze in SPL. That patch from Baruch seems like it's trying to address the exact same issue. Thanks for pointing it out.
Note that on Turris Omnia the SPL needs a working I2C0 to fetch RAM configuration from an EEPROM. The "band-aid" fix of disabling the debug register write in SPL mode would break Omnia boards in 2GB RAM configuration.

Hi Pierre,
On Thu, Apr 11, 2019 at 01:45:51PM +0200, Pierre Bourdon wrote:
On Thu, Apr 11, 2019 at 11:21 AM Stefan Roese sr@denx.de wrote:
However, this is called *after* spl_init, which triggers a DM scan. So at the point where the DM subsystem is aware of the translation offset, drivers might have already cached addresses (priv->base) or even performed initialization (the TWSI i2c module does some configuration at bind time).
Yes, you might be correct here. Thanks for spotting. This might be the root cause for the I2C binding hang reported by Baruch (added to Cc):
https://marc.info/?l=u-boot&m=155462955329578&w=2
Is this the same issue that you have noticed?
Yes, the original symptom I was debugging was indeed the mvtwsi I2C bind handler causing an early freeze in SPL. That patch from Baruch seems like it's trying to address the exact same issue. Thanks for pointing it out.
Note that on Turris Omnia the SPL needs a working I2C0 to fetch RAM configuration from an EEPROM. The "band-aid" fix of disabling the debug register write in SPL mode would break Omnia boards in 2GB RAM configuration.
Would it? You should be able to communicate with the EEPROM at 100KHz without any problem.
My use case is also EEPROM access from SPL for RAM configuration on a future revision (2.1) of the SolidRun Armada 388 SOM. It works here with the patch applied.
I guess that calling dm_set_translation_offset() before spl_init would fail because dm_root is not initialized yet. Is that correct? I could not find the code that initializes dm_root. The correct fix would probably involve setting translation_offset at the earliest possible point.
baruch

On Thu, Apr 11, 2019 at 2:07 PM Baruch Siach baruch@tkos.co.il wrote:
Note that on Turris Omnia the SPL needs a working I2C0 to fetch RAM configuration from an EEPROM. The "band-aid" fix of disabling the debug register write in SPL mode would break Omnia boards in 2GB RAM configuration.
Would it? You should be able to communicate with the EEPROM at 100KHz without any problem.
My use case is also EEPROM access from SPL for RAM configuration on a future revision (2.1) of the SolidRun Armada 388 SOM. It works here with the patch applied.
Cc-ing Tomas who wrote the original code in the Turris u-boot fork [1] and may have a deeper understanding of why it is needed.
My understanding from the comments and description of the patch is that this might be i2c address dependent. I will do some more testing tonight, but I'm almost sure that removing this code causes my Turris Omnia to fail talking to any device on i2c0 (MCU, EEPROM, atsha204a).
[1] https://gitlab.labs.nic.cz/turris/turris-omnia-uboot/commit/590200da03fc0c04...
I guess that calling dm_set_translation_offset() before spl_init would fail because dm_root is not initialized yet. Is that correct? I could not find the code that initializes dm_root. The correct fix would probably involve setting translation_offset at the earliest possible point.
Correct. dm_root is initialized by dm_init (look for DM_ROOT_NON_CONST). But the way dm_init is called in SPL is via dm_init_and_scan, which does not let boards / machine definitions configure the translation offset before driver->bind is called. -- Pierre Bourdon delroth@gmail.com Software Engineer @ Zürich, Switzerland https://delroth.net/

Adding Simon to Cc (again)...
On 11.04.19 14:26, Pierre Bourdon wrote:
On Thu, Apr 11, 2019 at 2:07 PM Baruch Siach baruch@tkos.co.il wrote:
Note that on Turris Omnia the SPL needs a working I2C0 to fetch RAM configuration from an EEPROM. The "band-aid" fix of disabling the debug register write in SPL mode would break Omnia boards in 2GB RAM configuration.
Would it? You should be able to communicate with the EEPROM at 100KHz without any problem.
My use case is also EEPROM access from SPL for RAM configuration on a future revision (2.1) of the SolidRun Armada 388 SOM. It works here with the patch applied.
Cc-ing Tomas who wrote the original code in the Turris u-boot fork [1] and may have a deeper understanding of why it is needed.
My understanding from the comments and description of the patch is that this might be i2c address dependent. I will do some more testing tonight, but I'm almost sure that removing this code causes my Turris Omnia to fail talking to any device on i2c0 (MCU, EEPROM, atsha204a).
[1] https://gitlab.labs.nic.cz/turris/turris-omnia-uboot/commit/590200da03fc0c04...
I guess that calling dm_set_translation_offset() before spl_init would fail because dm_root is not initialized yet. Is that correct? I could not find the code that initializes dm_root. The correct fix would probably involve setting translation_offset at the earliest possible point.
Correct. dm_root is initialized by dm_init (look for DM_ROOT_NON_CONST). But the way dm_init is called in SPL is via dm_init_and_scan, which does not let boards / machine definitions configure the translation offset before driver->bind is called.
I also see no easy way to solve this "hen egg problem" with the translation offset before bind for the drivers is being called with the current API. Perhaps Simon (or someone else) has an idea?
As mentioned in my earlier reply, moving to a callback from the DM code into a board specific code or using a Kconfig option could solve this issue.
Thanks, Stefan

Hi,
On Thu, 11 Apr 2019 at 06:55, Stefan Roese sr@denx.de wrote:
Adding Simon to Cc (again)...
On 11.04.19 14:26, Pierre Bourdon wrote:
On Thu, Apr 11, 2019 at 2:07 PM Baruch Siach baruch@tkos.co.il wrote:
Note that on Turris Omnia the SPL needs a working I2C0 to fetch RAM configuration from an EEPROM. The "band-aid" fix of disabling the debug register write in SPL mode would break Omnia boards in 2GB RAM configuration.
Would it? You should be able to communicate with the EEPROM at 100KHz without any problem.
My use case is also EEPROM access from SPL for RAM configuration on a future revision (2.1) of the SolidRun Armada 388 SOM. It works here with the patch applied.
Cc-ing Tomas who wrote the original code in the Turris u-boot fork [1] and may have a deeper understanding of why it is needed.
My understanding from the comments and description of the patch is that this might be i2c address dependent. I will do some more testing tonight, but I'm almost sure that removing this code causes my Turris Omnia to fail talking to any device on i2c0 (MCU, EEPROM, atsha204a).
[1] https://gitlab.labs.nic.cz/turris/turris-omnia-uboot/commit/590200da03fc0c04...
I guess that calling dm_set_translation_offset() before spl_init would fail because dm_root is not initialized yet. Is that correct? I could not find the code that initializes dm_root. The correct fix would probably involve setting translation_offset at the earliest possible point.
Correct. dm_root is initialized by dm_init (look for DM_ROOT_NON_CONST). But the way dm_init is called in SPL is via dm_init_and_scan, which does not let boards / machine definitions configure the translation offset before driver->bind is called.
I also see no easy way to solve this "hen egg problem" with the translation offset before bind for the drivers is being called with the current API. Perhaps Simon (or someone else) has an idea?
As mentioned in my earlier reply, moving to a callback from the DM code into a board specific code or using a Kconfig option could solve this issue.
I don't like dm_set_translation_offset() very much. As some said, you have to call it after DM is up but before any affected drivers are bound. Tricky.
How about adding a field in global_data, which can be set early, them have dm_init() pick it up?
Regards, Simon

Hi Simon,
On 12.04.19 14:35, Simon Glass wrote:
Hi,
On Thu, 11 Apr 2019 at 06:55, Stefan Roese sr@denx.de wrote:
Adding Simon to Cc (again)...
On 11.04.19 14:26, Pierre Bourdon wrote:
On Thu, Apr 11, 2019 at 2:07 PM Baruch Siach baruch@tkos.co.il wrote:
Note that on Turris Omnia the SPL needs a working I2C0 to fetch RAM configuration from an EEPROM. The "band-aid" fix of disabling the debug register write in SPL mode would break Omnia boards in 2GB RAM configuration.
Would it? You should be able to communicate with the EEPROM at 100KHz without any problem.
My use case is also EEPROM access from SPL for RAM configuration on a future revision (2.1) of the SolidRun Armada 388 SOM. It works here with the patch applied.
Cc-ing Tomas who wrote the original code in the Turris u-boot fork [1] and may have a deeper understanding of why it is needed.
My understanding from the comments and description of the patch is that this might be i2c address dependent. I will do some more testing tonight, but I'm almost sure that removing this code causes my Turris Omnia to fail talking to any device on i2c0 (MCU, EEPROM, atsha204a).
[1] https://gitlab.labs.nic.cz/turris/turris-omnia-uboot/commit/590200da03fc0c04...
I guess that calling dm_set_translation_offset() before spl_init would fail because dm_root is not initialized yet. Is that correct? I could not find the code that initializes dm_root. The correct fix would probably involve setting translation_offset at the earliest possible point.
Correct. dm_root is initialized by dm_init (look for DM_ROOT_NON_CONST). But the way dm_init is called in SPL is via dm_init_and_scan, which does not let boards / machine definitions configure the translation offset before driver->bind is called.
I also see no easy way to solve this "hen egg problem" with the translation offset before bind for the drivers is being called with the current API. Perhaps Simon (or someone else) has an idea?
As mentioned in my earlier reply, moving to a callback from the DM code into a board specific code or using a Kconfig option could solve this issue.
I don't like dm_set_translation_offset() very much. As some said, you have to call it after DM is up but before any affected drivers are bound. Tricky.
Yes. We missing this very important fact while implementing this solution.
How about adding a field in global_data, which can be set early, them have dm_init() pick it up?
Okay. I could come up with a patch for this quickly, I think. I would prefer to add a Kconfig option for this, to not add size to GD for platforms not needed this option. This could also save a bit of SPL size as well.
Let me see...
Thanks, Stefan

Adding Simon to Cc...
On 11.04.19 14:07, Baruch Siach wrote:
Hi Pierre,
On Thu, Apr 11, 2019 at 01:45:51PM +0200, Pierre Bourdon wrote:
On Thu, Apr 11, 2019 at 11:21 AM Stefan Roese sr@denx.de wrote:
However, this is called *after* spl_init, which triggers a DM scan. So at the point where the DM subsystem is aware of the translation offset, drivers might have already cached addresses (priv->base) or even performed initialization (the TWSI i2c module does some configuration at bind time).
Yes, you might be correct here. Thanks for spotting. This might be the root cause for the I2C binding hang reported by Baruch (added to Cc):
https://marc.info/?l=u-boot&m=155462955329578&w=2
Is this the same issue that you have noticed?
Yes, the original symptom I was debugging was indeed the mvtwsi I2C bind handler causing an early freeze in SPL. That patch from Baruch seems like it's trying to address the exact same issue. Thanks for pointing it out.
Note that on Turris Omnia the SPL needs a working I2C0 to fetch RAM configuration from an EEPROM. The "band-aid" fix of disabling the debug register write in SPL mode would break Omnia boards in 2GB RAM configuration.
Would it? You should be able to communicate with the EEPROM at 100KHz without any problem.
My use case is also EEPROM access from SPL for RAM configuration on a future revision (2.1) of the SolidRun Armada 388 SOM. It works here with the patch applied.
I guess that calling dm_set_translation_offset() before spl_init would fail because dm_root is not initialized yet. Is that correct?
Yes, I pretty much suspect so too.
I could not find the code that initializes dm_root. The correct fix would probably involve setting translation_offset at the earliest possible point.
Something like this, yes. Or use a callback in the DM infrastructure at the time where this translation offset is needed to get a board specific value here. I remember that I first had another solution for this (perhaps similar to a callback or some Kconfig option). But Simon suggested to move into this API that we have right now (IIRC).
Simon, what are your thoughts about this issue here with the translation offset (hen egg problem)?
Thanks, Stefan
participants (4)
-
Baruch Siach
-
Pierre Bourdon
-
Simon Glass
-
Stefan Roese