[PATCH RFC] gpio: Fix probing of gpio-hogs

48b3ecbe replumbed the gpio-hog probing to use DM_FLAG_PROBE_AFTER_BIND.
Unfortunately gpio_post_bind is called after the non-preloc recursive dm_probe_devices completes, so setting this flag does not have the intended effect and the gpio-hogs never get probed. With instrumentation:
[...] CPU: MediaTek MT7981 Model: GL.iNet GL-X3000 DRAM: 512 MiB <mtk_pinctrl_mt7981_bind called> <dm_probe_devices called: root root_driver root_driver [+] [ ]> <dm_probe_devices called: clk fixed_clock gpt_dummy20m [ ] [ ]> [...] <dm_probe_devices called: led gpio_led signal-4 [ ] [ ]> Core: 34 devices, 14 uclasses, devicetree: separate MMC: <gpio_post_bind called> mmc@11230000: 0 [...]
Probe them directly in gpio_post_bind instead.
Signed-off-by: Chris Webb chris@arachsys.com --- drivers/gpio/gpio-uclass.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-uclass.c b/drivers/gpio/gpio-uclass.c index 4234cd91..1c6e1715 100644 --- a/drivers/gpio/gpio-uclass.c +++ b/drivers/gpio/gpio-uclass.c @@ -1539,7 +1539,9 @@ static int gpio_post_bind(struct udevice *dev) * since hogs can be essential to the hardware * system. */ - dev_or_flags(child, DM_FLAG_PROBE_AFTER_BIND); + ret = device_probe(child); + if (ret) + return ret; } } }

On Thu, Jun 13, 2024 at 11:59:05AM +0100, Chris Webb wrote:
48b3ecbe replumbed the gpio-hog probing to use DM_FLAG_PROBE_AFTER_BIND.
Unfortunately gpio_post_bind is called after the non-preloc recursive dm_probe_devices completes, so setting this flag does not have the intended effect and the gpio-hogs never get probed. With instrumentation:
[...] CPU: MediaTek MT7981 Model: GL.iNet GL-X3000 DRAM: 512 MiB
<mtk_pinctrl_mt7981_bind called> <dm_probe_devices called: root root_driver root_driver [+] [ ]> <dm_probe_devices called: clk fixed_clock gpt_dummy20m [ ] [ ]> [...] <dm_probe_devices called: led gpio_led signal-4 [ ] [ ]> Core: 34 devices, 14 uclasses, devicetree: separate MMC: <gpio_post_bind called> mmc@11230000: 0 [...]
Probe them directly in gpio_post_bind instead.
Signed-off-by: Chris Webb chris@arachsys.com
drivers/gpio/gpio-uclass.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-uclass.c b/drivers/gpio/gpio-uclass.c index 4234cd91..1c6e1715 100644 --- a/drivers/gpio/gpio-uclass.c +++ b/drivers/gpio/gpio-uclass.c @@ -1539,7 +1539,9 @@ static int gpio_post_bind(struct udevice *dev) * since hogs can be essential to the hardware * system. */
dev_or_flags(child, DM_FLAG_PROBE_AFTER_BIND);
ret = device_probe(child);
if (ret)
} }return ret; }
Adding Marek, as the author of commit 48b3ecbedf82 ("gpio: Get rid of gpio_hog_probe_all()").

Tom Rini trini@konsulko.com wrote:
Adding Marek, as the author of commit 48b3ecbedf82 ("gpio: Get rid of gpio_hog_probe_all()").
Thanks! I don't claim this is the correct way to fix this, just that it works.
Specifically, the two things I found that got gpio-hog working were
(a) adding an explicit probe instead of DM_FLAG_PROBE_AFTER_BIND in gpio_post_bind(), or
(b) adding a .bind function in U_BOOT_DRIVER(mt7981_pinctrl) like
static int mtk_pinctrl_mt7981_bind(struct udevice *dev) { dev_or_flags(dev, DM_FLAG_PROBE_AFTER_BIND); return 0; }
However, presumably (b) isn't right as it would (presumably) need repeating in lots of other pinctrl drivers?
Best wishes,
Chris.

Chris Webb chris@arachsys.com wrote:
Tom Rini trini@konsulko.com wrote:
Adding Marek, as the author of commit 48b3ecbedf82 ("gpio: Get rid of gpio_hog_probe_all()").
Thanks! I don't claim this is the correct way to fix this, just that it works.
Specifically, the two things I found that got gpio-hog working were
(a) adding an explicit probe instead of DM_FLAG_PROBE_AFTER_BIND in gpio_post_bind(), or
(b) adding a .bind function in U_BOOT_DRIVER(mt7981_pinctrl) like
static int mtk_pinctrl_mt7981_bind(struct udevice *dev) { dev_or_flags(dev, DM_FLAG_PROBE_AFTER_BIND); return 0; }
However, presumably (b) isn't right as it would (presumably) need repeating in lots of other pinctrl drivers?
Now the release is out, I'd be really keen to pick this one up and get it fixed upstream if possible.
The device I originally discovered this on is now deployed, but I could probably grab it back for a bit and resolder a serial console onto it for further testing if neither of the above is correct and a third alternative I didn't try needs confirming on real hardware.
Best wishes,
Chris.

Chris Webb chris@arachsys.com wrote:
Now the release is out, I'd be really keen to pick this one up and get it fixed upstream if possible.
Hi Tom, is there anything more I can do to help out here? I'd love upstream 2024.10 to ship with gpio-hog that works again.
Best wishes,
Chris.

Hi Chris,
On Thu, 13 Jun 2024 at 04:59, Chris Webb chris@arachsys.com wrote:
48b3ecbe replumbed the gpio-hog probing to use DM_FLAG_PROBE_AFTER_BIND.
Unfortunately gpio_post_bind is called after the non-preloc recursive dm_probe_devices completes, so setting this flag does not have the intended effect and the gpio-hogs never get probed. With instrumentation:
[...] CPU: MediaTek MT7981 Model: GL.iNet GL-X3000 DRAM: 512 MiB
<mtk_pinctrl_mt7981_bind called> <dm_probe_devices called: root root_driver root_driver [+] [ ]> <dm_probe_devices called: clk fixed_clock gpt_dummy20m [ ] [ ]> [...] <dm_probe_devices called: led gpio_led signal-4 [ ] [ ]> Core: 34 devices, 14 uclasses, devicetree: separate MMC: <gpio_post_bind called> mmc@11230000: 0 [...]
Probe them directly in gpio_post_bind instead.
We cannot probe devices when they are bound since it breaks the ordering of driver model.
From your trace it looks like everything is happening after
relocation. I can't quite see what is actually going wrong. But if you look at dm_init_and_scan(), it does the probe at the end, immediately after all devices have been bound. So it should do what you want.
Is the GPIO device not being bound? There is something strange here.
Signed-off-by: Chris Webb chris@arachsys.com
drivers/gpio/gpio-uclass.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-uclass.c b/drivers/gpio/gpio-uclass.c index 4234cd91..1c6e1715 100644 --- a/drivers/gpio/gpio-uclass.c +++ b/drivers/gpio/gpio-uclass.c @@ -1539,7 +1539,9 @@ static int gpio_post_bind(struct udevice *dev) * since hogs can be essential to the hardware * system. */
dev_or_flags(child, DM_FLAG_PROBE_AFTER_BIND);
ret = device_probe(child);
if (ret)
return ret; } } }
Regards, Simon

Simon Glass sjg@chromium.org wrote:
We cannot probe devices when they are bound since it breaks the ordering of driver model.
From your trace it looks like everything is happening after relocation. I can't quite see what is actually going wrong. But if you look at dm_init_and_scan(), it does the probe at the end, immediately after all devices have been bound. So it should do what you want.
Is the GPIO device not being bound? There is something strange here.
Hi Simon, many thanks for your follow up. Yes I wasn't convinced the patch was the correct fix (hence the RFC) but posted as it was one of the two ways I found to make gpio-hog work, the other being adding a .bind function in U_BOOT_DRIVER(mt7981_pinctrl) like
static int mtk_pinctrl_mt7981_bind(struct udevice *dev) { dev_or_flags(dev, DM_FLAG_PROBE_AFTER_BIND); return 0; }
to force a probe after bind in the parent pinctrl device. I was hoping someone with more clue than me might go 'Aha! This is just...' :)
The device I tested on has been deployed but I can probably get it back for a bit and resolder a serial console on to test again if that would be helpful. Are there other significant places I should be adding some traces that would make the problem clearer?
Is it significant/relevant that the gpio device is a child of the pinctrl device in the mt7981 device tree?
I think the gpio device must be getting bound, because otherwise my trace in gpio_post_bind() wouldn't get called at all, but perhaps it's bound too late somehow?
Best wishes,
Chris.

Hi Chris,
On Mon, 29 Jul 2024 at 09:44, Chris Webb chris@arachsys.com wrote:
Simon Glass sjg@chromium.org wrote:
We cannot probe devices when they are bound since it breaks the ordering of driver model.
From your trace it looks like everything is happening after relocation. I can't quite see what is actually going wrong. But if you look at dm_init_and_scan(), it does the probe at the end, immediately after all devices have been bound. So it should do what you want.
Is the GPIO device not being bound? There is something strange here.
Hi Simon, many thanks for your follow up. Yes I wasn't convinced the patch was the correct fix (hence the RFC) but posted as it was one of the two ways I found to make gpio-hog work, the other being adding a .bind function in U_BOOT_DRIVER(mt7981_pinctrl) like
static int mtk_pinctrl_mt7981_bind(struct udevice *dev) { dev_or_flags(dev, DM_FLAG_PROBE_AFTER_BIND); return 0; }
to force a probe after bind in the parent pinctrl device. I was hoping someone with more clue than me might go 'Aha! This is just...' :)
?
The device I tested on has been deployed but I can probably get it back for a bit and resolder a serial console on to test again if that would be helpful. Are there other significant places I should be adding some traces that would make the problem clearer?
Is it significant/relevant that the gpio device is a child of the pinctrl device in the mt7981 device tree?
I think the gpio device must be getting bound, because otherwise my trace in gpio_post_bind() wouldn't get called at all, but perhaps it's bound too late somehow?
Well, yes, mt7981_pinctrl is wrong since it is not actually binding the GPIO devices until it itself is probed. It should do it when it is bound.
Better still, those GPIO devices should be in the devicetree and bound automatically by driver model. But, sigh, I see that there is no compatible string in the gpio subnode of pinctrl@11d00000. It should really have one and avoid all this pointless code and problems.
mtk_pinctrl_common_probe() is misnamed, as it actually binds and then probes.
So (unless Linux allows a patch to add a compatible string) it needs a new mtk_pinctrl_common_bind() (called from mtk_pinctrl_mt7981_bind()) which calls mtk_gpiochip_register(). Then you won't need to add your dev_or_flags() into mtk_pinctrl_mt7981_bind().
Regards, Simon

Hi Simon,
Simon Glass sjg@chromium.org wrote:
Well, yes, mt7981_pinctrl is wrong since it is not actually binding the GPIO devices until it itself is probed. It should do it when it is bound.
Oh I see! Yes, I can see the mtk_gpiochip_register(dev) in mtk_pinctrl_common_probe() exactly as you say.
Better still, those GPIO devices should be in the devicetree and bound automatically by driver model. But, sigh, I see that there is no compatible string in the gpio subnode of pinctrl@11d00000. It should really have one and avoid all this pointless code and problems.
mtk_pinctrl_common_probe() is misnamed, as it actually binds and then probes.
So (unless Linux allows a patch to add a compatible string) it needs a new mtk_pinctrl_common_bind() (called from mtk_pinctrl_mt7981_bind()) which calls mtk_gpiochip_register(). Then you won't need to add your dev_or_flags() into mtk_pinctrl_mt7981_bind().
Yes, that makes complete sense. Many thanks! I'm very happy to write that patch and grab back the physical hardware to double-check on if you like? (Or equally happy to leave it if you'd prefer to fix yourself?)
Presumably it needs to apply to every mtk soc that uses mtk_pinctrl_common_probe() as they'll all be affected by this problem.
Best wishes,
Chris.

Hi Chris,
On Mon, 29 Jul 2024 at 10:45, Chris Webb chris@arachsys.com wrote:
Hi Simon,
Simon Glass sjg@chromium.org wrote:
Well, yes, mt7981_pinctrl is wrong since it is not actually binding the GPIO devices until it itself is probed. It should do it when it is bound.
Oh I see! Yes, I can see the mtk_gpiochip_register(dev) in mtk_pinctrl_common_probe() exactly as you say.
Better still, those GPIO devices should be in the devicetree and bound automatically by driver model. But, sigh, I see that there is no compatible string in the gpio subnode of pinctrl@11d00000. It should really have one and avoid all this pointless code and problems.
mtk_pinctrl_common_probe() is misnamed, as it actually binds and then probes.
So (unless Linux allows a patch to add a compatible string) it needs a new mtk_pinctrl_common_bind() (called from mtk_pinctrl_mt7981_bind()) which calls mtk_gpiochip_register(). Then you won't need to add your dev_or_flags() into mtk_pinctrl_mt7981_bind().
Yes, that makes complete sense. Many thanks! I'm very happy to write that patch and grab back the physical hardware to double-check on if you like? (Or equally happy to leave it if you'd prefer to fix yourself?)
OK good. Please go ahead!
Presumably it needs to apply to every mtk soc that uses mtk_pinctrl_common_probe() as they'll all be affected by this problem.
Yes I suppose so.
Regards, Simon

Hi Simon,
Simon Glass sjg@chromium.org wrote:
Presumably it needs to apply to every mtk soc that uses mtk_pinctrl_common_probe() as they'll all be affected by this problem.
Yes I suppose so.
As well as the mediatek case (patch just sent), I thought I should look through the other pinctrl drivers for other examples of this problem you explained to me.
Both starfive/pinctrl-starfive.c and mvebu/pinctrl-armada-37xx.c do the same thing, calling their gpiochip_register as part of the driver probe method. The pinctrl-armada-37xx.c driver also has a bind action:
static int armada_37xx_pinctrl_bind(struct udevice *dev) { /* * Make sure that the pinctrl driver gets probed after binding * as on A37XX the pinctrl driver is the one that is also * registering the GPIO one during probe, so if its not probed * GPIO-s are not registered as well. */ dev_or_flags(dev, DM_FLAG_PROBE_AFTER_BIND);
return 0; }
which presumably wouldn't be needed if the gpiochip were bound at pinctrl bind time instead of pinctrl probe time?
Alas I don't have any boards to test on for either of these platforms.
Best wishes,
Chris.

Hi Chris,
On Wed, 31 Jul 2024 at 04:14, Chris Webb chris@arachsys.com wrote:
Hi Simon,
Simon Glass sjg@chromium.org wrote:
Presumably it needs to apply to every mtk soc that uses mtk_pinctrl_common_probe() as they'll all be affected by this problem.
Yes I suppose so.
As well as the mediatek case (patch just sent), I thought I should look through the other pinctrl drivers for other examples of this problem you explained to me.
Both starfive/pinctrl-starfive.c and mvebu/pinctrl-armada-37xx.c do the same thing, calling their gpiochip_register as part of the driver probe method. The pinctrl-armada-37xx.c driver also has a bind action:
static int armada_37xx_pinctrl_bind(struct udevice *dev) { /* * Make sure that the pinctrl driver gets probed after binding * as on A37XX the pinctrl driver is the one that is also * registering the GPIO one during probe, so if its not probed * GPIO-s are not registered as well. */ dev_or_flags(dev, DM_FLAG_PROBE_AFTER_BIND);
return 0;
}
which presumably wouldn't be needed if the gpiochip were bound at pinctrl bind time instead of pinctrl probe time?
Alas I don't have any boards to test on for either of these platforms.
If you have the inclination it is still worth sending a patch. The maintainer can check it. These sorts of counter-examples can be copied and soon everyone is making the same mistake!
Regards, Simon

Hi Simon,
Simon Glass sjg@chromium.org wrote:
On Wed, 31 Jul 2024 at 04:14, Chris Webb chris@arachsys.com wrote:
Alas I don't have any boards to test on for either of these platforms.
If you have the inclination it is still worth sending a patch. The maintainer can check it. These sorts of counter-examples can be copied and soon everyone is making the same mistake!
Makes sense! If the Mediatek patch is okay, I'll write the equivalents for the other two platforms and put a comment after the --- to warn the maintainers that I haven't been able to test on real hardware. I can do a compile test of them both at least, and they're simple, easy to verify changes.
Best wishes,
Chris.
participants (3)
-
Chris Webb
-
Simon Glass
-
Tom Rini