
On 15/10/2022 18:53, Simon Glass wrote:
Hi Paul,
On Sat, 15 Oct 2022 at 03:19, Paul Barker paul.barker@sancloud.com wrote:
We should only perform additional iteration steps when needed to initialize the parent of a device. Other binding errors (such as a missing driver) should not lead to additional iteration steps.
Unnecessary iteration steps can cause issues when memory is tightly constrained (such as in the TPL/SPL) since device_bind_by_name() unconditionally allocates memory for a struct udevice. On the SanCloud BBE this led to boot failure caused by memory exhaustion in the SPL when booting from SPI flash.
Signed-off-by: Paul Barker paul.barker@sancloud.com
drivers/core/lists.c | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-)
Are you able to construct a test for this? See test/dm/core.c or test-fdt.c
I can't see how to construct a test case for this issue. What I observed was the SPL crashing during the pre-relocation lists_bind_drivers() call due to exhaustion of the available memory. On the AM335x SoC this is 64k of on-board SRAM which is available before main DRAM is initialised. I can't see how to reproduce this issue within the u-boot test framework as significantly more memory is available when running the u-boot sandbox and so memory exhaustion does not occur. If there are other side effects than memory leaking here then I'm not familiar enough with the u-boot device model to spot them.
Each call to bind_drivers_pass() calls device_bind_by_name() for every driver_info record, with no way of checking whether a particular driver_info record has already been handled by a previous call to bind_drivers_pass(). This leaks memory as each call to device_bind_by_name() allocates a new device object.
Also, I think bind_drivers_pass() needs a function comment that describes in detail what is going on.
Agreed, I'll see what I can add when I revisit this.
Would it be possible to achieve the same effect while keeping that function the same as now? See below.
diff --git a/drivers/core/lists.c b/drivers/core/lists.c index c49695b24f00..82d479564121 100644 --- a/drivers/core/lists.c +++ b/drivers/core/lists.c @@ -51,13 +51,13 @@ struct uclass_driver *lists_uclass_lookup(enum uclass_id id) return NULL; }
-static int bind_drivers_pass(struct udevice *parent, bool pre_reloc_only) +static bool bind_drivers_pass(struct udevice *parent, bool pre_reloc_only,
int *result)
{ struct driver_info *info = ll_entry_start(struct driver_info, driver_info); const int n_ents = ll_entry_count(struct driver_info, driver_info); bool missing_parent = false;
int result = 0; int idx; /*
@@ -98,12 +98,12 @@ static int bind_drivers_pass(struct udevice *parent, bool pre_reloc_only) drt->dev = dev; } else if (ret != -EPERM) { dm_warn("No match for driver '%s'\n", entry->name);
if (!result || ret != -ENOENT)
result = ret;
if (!*result || ret != -ENOENT)
*result = ret; } }
return result ? result : missing_parent ? -EAGAIN : 0;
return missing_parent;
}
int lists_bind_drivers(struct udevice *parent, bool pre_reloc_only) @@ -117,13 +117,8 @@ int lists_bind_drivers(struct udevice *parent, bool pre_reloc_only) * always succeed on the first pass. */ for (pass = 0; pass < 10; pass++) {
int ret;
ret = bind_drivers_pass(parent, pre_reloc_only);
if (!ret)
if (!bind_drivers_pass(parent, pre_reloc_only, &result)) break;
if (ret != -EAGAIN && !result)
result = ret;
If there were something like this, could we drop the other changes?
if (ret) { if (ret == -EAGAIN) continue; else if (!result)
result = ret;
}
I don't think this works exactly, as we want to break when (ret == -EAGAIN) instead of continue. However I have a variant which I think is working which I can send as a v2.
} return result;
-- 2.25.1