Re: [PATCH 0/8] efi_loader: Complete the bootflow_efi() test

9 Jan 2025

      On Thu, Jan 09, 2025 at 08:02:01AM -0700, Simon Glass wrote:
...
Hi Tom,
On Wed, 8 Jan 2025 at 12:15, Tom Rini trini@konsulko.com wrote:
...
On Wed, Jan 08, 2025 at 10:02:52AM -0700, Simon Glass wrote:
...
Hi Heinrich, Tom,
On Tue, 7 Jan 2025 at 08:47, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
...
On 07.01.25 16:11, Tom Rini wrote:
...
On Tue, Jan 07, 2025 at 06:57:50AM -0700, Simon Glass wrote:
...
Hi Heinrich,
On Tue, 7 Jan 2025 at 06:11, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
>
> On 07.01.25 13:15, Simon Glass wrote:
>> Hi Heinrich,
>>
>> On Mon, 6 Jan 2025 at 10:00, Heinrich Schuchardt xypron.glpk@gmx.de wrote:
>>>
>>> On 06.01.25 15:47, Simon Glass wrote:
>>>> This test was hamstrung in code review so this series is an attempt to
>>>> complete the intended functionality:
>>>>
>>>> - Check memory allocations look correct
>>>> - Check that exit-boot-services removes active-DMA devices
>>>> - Check that the bootflow is still present after testapp finishes
>>>>
>>>> The EFI functionality duplicates bootm_announce_and_cleanup() and still
>>>> uses the defunct board_quiesce_devices() so a nice cleanup would be to
>>>> call the bootm function instead, with suitable modifications. That would
>>>> allow bootstage to work too.
>>>>
>>>> This series is based on sjg/master since the EFI logging was rejected so
>>>> far.
>>>
>>> Yes, it was rejected because a solution at the lib/log.c level would be
>>> more generic.
>>
>> As I mentioned, that idea isn't suitable for programmatic use.
>
> What can be done with show_addr("mem", rec->memory); that log_debug()
> does not offer or which you could not do with a new log function in
> lib/log.c that takes variadic arguments?
There are asserts in [1], for example. How do you propose to handle
that? See [2] for my previous explanation, quoted here:
> CONFIG_LOG with a bloblist option would be a great idea, but it's hard
> to programmatically scan text...plus only the external call sites are
> actually logged.
Also see the discussion on the original patch [3]. There was also your
reply at [4], but I think you missed that this is intended for use in
unit tests (i.e. with ut_assert()).
You also requested that this be generalised, rather than being
EFI-loader-specific. I have no objection to that, but don't have a use
case for it yet, so have deferred that to later. It's a fairly simple
change, if/when needed. If the series was not NAKed, I'd be happy to
do it now.
>>
>>>
>>> Tom suggested not to send patches that are for private enjoyment to the
>>> mailing list.
>>
>> My contributions to U-Boot are only ever about private enjoyment :-)
>>
>> Do you have any comments on the patches?
Regards,
Simon
[1] https://patchwork.ozlabs.org/project/uboot/patch/20250106144755.3054780-6-sj...
[2] https://lore.kernel.org/u-boot/CAFLszTjxOE_037+kR0jgdax80sBombYo_k0YgiuVnP=K...
[3] https://lore.kernel.org/u-boot/CAC_iWjKtaN54B98OKbkoXkC_GmKJ=x+M4=UY_O6roSOp...
[4] https://lore.kernel.org/u-boot/D513D326-41A6-425E-B11F-85958065BCD2@gmx.de/
Looking at the logging portions of the original series again, especially
if this was made generic, we probably don't want to print to actual
console every time we're making a note of some memory allocation for
example, that would be unreadable outside of a debug context. The point
of this really seems to be "log things for verifying in tests later".
Does that end up being useful? I don't know. Heinrich or Ilias, do the
tests in [1] look generally useful?
The tests in [1] are not documented, not even in the commit message. So
the reasoning behind the tests remains Simon's secret.
Are you asking for code comments in the test? If so, I can add some.
...
At first sight the tests in [1] don't make much sense. E.g. that only a
subset of memory types have been used does not tell that the right
memory type has been used for the right object.
It is a pretty good start, though. It makes sure that the memory types
are sane, checks addresses are within DRAM, etc. With [5] it makes
sure that devices are removed.
...
Implementing a specific tracing functionality for EFI is definitively
the wrong way forward as it will lead to code duplication.
We can cross that bridge when we come to it.
Well, no. It's backwards to make a bridge in one place when everyone
agrees it needs to be moved somewhere else. I mean [5] is a generic
issue and test/py/tests/test_net_boot.py or some other test we already
have which tests booting an OS should confirm that we've quiesced
devices before moving on. And as a bonus it's in python where dealing
with strings doesn't suck.
I really don't want to write C tests in Python. CI is slow enough as
it is, something realy want to fix. I'm also not sure how you can tell
if a device has been removed. Run 'dm tree' and look for the missing
'star' in the resulting 300 lines of text?
As I'm in a bisect-hell in our C tests you'll have to forgive me for not
thinking the C tests are noticeably faster than python tests. Or that
they aren't their own potential source of corner-case bugs. But I
digress..
And yes, taking a bunch of text and parsing it, is what python is fast
at. And easier to write.
...
But actually [5] is not generic, since EFI uses its own code to remove
devices. This test is solely focussed on EFI.
Yes, you're testing the EFI version of the code in
arch/$(ARCH)/lib/bootm.c. The remove devices functions being called in
both cases are generic.
...
If you want the logging to be renamed and placed centrally I don't
mind doing it now. But note that only EFI will use it for now.
...
...
...
We already have function _log() which is variadic.
Simon could write a new log driver that parses the `format` parameter
and saves the binary data in an appropriate format for analysis by the
unit tests:

For %s the driver should save the string and not the address of the

string.

For %pD the driver should save the device path instead of the pointer.
...

Some changes to the log driver interface will be needed to pass the
variadic arguments instead of the formatted message.
Perhaps the word 'log' is confusing people. But the above suggestion
is quite a complicated way of handling things. We have no way to
decode printf() strings in this way. See log_dispatch() for how this
is handled today. It uses sprintf(). Trying to test based on text
output would be very clumsy (lots of regexes and sscan() calls?) and
result in a huge amount of parsing code, highly dependent on the
printf() format, etc.
I very-much doubt that would produce a useful implementation, but if
you would like to try it out then I would be happy to look at it.
I mentioned this several times, but even if we did go that way, we
only have logging on the external calls, so much of the EFI-memory
allocation in U-Boot would not be logged.
Regards,
Simon
[5] https://patchwork.ozlabs.org/project/uboot/patch/20250106144755.3054780-9-sj...
Yes, calling this a "log" when it's intended for capturing information
for tests got some of this off on the wrong track. But that also helps
explain now that this is still on the wrong track and should instead be
following normal design practices for testing and expanding existing
infrastructure and not inventing a new everything. So if you don't like
Heinrich's suggestion, take a look at Caleb's suggestion.
I don't have the energy to port the tracing framework from Linux to
U-Boot, although I agree it would be useful. Still, function tracing
is quite fragile and confusing to work with when refactoring code. I
don't like that idea much for this use case, although if function
tracing did exist in U-Boot I would likely have used it.
I mean yes, it would be good if you went back and expanded on the trace
functionality you did before.
...
...
And if you
don't like Caleb's suggestion, go put this in a topic branch you can
merge when you need to debug some problem that seemingly nothing else
will catch.
Here we are over a year after I reported the bug and we still don't
have a test to cover it. This series is better than the available
alternatives, IMO.
Well, no. We have commit dabaa4ae3206 ("dm: Add
dm_remove_devices_active() for ordered device removal") we have a test
for the underlying problem. We need more functional boot tests, but we
need those to be in python too, and not more C code.
And you're not just coming up with a test, you're refactoring a bunch of
code and introducing new subsystems in order to do that. When as I keep
pointing out, we don't need that. We could easily extend the existing OS
boot tests we have to script booting an ISO. And we only run those when
say "ENABLE_SLOW_TESTS" is set, and only do that on tagged releases.
-- 
Tom