Re: [PATCH 2/2] buildman: Add --allow-missing-binaries flag to build with BINMAN_ALLOW_MISSING=1

13 Oct 2022


      On 12/10/2022 16.52, Tom Rini wrote:
...
...
Option 1 has the benefit that we don't do any of the blob handling, so
it just dies right away. Perhaps this is a philosophical question, but
it is a little subtle and I'm not actually sure people would notice
the difference so long as they get the errors as expected.
The way I'm thinking of it is there's two cases. The first case is
someone who is testing on the hardware that requires these files. Stop
ASAP because they either will know they forgot to pass X/Y/Z files or
they'll re-read the board instructions page and see oh, they need to
grab binaries X/Y/Z too. Waiting to collect up missing file errors
doesn't save time.
Indeed. If I forgot to put lpddr4_pmu_train_1d_dmem_202006.bin in my
build folder, it is extremely unlikely I wouldn't remember to do the
other three lpddr*.bin files at the same time. And even if there's some
fifth file I also forgot, re-running make (yes, make) after putting the
lpddr*.bin files in place doesn't take long since the whole tree is
already built. I don't think any board really needs more than a handful
of blobs (hopefully), and certainly not more than from a few sources
(i.e., I count the imx8 lpddr*.bin as one), so there's really
practically not much difference between "stop as soon as one is missing"
or "give an error at the end and print all the missing stuff".
Personally, I usually prefer the "stop at first fatal error" model,
because that makes it easier to see the actual problem - some of the
follow-on errors/warnings could be due the first problem, and sometimes
not stopping on that first error means the same problem gets printed
over and over, making it hard to scroll back to find the first
occurrence. Somewhat hand-wavy, yes, and I can't give any good examples.
...
The counter problem is this isn't the first time someone has come up and
noted how much time they've wasted because we defaulted to fake
binaries. I think we've optimized too much for the people that build a
thousand configs all the time (us) instead of the people that build for
1 or two configs at a time (most people?).
To that end, I am really curious what Rasmus has to say here, or anyone
else that has a different workflow from you and I.
Indeed my workflow never involves building a thousand configs, I leave
that to upstream's CI.
I have roughly four different ways of building, all of them must fail if
the resulting binary is known to be unusable (and I think we all agree
on that part), and preferably without having to pass special flags to
however the build is done (i.e., failing must be default, and I also
think we're in agreement there), because otherwise I know those flags
would be missed sometimes. Just to enumerate:
(1) I do local development, building in my xterm, and testing on target
either by booting the binary with uuu or (for some other boards/SOCs)
scp'ing it to target or however it can easily be deployed and tested.
(2) Once this has stabilized, I update our bitbake metadata to point at
the new branch/commit, and do a local build with yocto (which is always
done inside a docker container based on a specific image that we and our
customers agree on using). That primarily catches things like missing
host tools or libraries that I may happen to have on my dev machine but
which are not in the docker image, or not exposed by Yocto. This can
then either mean the recipe needs to grow a new DEPENDS, or (which is
thankfully pretty rare) our docker image definition needs to be updated.
(3) When I can build with Yocto, it's time for my customer's CI to chew
on it. Depending on the project, that sometimes involves automatically
deploying the new bootloader to a test rack - which is why it's so
important that we do not pass a build if the binary is known-broken.
(4) [And we're not here yet, but pretty close, which is why I've been
rather active pushing stuff upstream the past few months] We want to
have a CI job set up for automatically merging upstream master into our
private branch, at least every -rc release, build test that and if
successful, deploy it to target; if not (or if the merge cannot be done
automatically in the first place), send an email so we're aware of it
before the next release happens. So far, I've found three bugs in
v2022.10 that could have been avoided (i.e. fixed before release) if
we/I had this in place.
Since I'm writing this wall of text anyway, let me explain how I was
bitten by the build not failing: I had added a new blob requirement (not
a "proprietary" thing, just changing the logic so that the boot script
is included in the u-boot fit image as a "loadable" and thus
automatically loaded to a known location by SPL, instead of having
U-Boot itself load it from somewhere), and locally added that
bootscript.itb to my build folder when testing; I had also duly updated
the bitbake recipe to copy bootscript.itb to "${B}" before do_compile,
but failed to remember that that was not the right place to put it
because the actual build folder is "${B}/<config name>". My own yocto
build succeeded, I deployed the binary to the board on my desk, and it
didn't work... Only then did I go look in do_compile.log and found the
warning(s).
Rasmus

Re: [PATCH 2/2] buildman: Add --allow-missing-binaries flag to build with BINMAN_ALLOW_MISSING=1

Rasmus Villemoes