Re: [PATCH 10/25] binman: Move section-building code into a function

23 Jan 2021

      Hi Alper,
On Wed, 4 Nov 2020 at 14:51, Alper Nebi Yasak alpernebiyasak@gmail.com wrote:
...
On 03/11/2020 18:11, Simon Glass wrote:
...
Hi Alper,
On Mon, 26 Oct 2020 at 17:20, Alper Nebi Yasak alpernebiyasak@gmail.com wrote:
...
On 26/10/2020 22:22, Simon Glass wrote:
...
I've added a few test cases along these lines in v2, and one of them
certainly different behaviour. This is good actually since it shows a
simple case of what these padding changes are intended to fix.
See what you think of the above test cases - testSectionPad() and
testSectionAlign()
I've tried to visualize those tests a bit, in the following:

The vertical line of numbers is the offsets, usually starts with 0 and
ends with entry.size of that entry.
The offset to the upper-left of a block is that entry's entry.offset
The "*" offsets are unconstrained, determined as parts are fitted
together.
The "k*n" are offsets for alignment that must be a multiple of k
The vertical "[...]" line is what the entry.data returns for that
entry.
The horizontal line from a value is what the value corresponds to.

Hope things make sense. I kind of started drawing things to gather my
thoughts and improve my understanding, but didn't want to discard them.
Please do tell if anything's wrong.
== 177_skip_at_start.dts ==
binman {
[ 0
. | section {
. | [ 0
. | . | 16 -------------------- binman/section:skip-at-start
. | . | | u-boot {
. | . | | [ 0
. | . | | . | U_BOOT_DATA
. | . | | ] *
. | . | | }
. | . | *
. | ] *
. | }
] *
}

I understand skip-at-start as if it's creating a frame of reference for
alignments. Then, that frame is mapped back to starting from zero.
It looks weird here for me to use two nested offset-lines here. I can't
use one line that starts at skip-at-start, because that changes the
entry.offset if the section has pad-before.
== 178_skip_at_start_pad.dts ==
binman {
[ 0
. | section {
. | [ 0
. | . | 16 -------------------- binman/section:skip-at-start
. | . | | u-boot {
. | . | |   0
. | . | |   | 8 x <0x00>
. | . | |   |  \--------------- binman/section/u-boot:pad-before
. | . | | [ *
. | . | | . | U_BOOT_DATA
. | . | | ] *
. | . | |   | 4 x <0x00>
. | . | |   |  \--------------- binman/section/u-boot:pad-after
. | . | |   * ----------------- binman/section/u-boot:size
. | . | | }
. | . | *
. | ] *
. | }
] *
}

This is like the above, just adds padding to u-boot. I have to visualize
the padding as something inside the entry block, since alignments and
entry.size is calculated for the padded data, not the raw U_BOOT_DATA.
It's a bit weird but understandable that len(entry.data) != entry.size.
== 179_skip_at_start_section_pad.dts ==
binman {
[ 0
. | section {
. |   0
. |   | 8 x <0x00>
. |   |  \--------------------- binman/section:pad-before
. | [ *
. | . | 16 -------------------- binman/section:skip-at-start
. | . | | u-boot {
. | . | | [ 0
. | . | | . | U_BOOT_DATA
. | . | | ] *
. | . | | }
. | . | *
. | ] *
. |   | 4 x <0x00>
. |   |  \--------------------- binman/section:pad-after
. |   *
. | }
] *
}

I'm still having trouble with this. In the old code:
...
base = self.pad_before + (entry.offset or 0) - self._skip_at_start
     8               + 16                  - 16

...
pad = base - len(section_data) + (entry.pad_before or 0)
    8    - 0                 +  0

...
if pad > 0:
 8

...
section_data += tools.GetBytes(self._pad_byte, pad)

                                                 8

So, why was it prepending 16 bytes? The only way I see is if u-boot
entry.offset is 24, but that's explicitly checked to be 16 in the test.
However, it's clear to me now that the fragment I sent wouldn't result
in different padding between two versions, because there entry.offset =
section.skip_at_start so the negative padding never happens.
Then, what does an entry.offset < section.skip-at-start mean? That's
what was missing for the actual thing I was trying to trigger:
section {
    skip-at-start = <16>;

    blob {
        offset = <0>;
        pad-before = <16>;
        filename = "foo";
    };
};

== 180_section_pad.dts ==
binman {
  [ 0
  . | section@0 {
  . |   0
  . |   | 3 x <0x26> -------------- binman:pad-byte
  . |   |  ----------------------- binman/section@0:pad-before
  . | [ *
  . | . | u-boot {
  . | . |   0
  . | . |   | 5 x <0x21> ---------- binman/section@0:pad-byte
  . | . |   |  ------------------- binman/section@0/u-boot:pad-before
  . | . | [ *
  . | . | . | U_BOOT_DATA
  . | . | ] *
  . | . |   | 1 x <0x21> ---------- binman/section@0:pad-byte
  . | . |   *  ------------------- binman/section@0/u-boot:pad-after
  . | . | }
  . | ] *
  . |   | 2 x <0x26> -------------- binman:pad-byte
  . |   |  ----------------------- binman/section@0:pad-after
  . |   *
  . | }
  ] *
  }
It looks like paddings are part of the entry:

entry.offset and entry.image_pos point to pad-before padding
entry.size includes both paddings
pad-before, pad-after properties belong to the entry
entry's parent aligns the entry with respect to the padded-data

But, also the opposite:

entry.data doesn't include padding bytes
it's entirely added by the entry's parent
pad-byte property belongs to the parent

I have no idea which way things should go towards. I think padding could
be completely handled by the entries themselves. Section's
GetPaddedDataForEntry(entry, entry_data) could be moved to Entry as
GetPaddedData(pad_byte), which the parent section would use while
assembling itself. The pad_byte argument could be dropped by making the
entries find it by traversing upwards in the tree starting from the
entry itself (and not just checking the immediate parent).
I appreciate all your thinking on this!
I originally had it so len(entry.data) == entry.size (and also
entry.contents_size). But the problem is that for compressed data we
don't want to compress the padding. Say we have an entry of size 100
with only e0 bytes used (hex used througout):
0   ---
    <data>
e0  <empty>
100 ---
We can certainly add the extra 20 bytes onto the end of the entry
data, as was done previously. Let's now compress it. We want to
compress the actual contents of the entry, excluding any padding,
since the padding is for the entry, not its contents:
0   ---
    <compressed data>
80  <empty>
100 ---
Binman has always had the concept of Entry.contents_size separate from
Entry.size. But they have typically been the same. With compression,
that breaks down, since the Entry.data is the compressed data
(Entry.uncomp_data is the uncompressed) and Entry.contents_size is the
size of the compressed data. In fact the uncompressed data might not
even fit in the entry.
In this case it does not make sense to have the Entry pad its own
data, since then it would be compressing part of it with padding
around. In trying to think this through and actually implement it, I
came to the point where it is today.
...
== 181_section_align.dts ==
binman {
  [ 0
  . | fill {
  . | [ 0
  . | . | <0x00>
  . | ] 1 ------------------------- binman/fill:size
  . | }
  . *
  . | <0x26> ---------------------- binman:pad-byte
  . 2*n --------------------------- binman/section@1:align
  . | section@1 {
  . | [ 0
  . | . | fill {
  . | . | [ 0
  . | . | . | <0x00>
  . | . | ] 1 --------------------- binman/section@1/fill:size
  . | . | }
  . | . *
  . | . | <0x21> ------------------ binman/section@1:pad-byte
  . | . 4*n ----------------------- binman/section@1/u-boot:align
  . | . | u-boot {
  . | . | [ 0
  . | . | . | U_BOOT_DATA
  . | . | ] *
  . | . |   | <0x21> -------------- binman/section@1:pad-byte
  . | . |   8*n ------------------- binman/section@1/u-boot:size
  . | . |    ----------------------binman/section@1/u-boot:align-size
  . | . | }
  . | ] *
  . |   | <0x21> ------------------ binman/section@1:pad-byte
  . |   0x10*n -------------------- binman/section@1:size
  . |     ------------------------ binman/section@1:align-size
  . | }
  ] *
  }
The pad-byte values here surprise me a bit. I'd say they should be the
parent's pad-byte, since I think this in-section alignment padding is
the same kind of thing as the pad-before and pad-after, and those use
the parent's. However, like what I said above, the latter two could
instead be changed to use the entry's pad-byte like this one.
With the design decision above it makes sense for the parent to
provide the pad data, since it is doing the padding. Were it to come
from the Entry itself, then every entry would need to specify the
padding byte to use. But if you think about it at a high level, it is
the parent section that is providing the space for the Entries to be
packed into, so the parent section should provide the padding byte.
Again, I have tried it both ways...
This stuff is surprisingly complicated. It will be interesting to see
if it stands the test of time. One reason for all the tests is so the
behaviour is as fully defined as possible.
Regards,
Simon