Re: [PATCH v5 00/27] Integrate MbedTLS v3.6 LTS with U-Boot

2 Aug 2024

      Hi Tom,
On Thu, 1 Aug 2024 at 16:46, Tom Rini trini@konsulko.com wrote:
...
On Wed, Jul 31, 2024 at 10:25:10AM -0700, Raymond Mao wrote:
...
Integrate MbedTLS v3.6 LTS (currently v3.6.0) with U-Boot.
Motivations:

MbedTLS is well maintained with LTS versions.
LWIP is integrated with MbedTLS and easily to enable HTTPS.
MbedTLS recently switched license back to GPLv2.

Prerequisite:
This patch series requires mbedtls git repo to be added as a
subtree to the main U-Boot repo via:
    $ git subtree add --prefix lib/mbedtls/external/mbedtls \
          https://github.com/Mbed-TLS/mbedtls.git \
          v3.6.0 --squash
Moreover, due to the Windows-style files from mbedtls git repo,
we need to convert the CRLF endings to LF and do a commit manually:
    $ git add --renormalize .
    $ git commit
New Kconfig options:
`MBEDTLS_LIB` is for MbedTLS general switch.
`MBEDTLS_LIB_CRYPTO` is for replacing original digest and crypto libs
with
...
MbedTLS.
`MBEDTLS_LIB_X509` is for replacing original X509, PKCS7, MSCode, ASN1,
and Pubkey parser with MbedTLS.
`LEGACY_CRYPTO` is introduced as a main switch for legacy crypto library.
`LEGACY_CRYPTO_BASIC` is for the basic crypto functionalities and
`LEGACY_CRYPTO_CERT` is for the certificate related functionalities.
For each of the algorithm, a pair of `<alg>_LEGACY` and `<alg>_MBEDTLS`
Kconfig options are introduced. Meanwhile, `SPL_` Kconfig options are
introduced.
In this patch set, MBEDTLS_LIB, MBEDTLS_LIB_CRYPTO and MBEDTLS_LIB_X509
are by default enabled in qemu_arm64_defconfig and sandbox_defconfig
for testing purpose.
Patches for external MbedTLS project:
Since U-Boot uses Microsoft Authentication Code to verify PE/COFFs
executables which is not supported by MbedTLS at the moment,
addtional patches for MbedTLS are created to adapt with the EFI loader:

Decoding of Microsoft Authentication Code.
Decoding of PKCS#9 Authenticate Attributes.
Extending MbedTLS PKCS#7 lib to support multiple signer's

certificates.
...

MbedTLS native test suites for PKCS#7 signer's info.

All above 4 patches (tagged with `mbedtls/external`) are submitted to
MbedTLS project and being reviewed, eventually they should be part of
MbedTLS LTS release.
But before that, please merge them into U-Boot, otherwise the building
will be broken when MBEDTLS_LIB_X509 is enabled.
See below PR link for the reference:
https://github.com/Mbed-TLS/mbedtls/pull/9001
Miscellaneous:
Optimized MbedTLS library size by tailoring the config file
and disabling all unnecessary features for EFI loader.
From v2, original libs (rsa, asn1_decoder, rsa_helper, md5, sha1, sha256,
sha512) are completely replaced when MbedTLS is enabled.
From v3, the size-growth is slightly reduced by refactoring Hash
functions.
...
Target(QEMU arm64) size-growth when enabling MbedTLS:
v1: 6.03%
v2: 4.66%
From v3: 4.55%
Please see the latest output from buildman for size-growth on QEMU arm64,
Sandbox and Nanopi A64. [1]
Let us inline the growth on qemu_arm64 for a moment:
   aarch64: (for 1/1 boards) all +6916.0 bss -32.0 data -64.0 rodata
+200.0 text +6812.0
            qemu_arm64     : all +6916 bss -32 data -64 rodata +200 text
+6812
               u-boot: add: 28/-17, grow: 12/-16 bytes: 15492/-8304 (7188)
                 function                                   old     new
 delta
                 mbedtls_internal_sha1_process                -    4540
 +4540
                 mbedtls_internal_md5_process                 -    2928
 +2928
                 mbedtls_internal_sha256_process              -    2052
 +2052
                 mbedtls_internal_sha512_process              -    1056
 +1056
                 K                                            -     896
+896
                 mbedtls_sha512_finish                        -     556
+556
                 mbedtls_sha256_finish                        -     484
+484
                 mbedtls_sha1_finish                          -     420
+420
                 mbedtls_sha512_starts                        -     340
+340
                 mbedtls_md5_finish                           -     336
+336
                 mbedtls_sha512_update                        -     264
+264
                 mbedtls_sha256_update                        -     252
+252
                 mbedtls_sha1_update                          -     236
+236
                 mbedtls_md5_update                           -     236
+236
                 mbedtls_sha512                               -     148
+148
                 mbedtls_sha256_starts                        -     124
+124
                 hash_init_sha512                            52     128
 +76
                 hash_init_sha256                            52     128
 +76
                 mbedtls_sha1_starts                          -      72
 +72
                 mbedtls_md5_starts                           -      60
 +60
                 hash_init_sha1                              52     112
 +60
                 mbedtls_platform_zeroize                     -      56
 +56
                 mbedtls_sha512_free                          -      16
 +16
                 mbedtls_sha256_free                          -      16
 +16
                 mbedtls_sha1_free                            -      16
 +16
                 mbedtls_md5_free                             -      16
 +16
                 hash_finish_sha512                          72      88
 +16
                 hash_finish_sha256                          72      88
 +16
                 hash_finish_sha1                            72      88
 +16
                 sha512_csum_wd                              68      80
 +12
                 sha256_csum_wd                              68      80
 +12
                 sha1_csum_wd                                68      80
 +12
                 md5_wd                                      68      80
 +12
                 mbedtls_sha512_init                          -      12
 +12
                 mbedtls_sha256_init                          -      12
 +12
                 mbedtls_sha1_init                            -      12
 +12
                 mbedtls_md5_init                             -      12
 +12
                 memset_func                                  -       8
  +8
                 sha512_update                                4       8
  +4
                 sha384_update                                4       8
  +4
                 sha256_update                               12       8
  -4
                 sha1_update                                 12       8
  -4
                 sha256_process                              16       -
 -16
                 sha1_process                                16       -
 -16
                 hash_update_sha512                          36      16
 -20
                 hash_update_sha256                          36      16
 -20
                 hash_update_sha1                            36      16
 -20
                 MD5Init                                     56      36
 -20
                 sha1_starts                                 60      36
 -24
                 hash_update_sha384                          36       -
 -36
                 hash_init_sha384                            52       -
 -52
                 sha384_csum_wd                              68      12
 -56
                 sha256_starts                              104      40
 -64
                 sha256_padding                              64       -
 -64
                 sha1_padding                                64       -
 -64
                 hash_finish_sha384                          72       -
 -72
                 sha512_finish                              152      36
-116
                 sha512_starts                              168      40
-128
                 sha384_starts                              168      40
-128
                 sha384_finish                              152       4
-148
                 MD5Final                                   196      44
-152
                 sha512_base_do_finalize                    160       -
-160
                 static.sha256_update                       228       -
-228
                 static.sha1_update                         240       -
-240
                 sha512_base_do_update                      244       -
-244
                 MD5Update                                  260       -
-260
                 sha1_finish                                300      36
-264
                 sha256_finish                              404      36
-368
                 sha256_armv8_ce_process                    428       -
-428
                 sha1_armv8_ce_process                      484       -
-484
                 sha512_K                                   640       -
-640
                 sha512_block_fn                           1212       -
 -1212
                 MD5Transform                              2552       -
 -2552
And to start with, that's not bad. In fact, tossing LTO in before mbedTLS
only changes
the top-line a little:
   aarch64: (for 1/1 boards) all +5120.0 bss -16.0 data -64.0 rodata
+200.0 text +5000.0
            qemu_arm64     : all +5120 bss -16 data -64 rodata +200 text
+5000
               u-boot: add: 19/-18, grow: 11/-7 bytes: 14696/-7884 (6812)
But, is there something we can do still? mbedTLS is a more robust
solution and I'm accepting there will be growth. But still the
process/start/finish is much larger. Is there something configurable
there?
I have investigated all those MbedTLS native functions with big-size
(_process/_update/_finish).
For MD5 and SHA1, we don't have turnable configs.
For SHA256 and SHA512, there are a few configs:
1. Performance configs only for Armv8/a64.
    I didn't turn that on, which might affect the target size as well.
2. Smaller implementation with lower size (only for non-Armv8/a64) at the
expense of losing
    performance.
I didn't enable both, as #1 is more for performance and might potentially
increase target size;
#2 compromises the performance and only for non-Armv8/a64.
Looks like that both don't help in reducing the size of qemu_arm64.
But I will try #1 on qemu_arm64 and  #2 on sandbox and let you know
the size impact soon.
Thanks.
Regards,
Raymond