
Hi Tom,
On Thu, 1 Aug 2024 at 16:46, Tom Rini trini@konsulko.com wrote:
On Wed, Jul 31, 2024 at 10:25:10AM -0700, Raymond Mao wrote:
Integrate MbedTLS v3.6 LTS (currently v3.6.0) with U-Boot.
Motivations:
- MbedTLS is well maintained with LTS versions.
- LWIP is integrated with MbedTLS and easily to enable HTTPS.
- MbedTLS recently switched license back to GPLv2.
Prerequisite:
This patch series requires mbedtls git repo to be added as a subtree to the main U-Boot repo via: $ git subtree add --prefix lib/mbedtls/external/mbedtls \ https://github.com/Mbed-TLS/mbedtls.git \ v3.6.0 --squash Moreover, due to the Windows-style files from mbedtls git repo, we need to convert the CRLF endings to LF and do a commit manually: $ git add --renormalize . $ git commit
New Kconfig options:
`MBEDTLS_LIB` is for MbedTLS general switch. `MBEDTLS_LIB_CRYPTO` is for replacing original digest and crypto libs
with
MbedTLS. `MBEDTLS_LIB_X509` is for replacing original X509, PKCS7, MSCode, ASN1, and Pubkey parser with MbedTLS. `LEGACY_CRYPTO` is introduced as a main switch for legacy crypto library. `LEGACY_CRYPTO_BASIC` is for the basic crypto functionalities and `LEGACY_CRYPTO_CERT` is for the certificate related functionalities. For each of the algorithm, a pair of `<alg>_LEGACY` and `<alg>_MBEDTLS` Kconfig options are introduced. Meanwhile, `SPL_` Kconfig options are introduced.
In this patch set, MBEDTLS_LIB, MBEDTLS_LIB_CRYPTO and MBEDTLS_LIB_X509 are by default enabled in qemu_arm64_defconfig and sandbox_defconfig for testing purpose.
Patches for external MbedTLS project:
Since U-Boot uses Microsoft Authentication Code to verify PE/COFFs executables which is not supported by MbedTLS at the moment, addtional patches for MbedTLS are created to adapt with the EFI loader:
- Decoding of Microsoft Authentication Code.
- Decoding of PKCS#9 Authenticate Attributes.
- Extending MbedTLS PKCS#7 lib to support multiple signer's
certificates.
- MbedTLS native test suites for PKCS#7 signer's info.
All above 4 patches (tagged with `mbedtls/external`) are submitted to MbedTLS project and being reviewed, eventually they should be part of MbedTLS LTS release. But before that, please merge them into U-Boot, otherwise the building will be broken when MBEDTLS_LIB_X509 is enabled.
See below PR link for the reference: https://github.com/Mbed-TLS/mbedtls/pull/9001
Miscellaneous:
Optimized MbedTLS library size by tailoring the config file and disabling all unnecessary features for EFI loader. From v2, original libs (rsa, asn1_decoder, rsa_helper, md5, sha1, sha256, sha512) are completely replaced when MbedTLS is enabled. From v3, the size-growth is slightly reduced by refactoring Hash
functions.
Target(QEMU arm64) size-growth when enabling MbedTLS: v1: 6.03% v2: 4.66% From v3: 4.55%
Please see the latest output from buildman for size-growth on QEMU arm64, Sandbox and Nanopi A64. [1]
Let us inline the growth on qemu_arm64 for a moment: aarch64: (for 1/1 boards) all +6916.0 bss -32.0 data -64.0 rodata +200.0 text +6812.0 qemu_arm64 : all +6916 bss -32 data -64 rodata +200 text +6812 u-boot: add: 28/-17, grow: 12/-16 bytes: 15492/-8304 (7188) function old new delta mbedtls_internal_sha1_process - 4540 +4540 mbedtls_internal_md5_process - 2928 +2928 mbedtls_internal_sha256_process - 2052 +2052 mbedtls_internal_sha512_process - 1056 +1056 K - 896 +896 mbedtls_sha512_finish - 556 +556 mbedtls_sha256_finish - 484 +484 mbedtls_sha1_finish - 420 +420 mbedtls_sha512_starts - 340 +340 mbedtls_md5_finish - 336 +336 mbedtls_sha512_update - 264 +264 mbedtls_sha256_update - 252 +252 mbedtls_sha1_update - 236 +236 mbedtls_md5_update - 236 +236 mbedtls_sha512 - 148 +148 mbedtls_sha256_starts - 124 +124 hash_init_sha512 52 128 +76 hash_init_sha256 52 128 +76 mbedtls_sha1_starts - 72 +72 mbedtls_md5_starts - 60 +60 hash_init_sha1 52 112 +60 mbedtls_platform_zeroize - 56 +56 mbedtls_sha512_free - 16 +16 mbedtls_sha256_free - 16 +16 mbedtls_sha1_free - 16 +16 mbedtls_md5_free - 16 +16 hash_finish_sha512 72 88 +16 hash_finish_sha256 72 88 +16 hash_finish_sha1 72 88 +16 sha512_csum_wd 68 80 +12 sha256_csum_wd 68 80 +12 sha1_csum_wd 68 80 +12 md5_wd 68 80 +12 mbedtls_sha512_init - 12 +12 mbedtls_sha256_init - 12 +12 mbedtls_sha1_init - 12 +12 mbedtls_md5_init - 12 +12 memset_func - 8 +8 sha512_update 4 8 +4 sha384_update 4 8 +4 sha256_update 12 8 -4 sha1_update 12 8 -4 sha256_process 16 - -16 sha1_process 16 - -16 hash_update_sha512 36 16 -20 hash_update_sha256 36 16 -20 hash_update_sha1 36 16 -20 MD5Init 56 36 -20 sha1_starts 60 36 -24 hash_update_sha384 36 - -36 hash_init_sha384 52 - -52 sha384_csum_wd 68 12 -56 sha256_starts 104 40 -64 sha256_padding 64 - -64 sha1_padding 64 - -64 hash_finish_sha384 72 - -72 sha512_finish 152 36 -116 sha512_starts 168 40 -128 sha384_starts 168 40 -128 sha384_finish 152 4 -148 MD5Final 196 44 -152 sha512_base_do_finalize 160 - -160 static.sha256_update 228 - -228 static.sha1_update 240 - -240 sha512_base_do_update 244 - -244 MD5Update 260 - -260 sha1_finish 300 36 -264 sha256_finish 404 36 -368 sha256_armv8_ce_process 428 - -428 sha1_armv8_ce_process 484 - -484 sha512_K 640 - -640 sha512_block_fn 1212 - -1212 MD5Transform 2552 - -2552
And to start with, that's not bad. In fact, tossing LTO in before mbedTLS only changes the top-line a little: aarch64: (for 1/1 boards) all +5120.0 bss -16.0 data -64.0 rodata +200.0 text +5000.0 qemu_arm64 : all +5120 bss -16 data -64 rodata +200 text +5000 u-boot: add: 19/-18, grow: 11/-7 bytes: 14696/-7884 (6812)
But, is there something we can do still? mbedTLS is a more robust solution and I'm accepting there will be growth. But still the process/start/finish is much larger. Is there something configurable there?
I have investigated all those MbedTLS native functions with big-size
(_process/_update/_finish). For MD5 and SHA1, we don't have turnable configs. For SHA256 and SHA512, there are a few configs: 1. Performance configs only for Armv8/a64. I didn't turn that on, which might affect the target size as well. 2. Smaller implementation with lower size (only for non-Armv8/a64) at the expense of losing performance. I didn't enable both, as #1 is more for performance and might potentially increase target size; #2 compromises the performance and only for non-Armv8/a64. Looks like that both don't help in reducing the size of qemu_arm64. But I will try #1 on qemu_arm64 and #2 on sandbox and let you know the size impact soon.
Thanks. Regards, Raymond