[PATCH 00/15] vbe: Series part F

This includes various patches towards implementing the VBE abrec bootmeth in U-Boot. It mostly focuses on introducing a relocating SPL-loader so that VBE can run in the limited amount of SRAM available on many devices.
Another minor new feature is support in VBE for specifying the image phase when loading from a FIT. This allows a single FIT to include images for several boot phases, thus simplifying image-creation.
One lingering niggle in this series is that it has a different code path for sandbox, since it does not support the relocating jump. It should be possible to resolve this with additional work, but I have not attempted this so far.
Again, only MMC is supported so far.
Looking ahead, series G will have some more plumbing and H some rk3399 pieces. That should be enough to complete these feature.
Here is a run in my lab, with the VBE ABrec bootmeth. You can see that VPL runs before memory is set up. SPL sets up memory and can be upgraded in the field reliably.
$ ub-int vbe Building U-Boot in sourcedir for rk3399-generic Bootstrapping U-Boot from dir /tmp/b/rk3399-generic Writing U-Boot using method rockchip
U-Boot TPL 2025.01-rc3-00345-gdfbdbf1eb56c-dirty (Jan 08 2025 - 10:47:58) Trying to boot from vbe_abrec load: Firefly-RK3399 Board Using 'config-3' configuration Trying 'image-vpl' firmware subimage Using 'config-3' configuration Trying 'fdt-3' fdt subimage
U-Boot VPL 2025.01-rc3-00345-gdfbdbf1eb56c-dirty (Jan 08 2025 - 10:47:58) Trying to boot from vbe_abrec load: Firefly-RK3399 Board Starting with empty state VBE: Firmware pick A at 800000 Using 'config-3' configuration Trying 'spl' firmware subimage Using 'config-3' configuration Trying 'fdt-3' fdt subimage Channel 0: DDR3, 800MHz BW=32 Col=10 Bk=8 CS0 Row=15 CS1 Row=15 CS=2 Die BW=16 Size=2048MB Channel 1: DDR3, 800MHz BW=32 Col=10 Bk=8 CS0 Row=15 CS1 Row=15 CS=2 Die BW=16 Size=2048MB 256B stride
U-Boot SPL 2025.01-rc3-00345-gdfbdbf1eb56c-dirty (Jan 08 2025 - 10:47:58 -0700) Trying to boot from vbe_abrec load: Firefly-RK3399 Board VBE: Firmware pick A at 900000 load_simple_fit: Skip load 'atf-5': image size is 0! Relocating bloblist ff8eff00 to 100000: done ns16550_serial serial@ff1a0000: pinctrl_select_state_full: uclass_get_device_by_phandle_id: err=-19
U-Boot 2025.01-rc3-00345-gdfbdbf1eb56c-dirty (Jan 08 2025 - 10:47:58 -0700)
SoC: Rockchip rk3399 Reset cause: POR Model: Firefly-RK3399 Board DRAM: 4 GiB (effective 3.9 GiB) Core: 314 devices, 33 uclasses, devicetree: separate MMC: mmc@fe310000: 3, mmc@fe320000: 1, mmc@fe330000: 0 Loading Environment from SPIFlash... Invalid bus 0 (err=-19) *** Warning - spi_flash_probe_bus_cs() failed, using default environment
In: serial,usbkbd Out: serial,vidconsole Err: serial,vidconsole Model: Firefly-RK3399 Board Net: PMIC: RK808 eth0: ethernet@fe300000
starting USB... Bus usb@fe380000: USB EHCI 1.00 Bus usb@fe3a0000: USB OHCI 1.0 Bus usb@fe3c0000: USB EHCI 1.00 Bus usb@fe3e0000: USB OHCI 1.0 Bus usb@fe900000: Register 2000140 NbrPorts 2 Starting the controller USB XHCI 1.10 scanning bus usb@fe380000 for devices... 1 USB Device(s) found scanning bus usb@fe3a0000 for devices... 1 USB Device(s) found scanning bus usb@fe3c0000 for devices... 2 USB Device(s) found scanning bus usb@fe3e0000 for devices... 1 USB Device(s) found scanning bus usb@fe900000 for devices... 1 USB Device(s) found scanning usb for storage devices... 0 Storage Device(s) found Hit any key to stop autoboot: 0
Simon Glass (15): vbe: Split out some VBE code into a common file vbe: Split out reading a FIT into a common file vbe: Allocate space for the FIT header vbe: Allow VBE to load FITs on any architecture vbe: Tidy up error checking with blk_read() vbe: Handle loading from an unaligned offset vbe: Allow loading loadables if there is no firmware vbe: Support loading an FDT from the FIT spl: Add fields for VBE spl: Add a type for the jumper function spl: Add support for a relocating jump to the next phase spl: Plumb in the relocating loader vbe: Support loading an FDT with the relocating loader vbe: Support loading SPL images vbe: Update simple-fw to support using the SPL loader
boot/Makefile | 2 +- boot/vbe_common.c | 362 +++++++++++++++++++++++++++++++++++++++++ boot/vbe_common.h | 173 ++++++++++++++++++++ boot/vbe_simple.c | 99 +++-------- boot/vbe_simple.h | 15 +- boot/vbe_simple_fw.c | 207 +++++++++-------------- common/spl/Kconfig | 8 + common/spl/Kconfig.tpl | 8 + common/spl/Kconfig.vpl | 8 + common/spl/Makefile | 1 + common/spl/spl.c | 15 +- common/spl/spl_reloc.c | 183 +++++++++++++++++++++ include/spl.h | 90 +++++++++- 13 files changed, 948 insertions(+), 223 deletions(-) create mode 100644 boot/vbe_common.c create mode 100644 boot/vbe_common.h create mode 100644 common/spl/spl_reloc.c

Loading a FIT is useful for other VBE methods, such as ABrec, so start a new common file. Add functions for reading the version and nvdata. Also add a function to get the block device.
Signed-off-by: Simon Glass sjg@chromium.org ---
boot/Makefile | 2 +- boot/vbe_common.c | 93 ++++++++++++++++++++++++++++++ boot/vbe_common.h | 143 ++++++++++++++++++++++++++++++++++++++++++++++ boot/vbe_simple.c | 99 +++++++------------------------- boot/vbe_simple.h | 15 +---- 5 files changed, 258 insertions(+), 94 deletions(-) create mode 100644 boot/vbe_common.c create mode 100644 boot/vbe_common.h
diff --git a/boot/Makefile b/boot/Makefile index 9446c6b82a9..30529bac367 100644 --- a/boot/Makefile +++ b/boot/Makefile @@ -66,7 +66,7 @@ endif
obj-$(CONFIG_$(PHASE_)BOOTMETH_VBE) += vbe.o obj-$(CONFIG_$(PHASE_)BOOTMETH_VBE_REQUEST) += vbe_request.o -obj-$(CONFIG_$(PHASE_)BOOTMETH_VBE_SIMPLE) += vbe_simple.o +obj-$(CONFIG_$(PHASE_)BOOTMETH_VBE_SIMPLE) += vbe_simple.o vbe_common.o obj-$(CONFIG_$(PHASE_)BOOTMETH_VBE_SIMPLE_FW) += vbe_simple_fw.o obj-$(CONFIG_$(PHASE_)BOOTMETH_VBE_SIMPLE_OS) += vbe_simple_os.o
diff --git a/boot/vbe_common.c b/boot/vbe_common.c new file mode 100644 index 00000000000..5f31cde0288 --- /dev/null +++ b/boot/vbe_common.c @@ -0,0 +1,93 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Verified Boot for Embedded (VBE) common functions + * + * Copyright 2024 Google LLC + * Written by Simon Glass sjg@chromium.org + */ + +#include <blk.h> +#include <bootstage.h> +#include <display_options.h> +#include <dm.h> +#include <image.h> +#include <spl.h> +#include <mapmem.h> +#include <memalign.h> +#include <linux/types.h> +#include <u-boot/crc.h> +#include "vbe_common.h" + +int vbe_read_version(struct udevice *blk, ulong offset, char *version, + int max_size) +{ + ALLOC_CACHE_ALIGN_BUFFER(u8, buf, MMC_MAX_BLOCK_LEN); + + if (max_size > MMC_MAX_BLOCK_LEN) + return log_msg_ret("ver", -E2BIG); + + if (offset & (MMC_MAX_BLOCK_LEN - 1)) + return log_msg_ret("get", -EBADF); + offset /= MMC_MAX_BLOCK_LEN; + + if (blk_read(blk, offset, 1, buf) != 1) + return log_msg_ret("read", -EIO); + strlcpy(version, buf, max_size); + + return 0; +} + +int vbe_read_nvdata(struct udevice *blk, ulong offset, ulong size, u8 *buf) +{ + uint hdr_ver, hdr_size, data_size, crc; + const struct vbe_nvdata *nvd; + + if (size > MMC_MAX_BLOCK_LEN) + return log_msg_ret("state", -E2BIG); + + if (offset & (MMC_MAX_BLOCK_LEN - 1)) + return log_msg_ret("get", -EBADF); + offset /= MMC_MAX_BLOCK_LEN; + + if (blk_read(blk, offset, 1, buf) != 1) + return log_msg_ret("read", -EIO); + nvd = (struct vbe_nvdata *)buf; + hdr_ver = (nvd->hdr & NVD_HDR_VER_MASK) >> NVD_HDR_VER_SHIFT; + hdr_size = (nvd->hdr & NVD_HDR_SIZE_MASK) >> NVD_HDR_SIZE_SHIFT; + if (hdr_ver != NVD_HDR_VER_CUR) + return log_msg_ret("hdr", -EPERM); + data_size = 1 << hdr_size; + if (data_size > sizeof(*nvd)) + return log_msg_ret("sz", -EPERM); + + crc = crc8(0, buf + 1, data_size - 1); + if (crc != nvd->crc8) + return log_msg_ret("crc", -EPERM); + + return 0; +} + +int vbe_get_blk(const char *storage, struct udevice **blkp) +{ + struct blk_desc *desc; + char devname[16]; + const char *end; + int devnum; + + /* First figure out the block device */ + log_debug("storage=%s\n", storage); + devnum = trailing_strtoln_end(storage, NULL, &end); + if (devnum == -1) + return log_msg_ret("num", -ENODEV); + if (end - storage >= sizeof(devname)) + return log_msg_ret("end", -E2BIG); + strlcpy(devname, storage, end - storage + 1); + log_debug("dev=%s, %x\n", devname, devnum); + + desc = blk_get_dev(devname, devnum); + if (!desc) + return log_msg_ret("get", -ENXIO); + *blkp = desc->bdev; + + return 0; +} diff --git a/boot/vbe_common.h b/boot/vbe_common.h new file mode 100644 index 00000000000..df133fbf05f --- /dev/null +++ b/boot/vbe_common.h @@ -0,0 +1,143 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Verified Boot for Embedded (VBE) common functions + * + * Copyright 2024 Google LLC + * Written by Simon Glass sjg@chromium.org + */ + +#ifndef __VBE_COMMON_H +#define __VBE_COMMON_H + +#include <linux/types.h> + +struct udevice; + +/* + * Controls whether we use a full bootmeth driver with VBE in this phase, or + * just access the information directly. + * + * For now VBE-simple uses the full bootmeth, but VBE-abrec does not, to reduce + * code size + */ +#define USE_BOOTMETH CONFIG_IS_ENABLED(BOOTMETH_VBE_SIMPLE) + +struct spl_image_info; +struct spl_load_info; + +enum { + MAX_VERSION_LEN = 256, + + NVD_HDR_VER_SHIFT = 0, + NVD_HDR_VER_MASK = 0xf, + NVD_HDR_SIZE_SHIFT = 4, + NVD_HDR_SIZE_MASK = 0xf << NVD_HDR_SIZE_SHIFT, + + /* Firmware key-version is in the top 16 bits of fw_ver */ + FWVER_KEY_SHIFT = 16, + FWVER_FW_MASK = 0xffff, + + NVD_HDR_VER_CUR = 1, /* current version */ +}; + +/** + * enum vbe_try_result - result of trying a firmware pick + * + * @VBETR_UNKNOWN: Unknown / invalid result + * @VBETR_TRYING: Firmware pick is being tried + * @VBETR_OK: Firmware pick is OK and can be used from now on + * @VBETR_BAD: Firmware pick is bad and should be removed + */ +enum vbe_try_result { + VBETR_UNKNOWN, + VBETR_TRYING, + VBETR_OK, + VBETR_BAD, +}; + +/** + * enum vbe_flags - flags controlling operation + * + * @VBEF_TRY_COUNT_MASK: mask for the 'try count' value + * @VBEF_TRY_B: Try the B slot + * @VBEF_RECOVERY: Use recovery slot + */ +enum vbe_flags { + VBEF_TRY_COUNT_MASK = 0x3, + VBEF_TRY_B = BIT(2), + VBEF_RECOVERY = BIT(3), + + VBEF_RESULT_SHIFT = 4, + VBEF_RESULT_MASK = 3 << VBEF_RESULT_SHIFT, + + VBEF_PICK_SHIFT = 6, + VBEF_PICK_MASK = 3 << VBEF_PICK_SHIFT, +}; + +/** + * struct vbe_nvdata - basic storage format for non-volatile data + * + * This is used for all VBE methods + * + * @crc8: crc8 for the entire record except @crc8 field itself + * @hdr: header size and version (NVD_HDR_...) + * @spare1: unused, must be 0 + * @fw_vernum: version and key version (FWVER_...) + * @flags: Flags controlling operation (enum vbe_flags) + */ +struct vbe_nvdata { + u8 crc8; + u8 hdr; + u16 spare1; + u32 fw_vernum; + u32 flags; + u8 spare2[0x34]; +}; + +/** + * vbe_read_nvdata() - Read non-volatile data from a block device + * + * Reads the VBE nvdata from a device. This function reads a single block from + * the device, so the nvdata cannot be larger than that. + * + * @blk: Device to read from + * @offset: Offset to read, in bytes + * @size: Number of bytes to read + * @buf: Buffer to hold the data + * Return: 0 if OK, -E2BIG if @size > block size, -EBADF if the offset is not + * block-aligned, -EIO if an I/O error occurred, -EPERM if the header version is + * incorrect, the header size is invalid or the data fails its CRC check + */ +int vbe_read_nvdata(struct udevice *blk, ulong offset, ulong size, u8 *buf); + +/** + * vbe_read_version() - Read version-string from a block device + * + * Reads the VBE version-string from a device. This function reads a single + * block from the device, so the string cannot be larger than that. It uses a + * temporary buffer for the read, then copies in up to @size bytes + * + * @blk: Device to read from + * @offset: Offset to read, in bytes + * @version: Place to put the string + * @max_size: Maximum size of @version + * Return: 0 if OK, -E2BIG if @max_size > block size, -EBADF if the offset is + * not block-aligned, -EIO if an I/O error occurred + */ +int vbe_read_version(struct udevice *blk, ulong offset, char *version, + int max_size); + +/** + * vbe_get_blk() - Obtain the block device to use for VBE + * + * Decodes the string to produce a block device + * + * @storage: String indicating the device to use, e.g. "mmc1" + * @blkp: Returns associated block device, on success + * Return 0 if OK, -ENODEV if @storage does not end with a number, -E2BIG if + * the device name is more than 15 characters, -ENXIO if the block device could + * not be found + */ +int vbe_get_blk(const char *storage, struct udevice **blkp); + +#endif /* __VBE_ABREC_H */ diff --git a/boot/vbe_simple.c b/boot/vbe_simple.c index ed7b9598e38..792750f7ef6 100644 --- a/boot/vbe_simple.c +++ b/boot/vbe_simple.c @@ -18,70 +18,23 @@ #include <vbe.h> #include <dm/device-internal.h> #include <dm/ofnode.h> -#include <u-boot/crc.h> +#include "vbe_common.h" #include "vbe_simple.h"
-/** struct simple_nvdata - storage format for non-volatile data */ -struct simple_nvdata { - u8 crc8; - u8 hdr; - u16 spare1; - u32 fw_vernum; - u8 spare2[0x38]; -}; - -static int simple_read_version(struct udevice *dev, struct blk_desc *desc, - u8 *buf, struct simple_state *state) +static int simple_read_nvdata(struct udevice *dev, struct udevice *blk, + struct simple_state *state) { + ALLOC_CACHE_ALIGN_BUFFER(u8, buf, MMC_MAX_BLOCK_LEN); struct simple_priv *priv = dev_get_priv(dev); - int start; - - if (priv->version_size > MMC_MAX_BLOCK_LEN) - return log_msg_ret("ver", -E2BIG); - - start = priv->area_start + priv->version_offset; - if (start & (MMC_MAX_BLOCK_LEN - 1)) - return log_msg_ret("get", -EBADF); - start /= MMC_MAX_BLOCK_LEN; - - if (blk_dread(desc, start, 1, buf) != 1) - return log_msg_ret("read", -EIO); - strlcpy(state->fw_version, buf, MAX_VERSION_LEN); - log_debug("version=%s\n", state->fw_version); + const struct vbe_nvdata *nvd; + int ret;
- return 0; -} + ret = vbe_read_nvdata(blk, priv->area_start + priv->state_offset, + priv->state_size, buf); + if (ret) + return log_msg_ret("nv", ret);
-static int simple_read_nvdata(struct udevice *dev, struct blk_desc *desc, - u8 *buf, struct simple_state *state) -{ - struct simple_priv *priv = dev_get_priv(dev); - uint hdr_ver, hdr_size, size, crc; - const struct simple_nvdata *nvd; - int start; - - if (priv->state_size > MMC_MAX_BLOCK_LEN) - return log_msg_ret("state", -E2BIG); - - start = priv->area_start + priv->state_offset; - if (start & (MMC_MAX_BLOCK_LEN - 1)) - return log_msg_ret("get", -EBADF); - start /= MMC_MAX_BLOCK_LEN; - - if (blk_dread(desc, start, 1, buf) != 1) - return log_msg_ret("read", -EIO); - nvd = (struct simple_nvdata *)buf; - hdr_ver = (nvd->hdr & NVD_HDR_VER_MASK) >> NVD_HDR_VER_SHIFT; - hdr_size = (nvd->hdr & NVD_HDR_SIZE_MASK) >> NVD_HDR_SIZE_SHIFT; - if (hdr_ver != NVD_HDR_VER_CUR) - return log_msg_ret("hdr", -EPERM); - size = 1 << hdr_size; - if (size > sizeof(*nvd)) - return log_msg_ret("sz", -ENOEXEC); - - crc = crc8(0, buf + 1, size - 1); - if (crc != nvd->crc8) - return log_msg_ret("crc", -EPERM); + nvd = (struct vbe_nvdata *)buf; state->fw_vernum = nvd->fw_vernum;
log_debug("version=%s\n", state->fw_version); @@ -91,33 +44,21 @@ static int simple_read_nvdata(struct udevice *dev, struct blk_desc *desc,
int vbe_simple_read_state(struct udevice *dev, struct simple_state *state) { - ALLOC_CACHE_ALIGN_BUFFER(u8, buf, MMC_MAX_BLOCK_LEN); struct simple_priv *priv = dev_get_priv(dev); - struct blk_desc *desc; - char devname[16]; - const char *end; - int devnum; + struct udevice *blk; int ret;
- /* First figure out the block device */ - log_debug("storage=%s\n", priv->storage); - devnum = trailing_strtoln_end(priv->storage, NULL, &end); - if (devnum == -1) - return log_msg_ret("num", -ENODEV); - if (end - priv->storage >= sizeof(devname)) - return log_msg_ret("end", -E2BIG); - strlcpy(devname, priv->storage, end - priv->storage + 1); - log_debug("dev=%s, %x\n", devname, devnum); - - desc = blk_get_dev(devname, devnum); - if (!desc) - return log_msg_ret("get", -ENXIO); - - ret = simple_read_version(dev, desc, buf, state); + ret = vbe_get_blk(priv->storage, &blk); + if (ret) + return log_msg_ret("blk", ret); + + ret = vbe_read_version(blk, priv->area_start + priv->version_offset, + state->fw_version, MAX_VERSION_LEN); if (ret) return log_msg_ret("ver", ret); + log_debug("version=%s\n", state->fw_version);
- ret = simple_read_nvdata(dev, desc, buf, state); + ret = simple_read_nvdata(dev, blk, state); if (ret) return log_msg_ret("nvd", ret);
diff --git a/boot/vbe_simple.h b/boot/vbe_simple.h index 56d319206f2..8eb3ddb61ba 100644 --- a/boot/vbe_simple.h +++ b/boot/vbe_simple.h @@ -9,20 +9,7 @@ #ifndef __VBE_SIMPLE_H #define __VBE_SIMPLE_H
-enum { - MAX_VERSION_LEN = 256, - - NVD_HDR_VER_SHIFT = 0, - NVD_HDR_VER_MASK = 0xf, - NVD_HDR_SIZE_SHIFT = 4, - NVD_HDR_SIZE_MASK = 0xf << NVD_HDR_SIZE_SHIFT, - - /* Firmware key-version is in the top 16 bits of fw_ver */ - FWVER_KEY_SHIFT = 16, - FWVER_FW_MASK = 0xffff, - - NVD_HDR_VER_CUR = 1, /* current version */ -}; +#include "vbe_common.h"
/** struct simple_priv - information read from the device tree */ struct simple_priv {

On Thu, Jan 09, 2025 at 05:29:56AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec, so start a new common file. Add functions for reading the version and nvdata. Also add a function to get the block device.
This is not a good commit message when you're also introducing functional changes (blk_dread -> blk_read). Further, this functional change seems to introduce some other code being pulled in very unexpectedly, and that needs to be explained. For example, phycore_am62x_r5_usbdfu is another case and that has CONFIG_BLOCK_CACHE=y but CONFIG_SPL_BLOCK_CACHE=n, and yet the size growth is in full U-Boot. Just moving code should not result in unexpected size change.

Hi Tom,
On Mon, 13 Jan 2025 at 18:22, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:56AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec, so start a new common file. Add functions for reading the version and nvdata. Also add a function to get the block device.
This is not a good commit message when you're also introducing functional changes (blk_dread -> blk_read).
Yes, I did not actually notice this size change at all, as mentioned, as I only built for a few boards (firefly rk3399/rk3288 and a few sandbox ones). The blk_dread() function is the old API for reading from a block. It didn't occur to me that blk_dread() would bypass the cache.
Further, this functional change seems to introduce some other code being pulled in very unexpectedly, and that needs to be explained. For example, phycore_am62x_r5_usbdfu is another case and that has CONFIG_BLOCK_CACHE=y but CONFIG_SPL_BLOCK_CACHE=n, and yet the size growth is in full U-Boot.
This board currently uses blk_dread(), but not blk_read(), so the addition of a blk_read() call in non-SPL causes the BLK cache to be pulled in.
Just moving code should not result in unexpected size change.
That's correct, so long as 'expected' size changes includes the normal function-call overhead and loss of intra-module optimisation.
So, what to do here? I certainly don't want to use the old blk API. But I can perhaps do a patch which changes VBE to use the new one, first, so the size growth is within that commit. I looked up BLOCK_CACHE and it is used by 90% of boards.
Another thing I could do (down the track) is a separate series to remove blk_d...() functions.
Regards, Simon

Hi Tom,
On Tue, 14 Jan 2025 at 05:58, Simon Glass sjg@chromium.org wrote:
Hi Tom,
On Mon, 13 Jan 2025 at 18:22, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:56AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec, so start a new common file. Add functions for reading the version and nvdata. Also add a function to get the block device.
This is not a good commit message when you're also introducing functional changes (blk_dread -> blk_read).
Yes, I did not actually notice this size change at all, as mentioned, as I only built for a few boards (firefly rk3399/rk3288 and a few sandbox ones). The blk_dread() function is the old API for reading from a block. It didn't occur to me that blk_dread() would bypass the cache.
Further, this functional change seems to introduce some other code being pulled in very unexpectedly, and that needs to be explained. For example, phycore_am62x_r5_usbdfu is another case and that has CONFIG_BLOCK_CACHE=y but CONFIG_SPL_BLOCK_CACHE=n, and yet the size growth is in full U-Boot.
This board currently uses blk_dread(), but not blk_read(), so the addition of a blk_read() call in non-SPL causes the BLK cache to be pulled in.
Just moving code should not result in unexpected size change.
That's correct, so long as 'expected' size changes includes the normal function-call overhead and loss of intra-module optimisation.
So, what to do here? I certainly don't want to use the old blk API. But I can perhaps do a patch which changes VBE to use the new one, first, so the size growth is within that commit. I looked up BLOCK_CACHE and it is used by 90% of boards.
Another thing I could do (down the track) is a separate series to remove blk_d...() functions.
Further to this I've created a v2 series which splits the first commit into more pieces, so we can more clearly see the code-size increase. I'll hold off sending it for now, in case something else comes up in this discussion.
Regards, Simon

On Tue, Jan 14, 2025 at 05:58:04AM -0700, Simon Glass wrote:
Hi Tom,
On Mon, 13 Jan 2025 at 18:22, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:56AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec, so start a new common file. Add functions for reading the version and nvdata. Also add a function to get the block device.
This is not a good commit message when you're also introducing functional changes (blk_dread -> blk_read).
Yes, I did not actually notice this size change at all, as mentioned, as I only built for a few boards (firefly rk3399/rk3288 and a few sandbox ones). The blk_dread() function is the old API for reading
You have much faster build machines available than I do, it would be good to run a world build before/after (it might take an hour on alexandra) before posting these big series.
from a block. It didn't occur to me that blk_dread() would bypass the cache.
Yes, how is that happening? That's what doesn't make sense.
Further, this functional change seems to introduce some other code being pulled in very unexpectedly, and that needs to be explained. For example, phycore_am62x_r5_usbdfu is another case and that has CONFIG_BLOCK_CACHE=y but CONFIG_SPL_BLOCK_CACHE=n, and yet the size growth is in full U-Boot.
This board currently uses blk_dread(), but not blk_read(), so the addition of a blk_read() call in non-SPL causes the BLK cache to be pulled in.
But, how? I mean, this sounds like some underlying bug that needs to be fixed. The platform sets CONFIG_BLK=y and CONFIG_BLOCK_CACHE=y and drivers/block/blk-uclass.c has: ulong blk_dread(struct blk_desc *desc, lbaint_t start, lbaint_t blkcnt, void *buffer) { return blk_read(desc->bdev, start, blkcnt, buffer); }
Or perhaps the answer / problem here is that I need to track down what's going wrong here instead of asking you to track this down.
Just moving code should not result in unexpected size change.
That's correct, so long as 'expected' size changes includes the normal function-call overhead and loss of intra-module optimisation.
Yes, but that's not what this series shows. There's (a) the above bug and (b) you might be allocating buffers twice now instead of once?
So, what to do here? I certainly don't want to use the old blk API.
Yes. I'm not 100% sure, especially after https://patchwork.ozlabs.org/project/uboot/list/?series=437816&state=* which I need to v2 what the non-SPL case is. And for new features, sure, saying SPL_BLK is required too is fine. Heck, it *may* even be the case that only TPL_BLK=n is required to be valid, all of that would require other investigation and removals.
But I can perhaps do a patch which changes VBE to use the new one, first, so the size growth is within that commit. I looked up BLOCK_CACHE and it is used by 90% of boards.
Yes, BLOCK_CACHE should be used by everyone at this point, it was one of those cases of "this grows the code but it's a universal speedup".

On Tue, Jan 14, 2025 at 10:58:01AM -0600, Tom Rini wrote:
On Tue, Jan 14, 2025 at 05:58:04AM -0700, Simon Glass wrote:
Hi Tom,
On Mon, 13 Jan 2025 at 18:22, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:56AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec, so start a new common file. Add functions for reading the version and nvdata. Also add a function to get the block device.
This is not a good commit message when you're also introducing functional changes (blk_dread -> blk_read).
Yes, I did not actually notice this size change at all, as mentioned, as I only built for a few boards (firefly rk3399/rk3288 and a few sandbox ones). The blk_dread() function is the old API for reading
You have much faster build machines available than I do, it would be good to run a world build before/after (it might take an hour on alexandra) before posting these big series.
from a block. It didn't occur to me that blk_dread() would bypass the cache.
Yes, how is that happening? That's what doesn't make sense.
Further, this functional change seems to introduce some other code being pulled in very unexpectedly, and that needs to be explained. For example, phycore_am62x_r5_usbdfu is another case and that has CONFIG_BLOCK_CACHE=y but CONFIG_SPL_BLOCK_CACHE=n, and yet the size growth is in full U-Boot.
This board currently uses blk_dread(), but not blk_read(), so the addition of a blk_read() call in non-SPL causes the BLK cache to be pulled in.
But, how? I mean, this sounds like some underlying bug that needs to be fixed. The platform sets CONFIG_BLK=y and CONFIG_BLOCK_CACHE=y and drivers/block/blk-uclass.c has: ulong blk_dread(struct blk_desc *desc, lbaint_t start, lbaint_t blkcnt, void *buffer) { return blk_read(desc->bdev, start, blkcnt, buffer); }
Or perhaps the answer / problem here is that I need to track down what's going wrong here instead of asking you to track this down.
So, digging in to this, what's going on is that the platforms I've noted here (and a few others) don't actually enable any block devices. That's why, today, with VBE on still, they discard all of the code still. Your rework means that while there's still no block devices, we're now including the code. What to do? Both of: - These platforms *should* disable the assorted block device related library options they have enabled if / when possible (it might not be a neat and easy un-tangle). - VBE needs to handle the case where there's not a block device enabled and not fail the build. Part of this requires the Kconfig rework I linked to earlier (so that we can have BLK off) to be complete, but having the VBE code check for BLK being enabled either in Makefile or C code should be the same regardless.

Hi Tom,
On Tue, 14 Jan 2025 at 10:33, Tom Rini trini@konsulko.com wrote:
On Tue, Jan 14, 2025 at 10:58:01AM -0600, Tom Rini wrote:
On Tue, Jan 14, 2025 at 05:58:04AM -0700, Simon Glass wrote:
Hi Tom,
On Mon, 13 Jan 2025 at 18:22, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:56AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec, so start a new common file. Add functions for reading the version and nvdata. Also add a function to get the block device.
This is not a good commit message when you're also introducing functional changes (blk_dread -> blk_read).
Yes, I did not actually notice this size change at all, as mentioned, as I only built for a few boards (firefly rk3399/rk3288 and a few sandbox ones). The blk_dread() function is the old API for reading
You have much faster build machines available than I do, it would be good to run a world build before/after (it might take an hour on alexandra) before posting these big series.
OK. I used to do this, but with CI sometimes grabbing the machines it ends up with a huge load and sometimes runs out of memory. I suppose I can stop the runner before doing it and remember to restart it afterwards. Or perhaps Alex has enough memory per thread? (96GB).
from a block. It didn't occur to me that blk_dread() would bypass the cache.
Yes, how is that happening? That's what doesn't make sense.
OK, I see. It is actually one level deeper than I thought.
PARTITIONS is not enabled, for example on phycore_am62x_r5_usbdfu which means that blk_get_dev() returns NULL
So long as all the code is in one module, blk_get_dev() returning NULL causes the calls to vbe_simple_read_state() and simple_read_nvdata() to be dropped, so their code is dropped too, so there is no blk_read()/blk_dread() call. It is optimised away. It just happens to be the only call in U-Boot
Further, this functional change seems to introduce some other code being pulled in very unexpectedly, and that needs to be explained. For example, phycore_am62x_r5_usbdfu is another case and that has CONFIG_BLOCK_CACHE=y but CONFIG_SPL_BLOCK_CACHE=n, and yet the size growth is in full U-Boot.
This board currently uses blk_dread(), but not blk_read(), so the addition of a blk_read() call in non-SPL causes the BLK cache to be pulled in.
But, how? I mean, this sounds like some underlying bug that needs to be fixed. The platform sets CONFIG_BLK=y and CONFIG_BLOCK_CACHE=y and drivers/block/blk-uclass.c has: ulong blk_dread(struct blk_desc *desc, lbaint_t start, lbaint_t blkcnt, void *buffer) { return blk_read(desc->bdev, start, blkcnt, buffer); }
Or perhaps the answer / problem here is that I need to track down what's going wrong here instead of asking you to track this down.
So, digging in to this, what's going on is that the platforms I've noted here (and a few others) don't actually enable any block devices. That's why, today, with VBE on still, they discard all of the code still. Your rework means that while there's still no block devices, we're now including the code. What to do? Both of:
- These platforms *should* disable the assorted block device related library options they have enabled if / when possible (it might not be a neat and easy un-tangle).
As above, the issue seems to be PARTITIONS
- VBE needs to handle the case where there's not a block device enabled and not fail the build. Part of this requires the Kconfig rework I linked to earlier (so that we can have BLK off) to be complete, but having the VBE code check for BLK being enabled either in Makefile or C code should be the same regardless.
I thought you NAK'd my patch to drop the 'depends on BLK' for bootstd?
From what I can tell EFI_LOADER doesn't do any block access unless
PARTITIONS is enabled. I don't want that for VBE because the VBE data is not actually in a partition (although perhaps it could be in future).
Anyway, from what I can tell, the code-size increase is explainable and perhaps we should just accept it.
Regards, Simon

On Tue, Jan 14, 2025 at 06:18:26PM -0700, Simon Glass wrote:
Hi Tom,
On Tue, 14 Jan 2025 at 10:33, Tom Rini trini@konsulko.com wrote:
On Tue, Jan 14, 2025 at 10:58:01AM -0600, Tom Rini wrote:
On Tue, Jan 14, 2025 at 05:58:04AM -0700, Simon Glass wrote:
Hi Tom,
On Mon, 13 Jan 2025 at 18:22, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:56AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec, so start a new common file. Add functions for reading the version and nvdata. Also add a function to get the block device.
This is not a good commit message when you're also introducing functional changes (blk_dread -> blk_read).
Yes, I did not actually notice this size change at all, as mentioned, as I only built for a few boards (firefly rk3399/rk3288 and a few sandbox ones). The blk_dread() function is the old API for reading
You have much faster build machines available than I do, it would be good to run a world build before/after (it might take an hour on alexandra) before posting these big series.
OK. I used to do this, but with CI sometimes grabbing the machines it ends up with a huge load and sometimes runs out of memory. I suppose I can stop the runner before doing it and remember to restart it afterwards. Or perhaps Alex has enough memory per thread? (96GB).
Personally, I would make the script that does the world build for size test turn off the runner before starting and turn it back on when done otherwise build time will suffer greatly and you would have been better served doing it all on a slower class of machine.
from a block. It didn't occur to me that blk_dread() would bypass the cache.
Yes, how is that happening? That's what doesn't make sense.
OK, I see. It is actually one level deeper than I thought.
PARTITIONS is not enabled, for example on phycore_am62x_r5_usbdfu which means that blk_get_dev() returns NULL
So long as all the code is in one module, blk_get_dev() returning NULL causes the calls to vbe_simple_read_state() and simple_read_nvdata() to be dropped, so their code is dropped too, so there is no blk_read()/blk_dread() call. It is optimised away. It just happens to be the only call in U-Boot
Ah, that is all also true, thanks.
Further, this functional change seems to introduce some other code being pulled in very unexpectedly, and that needs to be explained. For example, phycore_am62x_r5_usbdfu is another case and that has CONFIG_BLOCK_CACHE=y but CONFIG_SPL_BLOCK_CACHE=n, and yet the size growth is in full U-Boot.
This board currently uses blk_dread(), but not blk_read(), so the addition of a blk_read() call in non-SPL causes the BLK cache to be pulled in.
But, how? I mean, this sounds like some underlying bug that needs to be fixed. The platform sets CONFIG_BLK=y and CONFIG_BLOCK_CACHE=y and drivers/block/blk-uclass.c has: ulong blk_dread(struct blk_desc *desc, lbaint_t start, lbaint_t blkcnt, void *buffer) { return blk_read(desc->bdev, start, blkcnt, buffer); }
Or perhaps the answer / problem here is that I need to track down what's going wrong here instead of asking you to track this down.
So, digging in to this, what's going on is that the platforms I've noted here (and a few others) don't actually enable any block devices. That's why, today, with VBE on still, they discard all of the code still. Your rework means that while there's still no block devices, we're now including the code. What to do? Both of:
- These platforms *should* disable the assorted block device related library options they have enabled if / when possible (it might not be a neat and easy un-tangle).
As above, the issue seems to be PARTITIONS
Yes, I think we both agree there's a number of useless functionality enabled on these handful of platforms. PARTITIONS explains how it gets optimized away and all of the useless block-related things like MMC core should also be dropped.
- VBE needs to handle the case where there's not a block device enabled and not fail the build. Part of this requires the Kconfig rework I linked to earlier (so that we can have BLK off) to be complete, but having the VBE code check for BLK being enabled either in Makefile or C code should be the same regardless.
I thought you NAK'd my patch to drop the 'depends on BLK' for bootstd?
I know I NAK'd https://patchwork.ozlabs.org/project/uboot/patch/20241113150938.1534931-2-sj... as that should be instead done with https://patchwork.ozlabs.org/project/uboot/list/?series=440336 instead.
With that series then yes, you should be able to have BOOTSTD not depend on BLK. But in a practical sense I worry that your changes in this series will break with BLK=n. My point above is to please make sure it still works in that case.
From what I can tell EFI_LOADER doesn't do any block access unless PARTITIONS is enabled. I don't want that for VBE because the VBE data is not actually in a partition (although perhaps it could be in future).
In a practical sense EFI_LOADER depends on BLK and I've fixed the dependency with my series and talked with Ilias and Heinrich about it. Making EFI_LOADER buildable without BLK is a mechanical effort, but making it usefully functional still might require a bit more work and they'll cross that bridge when they come to it.
Anyway, from what I can tell, the code-size increase is explainable and perhaps we should just accept it.
Other than the double buffer one, yes, I suppose it's not unreasonable to ask me to fixup the oddball platforms with extraneous PARTITIONS and other block type things enabled. Still need to get back to the buffer one as 200 bytes isn't normal for re-organizing things such that the compiler can't decide to inline something.

Hi Tom,
On Tue, 14 Jan 2025 at 18:41, Tom Rini trini@konsulko.com wrote:
On Tue, Jan 14, 2025 at 06:18:26PM -0700, Simon Glass wrote:
Hi Tom,
On Tue, 14 Jan 2025 at 10:33, Tom Rini trini@konsulko.com wrote:
On Tue, Jan 14, 2025 at 10:58:01AM -0600, Tom Rini wrote:
On Tue, Jan 14, 2025 at 05:58:04AM -0700, Simon Glass wrote:
Hi Tom,
On Mon, 13 Jan 2025 at 18:22, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:56AM -0700, Simon Glass wrote:
> Loading a FIT is useful for other VBE methods, such as ABrec,
so start
> a new common file. Add functions for reading the version and
nvdata.
> Also add a function to get the block device.
This is not a good commit message when you're also introducing functional changes (blk_dread -> blk_read).
Yes, I did not actually notice this size change at all, as
mentioned,
as I only built for a few boards (firefly rk3399/rk3288 and a few sandbox ones). The blk_dread() function is the old API for reading
You have much faster build machines available than I do, it would be good to run a world build before/after (it might take an hour on alexandra) before posting these big series.
OK. I used to do this, but with CI sometimes grabbing the machines it ends up with a huge load and sometimes runs out of memory. I suppose I can stop the runner before doing it and remember to restart it afterwards. Or perhaps Alex has enough memory per thread? (96GB).
Personally, I would make the script that does the world build for size test turn off the runner before starting and turn it back on when done otherwise build time will suffer greatly and you would have been better served doing it all on a slower class of machine.
from a block. It didn't occur to me that blk_dread() would bypass
the
cache.
Yes, how is that happening? That's what doesn't make sense.
OK, I see. It is actually one level deeper than I thought.
PARTITIONS is not enabled, for example on phycore_am62x_r5_usbdfu which means that blk_get_dev() returns NULL
So long as all the code is in one module, blk_get_dev() returning NULL causes the calls to vbe_simple_read_state() and simple_read_nvdata() to be dropped, so their code is dropped too, so there is no blk_read()/blk_dread() call. It is optimised away. It just happens to be the only call in U-Boot
Ah, that is all also true, thanks.
Further, this functional change seems to introduce some other code being pulled in very unexpectedly, and that needs to be explained. For example, phycore_am62x_r5_usbdfu is another case and that has CONFIG_BLOCK_CACHE=y but CONFIG_SPL_BLOCK_CACHE=n, and yet the
size
growth is in full U-Boot.
This board currently uses blk_dread(), but not blk_read(), so the addition of a blk_read() call in non-SPL causes the BLK cache to
be
pulled in.
But, how? I mean, this sounds like some underlying bug that needs
to be
fixed. The platform sets CONFIG_BLK=y and CONFIG_BLOCK_CACHE=y and drivers/block/blk-uclass.c has: ulong blk_dread(struct blk_desc *desc, lbaint_t start, lbaint_t
blkcnt,
void *buffer)
{ return blk_read(desc->bdev, start, blkcnt, buffer); }
Or perhaps the answer / problem here is that I need to track down
what's
going wrong here instead of asking you to track this down.
So, digging in to this, what's going on is that the platforms I've
noted
here (and a few others) don't actually enable any block devices.
That's
why, today, with VBE on still, they discard all of the code still.
Your
rework means that while there's still no block devices, we're now including the code. What to do? Both of:
- These platforms *should* disable the assorted block device related library options they have enabled if / when possible (it might not
be
a neat and easy un-tangle).
As above, the issue seems to be PARTITIONS
Yes, I think we both agree there's a number of useless functionality enabled on these handful of platforms. PARTITIONS explains how it gets optimized away and all of the useless block-related things like MMC core should also be dropped.
- VBE needs to handle the case where there's not a block device
enabled
and not fail the build. Part of this requires the Kconfig rework I linked to earlier (so that we can have BLK off) to be complete, but having the VBE code check for BLK being enabled either in Makefile
or
C code should be the same regardless.
I thought you NAK'd my patch to drop the 'depends on BLK' for bootstd?
I know I NAK'd
https://patchwork.ozlabs.org/project/uboot/patch/20241113150938.1534931-2-sj...
as that should be instead done with https://patchwork.ozlabs.org/project/uboot/list/?series=440336 instead.
With that series then yes, you should be able to have BOOTSTD not depend on BLK. But in a practical sense I worry that your changes in this series will break with BLK=n. My point above is to please make sure it still works in that case.
From what I can tell EFI_LOADER doesn't do any block access unless PARTITIONS is enabled. I don't want that for VBE because the VBE data is not actually in a partition (although perhaps it could be in future).
In a practical sense EFI_LOADER depends on BLK and I've fixed the dependency with my series and talked with Ilias and Heinrich about it. Making EFI_LOADER buildable without BLK is a mechanical effort, but making it usefully functional still might require a bit more work and they'll cross that bridge when they come to it.
Anyway, from what I can tell, the code-size increase is explainable and perhaps we should just accept it.
Other than the double buffer one, yes, I suppose it's not unreasonable to ask me to fixup the oddball platforms with extraneous PARTITIONS and other block type things enabled. Still need to get back to the buffer one as 200 bytes isn't normal for re-organizing things such that the compiler can't decide to inline something.
What is the double buffer one?
I have a v2 of this series which splits the first patch up a bit more. But the end result is the same.
Regards, Simon

Hi Tom,
On Tue, 14 Jan 2025 at 19:48, Simon Glass sjg@chromium.org wrote:
Hi Tom,
On Tue, 14 Jan 2025 at 18:41, Tom Rini trini@konsulko.com wrote:
On Tue, Jan 14, 2025 at 06:18:26PM -0700, Simon Glass wrote:
Hi Tom,
On Tue, 14 Jan 2025 at 10:33, Tom Rini trini@konsulko.com wrote:
On Tue, Jan 14, 2025 at 10:58:01AM -0600, Tom Rini wrote:
On Tue, Jan 14, 2025 at 05:58:04AM -0700, Simon Glass wrote:
Hi Tom,
On Mon, 13 Jan 2025 at 18:22, Tom Rini trini@konsulko.com wrote: > > On Thu, Jan 09, 2025 at 05:29:56AM -0700, Simon Glass wrote: > > > Loading a FIT is useful for other VBE methods, such as ABrec, so start > > a new common file. Add functions for reading the version and nvdata. > > Also add a function to get the block device. > > This is not a good commit message when you're also introducing > functional changes (blk_dread -> blk_read).
Yes, I did not actually notice this size change at all, as mentioned, as I only built for a few boards (firefly rk3399/rk3288 and a few sandbox ones). The blk_dread() function is the old API for reading
You have much faster build machines available than I do, it would be good to run a world build before/after (it might take an hour on alexandra) before posting these big series.
OK. I used to do this, but with CI sometimes grabbing the machines it ends up with a huge load and sometimes runs out of memory. I suppose I can stop the runner before doing it and remember to restart it afterwards. Or perhaps Alex has enough memory per thread? (96GB).
Personally, I would make the script that does the world build for size test turn off the runner before starting and turn it back on when done otherwise build time will suffer greatly and you would have been better served doing it all on a slower class of machine.
from a block. It didn't occur to me that blk_dread() would bypass the cache.
Yes, how is that happening? That's what doesn't make sense.
OK, I see. It is actually one level deeper than I thought.
PARTITIONS is not enabled, for example on phycore_am62x_r5_usbdfu which means that blk_get_dev() returns NULL
So long as all the code is in one module, blk_get_dev() returning NULL causes the calls to vbe_simple_read_state() and simple_read_nvdata() to be dropped, so their code is dropped too, so there is no blk_read()/blk_dread() call. It is optimised away. It just happens to be the only call in U-Boot
Ah, that is all also true, thanks.
> Further, this functional > change seems to introduce some other code being pulled in very > unexpectedly, and that needs to be explained. For example, > phycore_am62x_r5_usbdfu is another case and that has > CONFIG_BLOCK_CACHE=y but CONFIG_SPL_BLOCK_CACHE=n, and yet the size > growth is in full U-Boot.
This board currently uses blk_dread(), but not blk_read(), so the addition of a blk_read() call in non-SPL causes the BLK cache to be pulled in.
But, how? I mean, this sounds like some underlying bug that needs to be fixed. The platform sets CONFIG_BLK=y and CONFIG_BLOCK_CACHE=y and drivers/block/blk-uclass.c has: ulong blk_dread(struct blk_desc *desc, lbaint_t start, lbaint_t blkcnt, void *buffer) { return blk_read(desc->bdev, start, blkcnt, buffer); }
Or perhaps the answer / problem here is that I need to track down what's going wrong here instead of asking you to track this down.
So, digging in to this, what's going on is that the platforms I've noted here (and a few others) don't actually enable any block devices. That's why, today, with VBE on still, they discard all of the code still. Your rework means that while there's still no block devices, we're now including the code. What to do? Both of:
- These platforms *should* disable the assorted block device related library options they have enabled if / when possible (it might not be a neat and easy un-tangle).
As above, the issue seems to be PARTITIONS
Yes, I think we both agree there's a number of useless functionality enabled on these handful of platforms. PARTITIONS explains how it gets optimized away and all of the useless block-related things like MMC core should also be dropped.
- VBE needs to handle the case where there's not a block device enabled and not fail the build. Part of this requires the Kconfig rework I linked to earlier (so that we can have BLK off) to be complete, but having the VBE code check for BLK being enabled either in Makefile or C code should be the same regardless.
I thought you NAK'd my patch to drop the 'depends on BLK' for bootstd?
I know I NAK'd https://patchwork.ozlabs.org/project/uboot/patch/20241113150938.1534931-2-sj... as that should be instead done with https://patchwork.ozlabs.org/project/uboot/list/?series=440336 instead.
With that series then yes, you should be able to have BOOTSTD not depend on BLK. But in a practical sense I worry that your changes in this series will break with BLK=n. My point above is to please make sure it still works in that case.
From what I can tell EFI_LOADER doesn't do any block access unless PARTITIONS is enabled. I don't want that for VBE because the VBE data is not actually in a partition (although perhaps it could be in future).
In a practical sense EFI_LOADER depends on BLK and I've fixed the dependency with my series and talked with Ilias and Heinrich about it. Making EFI_LOADER buildable without BLK is a mechanical effort, but making it usefully functional still might require a bit more work and they'll cross that bridge when they come to it.
Anyway, from what I can tell, the code-size increase is explainable and perhaps we should just accept it.
Other than the double buffer one, yes, I suppose it's not unreasonable to ask me to fixup the oddball platforms with extraneous PARTITIONS and other block type things enabled. Still need to get back to the buffer one as 200 bytes isn't normal for re-organizing things such that the compiler can't decide to inline something.
What is the double buffer one?
I have a v2 of this series which splits the first patch up a bit more. But the end result is the same.
I can send that if you like, but I just want to make sure I haven't missed anything from this thread.
Regards, SImon

Loading a FIT is useful for other VBE methods, such as ABrec. Create a new function to handling reading it.
Signed-off-by: Simon Glass sjg@chromium.org ---
boot/vbe_common.c | 103 +++++++++++++++++++++++++++++++++++++++++++ boot/vbe_common.h | 28 ++++++++++++ boot/vbe_simple_fw.c | 94 ++------------------------------------- 3 files changed, 135 insertions(+), 90 deletions(-)
diff --git a/boot/vbe_common.c b/boot/vbe_common.c index 5f31cde0288..c009337ef3f 100644 --- a/boot/vbe_common.c +++ b/boot/vbe_common.c @@ -18,6 +18,109 @@ #include <u-boot/crc.h> #include "vbe_common.h"
+int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, + ulong *load_addrp, ulong *lenp, char **namep) +{ + ALLOC_CACHE_ALIGN_BUFFER(u8, sbuf, MMC_MAX_BLOCK_LEN); + ulong size, blknum, addr, len, load_addr, num_blks; + const char *fit_uname, *fit_uname_config; + struct bootm_headers images = {}; + enum image_phase_t phase; + struct blk_desc *desc; + int node, ret; + void *buf; + + desc = dev_get_uclass_plat(blk); + + /* read in one block to find the FIT size */ + blknum = area_offset / desc->blksz; + log_debug("read at %lx, blknum %lx\n", area_offset, blknum); + ret = blk_read(blk, blknum, 1, sbuf); + if (ret < 0) + return log_msg_ret("rd", ret); + + ret = fdt_check_header(sbuf); + if (ret < 0) + return log_msg_ret("fdt", -EINVAL); + size = fdt_totalsize(sbuf); + if (size > area_size) + return log_msg_ret("fdt", -E2BIG); + log_debug("FIT size %lx\n", size); + + /* + * Load the FIT into the SPL memory. This is typically a FIT with + * external data, so this is quite small, perhaps a few KB. + */ + addr = CONFIG_VAL(TEXT_BASE); + buf = map_sysmem(addr, size); + num_blks = DIV_ROUND_UP(size, desc->blksz); + log_debug("read %lx, %lx blocks to %lx / %p\n", size, num_blks, addr, + buf); + ret = blk_read(blk, blknum, num_blks, buf); + if (ret < 0) + return log_msg_ret("rd", ret); + + /* figure out the phase to load */ + phase = IS_ENABLED(CONFIG_VPL_BUILD) ? IH_PHASE_SPL : IH_PHASE_U_BOOT; + + /* + * Load the image from the FIT. We ignore any load-address information + * so in practice this simply locates the image in the external-data + * region and returns its address and size. Since we only loaded the FIT + * itself, only a part of the image will be present, at best. + */ + fit_uname = NULL; + fit_uname_config = NULL; + log_debug("loading FIT\n"); + ret = fit_image_load(&images, addr, &fit_uname, &fit_uname_config, + IH_ARCH_SANDBOX, image_ph(phase, IH_TYPE_FIRMWARE), + BOOTSTAGE_ID_FIT_SPL_START, FIT_LOAD_IGNORED, + &load_addr, &len); + if (ret < 0) + return log_msg_ret("ld", ret); + node = ret; + log_debug("loaded to %lx\n", load_addr); + + /* For FIT external data, read in the external data */ + if (load_addr + len > addr + size) { + ulong base, full_size; + void *base_buf; + + /* Find the start address to load from */ + base = ALIGN_DOWN(load_addr, desc->blksz); + + /* + * Get the total number of bytes to load, taking care of + * block alignment + */ + full_size = load_addr + len - base; + + /* + * Get the start block number, number of blocks and the address + * to load to, then load the blocks + */ + blknum = (area_offset + base - addr) / desc->blksz; + num_blks = DIV_ROUND_UP(full_size, desc->blksz); + base_buf = map_sysmem(base, full_size); + ret = blk_read(blk, blknum, num_blks, base_buf); + log_debug("read %lx %lx, %lx blocks to %lx / %p: ret=%d\n", + blknum, full_size, num_blks, base, base_buf, ret); + if (ret < 0) + return log_msg_ret("rd", ret); + } + if (load_addrp) + *load_addrp = load_addr; + if (lenp) + *lenp = len; + if (namep) { + *namep = strdup(fdt_get_name(buf, node, NULL)); + if (!namep) + return log_msg_ret("nam", -ENOMEM); + } + + return 0; +} + int vbe_read_version(struct udevice *blk, ulong offset, char *version, int max_size) { diff --git a/boot/vbe_common.h b/boot/vbe_common.h index df133fbf05f..403391ff1ea 100644 --- a/boot/vbe_common.h +++ b/boot/vbe_common.h @@ -140,4 +140,32 @@ int vbe_read_version(struct udevice *blk, ulong offset, char *version, */ int vbe_get_blk(const char *storage, struct udevice **blkp);
+/** + * vbe_read_fit() - Read an image from a FIT + * + * This handles most of the VBE logic for reading from a FIT. It reads the FIT + * metadata, decides which image to load and loads it to a suitable address, + * ready for jumping to the next phase of VBE. + * + * This supports transition from VPL to SPL as well as SPL to U-Boot proper. For + * now, TPL->VPL is not supported. + * + * Both embedded and external data are supported for the FIT + * + * @blk: Block device containing FIT + * @area_offset: Byte offset of the VBE area in @blk containing the FIT + * @area_size: Size of the VBE area + * @load_addrp: If non-null, returns the address where the image was loaded + * @lenp: If non-null, returns the size of the image loaded, in bytes + * @namep: If non-null, returns the name of the FIT-image node that was loaded + * (allocated by this function) + * Return: 0 if OK, -EINVAL if the area does not contain an FDT (the underlying + * format for FIT), -E2BIG if the FIT extends past @area_size, -ENOMEM if there + * was not space to allocate the image-node name, other error if a read error + * occurred (see blk_read()), or something went wrong with the actually + * FIT-parsing (see fit_image_load()). + */ +int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, + ulong *load_addrp, ulong *lenp, char **namep); + #endif /* __VBE_ABREC_H */ diff --git a/boot/vbe_simple_fw.c b/boot/vbe_simple_fw.c index da9701f9eb9..0bf25ccad23 100644 --- a/boot/vbe_simple_fw.c +++ b/boot/vbe_simple_fw.c @@ -38,109 +38,23 @@ */ int vbe_simple_read_bootflow_fw(struct udevice *dev, struct bootflow *bflow) { - ALLOC_CACHE_ALIGN_BUFFER(u8, sbuf, MMC_MAX_BLOCK_LEN); struct udevice *media = dev_get_parent(bflow->dev); struct udevice *meth = bflow->method; struct simple_priv *priv = dev_get_priv(meth); - const char *fit_uname, *fit_uname_config; - struct bootm_headers images = {}; - ulong offset, size, blknum, addr, len, load_addr, num_blks; - enum image_phase_t phase; - struct blk_desc *desc; + ulong len, load_addr; struct udevice *blk; - int node, ret; - void *buf; + int ret;
log_debug("media=%s\n", media->name); ret = blk_get_from_parent(media, &blk); if (ret) return log_msg_ret("med", ret); log_debug("blk=%s\n", blk->name); - desc = dev_get_uclass_plat(blk); - - offset = priv->area_start + priv->skip_offset; - - /* read in one block to find the FIT size */ - blknum = offset / desc->blksz; - log_debug("read at %lx, blknum %lx\n", offset, blknum); - ret = blk_read(blk, blknum, 1, sbuf); - if (ret < 0) - return log_msg_ret("rd", ret); - - ret = fdt_check_header(sbuf); - if (ret < 0) - return log_msg_ret("fdt", -EINVAL); - size = fdt_totalsize(sbuf); - if (size > priv->area_size) - return log_msg_ret("fdt", -E2BIG); - log_debug("FIT size %lx\n", size); - - /* - * Load the FIT into the SPL memory. This is typically a FIT with - * external data, so this is quite small, perhaps a few KB. - */ - addr = CONFIG_VAL(TEXT_BASE); - buf = map_sysmem(addr, size); - num_blks = DIV_ROUND_UP(size, desc->blksz); - log_debug("read %lx, %lx blocks to %lx / %p\n", size, num_blks, addr, - buf); - ret = blk_read(blk, blknum, num_blks, buf); - if (ret < 0) - return log_msg_ret("rd", ret);
- /* figure out the phase to load */ - phase = IS_ENABLED(CONFIG_VPL_BUILD) ? IH_PHASE_SPL : IH_PHASE_U_BOOT; - - /* - * Load the image from the FIT. We ignore any load-address information - * so in practice this simply locates the image in the external-data - * region and returns its address and size. Since we only loaded the FIT - * itself, only a part of the image will be present, at best. - */ - fit_uname = NULL; - fit_uname_config = NULL; - log_debug("loading FIT\n"); - ret = fit_image_load(&images, addr, &fit_uname, &fit_uname_config, - IH_ARCH_SANDBOX, image_ph(phase, IH_TYPE_FIRMWARE), - BOOTSTAGE_ID_FIT_SPL_START, FIT_LOAD_IGNORED, - &load_addr, &len); - if (ret < 0) - return log_msg_ret("ld", ret); - node = ret; - log_debug("loaded to %lx\n", load_addr); - - /* For FIT external data, read in the external data */ - if (load_addr + len > addr + size) { - ulong base, full_size; - void *base_buf; - - /* Find the start address to load from */ - base = ALIGN_DOWN(load_addr, desc->blksz); - - /* - * Get the total number of bytes to load, taking care of - * block alignment - */ - full_size = load_addr + len - base; - - /* - * Get the start block number, number of blocks and the address - * to load to, then load the blocks - */ - blknum = (offset + base - addr) / desc->blksz; - num_blks = DIV_ROUND_UP(full_size, desc->blksz); - base_buf = map_sysmem(base, full_size); - ret = blk_read(blk, blknum, num_blks, base_buf); - log_debug("read %lx %lx, %lx blocks to %lx / %p: ret=%d\n", - blknum, full_size, num_blks, base, base_buf, ret); - if (ret < 0) - return log_msg_ret("rd", ret); - } + ret = vbe_read_fit(blk, priv->area_start + priv->skip_offset, + priv->area_size, &load_addr, &len, &bflow->name);
/* set up the bootflow with the info we obtained */ - bflow->name = strdup(fdt_get_name(buf, node, NULL)); - if (!bflow->name) - return log_msg_ret("name", -ENOMEM); bflow->blk = blk; bflow->buf = map_sysmem(load_addr, len); bflow->size = len;

On Thu, Jan 09, 2025 at 05:29:57AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec. Create a new function to handling reading it.
Signed-off-by: Simon Glass sjg@chromium.org
This causes a bunch of growth: a3y17lte : all +1328 text +1328 u-boot: add: 8/0, grow: 1/0 bytes: 1328/0 (1328) function old new delta blkcache_fill - 332 +332 blkcache_read - 240 +240 blk_read - 188 +188 vbe_read_nvdata - 156 +156 vbe_read_version - 140 +140 vbe_get_blk - 100 +100 simple_read_nvdata - 96 +96 crc8 - 72 +72 vbe_simple_read_state 108 112 +4
Which is unexpected for just moving code around that's not newly used. And even when it's just a move it's still growing: xilinx_zynqmp_virt: all +128 bss -72 text +200 u-boot: add: 4/0, grow: 0/-1 bytes: 540/-340 (200) function old new delta vbe_read_nvdata - 156 +156 vbe_get_blk - 148 +148 vbe_read_version - 140 +140 simple_read_nvdata - 96 +96 vbe_simple_read_state 452 112 -340

Hi Tom,
On Sat, 11 Jan 2025 at 15:54, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:57AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec. Create a new function to handling reading it.
Signed-off-by: Simon Glass sjg@chromium.org
This causes a bunch of growth: a3y17lte : all +1328 text +1328 u-boot: add: 8/0, grow: 1/0 bytes: 1328/0 (1328) function old new delta blkcache_fill - 332 +332 blkcache_read - 240 +240 blk_read - 188 +188 vbe_read_nvdata - 156 +156 vbe_read_version - 140 +140 vbe_get_blk - 100 +100 simple_read_nvdata - 96 +96 crc8 - 72 +72 vbe_simple_read_state 108 112 +4
Which is unexpected for just moving code around that's not newly used.
I hadn't noticed that on the boards I was trying, so thank you for spotting it.
This is because it now uses blk_read() instead of blk_dread(), so if BLOCK_CACHE is enabled, it will use the block cache. We could disable BLOCK_CACHE on those boards perhaps? It is a speed optimisation so shouldn't be used by boards which care about code size.
And even when it's just a move it's still growing: xilinx_zynqmp_virt: all +128 bss -72 text +200 u-boot: add: 4/0, grow: 0/-1 bytes: 540/-340 (200) function old new delta vbe_read_nvdata - 156 +156 vbe_get_blk - 148 +148 vbe_read_version - 140 +140 simple_read_nvdata - 96 +96 vbe_simple_read_state 452 112 -340
Unfortunately this one is hard to fix. As you know, whenever you take code from a single module and put it into another, the compiler cannot optimise away the function-call overhead. I'll note that there is no increase when LTO is used, e.g. with xilinx_versal_net_mini_qspi
So let me know what you think.
Regards, Simon

On Mon, Jan 13, 2025 at 01:03:52PM -0700, Simon Glass wrote:
Hi Tom,
On Sat, 11 Jan 2025 at 15:54, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:57AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec. Create a new function to handling reading it.
Signed-off-by: Simon Glass sjg@chromium.org
This causes a bunch of growth: a3y17lte : all +1328 text +1328 u-boot: add: 8/0, grow: 1/0 bytes: 1328/0 (1328) function old new delta blkcache_fill - 332 +332 blkcache_read - 240 +240 blk_read - 188 +188 vbe_read_nvdata - 156 +156 vbe_read_version - 140 +140 vbe_get_blk - 100 +100 simple_read_nvdata - 96 +96 crc8 - 72 +72 vbe_simple_read_state 108 112 +4
Which is unexpected for just moving code around that's not newly used.
I hadn't noticed that on the boards I was trying, so thank you for spotting it.
This is because it now uses blk_read() instead of blk_dread(), so if
That's not what this patch does? There's no caller before or after in this patch of "blk_dread". Just moving functions around should not increase size on platforms that weren't using the existing functionality. Why is vbe_simple_read_state changing at all here, when it's not being touched?
BLOCK_CACHE is enabled, it will use the block cache. We could disable BLOCK_CACHE on those boards perhaps? It is a speed optimisation so shouldn't be used by boards which care about code size.
And even when it's just a move it's still growing: xilinx_zynqmp_virt: all +128 bss -72 text +200 u-boot: add: 4/0, grow: 0/-1 bytes: 540/-340 (200) function old new delta vbe_read_nvdata - 156 +156 vbe_get_blk - 148 +148 vbe_read_version - 140 +140 simple_read_nvdata - 96 +96 vbe_simple_read_state 452 112 -340
Unfortunately this one is hard to fix. As you know, whenever you take code from a single module and put it into another, the compiler cannot optimise away the function-call overhead. I'll note that there is no increase when LTO is used, e.g. with xilinx_versal_net_mini_qspi
So let me know what you think.
You likely need to re-think your refactor a bit then. If it's in part G or H that we have more than one caller of any of these functions, that's perhaps where it's time to refactor and expose them?

Hi Tom,
On Mon, 13 Jan 2025 at 13:44, Tom Rini trini@konsulko.com wrote:
On Mon, Jan 13, 2025 at 01:03:52PM -0700, Simon Glass wrote:
Hi Tom,
On Sat, 11 Jan 2025 at 15:54, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:57AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec. Create a new function to handling reading it.
Signed-off-by: Simon Glass sjg@chromium.org
This causes a bunch of growth: a3y17lte : all +1328 text +1328 u-boot: add: 8/0, grow: 1/0 bytes: 1328/0 (1328) function old new delta blkcache_fill - 332 +332 blkcache_read - 240 +240 blk_read - 188 +188 vbe_read_nvdata - 156 +156 vbe_read_version - 140 +140 vbe_get_blk - 100 +100 simple_read_nvdata - 96 +96 crc8 - 72 +72 vbe_simple_read_state 108 112 +4
Which is unexpected for just moving code around that's not newly used.
I hadn't noticed that on the boards I was trying, so thank you for spotting it.
This is because it now uses blk_read() instead of blk_dread(), so if
That's not what this patch does? There's no caller before or after in this patch of "blk_dread". Just moving functions around should not increase size on platforms that weren't using the existing functionality.
Firstly, are we looking at the same patch? Here is the one I am looking at:
https://patchwork.ozlabs.org/project/uboot/patch/20250109123010.4005298-2-sj...
It has blk_dread() in the old code and blk_read() in the new.
Why is vbe_simple_read_state changing at all here, when it's not being touched?
Again, I'm not sure we are looking at the same code. In my version, it does get changed, since it now calls vbe_get_blk() rather than finding the block device itself. This is to avoid duplicating that code in simple and abrec.
BLOCK_CACHE is enabled, it will use the block cache. We could disable BLOCK_CACHE on those boards perhaps? It is a speed optimisation so shouldn't be used by boards which care about code size.
And even when it's just a move it's still growing: xilinx_zynqmp_virt: all +128 bss -72 text +200 u-boot: add: 4/0, grow: 0/-1 bytes: 540/-340 (200) function old new delta vbe_read_nvdata - 156 +156 vbe_get_blk - 148 +148 vbe_read_version - 140 +140 simple_read_nvdata - 96 +96 vbe_simple_read_state 452 112 -340
Unfortunately this one is hard to fix. As you know, whenever you take code from a single module and put it into another, the compiler cannot optimise away the function-call overhead. I'll note that there is no increase when LTO is used, e.g. with xilinx_versal_net_mini_qspi
So let me know what you think.
You likely need to re-think your refactor a bit then. If it's in part G or H that we have more than one caller of any of these functions, that's perhaps where it's time to refactor and expose them?
Yes of course I can move things around. I would need to drop the FIT loader as well, so this series would become quite small. Shall I do that for the next version?
But just so that I understand...in the next series, when abrec is added, and I have these same patches, will you accept this size increase?
Regards, Simon

On Mon, Jan 13, 2025 at 05:13:46PM -0700, Simon Glass wrote:
Hi Tom,
On Mon, 13 Jan 2025 at 13:44, Tom Rini trini@konsulko.com wrote:
On Mon, Jan 13, 2025 at 01:03:52PM -0700, Simon Glass wrote:
Hi Tom,
On Sat, 11 Jan 2025 at 15:54, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:57AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec. Create a new function to handling reading it.
Signed-off-by: Simon Glass sjg@chromium.org
This causes a bunch of growth: a3y17lte : all +1328 text +1328 u-boot: add: 8/0, grow: 1/0 bytes: 1328/0 (1328) function old new delta blkcache_fill - 332 +332 blkcache_read - 240 +240 blk_read - 188 +188 vbe_read_nvdata - 156 +156 vbe_read_version - 140 +140 vbe_get_blk - 100 +100 simple_read_nvdata - 96 +96 crc8 - 72 +72 vbe_simple_read_state 108 112 +4
Which is unexpected for just moving code around that's not newly used.
I hadn't noticed that on the boards I was trying, so thank you for spotting it.
This is because it now uses blk_read() instead of blk_dread(), so if
That's not what this patch does? There's no caller before or after in this patch of "blk_dread". Just moving functions around should not increase size on platforms that weren't using the existing functionality.
Firstly, are we looking at the same patch? Here is the one I am looking at:
https://patchwork.ozlabs.org/project/uboot/patch/20250109123010.4005298-2-sj...
You're right, I replied to the wrong patch here, sorry for the confusion. I'll move some of my comments in reply to the correct patch now.
[snip]
And even when it's just a move it's still growing: xilinx_zynqmp_virt: all +128 bss -72 text +200 u-boot: add: 4/0, grow: 0/-1 bytes: 540/-340 (200) function old new delta vbe_read_nvdata - 156 +156 vbe_get_blk - 148 +148 vbe_read_version - 140 +140 simple_read_nvdata - 96 +96 vbe_simple_read_state 452 112 -340
Unfortunately this one is hard to fix. As you know, whenever you take code from a single module and put it into another, the compiler cannot optimise away the function-call overhead. I'll note that there is no increase when LTO is used, e.g. with xilinx_versal_net_mini_qspi
Yes, but 200 bytes isn't just function call overhead. Some of that might be from going from one ALLOC_CACHE_ALIGN_BUFFER(u8, buf, MMC_MAX_BLOCK_LEN) to two?
So let me know what you think.
You likely need to re-think your refactor a bit then. If it's in part G or H that we have more than one caller of any of these functions, that's perhaps where it's time to refactor and expose them?
Yes of course I can move things around. I would need to drop the FIT loader as well, so this series would become quite small. Shall I do that for the next version?
But just so that I understand...in the next series, when abrec is added, and I have these same patches, will you accept this size increase?
It sounds like you're talking about re-ordering patches and I'm asking you to re-work your re-work of the code so that it doesn't grow things without explanation, and minimizes growth. Maybe that's code changes, maybe that's better commit messages, likely it's a combination of both.

On Mon, Jan 13, 2025 at 07:22:19PM -0600, Tom Rini wrote:
On Mon, Jan 13, 2025 at 05:13:46PM -0700, Simon Glass wrote:
Hi Tom,
On Mon, 13 Jan 2025 at 13:44, Tom Rini trini@konsulko.com wrote:
On Mon, Jan 13, 2025 at 01:03:52PM -0700, Simon Glass wrote:
Hi Tom,
On Sat, 11 Jan 2025 at 15:54, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:57AM -0700, Simon Glass wrote:
Loading a FIT is useful for other VBE methods, such as ABrec. Create a new function to handling reading it.
Signed-off-by: Simon Glass sjg@chromium.org
This causes a bunch of growth: a3y17lte : all +1328 text +1328 u-boot: add: 8/0, grow: 1/0 bytes: 1328/0 (1328) function old new delta blkcache_fill - 332 +332 blkcache_read - 240 +240 blk_read - 188 +188 vbe_read_nvdata - 156 +156 vbe_read_version - 140 +140 vbe_get_blk - 100 +100 simple_read_nvdata - 96 +96 crc8 - 72 +72 vbe_simple_read_state 108 112 +4
Which is unexpected for just moving code around that's not newly used.
I hadn't noticed that on the boards I was trying, so thank you for spotting it.
This is because it now uses blk_read() instead of blk_dread(), so if
That's not what this patch does? There's no caller before or after in this patch of "blk_dread". Just moving functions around should not increase size on platforms that weren't using the existing functionality.
Firstly, are we looking at the same patch? Here is the one I am looking at:
https://patchwork.ozlabs.org/project/uboot/patch/20250109123010.4005298-2-sj...
You're right, I replied to the wrong patch here, sorry for the confusion. I'll move some of my comments in reply to the correct patch now.
[snip]
And even when it's just a move it's still growing: xilinx_zynqmp_virt: all +128 bss -72 text +200 u-boot: add: 4/0, grow: 0/-1 bytes: 540/-340 (200) function old new delta vbe_read_nvdata - 156 +156 vbe_get_blk - 148 +148 vbe_read_version - 140 +140 simple_read_nvdata - 96 +96 vbe_simple_read_state 452 112 -340
Unfortunately this one is hard to fix. As you know, whenever you take code from a single module and put it into another, the compiler cannot optimise away the function-call overhead. I'll note that there is no increase when LTO is used, e.g. with xilinx_versal_net_mini_qspi
Yes, but 200 bytes isn't just function call overhead. Some of that might be from going from one ALLOC_CACHE_ALIGN_BUFFER(u8, buf, MMC_MAX_BLOCK_LEN) to two?
This is the double buffer I was referring to.

Hi Tom,
On Wed, 15 Jan 2025 at 07:43, Tom Rini trini@konsulko.com wrote:
On Mon, Jan 13, 2025 at 07:22:19PM -0600, Tom Rini wrote:
On Mon, Jan 13, 2025 at 05:13:46PM -0700, Simon Glass wrote:
Hi Tom,
On Mon, 13 Jan 2025 at 13:44, Tom Rini trini@konsulko.com wrote:
On Mon, Jan 13, 2025 at 01:03:52PM -0700, Simon Glass wrote:
Hi Tom,
On Sat, 11 Jan 2025 at 15:54, Tom Rini trini@konsulko.com wrote:
On Thu, Jan 09, 2025 at 05:29:57AM -0700, Simon Glass wrote:
> Loading a FIT is useful for other VBE methods, such as ABrec. Create a > new function to handling reading it. > > Signed-off-by: Simon Glass sjg@chromium.org
This causes a bunch of growth: a3y17lte : all +1328 text +1328 u-boot: add: 8/0, grow: 1/0 bytes: 1328/0 (1328) function old new delta blkcache_fill - 332 +332 blkcache_read - 240 +240 blk_read - 188 +188 vbe_read_nvdata - 156 +156 vbe_read_version - 140 +140 vbe_get_blk - 100 +100 simple_read_nvdata - 96 +96 crc8 - 72 +72 vbe_simple_read_state 108 112 +4
Which is unexpected for just moving code around that's not newly used.
I hadn't noticed that on the boards I was trying, so thank you for spotting it.
This is because it now uses blk_read() instead of blk_dread(), so if
That's not what this patch does? There's no caller before or after in this patch of "blk_dread". Just moving functions around should not increase size on platforms that weren't using the existing functionality.
Firstly, are we looking at the same patch? Here is the one I am looking at:
https://patchwork.ozlabs.org/project/uboot/patch/20250109123010.4005298-2-sj...
You're right, I replied to the wrong patch here, sorry for the confusion. I'll move some of my comments in reply to the correct patch now.
[snip]
And even when it's just a move it's still growing: xilinx_zynqmp_virt: all +128 bss -72 text +200 u-boot: add: 4/0, grow: 0/-1 bytes: 540/-340 (200) function old new delta vbe_read_nvdata - 156 +156 vbe_get_blk - 148 +148 vbe_read_version - 140 +140 simple_read_nvdata - 96 +96 vbe_simple_read_state 452 112 -340
Unfortunately this one is hard to fix. As you know, whenever you take code from a single module and put it into another, the compiler cannot optimise away the function-call overhead. I'll note that there is no increase when LTO is used, e.g. with xilinx_versal_net_mini_qspi
Yes, but 200 bytes isn't just function call overhead. Some of that might be from going from one ALLOC_CACHE_ALIGN_BUFFER(u8, buf, MMC_MAX_BLOCK_LEN) to two?
This is the double buffer I was referring to.
OK, well that is just a declaration, so doesn't add any code, so far as I understand it.
I ran out of time this morning, but did find that the compiler cannot (of course) optimise the common range-checks when the code is in different files. So I fiddled with removing some of them and that gets the growth down, about 40 bytes on Thumb2 and 130 on aarch64, both without LTO.
I'll send a new series out later today.
Regards, Simon

It is convenient to use TEXT_BASE as a place to hold the FIT header, but this does not work in VPL, since SDRAM is not inited yet.
Allocate the memory instead. Ensure the size is aligned to the media block-size so that it can be read in directly. Improve the error-checking for blk_read() and add some more debugging.
Keep the existing TEXT_BASE mechanism in sandbox to avoid an 'Exec format error' when trying to run the image.
Signed-off-by: Simon Glass sjg@chromium.org ---
boot/vbe_common.c | 36 ++++++++++++++++++++++++++++-------- 1 file changed, 28 insertions(+), 8 deletions(-)
diff --git a/boot/vbe_common.c b/boot/vbe_common.c index c009337ef3f..d2d20cabe00 100644 --- a/boot/vbe_common.c +++ b/boot/vbe_common.c @@ -27,6 +27,7 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, struct bootm_headers images = {}; enum image_phase_t phase; struct blk_desc *desc; + ulong aligned_size; int node, ret; void *buf;
@@ -46,20 +47,37 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, if (size > area_size) return log_msg_ret("fdt", -E2BIG); log_debug("FIT size %lx\n", size); + aligned_size = ALIGN(size, desc->blksz);
/* * Load the FIT into the SPL memory. This is typically a FIT with * external data, so this is quite small, perhaps a few KB. */ - addr = CONFIG_VAL(TEXT_BASE); - buf = map_sysmem(addr, size); - num_blks = DIV_ROUND_UP(size, desc->blksz); - log_debug("read %lx, %lx blocks to %lx / %p\n", size, num_blks, addr, - buf); + if (IS_ENABLED(CONFIG_SANDBOX)) { + addr = CONFIG_VAL(TEXT_BASE); + buf = map_sysmem(addr, size); + } else { + buf = malloc(aligned_size); + if (!buf) + return log_msg_ret("fit", -ENOMEM); + addr = map_to_sysmem(buf); + } + num_blks = aligned_size / desc->blksz; + log_debug("read %lx, %lx blocks to %lx / %p\n", aligned_size, num_blks, + addr, buf); ret = blk_read(blk, blknum, num_blks, buf); if (ret < 0) - return log_msg_ret("rd", ret); - + return log_msg_ret("rd3", ret); + else if (ret != num_blks) + return log_msg_ret("rd4", -EIO); + log_debug("check total size %x off_dt_strings %x\n", fdt_totalsize(buf), + fdt_off_dt_strings(buf)); + +#if CONFIG_IS_ENABLED(SYS_MALLOC_F) + log_debug("malloc base %lx ptr %x limit %x top %lx\n", + gd->malloc_base, gd->malloc_ptr, gd->malloc_limit, + gd->malloc_base + gd->malloc_limit); +#endif /* figure out the phase to load */ phase = IS_ENABLED(CONFIG_VPL_BUILD) ? IH_PHASE_SPL : IH_PHASE_U_BOOT;
@@ -82,7 +100,9 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, log_debug("loaded to %lx\n", load_addr);
/* For FIT external data, read in the external data */ - if (load_addr + len > addr + size) { + log_debug("load_addr %lx len %lx addr %lx aligned_size %lx\n", + load_addr, len, addr, aligned_size); + if (load_addr + len > addr + aligned_size) { ulong base, full_size; void *base_buf;

At present the VBE implementation is limited to sandbox only. Adjust the call to fit_image_load() to remove this limitation.
Signed-off-by: Simon Glass sjg@chromium.org ---
boot/vbe_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/boot/vbe_common.c b/boot/vbe_common.c index d2d20cabe00..dfe44822430 100644 --- a/boot/vbe_common.c +++ b/boot/vbe_common.c @@ -91,7 +91,7 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, fit_uname_config = NULL; log_debug("loading FIT\n"); ret = fit_image_load(&images, addr, &fit_uname, &fit_uname_config, - IH_ARCH_SANDBOX, image_ph(phase, IH_TYPE_FIRMWARE), + IH_ARCH_DEFAULT, image_ph(phase, IH_TYPE_FIRMWARE), BOOTSTAGE_ID_FIT_SPL_START, FIT_LOAD_IGNORED, &load_addr, &len); if (ret < 0)

This function can read fewer blocks than requested, so update the checks to handle this.
Signed-off-by: Simon Glass sjg@chromium.org ---
boot/vbe_common.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/boot/vbe_common.c b/boot/vbe_common.c index dfe44822430..c7f09c6d090 100644 --- a/boot/vbe_common.c +++ b/boot/vbe_common.c @@ -39,6 +39,8 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, ret = blk_read(blk, blknum, 1, sbuf); if (ret < 0) return log_msg_ret("rd", ret); + else if (ret != 1) + return log_msg_ret("rd2", -EIO);
ret = fdt_check_header(sbuf); if (ret < 0) @@ -127,6 +129,8 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, blknum, full_size, num_blks, base, base_buf, ret); if (ret < 0) return log_msg_ret("rd", ret); + if (ret != num_blks) + return log_msg_ret("rd", -EIO); } if (load_addrp) *load_addrp = load_addr;

There is no guarantee that an FIT image starts on a block boundary. When it doesn't, the image starts part-way through the first block.
Add logic to detect this and copy the image down into place.
Signed-off-by: Simon Glass sjg@chromium.org ---
boot/vbe_common.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/boot/vbe_common.c b/boot/vbe_common.c index c7f09c6d090..02e2c4ae066 100644 --- a/boot/vbe_common.c +++ b/boot/vbe_common.c @@ -105,32 +105,41 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, log_debug("load_addr %lx len %lx addr %lx aligned_size %lx\n", load_addr, len, addr, aligned_size); if (load_addr + len > addr + aligned_size) { - ulong base, full_size; + ulong base, full_size, offset, extra; void *base_buf;
/* Find the start address to load from */ base = ALIGN_DOWN(load_addr, desc->blksz);
+ offset = area_offset + load_addr - addr; + blknum = offset / desc->blksz; + extra = offset % desc->blksz; + /* * Get the total number of bytes to load, taking care of * block alignment */ - full_size = load_addr + len - base; + full_size = len + extra;
/* * Get the start block number, number of blocks and the address * to load to, then load the blocks */ - blknum = (area_offset + base - addr) / desc->blksz; num_blks = DIV_ROUND_UP(full_size, desc->blksz); base_buf = map_sysmem(base, full_size); ret = blk_read(blk, blknum, num_blks, base_buf); - log_debug("read %lx %lx, %lx blocks to %lx / %p: ret=%d\n", - blknum, full_size, num_blks, base, base_buf, ret); + log_debug("read foffset %lx blknum %lx full_size %lx num_blks %lx to %lx / %p: ret=%d\n", + offset - 0x8000, blknum, full_size, num_blks, base, base_buf, + ret); if (ret < 0) return log_msg_ret("rd", ret); if (ret != num_blks) return log_msg_ret("rd", -EIO); + if (extra && !IS_ENABLED(CONFIG_SANDBOX)) { + log_debug("move %p %p %lx\n", base_buf, + base_buf + extra, len); + memmove(base_buf, base_buf + extra, len); + } } if (load_addrp) *load_addrp = load_addr;

In some cases only the 'loadable' property is present in the FIT. Handle this by loading the first such image.
Signed-off-by: Simon Glass sjg@chromium.org ---
boot/vbe_common.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/boot/vbe_common.c b/boot/vbe_common.c index 02e2c4ae066..4b297e780b3 100644 --- a/boot/vbe_common.c +++ b/boot/vbe_common.c @@ -96,6 +96,13 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, IH_ARCH_DEFAULT, image_ph(phase, IH_TYPE_FIRMWARE), BOOTSTAGE_ID_FIT_SPL_START, FIT_LOAD_IGNORED, &load_addr, &len); + if (ret == -ENOENT) { + ret = fit_image_load(&images, addr, &fit_uname, + &fit_uname_config, IH_ARCH_DEFAULT, + image_ph(phase, IH_TYPE_LOADABLE), + BOOTSTAGE_ID_FIT_SPL_START, + FIT_LOAD_IGNORED, &load_addr, &len); + } if (ret < 0) return log_msg_ret("ld", ret); node = ret;

In many cases the FIT includes a devicetree. Add support for loading this into a suitable place in memory.
Signed-off-by: Simon Glass sjg@chromium.org ---
boot/vbe_common.c | 33 ++++++++++++++++++++++++++++++--- 1 file changed, 30 insertions(+), 3 deletions(-)
diff --git a/boot/vbe_common.c b/boot/vbe_common.c index 4b297e780b3..742e754b133 100644 --- a/boot/vbe_common.c +++ b/boot/vbe_common.c @@ -23,11 +23,11 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, { ALLOC_CACHE_ALIGN_BUFFER(u8, sbuf, MMC_MAX_BLOCK_LEN); ulong size, blknum, addr, len, load_addr, num_blks; + ulong aligned_size, fdt_load_addr, fdt_size; const char *fit_uname, *fit_uname_config; struct bootm_headers images = {}; enum image_phase_t phase; struct blk_desc *desc; - ulong aligned_size; int node, ret; void *buf;
@@ -108,12 +108,16 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, node = ret; log_debug("loaded to %lx\n", load_addr);
+ fdt_load_addr = 0; + fdt_size = 0; + /* For FIT external data, read in the external data */ log_debug("load_addr %lx len %lx addr %lx aligned_size %lx\n", load_addr, len, addr, aligned_size); if (load_addr + len > addr + aligned_size) { - ulong base, full_size, offset, extra; - void *base_buf; + ulong base, full_size, offset, extra, fdt_base, fdt_full_size; + ulong fdt_offset; + void *base_buf, *fdt_base_buf;
/* Find the start address to load from */ base = ALIGN_DOWN(load_addr, desc->blksz); @@ -147,6 +151,29 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, base_buf + extra, len); memmove(base_buf, base_buf + extra, len); } + + /* now the FDT */ + if (fdt_size) { + fdt_offset = area_offset + fdt_load_addr - addr; + blknum = fdt_offset / desc->blksz; + extra = fdt_offset % desc->blksz; + fdt_full_size = fdt_size + extra; + num_blks = DIV_ROUND_UP(fdt_full_size, desc->blksz); + fdt_base = ALIGN(base + len, 4); + fdt_base_buf = map_sysmem(fdt_base, fdt_size); + ret = blk_read(blk, blknum, num_blks, fdt_base_buf); + log_debug("fdt read foffset %lx blknum %lx full_size %lx num_blks %lx to %lx / %p: ret=%d\n", + fdt_offset - 0x8000, blknum, fdt_full_size, num_blks, + fdt_base, fdt_base_buf, ret); + if (ret != num_blks) + return log_msg_ret("rdf", -EIO); + if (extra) { + log_debug("move %p %p %lx\n", fdt_base_buf, + fdt_base_buf + extra, fdt_size); + memmove(fdt_base_buf, fdt_base_buf + extra, + fdt_size); + } + } } if (load_addrp) *load_addrp = load_addr;

Add some fields to track the VBE state in SPL.
Signed-off-by: Simon Glass sjg@chromium.org ---
include/spl.h | 41 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-)
diff --git a/include/spl.h b/include/spl.h index 43b344dbc55..781e5a2d638 100644 --- a/include/spl.h +++ b/include/spl.h @@ -14,6 +14,7 @@ #include <asm/global_data.h> #include <asm/spl.h> #include <handoff.h> +#include <image.h> #include <mmc.h>
struct blk_desc; @@ -265,6 +266,11 @@ enum spl_sandbox_flags { SPL_SANDBOXF_ARG_IS_BUF, };
+/** + * struct spl_image_info - Information about the SPL image being loaded + * + * @fdt_size: Size of the FDT for the image (0 if none) + */ struct spl_image_info { const char *name; u8 os; @@ -276,6 +282,7 @@ struct spl_image_info { u32 boot_device; u32 offset; u32 size; + ulong fdt_size; u32 flags; void *arg; #ifdef CONFIG_SPL_LEGACY_IMAGE_CRC_CHECK @@ -316,12 +323,18 @@ typedef ulong (*spl_load_reader)(struct spl_load_info *load, ulong sector, * @read: Function to call to read from the device * @priv: Private data for the device * @bl_len: Block length for reading in bytes + * @phase: Image phase to load + * @fit_loaded: true if the FIT has been loaded, except for external data */ struct spl_load_info { spl_load_reader read; void *priv; #if IS_ENABLED(CONFIG_SPL_LOAD_BLOCK) - int bl_len; + u16 bl_len; +#endif +#if CONFIG_IS_ENABLED(BOOTMETH_VBE) + u8 phase; + u8 fit_loaded; #endif };
@@ -344,6 +357,32 @@ static inline void spl_set_bl_len(struct spl_load_info *info, int bl_len) #endif }
+static inline void xpl_set_phase(struct spl_load_info *info, + enum image_phase_t phase) +{ +#if CONFIG_IS_ENABLED(BOOTMETH_VBE) + info->phase = phase; +#endif +} + +static inline enum image_phase_t xpl_get_phase(struct spl_load_info *info) +{ +#if CONFIG_IS_ENABLED(BOOTMETH_VBE) + return info->phase; +#else + return IH_PHASE_NONE; +#endif +} + +static inline bool xpl_get_fit_loaded(struct spl_load_info *info) +{ +#if CONFIG_IS_ENABLED(BOOTMETH_VBE) + return info->fit_loaded; +#else + return false; +#endif +} + /** * spl_load_init() - Set up a new spl_load_info structure */

This function will be used by the relocating jumper too, so add a typedef to the header file to avoid mismatches.
Signed-off-by: Simon Glass sjg@chromium.org ---
common/spl/spl.c | 3 +-- include/spl.h | 3 +++ 2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/common/spl/spl.c b/common/spl/spl.c index ad31a2f8b6c..09e6dc26f5e 100644 --- a/common/spl/spl.c +++ b/common/spl/spl.c @@ -671,8 +671,7 @@ void board_init_r(gd_t *dummy1, ulong dummy2) BOOT_DEVICE_NONE, BOOT_DEVICE_NONE, }; - typedef void __noreturn (*jump_to_image_t)(struct spl_image_info *); - jump_to_image_t jump_to_image = &jump_to_image_no_args; + spl_jump_to_image_t jump_to_image = &jump_to_image_no_args; struct spl_image_info spl_image; int ret, os;
diff --git a/include/spl.h b/include/spl.h index 781e5a2d638..488adbeb1bc 100644 --- a/include/spl.h +++ b/include/spl.h @@ -292,6 +292,9 @@ struct spl_image_info { #endif };
+/* function to jump to an image from SPL */ +typedef void __noreturn (*spl_jump_to_image_t)(struct spl_image_info *); + static inline void *spl_image_fdt_addr(struct spl_image_info *info) { #if CONFIG_IS_ENABLED(LOAD_FIT) || CONFIG_IS_ENABLED(LOAD_FIT_FULL)

When one xPL phase wants to jump to the next, the next phase must be loaded into its required address. This means that the TEXT_BASE for the two phases must be different and there cannot be any memory overlap between the code used by the two phases. It also can mean that phases need to be moved around to accommodate any size growth.
Having two xPL phases in SRAM at the same time can be tricky if SRAM is limited, which it often is. It would be better if the second phase could be loaded somewhere else, then decompressed into place over the top of the first phase.
Introduce a relocating jump for xPL to support this. This selects a suitable place to load the (typically compressed) next phase, copies some decompression code out of the first phase, then jumps to this code to decompress and start the next phase.
This feature makes it much easier to support Verified Boot for Embedded (VBE) on RK3399 boards, which have 192KB of SRAM.
Signed-off-by: Simon Glass sjg@chromium.org ---
common/spl/Kconfig | 8 ++ common/spl/Kconfig.tpl | 8 ++ common/spl/Kconfig.vpl | 8 ++ common/spl/Makefile | 1 + common/spl/spl_reloc.c | 183 +++++++++++++++++++++++++++++++++++++++++ include/spl.h | 45 ++++++++++ 6 files changed, 253 insertions(+) create mode 100644 common/spl/spl_reloc.c
diff --git a/common/spl/Kconfig b/common/spl/Kconfig index 045fcac10a5..a250bde779f 100644 --- a/common/spl/Kconfig +++ b/common/spl/Kconfig @@ -981,6 +981,14 @@ config SPL_NAND_IDENT help SPL uses the chip ID list to identify the NAND flash.
+config SPL_RELOC_LOADER + bool "Allow relocating the next phase" + help + In some cases multiple U-Boot phases need to run in SRAM, typically + at the same address. Enable this to support loading the next phase + to temporary memory, then copying it into place afterwards, then + jumping to it. + config SPL_UBI bool "Support UBI" help diff --git a/common/spl/Kconfig.tpl b/common/spl/Kconfig.tpl index 92d4d43ec87..22ca7016453 100644 --- a/common/spl/Kconfig.tpl +++ b/common/spl/Kconfig.tpl @@ -268,6 +268,14 @@ config TPL_RAM_DEVICE be already in memory when TPL takes over, e.g. loaded by the boot ROM.
+config TPL_RELOC_LOADER + bool "Allow relocating the next phase" + help + In some cases multiple U-Boot phases need to run in SRAM, typically + at the same address. Enable this to support loading the next phase + to temporary memory, then copying it into place afterwards, then + jumping to it. + config TPL_RTC bool "Support RTC drivers" help diff --git a/common/spl/Kconfig.vpl b/common/spl/Kconfig.vpl index eb57dfabea5..97dfc630152 100644 --- a/common/spl/Kconfig.vpl +++ b/common/spl/Kconfig.vpl @@ -181,6 +181,14 @@ config VPL_PCI necessary driver support. This enables the drivers in drivers/pci as part of a VPL build.
+config VPL_RELOC_LOADER + bool "Allow relocating the next phase" + help + In some cases multiple U-Boot phases need to run in SRAM, typically + at the same address. Enable this to support loading the next phase + to temporary memory, then copying it into place afterwards, then + jumping to it. + config VPL_RTC bool "Support RTC drivers" help diff --git a/common/spl/Makefile b/common/spl/Makefile index 75123eb666b..4c9482bd309 100644 --- a/common/spl/Makefile +++ b/common/spl/Makefile @@ -12,6 +12,7 @@ obj-$(CONFIG_$(PHASE_)BOOTROM_SUPPORT) += spl_bootrom.o obj-$(CONFIG_$(PHASE_)LOAD_FIT) += spl_fit.o obj-$(CONFIG_$(PHASE_)BLK_FS) += spl_blk_fs.o obj-$(CONFIG_$(PHASE_)LEGACY_IMAGE_FORMAT) += spl_legacy.o +obj-$(CONFIG_$(PHASE_)RELOC_LOADER) += spl_reloc.o obj-$(CONFIG_$(PHASE_)NOR_SUPPORT) += spl_nor.o obj-$(CONFIG_$(PHASE_)XIP_SUPPORT) += spl_xip.o obj-$(CONFIG_$(PHASE_)YMODEM_SUPPORT) += spl_ymodem.o diff --git a/common/spl/spl_reloc.c b/common/spl/spl_reloc.c new file mode 100644 index 00000000000..be8349b535b --- /dev/null +++ b/common/spl/spl_reloc.c @@ -0,0 +1,183 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright 2024 Google LLC + * Written by Simon Glass sjg@chromium.org + */ + +#include <gzip.h> +#include <image.h> +#include <log.h> +#include <mapmem.h> +#include <spl.h> +#include <asm/global_data.h> +#include <asm/io.h> +#include <asm/sections.h> +#include <asm/unaligned.h> +#include <linux/types.h> +#include <lzma/LzmaTypes.h> +#include <lzma/LzmaDec.h> +#include <lzma/LzmaTools.h> +#include <u-boot/crc.h> +#include <u-boot/lz4.h> + +DECLARE_GLOBAL_DATA_PTR; + +/* provide a way to jump straight into the relocation code, for debugging */ +#define DEBUG_JUMP 0 + +enum { + /* margin to allow for stack growth */ + RELOC_STACK_MARGIN = 0x800, + + /* align base address for DMA controllers which require it */ + BASE_ALIGN = 0x200, + + STACK_PROT_VALUE = 0x51ce4697, +}; + +typedef int (*rcode_func)(struct spl_image_info *image); + +static int setup_layout(struct spl_image_info *image, ulong *addrp) +{ + ulong base, fdt_size; + ulong limit, rcode_base; + uint rcode_size; + int buf_size, margin; + char *rcode_buf; + + limit = ALIGN(map_to_sysmem(&limit) - RELOC_STACK_MARGIN, 8); + image->stack_prot = map_sysmem(limit, sizeof(uint)); + *image->stack_prot = STACK_PROT_VALUE; + + fdt_size = fdt_totalsize(gd->fdt_blob); + base = ALIGN(map_to_sysmem(gd->fdt_blob) + fdt_size + BASE_ALIGN - 1, + BASE_ALIGN); + + rcode_size = _rcode_end - _rcode_start; + rcode_base = limit - rcode_size; + buf_size = rcode_base - base; + uint need_size = image->size + image->fdt_size; + margin = buf_size - need_size; + log_debug("spl_reloc %s->%s: margin%s%lx limit %lx fdt_size %lx base %lx avail %x image %x fdt %lx need %x\n", + spl_phase_name(spl_phase()), spl_phase_name(spl_phase() + 1), + margin >= 0 ? " " : " -", abs(margin), limit, fdt_size, base, + buf_size, image->size, image->fdt_size, need_size); + if (margin < 0) { + log_err("Image size %x but buffer is only %x\n", need_size, + buf_size); + return -ENOSPC; + } + + rcode_buf = map_sysmem(rcode_base, rcode_size); + log_debug("_rcode_start %p: %x -- func %p %x\n", _rcode_start, + *(uint *)_rcode_start, setup_layout, *(uint *)setup_layout); + + image->reloc_offset = rcode_buf - _rcode_start; + log_debug("_rcode start %lx base %lx size %x offset %lx\n", + (ulong)map_to_sysmem(_rcode_start), rcode_base, rcode_size, + image->reloc_offset); + + memcpy(rcode_buf, _rcode_start, rcode_size); + + image->buf = map_sysmem(base, need_size); + image->fdt_buf = image->buf + image->size; + image->rcode_buf = rcode_buf; + *addrp = base; + + return 0; +} + +int spl_reloc_prepare(struct spl_image_info *image, ulong *addrp) +{ + int ret; + + ret = setup_layout(image, addrp); + if (ret) + return ret; + + return 0; +} + +typedef void __noreturn (*image_entry_noargs_t)(uint crc, uint unc_len); + +/* this is the relocation + jump code that is copied to the top of memory */ +__rcode int rcode_reloc_and_jump(struct spl_image_info *image) +{ + image_entry_noargs_t entry = (image_entry_noargs_t)image->entry_point; + u32 *dst; + ulong image_len; + size_t unc_len; + int ret, crc; + uint magic; + + dst = map_sysmem(image->load_addr, image->size); + unc_len = (void *)image->rcode_buf - (void *)dst; + image_len = image->size; + if (*image->stack_prot != STACK_PROT_VALUE) + return -EFAULT; + magic = get_unaligned_le32(image->buf); + if (CONFIG_IS_ENABLED(LZMA)) { + SizeT lzma_len = unc_len; + + ret = lzmaBuffToBuffDecompress((u8 *)dst, &lzma_len, + image->buf, image_len); + unc_len = lzma_len; + } else if (CONFIG_IS_ENABLED(GZIP)) { + ret = gunzip(dst, unc_len, image->buf, &image_len); + } else if (CONFIG_IS_ENABLED(LZ4) && magic == LZ4F_MAGIC) { + ret = ulz4fn(image->buf, image_len, dst, &unc_len); + if (ret) + return ret; + } else { + u32 *src, *end, *ptr; + + unc_len = image->size; + for (src = image->buf, end = (void *)src + image->size, + ptr = dst; src < end;) + *ptr++ = *src++; + } + if (*image->stack_prot != STACK_PROT_VALUE) + return -EFAULT; + + /* copy in the FDT if needed */ + if (image->fdt_size) + memcpy(image->fdt_start, image->fdt_buf, image->fdt_size); + + crc = crc8(0, (u8 *)dst, unc_len); + + /* jump to the entry point */ + entry(crc, unc_len); +} + +int spl_reloc_jump(struct spl_image_info *image, spl_jump_to_image_t jump) +{ + rcode_func loader; + int ret; + + log_debug("malloc usage %lx bytes (%ld KB of %d KB)\n", gd->malloc_ptr, + gd->malloc_ptr / 1024, CONFIG_VAL(SYS_MALLOC_F_LEN) / 1024); + + if (*image->stack_prot != STACK_PROT_VALUE) { + log_err("stack busted, cannot continue\n"); + return -EFAULT; + } + loader = (rcode_func)(void *)rcode_reloc_and_jump + image->reloc_offset; + log_debug("Jumping via %p to %lx - image %p size %x load %lx\n", loader, + image->entry_point, image, image->size, image->load_addr); + + log_debug("unc_len %lx\n", + image->rcode_buf - map_sysmem(image->load_addr, image->size)); + if (DEBUG_JUMP) { + rcode_reloc_and_jump(image); + } else { + /* + * Must disable LOG_DEBUG since the decompressor cannot call + * log functions, printf(), etc. + */ + _Static_assert(DEBUG_JUMP || !_DEBUG, + "Cannot have debug output from decompressor"); + ret = loader(image); + } + + return -EFAULT; +} diff --git a/include/spl.h b/include/spl.h index 488adbeb1bc..9cfba98db55 100644 --- a/include/spl.h +++ b/include/spl.h @@ -270,6 +270,16 @@ enum spl_sandbox_flags { * struct spl_image_info - Information about the SPL image being loaded * * @fdt_size: Size of the FDT for the image (0 if none) + * @buf: Buffer where the image should be loaded + * @fdt_buf: Buffer where the FDT will be copied by spl_reloc_jump(), only used + * if @fdt_size is non-zero + * @fdt_start: Pointer to the FDT to be copied (must be set up before calling + * spl_reloc_jump() + * @rcode_buf: Buffer to hold the relocating-jump code + * @stack_prot: Pointer to the stack-protection value, used to ensure the stack + * does not overflow + * @reloc_offset: offset between the relocating-jump code and its place in the + * currently running image */ struct spl_image_info { const char *name; @@ -290,6 +300,14 @@ struct spl_image_info { ulong dcrc_length; ulong dcrc; #endif +#if CONFIG_IS_ENABLED(RELOC_LOADER) + void *buf; + void *fdt_buf; + void *fdt_start; + void *rcode_buf; + uint *stack_prot; + ulong reloc_offset; +#endif };
/* function to jump to an image from SPL */ @@ -1155,4 +1173,31 @@ int spl_write_upl_handoff(struct spl_image_info *spl_image); */ void spl_upl_init(void);
+/** + * spl_reloc_prepare() - Prepare the relocating loader ready for use + * + * Sets up the relocating loader ready for use. This must be called before + * spl_reloc_jump() can be used. + * + * The memory layout is figured out, making use of the space between the top of + * the current image and the top of memory. + * + * Once this is done, the relocating-jump code is copied into place at + * image->rcode_buf + * + * @image: SPL image containing information. This is updated with various + * necessary values. On entry, the size and fdt_size fields must be valid + * @addrp: Returns the address to which the image should be loaded into memory + * Return 0 if OK, -ENOSPC if there is not enough memory available + */ +int spl_reloc_prepare(struct spl_image_info *image, ulong *addrp); + +/** + * spl_reloc_jump() - Jump to an image, via a 'relocating-jump' region + * + * @image: SPL image to jump to + * @func: Function to call in the final image + */ +int spl_reloc_jump(struct spl_image_info *image, spl_jump_to_image_t func); + #endif

This is fairly easy to use. The SPL loader sets up some fields in the spl_image_info struct and calls spl_reloc_prepare(). When SPL is ready to do the jump it must call spl_reloc_jump() instead of jump_to_image().
Add this logic.
Signed-off-by: Simon Glass sjg@chromium.org ---
common/spl/spl.c | 12 ++++++++++++ include/spl.h | 1 + 2 files changed, 13 insertions(+)
diff --git a/common/spl/spl.c b/common/spl/spl.c index 09e6dc26f5e..7cfbab06419 100644 --- a/common/spl/spl.c +++ b/common/spl/spl.c @@ -826,6 +826,18 @@ void board_init_r(gd_t *dummy1, ulong dummy2) }
spl_board_prepare_for_boot(); + + if (CONFIG_IS_ENABLED(RELOC_LOADER)) { + int ret; + + ret = spl_reloc_jump(&spl_image, jump_to_image); + if (ret) { + if (xpl_phase() == PHASE_VPL) + printf("jump failed %d\n", ret); + hang(); + } + } + jump_to_image(&spl_image); }
diff --git a/include/spl.h b/include/spl.h index 9cfba98db55..7155e9c67aa 100644 --- a/include/spl.h +++ b/include/spl.h @@ -414,6 +414,7 @@ static inline void spl_load_init(struct spl_load_info *load, load->read = h_read; load->priv = priv; spl_set_bl_len(load, bl_len); + xpl_set_phase(load, IH_PHASE_NONE); }
/*

Add FDT support so that this can be copied down in memory after loading and made available to the new image.
Signed-off-by: Simon Glass sjg@chromium.org ---
boot/vbe_common.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
diff --git a/boot/vbe_common.c b/boot/vbe_common.c index 742e754b133..46b39cf8910 100644 --- a/boot/vbe_common.c +++ b/boot/vbe_common.c @@ -173,6 +173,26 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, memmove(fdt_base_buf, fdt_base_buf + extra, fdt_size); } +#if CONFIG_IS_ENABLED(RELOC_LOADER) + image->fdt_buf = fdt_base_buf; + + ulong xpl_size; + ulong xpl_pad; + ulong fdt_start; + + if (xpl_phase() == PHASE_TPL) { + xpl_size = binman_sym(ulong, u_boot_vpl_nodtb, size); + xpl_pad = binman_sym(ulong, u_boot_vpl_bss_pad, size); + } else { + xpl_size = binman_sym(ulong, u_boot_spl_nodtb, size); + xpl_pad = binman_sym(ulong, u_boot_spl_bss_pad, size); + } + fdt_start = image->load_addr + xpl_size + xpl_pad; + log_debug("load_addr %lx xpl_size %lx copy-to %lx\n", + image->load_addr, xpl_size + xpl_pad, + fdt_start); + image->fdt_start = map_sysmem(fdt_start, fdt_size); +#endif } } if (load_addrp)

VBE needs to load different images from a FIT depending on the xPL phase in use. The IH_PHASE value is used to select the image to load.
Add the required logic to handle this. For compatibility with the SPL-loader driver, fill out a struct spl_image_info with the details needed to boot the next phase.
Signed-off-by: Simon Glass sjg@chromium.org ---
boot/vbe_common.c | 85 ++++++++++++++++++++++++++++++++++++++++++-- boot/vbe_common.h | 4 ++- boot/vbe_simple_fw.c | 3 +- 3 files changed, 87 insertions(+), 5 deletions(-)
diff --git a/boot/vbe_common.c b/boot/vbe_common.c index 46b39cf8910..799f33d9ffa 100644 --- a/boot/vbe_common.c +++ b/boot/vbe_common.c @@ -18,17 +18,48 @@ #include <u-boot/crc.h> #include "vbe_common.h"
+binman_sym_declare(ulong, u_boot_vpl_nodtb, size); +binman_sym_declare(ulong, u_boot_vpl_bss_pad, size); +binman_sym_declare(ulong, u_boot_spl_nodtb, size); +binman_sym_declare(ulong, u_boot_spl_bss_pad, size); + +/** + * h_vbe_load_read() - Handler for reading an SPL image from a FIT + * + * See spl_load_reader for the definition + */ +ulong h_vbe_load_read(struct spl_load_info *load, ulong off, ulong size, + void *buf) +{ + struct blk_desc *desc = load->priv; + lbaint_t sector = off >> desc->log2blksz; + lbaint_t count = size >> desc->log2blksz; + int ret; + + log_debug("vbe read log2blksz %x offset %lx sector %lx count %lx\n", + desc->log2blksz, (ulong)off, (long)sector, (ulong)count); + + ret = blk_dread(desc, sector, count, buf); + log_debug("ret=%x\n", ret); + if (ret < 0) + return ret; + + return ret << desc->log2blksz; +} + int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, - ulong *load_addrp, ulong *lenp, char **namep) + struct spl_image_info *image, ulong *load_addrp, ulong *lenp, + char **namep) { ALLOC_CACHE_ALIGN_BUFFER(u8, sbuf, MMC_MAX_BLOCK_LEN); - ulong size, blknum, addr, len, load_addr, num_blks; + ulong size, blknum, addr, len, load_addr, num_blks, spl_load_addr; ulong aligned_size, fdt_load_addr, fdt_size; const char *fit_uname, *fit_uname_config; struct bootm_headers images = {}; enum image_phase_t phase; struct blk_desc *desc; int node, ret; + bool for_xpl; void *buf;
desc = dev_get_uclass_plat(blk); @@ -81,7 +112,8 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, gd->malloc_base + gd->malloc_limit); #endif /* figure out the phase to load */ - phase = IS_ENABLED(CONFIG_VPL_BUILD) ? IH_PHASE_SPL : IH_PHASE_U_BOOT; + phase = IS_ENABLED(CONFIG_TPL_BUILD) ? IH_PHASE_NONE : + IS_ENABLED(CONFIG_VPL_BUILD) ? IH_PHASE_SPL : IH_PHASE_U_BOOT;
/* * Load the image from the FIT. We ignore any load-address information @@ -92,6 +124,20 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, fit_uname = NULL; fit_uname_config = NULL; log_debug("loading FIT\n"); + + if (xpl_phase() == PHASE_SPL && !IS_ENABLED(CONFIG_SANDBOX)) { + struct spl_load_info info; + + spl_load_init(&info, h_vbe_load_read, desc, desc->blksz); + xpl_set_phase(&info, IH_PHASE_U_BOOT); + log_debug("doing SPL from %s blksz %lx log2blksz %x area_offset %lx + fdt_size %lx\n", + blk->name, desc->blksz, desc->log2blksz, area_offset, ALIGN(size, 4)); + ret = spl_load_simple_fit(image, &info, area_offset, buf); + log_debug("spl_load_abrec_fit() ret=%d\n", ret); + + return ret; + } + ret = fit_image_load(&images, addr, &fit_uname, &fit_uname_config, IH_ARCH_DEFAULT, image_ph(phase, IH_TYPE_FIRMWARE), BOOTSTAGE_ID_FIT_SPL_START, FIT_LOAD_IGNORED, @@ -110,6 +156,31 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size,
fdt_load_addr = 0; fdt_size = 0; + if ((xpl_phase() == PHASE_TPL || xpl_phase() == PHASE_VPL) && + !IS_ENABLED(CONFIG_SANDBOX)) { + /* allow use of a different image from the configuration node */ + fit_uname = NULL; + ret = fit_image_load(&images, addr, &fit_uname, + &fit_uname_config, IH_ARCH_DEFAULT, + image_ph(phase, IH_TYPE_FLATDT), + BOOTSTAGE_ID_FIT_SPL_START, + FIT_LOAD_IGNORED, &fdt_load_addr, + &fdt_size); + fdt_size = ALIGN(fdt_size, desc->blksz); + log_debug("FDT noload to %lx size %lx\n", fdt_load_addr, + fdt_size); + } + + for_xpl = !USE_BOOTMETH && CONFIG_IS_ENABLED(RELOC_LOADER); + if (for_xpl) { + image->size = len; + image->fdt_size = fdt_size; + ret = spl_reloc_prepare(image, &spl_load_addr); + if (ret) + return log_msg_ret("spl", ret); + } + if (!IS_ENABLED(CONFIG_SANDBOX)) + image->os = IH_OS_U_BOOT;
/* For FIT external data, read in the external data */ log_debug("load_addr %lx len %lx addr %lx aligned_size %lx\n", @@ -137,6 +208,8 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, * to load to, then load the blocks */ num_blks = DIV_ROUND_UP(full_size, desc->blksz); + if (for_xpl) + base = spl_load_addr; base_buf = map_sysmem(base, full_size); ret = blk_read(blk, blknum, num_blks, base_buf); log_debug("read foffset %lx blknum %lx full_size %lx num_blks %lx to %lx / %p: ret=%d\n", @@ -152,6 +225,12 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, memmove(base_buf, base_buf + extra, len); }
+ if ((xpl_phase() == PHASE_VPL || xpl_phase() == PHASE_TPL) && + !IS_ENABLED(CONFIG_SANDBOX)) { + image->load_addr = spl_get_image_text_base(); + image->entry_point = image->load_addr; + } + /* now the FDT */ if (fdt_size) { fdt_offset = area_offset + fdt_load_addr - addr; diff --git a/boot/vbe_common.h b/boot/vbe_common.h index 403391ff1ea..98b7aff0b5d 100644 --- a/boot/vbe_common.h +++ b/boot/vbe_common.h @@ -155,6 +155,7 @@ int vbe_get_blk(const char *storage, struct udevice **blkp); * @blk: Block device containing FIT * @area_offset: Byte offset of the VBE area in @blk containing the FIT * @area_size: Size of the VBE area + * @image: SPL image to fill in with details of the loaded image, or NULL * @load_addrp: If non-null, returns the address where the image was loaded * @lenp: If non-null, returns the size of the image loaded, in bytes * @namep: If non-null, returns the name of the FIT-image node that was loaded @@ -166,6 +167,7 @@ int vbe_get_blk(const char *storage, struct udevice **blkp); * FIT-parsing (see fit_image_load()). */ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, - ulong *load_addrp, ulong *lenp, char **namep); + struct spl_image_info *image, ulong *load_addrp, ulong *lenp, + char **namep);
#endif /* __VBE_ABREC_H */ diff --git a/boot/vbe_simple_fw.c b/boot/vbe_simple_fw.c index 0bf25ccad23..9da3e49a66e 100644 --- a/boot/vbe_simple_fw.c +++ b/boot/vbe_simple_fw.c @@ -52,7 +52,8 @@ int vbe_simple_read_bootflow_fw(struct udevice *dev, struct bootflow *bflow) log_debug("blk=%s\n", blk->name);
ret = vbe_read_fit(blk, priv->area_start + priv->skip_offset, - priv->area_size, &load_addr, &len, &bflow->name); + priv->area_size, NULL, &load_addr, &len, + &bflow->name);
/* set up the bootflow with the info we obtained */ bflow->blk = blk;

For a sandbox implementation, where code size is no object, it makes sense to use the full bootstd drivers to load images.
For real boards, running from SRAM, this adds quite a bit of overhead.
Add a way to load the next phase using just the underlying storage driver, to reduce code size. For now, only MMC is supported.
Change the log_debug() to show the load address and size in a more neutral way, rather than suggesting that the load has already happened.
Signed-off-by: Simon Glass sjg@chromium.org ---
boot/vbe_common.c | 2 +- boot/vbe_simple_fw.c | 112 +++++++++++++++++++++++++++++-------------- 2 files changed, 77 insertions(+), 37 deletions(-)
diff --git a/boot/vbe_common.c b/boot/vbe_common.c index 799f33d9ffa..549a3f955d1 100644 --- a/boot/vbe_common.c +++ b/boot/vbe_common.c @@ -152,7 +152,7 @@ int vbe_read_fit(struct udevice *blk, ulong area_offset, ulong area_size, if (ret < 0) return log_msg_ret("ld", ret); node = ret; - log_debug("loaded to %lx\n", load_addr); + log_debug("load %lx size %lx\n", load_addr, len);
fdt_load_addr = 0; fdt_size = 0; diff --git a/boot/vbe_simple_fw.c b/boot/vbe_simple_fw.c index 9da3e49a66e..cb5534fc731 100644 --- a/boot/vbe_simple_fw.c +++ b/boot/vbe_simple_fw.c @@ -8,6 +8,7 @@
#define LOG_CATEGORY LOGC_BOOT
+#include <binman_sym.h> #include <bloblist.h> #include <bootdev.h> #include <bootflow.h> @@ -17,13 +18,24 @@ #include <image.h> #include <log.h> #include <mapmem.h> -#include <memalign.h> #include <mmc.h> #include <spl.h> #include <vbe.h> #include <dm/device-internal.h> +#include "vbe_common.h" #include "vbe_simple.h"
+#ifdef CONFIG_BOOTMETH_VBE_SIMPLE +binman_sym_extern(ulong, vbe_a, image_pos); +binman_sym_extern(ulong, vbe_a, size); +#else +binman_sym_declare(ulong, vbe_a, image_pos); +binman_sym_declare(ulong, vbe_a, size); +#endif + +binman_sym_declare(ulong, vpl, image_pos); +binman_sym_declare(ulong, vpl, size); + /** * vbe_simple_read_bootflow_fw() - Create a bootflow for firmware * @@ -54,6 +66,8 @@ int vbe_simple_read_bootflow_fw(struct udevice *dev, struct bootflow *bflow) ret = vbe_read_fit(blk, priv->area_start + priv->skip_offset, priv->area_size, NULL, &load_addr, &len, &bflow->name); + if (ret) + return log_msg_ret("vbe", ret);
/* set up the bootflow with the info we obtained */ bflow->blk = blk; @@ -63,16 +77,14 @@ int vbe_simple_read_bootflow_fw(struct udevice *dev, struct bootflow *bflow) return 0; }
-static int simple_load_from_image(struct spl_image_info *spl_image, +static int simple_load_from_image(struct spl_image_info *image, struct spl_boot_device *bootdev) { - struct udevice *meth, *bdev; - struct simple_priv *priv; - struct bootflow bflow; struct vbe_handoff *handoff; int ret;
- if (xpl_phase() != PHASE_VPL && xpl_phase() != PHASE_SPL) + if (xpl_phase() != PHASE_VPL && xpl_phase() != PHASE_SPL && + xpl_phase() != PHASE_TPL) return -ENOENT;
ret = bloblist_ensure_size(BLOBLISTT_VBE, sizeof(struct vbe_handoff), @@ -80,36 +92,64 @@ static int simple_load_from_image(struct spl_image_info *spl_image, if (ret) return log_msg_ret("ro", ret);
- vbe_find_first_device(&meth); - if (!meth) - return log_msg_ret("vd", -ENODEV); - log_debug("vbe dev %s\n", meth->name); - ret = device_probe(meth); - if (ret) - return log_msg_ret("probe", ret); - - priv = dev_get_priv(meth); - log_debug("simple %s\n", priv->storage); - ret = bootdev_find_by_label(priv->storage, &bdev, NULL); - if (ret) - return log_msg_ret("bd", ret); - log_debug("bootdev %s\n", bdev->name); - - bootflow_init(&bflow, bdev, meth); - ret = bootmeth_read_bootflow(meth, &bflow); - log_debug("\nfw ret=%d\n", ret); - if (ret) - return log_msg_ret("rd", ret); - - /* jump to the image */ - spl_image->flags = SPL_SANDBOXF_ARG_IS_BUF; - spl_image->arg = bflow.buf; - spl_image->size = bflow.size; - log_debug("Image: %s at %p size %x\n", bflow.name, bflow.buf, - bflow.size); - - /* this is not used from now on, so free it */ - bootflow_free(&bflow); + if (USE_BOOTMETH) { + struct udevice *meth, *bdev; + struct simple_priv *priv; + struct bootflow bflow; + + vbe_find_first_device(&meth); + if (!meth) + return log_msg_ret("vd", -ENODEV); + log_debug("vbe dev %s\n", meth->name); + ret = device_probe(meth); + if (ret) + return log_msg_ret("probe", ret); + + priv = dev_get_priv(meth); + log_debug("simple %s\n", priv->storage); + ret = bootdev_find_by_label(priv->storage, &bdev, NULL); + if (ret) + return log_msg_ret("bd", ret); + log_debug("bootdev %s\n", bdev->name); + + bootflow_init(&bflow, bdev, meth); + ret = bootmeth_read_bootflow(meth, &bflow); + log_debug("\nfw ret=%d\n", ret); + if (ret) + return log_msg_ret("rd", ret); + + /* jump to the image */ + image->flags = SPL_SANDBOXF_ARG_IS_BUF; + image->arg = bflow.buf; + image->size = bflow.size; + log_debug("Image: %s at %p size %x\n", bflow.name, bflow.buf, + bflow.size); + + /* this is not used from now on, so free it */ + bootflow_free(&bflow); + } else { + struct udevice *media, *blk; + ulong offset, size; + + ret = uclass_get_device_by_seq(UCLASS_MMC, 1, &media); + if (ret) + return log_msg_ret("vdv", ret); + ret = blk_get_from_parent(media, &blk); + if (ret) + return log_msg_ret("med", ret); + if (xpl_phase() == PHASE_TPL) { + offset = binman_sym(ulong, vpl, image_pos); + size = binman_sym(ulong, vpl, size); + } else { + offset = binman_sym(ulong, vbe_a, image_pos); + size = binman_sym(ulong, vbe_a, size); + printf("offset=%lx\n", offset); + } + + ret = vbe_read_fit(blk, offset, size, image, NULL, NULL, NULL); + if (ret) + return log_msg_ret("vbe", ret); + }
/* Record that VBE was used in this phase */ handoff->phases |= 1 << xpl_phase();
participants (2)
-
Simon Glass
-
Tom Rini