[U-Boot] [PATCH 1/3] dm: Add migration plan for CONFIG_BLK

The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
Set a deadline of 9 months for this work, rounded up to the next release.
Signed-off-by: Simon Glass sjg@chromium.org ---
doc/driver-model/MIGRATION.txt | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) create mode 100644 doc/driver-model/MIGRATION.txt
diff --git a/doc/driver-model/MIGRATION.txt b/doc/driver-model/MIGRATION.txt new file mode 100644 index 0000000000..d2fe027249 --- /dev/null +++ b/doc/driver-model/MIGRATION.txt @@ -0,0 +1,20 @@ +Migration Schedule +==================== + +U-Boot has been migrating to a new driver model since its introduction in +2014. This file describes the schedule for deprecation of pre-driver-model +features. + + +CONFIG_BLK +---------- + +Status: In progress +Deadline: 2018.05 + +Maintainers should submit patches for enabling CONFIG_BLK on all boards in +time for inclusion in the 2018.05 release. Boards not converted by this +time may be removed in a subsequent release. + +Note that this implies use of driver model for all block devices (e.g. +MMC, USB, SCSI, SATA).

Add a convenience macro to iterate over subnodes of a node. Make use of this where appropriate in the code.
Signed-off-by: Simon Glass sjg@chromium.org ---
arch/arm/mach-tegra/xusb-padctl-common.c | 8 ++------ drivers/core/ofnode.c | 9 +++++---- drivers/misc/cros_ec.c | 3 +-- drivers/power/pmic/pmic-uclass.c | 4 +--- include/dm/ofnode.h | 24 ++++++++++++++++++++++++ 5 files changed, 33 insertions(+), 15 deletions(-)
diff --git a/arch/arm/mach-tegra/xusb-padctl-common.c b/arch/arm/mach-tegra/xusb-padctl-common.c index 37b5b8fb5b..abc18c03a5 100644 --- a/arch/arm/mach-tegra/xusb-padctl-common.c +++ b/arch/arm/mach-tegra/xusb-padctl-common.c @@ -224,9 +224,7 @@ tegra_xusb_padctl_config_parse_dt(struct tegra_xusb_padctl *padctl,
config->name = ofnode_get_name(node);
- for (subnode = ofnode_first_subnode(node); - ofnode_valid(subnode); - subnode = ofnode_next_subnode(subnode)) { + ofnode_for_each_subnode(subnode, node) { struct tegra_xusb_padctl_group *group; int err;
@@ -256,9 +254,7 @@ static int tegra_xusb_padctl_parse_dt(struct tegra_xusb_padctl *padctl, return err; }
- for (subnode = ofnode_first_subnode(node); - ofnode_valid(subnode); - subnode = ofnode_next_subnode(subnode)) { + ofnode_for_each_subnode(subnode, node) { struct tegra_xusb_padctl_config *config = &padctl->config;
debug("%s: subnode=%s\n", __func__, ofnode_get_name(subnode)); diff --git a/drivers/core/ofnode.c b/drivers/core/ofnode.c index c1a2e9f0da..ae7cc833b9 100644 --- a/drivers/core/ofnode.c +++ b/drivers/core/ofnode.c @@ -390,10 +390,11 @@ int ofnode_decode_display_timing(ofnode parent, int index, if (!ofnode_valid(timings)) return -EINVAL;
- for (i = 0, node = ofnode_first_subnode(timings); - ofnode_valid(node) && i != index; - node = ofnode_first_subnode(node)) - i++; + i = 0; + ofnode_for_each_subnode(node, timings) { + if (i++ == index) + break; + }
if (!ofnode_valid(node)) return -EINVAL; diff --git a/drivers/misc/cros_ec.c b/drivers/misc/cros_ec.c index feaa5d8567..eefaaa53ad 100644 --- a/drivers/misc/cros_ec.c +++ b/drivers/misc/cros_ec.c @@ -1038,8 +1038,7 @@ int cros_ec_decode_ec_flash(struct udevice *dev, struct fdt_cros_ec *config)
config->flash_erase_value = ofnode_read_s32_default(flash_node, "erase-value", -1); - for (node = ofnode_first_subnode(flash_node); ofnode_valid(node); - node = ofnode_next_subnode(node)) { + ofnode_for_each_subnode(node, flash_node) { const char *name = ofnode_get_name(node); enum ec_flash_region region;
diff --git a/drivers/power/pmic/pmic-uclass.c b/drivers/power/pmic/pmic-uclass.c index 953bbe5026..64964e4e96 100644 --- a/drivers/power/pmic/pmic-uclass.c +++ b/drivers/power/pmic/pmic-uclass.c @@ -34,9 +34,7 @@ int pmic_bind_children(struct udevice *pmic, ofnode parent, debug("%s for '%s' at node offset: %d\n", __func__, pmic->name, dev_of_offset(pmic));
- for (node = ofnode_first_subnode(parent); - ofnode_valid(node); - node = ofnode_next_subnode(node)) { + ofnode_for_each_subnode(node, parent) { node_name = ofnode_get_name(node);
debug("* Found child node: '%s'\n", node_name); diff --git a/include/dm/ofnode.h b/include/dm/ofnode.h index 210ddb2e5d..22bece0a60 100644 --- a/include/dm/ofnode.h +++ b/include/dm/ofnode.h @@ -626,4 +626,28 @@ bool ofnode_pre_reloc(ofnode node);
int ofnode_read_resource(ofnode node, uint index, struct resource *res);
+/** + * ofnode_for_each_subnode() - iterate over all subnodes of a parent + * + * @node: child node (ofnode, lvalue) + * @parent: parent node (ofnode) + * + * This is a wrapper around a for loop and is used like so: + * + * ofnode node; + * + * ofnode_for_each_subnode(node, parent) { + * Use node + * ... + * } + * + * Note that this is implemented as a macro and @node is used as + * iterator in the loop. The parent variable can be a constant or even a + * literal. + */ +#define ofnode_for_each_subnode(node, parent) \ + for (node = ofnode_first_subnode(parent); \ + ofnode_valid(node); \ + node = ofnode_next_subnode(node)) + #endif

Add some documentation for the live device tree support in U-Boot. This was missing from the initial series.
Signed-off-by: Simon Glass sjg@chromium.org Suggested-by: Lukasz Majewski lukma@denx.de ---
doc/driver-model/livetree.txt | 272 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 272 insertions(+) create mode 100644 doc/driver-model/livetree.txt
diff --git a/doc/driver-model/livetree.txt b/doc/driver-model/livetree.txt new file mode 100644 index 0000000000..630f70bb85 --- /dev/null +++ b/doc/driver-model/livetree.txt @@ -0,0 +1,272 @@ +Driver Model with Live Device Tree +================================== + + +Introduction +------------ + +Traditionally U-Boot has used a 'flat' device tree. This means that it +reads directly from the device tree binary structure. It is called a flat +device tree because nodes are listed one after the other, with the +hierarchy detected by tags in the format. + +This document describes U-Boot's support for a 'live' device tree, meaning +that the tree is loaded into a hierarchical data structure within U-Boot. + + +Motivation +---------- + +The flat device tree has several advantages: + +- it is the format produced by the device tree compiler, so no translation +is needed + +- it is fairly compact (e.g. there is no need for pointers) + +- it ia accessed by the libfdt library, which is well tested and stable + + +However the flat device tree does have some limitations. Adding new +properties can involve copying large amounts of data around to make room. +The overall tree has a fixed maximum size so sometimes the tree must be +rebuilt in a new location to create more space. Even if not adding new +properties or nodes, scanning the tree can be slow. For example, finding +the parent of a node is a slow process. Reading from nodes involves a +small amount parsing which takes a little time. + +Driver model scans the entire device tree sequentially on start-up which +avoids the worst of the flat tree's limitations. But if the tree is to be +modified at run-time, a live tree is much faster. Even if no modification +is necessary, parsing the tree once and using a live tree from then on +seems to save a little time. + + +Implementation +-------------- + +In U-Boot a live device tree ('livetree') is currently supported only +after relocation. Therefore we need a mechanism to specify a device +tree node regardless of whether it is in the flat tree or livetree. + +The 'ofnode' type provides this. An ofnode can point to either a flat tree +node (when the live tree node is not yet set up) or a livetree node. The +caller of an ofnode function does not need to worry about these details. + +The main users of the information in a device tree are drivers. These have +a 'struct udevice *' which is attached to a device tree node. Therefore it +makes sense to be able to read device tree properties using the +'struct udevice *', rather than having to obtain the ofnode first. + +The 'dev_read_...()' interface provides this. It allows properties to be +easily read from the device tree using only a device pointer. Under the +hood it uses ofnode so it works with both flat and live device trees. + + +Enabling livetree +----------------- + +CONFIG_OF_LIVE enables livetree. When this option is enabled, the flat +tree will be used in SPL and before relocation in U-Boot proper. Just +before relocation a livetree is built, and this is used for U-Boot proper +after relocation. + +Most checks for livetree use CONFIG_IS_ENABLED(OF_LIVE). This means that +for SPL, the CONFIG_SPL_OF_LIVE option is checked. At present this does +not exist, since SPL coes not support livetree. + + +Porting drivers +--------------- + +Many existing drivers use the fdtdec interface to read device tree +properties. This only works with a flat device tree. The drivers should be +converted to use the dev_read_() interface. + +For example, the old code may be like this: + + struct udevice *bus; + const void *blob = gd->fdt_blob; + int node = dev_of_offset(bus); + + i2c_bus->regs = (struct i2c_ctlr *)devfdt_get_addr(dev); + plat->frequency = fdtdec_get_int(blob, node, "spi-max-frequency", 500000); + +The new code is: + + struct udevice *bus; + + i2c_bus->regs = (struct i2c_ctlr *)dev_read_addr(dev); + plat->frequency = dev_read_u32_default(bus, "spi-max-frequency", 500000); + +The dev_read_...() interface is more convenient and works with both the +flat and live device trees. See include/dm/read.h for a list of functions. + +Where properties must be read from sub-nodes or other nodes, you must fall +back to using ofnode. For example, for old code like this: + + const void *blob = gd->fdt_blob; + int subnode; + + fdt_for_each_subnode(subnode, blob, dev_of_offset(dev)) { + freq = fdtdec_get_int(blob, node, "spi-max-frequency", 500000); + ... + } + +you should use: + + ofnode subnode; + + ofnode_for_each_subnode(subnode, dev_ofnode(dev)) { + freq = ofnode_read_u32(node, "spi-max-frequency", 500000); + ... + } + + +Useful ofnode functions +----------------------- + +The internal data structures of the livetree are defined in include/dm/of.h : + + struct device_node - holds information about a device tree node + struct property - holds information about a property within a node + +Nodes have pointers to their first property, their parent, their first child +and their sibling. This allows nodes to be linked together in a hierarchical +tree. + +Properties have pointers to the next property. This allows all properties of +a node to be linked together in a chain. + +It should not be necessary to use these data structures in normal code. In +particular, you should refrain from using functions which access the livetree +directly, such as of_read_u32(). Use ofnode functions instead, to allow your +code to work with a flat tree also. + +Some conversion functions are used internally. Generally these are not needed +for driver code. Note that they will not work if called in the wrong context. +For example it is invalid to call ofnode_to_no() when a flat tree is being +used. Similarly it is not possible to call ofnode_to_offset() on a livetree +node. + + ofnode_to_np() - converts ofnode to struct device_node * + ofnode_to_offset() - converts ofnode to offset + + no_to_ofnode() - converts node pointer to ofnode + offset_to_ofnode() - converts offset to ofnode + + +Other useful functions: + + of_live_active() returns true if livetree is in use, false if flat tree + ofnode_valid() return true if a given node is valid + ofnode_is_np() returns true if a given node is a livetree node + ofnode_equal() compares two ofnodes + ofnode_null() returns a null ofnode (for which ofnode_valid() returns false) + + +Phandles +-------- + +There is full phandle support for live tree. All functions make use of +struct ofnode_phandle_args, which has an ofnode within it. This supports both +livetree and flat tree transparently. See for example +ofnode_parse_phandle_with_args(). + + +Reading addresses +----------------- + +You should use dev_read_addr() and friends to read addresses from device-tree +nodes. + + +fdtdec +------ + +The existing fdtdec interface will eventually be retired. Please try to avoid +using it in new code. + + +Modifying the livetree +---------------------- + +This is not currently supported. Once implemented it should provide a much +more efficient implementation for modification of the device tree than using +the flat tree. + + +Internal implementation +----------------------- + +The dev_read_...() fucctions have two implementations. When +CONFIG_DM_DEV_READ_INLINE is enabled, these functions simply call the ofnode +functions directly. This is useful when livetre is not enabled. The ofnode +functions call ofnode_is_np(node) which will always return false if livetree +is disabled, just falling back to flat tree code. + +This optimisation means that without livetree enabled, the dev_read_...() and +ofnode interfaces do not noticeably add to code size. + +The CONFIG_DM_DEV_READ_INLINE optoin defaults to enabled when livetree is +disabled. + +Most livetree code comes directly from Linux and is modified as little as +possible. This is deliberate since this code is fairly stable and does what +we want. Some features (such as get/put) are not supported. Internal macros +take care of removing these features silently. + +Within the of_access.c file there are pointers to the alias node, the chosen +node and the stdout-path alias. + + +Errors +------ + +With a flat device tree, libfdt errors are returned (e.g. -FDT_ERR_NOTFOUND). +For livetree normal 'errno' errors are returned (e.g. -ENOTFOUND). At present +the ofnode and dev_read_...() functions return either one or other type of +error. This is clearly not desirable. Once tests are added for all the +functions this can be tidied up. + + +Adding new access functions +--------------------------- + +Adding a new function for device-tree access involves the following steps: + + - Add two dev_read() functions: + - inline version in the read.h header file, which calls an ofnode + function + - standard version in the read.c file (or perhaps another file), which + also calls an ofnode function + + The implementations of these functions can be the same. The purpose + of the inline version is purely to reduce code size impact. + + - Add an ofnode function. This should call ofnode_is_np() to work out + whether a livetree or flat tree is used. For the livetree it should + call an of_...() function. For the flat tree it should call an + fdt_...() function. The livetree version will be optimised out at + compile time if livetree is not enabled. + + - Add an of_...() function for the livetree implementation. If a similar + function is available in Linux, the implementation should be taken + from there and modified as little as possible (generally not at all). + + +Future work +----------- + +Live tree support was introduced in U-Boot 2017.07. There is still quite a bit +of work to do to flesh this out: + +- tests for all access functions +- support for livetree modification +- addition of more access functions as needed +- support for livetree in SPL and before relocation (if desired) + + +-- +Simon Glass sjg@chromium.org +5-Aug-17

On 08/05/2017 11:45 PM, Simon Glass wrote:
Add some documentation for the live device tree support in U-Boot. This was missing from the initial series.
Signed-off-by: Simon Glass sjg@chromium.org Suggested-by: Lukasz Majewski lukma@denx.de
doc/driver-model/livetree.txt | 272 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 272 insertions(+) create mode 100644 doc/driver-model/livetree.txt
diff --git a/doc/driver-model/livetree.txt b/doc/driver-model/livetree.txt new file mode 100644 index 0000000000..630f70bb85 --- /dev/null +++ b/doc/driver-model/livetree.txt @@ -0,0 +1,272 @@ +Driver Model with Live Device Tree +==================================
+Introduction +------------
+Traditionally U-Boot has used a 'flat' device tree. This means that it +reads directly from the device tree binary structure. It is called a flat +device tree because nodes are listed one after the other, with the +hierarchy detected by tags in the format.
+This document describes U-Boot's support for a 'live' device tree, meaning +that the tree is loaded into a hierarchical data structure within U-Boot.
+Motivation +----------
+The flat device tree has several advantages:
+- it is the format produced by the device tree compiler, so no translation +is needed
+- it is fairly compact (e.g. there is no need for pointers)
+- it ia accessed by the libfdt library, which is well tested and stable
^^^ is
+However the flat device tree does have some limitations. Adding new +properties can involve copying large amounts of data around to make room. +The overall tree has a fixed maximum size so sometimes the tree must be +rebuilt in a new location to create more space. Even if not adding new +properties or nodes, scanning the tree can be slow. For example, finding +the parent of a node is a slow process. Reading from nodes involves a +small amount parsing which takes a little time.
+Driver model scans the entire device tree sequentially on start-up which +avoids the worst of the flat tree's limitations. But if the tree is to be +modified at run-time, a live tree is much faster. Even if no modification +is necessary, parsing the tree once and using a live tree from then on +seems to save a little time.
+Implementation +--------------
+In U-Boot a live device tree ('livetree') is currently supported only +after relocation. Therefore we need a mechanism to specify a device +tree node regardless of whether it is in the flat tree or livetree.
+The 'ofnode' type provides this. An ofnode can point to either a flat tree +node (when the live tree node is not yet set up) or a livetree node. The +caller of an ofnode function does not need to worry about these details.
+The main users of the information in a device tree are drivers. These have +a 'struct udevice *' which is attached to a device tree node. Therefore it +makes sense to be able to read device tree properties using the +'struct udevice *', rather than having to obtain the ofnode first.
+The 'dev_read_...()' interface provides this. It allows properties to be +easily read from the device tree using only a device pointer. Under the +hood it uses ofnode so it works with both flat and live device trees.
+Enabling livetree +-----------------
+CONFIG_OF_LIVE enables livetree. When this option is enabled, the flat +tree will be used in SPL and before relocation in U-Boot proper. Just +before relocation a livetree is built, and this is used for U-Boot proper +after relocation.
+Most checks for livetree use CONFIG_IS_ENABLED(OF_LIVE). This means that +for SPL, the CONFIG_SPL_OF_LIVE option is checked. At present this does +not exist, since SPL coes not support livetree.
^^^^ does?
+Porting drivers +---------------
+Many existing drivers use the fdtdec interface to read device tree +properties. This only works with a flat device tree. The drivers should be +converted to use the dev_read_() interface.
+For example, the old code may be like this:
- struct udevice *bus;
- const void *blob = gd->fdt_blob;
- int node = dev_of_offset(bus);
- i2c_bus->regs = (struct i2c_ctlr *)devfdt_get_addr(dev);
- plat->frequency = fdtdec_get_int(blob, node, "spi-max-frequency", 500000);
+The new code is:
- struct udevice *bus;
- i2c_bus->regs = (struct i2c_ctlr *)dev_read_addr(dev);
- plat->frequency = dev_read_u32_default(bus, "spi-max-frequency", 500000);
+The dev_read_...() interface is more convenient and works with both the +flat and live device trees. See include/dm/read.h for a list of functions.
+Where properties must be read from sub-nodes or other nodes, you must fall +back to using ofnode. For example, for old code like this:
- const void *blob = gd->fdt_blob;
- int subnode;
- fdt_for_each_subnode(subnode, blob, dev_of_offset(dev)) {
freq = fdtdec_get_int(blob, node, "spi-max-frequency", 500000);
...
- }
+you should use:
- ofnode subnode;
- ofnode_for_each_subnode(subnode, dev_ofnode(dev)) {
freq = ofnode_read_u32(node, "spi-max-frequency", 500000);
...
- }
+Useful ofnode functions +-----------------------
+The internal data structures of the livetree are defined in include/dm/of.h :
- struct device_node - holds information about a device tree node
- struct property - holds information about a property within a node
+Nodes have pointers to their first property, their parent, their first child +and their sibling. This allows nodes to be linked together in a hierarchical +tree.
+Properties have pointers to the next property. This allows all properties of +a node to be linked together in a chain.
+It should not be necessary to use these data structures in normal code. In +particular, you should refrain from using functions which access the livetree +directly, such as of_read_u32(). Use ofnode functions instead, to allow your +code to work with a flat tree also.
+Some conversion functions are used internally. Generally these are not needed +for driver code. Note that they will not work if called in the wrong context. +For example it is invalid to call ofnode_to_no() when a flat tree is being +used. Similarly it is not possible to call ofnode_to_offset() on a livetree +node.
- ofnode_to_np() - converts ofnode to struct device_node *
- ofnode_to_offset() - converts ofnode to offset
- no_to_ofnode() - converts node pointer to ofnode
- offset_to_ofnode() - converts offset to ofnode
+Other useful functions:
- of_live_active() returns true if livetree is in use, false if flat tree
- ofnode_valid() return true if a given node is valid
- ofnode_is_np() returns true if a given node is a livetree node
- ofnode_equal() compares two ofnodes
- ofnode_null() returns a null ofnode (for which ofnode_valid() returns false)
+Phandles +--------
+There is full phandle support for live tree. All functions make use of +struct ofnode_phandle_args, which has an ofnode within it. This supports both +livetree and flat tree transparently. See for example +ofnode_parse_phandle_with_args().
+Reading addresses +-----------------
+You should use dev_read_addr() and friends to read addresses from device-tree +nodes.
+fdtdec +------
+The existing fdtdec interface will eventually be retired. Please try to avoid +using it in new code.
+Modifying the livetree +----------------------
+This is not currently supported. Once implemented it should provide a much +more efficient implementation for modification of the device tree than using +the flat tree.
+Internal implementation +-----------------------
+The dev_read_...() fucctions have two implementations. When
^^^^^^^^^ functions
+CONFIG_DM_DEV_READ_INLINE is enabled, these functions simply call the ofnode +functions directly. This is useful when livetre is not enabled. The ofnode
^^^^^^^ livetree
+functions call ofnode_is_np(node) which will always return false if livetree +is disabled, just falling back to flat tree code.
+This optimisation means that without livetree enabled, the dev_read_...() and +ofnode interfaces do not noticeably add to code size.
+The CONFIG_DM_DEV_READ_INLINE optoin defaults to enabled when livetree is
^^^^^^^ option
+disabled.
+Most livetree code comes directly from Linux and is modified as little as +possible. This is deliberate since this code is fairly stable and does what +we want. Some features (such as get/put) are not supported. Internal macros +take care of removing these features silently.
+Within the of_access.c file there are pointers to the alias node, the chosen +node and the stdout-path alias.
+Errors +------
+With a flat device tree, libfdt errors are returned (e.g. -FDT_ERR_NOTFOUND). +For livetree normal 'errno' errors are returned (e.g. -ENOTFOUND). At present +the ofnode and dev_read_...() functions return either one or other type of +error. This is clearly not desirable. Once tests are added for all the +functions this can be tidied up.
+Adding new access functions +---------------------------
+Adding a new function for device-tree access involves the following steps:
- Add two dev_read() functions:
- inline version in the read.h header file, which calls an ofnode
function
- standard version in the read.c file (or perhaps another file), which
also calls an ofnode function
- The implementations of these functions can be the same. The purpose
- of the inline version is purely to reduce code size impact.
- Add an ofnode function. This should call ofnode_is_np() to work out
- whether a livetree or flat tree is used. For the livetree it should
- call an of_...() function. For the flat tree it should call an
- fdt_...() function. The livetree version will be optimised out at
- compile time if livetree is not enabled.
- Add an of_...() function for the livetree implementation. If a similar
- function is available in Linux, the implementation should be taken
- from there and modified as little as possible (generally not at all).
+Future work +-----------
+Live tree support was introduced in U-Boot 2017.07. There is still quite a bit +of work to do to flesh this out:
+- tests for all access functions +- support for livetree modification +- addition of more access functions as needed +- support for livetree in SPL and before relocation (if desired)
+-- +Simon Glass sjg@chromium.org +5-Aug-17
Thanks Simon for providing such great documentation.
Please find some word misspelled correction.
Reviewed-by: Łukasz Majewski lukma@denx.de

On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
Set a deadline of 9 months for this work, rounded up to the next release.
Signed-off-by: Simon Glass sjg@chromium.org
Reviewed-by: Tom Rini trini@konsulko.com
And I've gone and made a calendar reminder to make v2018.01 have a scary build warning about this conversion too.

Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
Set a deadline of 9 months for this work, rounded up to the next release.
Signed-off-by: Simon Glass sjg@chromium.org
Reviewed-by: Tom Rini trini@konsulko.com
And I've gone and made a calendar reminder to make v2018.01 have a scary build warning about this conversion too.
OK sounds good. I see that I didn't cc many maintainers on this one. Might be worth having a (less scary) message with the next release :-)
Regards, Simon

Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
Set a deadline of 9 months for this work, rounded up to the next release.
Signed-off-by: Simon Glass sjg@chromium.org
Reviewed-by: Tom Rini trini@konsulko.com
And I've gone and made a calendar reminder to make v2018.01 have a scary build warning about this conversion too.
OK sounds good. I see that I didn't cc many maintainers on this one. Might be worth having a (less scary) message with the next release :-)
Regards, Simon
Applied to u-boot-dm, thanks!

On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?

Hi Jagan,
On 28 March 2018 at 02:04, Jagan Teki jagannadh.teki@gmail.com wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
I don't think we can require BLK for SPL / TPL, or even DM for that matter. We should use it when size permits.
Regards, Simon

On Fri, Mar 30, 2018 at 4:13 AM, Simon Glass sjg@google.com wrote:
Hi Jagan,
On 28 March 2018 at 02:04, Jagan Teki jagannadh.teki@gmail.com wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
I don't think we can require BLK for SPL / TPL, or even DM for that matter. We should use it when size permits.
But we can still maintain non-dm code in driver with #ifdef right? indeed migration plan is to remove that.
Jagan.

Hi Jagan,
On 2 April 2018 at 12:57, Jagan Teki jagannadh.teki@gmail.com wrote:
On Fri, Mar 30, 2018 at 4:13 AM, Simon Glass sjg@google.com wrote:
Hi Jagan,
On 28 March 2018 at 02:04, Jagan Teki jagannadh.teki@gmail.com wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
I don't think we can require BLK for SPL / TPL, or even DM for that matter. We should use it when size permits.
But we can still maintain non-dm code in driver with #ifdef right? indeed migration plan is to remove that.
I think for drivers that have to work with SPL and cannot use DM due to code size, we have to provide a way. But I think over time, this will fade out, as SoCs have more SRAM. Also with CONFIG_OF_PLATDATA the penalty is very small.
Regards, Simon

On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?

On 04/01/2018 03:19 PM, Tom Rini wrote:
On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?
To the point where CI20 with it's SPL limit might actually make it into mainline even ?

Hi,
On 01/04/18 14:19, Tom Rini wrote:
On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?
The Boot ROM limit for all Allwinner SoCs known so far is 32KB. The A64 SPL (AArch64) stands at ~31KB at the moment. Yes, we went over the map and picked most low hanging fruits already. So far we discussed several mitigations, but mostly to cover the "natural" SPL code size grow over time: 1) The AArch64 exception vectors take 1KB, plus an unnecessary ~1.6KB of padding (for a 2KB architectural alignment). Given that the vectors are used only for debugging purposes, we could scrap them entirely or construct them on the fly in some other SRAM. So would free about 2.5KB, ideally. Lowest hanging fruit so far. 2) We can compile the SPL in AArch32 mode, which can use the Thumb2 encoding. This reduces the size significantly, to about 20KB. The disadvantage is using a second cross-compiler or even a additional cross-compiler for native builds, complicating the build process. I maintain a branch for enabling FEL booting here [1], which provides two _defconfigs (one 32-bit for SPL, one 64-bit for U-Boot proper). There are no technical disadvantages in running the SPL in 32-bit, so this is mostly a build issue. 3) Try to use ILP32 for the AArch64 SPL build. This reduces the pointer size and sizeof(long) to be 32-bit and should help, though I haven't been able to successfully compile it yet (relocation types problems). Despite lacking mainline support for AArch64 ILP32 in Linux and glibc(?), GCC supports it for quite a while already. Unknown saving effect. 4) Use runtime decompression. Most SoCs have larger or more SRAM than the 32KB, so we could leverage this. Siarhei knows more about this. 5) Use a TPL. Haven't looked at this in detail yet.
So 1) would be the easiest to pursue, but 2.5KB are not enough to offset the >10 KB toll the DM_SPL support actually takes.
Cheers, Andre.

Hi Andre,
On 2 April 2018 at 09:43, André Przywara andre.przywara@arm.com wrote:
Hi,
On 01/04/18 14:19, Tom Rini wrote:
On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?
The Boot ROM limit for all Allwinner SoCs known so far is 32KB. The A64 SPL (AArch64) stands at ~31KB at the moment. Yes, we went over the map and picked most low hanging fruits already. So far we discussed several mitigations, but mostly to cover the "natural" SPL code size grow over time:
- The AArch64 exception vectors take 1KB, plus an unnecessary ~1.6KB of
padding (for a 2KB architectural alignment). Given that the vectors are used only for debugging purposes, we could scrap them entirely or construct them on the fly in some other SRAM. So would free about 2.5KB, ideally. Lowest hanging fruit so far. 2) We can compile the SPL in AArch32 mode, which can use the Thumb2 encoding. This reduces the size significantly, to about 20KB. The disadvantage is using a second cross-compiler or even a additional cross-compiler for native builds, complicating the build process. I maintain a branch for enabling FEL booting here [1], which provides two _defconfigs (one 32-bit for SPL, one 64-bit for U-Boot proper). There are no technical disadvantages in running the SPL in 32-bit, so this is mostly a build issue.
FYI 32-bit tegra compiles SPL with ARMv4T and U-Boot proper with ARMv7. It should be fairly easy to do,
- Try to use ILP32 for the AArch64 SPL build. This reduces the pointer
size and sizeof(long) to be 32-bit and should help, though I haven't been able to successfully compile it yet (relocation types problems). Despite lacking mainline support for AArch64 ILP32 in Linux and glibc(?), GCC supports it for quite a while already. Unknown saving effect. 4) Use runtime decompression. Most SoCs have larger or more SRAM than the 32KB, so we could leverage this. Siarhei knows more about this. 5) Use a TPL. Haven't looked at this in detail yet.
So 1) would be the easiest to pursue, but 2.5KB are not enough to offset the >10 KB toll the DM_SPL support actually takes.
Is this the cost on 64-bit?
I wonder if CONFIG_OF_PLATDATA might be an option?
Regards, Simon

On Mon, Apr 2, 2018 at 3:28 AM, Simon Glass sjg@google.com wrote:
Hi Andre,
On 2 April 2018 at 09:43, André Przywara andre.przywara@arm.com wrote:
Hi,
On 01/04/18 14:19, Tom Rini wrote:
On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
> The CONFIG_BLK conversion involves quite invasive changes in the U-Boot > code, with #ifdefs and different code paths. We should try to move over to > this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?
The Boot ROM limit for all Allwinner SoCs known so far is 32KB. The A64 SPL (AArch64) stands at ~31KB at the moment. Yes, we went over the map and picked most low hanging fruits already. So far we discussed several mitigations, but mostly to cover the "natural" SPL code size grow over time:
- The AArch64 exception vectors take 1KB, plus an unnecessary ~1.6KB of
padding (for a 2KB architectural alignment). Given that the vectors are used only for debugging purposes, we could scrap them entirely or construct them on the fly in some other SRAM. So would free about 2.5KB, ideally. Lowest hanging fruit so far. 2) We can compile the SPL in AArch32 mode, which can use the Thumb2 encoding. This reduces the size significantly, to about 20KB. The disadvantage is using a second cross-compiler or even a additional cross-compiler for native builds, complicating the build process. I maintain a branch for enabling FEL booting here [1], which provides two _defconfigs (one 32-bit for SPL, one 64-bit for U-Boot proper). There are no technical disadvantages in running the SPL in 32-bit, so this is mostly a build issue.
FYI 32-bit tegra compiles SPL with ARMv4T and U-Boot proper with ARMv7. It should be fairly easy to do,
ARMv4 and ARMv7 are both 32 bit though, as opposed to 32 and 64 bit in the case of Allwinner A64

Hi Peter,
On 2 April 2018 at 10:45, Peter Robinson pbrobinson@gmail.com wrote:
On Mon, Apr 2, 2018 at 3:28 AM, Simon Glass sjg@google.com wrote:
Hi Andre,
On 2 April 2018 at 09:43, André Przywara andre.przywara@arm.com wrote:
Hi,
On 01/04/18 14:19, Tom Rini wrote:
On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote: > On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote: > >> The CONFIG_BLK conversion involves quite invasive changes in the U-Boot >> code, with #ifdefs and different code paths. We should try to move over to >> this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?
The Boot ROM limit for all Allwinner SoCs known so far is 32KB. The A64 SPL (AArch64) stands at ~31KB at the moment. Yes, we went over the map and picked most low hanging fruits already. So far we discussed several mitigations, but mostly to cover the "natural" SPL code size grow over time:
- The AArch64 exception vectors take 1KB, plus an unnecessary ~1.6KB of
padding (for a 2KB architectural alignment). Given that the vectors are used only for debugging purposes, we could scrap them entirely or construct them on the fly in some other SRAM. So would free about 2.5KB, ideally. Lowest hanging fruit so far. 2) We can compile the SPL in AArch32 mode, which can use the Thumb2 encoding. This reduces the size significantly, to about 20KB. The disadvantage is using a second cross-compiler or even a additional cross-compiler for native builds, complicating the build process. I maintain a branch for enabling FEL booting here [1], which provides two _defconfigs (one 32-bit for SPL, one 64-bit for U-Boot proper). There are no technical disadvantages in running the SPL in 32-bit, so this is mostly a build issue.
FYI 32-bit tegra compiles SPL with ARMv4T and U-Boot proper with ARMv7. It should be fairly easy to do,
ARMv4 and ARMv7 are both 32 bit though, as opposed to 32 and 64 bit in the case of Allwinner A64
Yes, but that is just a matter of compiler or compiler flags. My point was we should be able to use different build for each without too much work.
Regards, Simon

On Mon, Apr 2, 2018 at 3:56 AM, Simon Glass sjg@chromium.org wrote:
Hi Peter,
On 2 April 2018 at 10:45, Peter Robinson pbrobinson@gmail.com wrote:
On Mon, Apr 2, 2018 at 3:28 AM, Simon Glass sjg@google.com wrote:
Hi Andre,
On 2 April 2018 at 09:43, André Przywara andre.przywara@arm.com wrote:
Hi,
On 01/04/18 14:19, Tom Rini wrote:
On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote: > Hi Tom, > > On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote: >> On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote: >> >>> The CONFIG_BLK conversion involves quite invasive changes in the U-Boot >>> code, with #ifdefs and different code paths. We should try to move over to >>> this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?
The Boot ROM limit for all Allwinner SoCs known so far is 32KB. The A64 SPL (AArch64) stands at ~31KB at the moment. Yes, we went over the map and picked most low hanging fruits already. So far we discussed several mitigations, but mostly to cover the "natural" SPL code size grow over time:
- The AArch64 exception vectors take 1KB, plus an unnecessary ~1.6KB of
padding (for a 2KB architectural alignment). Given that the vectors are used only for debugging purposes, we could scrap them entirely or construct them on the fly in some other SRAM. So would free about 2.5KB, ideally. Lowest hanging fruit so far. 2) We can compile the SPL in AArch32 mode, which can use the Thumb2 encoding. This reduces the size significantly, to about 20KB. The disadvantage is using a second cross-compiler or even a additional cross-compiler for native builds, complicating the build process. I maintain a branch for enabling FEL booting here [1], which provides two _defconfigs (one 32-bit for SPL, one 64-bit for U-Boot proper). There are no technical disadvantages in running the SPL in 32-bit, so this is mostly a build issue.
FYI 32-bit tegra compiles SPL with ARMv4T and U-Boot proper with ARMv7. It should be fairly easy to do,
ARMv4 and ARMv7 are both 32 bit though, as opposed to 32 and 64 bit in the case of Allwinner A64
Yes, but that is just a matter of compiler or compiler flags. My point was we should be able to use different build for each without too much work.
It's a lot more work for the way most distros build u-boot, but TBH the sooner I don't need to the better ;-)

Hi,
On 2 April 2018 at 11:07, Peter Robinson pbrobinson@gmail.com wrote:
On Mon, Apr 2, 2018 at 3:56 AM, Simon Glass sjg@chromium.org wrote:
Hi Peter,
On 2 April 2018 at 10:45, Peter Robinson pbrobinson@gmail.com wrote:
On Mon, Apr 2, 2018 at 3:28 AM, Simon Glass sjg@google.com wrote:
Hi Andre,
On 2 April 2018 at 09:43, André Przywara andre.przywara@arm.com wrote:
Hi,
On 01/04/18 14:19, Tom Rini wrote:
On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote: > On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote: >> Hi Tom, >> >> On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote: >>> On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote: >>> >>>> The CONFIG_BLK conversion involves quite invasive changes in the U-Boot >>>> code, with #ifdefs and different code paths. We should try to move over to >>>> this soon so we can drop the old code. > > I hope this will applicable to SPL too? > > If so, we are having SPL size issues with few Allwinner families, if > enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?
The Boot ROM limit for all Allwinner SoCs known so far is 32KB. The A64 SPL (AArch64) stands at ~31KB at the moment. Yes, we went over the map and picked most low hanging fruits already. So far we discussed several mitigations, but mostly to cover the "natural" SPL code size grow over time:
- The AArch64 exception vectors take 1KB, plus an unnecessary ~1.6KB of
padding (for a 2KB architectural alignment). Given that the vectors are used only for debugging purposes, we could scrap them entirely or construct them on the fly in some other SRAM. So would free about 2.5KB, ideally. Lowest hanging fruit so far. 2) We can compile the SPL in AArch32 mode, which can use the Thumb2 encoding. This reduces the size significantly, to about 20KB. The disadvantage is using a second cross-compiler or even a additional cross-compiler for native builds, complicating the build process. I maintain a branch for enabling FEL booting here [1], which provides two _defconfigs (one 32-bit for SPL, one 64-bit for U-Boot proper). There are no technical disadvantages in running the SPL in 32-bit, so this is mostly a build issue.
FYI 32-bit tegra compiles SPL with ARMv4T and U-Boot proper with ARMv7. It should be fairly easy to do,
ARMv4 and ARMv7 are both 32 bit though, as opposed to 32 and 64 bit in the case of Allwinner A64
Yes, but that is just a matter of compiler or compiler flags. My point was we should be able to use different build for each without too much work.
It's a lot more work for the way most distros build u-boot, but TBH the sooner I don't need to the better ;-)
I don't understand the last part of that sentence. But getting back to the original question, DM does add size, DT adds more. There is CONFIG_OF_PLATDATA which essentially removes the DT cost, but DM remains (perhaps 5KB at a guess on 64-bit). So we will have pressure to avoid using DM in SPL for some time to come, I think.
Regards, Simon

[resending from correct address]
Hi Andre,
On 2 April 2018 at 09:43, André Przywara andre.przywara@arm.com wrote:
Hi,
On 01/04/18 14:19, Tom Rini wrote:
On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?
The Boot ROM limit for all Allwinner SoCs known so far is 32KB. The A64 SPL (AArch64) stands at ~31KB at the moment. Yes, we went over the map and picked most low hanging fruits already. So far we discussed several mitigations, but mostly to cover the "natural" SPL code size grow over time:
- The AArch64 exception vectors take 1KB, plus an unnecessary ~1.6KB of
padding (for a 2KB architectural alignment). Given that the vectors are used only for debugging purposes, we could scrap them entirely or construct them on the fly in some other SRAM. So would free about 2.5KB, ideally. Lowest hanging fruit so far. 2) We can compile the SPL in AArch32 mode, which can use the Thumb2 encoding. This reduces the size significantly, to about 20KB. The disadvantage is using a second cross-compiler or even a additional cross-compiler for native builds, complicating the build process. I maintain a branch for enabling FEL booting here [1], which provides two _defconfigs (one 32-bit for SPL, one 64-bit for U-Boot proper). There are no technical disadvantages in running the SPL in 32-bit, so this is mostly a build issue.
FYI 32-bit tegra compiles SPL with ARMv4T and U-Boot proper with ARMv7. It should be fairly easy to do,
- Try to use ILP32 for the AArch64 SPL build. This reduces the pointer
size and sizeof(long) to be 32-bit and should help, though I haven't been able to successfully compile it yet (relocation types problems). Despite lacking mainline support for AArch64 ILP32 in Linux and glibc(?), GCC supports it for quite a while already. Unknown saving effect. 4) Use runtime decompression. Most SoCs have larger or more SRAM than the 32KB, so we could leverage this. Siarhei knows more about this. 5) Use a TPL. Haven't looked at this in detail yet.
So 1) would be the easiest to pursue, but 2.5KB are not enough to offset the >10 KB toll the DM_SPL support actually takes.
Is this the cost on 64-bit?
I wonder if CONFIG_OF_PLATDATA might be an option?
Regards, Simon

Hi,
On 02/04/18 03:30, Simon Glass wrote:
Hi Andre,
On 2 April 2018 at 09:43, André Przywara andre.przywara@arm.com wrote:
Hi,
On 01/04/18 14:19, Tom Rini wrote:
On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
> The CONFIG_BLK conversion involves quite invasive changes in the U-Boot > code, with #ifdefs and different code paths. We should try to move over to > this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?
The Boot ROM limit for all Allwinner SoCs known so far is 32KB. The A64 SPL (AArch64) stands at ~31KB at the moment. Yes, we went over the map and picked most low hanging fruits already. So far we discussed several mitigations, but mostly to cover the "natural" SPL code size grow over time:
- The AArch64 exception vectors take 1KB, plus an unnecessary ~1.6KB of
padding (for a 2KB architectural alignment). Given that the vectors are used only for debugging purposes, we could scrap them entirely or construct them on the fly in some other SRAM. So would free about 2.5KB, ideally. Lowest hanging fruit so far. 2) We can compile the SPL in AArch32 mode, which can use the Thumb2 encoding. This reduces the size significantly, to about 20KB. The disadvantage is using a second cross-compiler or even a additional cross-compiler for native builds, complicating the build process. I maintain a branch for enabling FEL booting here [1], which provides two _defconfigs (one 32-bit for SPL, one 64-bit for U-Boot proper). There are no technical disadvantages in running the SPL in 32-bit, so this is mostly a build issue.
FYI 32-bit tegra compiles SPL with ARMv4T and U-Boot proper with ARMv7. It should be fairly easy to do,
Yes, but this is merely different compiler *flags*, to the same (cross) compiler binary. ARM32 and ARM64 are different architectures to GCC, so require different compiler binaries with different prefixes. Last time I checked this wasn't easy to integrate into the U-Boot build system. One hack could be a "switching script", which filters for, say -m32", and calls the respective binary. But still we need to somehow set *two* CROSS_COMPILE prefixes. CROSS_COMPILE_SPL, maybe? But still it would require to install *two* cross compilers, and would spoil a completely native build by still requiring a cross compiler.
- Try to use ILP32 for the AArch64 SPL build. This reduces the pointer
size and sizeof(long) to be 32-bit and should help, though I haven't been able to successfully compile it yet (relocation types problems). Despite lacking mainline support for AArch64 ILP32 in Linux and glibc(?), GCC supports it for quite a while already. Unknown saving effect. 4) Use runtime decompression. Most SoCs have larger or more SRAM than the 32KB, so we could leverage this. Siarhei knows more about this. 5) Use a TPL. Haven't looked at this in detail yet.
So 1) would be the easiest to pursue, but 2.5KB are not enough to offset the >10 KB toll the DM_SPL support actually takes.
Is this the cost on 64-bit?
Yes, this is AArch64, just enabling DM_SPL_MMC and DM_SPL.
I wonder if CONFIG_OF_PLATDATA might be an option?
Well, this would be a requirement, I guess, since adding any kind of DT to the mix makes it even worse.
Cheers, Andre

Hi Andre,
On 2 April 2018 at 19:00, André Przywara andre.przywara@arm.com wrote:
Hi,
On 02/04/18 03:30, Simon Glass wrote:
Hi Andre,
On 2 April 2018 at 09:43, André Przywara andre.przywara@arm.com wrote:
Hi,
On 01/04/18 14:19, Tom Rini wrote:
On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote: > On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote: > >> The CONFIG_BLK conversion involves quite invasive changes in the U-Boot >> code, with #ifdefs and different code paths. We should try to move over to >> this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?
The Boot ROM limit for all Allwinner SoCs known so far is 32KB. The A64 SPL (AArch64) stands at ~31KB at the moment. Yes, we went over the map and picked most low hanging fruits already. So far we discussed several mitigations, but mostly to cover the "natural" SPL code size grow over time:
- The AArch64 exception vectors take 1KB, plus an unnecessary ~1.6KB of
padding (for a 2KB architectural alignment). Given that the vectors are used only for debugging purposes, we could scrap them entirely or construct them on the fly in some other SRAM. So would free about 2.5KB, ideally. Lowest hanging fruit so far. 2) We can compile the SPL in AArch32 mode, which can use the Thumb2 encoding. This reduces the size significantly, to about 20KB. The disadvantage is using a second cross-compiler or even a additional cross-compiler for native builds, complicating the build process. I maintain a branch for enabling FEL booting here [1], which provides two _defconfigs (one 32-bit for SPL, one 64-bit for U-Boot proper). There are no technical disadvantages in running the SPL in 32-bit, so this is mostly a build issue.
FYI 32-bit tegra compiles SPL with ARMv4T and U-Boot proper with ARMv7. It should be fairly easy to do,
Yes, but this is merely different compiler *flags*, to the same (cross) compiler binary. ARM32 and ARM64 are different architectures to GCC, so require different compiler binaries with different prefixes. Last time I checked this wasn't easy to integrate into the U-Boot build system. One hack could be a "switching script", which filters for, say -m32", and calls the respective binary. But still we need to somehow set *two* CROSS_COMPILE prefixes. CROSS_COMPILE_SPL, maybe? But still it would require to install *two* cross compilers, and would spoil a completely native build by still requiring a cross compiler.
That seems like a good idea to me.
- Try to use ILP32 for the AArch64 SPL build. This reduces the pointer
size and sizeof(long) to be 32-bit and should help, though I haven't been able to successfully compile it yet (relocation types problems). Despite lacking mainline support for AArch64 ILP32 in Linux and glibc(?), GCC supports it for quite a while already. Unknown saving effect. 4) Use runtime decompression. Most SoCs have larger or more SRAM than the 32KB, so we could leverage this. Siarhei knows more about this. 5) Use a TPL. Haven't looked at this in detail yet.
So 1) would be the easiest to pursue, but 2.5KB are not enough to offset the >10 KB toll the DM_SPL support actually takes.
Is this the cost on 64-bit?
Yes, this is AArch64, just enabling DM_SPL_MMC and DM_SPL.
OK I see, and presumably OF_CONTROL as well?
I wonder if CONFIG_OF_PLATDATA might be an option?
Well, this would be a requirement, I guess, since adding any kind of DT to the mix makes it even worse.
Well it still uses DT as the source for the config. It's just that it compiles it to C so we don't have to build in libfdt. It does have some painful side effects though - e.g. you need to adjust drivers to read the new C structure.
Cheers, Andre
Regards, Simon

On Wed, Apr 04, 2018 at 01:53:17AM +0800, Simon Glass wrote:
Hi Andre,
On 2 April 2018 at 19:00, André Przywara andre.przywara@arm.com wrote:
Hi,
On 02/04/18 03:30, Simon Glass wrote:
Hi Andre,
On 2 April 2018 at 09:43, André Przywara andre.przywara@arm.com wrote:
Hi,
On 01/04/18 14:19, Tom Rini wrote:
On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote: > Hi Tom, > > On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote: >> On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote: >> >>> The CONFIG_BLK conversion involves quite invasive changes in the U-Boot >>> code, with #ifdefs and different code paths. We should try to move over to >>> this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?
The Boot ROM limit for all Allwinner SoCs known so far is 32KB. The A64 SPL (AArch64) stands at ~31KB at the moment. Yes, we went over the map and picked most low hanging fruits already. So far we discussed several mitigations, but mostly to cover the "natural" SPL code size grow over time:
- The AArch64 exception vectors take 1KB, plus an unnecessary ~1.6KB of
padding (for a 2KB architectural alignment). Given that the vectors are used only for debugging purposes, we could scrap them entirely or construct them on the fly in some other SRAM. So would free about 2.5KB, ideally. Lowest hanging fruit so far. 2) We can compile the SPL in AArch32 mode, which can use the Thumb2 encoding. This reduces the size significantly, to about 20KB. The disadvantage is using a second cross-compiler or even a additional cross-compiler for native builds, complicating the build process. I maintain a branch for enabling FEL booting here [1], which provides two _defconfigs (one 32-bit for SPL, one 64-bit for U-Boot proper). There are no technical disadvantages in running the SPL in 32-bit, so this is mostly a build issue.
FYI 32-bit tegra compiles SPL with ARMv4T and U-Boot proper with ARMv7. It should be fairly easy to do,
Yes, but this is merely different compiler *flags*, to the same (cross) compiler binary. ARM32 and ARM64 are different architectures to GCC, so require different compiler binaries with different prefixes. Last time I checked this wasn't easy to integrate into the U-Boot build system. One hack could be a "switching script", which filters for, say -m32", and calls the respective binary. But still we need to somehow set *two* CROSS_COMPILE prefixes. CROSS_COMPILE_SPL, maybe? But still it would require to install *two* cross compilers, and would spoil a completely native build by still requiring a cross compiler.
That seems like a good idea to me.
I've lamented before (and I think others have too) that it's really a shame that gcc treats arm32 and arm64 as totally distinct builds (and where clang is a win). But I don't think we can require people to have both an arm and an aarch64 compiler available in order to build U-Boot for some aarch64.
- Try to use ILP32 for the AArch64 SPL build. This reduces the pointer
size and sizeof(long) to be 32-bit and should help, though I haven't been able to successfully compile it yet (relocation types problems). Despite lacking mainline support for AArch64 ILP32 in Linux and glibc(?), GCC supports it for quite a while already. Unknown saving effect. 4) Use runtime decompression. Most SoCs have larger or more SRAM than the 32KB, so we could leverage this. Siarhei knows more about this. 5) Use a TPL. Haven't looked at this in detail yet.
Here, my preference would be to again look at (4) then (3). I think a (5) TPL here would be enough of a something to get DDR available so that SPL can run there and not be subject to the tiny limits. But I have no idea how feasible that is here.

"Tom" == Tom Rini trini@konsulko.com writes:
Hi,
That seems like a good idea to me.
I've lamented before (and I think others have too) that it's really a shame that gcc treats arm32 and arm64 as totally distinct builds (and where clang is a win). But I don't think we can require people to have both an arm and an aarch64 compiler available in order to build U-Boot for some aarch64.
No, please not. It would make it very hard to handle U-Boot builds in Buildroot for these boards.

On Mon, Apr 2, 2018 at 7:13 AM, André Przywara andre.przywara@arm.com wrote:
Hi,
On 01/04/18 14:19, Tom Rini wrote:
On Tue, Mar 27, 2018 at 11:34:19PM +0530, Jagan Teki wrote:
On Mon, Sep 4, 2017 at 9:57 PM, sjg@google.com wrote:
Hi Tom,
On 7 August 2017 at 09:39, Tom Rini trini@konsulko.com wrote:
On Sat, Aug 05, 2017 at 03:45:53PM -0600, Simon Glass wrote:
The CONFIG_BLK conversion involves quite invasive changes in the U-Boot code, with #ifdefs and different code paths. We should try to move over to this soon so we can drop the old code.
I hope this will applicable to SPL too?
If so, we are having SPL size issues with few Allwinner families, if enable SPL_DM any suggestions?
How close, and have you looked at the u-boot-spl.map to see what you can maybe trim? Or areas to look at reducing in code complexity?
The Boot ROM limit for all Allwinner SoCs known so far is 32KB. The A64 SPL (AArch64) stands at ~31KB at the moment. Yes, we went over the map and picked most low hanging fruits already. So far we discussed several mitigations, but mostly to cover the "natural" SPL code size grow over time:
- The AArch64 exception vectors take 1KB, plus an unnecessary ~1.6KB of
padding (for a 2KB architectural alignment). Given that the vectors are used only for debugging purposes, we could scrap them entirely or construct them on the fly in some other SRAM. So would free about 2.5KB, ideally. Lowest hanging fruit so far. 2) We can compile the SPL in AArch32 mode, which can use the Thumb2 encoding. This reduces the size significantly, to about 20KB. The disadvantage is using a second cross-compiler or even a additional cross-compiler for native builds, complicating the build process. I maintain a branch for enabling FEL booting here [1], which provides two _defconfigs (one 32-bit for SPL, one 64-bit for U-Boot proper). There are no technical disadvantages in running the SPL in 32-bit, so this is mostly a build issue.
May be this can be a good option and it has verified with board. As Simon pointed tegra for this matter about building two arch's I think we can try this out. I made some know change in arm/Makefile but unable to export armv7 and armv8 compilers so-that build can pick based on SPL and U-Boot?
--- a/arch/arm/Makefile +++ b/arch/arm/Makefile @@ -24,6 +24,8 @@ arch-$(CONFIG_ARM64) =-march=armv8-a # but otherwise we can use the value in CONFIG_SYS_ARM_ARCH ifeq ($(CONFIG_SPL_BUILD)$(CONFIG_TEGRA),yy) arch-y += -D__LINUX_ARM_ARCH__=4 +else ifeq ($(CONFIG_SPL_BUILD)$(CONFIG_MACH_SUN50I),yy) +arch-y += -D__LINUX_ARM_ARCH__=7 else arch-y += -D__LINUX_ARM_ARCH__=$(CONFIG_SYS_ARM_ARCH) endif
- Try to use ILP32 for the AArch64 SPL build. This reduces the pointer
size and sizeof(long) to be 32-bit and should help, though I haven't been able to successfully compile it yet (relocation types problems). Despite lacking mainline support for AArch64 ILP32 in Linux and glibc(?), GCC supports it for quite a while already. Unknown saving effect. 4) Use runtime decompression. Most SoCs have larger or more SRAM than the 32KB, so we could leverage this. Siarhei knows more about this. 5) Use a TPL. Haven't looked at this in detail yet.
I think it's difficult to implement TPL here because, we should require same SPL code for TPL like cpu, clock, DRAM and MMC(for boot mode) butif we have a way to return from BootROM once TPL loaded(like rockchip does) so-that we can skip MMC code from TPL.
Jagan.
participants (10)
-
André Przywara
-
Jagan Teki
-
Marek Vasut
-
Peter Korsgaard
-
Peter Robinson
-
Simon Glass
-
Simon Glass
-
sjg@google.com
-
Tom Rini
-
Łukasz Majewski