[U-Boot] [PATCH 00/24] sunxi: Allwinner A64: SPL support

Hi,
this series introduces SPL support for the Allwinner A64 SoC. In contrast to the previous RFC this one includes support for both AArch64 and AArch32 SPL builds. Still the FIT support is missing, which means the functionality is limited. Due to the missing ARM Trusted Firmware (ATF) in this firmware chain one loses Ethernet and SMP, among other minor things. I will send the FIT support later on top of this.
The first two patches are fixes that I sent out before, not sure if they landed somewhere already. Patch 2-8 prepare the SPL code to be compiled for 64-bit in general and AArch64 in particular. Patch 9-11 refactor the existing boot0 header functionality to be used by patch 12, which introduces the 64-bit switch in the first SPL instructions. Patches 13-18 then introduce the actual core of the SPL support: the DRAM initialization, courtesy of Jens. This piggy backs on the existing H3 DRAM code, deviating where needed. Patch 19 finally enables the 64-bit SPL support. So now building the existing pine64_plus_defconfig will generate a sunxi-spl.bin, which can be prepended to the U-Boot proper image (not .bin) to boot from an SD card. Due to the missing ATF support this is of limited usability at the moment, though. Also FEL support requires more love - to switch back to AArch32 before returning to FEL (without crashing, that is ;-), so this is disabled. On my setup this results in a 26KB SPL binary, which is close to the 28K limit mksunxiboot imposes at the moment. Adding anything (like FIT support or DEBUG) will exceed this, and although I have patches to let mksunxiboot get close to 32KB, this is the ulimate frontier.
So patches 20-23 then teach the SPL how to detect an U-Boot image file of a different bitness and do the RMR switch from AArch32 to AArch64, if needed. This is used by the final patch 24, which creates another _defconfig to let the SPL compile for AArch32 using the Thumb2 encoding. This results in a binary of less than 17KB in my case, so has plenty of room for extensions.
I know this is nasty stuff, so I appreciate any comments.
Cheers, Andre.
Andre Przywara (21): drivers: SPI: sunxi SPL: fix warning sun6i: Restrict some register initialization to Allwinner A31 SoC armv8: prevent using THUMB armv8: add lowlevel_init.S SPL: tiny-printf: add "l" modifier move UL() macro from armv8/mmu.h into common.h SPL: make struct spl_image 64-bit safe armv8: add simple sdelay implementation armv8: move reset branch into boot hook ARM: boot0 hook: remove macro, include whole header file sunxi: introduce extra config option for boot0 header sunxi: A64: do an RMR switch if started in AArch32 mode sunxi: provide default DRAM config for sun50i in Kconfig sunxi: H3/A64: fix non-ODT setting sunxi: DRAM: fix H3 DRAM size display on aarch64 sunxi: A64: enable SPL SPL: read and store arch property from U-Boot image Makefile: use "arm64" architecture for U-Boot image files ARM: SPL/FIT: differentiate between arm and arm64 arch properties sunxi: introduce RMR switch to enter payloads in 64-bit mode sunxi: A64: add 32-bit SPL support
Jens Kuske (3): sunxi: H3: add and rename some DRAM contoller registers sunxi: H3: add DRAM controller single bit delay support sunxi: A64: use H3 DRAM initialization code for A64
Makefile | 9 +- arch/arm/cpu/armv7/omap-common/boot-common.c | 2 +- arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/cpu.c | 13 ++ arch/arm/cpu/armv8/lowlevel_init.S | 44 +++++ arch/arm/cpu/armv8/start.S | 5 +- arch/arm/include/asm/arch-bcm235xx/boot0.h | 8 +- arch/arm/include/asm/arch-bcm281xx/boot0.h | 8 +- arch/arm/include/asm/arch-sunxi/boot0.h | 34 +++- arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 51 +++--- arch/arm/include/asm/armv8/mmu.h | 8 - arch/arm/lib/Makefile | 2 + arch/arm/lib/spl.c | 15 ++ arch/arm/lib/vectors.S | 1 - arch/arm/mach-sunxi/Makefile | 2 + arch/arm/mach-sunxi/board.c | 2 +- arch/arm/mach-sunxi/clock_sun6i.c | 7 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 215 +++++++++++++++++------- arch/arm/mach-sunxi/spl_switch.c | 60 +++++++ arch/arm/mach-tegra/spl.c | 2 +- board/sunxi/Kconfig | 32 +++- common/spl/spl.c | 9 +- common/spl/spl_fit.c | 8 + common/spl/spl_mmc.c | 2 +- configs/pine64_plus_defconfig | 6 +- configs/sun50i_spl32_defconfig | 11 ++ drivers/mtd/spi/sunxi_spi_spl.c | 3 +- include/common.h | 10 +- include/configs/sunxi-common.h | 4 +- include/spl.h | 19 ++- lib/tiny-printf.c | 43 +++-- 33 files changed, 488 insertions(+), 151 deletions(-) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S create mode 100644 arch/arm/mach-sunxi/spl_switch.c create mode 100644 configs/sun50i_spl32_defconfig

Somehow an int returning function without a return statement sneaked in, fix it. Also fix some whitespace damage on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- drivers/mtd/spi/sunxi_spi_spl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/mtd/spi/sunxi_spi_spl.c b/drivers/mtd/spi/sunxi_spi_spl.c index 67c7edd..7502314 100644 --- a/drivers/mtd/spi/sunxi_spi_spl.c +++ b/drivers/mtd/spi/sunxi_spi_spl.c @@ -158,9 +158,10 @@ static void spi0_disable_clock(void) (1 << AHB_RESET_SPI0_SHIFT)); }
-static int spi0_init(void) +static void spi0_init(void) { unsigned int pin_function = SUNXI_GPC_SPI0; + if (IS_ENABLED(CONFIG_MACH_SUN50I)) pin_function = SUN50I_GPC_SPI0;

On Sun, Nov 20, 2016 at 8:26 PM, Andre Przywara andre.przywara@arm.com wrote:
Somehow an int returning function without a return statement sneaked in, fix it. Also fix some whitespace damage on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
drivers/mtd/spi/sunxi_spi_spl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/mtd/spi/sunxi_spi_spl.c b/drivers/mtd/spi/sunxi_spi_spl.c index 67c7edd..7502314 100644 --- a/drivers/mtd/spi/sunxi_spi_spl.c +++ b/drivers/mtd/spi/sunxi_spi_spl.c @@ -158,9 +158,10 @@ static void spi0_disable_clock(void) (1 << AHB_RESET_SPI0_SHIFT)); }
-static int spi0_init(void) +static void spi0_init(void) { unsigned int pin_function = SUNXI_GPC_SPI0;
Applied to u-boot-spi/master
thanks!

On 20/11/2016 15:56, Andre Przywara wrote:
Somehow an int returning function without a return statement sneaked in, fix it. Also fix some whitespace damage on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Reviewed-by: Alexander Graf agraf@suse.de
Alex

These days many Allwinner SoCs use clock_sun6i.c, although out of them only the (original sun6i) A31 has a second MBUS clock register. Also setting up the PRCM PLL_CTLR1 register to provide the proper voltage seems to be an A31-only feature as well. So restrict the initialization to this SoC only to avoid writing bogus values to (undefined) registers in other chips.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de Reviewed-by: Chen-Yu Tsai wens@csie.org --- arch/arm/mach-sunxi/clock_sun6i.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index ed8cd9b..382fa94 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -21,6 +21,8 @@ void clock_init_safe(void) { struct sunxi_ccm_reg * const ccm = (struct sunxi_ccm_reg *)SUNXI_CCM_BASE; + +#ifdef CONFIG_MACH_SUN6I struct sunxi_prcm_reg * const prcm = (struct sunxi_prcm_reg *)SUNXI_PRCM_BASE;
@@ -31,6 +33,7 @@ void clock_init_safe(void) PRCM_PLL_CTRL_LDO_DIGITAL_EN | PRCM_PLL_CTRL_LDO_ANALOG_EN | PRCM_PLL_CTRL_EXT_OSC_EN | PRCM_PLL_CTRL_LDO_OUT_L(1140)); clrbits_le32(&prcm->pll_ctrl1, PRCM_PLL_CTRL_LDO_KEY_MASK); +#endif
clock_set_pll1(408000000);
@@ -41,7 +44,9 @@ void clock_init_safe(void) writel(AHB1_ABP1_DIV_DEFAULT, &ccm->ahb1_apb1_div);
writel(MBUS_CLK_DEFAULT, &ccm->mbus0_clk_cfg); +#ifdef CONFIG_MACH_SUN6I writel(MBUS_CLK_DEFAULT, &ccm->mbus1_clk_cfg); +#endif } #endif

On Sun, Nov 20, 2016 at 8:26 PM, Andre Przywara andre.przywara@arm.com wrote:
These days many Allwinner SoCs use clock_sun6i.c, although out of them only the (original sun6i) A31 has a second MBUS clock register. Also setting up the PRCM PLL_CTLR1 register to provide the proper voltage seems to be an A31-only feature as well. So restrict the initialization to this SoC only to avoid writing bogus values to (undefined) registers in other chips.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de Reviewed-by: Chen-Yu Tsai wens@csie.org
Reviewed-by: Jagan Teki jagan@openedev.com
thanks!

On Sun, 20 Nov 2016 14:56:56 +0000 Andre Przywara andre.przywara@arm.com wrote:
These days many Allwinner SoCs use clock_sun6i.c, although out of them only the (original sun6i) A31 has a second MBUS clock register. Also setting up the PRCM PLL_CTLR1 register to provide the proper voltage seems to be an A31-only feature as well. So restrict the initialization to this SoC only to avoid writing bogus values to (undefined) registers in other chips.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de Reviewed-by: Chen-Yu Tsai wens@csie.org
arch/arm/mach-sunxi/clock_sun6i.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index ed8cd9b..382fa94 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -21,6 +21,8 @@ void clock_init_safe(void) { struct sunxi_ccm_reg * const ccm = (struct sunxi_ccm_reg *)SUNXI_CCM_BASE;
+#ifdef CONFIG_MACH_SUN6I struct sunxi_prcm_reg * const prcm = (struct sunxi_prcm_reg *)SUNXI_PRCM_BASE;
@@ -31,6 +33,7 @@ void clock_init_safe(void) PRCM_PLL_CTRL_LDO_DIGITAL_EN | PRCM_PLL_CTRL_LDO_ANALOG_EN | PRCM_PLL_CTRL_EXT_OSC_EN | PRCM_PLL_CTRL_LDO_OUT_L(1140)); clrbits_le32(&prcm->pll_ctrl1, PRCM_PLL_CTRL_LDO_KEY_MASK); +#endif
PRCM is generally poorly documented, with its description sometimes entirely missing from the Allwinner manuals (while it exists in the hardware). But many SoC variants are sharing the same features and need the same code. I can confirm that this code chunk is needed on my A31s tablet (otherwise the system does not boot), so it was not entirely useless.
What about the other SoC variants? For example, I can see that the A23 manual documents this register in a roughly the same way as A31 (the PLLVDD voltage settings, etc.). But I don't have any A23 hardware to test. Basically, are we sure that we will not break A23 support with this commit?
If I understand it correctly, you primarily wanted to disable this code on A64. But disabling it everywhere other than A31 may be a bit too broad.
clock_set_pll1(408000000);
@@ -41,7 +44,9 @@ void clock_init_safe(void) writel(AHB1_ABP1_DIV_DEFAULT, &ccm->ahb1_apb1_div);
writel(MBUS_CLK_DEFAULT, &ccm->mbus0_clk_cfg); +#ifdef CONFIG_MACH_SUN6I writel(MBUS_CLK_DEFAULT, &ccm->mbus1_clk_cfg); +#endif
We can change this to:
if (IS_ENABLED(CONFIG_MACH_SUN6I)) writel(MBUS_CLK_DEFAULT, &ccm->mbus1_clk_cfg);
This saves one line of code and also looks a bit less ugly.

Hi,
On 24/11/16 03:01, Siarhei Siamashka wrote:
On Sun, 20 Nov 2016 14:56:56 +0000 Andre Przywara andre.przywara@arm.com wrote:
These days many Allwinner SoCs use clock_sun6i.c, although out of them only the (original sun6i) A31 has a second MBUS clock register. Also setting up the PRCM PLL_CTLR1 register to provide the proper voltage seems to be an A31-only feature as well. So restrict the initialization to this SoC only to avoid writing bogus values to (undefined) registers in other chips.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de Reviewed-by: Chen-Yu Tsai wens@csie.org
arch/arm/mach-sunxi/clock_sun6i.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index ed8cd9b..382fa94 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -21,6 +21,8 @@ void clock_init_safe(void) { struct sunxi_ccm_reg * const ccm = (struct sunxi_ccm_reg *)SUNXI_CCM_BASE;
+#ifdef CONFIG_MACH_SUN6I struct sunxi_prcm_reg * const prcm = (struct sunxi_prcm_reg *)SUNXI_PRCM_BASE;
@@ -31,6 +33,7 @@ void clock_init_safe(void) PRCM_PLL_CTRL_LDO_DIGITAL_EN | PRCM_PLL_CTRL_LDO_ANALOG_EN | PRCM_PLL_CTRL_EXT_OSC_EN | PRCM_PLL_CTRL_LDO_OUT_L(1140)); clrbits_le32(&prcm->pll_ctrl1, PRCM_PLL_CTRL_LDO_KEY_MASK); +#endif
PRCM is generally poorly documented, with its description sometimes entirely missing from the Allwinner manuals (while it exists in the hardware). But many SoC variants are sharing the same features and need the same code. I can confirm that this code chunk is needed on my A31s tablet (otherwise the system does not boot), so it was not entirely useless.
Sure, in fact I was hoping for people to holler if it breaks their board.
What about the other SoC variants? For example, I can see that the A23 manual documents this register in a roughly the same way as A31 (the PLLVDD voltage settings, etc.). But I don't have any A23 hardware to test. Basically, are we sure that we will not break A23 support with this commit?
If I understand it correctly, you primarily wanted to disable this code on A64. But disabling it everywhere other than A31 may be a bit too broad.
Well, my impression was that this code was added for the A31, and just called clock_sun6i.c because it made sense at this time. Later on people just re-used the _clock_ code because the clocks are compatible, but missed this one - which cares about a regulator, really. So if people can come up with a list of Socs that need this, I am happy to add this to the #ifdef. I just had the impression that boards with AXPs or I2C regulators don't need this.
clock_set_pll1(408000000);
@@ -41,7 +44,9 @@ void clock_init_safe(void) writel(AHB1_ABP1_DIV_DEFAULT, &ccm->ahb1_apb1_div);
writel(MBUS_CLK_DEFAULT, &ccm->mbus0_clk_cfg); +#ifdef CONFIG_MACH_SUN6I writel(MBUS_CLK_DEFAULT, &ccm->mbus1_clk_cfg); +#endif
We can change this to:
if (IS_ENABLED(CONFIG_MACH_SUN6I)) writel(MBUS_CLK_DEFAULT, &ccm->mbus1_clk_cfg);
This saves one line of code and also looks a bit less ugly.
Is there some "official" rationale for using IS_ENABLED vs. #ifdef? As much as I dislike this massive usage of #ifdefs, at least it gives me clear heads up that this code may not be compiled in, which can more easily be missed with IS_ENABLED. But I don't have a strong opinion on this, so happy to change it.
Cheers, Andre.

On 24/11/16 10:18, Andre Przywara wrote:
Hi,
On 24/11/16 03:01, Siarhei Siamashka wrote:
On Sun, 20 Nov 2016 14:56:56 +0000 Andre Przywara andre.przywara@arm.com wrote:
These days many Allwinner SoCs use clock_sun6i.c, although out of them only the (original sun6i) A31 has a second MBUS clock register. Also setting up the PRCM PLL_CTLR1 register to provide the proper voltage seems to be an A31-only feature as well. So restrict the initialization to this SoC only to avoid writing bogus values to (undefined) registers in other chips.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de Reviewed-by: Chen-Yu Tsai wens@csie.org
arch/arm/mach-sunxi/clock_sun6i.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index ed8cd9b..382fa94 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -21,6 +21,8 @@ void clock_init_safe(void) { struct sunxi_ccm_reg * const ccm = (struct sunxi_ccm_reg *)SUNXI_CCM_BASE;
+#ifdef CONFIG_MACH_SUN6I struct sunxi_prcm_reg * const prcm = (struct sunxi_prcm_reg *)SUNXI_PRCM_BASE;
@@ -31,6 +33,7 @@ void clock_init_safe(void) PRCM_PLL_CTRL_LDO_DIGITAL_EN | PRCM_PLL_CTRL_LDO_ANALOG_EN | PRCM_PLL_CTRL_EXT_OSC_EN | PRCM_PLL_CTRL_LDO_OUT_L(1140)); clrbits_le32(&prcm->pll_ctrl1, PRCM_PLL_CTRL_LDO_KEY_MASK); +#endif
PRCM is generally poorly documented, with its description sometimes entirely missing from the Allwinner manuals (while it exists in the hardware). But many SoC variants are sharing the same features and need the same code. I can confirm that this code chunk is needed on my A31s tablet (otherwise the system does not boot), so it was not entirely useless.
Sure, in fact I was hoping for people to holler if it breaks their board.
What about the other SoC variants? For example, I can see that the A23 manual documents this register in a roughly the same way as A31 (the PLLVDD voltage settings, etc.). But I don't have any A23 hardware to test. Basically, are we sure that we will not break A23 support with this commit?
If I understand it correctly, you primarily wanted to disable this code on A64. But disabling it everywhere other than A31 may be a bit too broad.
Well, my impression was that this code was added for the A31, and just called clock_sun6i.c because it made sense at this time. Later on people just re-used the _clock_ code because the clocks are compatible, but missed this one - which cares about a regulator, really. So if people can come up with a list of Socs that need this, I am happy to add this to the #ifdef. I just had the impression that boards with AXPs or I2C regulators don't need this.
I now realized that this PLL voltage is obviously generated by an internal regulator, based on the externally provided PLL-Vcc. So in fact this register seems still be valid, even for newer SoCs. But at least for the H2, A64 and H5 I tested this on, the reset value (or the value set by BROM) is exactly 0x00070007, which is what we write here. I think U-Boot shouldn't care about writing those registers if that's the reset value anyway, especially if that happens with a widely used CONFIG symbol. So I will change that #ifdef to just spare the H3 and A64 for now, eventually extending this to more chips, with possibly ending at the A31 being the only user. If any owner of one of those A23, A33, A80 and A83T systems could verify that it works without this register setup (so with this very patch applied), I'd be grateful.
The reset(?) value can be checked via FEL by: $ ./sunxi-fel readl 0x01f01444
Cheers, Andre.
clock_set_pll1(408000000);
@@ -41,7 +44,9 @@ void clock_init_safe(void) writel(AHB1_ABP1_DIV_DEFAULT, &ccm->ahb1_apb1_div);
writel(MBUS_CLK_DEFAULT, &ccm->mbus0_clk_cfg); +#ifdef CONFIG_MACH_SUN6I writel(MBUS_CLK_DEFAULT, &ccm->mbus1_clk_cfg); +#endif
We can change this to:
if (IS_ENABLED(CONFIG_MACH_SUN6I)) writel(MBUS_CLK_DEFAULT, &ccm->mbus1_clk_cfg);
This saves one line of code and also looks a bit less ugly.
Is there some "official" rationale for using IS_ENABLED vs. #ifdef? As much as I dislike this massive usage of #ifdefs, at least it gives me clear heads up that this code may not be compiled in, which can more easily be missed with IS_ENABLED. But I don't have a strong opinion on this, so happy to change it.
Cheers, Andre.

The predominantely 32-bit ARM targets try to compile the SPL in Thumb mode to reduce code size. The 64-bit AArch64 instruction set does not know an alternative, concise encoding, so the Thumb build option should only be set for 32-bit targets. Likewise -marm machine options are only valid for ARMv7 targets.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/lib/Makefile | 2 ++ include/configs/sunxi-common.h | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile index a812306..8dc6787 100644 --- a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -77,8 +77,10 @@ ifndef CONFIG_HAS_THUMB2
# for C files, just apend -marm, which will override previous -mthumb*
+ifndef CONFIG_ARM64 CFLAGS_cache.o := -marm CFLAGS_cache-cp15.o := -marm +endif
# For .S, drop -mthumb* and other thumb-related options. # CFLAGS_REMOVE_* would not have an effet, so AFLAGS_REMOVE_* diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index 8363414..86b4104 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -35,7 +35,7 @@ /* * High Level Configuration Options */ -#ifdef CONFIG_SPL_BUILD +#if defined(CONFIG_SPL_BUILD) && !defined(CONFIG_ARM64) #define CONFIG_SYS_THUMB_BUILD /* Thumbs mode to save space in SPL */ #endif

On 20/11/2016 15:56, Andre Przywara wrote:
The predominantely 32-bit ARM targets try to compile the SPL in Thumb mode to reduce code size. The 64-bit AArch64 instruction set does not know an alternative, concise encoding, so the Thumb build option should only be set for 32-bit targets. Likewise -marm machine options are only valid for ARMv7 targets.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Reviewed-by: Alexander Graf agraf@suse.de
Alex

For boards that call s_init() when the SPL runs, we are expected to setup an early stack before calling this C function. Implement the proper AArch64 version of this based on the ARMv7 code. This allows sunxi boards to setup the basic peripherals even on with a 64-bit SPL.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/lowlevel_init.S | 44 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S
diff --git a/arch/arm/cpu/armv8/Makefile b/arch/arm/cpu/armv8/Makefile index dea1465..799a752 100644 --- a/arch/arm/cpu/armv8/Makefile +++ b/arch/arm/cpu/armv8/Makefile @@ -25,3 +25,4 @@ obj-$(CONFIG_FSL_LAYERSCAPE) += fsl-layerscape/ obj-$(CONFIG_S32V234) += s32v234/ obj-$(CONFIG_ARCH_ZYNQMP) += zynqmp/ obj-$(CONFIG_TARGET_HIKEY) += hisilicon/ +obj-$(CONFIG_ARCH_SUNXI) += lowlevel_init.o diff --git a/arch/arm/cpu/armv8/lowlevel_init.S b/arch/arm/cpu/armv8/lowlevel_init.S new file mode 100644 index 0000000..189e35f --- /dev/null +++ b/arch/arm/cpu/armv8/lowlevel_init.S @@ -0,0 +1,44 @@ +/* + * A lowlevel_init function that sets up the stack to call a C function to + * perform further init. + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#include <asm-offsets.h> +#include <config.h> +#include <linux/linkage.h> + +ENTRY(lowlevel_init) + /* + * Setup a temporary stack. Global data is not available yet. + */ +#if defined(CONFIG_SPL_BUILD) && defined(CONFIG_SPL_STACK) + ldr w0, =CONFIG_SPL_STACK +#else + ldr w0, =CONFIG_SYS_INIT_SP_ADDR +#endif + bic sp, x0, #0xf /* 16-byte alignment for ABI compliance */ + + /* + * Save the old LR(passed in x29) and the current LR to stack + */ + stp x29, x30, [sp, #-16]! + + /* + * Call the very early init function. This should do only the + * absolute bare minimum to get started. It should not: + * + * - set up DRAM + * - use global_data + * - clear BSS + * - try to start a console + * + * For boards with SPL this should be empty since SPL can do all of + * this init in the SPL board_init_f() function which is called + * immediately after this. + */ + bl s_init + ldp x29, x30, [sp] + ret +ENDPROC(lowlevel_init)

On 20/11/2016 15:56, Andre Przywara wrote:
For boards that call s_init() when the SPL runs, we are expected to setup an early stack before calling this C function. Implement the proper AArch64 version of this based on the ARMv7 code. This allows sunxi boards to setup the basic peripherals even on with a 64-bit SPL.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Isn't that what _main in ./arch/arm/lib/crt0_64.S is supposed to do? That should be used for the SPL flow too, no?
arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/lowlevel_init.S | 44 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S
diff --git a/arch/arm/cpu/armv8/Makefile b/arch/arm/cpu/armv8/Makefile index dea1465..799a752 100644 --- a/arch/arm/cpu/armv8/Makefile +++ b/arch/arm/cpu/armv8/Makefile @@ -25,3 +25,4 @@ obj-$(CONFIG_FSL_LAYERSCAPE) += fsl-layerscape/ obj-$(CONFIG_S32V234) += s32v234/ obj-$(CONFIG_ARCH_ZYNQMP) += zynqmp/ obj-$(CONFIG_TARGET_HIKEY) += hisilicon/ +obj-$(CONFIG_ARCH_SUNXI) += lowlevel_init.o diff --git a/arch/arm/cpu/armv8/lowlevel_init.S b/arch/arm/cpu/armv8/lowlevel_init.S new file mode 100644 index 0000000..189e35f --- /dev/null +++ b/arch/arm/cpu/armv8/lowlevel_init.S @@ -0,0 +1,44 @@ +/*
- A lowlevel_init function that sets up the stack to call a C function to
- perform further init.
- SPDX-License-Identifier: GPL-2.0+
- */
+#include <asm-offsets.h> +#include <config.h> +#include <linux/linkage.h>
+ENTRY(lowlevel_init)
- /*
* Setup a temporary stack. Global data is not available yet.
*/
+#if defined(CONFIG_SPL_BUILD) && defined(CONFIG_SPL_STACK)
- ldr w0, =CONFIG_SPL_STACK
+#else
- ldr w0, =CONFIG_SYS_INIT_SP_ADDR
+#endif
- bic sp, x0, #0xf /* 16-byte alignment for ABI compliance */
- /*
* Save the old LR(passed in x29) and the current LR to stack
*/
- stp x29, x30, [sp, #-16]!
- /*
* Call the very early init function. This should do only the
* absolute bare minimum to get started. It should not:
*
* - set up DRAM
* - use global_data
* - clear BSS
* - try to start a console
*
* For boards with SPL this should be empty since SPL can do all of
* this init in the SPL board_init_f() function which is called
* immediately after this.
So this comment says that s_init shouldn't even be used for SPL and instead we should use board_init_f which is what crt0 calls.
Alex

Hi Alex,
thanks for having a look!
On 21/11/16 15:34, Alexander Graf wrote:
On 20/11/2016 15:56, Andre Przywara wrote:
For boards that call s_init() when the SPL runs, we are expected to setup an early stack before calling this C function. Implement the proper AArch64 version of this based on the ARMv7 code. This allows sunxi boards to setup the basic peripherals even on with a 64-bit SPL.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Isn't that what _main in ./arch/arm/lib/crt0_64.S is supposed to do? That should be used for the SPL flow too, no?
I saw that too, but apparently this one here is a some shortcut for the SPL. TBH I didn't dare to look too closely, as this seems to be a bit entangled.
....
arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/lowlevel_init.S | 44 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S
diff --git a/arch/arm/cpu/armv8/Makefile b/arch/arm/cpu/armv8/Makefile index dea1465..799a752 100644 --- a/arch/arm/cpu/armv8/Makefile +++ b/arch/arm/cpu/armv8/Makefile @@ -25,3 +25,4 @@ obj-$(CONFIG_FSL_LAYERSCAPE) += fsl-layerscape/ obj-$(CONFIG_S32V234) += s32v234/ obj-$(CONFIG_ARCH_ZYNQMP) += zynqmp/ obj-$(CONFIG_TARGET_HIKEY) += hisilicon/ +obj-$(CONFIG_ARCH_SUNXI) += lowlevel_init.o diff --git a/arch/arm/cpu/armv8/lowlevel_init.S b/arch/arm/cpu/armv8/lowlevel_init.S new file mode 100644 index 0000000..189e35f --- /dev/null +++ b/arch/arm/cpu/armv8/lowlevel_init.S @@ -0,0 +1,44 @@ +/*
- A lowlevel_init function that sets up the stack to call a C
function to
- perform further init.
- SPDX-License-Identifier: GPL-2.0+
- */
+#include <asm-offsets.h> +#include <config.h> +#include <linux/linkage.h>
+ENTRY(lowlevel_init)
- /*
* Setup a temporary stack. Global data is not available yet.
*/
+#if defined(CONFIG_SPL_BUILD) && defined(CONFIG_SPL_STACK)
- ldr w0, =CONFIG_SPL_STACK
+#else
- ldr w0, =CONFIG_SYS_INIT_SP_ADDR
+#endif
- bic sp, x0, #0xf /* 16-byte alignment for ABI compliance */
- /*
* Save the old LR(passed in x29) and the current LR to stack
*/
- stp x29, x30, [sp, #-16]!
- /*
* Call the very early init function. This should do only the
* absolute bare minimum to get started. It should not:
*
* - set up DRAM
* - use global_data
* - clear BSS
* - try to start a console
*
* For boards with SPL this should be empty since SPL can do all of
* this init in the SPL board_init_f() function which is called
* immediately after this.
So this comment says that s_init shouldn't even be used for SPL and instead we should use board_init_f which is what crt0 calls.
Yes, that is a good point. So the sunxi port seemed to ignore this recommendation and do it in its own way (TM). Frankly I didn't want to fix this one too at this point, so I just went ahead with filling the 64-bit gaps. But I agree that this should be fixed eventually. The line above: +obj-$(CONFIG_ARCH_SUNXI) += lowlevel_init.o hints already that something is not really right here, as SUNXI is the only user of this rather generic function. If I find some spare time (haha), I might take a look, otherwise: be my guest ;-)
Cheers, Andre.

On 21/11/2016 16:49, Andre Przywara wrote:
Hi Alex,
thanks for having a look!
On 21/11/16 15:34, Alexander Graf wrote:
On 20/11/2016 15:56, Andre Przywara wrote:
For boards that call s_init() when the SPL runs, we are expected to setup an early stack before calling this C function. Implement the proper AArch64 version of this based on the ARMv7 code. This allows sunxi boards to setup the basic peripherals even on with a 64-bit SPL.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Isn't that what _main in ./arch/arm/lib/crt0_64.S is supposed to do? That should be used for the SPL flow too, no?
I saw that too, but apparently this one here is a some shortcut for the SPL. TBH I didn't dare to look too closely, as this seems to be a bit entangled.
....
arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/lowlevel_init.S | 44 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S
diff --git a/arch/arm/cpu/armv8/Makefile b/arch/arm/cpu/armv8/Makefile index dea1465..799a752 100644 --- a/arch/arm/cpu/armv8/Makefile +++ b/arch/arm/cpu/armv8/Makefile @@ -25,3 +25,4 @@ obj-$(CONFIG_FSL_LAYERSCAPE) += fsl-layerscape/ obj-$(CONFIG_S32V234) += s32v234/ obj-$(CONFIG_ARCH_ZYNQMP) += zynqmp/ obj-$(CONFIG_TARGET_HIKEY) += hisilicon/ +obj-$(CONFIG_ARCH_SUNXI) += lowlevel_init.o diff --git a/arch/arm/cpu/armv8/lowlevel_init.S b/arch/arm/cpu/armv8/lowlevel_init.S new file mode 100644 index 0000000..189e35f --- /dev/null +++ b/arch/arm/cpu/armv8/lowlevel_init.S @@ -0,0 +1,44 @@ +/*
- A lowlevel_init function that sets up the stack to call a C
function to
- perform further init.
- SPDX-License-Identifier: GPL-2.0+
- */
+#include <asm-offsets.h> +#include <config.h> +#include <linux/linkage.h>
+ENTRY(lowlevel_init)
- /*
* Setup a temporary stack. Global data is not available yet.
*/
+#if defined(CONFIG_SPL_BUILD) && defined(CONFIG_SPL_STACK)
- ldr w0, =CONFIG_SPL_STACK
+#else
- ldr w0, =CONFIG_SYS_INIT_SP_ADDR
+#endif
- bic sp, x0, #0xf /* 16-byte alignment for ABI compliance */
- /*
* Save the old LR(passed in x29) and the current LR to stack
*/
- stp x29, x30, [sp, #-16]!
- /*
* Call the very early init function. This should do only the
* absolute bare minimum to get started. It should not:
*
* - set up DRAM
* - use global_data
* - clear BSS
* - try to start a console
*
* For boards with SPL this should be empty since SPL can do all of
* this init in the SPL board_init_f() function which is called
* immediately after this.
So this comment says that s_init shouldn't even be used for SPL and instead we should use board_init_f which is what crt0 calls.
Yes, that is a good point. So the sunxi port seemed to ignore this recommendation and do it in its own way (TM). Frankly I didn't want to fix this one too at this point, so I just went ahead with filling the 64-bit gaps. But I agree that this should be fixed eventually. The line above: +obj-$(CONFIG_ARCH_SUNXI) += lowlevel_init.o hints already that something is not really right here, as SUNXI is the only user of this rather generic function. If I find some spare time (haha), I might take a look, otherwise: be my guest ;-)
IIRC in my SPL port (that I can't find anymore) I just called s_init from board_init_f.
Alex

tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- lib/tiny-printf.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
diff --git a/lib/tiny-printf.c b/lib/tiny-printf.c index 30ac759..b01099d 100644 --- a/lib/tiny-printf.c +++ b/lib/tiny-printf.c @@ -38,8 +38,8 @@ static void out_dgt(struct printf_info *info, char dgt) info->zs = 1; }
-static void div_out(struct printf_info *info, unsigned int *num, - unsigned int div) +static void div_out(struct printf_info *info, unsigned long *num, + unsigned long div) { unsigned char dgt = 0;
@@ -56,9 +56,9 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) { char ch; char *p; - unsigned int num; + unsigned long num; char buf[12]; - unsigned int div; + unsigned long div;
while ((ch = *(fmt++))) { if (ch != '%') { @@ -66,8 +66,12 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) } else { bool lz = false; int width = 0; + bool islong = false;
ch = *(fmt++); + if (ch == '-') + ch = *(fmt++); + if (ch == '0') { ch = *(fmt++); lz = 1; @@ -80,6 +84,11 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) ch = *fmt++; } } + if (ch == 'l') { + ch = *(fmt++); + islong = true; + } + info->bf = buf; p = info->bf; info->zs = 0; @@ -89,24 +98,38 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) goto abort; case 'u': case 'd': - num = va_arg(va, unsigned int); - if (ch == 'd' && (int)num < 0) { - num = -(int)num; + div = 1000000000; + if (islong) { + num = va_arg(va, unsigned long); + if (sizeof(long) > 4) + div *= div * 10; + } else { + num = va_arg(va, unsigned int); + } + + if (ch == 'd' && (long)num < 0) { + num = -(long)num; out(info, '-'); } if (!num) { out_dgt(info, 0); } else { - for (div = 1000000000; div; div /= 10) + for (; div; div /= 10) div_out(info, &num, div); } break; case 'x': - num = va_arg(va, unsigned int); + if (islong) { + num = va_arg(va, unsigned long); + div = 1UL << (sizeof(long) * 8 - 4); + } else { + num = va_arg(va, unsigned int); + div = 0x10000000; + } if (!num) { out_dgt(info, 0); } else { - for (div = 0x10000000; div; div /= 0x10) + for (; div; div /= 0x10) div_out(info, &num, div); } break;

On 20/11/2016 15:56, Andre Przywara wrote:
tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly.
Signed-off-by: Andre Przywara andre.przywara@arm.com
lib/tiny-printf.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
diff --git a/lib/tiny-printf.c b/lib/tiny-printf.c index 30ac759..b01099d 100644 --- a/lib/tiny-printf.c +++ b/lib/tiny-printf.c @@ -38,8 +38,8 @@ static void out_dgt(struct printf_info *info, char dgt) info->zs = 1; }
-static void div_out(struct printf_info *info, unsigned int *num,
unsigned int div)
+static void div_out(struct printf_info *info, unsigned long *num,
unsigned long div)
{ unsigned char dgt = 0;
@@ -56,9 +56,9 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) { char ch; char *p;
- unsigned int num;
- unsigned long num; char buf[12];
- unsigned int div;
unsigned long div;
while ((ch = *(fmt++))) { if (ch != '%') {
@@ -66,8 +66,12 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) } else { bool lz = false; int width = 0;
bool islong = false; ch = *(fmt++);
if (ch == '-')
ch = *(fmt++);
What does this do? I don't see '-' mentioned in the patch description.
if (ch == '0') { ch = *(fmt++); lz = 1;
@@ -80,6 +84,11 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) ch = *fmt++; } }
if (ch == 'l') {
ch = *(fmt++);
islong = true;
}
info->bf = buf; p = info->bf; info->zs = 0;
@@ -89,24 +98,38 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) goto abort; case 'u': case 'd':
num = va_arg(va, unsigned int);
if (ch == 'd' && (int)num < 0) {
num = -(int)num;
div = 1000000000;
if (islong) {
Check here if sizeof(long) > 4, so that the whole branch gets optimized away on 32bit.
num = va_arg(va, unsigned long);
if (sizeof(long) > 4)
div *= div * 10;
} else {
num = va_arg(va, unsigned int);
}
if (ch == 'd' && (long)num < 0) {
num = -(long)num;
Num is a long now and before. So if you have a 32bit signed input, it will sign extend incorrectly here. You need an additional check
if (islong) num = -(long)num; else num = -(int)num;
Let's hope the compiler on 32bit is smart enough to know that it can combine those two cases :).
out(info, '-'); } if (!num) { out_dgt(info, 0); } else {
for (div = 1000000000; div; div /= 10)
for (; div; div /= 10)
Any particular reason for that change?
div_out(info, &num, div); } break; case 'x':
num = va_arg(va, unsigned int);
if (islong) {
Same comment as above.
Alex
num = va_arg(va, unsigned long);
div = 1UL << (sizeof(long) * 8 - 4);
} else {
num = va_arg(va, unsigned int);
div = 0x10000000;
} if (!num) { out_dgt(info, 0); } else {
for (div = 0x10000000; div; div /= 0x10)
for (; div; div /= 0x10) div_out(info, &num, div); } break;

Hi Alex,
On 21/11/16 15:42, Alexander Graf wrote:
On 20/11/2016 15:56, Andre Przywara wrote:
tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly.
Signed-off-by: Andre Przywara andre.przywara@arm.com
lib/tiny-printf.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
diff --git a/lib/tiny-printf.c b/lib/tiny-printf.c index 30ac759..b01099d 100644 --- a/lib/tiny-printf.c +++ b/lib/tiny-printf.c @@ -38,8 +38,8 @@ static void out_dgt(struct printf_info *info, char dgt) info->zs = 1; }
-static void div_out(struct printf_info *info, unsigned int *num,
unsigned int div)
+static void div_out(struct printf_info *info, unsigned long *num,
unsigned long div)
{ unsigned char dgt = 0;
@@ -56,9 +56,9 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) { char ch; char *p;
- unsigned int num;
- unsigned long num; char buf[12];
- unsigned int div;
unsigned long div;
while ((ch = *(fmt++))) { if (ch != '%') {
@@ -66,8 +66,12 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) } else { bool lz = false; int width = 0;
bool islong = false; ch = *(fmt++);
if (ch == '-')
ch = *(fmt++);
What does this do? I don't see '-' mentioned in the patch description.
Argh, apparently the comment in the commit message got lost during a patch reshuffle. Sorry, will re-add it.
We need it because some SPL printf uses '-', just ignoring it here seems fine for SPL purposes though.
if (ch == '0') { ch = *(fmt++); lz = 1;
@@ -80,6 +84,11 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) ch = *fmt++; } }
if (ch == 'l') {
ch = *(fmt++);
islong = true;
}
info->bf = buf; p = info->bf; info->zs = 0;
@@ -89,24 +98,38 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) goto abort; case 'u': case 'd':
num = va_arg(va, unsigned int);
if (ch == 'd' && (int)num < 0) {
num = -(int)num;
div = 1000000000;
if (islong) {
Check here if sizeof(long) > 4, so that the whole branch gets optimized away on 32bit.
Good idea.
num = va_arg(va, unsigned long);
if (sizeof(long) > 4)
div *= div * 10;
} else {
num = va_arg(va, unsigned int);
}
if (ch == 'd' && (long)num < 0) {
num = -(long)num;
Num is a long now and before. So if you have a 32bit signed input, it will sign extend incorrectly here. You need an additional check
if (islong) num = -(long)num; else num = -(int)num;
Let's hope the compiler on 32bit is smart enough to know that it can combine those two cases :).
out(info, '-'); } if (!num) { out_dgt(info, 0); } else {
for (div = 1000000000; div; div /= 10)
for (; div; div /= 10)
Any particular reason for that change?
This algorithm so far only cared for 32-bit values, so it set the start divider to 1E9. This is not sufficient for 64-bit longs in AA64. So I compute div above, depending on the actual size of long.
div_out(info, &num, div); } break; case 'x':
num = va_arg(va, unsigned int);
if (islong) {
Same comment as above.
Thanks, I will take a look at the rest.
Cheers, Andre.
num = va_arg(va, unsigned long);
div = 1UL << (sizeof(long) * 8 - 4);
} else {
num = va_arg(va, unsigned int);
div = 0x10000000;
} if (!num) { out_dgt(info, 0); } else {
for (div = 0x10000000; div; div /= 0x10)
for (; div; div /= 0x10) div_out(info, &num, div); } break;

On 21/11/2016 16:56, Andre Przywara wrote:
Hi Alex,
On 21/11/16 15:42, Alexander Graf wrote:
On 20/11/2016 15:56, Andre Przywara wrote:
tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly.
Signed-off-by: Andre Przywara andre.przywara@arm.com
lib/tiny-printf.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
diff --git a/lib/tiny-printf.c b/lib/tiny-printf.c index 30ac759..b01099d 100644 --- a/lib/tiny-printf.c +++ b/lib/tiny-printf.c @@ -38,8 +38,8 @@ static void out_dgt(struct printf_info *info, char dgt) info->zs = 1; }
-static void div_out(struct printf_info *info, unsigned int *num,
unsigned int div)
+static void div_out(struct printf_info *info, unsigned long *num,
unsigned long div)
{ unsigned char dgt = 0;
@@ -56,9 +56,9 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) { char ch; char *p;
- unsigned int num;
- unsigned long num; char buf[12];
- unsigned int div;
unsigned long div;
while ((ch = *(fmt++))) { if (ch != '%') {
@@ -66,8 +66,12 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) } else { bool lz = false; int width = 0;
bool islong = false; ch = *(fmt++);
if (ch == '-')
ch = *(fmt++);
What does this do? I don't see '-' mentioned in the patch description.
Argh, apparently the comment in the commit message got lost during a patch reshuffle. Sorry, will re-add it.
We need it because some SPL printf uses '-', just ignoring it here seems fine for SPL purposes though.
if (ch == '0') { ch = *(fmt++); lz = 1;
@@ -80,6 +84,11 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) ch = *fmt++; } }
if (ch == 'l') {
ch = *(fmt++);
islong = true;
}
info->bf = buf; p = info->bf; info->zs = 0;
@@ -89,24 +98,38 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) goto abort; case 'u': case 'd':
num = va_arg(va, unsigned int);
if (ch == 'd' && (int)num < 0) {
num = -(int)num;
div = 1000000000;
if (islong) {
Check here if sizeof(long) > 4, so that the whole branch gets optimized away on 32bit.
Good idea.
num = va_arg(va, unsigned long);
if (sizeof(long) > 4)
div *= div * 10;
} else {
num = va_arg(va, unsigned int);
}
if (ch == 'd' && (long)num < 0) {
num = -(long)num;
Num is a long now and before. So if you have a 32bit signed input, it will sign extend incorrectly here. You need an additional check
if (islong) num = -(long)num; else num = -(int)num;
Let's hope the compiler on 32bit is smart enough to know that it can combine those two cases :).
out(info, '-'); } if (!num) { out_dgt(info, 0); } else {
for (div = 1000000000; div; div /= 10)
for (; div; div /= 10)
Any particular reason for that change?
This algorithm so far only cared for 32-bit values, so it set the start divider to 1E9. This is not sufficient for 64-bit longs in AA64. So I compute div above, depending on the actual size of long.
Ah, I missed the div *= up there. Sure, then it makes sense. Btw, have you checked that the compiler is smart enough to do constant propagation here? Multiplications can be very expensive.
Alex

On Sun, 20 Nov 2016 14:56:59 +0000 Andre Przywara andre.przywara@arm.com wrote:
tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly.
I can't help but notice that the changes of this kind are in a way defeating the original purpose of tiny-printf. And it is surely not the first patch adding features to tiny-printf. I guess, in the end we may end up with a large and bloated printf implementation again :-)
A possible solution might be just a strict check when parsing format modifiers and abort with an error message (yeah, this will introduce some size increase, but hopefully the last one). This way we acknowledge the fact that tiny-printf is a reduced incomplete implementation, and that the callers need to take this into account.
As for the "l" modifier. How much does it add to the code size? IMHO this information should be mentioned in the commit message. Can the AArch64 crash dump code be modified to avoid using it? Or can we have the "l" modifier supported on 64-bit platforms only?
Signed-off-by: Andre Przywara andre.przywara@arm.com
lib/tiny-printf.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
diff --git a/lib/tiny-printf.c b/lib/tiny-printf.c index 30ac759..b01099d 100644 --- a/lib/tiny-printf.c +++ b/lib/tiny-printf.c @@ -38,8 +38,8 @@ static void out_dgt(struct printf_info *info, char dgt) info->zs = 1; }
-static void div_out(struct printf_info *info, unsigned int *num,
unsigned int div)
+static void div_out(struct printf_info *info, unsigned long *num,
unsigned long div)
{ unsigned char dgt = 0;
@@ -56,9 +56,9 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) { char ch; char *p;
- unsigned int num;
- unsigned long num; char buf[12];
- unsigned int div;
unsigned long div;
while ((ch = *(fmt++))) { if (ch != '%') {
@@ -66,8 +66,12 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) } else { bool lz = false; int width = 0;
bool islong = false; ch = *(fmt++);
if (ch == '-')
ch = *(fmt++);
if (ch == '0') { ch = *(fmt++); lz = 1;
@@ -80,6 +84,11 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) ch = *fmt++; } }
if (ch == 'l') {
ch = *(fmt++);
islong = true;
}
info->bf = buf; p = info->bf; info->zs = 0;
@@ -89,24 +98,38 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) goto abort; case 'u': case 'd':
num = va_arg(va, unsigned int);
if (ch == 'd' && (int)num < 0) {
num = -(int)num;
div = 1000000000;
if (islong) {
num = va_arg(va, unsigned long);
if (sizeof(long) > 4)
div *= div * 10;
} else {
num = va_arg(va, unsigned int);
}
if (ch == 'd' && (long)num < 0) {
num = -(long)num; out(info, '-'); } if (!num) { out_dgt(info, 0); } else {
for (div = 1000000000; div; div /= 10)
for (; div; div /= 10) div_out(info, &num, div); } break; case 'x':
num = va_arg(va, unsigned int);
if (islong) {
num = va_arg(va, unsigned long);
div = 1UL << (sizeof(long) * 8 - 4);
} else {
num = va_arg(va, unsigned int);
div = 0x10000000;
} if (!num) { out_dgt(info, 0); } else {
for (div = 0x10000000; div; div /= 0x10)
for (; div; div /= 0x10) div_out(info, &num, div); } break;

Hi,
On 23 November 2016 at 20:19, Siarhei Siamashka siarhei.siamashka@gmail.com wrote:
On Sun, 20 Nov 2016 14:56:59 +0000 Andre Przywara andre.przywara@arm.com wrote:
tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly.
I can't help but notice that the changes of this kind are in a way defeating the original purpose of tiny-printf. And it is surely not the first patch adding features to tiny-printf. I guess, in the end we may end up with a large and bloated printf implementation again :-)
A possible solution might be just a strict check when parsing format modifiers and abort with an error message (yeah, this will introduce some size increase, but hopefully the last one). This way we acknowledge the fact that tiny-printf is a reduced incomplete implementation, and that the callers need to take this into account.
As for the "l" modifier. How much does it add to the code size? IMHO this information should be mentioned in the commit message. Can the AArch64 crash dump code be modified to avoid using it? Or can we have the "l" modifier supported on 64-bit platforms only?
Signed-off-by: Andre Przywara andre.przywara@arm.com
lib/tiny-printf.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
I think I tested this patch as adding 36 bytes on Thumb2 so not too terrible. But I do agree with the sentiment.
Why is aarch64 using tiny-printf? Surely all though chips have heaps of space?!
Regards, Simon

On 27/11/16 17:02, Simon Glass wrote:
Hi,
On 23 November 2016 at 20:19, Siarhei Siamashka siarhei.siamashka@gmail.com wrote:
On Sun, 20 Nov 2016 14:56:59 +0000 Andre Przywara andre.przywara@arm.com wrote:
tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly.
I can't help but notice that the changes of this kind are in a way defeating the original purpose of tiny-printf. And it is surely not the first patch adding features to tiny-printf. I guess, in the end we may end up with a large and bloated printf implementation again :-)
A possible solution might be just a strict check when parsing format modifiers and abort with an error message (yeah, this will introduce some size increase, but hopefully the last one). This way we acknowledge the fact that tiny-printf is a reduced incomplete implementation, and that the callers need to take this into account.
As for the "l" modifier. How much does it add to the code size? IMHO this information should be mentioned in the commit message. Can the AArch64 crash dump code be modified to avoid using it? Or can we have the "l" modifier supported on 64-bit platforms only?
Signed-off-by: Andre Przywara andre.przywara@arm.com
lib/tiny-printf.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
I think I tested this patch as adding 36 bytes on Thumb2 so not too terrible. But I do agree with the sentiment.
Thanks for checking that!
Why is aarch64 using tiny-printf? Surely all though chips have heaps of space?!
Ha, one would hope so, right? But in fact this is basically an existing 32-bit Allwinner chip with 64-bit cores - mostly because they can ;-). Replacing Cortex-A7 cores with A53s seems to be a common exercise.
But the point is that even the most capable chip needs to be booted somehow, and here the Allwinner boot ROM still loads only 32KB into some SRAM. This hasn't changed for years, so even the 64-bit chips suffer from the same SPL space limitations. And since AArch64 does not define a tight encoding variant like Thumb, we are even more limited in our code size.
Of course this only applies to the SPL, so once we have DRAM up and an MMC driver initialized, we indeed have quite some resources available.
Cheers, Andre.

On 28/11/16 00:22, André Przywara wrote:
On 27/11/16 17:02, Simon Glass wrote:
Hi,
On 23 November 2016 at 20:19, Siarhei Siamashka siarhei.siamashka@gmail.com wrote:
On Sun, 20 Nov 2016 14:56:59 +0000 Andre Przywara andre.przywara@arm.com wrote:
tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly.
I can't help but notice that the changes of this kind are in a way defeating the original purpose of tiny-printf. And it is surely not the first patch adding features to tiny-printf. I guess, in the end we may end up with a large and bloated printf implementation again :-)
A possible solution might be just a strict check when parsing format modifiers and abort with an error message (yeah, this will introduce some size increase, but hopefully the last one). This way we acknowledge the fact that tiny-printf is a reduced incomplete implementation, and that the callers need to take this into account.
As for the "l" modifier. How much does it add to the code size? IMHO this information should be mentioned in the commit message. Can the AArch64 crash dump code be modified to avoid using it? Or can we have the "l" modifier supported on 64-bit platforms only?
Signed-off-by: Andre Przywara andre.przywara@arm.com
lib/tiny-printf.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
I think I tested this patch as adding 36 bytes on Thumb2 so not too terrible. But I do agree with the sentiment.
Simon, what is your compiler? Both with GCC 5.3.0 and GCC 6.2.0 I get exactly 6/4 bytes more of .text, which is not too bad for parsing (but ignoring) two new modifiers. It turns out that (at least these two versions of) GCCs are quite clever here and optimize away almost everything. Looking closer one can see that the if and else branches become identical if sizeof(long) == sizeof(int) == 4, so the compiler happily merges the code, removes the "if (long)" check and in turn the whole long-handling code on 32-bit. This is the patch as sent, without any further hints in the code.
If anyone really wants to save code space, I suggest to switch to a later compiler:
GCC 5.3.0: text data bss dec hex filename origin/master orangepi_plus_defconfig GCC 5.3.0 18881 488 232 19601 4c91 spl/u-boot-spl 758 0 0 758 2f6 spl/lib/tiny-printf.o
master+tiny_printf %l,%- orangepi_plus_defconfig GCC 5.3.0 18887 488 232 19607 4c97 spl/u-boot-spl 758 0 0 758 2f6 spl/lib/tiny-printf.o
GCC 6.2.0: origin/master orangepi_plus_defconfig GCC 6.2.0 16871 488 232 17591 44b7 spl/u-boot-spl 698 0 0 698 2ba spl/lib/tiny-printf.o
master+tiny_printf %l,%- orangepi_plus_defconfig GCC 6.2.0 16875 488 232 17595 44bb spl/u-boot-spl 702 0 0 702 2be spl/lib/tiny-printf.o
On 64-bit (only GCC 6.2.0) this results in more code, as expected: HEAD of patch set w/o tiny-printf %l, pine64_plus_defconfig 25824 392 360 26576 67d0 spl/u-boot-spl 1542 0 0 1542 606 spl/lib/tiny-printf.o HEAD of patch set, pine64_plus_defconfig 25972 392 360 26724 6864 spl/u-boot-spl 1690 0 0 1690 69a spl/lib/tiny-printf.o
So this is 148 Bytes more in .text. I can trade three simple patches that cut off 80 Bytes in sunxi/armv8 to at least offset this a bit, though this isn't really a regression, as there was no SPL64 for sunxi before. So apart from Alex' bug fix I won't change the patch, if people can live with that.
Cheers, Andre.

Hi Andre,
On 28 November 2016 at 18:13, André Przywara andre.przywara@arm.com wrote:
On 28/11/16 00:22, André Przywara wrote:
On 27/11/16 17:02, Simon Glass wrote:
Hi,
On 23 November 2016 at 20:19, Siarhei Siamashka siarhei.siamashka@gmail.com wrote:
On Sun, 20 Nov 2016 14:56:59 +0000 Andre Przywara andre.przywara@arm.com wrote:
tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly.
I can't help but notice that the changes of this kind are in a way defeating the original purpose of tiny-printf. And it is surely not the first patch adding features to tiny-printf. I guess, in the end we may end up with a large and bloated printf implementation again :-)
A possible solution might be just a strict check when parsing format modifiers and abort with an error message (yeah, this will introduce some size increase, but hopefully the last one). This way we acknowledge the fact that tiny-printf is a reduced incomplete implementation, and that the callers need to take this into account.
As for the "l" modifier. How much does it add to the code size? IMHO this information should be mentioned in the commit message. Can the AArch64 crash dump code be modified to avoid using it? Or can we have the "l" modifier supported on 64-bit platforms only?
Signed-off-by: Andre Przywara andre.przywara@arm.com
lib/tiny-printf.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
I think I tested this patch as adding 36 bytes on Thumb2 so not too terrible. But I do agree with the sentiment.
Simon, what is your compiler?
4.9 I suspect for that test. I build with various ones as I have been caught by breaking a slightly older compiler.
Both with GCC 5.3.0 and GCC 6.2.0 I get exactly 6/4 bytes more of .text, which is not too bad for parsing (but ignoring) two new modifiers. It turns out that (at least these two versions of) GCCs are quite clever here and optimize away almost everything. Looking closer one can see that the if and else branches become identical if sizeof(long) == sizeof(int) == 4, so the compiler happily merges the code, removes the "if (long)" check and in turn the whole long-handling code on 32-bit. This is the patch as sent, without any further hints in the code.
If anyone really wants to save code space, I suggest to switch to a later compiler:
GCC 5.3.0: text data bss dec hex filename origin/master orangepi_plus_defconfig GCC 5.3.0 18881 488 232 19601 4c91 spl/u-boot-spl 758 0 0 758 2f6 spl/lib/tiny-printf.o
master+tiny_printf %l,%- orangepi_plus_defconfig GCC 5.3.0 18887 488 232 19607 4c97 spl/u-boot-spl 758 0 0 758 2f6 spl/lib/tiny-printf.o
GCC 6.2.0: origin/master orangepi_plus_defconfig GCC 6.2.0 16871 488 232 17591 44b7 spl/u-boot-spl 698 0 0 698 2ba spl/lib/tiny-printf.o
master+tiny_printf %l,%- orangepi_plus_defconfig GCC 6.2.0 16875 488 232 17595 44bb spl/u-boot-spl 702 0 0 702 2be spl/lib/tiny-printf.o
On 64-bit (only GCC 6.2.0) this results in more code, as expected: HEAD of patch set w/o tiny-printf %l, pine64_plus_defconfig 25824 392 360 26576 67d0 spl/u-boot-spl 1542 0 0 1542 606 spl/lib/tiny-printf.o HEAD of patch set, pine64_plus_defconfig 25972 392 360 26724 6864 spl/u-boot-spl 1690 0 0 1690 69a spl/lib/tiny-printf.o
So this is 148 Bytes more in .text. I can trade three simple patches that cut off 80 Bytes in sunxi/armv8 to at least offset this a bit, though this isn't really a regression, as there was no SPL64 for sunxi before. So apart from Alex' bug fix I won't change the patch, if people can live with that.
That seems fine to me. Also this useful info could go in a note in your patch.
Cheers, Andre.
Regards, Simon

On 24/11/16 03:19, Siarhei Siamashka wrote:
On Sun, 20 Nov 2016 14:56:59 +0000 Andre Przywara andre.przywara@arm.com wrote:
tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly.
I can't help but notice that the changes of this kind are in a way defeating the original purpose of tiny-printf. And it is surely not the first patch adding features to tiny-printf. I guess, in the end we may end up with a large and bloated printf implementation again :-)
While I appreciate the fight against bloat, I am not sure severely hacked or crippled code is much better. We are not talking about KBs here, it's probably only a small number of double digits bytes. Frankly our existing tiny-printf implementation apparently did not live fully up to its promise of replacing printf with a smaller implementation. It's just that the missing code coverage has hidden this so far. So actually we would need to add this code increase here to the original size comparison.
In the end we can't really simplify the code beyond a certain point - otherwise return 0; would be an even smaller implementation.
But see below ...
A possible solution might be just a strict check when parsing format modifiers and abort with an error message (yeah, this will introduce some size increase, but hopefully the last one). This way we acknowledge the fact that tiny-printf is a reduced incomplete implementation, and that the callers need to take this into account.
On 64-bit we need "l" to differentiate between 32-bit and 64-bit variables. I believe the crash dump code is shared between SPL and U-Boot proper, and we probably want to keep it that way.
As for the "l" modifier. How much does it add to the code size? IMHO this information should be mentioned in the commit message.
Yeah, good point. I will add the numbers.
Can the AArch64 crash dump code be modified to avoid using it?
I really don't want to go there.
Or can we have the "l" modifier supported on 64-bit platforms only?
That sounds more like an option. On 32-bit "l" is pretty useless, and we don't need "ll", which I consider a reasonable limitation. We could just ignore "l", like we do with "-".
But on 64-bit that's the way to differentiate between standard integers and addresses (aka longs), and we need that there. I'd rather avoid #ifdefs inside the routine, so I'd try Alex' suggestion of adding " && sizeof(long) > 4" to let the compiler optimize this away. Or I refactor this code into a separate (ifdef'ed) function.
Let me check.
Cheers, Andre.
Signed-off-by: Andre Przywara andre.przywara@arm.com
lib/tiny-printf.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
diff --git a/lib/tiny-printf.c b/lib/tiny-printf.c index 30ac759..b01099d 100644 --- a/lib/tiny-printf.c +++ b/lib/tiny-printf.c @@ -38,8 +38,8 @@ static void out_dgt(struct printf_info *info, char dgt) info->zs = 1; }
-static void div_out(struct printf_info *info, unsigned int *num,
unsigned int div)
+static void div_out(struct printf_info *info, unsigned long *num,
unsigned long div)
{ unsigned char dgt = 0;
@@ -56,9 +56,9 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) { char ch; char *p;
- unsigned int num;
- unsigned long num; char buf[12];
- unsigned int div;
unsigned long div;
while ((ch = *(fmt++))) { if (ch != '%') {
@@ -66,8 +66,12 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) } else { bool lz = false; int width = 0;
bool islong = false; ch = *(fmt++);
if (ch == '-')
ch = *(fmt++);
if (ch == '0') { ch = *(fmt++); lz = 1;
@@ -80,6 +84,11 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) ch = *fmt++; } }
if (ch == 'l') {
ch = *(fmt++);
islong = true;
}
info->bf = buf; p = info->bf; info->zs = 0;
@@ -89,24 +98,38 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) goto abort; case 'u': case 'd':
num = va_arg(va, unsigned int);
if (ch == 'd' && (int)num < 0) {
num = -(int)num;
div = 1000000000;
if (islong) {
num = va_arg(va, unsigned long);
if (sizeof(long) > 4)
div *= div * 10;
} else {
num = va_arg(va, unsigned int);
}
if (ch == 'd' && (long)num < 0) {
num = -(long)num; out(info, '-'); } if (!num) { out_dgt(info, 0); } else {
for (div = 1000000000; div; div /= 10)
for (; div; div /= 10) div_out(info, &num, div); } break; case 'x':
num = va_arg(va, unsigned int);
if (islong) {
num = va_arg(va, unsigned long);
div = 1UL << (sizeof(long) * 8 - 4);
} else {
num = va_arg(va, unsigned int);
div = 0x10000000;
} if (!num) { out_dgt(info, 0); } else {
for (div = 0x10000000; div; div /= 0x10)
for (; div; div /= 0x10) div_out(info, &num, div); } break;

The UL() macro is pretty useful in sharing constants between assembly and C files while still being able to specify a type for C. Move the macro from an armv8 specific header into a common header file to be able to use it by arm code (for instance) as well.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/include/asm/armv8/mmu.h | 8 -------- include/common.h | 10 +++++++++- 2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h index aa0f3c4..e9b4cdb 100644 --- a/arch/arm/include/asm/armv8/mmu.h +++ b/arch/arm/include/asm/armv8/mmu.h @@ -8,14 +8,6 @@ #ifndef _ASM_ARMV8_MMU_H_ #define _ASM_ARMV8_MMU_H_
-#ifdef __ASSEMBLY__ -#define _AC(X, Y) X -#else -#define _AC(X, Y) (X##Y) -#endif - -#define UL(x) _AC(x, UL) - /***************************************************************/ /* * The following definitions are related each other, shoud be diff --git a/include/common.h b/include/common.h index a8d833b..5fcd5f5 100644 --- a/include/common.h +++ b/include/common.h @@ -15,6 +15,8 @@ typedef volatile unsigned long vu_long; typedef volatile unsigned short vu_short; typedef volatile unsigned char vu_char;
+#define _AC(X, Y) (X##Y) + #include <config.h> #include <errno.h> #include <asm-offsets.h> @@ -936,7 +938,11 @@ int cpu_disable(int nr); int cpu_release(int nr, int argc, char * const argv[]); #endif
-#endif /* __ASSEMBLY__ */ +#else /* __ASSEMBLY__ */ + +#define _AC(X, Y) X + +#endif /* __ASSEMBLY__ */
#ifdef CONFIG_PPC /* @@ -948,6 +954,8 @@ int cpu_release(int nr, int argc, char * const argv[]);
/* Put only stuff here that the assembler can digest */
+#define UL(x) _AC(x, UL) + #ifdef CONFIG_POST #define CONFIG_HAS_POST #ifndef CONFIG_POST_ALT_LIST

On 20/11/2016 15:57, Andre Przywara wrote:
The UL() macro is pretty useful in sharing constants between assembly and C files while still being able to specify a type for C. Move the macro from an armv8 specific header into a common header file to be able to use it by arm code (for instance) as well.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Reviewed-by: Alexander Graf agraf@suse.de
Alex

Since entry_point and load_addr are addresses, they should be represented as longs to cover the whole address space and to avoid warning when compiling the SPL in 64-bit. Also adjust debug prints to add the 'l' specifier, where needed.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/cpu/armv7/omap-common/boot-common.c | 2 +- arch/arm/mach-tegra/spl.c | 2 +- common/spl/spl.c | 8 ++++---- common/spl/spl_mmc.c | 2 +- include/spl.h | 4 ++-- 5 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/arm/cpu/armv7/omap-common/boot-common.c b/arch/arm/cpu/armv7/omap-common/boot-common.c index 385310b..7ae3d80 100644 --- a/arch/arm/cpu/armv7/omap-common/boot-common.c +++ b/arch/arm/cpu/armv7/omap-common/boot-common.c @@ -228,7 +228,7 @@ void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image)
u32 boot_params = *((u32 *)OMAP_SRAM_SCRATCH_BOOT_PARAMS);
- debug("image entry point: 0x%X\n", spl_image->entry_point); + debug("image entry point: 0x%lX\n", spl_image->entry_point); /* Pass the saved boot_params from rom code */ image_entry((u32 *)boot_params); } diff --git a/arch/arm/mach-tegra/spl.c b/arch/arm/mach-tegra/spl.c index e0f9d5b..41c88cb 100644 --- a/arch/arm/mach-tegra/spl.c +++ b/arch/arm/mach-tegra/spl.c @@ -42,7 +42,7 @@ u32 spl_boot_device(void)
void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image) { - debug("image entry point: 0x%X\n", spl_image->entry_point); + debug("image entry point: 0x%lX\n", spl_image->entry_point);
start_cpu((u32)spl_image->entry_point); halt_avp(); diff --git a/common/spl/spl.c b/common/spl/spl.c index bdb165a..835eed6 100644 --- a/common/spl/spl.c +++ b/common/spl/spl.c @@ -115,7 +115,7 @@ int spl_parse_image_header(struct spl_image_info *spl_image, } spl_image->os = image_get_os(header); spl_image->name = image_get_name(header); - debug("spl: payload image: %.*s load addr: 0x%x size: %d\n", + debug("spl: payload image: %.*s load addr: 0x%lx size: %d\n", (int)sizeof(spl_image->name), spl_image->name, spl_image->load_addr, spl_image->size); } else { @@ -140,7 +140,7 @@ int spl_parse_image_header(struct spl_image_info *spl_image, spl_image->load_addr = CONFIG_SYS_LOAD_ADDR; spl_image->entry_point = CONFIG_SYS_LOAD_ADDR; spl_image->size = end - start; - debug("spl: payload zImage, load addr: 0x%x size: %d\n", + debug("spl: payload zImage, load addr: 0x%lx size: %d\n", spl_image->load_addr, spl_image->size); return 0; } @@ -164,9 +164,9 @@ __weak void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image) typedef void __noreturn (*image_entry_noargs_t)(void);
image_entry_noargs_t image_entry = - (image_entry_noargs_t)(unsigned long)spl_image->entry_point; + (image_entry_noargs_t)spl_image->entry_point;
- debug("image entry point: 0x%X\n", spl_image->entry_point); + debug("image entry point: 0x%lX\n", spl_image->entry_point); image_entry(); }
diff --git a/common/spl/spl_mmc.c b/common/spl/spl_mmc.c index c674e61..4d0af2d 100644 --- a/common/spl/spl_mmc.c +++ b/common/spl/spl_mmc.c @@ -36,7 +36,7 @@ static int mmc_load_legacy(struct spl_image_info *spl_image, struct mmc *mmc, /* Read the header too to avoid extra memcpy */ count = blk_dread(mmc_get_blk_desc(mmc), sector, image_size_sectors, (void *)(ulong)spl_image->load_addr); - debug("read %x sectors to %x\n", image_size_sectors, + debug("read %x sectors to %lx\n", image_size_sectors, spl_image->load_addr); if (count != image_size_sectors) return -EIO; diff --git a/include/spl.h b/include/spl.h index e080a82..2f8c052 100644 --- a/include/spl.h +++ b/include/spl.h @@ -23,8 +23,8 @@ struct spl_image_info { const char *name; u8 os; - u32 load_addr; - u32 entry_point; + ulong load_addr; + ulong entry_point; u32 size; u32 flags; };

On 20/11/2016 15:57, Andre Przywara wrote:
Since entry_point and load_addr are addresses, they should be represented as longs to cover the whole address space and to avoid warning when compiling the SPL in 64-bit. Also adjust debug prints to add the 'l' specifier, where needed.
Signed-off-by: Andre Przywara andre.przywara@arm.com
I'm surprised York didn't stumble over this yet :).
Reviewed-by: Alexander Graf agraf@suse.de
Alex
arch/arm/cpu/armv7/omap-common/boot-common.c | 2 +- arch/arm/mach-tegra/spl.c | 2 +- common/spl/spl.c | 8 ++++---- common/spl/spl_mmc.c | 2 +- include/spl.h | 4 ++-- 5 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/arm/cpu/armv7/omap-common/boot-common.c b/arch/arm/cpu/armv7/omap-common/boot-common.c index 385310b..7ae3d80 100644 --- a/arch/arm/cpu/armv7/omap-common/boot-common.c +++ b/arch/arm/cpu/armv7/omap-common/boot-common.c @@ -228,7 +228,7 @@ void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image)
u32 boot_params = *((u32 *)OMAP_SRAM_SCRATCH_BOOT_PARAMS);
- debug("image entry point: 0x%X\n", spl_image->entry_point);
- debug("image entry point: 0x%lX\n", spl_image->entry_point); /* Pass the saved boot_params from rom code */ image_entry((u32 *)boot_params);
} diff --git a/arch/arm/mach-tegra/spl.c b/arch/arm/mach-tegra/spl.c index e0f9d5b..41c88cb 100644 --- a/arch/arm/mach-tegra/spl.c +++ b/arch/arm/mach-tegra/spl.c @@ -42,7 +42,7 @@ u32 spl_boot_device(void)
void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image) {
- debug("image entry point: 0x%X\n", spl_image->entry_point);
debug("image entry point: 0x%lX\n", spl_image->entry_point);
start_cpu((u32)spl_image->entry_point); halt_avp();
diff --git a/common/spl/spl.c b/common/spl/spl.c index bdb165a..835eed6 100644 --- a/common/spl/spl.c +++ b/common/spl/spl.c @@ -115,7 +115,7 @@ int spl_parse_image_header(struct spl_image_info *spl_image, } spl_image->os = image_get_os(header); spl_image->name = image_get_name(header);
debug("spl: payload image: %.*s load addr: 0x%x size: %d\n",
} else {debug("spl: payload image: %.*s load addr: 0x%lx size: %d\n", (int)sizeof(spl_image->name), spl_image->name, spl_image->load_addr, spl_image->size);
@@ -140,7 +140,7 @@ int spl_parse_image_header(struct spl_image_info *spl_image, spl_image->load_addr = CONFIG_SYS_LOAD_ADDR; spl_image->entry_point = CONFIG_SYS_LOAD_ADDR; spl_image->size = end - start;
debug("spl: payload zImage, load addr: 0x%x size: %d\n",
}debug("spl: payload zImage, load addr: 0x%lx size: %d\n", spl_image->load_addr, spl_image->size); return 0;
@@ -164,9 +164,9 @@ __weak void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image) typedef void __noreturn (*image_entry_noargs_t)(void);
image_entry_noargs_t image_entry =
(image_entry_noargs_t)(unsigned long)spl_image->entry_point;
(image_entry_noargs_t)spl_image->entry_point;
- debug("image entry point: 0x%X\n", spl_image->entry_point);
- debug("image entry point: 0x%lX\n", spl_image->entry_point); image_entry();
}
diff --git a/common/spl/spl_mmc.c b/common/spl/spl_mmc.c index c674e61..4d0af2d 100644 --- a/common/spl/spl_mmc.c +++ b/common/spl/spl_mmc.c @@ -36,7 +36,7 @@ static int mmc_load_legacy(struct spl_image_info *spl_image, struct mmc *mmc, /* Read the header too to avoid extra memcpy */ count = blk_dread(mmc_get_blk_desc(mmc), sector, image_size_sectors, (void *)(ulong)spl_image->load_addr);
- debug("read %x sectors to %x\n", image_size_sectors,
- debug("read %x sectors to %lx\n", image_size_sectors, spl_image->load_addr); if (count != image_size_sectors) return -EIO;
diff --git a/include/spl.h b/include/spl.h index e080a82..2f8c052 100644 --- a/include/spl.h +++ b/include/spl.h @@ -23,8 +23,8 @@ struct spl_image_info { const char *name; u8 os;
- u32 load_addr;
- u32 entry_point;
- ulong load_addr;
- ulong entry_point; u32 size; u32 flags;
};

On 11/21/2016 07:48 AM, Alexander Graf wrote:
On 20/11/2016 15:57, Andre Przywara wrote:
Since entry_point and load_addr are addresses, they should be represented as longs to cover the whole address space and to avoid warning when compiling the SPL in 64-bit. Also adjust debug prints to add the 'l' specifier, where needed.
Signed-off-by: Andre Przywara andre.przywara@arm.com
I'm surprised York didn't stumble over this yet :).
I guess the debug is not turned on by default for compiling test.
York

The sunxi DRAM setup code needs an sdelay() implementation, which wasn't defined for armv8 so far. Shamelessly copy the armv7 version and adjust it to work in AArch64.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/cpu/armv8/cpu.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c index e06c3cc..e82e9cf 100644 --- a/arch/arm/cpu/armv8/cpu.c +++ b/arch/arm/cpu/armv8/cpu.c @@ -16,6 +16,19 @@ #include <asm/system.h> #include <linux/compiler.h>
+/************************************************************ + * sdelay() - simple spin loop. Will be constant time as + * its generally used in bypass conditions only. This + * is necessary until timers are accessible. + * + * not inline to increase chances its in cache when called + *************************************************************/ +void sdelay(unsigned long loops) +{ + __asm__ volatile ("1:\n" "subs %0, %1, #1\n" + "b.ne 1b":"=r" (loops):"0"(loops)); +} + int cleanup_before_linux(void) { /*

On 20/11/2016 15:57, Andre Przywara wrote:
The sunxi DRAM setup code needs an sdelay() implementation, which wasn't defined for armv8 so far. Shamelessly copy the armv7 version and adjust it to work in AArch64.
Signed-off-by: Andre Przywara andre.przywara@arm.com
I don't think it hurts to write this in C - and I also doubt that inlining has any negative effect.
Something along the lines of
static inline void sdelay(...) { for (; loops; loops--) asm volatile(""); }
inside a header should do the trick as well and is much more readable.
Alex
arch/arm/cpu/armv8/cpu.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c index e06c3cc..e82e9cf 100644 --- a/arch/arm/cpu/armv8/cpu.c +++ b/arch/arm/cpu/armv8/cpu.c @@ -16,6 +16,19 @@ #include <asm/system.h> #include <linux/compiler.h>
+/************************************************************
- sdelay() - simple spin loop. Will be constant time as
- its generally used in bypass conditions only. This
- is necessary until timers are accessible.
- not inline to increase chances its in cache when called
- *************************************************************/
+void sdelay(unsigned long loops) +{
- __asm__ volatile ("1:\n" "subs %0, %1, #1\n"
"b.ne 1b":"=r" (loops):"0"(loops));
+}
int cleanup_before_linux(void) { /*

On Mon, 21 Nov 2016 16:52:47 +0100 Alexander Graf agraf@suse.de wrote:
On 20/11/2016 15:57, Andre Przywara wrote:
The sunxi DRAM setup code needs an sdelay() implementation, which wasn't defined for armv8 so far. Shamelessly copy the armv7 version and adjust it to work in AArch64.
Signed-off-by: Andre Przywara andre.przywara@arm.com
I don't think it hurts to write this in C - and I also doubt that inlining has any negative effect.
Something along the lines of
static inline void sdelay(...) { for (; loops; loops--) asm volatile(""); }
inside a header should do the trick as well and is much more readable.
Unfortunately the performance of the generated C code is very unpredictable. Depending on the optimization settings, it may place the counter variable in a register, or keep it on stack.
It would be much nicer to have more predictable timings for these delays. So I like the assembly version a lot better. Naturally, when it is implemented correctly.
Alex
arch/arm/cpu/armv8/cpu.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c index e06c3cc..e82e9cf 100644 --- a/arch/arm/cpu/armv8/cpu.c +++ b/arch/arm/cpu/armv8/cpu.c @@ -16,6 +16,19 @@ #include <asm/system.h> #include <linux/compiler.h>
+/************************************************************
- sdelay() - simple spin loop. Will be constant time as
- its generally used in bypass conditions only. This
- is necessary until timers are accessible.
- not inline to increase chances its in cache when called
- *************************************************************/
+void sdelay(unsigned long loops) +{
- __asm__ volatile ("1:\n" "subs %0, %1, #1\n"
"b.ne 1b":"=r" (loops):"0"(loops));
+}
int cleanup_before_linux(void) { /*

On Sun, 20 Nov 2016 14:57:02 +0000 Andre Przywara andre.przywara@arm.com wrote:
The sunxi DRAM setup code needs an sdelay() implementation, which wasn't defined for armv8 so far. Shamelessly copy the armv7 version and adjust it to work in AArch64.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/cpu.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c index e06c3cc..e82e9cf 100644 --- a/arch/arm/cpu/armv8/cpu.c +++ b/arch/arm/cpu/armv8/cpu.c @@ -16,6 +16,19 @@ #include <asm/system.h> #include <linux/compiler.h>
+/************************************************************
- sdelay() - simple spin loop. Will be constant time as
- its generally used in bypass conditions only. This
- is necessary until timers are accessible.
- not inline to increase chances its in cache when called
- *************************************************************/
+void sdelay(unsigned long loops) +{
- __asm__ volatile ("1:\n" "subs %0, %1, #1\n"
"b.ne 1b":"=r" (loops):"0"(loops));
This inline assembly needs "cc" in the clobber list. Also don't we want to just use a single register for the counter ("subs %0, %0, #1") rather than trying to construct something excessively complicated and possibly fragile?
The https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html page provides some information.
+}
int cleanup_before_linux(void) { /*

On 24/11/16 01:25, Siarhei Siamashka wrote:
Hi Siarhei,
On Sun, 20 Nov 2016 14:57:02 +0000 Andre Przywara andre.przywara@arm.com wrote:
The sunxi DRAM setup code needs an sdelay() implementation, which wasn't defined for armv8 so far. Shamelessly copy the armv7 version and adjust it to work in AArch64.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/cpu.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c index e06c3cc..e82e9cf 100644 --- a/arch/arm/cpu/armv8/cpu.c +++ b/arch/arm/cpu/armv8/cpu.c @@ -16,6 +16,19 @@ #include <asm/system.h> #include <linux/compiler.h>
+/************************************************************
- sdelay() - simple spin loop. Will be constant time as
- its generally used in bypass conditions only. This
- is necessary until timers are accessible.
- not inline to increase chances its in cache when called
- *************************************************************/
+void sdelay(unsigned long loops) +{
- __asm__ volatile ("1:\n" "subs %0, %1, #1\n"
"b.ne 1b":"=r" (loops):"0"(loops));
This inline assembly needs "cc" in the clobber list. Also don't we want to just use a single register for the counter ("subs %0, %0, #1") rather than trying to construct something excessively complicated and possibly fragile?
Please don't shoot the messenger, this is the version copied from ARMv7. I noticed the redundant register as well, but didn't dare to touch it (assuming some higher wisdom behind it). And good catch for the cc clobber!
Cheers, Andre.
The https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html page provides some information.
+}
int cleanup_before_linux(void) { /*

The boot0 hook we have so far is applied _after_ the initial branch to the "reset" entry point. An upcoming change requires even this branch to be changed, so we apply the hook macro at the earliest point, and have the branch in the hook file as well. This is no functional change at this point, just refactoring to simplify upcoming patches.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/cpu/armv8/start.S | 4 ++-- arch/arm/include/asm/arch-sunxi/boot0.h | 1 + 2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm/cpu/armv8/start.S b/arch/arm/cpu/armv8/start.S index 19c771d..1ccb191 100644 --- a/arch/arm/cpu/armv8/start.S +++ b/arch/arm/cpu/armv8/start.S @@ -19,8 +19,6 @@
.globl _start _start: - b reset - #ifdef CONFIG_ENABLE_ARM_SOC_BOOT0_HOOK /* * Various SoCs need something special and SoC-specific up front in @@ -29,6 +27,8 @@ _start: */ #include <asm/arch/boot0.h> ARM_SOC_BOOT0_HOOK +#else + b reset #endif
.align 3 diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index ea5675e..6f28d63 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -9,6 +9,7 @@
/* reserve space for BOOT0 header information */ #define ARM_SOC_BOOT0_HOOK \ + b reset; \ .space 1532
#endif /* __BOOT0_H */

For prepending some board specific header area to U-Boot images we were so far including a header file with a macro definition containing the actual header specification. This works fine if there are just a few statements and if there is only one alternative. However adding more complex code quickly gets messy with this approach, so let's just drop that intermediate macro and let the #include actually insert the code directly. This converts the callers and the callees, but doesn't change anything at this point.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/cpu/armv8/start.S | 1 - arch/arm/include/asm/arch-bcm235xx/boot0.h | 8 +------- arch/arm/include/asm/arch-bcm281xx/boot0.h | 8 +------- arch/arm/include/asm/arch-sunxi/boot0.h | 8 +------- arch/arm/lib/vectors.S | 1 - 5 files changed, 3 insertions(+), 23 deletions(-)
diff --git a/arch/arm/cpu/armv8/start.S b/arch/arm/cpu/armv8/start.S index 1ccb191..2d10746 100644 --- a/arch/arm/cpu/armv8/start.S +++ b/arch/arm/cpu/armv8/start.S @@ -26,7 +26,6 @@ _start: * use it here. */ #include <asm/arch/boot0.h> -ARM_SOC_BOOT0_HOOK #else b reset #endif diff --git a/arch/arm/include/asm/arch-bcm235xx/boot0.h b/arch/arm/include/asm/arch-bcm235xx/boot0.h index 7e72882..9ff90b8 100644 --- a/arch/arm/include/asm/arch-bcm235xx/boot0.h +++ b/arch/arm/include/asm/arch-bcm235xx/boot0.h @@ -4,12 +4,6 @@ * SPDX-License-Identifier: GPL-2.0+ */
-#ifndef __BOOT0_H -#define __BOOT0_H - /* BOOT0 header information */ -#define ARM_SOC_BOOT0_HOOK \ - .word 0xbabeface; \ + .word 0xbabeface; .word _end - _start - -#endif /* __BOOT0_H */ diff --git a/arch/arm/include/asm/arch-bcm281xx/boot0.h b/arch/arm/include/asm/arch-bcm281xx/boot0.h index 7e72882..9ff90b8 100644 --- a/arch/arm/include/asm/arch-bcm281xx/boot0.h +++ b/arch/arm/include/asm/arch-bcm281xx/boot0.h @@ -4,12 +4,6 @@ * SPDX-License-Identifier: GPL-2.0+ */
-#ifndef __BOOT0_H -#define __BOOT0_H - /* BOOT0 header information */ -#define ARM_SOC_BOOT0_HOOK \ - .word 0xbabeface; \ + .word 0xbabeface; .word _end - _start - -#endif /* __BOOT0_H */ diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index 6f28d63..6a13db5 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,12 +4,6 @@ * SPDX-License-Identifier: GPL-2.0+ */
-#ifndef __BOOT0_H -#define __BOOT0_H - /* reserve space for BOOT0 header information */ -#define ARM_SOC_BOOT0_HOOK \ - b reset; \ + b reset .space 1532 - -#endif /* __BOOT0_H */ diff --git a/arch/arm/lib/vectors.S b/arch/arm/lib/vectors.S index 5cc132b..9fe7415 100644 --- a/arch/arm/lib/vectors.S +++ b/arch/arm/lib/vectors.S @@ -67,7 +67,6 @@ _start: * use it here. */ #include <asm/arch/boot0.h> -ARM_SOC_BOOT0_HOOK #endif
/*

The ENABLE_ARM_SOC_BOOT0_HOOK option is a generic option shared with other boards. To allow alternative code to be inserted, we create another, now function specific config symbol on top of it to simplify later additions. No functional change at this time.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- board/sunxi/Kconfig | 9 +++++++++ configs/pine64_plus_defconfig | 2 +- 2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index e1d4ab1..0cd57a2 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -133,6 +133,15 @@ config MACH_SUN8I bool default y if MACH_SUN8I_A23 || MACH_SUN8I_A33 || MACH_SUN8I_H3 || MACH_SUN8I_A83T
+config RESERVE_ALLWINNER_BOOT0_HEADER + bool "reserve space for Allwinner boot0 header" + select ENABLE_ARM_SOC_BOOT0_HOOK + ---help--- + Prepend a 1536 byte (empty) header to the U-Boot image file, to be + filled with magic values post build. The Allwinner provided boot0 + blob relies on this information to load and execute U-Boot. + Only needed on 64-bit Allwinner boards so far when using boot0. + config DRAM_TYPE int "sunxi dram type" depends on MACH_SUN8I_A83T diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index 6d0198f..ea53b96 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -1,5 +1,5 @@ CONFIG_ARM=y -CONFIG_ENABLE_ARM_SOC_BOOT0_HOOK=y +CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y CONFIG_MACH_SUN50I=y CONFIG_DRAM_CLK=672

Hi Andre,
On Sun, Nov 20, 2016 at 02:57:05PM +0000, Andre Przywara wrote:
The ENABLE_ARM_SOC_BOOT0_HOOK option is a generic option shared with other boards. To allow alternative code to be inserted, we create another, now function specific config symbol on top of it to simplify later additions. No functional change at this time.
Signed-off-by: Andre Przywara andre.przywara@arm.com
board/sunxi/Kconfig | 9 +++++++++ configs/pine64_plus_defconfig | 2 +- 2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index e1d4ab1..0cd57a2 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -133,6 +133,15 @@ config MACH_SUN8I bool default y if MACH_SUN8I_A23 || MACH_SUN8I_A33 || MACH_SUN8I_H3 || MACH_SUN8I_A83T
+config RESERVE_ALLWINNER_BOOT0_HEADER
- bool "reserve space for Allwinner boot0 header"
- select ENABLE_ARM_SOC_BOOT0_HOOK
- ---help---
- Prepend a 1536 byte (empty) header to the U-Boot image file, to be
- filled with magic values post build. The Allwinner provided boot0
- blob relies on this information to load and execute U-Boot.
- Only needed on 64-bit Allwinner boards so far when using boot0.
Is there a reason you can think of to disable it?
If not, you should consider making this enabled by default, so that we don't enable it in all the defconfig for no particular reason.
Maxime

Hi Maxime,
thanks for having a look!
On 21/11/16 07:27, Maxime Ripard wrote:
Hi Andre,
On Sun, Nov 20, 2016 at 02:57:05PM +0000, Andre Przywara wrote:
The ENABLE_ARM_SOC_BOOT0_HOOK option is a generic option shared with other boards. To allow alternative code to be inserted, we create another, now function specific config symbol on top of it to simplify later additions. No functional change at this time.
Signed-off-by: Andre Przywara andre.przywara@arm.com
board/sunxi/Kconfig | 9 +++++++++ configs/pine64_plus_defconfig | 2 +- 2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index e1d4ab1..0cd57a2 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -133,6 +133,15 @@ config MACH_SUN8I bool default y if MACH_SUN8I_A23 || MACH_SUN8I_A33 || MACH_SUN8I_H3 || MACH_SUN8I_A83T
+config RESERVE_ALLWINNER_BOOT0_HEADER
- bool "reserve space for Allwinner boot0 header"
- select ENABLE_ARM_SOC_BOOT0_HOOK
- ---help---
- Prepend a 1536 byte (empty) header to the U-Boot image file, to be
- filled with magic values post build. The Allwinner provided boot0
- blob relies on this information to load and execute U-Boot.
- Only needed on 64-bit Allwinner boards so far when using boot0.
Is there a reason you can think of to disable it?
We need it only for booting from boot0, so this series actually makes this whole thing obsolete. Since - apart from enlarging the U-Boot (proper) image by 1.5KB - it doesn't hurt, though, my idea was to keep it in as an option for some time until we are confident that boot0 is no longer needed.
If not, you should consider making this enabled by default, so that we don't enable it in all the defconfig for no particular reason.
I can change the logic, make the Kconfig entry "default y if ARM64", and any defconfig could then choose to say "# RESERVE_... is not set".
Does that make more sense to you?
What was your major concern about this? Having pointless options in various defconfigs?
Cheers, Andre.

On Mon, Nov 21, 2016 at 09:29:50AM +0000, Andre Przywara wrote:
Hi Maxime,
thanks for having a look!
On 21/11/16 07:27, Maxime Ripard wrote:
Hi Andre,
On Sun, Nov 20, 2016 at 02:57:05PM +0000, Andre Przywara wrote:
The ENABLE_ARM_SOC_BOOT0_HOOK option is a generic option shared with other boards. To allow alternative code to be inserted, we create another, now function specific config symbol on top of it to simplify later additions. No functional change at this time.
Signed-off-by: Andre Przywara andre.przywara@arm.com
board/sunxi/Kconfig | 9 +++++++++ configs/pine64_plus_defconfig | 2 +- 2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index e1d4ab1..0cd57a2 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -133,6 +133,15 @@ config MACH_SUN8I bool default y if MACH_SUN8I_A23 || MACH_SUN8I_A33 || MACH_SUN8I_H3 || MACH_SUN8I_A83T
+config RESERVE_ALLWINNER_BOOT0_HEADER
- bool "reserve space for Allwinner boot0 header"
- select ENABLE_ARM_SOC_BOOT0_HOOK
- ---help---
- Prepend a 1536 byte (empty) header to the U-Boot image file, to be
- filled with magic values post build. The Allwinner provided boot0
- blob relies on this information to load and execute U-Boot.
- Only needed on 64-bit Allwinner boards so far when using boot0.
Is there a reason you can think of to disable it?
We need it only for booting from boot0, so this series actually makes this whole thing obsolete. Since - apart from enlarging the U-Boot (proper) image by 1.5KB - it doesn't hurt, though, my idea was to keep it in as an option for some time until we are confident that boot0 is no longer needed.
Then we don't need to enable it in the defconfig ?
If not, you should consider making this enabled by default, so that we don't enable it in all the defconfig for no particular reason.
I can change the logic, make the Kconfig entry "default y if ARM64", and any defconfig could then choose to say "# RESERVE_... is not set".
Does that make more sense to you?
What was your major concern about this? Having pointless options in various defconfigs?
Yes, Hans was trying to avoid having too much duplication across defconfig, at least for the common stuff, and I agree with him that we should keep the defconfig as small as possible.
Maxime

The Allwinner A64 SoC starts execution in AArch32 mode, and both the boot ROM and Allwinner's boot0 keep running in this mode. So U-Boot gets entered in 32-bit, although we want it to run in AArch64.
By using a "magic" instruction, which happens to be an almost-NOP in AArch64 and a branch in AArch32, we differentiate between being entered in 64-bit or 32-bit mode. If in 64-bit mode, we proceed with the branch to reset, but in 32-bit mode we trigger an RMR write to bring the core into AArch64/EL3 and re-enter U-Boot at CONFIG_SYS_TEXT_BASE. This allows a 64-bit U-Boot to be both entered in 32 and 64-bit mode, so we can use the same start code for the SPL and the U-Boot proper.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/include/asm/arch-sunxi/boot0.h | 27 +++++++++++++++++++++++++++ board/sunxi/Kconfig | 5 +++++ 2 files changed, 32 insertions(+)
diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index 6a13db5..c31a2af 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,6 +4,33 @@ * SPDX-License-Identifier: GPL-2.0+ */
+#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) /* reserve space for BOOT0 header information */ b reset .space 1532 +#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) +/* switch into AArch64 if needed */ + tst x0, x0 // this is "b #0x84" in ARM + b reset + .space 0x7c + .word 0xe3a01617 // mov r1, #0x1700000 + .word 0xe38110a0 // orr r1, r1, #0xa0 + .word 0xe59f0020 // ldr r0, [pc, #32] + .word 0xe5810000 // str r0, [r1] + .word 0xf57ff04f // dsb sy + .word 0xf57ff06f // isb sy + .word 0xee1c0f50 // mrc 15, 0, r0, cr12, cr0, {2} + .word 0xe3800003 // orr r0, r0, #3 + .word 0xee0c0f50 // mcr 15, 0, r0, cr12, cr0, {2} + .word 0xf57ff06f // isb sy + .word 0xe320f003 // wfi + .word 0xeafffffd // b @wfi +#ifdef CONFIG_SPL_BUILD + .word CONFIG_SPL_TEXT_BASE +#else + .word CONFIG_SYS_TEXT_BASE +#endif +#else +/* normal execution */ + b reset +#endif diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index 0cd57a2..ba72e76 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -142,6 +142,11 @@ config RESERVE_ALLWINNER_BOOT0_HEADER blob relies on this information to load and execute U-Boot. Only needed on 64-bit Allwinner boards so far when using boot0.
+config ARM_BOOT_HOOK_RMR + bool + default y if ARM64 + select ENABLE_ARM_SOC_BOOT0_HOOK + config DRAM_TYPE int "sunxi dram type" depends on MACH_SUN8I_A83T

On 20/11/2016 15:57, Andre Przywara wrote:
The Allwinner A64 SoC starts execution in AArch32 mode, and both the boot ROM and Allwinner's boot0 keep running in this mode. So U-Boot gets entered in 32-bit, although we want it to run in AArch64.
By using a "magic" instruction, which happens to be an almost-NOP in AArch64 and a branch in AArch32, we differentiate between being entered in 64-bit or 32-bit mode. If in 64-bit mode, we proceed with the branch to reset, but in 32-bit mode we trigger an RMR write to bring the core into AArch64/EL3 and re-enter U-Boot at CONFIG_SYS_TEXT_BASE. This allows a 64-bit U-Boot to be both entered in 32 and 64-bit mode, so we can use the same start code for the SPL and the U-Boot proper.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/boot0.h | 27 +++++++++++++++++++++++++++ board/sunxi/Kconfig | 5 +++++ 2 files changed, 32 insertions(+)
diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index 6a13db5..c31a2af 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,6 +4,33 @@
- SPDX-License-Identifier: GPL-2.0+
*/
+#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) /* reserve space for BOOT0 header information */ b reset .space 1532 +#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) +/* switch into AArch64 if needed */
- tst x0, x0 // this is "b #0x84" in ARM
- b reset
- .space 0x7c
- .word 0xe3a01617 // mov r1, #0x1700000
- .word 0xe38110a0 // orr r1, r1, #0xa0
Is this address guaranteed to stay the same for newer chips? Maybe it'd be better to use a pc-relative load and put it in as .word like you do below for the text base address.
Alex
- .word 0xe59f0020 // ldr r0, [pc, #32]
- .word 0xe5810000 // str r0, [r1]
- .word 0xf57ff04f // dsb sy
- .word 0xf57ff06f // isb sy
- .word 0xee1c0f50 // mrc 15, 0, r0, cr12, cr0, {2}
- .word 0xe3800003 // orr r0, r0, #3
- .word 0xee0c0f50 // mcr 15, 0, r0, cr12, cr0, {2}
- .word 0xf57ff06f // isb sy
- .word 0xe320f003 // wfi
- .word 0xeafffffd // b @wfi
+#ifdef CONFIG_SPL_BUILD
- .word CONFIG_SPL_TEXT_BASE
+#else
- .word CONFIG_SYS_TEXT_BASE
+#endif +#else +/* normal execution */
- b reset
+#endif diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index 0cd57a2..ba72e76 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -142,6 +142,11 @@ config RESERVE_ALLWINNER_BOOT0_HEADER blob relies on this information to load and execute U-Boot. Only needed on 64-bit Allwinner boards so far when using boot0.
+config ARM_BOOT_HOOK_RMR
- bool
- default y if ARM64
- select ENABLE_ARM_SOC_BOOT0_HOOK
config DRAM_TYPE int "sunxi dram type" depends on MACH_SUN8I_A83T

Hi,
On 21/11/16 16:34, Alexander Graf wrote:
On 20/11/2016 15:57, Andre Przywara wrote:
The Allwinner A64 SoC starts execution in AArch32 mode, and both the boot ROM and Allwinner's boot0 keep running in this mode. So U-Boot gets entered in 32-bit, although we want it to run in AArch64.
By using a "magic" instruction, which happens to be an almost-NOP in AArch64 and a branch in AArch32, we differentiate between being entered in 64-bit or 32-bit mode. If in 64-bit mode, we proceed with the branch to reset, but in 32-bit mode we trigger an RMR write to bring the core into AArch64/EL3 and re-enter U-Boot at CONFIG_SYS_TEXT_BASE. This allows a 64-bit U-Boot to be both entered in 32 and 64-bit mode, so we can use the same start code for the SPL and the U-Boot proper.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/boot0.h | 27 +++++++++++++++++++++++++++ board/sunxi/Kconfig | 5 +++++ 2 files changed, 32 insertions(+)
diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index 6a13db5..c31a2af 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,6 +4,33 @@
- SPDX-License-Identifier: GPL-2.0+
*/
+#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) /* reserve space for BOOT0 header information */ b reset .space 1532 +#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) +/* switch into AArch64 if needed */
- tst x0, x0 // this is "b #0x84" in ARM
- b reset
- .space 0x7c
- .word 0xe3a01617 // mov r1, #0x1700000
- .word 0xe38110a0 // orr r1, r1, #0xa0
Is this address guaranteed to stay the same for newer chips?
AW and stay the same? ;-)
Maybe it'd be better to use a pc-relative load and put it in as .word like you do below for the text base address.
Yes, good plan.
Cheers, Andre
Alex
- .word 0xe59f0020 // ldr r0, [pc, #32]
- .word 0xe5810000 // str r0, [r1]
- .word 0xf57ff04f // dsb sy
- .word 0xf57ff06f // isb sy
- .word 0xee1c0f50 // mrc 15, 0, r0, cr12, cr0, {2}
- .word 0xe3800003 // orr r0, r0, #3
- .word 0xee0c0f50 // mcr 15, 0, r0, cr12, cr0, {2}
- .word 0xf57ff06f // isb sy
- .word 0xe320f003 // wfi
- .word 0xeafffffd // b @wfi
+#ifdef CONFIG_SPL_BUILD
- .word CONFIG_SPL_TEXT_BASE
+#else
- .word CONFIG_SYS_TEXT_BASE
+#endif +#else +/* normal execution */
- b reset
+#endif diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index 0cd57a2..ba72e76 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -142,6 +142,11 @@ config RESERVE_ALLWINNER_BOOT0_HEADER blob relies on this information to load and execute U-Boot. Only needed on 64-bit Allwinner boards so far when using boot0.
+config ARM_BOOT_HOOK_RMR
- bool
- default y if ARM64
- select ENABLE_ARM_SOC_BOOT0_HOOK
config DRAM_TYPE int "sunxi dram type" depends on MACH_SUN8I_A83T

To avoid enumerating the very same DRAM values in defconfig files for each and every Allwinner A64 board out there, let's put some sane default values in the Kconfig file. Boards with different needs can override them at any time.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- board/sunxi/Kconfig | 2 ++ configs/pine64_plus_defconfig | 2 -- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index ba72e76..d477925 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -159,6 +159,7 @@ config DRAM_CLK default 792 if MACH_SUN9I default 312 if MACH_SUN6I || MACH_SUN8I default 360 if MACH_SUN4I || MACH_SUN5I || MACH_SUN7I + default 672 if MACH_SUN50I ---help--- Set the dram clock speed, valid range 240 - 480 (prior to sun9i), must be a multiple of 24. For the sun9i (A80), the tested values @@ -178,6 +179,7 @@ config DRAM_ZQ default 123 if MACH_SUN4I || MACH_SUN5I || MACH_SUN6I || MACH_SUN8I default 127 if MACH_SUN7I default 4145117 if MACH_SUN9I + default 3881915 if MACH_SUN50I ---help--- Set the dram zq value.
diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index ea53b96..ebc24b8 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -2,8 +2,6 @@ CONFIG_ARM=y CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y CONFIG_MACH_SUN50I=y -CONFIG_DRAM_CLK=672 -CONFIG_DRAM_ZQ=3881915 CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y

From: Jens Kuske jenskuske@gmail.com
The IOCR registers got renamed to BDLR to match the public documentation of similar controllers.
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 43 ++++++++++++++----------- arch/arm/mach-sunxi/dram_sun8i_h3.c | 34 +++++++++---------- 2 files changed, 41 insertions(+), 36 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index d0f2b8a..867fd12 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -81,7 +81,7 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */ - u8 res3[0x20]; /* 0x9c */ + u8 res3[0x20]; /* 0x9c */ u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */ @@ -106,20 +106,23 @@ struct sunxi_mctl_ctl_reg { u32 perfhpr[2]; /* 0x1c4 */ u32 perflpr[2]; /* 0x1cc */ u32 perfwr[2]; /* 0x1d4 */ - u8 res8[0x2c]; /* 0x1dc */ - u32 aciocr; /* 0x208 */ - u8 res9[0xf4]; /* 0x20c */ + u8 res8[0x24]; /* 0x1dc */ + u32 acmdlr; /* 0x200 AC master delay line register */ + u32 aclcdlr; /* 0x204 AC local calibrated delay line register */ + u32 aciocr; /* 0x208 AC I/O configuration register */ + u8 res9[0x4]; /* 0x20c */ + u32 acbdlr[31]; /* 0x210 AC bit delay line registers */ + u8 res10[0x74]; /* 0x28c */ struct { /* 0x300 DATX8 modules*/ - u32 mdlr; /* 0x00 */ - u32 lcdlr[3]; /* 0x04 */ - u32 iocr[11]; /* 0x10 IO configuration register */ - u32 bdlr6; /* 0x3c */ - u32 gtr; /* 0x40 */ - u32 gcr; /* 0x44 */ - u32 gsr[3]; /* 0x48 */ + u32 mdlr; /* 0x00 master delay line register */ + u32 lcdlr[3]; /* 0x04 local calibrated delay line registers */ + u32 bdlr[12]; /* 0x10 bit delay line registers */ + u32 gtr; /* 0x40 general timing register */ + u32 gcr; /* 0x44 general configuration register */ + u32 gsr[3]; /* 0x48 general status registers */ u8 res0[0x2c]; /* 0x54 */ - } datx[4]; - u8 res10[0x388]; /* 0x500 */ + } dx[4]; + u8 res11[0x388]; /* 0x500 */ u32 upd2; /* 0x888 */ };
@@ -174,12 +177,14 @@ struct sunxi_mctl_ctl_reg {
#define ZQCR_PWRDOWN (0x1 << 31) /* ZQ power down */
-#define DATX_IOCR_DQ(x) (x) /* DQ0-7 IOCR index */ -#define DATX_IOCR_DM (8) /* DM IOCR index */ -#define DATX_IOCR_DQS (9) /* DQS IOCR index */ -#define DATX_IOCR_DQSN (10) /* DQSN IOCR index */ +#define ACBDLR_WRITE_DELAY(x) ((x) << 8)
-#define DATX_IOCR_WRITE_DELAY(x) ((x) << 8) -#define DATX_IOCR_READ_DELAY(x) ((x) << 0) +#define DXBDLR_DQ(x) (x) /* DQ0-7 BDLR index */ +#define DXBDLR_DM (8) /* DM BDLR index */ +#define DXBDLR_DQS (9) /* DQS BDLR index */ +#define DXBDLR_DQSN (10) /* DQSN BDLR index */ + +#define DXBDLR_WRITE_DELAY(x) ((x) << 8) +#define DXBDLR_READ_DELAY(x) ((x) << 0)
#endif /* _SUNXI_DRAM_SUN8I_H3_H */ diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index b08b8e6..3dd6803 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -72,21 +72,21 @@ static void mctl_dq_delay(u32 read, u32 write) u32 val;
for (i = 0; i < 4; i++) { - val = DATX_IOCR_WRITE_DELAY((write >> (i * 4)) & 0xf) | - DATX_IOCR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2); + val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) | + DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2);
- for (j = DATX_IOCR_DQ(0); j <= DATX_IOCR_DM; j++) - writel(val, &mctl_ctl->datx[i].iocr[j]); + for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++) + writel(val, &mctl_ctl->dx[i].bdlr[j]); }
clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
for (i = 0; i < 4; i++) { - val = DATX_IOCR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) | - DATX_IOCR_READ_DELAY((read >> (16 + i * 4)) & 0xf); + val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) | + DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf);
- writel(val, &mctl_ctl->datx[i].iocr[DATX_IOCR_DQS]); - writel(val, &mctl_ctl->datx[i].iocr[DATX_IOCR_DQSN]); + writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]); + writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]); }
setbits_le32(&mctl_ctl->pgcr[0], 1 << 26); @@ -344,7 +344,7 @@ static int mctl_channel_init(struct dram_para *para)
/* set dramc odt */ for (i = 0; i < 4; i++) - clrsetbits_le32(&mctl_ctl->datx[i].gcr, (0x3 << 4) | + clrsetbits_le32(&mctl_ctl->dx[i].gcr, (0x3 << 4) | (0x1 << 1) | (0x3 << 2) | (0x3 << 12) | (0x3 << 14), IS_ENABLED(CONFIG_DRAM_ODT_EN) ? 0x0 : 0x2); @@ -364,8 +364,8 @@ static int mctl_channel_init(struct dram_para *para)
/* set half DQ */ if (para->bus_width != 32) { - writel(0x0, &mctl_ctl->datx[2].gcr); - writel(0x0, &mctl_ctl->datx[3].gcr); + writel(0x0, &mctl_ctl->dx[2].gcr); + writel(0x0, &mctl_ctl->dx[3].gcr); }
/* data training configuration */ @@ -386,17 +386,17 @@ static int mctl_channel_init(struct dram_para *para) /* detect ranks and bus width */ if (readl(&mctl_ctl->pgsr[0]) & (0xfe << 20)) { /* only one rank */ - if (((readl(&mctl_ctl->datx[0].gsr[0]) >> 24) & 0x2) || - ((readl(&mctl_ctl->datx[1].gsr[0]) >> 24) & 0x2)) { + if (((readl(&mctl_ctl->dx[0].gsr[0]) >> 24) & 0x2) || + ((readl(&mctl_ctl->dx[1].gsr[0]) >> 24) & 0x2)) { clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, 0x1 << 24); para->dual_rank = 0; }
/* only half DQ width */ - if (((readl(&mctl_ctl->datx[2].gsr[0]) >> 24) & 0x1) || - ((readl(&mctl_ctl->datx[3].gsr[0]) >> 24) & 0x1)) { - writel(0x0, &mctl_ctl->datx[2].gcr); - writel(0x0, &mctl_ctl->datx[3].gcr); + if (((readl(&mctl_ctl->dx[2].gsr[0]) >> 24) & 0x1) || + ((readl(&mctl_ctl->dx[3].gsr[0]) >> 24) & 0x1)) { + writel(0x0, &mctl_ctl->dx[2].gcr); + writel(0x0, &mctl_ctl->dx[3].gcr); para->bus_width = 16; }

From: Jens Kuske jenskuske@gmail.com
Instead of setting the delay for whole bytes allow setting it for each individual bit. Also add support for address/command lane delays.
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/mach-sunxi/dram_sun8i_h3.c | 54 ++++++++++++++++++------------------- 1 file changed, 27 insertions(+), 27 deletions(-)
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 3dd6803..1647d76 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -16,12 +16,13 @@ #include <linux/kconfig.h>
struct dram_para { - u32 read_delays; - u32 write_delays; u16 page_size; u8 bus_width; u8 dual_rank; u8 row_bits; + const u8 dx_read_delays[4][11]; + const u8 dx_write_delays[4][11]; + const u8 ac_delays[31]; };
static inline int ns_to_t(int nanoseconds) @@ -64,34 +65,25 @@ static void mctl_phy_init(u32 val) mctl_await_completion(&mctl_ctl->pgsr[0], PGSR_INIT_DONE, 0x1); }
-static void mctl_dq_delay(u32 read, u32 write) +static void mctl_set_bit_delays(struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; int i, j; - u32 val; - - for (i = 0; i < 4; i++) { - val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) | - DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2); - - for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++) - writel(val, &mctl_ctl->dx[i].bdlr[j]); - }
clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
- for (i = 0; i < 4; i++) { - val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) | - DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf); + for (i = 0; i < 4; i++) + for (j = 0; j < 11; j++) + writel(DXBDLR_WRITE_DELAY(para->dx_write_delays[i][j]) | + DXBDLR_READ_DELAY(para->dx_read_delays[i][j]), + &mctl_ctl->dx[i].bdlr[j]);
- writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]); - writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]); - } + for (i = 0; i < 31; i++) + writel(ACBDLR_WRITE_DELAY(para->ac_delays[i]), + &mctl_ctl->acbdlr[i]);
setbits_le32(&mctl_ctl->pgcr[0], 1 << 26); - - udelay(1); }
static void mctl_set_master_priority(void) @@ -372,11 +364,8 @@ static int mctl_channel_init(struct dram_para *para) clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, (para->dual_rank ? 0x3 : 0x1) << 24);
- - if (para->read_delays || para->write_delays) { - mctl_dq_delay(para->read_delays, para->write_delays); - udelay(50); - } + mctl_set_bit_delays(para); + udelay(50);
mctl_zq_calibration(para);
@@ -458,12 +447,23 @@ unsigned long sunxi_dram_init(void) (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
struct dram_para para = { - .read_delays = 0x00007979, /* dram_tpr12 */ - .write_delays = 0x6aaa0000, /* dram_tpr11 */ .dual_rank = 0, .bus_width = 32, .row_bits = 15, .page_size = 4096, + + .dx_read_delays = {{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, + { 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }, + { 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, + { 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }}, + .dx_write_delays = {{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 }, + { 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 }, + { 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 }, + { 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6 }}, + .ac_delays = { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0 }, };
mctl_sys_init(¶);

From: Jens Kuske jenskuske@gmail.com
The A64 DRAM controller is very similar to the H3 one, so the code can be reused with some small changes. [Andre: fixed up typo, merged in fixes from Jens]
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 10 +- arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/clock_sun6i.c | 2 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 139 +++++++++++++++++++----- 6 files changed, 123 insertions(+), 32 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h index be9fcfd..3f87672 100644 --- a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h @@ -322,6 +322,7 @@ struct sunxi_ccm_reg { #define CCM_DRAMCLK_CFG_DIV0_MASK (0xf << 8) #define CCM_DRAMCLK_CFG_SRC_PLL5 (0x0 << 20) #define CCM_DRAMCLK_CFG_SRC_PLL6x2 (0x1 << 20) +#define CCM_DRAMCLK_CFG_SRC_PLL11 (0x1 << 20) /* A64 only */ #define CCM_DRAMCLK_CFG_SRC_MASK (0x3 << 20) #define CCM_DRAMCLK_CFG_UPD (0x1 << 16) #define CCM_DRAMCLK_CFG_RST (0x1 << 31) diff --git a/arch/arm/include/asm/arch-sunxi/dram.h b/arch/arm/include/asm/arch-sunxi/dram.h index e0be744..53e6d47 100644 --- a/arch/arm/include/asm/arch-sunxi/dram.h +++ b/arch/arm/include/asm/arch-sunxi/dram.h @@ -24,7 +24,7 @@ #include <asm/arch/dram_sun8i_a33.h> #elif defined(CONFIG_MACH_SUN8I_A83T) #include <asm/arch/dram_sun8i_a83t.h> -#elif defined(CONFIG_MACH_SUN8I_H3) +#elif defined(CONFIG_MACH_SUN8I_H3) || defined(CONFIG_MACH_SUN50I) #include <asm/arch/dram_sun8i_h3.h> #elif defined(CONFIG_MACH_SUN9I) #include <asm/arch/dram_sun9i.h> diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index 867fd12..b0e5d93 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -15,7 +15,8 @@
struct sunxi_mctl_com_reg { u32 cr; /* 0x00 control register */ - u8 res0[0xc]; /* 0x04 */ + u8 res0[0x8]; /* 0x04 */ + u32 tmr; /* 0x0c (A64 only) */ u32 mcr[16][2]; /* 0x10 */ u32 bwcr; /* 0x90 bandwidth control register */ u32 maer; /* 0x94 master enable register */ @@ -32,7 +33,9 @@ struct sunxi_mctl_com_reg { u32 swoffr; /* 0xc4 */ u8 res2[0x8]; /* 0xc8 */ u32 cccr; /* 0xd0 */ - u8 res3[0x72c]; /* 0xd4 */ + u8 res3[0x54]; /* 0xd4 */ + u32 mdfs_bwlr[3]; /* 0x128 (A64 only) */ + u8 res4[0x6cc]; /* 0x134 */ u32 protect; /* 0x800 */ };
@@ -81,7 +84,8 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */ - u8 res3[0x20]; /* 0x9c */ + u8 res3[0x1c]; /* 0x9c */ + u32 vtfcr; /* 0xb8 (A64 only) */ u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */ diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index e73114e..7daba11 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -50,4 +50,5 @@ obj-$(CONFIG_MACH_SUN8I_A33) += dram_sun8i_a33.o obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o +obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o endif diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index 382fa94..4570060 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -218,7 +218,7 @@ done: } #endif
-#ifdef CONFIG_MACH_SUN8I_A33 +#if defined(CONFIG_MACH_SUN8I_A33) || defined(CONFIG_MACH_SUN50I) void clock_set_pll11(unsigned int clk, bool sigma_delta_enable) { struct sunxi_ccm_reg * const ccm = diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 1647d76..2dc2071 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -32,30 +32,6 @@ static inline int ns_to_t(int nanoseconds) return DIV_ROUND_UP(ctrl_freq * nanoseconds, 1000); }
-static u32 bin_to_mgray(int val) -{ - static const u8 lookup_table[32] = { - 0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05, - 0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09, - 0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d, - 0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11, - }; - - return lookup_table[clamp(val, 0, 31)]; -} - -static int mgray_to_bin(u32 val) -{ - static const u8 lookup_table[32] = { - 0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05, - 0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b, - 0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b, - 0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15, - }; - - return lookup_table[val & 0x1f]; -} - static void mctl_phy_init(u32 val) { struct sunxi_mctl_ctl_reg * const mctl_ctl = @@ -91,8 +67,9 @@ static void mctl_set_master_priority(void) struct sunxi_mctl_com_reg * const mctl_com = (struct sunxi_mctl_com_reg *)SUNXI_DRAM_COM_BASE;
+#if defined(CONFIG_MACH_SUN8I_H3) /* enable bandwidth limit windows and set windows size 1us */ - writel(0x00010190, &mctl_com->bwcr); + writel((1 << 16) | (400 << 0), &mctl_com->bwcr);
/* set cpu high priority */ writel(0x00000001, &mctl_com->mapr); @@ -121,6 +98,38 @@ static void mctl_set_master_priority(void) writel(0x04001800, &mctl_com->mcr[10][1]); writel(0x04000009, &mctl_com->mcr[11][0]); writel(0x00400120, &mctl_com->mcr[11][1]); +#elif defined(CONFIG_MACH_SUN50I) + /* enable bandwidth limit windows and set windows size 1us */ + writel(399, &mctl_com->tmr); + writel((1 << 16), &mctl_com->bwcr); + + writel(0x00a0000d, &mctl_com->mcr[0][0]); + writel(0x00500064, &mctl_com->mcr[0][1]); + writel(0x06000009, &mctl_com->mcr[1][0]); + writel(0x01000578, &mctl_com->mcr[1][1]); + writel(0x0200000d, &mctl_com->mcr[2][0]); + writel(0x00600100, &mctl_com->mcr[2][1]); + writel(0x01000009, &mctl_com->mcr[3][0]); + writel(0x00500064, &mctl_com->mcr[3][1]); + writel(0x07000009, &mctl_com->mcr[4][0]); + writel(0x01000640, &mctl_com->mcr[4][1]); + writel(0x01000009, &mctl_com->mcr[5][0]); + writel(0x00000080, &mctl_com->mcr[5][1]); + writel(0x01000009, &mctl_com->mcr[6][0]); + writel(0x00400080, &mctl_com->mcr[6][1]); + writel(0x0100000d, &mctl_com->mcr[7][0]); + writel(0x00400080, &mctl_com->mcr[7][1]); + writel(0x0100000d, &mctl_com->mcr[8][0]); + writel(0x00400080, &mctl_com->mcr[8][1]); + writel(0x04000009, &mctl_com->mcr[9][0]); + writel(0x00400100, &mctl_com->mcr[9][1]); + writel(0x20000209, &mctl_com->mcr[10][0]); + writel(0x08001800, &mctl_com->mcr[10][1]); + writel(0x05000009, &mctl_com->mcr[11][0]); + writel(0x00400090, &mctl_com->mcr[11][1]); + + writel(0x81000004, &mctl_com->mdfs_bwlr[2]); +#endif }
static void mctl_set_timing_params(struct dram_para *para) @@ -204,7 +213,32 @@ static void mctl_set_timing_params(struct dram_para *para) writel(RFSHTMG_TREFI(trefi) | RFSHTMG_TRFC(trfc), &mctl_ctl->rfshtmg); }
-static void mctl_zq_calibration(struct dram_para *para) +#ifdef CONFIG_MACH_SUN8I_H3 +static u32 bin_to_mgray(int val) +{ + static const u8 lookup_table[32] = { + 0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05, + 0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09, + 0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d, + 0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11, + }; + + return lookup_table[clamp(val, 0, 31)]; +} + +static int mgray_to_bin(u32 val) +{ + static const u8 lookup_table[32] = { + 0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05, + 0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b, + 0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b, + 0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15, + }; + + return lookup_table[val & 0x1f]; +} + +static void mctl_h3_zq_calibration_quirk(struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; @@ -261,6 +295,7 @@ static void mctl_zq_calibration(struct dram_para *para) writel((zq_val[5] << 16) | zq_val[4], &mctl_ctl->zqdr[2]); } } +#endif
static void mctl_set_cr(struct dram_para *para) { @@ -286,16 +321,27 @@ static void mctl_sys_init(struct dram_para *para) clrbits_le32(&ccm->ahb_gate0, 1 << AHB_GATE_OFFSET_MCTL); clrbits_le32(&ccm->ahb_reset0_cfg, 1 << AHB_RESET_OFFSET_MCTL); clrbits_le32(&ccm->pll5_cfg, CCM_PLL5_CTRL_EN); +#ifdef CONFIG_MACH_SUN50I + clrbits_le32(&ccm->pll11_cfg, CCM_PLL11_CTRL_EN); +#endif udelay(10);
clrbits_le32(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_RST); udelay(1000);
+#ifdef CONFIG_MACH_SUN50I + clock_set_pll11(CONFIG_DRAM_CLK * 2 * 1000000, false); + clrsetbits_le32(&ccm->dram_clk_cfg, + CCM_DRAMCLK_CFG_DIV_MASK | CCM_DRAMCLK_CFG_SRC_MASK, + CCM_DRAMCLK_CFG_DIV(1) | CCM_DRAMCLK_CFG_SRC_PLL11 | + CCM_DRAMCLK_CFG_UPD); +#else clock_set_pll5(CONFIG_DRAM_CLK * 2 * 1000000, false); clrsetbits_le32(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_DIV_MASK | CCM_DRAMCLK_CFG_SRC_MASK, CCM_DRAMCLK_CFG_DIV(1) | CCM_DRAMCLK_CFG_SRC_PLL5 | CCM_DRAMCLK_CFG_UPD); +#endif mctl_await_completion(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_UPD, 0);
setbits_le32(&ccm->ahb_reset0_cfg, 1 << AHB_RESET_OFFSET_MCTL); @@ -347,12 +393,18 @@ static int mctl_channel_init(struct dram_para *para) /* set DQS auto gating PD mode */ setbits_le32(&mctl_ctl->pgcr[2], 0x3 << 6);
+#if defined(CONFIG_MACH_SUN8I_H3) /* dx ddr_clk & hdr_clk dynamic mode */ clrbits_le32(&mctl_ctl->pgcr[0], (0x3 << 14) | (0x3 << 12));
/* dphy & aphy phase select 270 degree */ clrsetbits_le32(&mctl_ctl->pgcr[2], (0x3 << 10) | (0x3 << 8), (0x1 << 10) | (0x2 << 8)); +#elif defined(CONFIG_MACH_SUN50I) + /* dphy & aphy phase select ? */ + clrsetbits_le32(&mctl_ctl->pgcr[2], (0x3 << 10) | (0x3 << 8), + (0x0 << 10) | (0x3 << 8)); +#endif
/* set half DQ */ if (para->bus_width != 32) { @@ -367,10 +419,17 @@ static int mctl_channel_init(struct dram_para *para) mctl_set_bit_delays(para); udelay(50);
- mctl_zq_calibration(para); +#ifdef CONFIG_MACH_SUN8I_H3 + mctl_h3_zq_calibration_quirk(para);
mctl_phy_init(PIR_PLLINIT | PIR_DCAL | PIR_PHYRST | PIR_DRAMRST | PIR_DRAMINIT | PIR_QSGATE); +#else + clrsetbits_le32(&mctl_ctl->zqcr, 0xffffff, CONFIG_DRAM_ZQ); + + mctl_phy_init(PIR_ZCAL | PIR_PLLINIT | PIR_DCAL | PIR_PHYRST | + PIR_DRAMRST | PIR_DRAMINIT | PIR_QSGATE); +#endif
/* detect ranks and bus width */ if (readl(&mctl_ctl->pgsr[0]) & (0xfe << 20)) { @@ -408,7 +467,11 @@ static int mctl_channel_init(struct dram_para *para) udelay(10);
/* set PGCR3, CKE polarity */ +#ifdef CONFIG_MACH_SUN50I + writel(0xc0aa0060, &mctl_ctl->pgcr[3]); +#else writel(0x00aa0060, &mctl_ctl->pgcr[3]); +#endif
/* power down zq calibration module for power save */ setbits_le32(&mctl_ctl->zqcr, ZQCR_PWRDOWN); @@ -452,6 +515,7 @@ unsigned long sunxi_dram_init(void) .row_bits = 15, .page_size = 4096,
+#if defined(CONFIG_MACH_SUN8I_H3) .dx_read_delays = {{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, { 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }, { 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, @@ -464,6 +528,20 @@ unsigned long sunxi_dram_init(void) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, +#elif defined(CONFIG_MACH_SUN50I) + .dx_read_delays = {{ 16, 16, 16, 16, 17, 16, 16, 17, 16, 1, 0 }, + { 17, 17, 17, 17, 17, 17, 17, 17, 17, 1, 0 }, + { 16, 17, 17, 16, 16, 16, 16, 16, 16, 0, 0 }, + { 17, 17, 17, 17, 17, 17, 17, 17, 17, 1, 0 }}, + .dx_write_delays = {{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 15, 15 }, + { 0, 0, 0, 0, 1, 1, 1, 1, 0, 10, 10 }, + { 1, 0, 1, 1, 1, 1, 1, 1, 0, 11, 11 }, + { 1, 0, 0, 1, 1, 1, 1, 1, 0, 12, 12 }}, + .ac_delays = { 5, 5, 13, 10, 2, 5, 3, 3, + 0, 3, 3, 3, 1, 0, 0, 0, + 3, 4, 0, 3, 4, 1, 4, 0, + 1, 1, 0, 1, 13, 5, 4 }, +#endif };
mctl_sys_init(¶); @@ -476,8 +554,15 @@ unsigned long sunxi_dram_init(void) writel(0x00000201, &mctl_ctl->odtmap); udelay(1);
+#ifdef CONFIG_MACH_SUN8I_H3 /* odt delay */ writel(0x0c000400, &mctl_ctl->odtcfg); +#endif + +#ifdef CONFIG_MACH_SUN50I + setbits_le32(&mctl_ctl->vtfcr, (1 << 9)); + clrbits_le32(&mctl_ctl->pgcr[2], (1 << 13)); +#endif
/* clear credit value */ setbits_le32(&mctl_com->cccr, 1 << 31);

According to Jens disabling the on-die-termination should set bit 5, not bit 1 in the respective register. Fix this.
Reported-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/mach-sunxi/dram_sun8i_h3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 2dc2071..3d569fc 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -385,7 +385,7 @@ static int mctl_channel_init(struct dram_para *para) clrsetbits_le32(&mctl_ctl->dx[i].gcr, (0x3 << 4) | (0x1 << 1) | (0x3 << 2) | (0x3 << 12) | (0x3 << 14), - IS_ENABLED(CONFIG_DRAM_ODT_EN) ? 0x0 : 0x2); + IS_ENABLED(CONFIG_DRAM_ODT_EN) ? 0x0 : 0x20);
/* AC PDR should always ON */ setbits_le32(&mctl_ctl->aciocr, 0x1 << 1);

Fix the output of the DRAM size on AArch64 SPLs.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/mach-sunxi/dram_sun8i_h3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 3d569fc..5ee8b3d 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -571,6 +571,6 @@ unsigned long sunxi_dram_init(void) mctl_auto_detect_dram_size(¶); mctl_set_cr(¶);
- return (1 << (para.row_bits + 3)) * para.page_size * + return (1UL << (para.row_bits + 3)) * para.page_size * (para.dual_rank ? 2 : 1); }

On 20/11/2016 15:57, Andre Przywara wrote:
Fix the output of the DRAM size on AArch64 SPLs.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Reviewed-by: Alexander Graf agraf@suse.de
Alex

Now that the SPL is ready to be compiled in AArch64 and the DRAM init code is ready, enable SPL support for the A64 SoC and in the Pine64 defconfig. For now we keep the boot0 header in the U-Boot proper, as this allows to still use boot0 as an SPL replacement without hurting the SPL use case. We disable FEL support for now, as the code isn't ready yet.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/include/asm/arch-sunxi/boot0.h | 4 ++-- arch/arm/mach-sunxi/board.c | 2 +- board/sunxi/Kconfig | 2 ++ configs/pine64_plus_defconfig | 2 ++ include/configs/sunxi-common.h | 2 ++ 5 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index c31a2af..173e042 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,11 +4,11 @@ * SPDX-License-Identifier: GPL-2.0+ */
-#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) +#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) && !defined(CONFIG_SPL_BUILD) /* reserve space for BOOT0 header information */ b reset .space 1532 -#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) +#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) && defined(CONFIG_SPL_BUILD) /* switch into AArch64 if needed */ tst x0, x0 // this is "b #0x84" in ARM b reset diff --git a/arch/arm/mach-sunxi/board.c b/arch/arm/mach-sunxi/board.c index 0f8ead9..80d4b57 100644 --- a/arch/arm/mach-sunxi/board.c +++ b/arch/arm/mach-sunxi/board.c @@ -133,7 +133,7 @@ static int gpio_init(void) return 0; }
-#ifdef CONFIG_SPL_BUILD +#if defined(CONFIG_SPL_BOARD_LOAD_IMAGE) && defined(CONFIG_SPL_BUILD) static int spl_board_load_image(struct spl_image_info *spl_image, struct spl_boot_device *bootdev) { diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index d477925..b5246df 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -125,6 +125,7 @@ config MACH_SUN50I bool "sun50i (Allwinner A64)" select ARM64 select SUNXI_GEN_SUN6I + select SUPPORT_SPL
endchoice
@@ -187,6 +188,7 @@ config DRAM_ODT_EN bool "sunxi dram odt enable" default n if !MACH_SUN8I_A23 default y if MACH_SUN8I_A23 + default y if MACH_SUN50I ---help--- Select this to enable dram odt (on die termination).
diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index ebc24b8..5286fee 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -2,9 +2,11 @@ CONFIG_ARM=y CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y CONFIG_MACH_SUN50I=y +CONFIG_DRAM_CLK=672 CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y +CONFIG_SPL=y # CONFIG_CMD_IMLS is not set # CONFIG_CMD_FLASH is not set # CONFIG_CMD_FPGA is not set diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index 86b4104..f2cb174 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -182,7 +182,9 @@
#define CONFIG_SPL_FRAMEWORK
+#ifndef CONFIG_MACH_SUN50I #define CONFIG_SPL_BOARD_LOAD_IMAGE +#endif
#if defined(CONFIG_MACH_SUN9I) #define CONFIG_SPL_TEXT_BASE 0x10040 /* sram start+header */

On 20/11/2016 15:57, Andre Przywara wrote:
Now that the SPL is ready to be compiled in AArch64 and the DRAM init code is ready, enable SPL support for the A64 SoC and in the Pine64 defconfig. For now we keep the boot0 header in the U-Boot proper, as this allows to still use boot0 as an SPL replacement without hurting the SPL use case. We disable FEL support for now, as the code isn't ready yet.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/boot0.h | 4 ++-- arch/arm/mach-sunxi/board.c | 2 +- board/sunxi/Kconfig | 2 ++ configs/pine64_plus_defconfig | 2 ++ include/configs/sunxi-common.h | 2 ++ 5 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index c31a2af..173e042 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,11 +4,11 @@
- SPDX-License-Identifier: GPL-2.0+
*/
-#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) +#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) && !defined(CONFIG_SPL_BUILD) /* reserve space for BOOT0 header information */ b reset .space 1532 -#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) +#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) && defined(CONFIG_SPL_BUILD) /* switch into AArch64 if needed */ tst x0, x0 // this is "b #0x84" in ARM b reset
Shouldn't the hunk above go into the patches that introduce the options?
diff --git a/arch/arm/mach-sunxi/board.c b/arch/arm/mach-sunxi/board.c index 0f8ead9..80d4b57 100644 --- a/arch/arm/mach-sunxi/board.c +++ b/arch/arm/mach-sunxi/board.c @@ -133,7 +133,7 @@ static int gpio_init(void) return 0; }
-#ifdef CONFIG_SPL_BUILD +#if defined(CONFIG_SPL_BOARD_LOAD_IMAGE) && defined(CONFIG_SPL_BUILD) static int spl_board_load_image(struct spl_image_info *spl_image, struct spl_boot_device *bootdev) { diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index d477925..b5246df 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -125,6 +125,7 @@ config MACH_SUN50I bool "sun50i (Allwinner A64)" select ARM64 select SUNXI_GEN_SUN6I
- select SUPPORT_SPL
endchoice
@@ -187,6 +188,7 @@ config DRAM_ODT_EN bool "sunxi dram odt enable" default n if !MACH_SUN8I_A23 default y if MACH_SUN8I_A23
- default y if MACH_SUN50I ---help--- Select this to enable dram odt (on die termination).
diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index ebc24b8..5286fee 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -2,9 +2,11 @@ CONFIG_ARM=y CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y CONFIG_MACH_SUN50I=y +CONFIG_DRAM_CLK=672
Do you need this?
Alex
CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y +CONFIG_SPL=y # CONFIG_CMD_IMLS is not set # CONFIG_CMD_FLASH is not set # CONFIG_CMD_FPGA is not set diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index 86b4104..f2cb174 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -182,7 +182,9 @@
#define CONFIG_SPL_FRAMEWORK
+#ifndef CONFIG_MACH_SUN50I #define CONFIG_SPL_BOARD_LOAD_IMAGE +#endif
#if defined(CONFIG_MACH_SUN9I) #define CONFIG_SPL_TEXT_BASE 0x10040 /* sram start+header */

Hi,
On 21/11/16 16:37, Alexander Graf wrote:
On 20/11/2016 15:57, Andre Przywara wrote:
Now that the SPL is ready to be compiled in AArch64 and the DRAM init code is ready, enable SPL support for the A64 SoC and in the Pine64 defconfig. For now we keep the boot0 header in the U-Boot proper, as this allows to still use boot0 as an SPL replacement without hurting the SPL use case. We disable FEL support for now, as the code isn't ready yet.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/boot0.h | 4 ++-- arch/arm/mach-sunxi/board.c | 2 +- board/sunxi/Kconfig | 2 ++ configs/pine64_plus_defconfig | 2 ++ include/configs/sunxi-common.h | 2 ++ 5 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index c31a2af..173e042 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,11 +4,11 @@
- SPDX-License-Identifier: GPL-2.0+
*/
-#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) +#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) && !defined(CONFIG_SPL_BUILD) /* reserve space for BOOT0 header information */ b reset .space 1532 -#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) +#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) && defined(CONFIG_SPL_BUILD) /* switch into AArch64 if needed */ tst x0, x0 // this is "b #0x84" in ARM b reset
Shouldn't the hunk above go into the patches that introduce the options?
Possibly.
diff --git a/arch/arm/mach-sunxi/board.c b/arch/arm/mach-sunxi/board.c index 0f8ead9..80d4b57 100644 --- a/arch/arm/mach-sunxi/board.c +++ b/arch/arm/mach-sunxi/board.c @@ -133,7 +133,7 @@ static int gpio_init(void) return 0; }
-#ifdef CONFIG_SPL_BUILD +#if defined(CONFIG_SPL_BOARD_LOAD_IMAGE) && defined(CONFIG_SPL_BUILD) static int spl_board_load_image(struct spl_image_info *spl_image, struct spl_boot_device *bootdev) { diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index d477925..b5246df 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -125,6 +125,7 @@ config MACH_SUN50I bool "sun50i (Allwinner A64)" select ARM64 select SUNXI_GEN_SUN6I
- select SUPPORT_SPL
endchoice
@@ -187,6 +188,7 @@ config DRAM_ODT_EN bool "sunxi dram odt enable" default n if !MACH_SUN8I_A23 default y if MACH_SUN8I_A23
- default y if MACH_SUN50I ---help--- Select this to enable dram odt (on die termination).
diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index ebc24b8..5286fee 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -2,9 +2,11 @@ CONFIG_ARM=y CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y CONFIG_MACH_SUN50I=y +CONFIG_DRAM_CLK=672
Do you need this?
No, you are right. I think I had a lower default in sunxi/Kconfig before.
Cheers, Andre.
CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y +CONFIG_SPL=y # CONFIG_CMD_IMLS is not set # CONFIG_CMD_FLASH is not set # CONFIG_CMD_FPGA is not set diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index 86b4104..f2cb174 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -182,7 +182,9 @@
#define CONFIG_SPL_FRAMEWORK
+#ifndef CONFIG_MACH_SUN50I #define CONFIG_SPL_BOARD_LOAD_IMAGE +#endif
#if defined(CONFIG_MACH_SUN9I) #define CONFIG_SPL_TEXT_BASE 0x10040 /* sram start+header */

Read the specified "arch" value from a legacy or FIT U-Boot image and store it in our SPL data structure. This allows loaders to take the target architecture in account for custom loading procedures. Having the complete string -> arch mapping for FIT based images in the SPL would be too big, so we leave it up to architectures (or boards) to overwrite the weak function that does the actual translation, possibly covering only the required subset there. Document struct spl_image_info on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
add a struct comment for spl_image_info --- common/spl/spl.c | 1 + common/spl/spl_fit.c | 8 ++++++++ include/spl.h | 15 ++++++++++++++- 3 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/common/spl/spl.c b/common/spl/spl.c index 835eed6..722c060 100644 --- a/common/spl/spl.c +++ b/common/spl/spl.c @@ -114,6 +114,7 @@ int spl_parse_image_header(struct spl_image_info *spl_image, header_size; } spl_image->os = image_get_os(header); + spl_image->arch = image_get_arch(header); spl_image->name = image_get_name(header); debug("spl: payload image: %.*s load addr: 0x%lx size: %d\n", (int)sizeof(spl_image->name), spl_image->name, diff --git a/common/spl/spl_fit.c b/common/spl/spl_fit.c index aae556f..a5d903b 100644 --- a/common/spl/spl_fit.c +++ b/common/spl/spl_fit.c @@ -123,6 +123,11 @@ static int get_aligned_image_size(struct spl_load_info *info, int data_size, return (data_size + info->bl_len - 1) / info->bl_len; }
+__weak u8 spl_genimg_get_arch_id(const char *arch_str) +{ + return IH_ARCH_DEFAULT; +} + int spl_load_simple_fit(struct spl_image_info *spl_image, struct spl_load_info *info, ulong sector, void *fit) { @@ -136,6 +141,7 @@ int spl_load_simple_fit(struct spl_image_info *spl_image, int base_offset, align_len = ARCH_DMA_MINALIGN - 1; int src_sector; void *dst, *src; + const char *arch_str;
/* * Figure out where the external images start. This is the base for the @@ -184,10 +190,12 @@ int spl_load_simple_fit(struct spl_image_info *spl_image, data_offset = fdt_getprop_u32(fit, node, "data-offset"); data_size = fdt_getprop_u32(fit, node, "data-size"); load = fdt_getprop_u32(fit, node, "load"); + arch_str = fdt_getprop(fit, node, "arch", NULL); debug("data_offset=%x, data_size=%x\n", data_offset, data_size); spl_image->load_addr = load; spl_image->entry_point = load; spl_image->os = IH_OS_U_BOOT; + spl_image->arch = spl_genimg_get_arch_id(arch_str);
/* * Work out where to place the image. We read it so that the first diff --git a/include/spl.h b/include/spl.h index 2f8c052..c557a64 100644 --- a/include/spl.h +++ b/include/spl.h @@ -20,13 +20,26 @@ #define MMCSD_MODE_FS 2 #define MMCSD_MODE_EMMCBOOT 3
+/* + * Information about an U-Boot image file as described in include/image.h. + * Parsed by the SPL code from a legacy or FIT image file. + * + * @name: descriptive string (mkimage -n) + * @load_addr: address to load the image file to (mkimage -a) + * @entry_point: address of first instruction to execute (mkimage -e) + * @size: size of image in bytes + * @flags: optional, used only for SPL_COPY_PAYLOAD_ONLY so far + * @os: target operating system, one of IH_OS_* (mkimage -O) + * @arch: target architecture, one of IH_ARCH_* (mkimage -A) + */ struct spl_image_info { const char *name; - u8 os; ulong load_addr; ulong entry_point; u32 size; u32 flags; + u8 os; + u8 arch; };
/*

On 20 November 2016 at 07:57, Andre Przywara andre.przywara@arm.com wrote:
Read the specified "arch" value from a legacy or FIT U-Boot image and store it in our SPL data structure. This allows loaders to take the target architecture in account for custom loading procedures. Having the complete string -> arch mapping for FIT based images in the SPL would be too big, so we leave it up to architectures (or boards) to overwrite the weak function that does the actual translation, possibly covering only the required subset there. Document struct spl_image_info on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
add a struct comment for spl_image_info
common/spl/spl.c | 1 + common/spl/spl_fit.c | 8 ++++++++ include/spl.h | 15 ++++++++++++++- 3 files changed, 23 insertions(+), 1 deletion(-)
Reviewed-by: Simon Glass sjg@chromium.org

At the moment we use the arch/arm directory for arm64 boards as well, so the Makefile will pick up the "arm" name for the architecture to use for tagging binaries in U-Boot image files. Differentiate between the two by looking at the CPU variable being defined to "armv8", and use the arm64 architecture name on creating the image file if that matches.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org --- Makefile | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/Makefile b/Makefile index 96ddc59..d6ef646 100644 --- a/Makefile +++ b/Makefile @@ -921,13 +921,18 @@ quiet_cmd_cpp_cfg = CFG $@ cmd_cpp_cfg = $(CPP) -Wp,-MD,$(depfile) $(cpp_flags) $(LDPPFLAGS) -ansi \ -DDO_DEPS_ONLY -D__ASSEMBLY__ -x assembler-with-cpp -P -dM -E -o $@ $<
+ifeq ($(CPU),armv8) +IH_ARCH := arm64 +else +IH_ARCH := $(ARCH) +endif ifdef CONFIG_SPL_LOAD_FIT -MKIMAGEFLAGS_u-boot.img = -f auto -A $(ARCH) -T firmware -C none -O u-boot \ +MKIMAGEFLAGS_u-boot.img = -f auto -A $(IH_ARCH) -T firmware -C none -O u-boot \ -a $(CONFIG_SYS_TEXT_BASE) -e $(CONFIG_SYS_UBOOT_START) \ -n "U-Boot $(UBOOTRELEASE) for $(BOARD) board" -E \ $(patsubst %,-b arch/$(ARCH)/dts/%.dtb,$(subst ",,$(CONFIG_OF_LIST))) else -MKIMAGEFLAGS_u-boot.img = -A $(ARCH) -T firmware -C none -O u-boot \ +MKIMAGEFLAGS_u-boot.img = -A $(IH_ARCH) -T firmware -C none -O u-boot \ -a $(CONFIG_SYS_TEXT_BASE) -e $(CONFIG_SYS_UBOOT_START) \ -n "U-Boot $(UBOOTRELEASE) for $(BOARD) board" endif

Since the SPL FIT loader can now differentiate between different architectures, teach it how to tell arm and arm64 apart when a FIT image is used. We just support those two for now, as these are so far the only sensible alternatives.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/lib/spl.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/arch/arm/lib/spl.c b/arch/arm/lib/spl.c index e606d47..45d285c 100644 --- a/arch/arm/lib/spl.c +++ b/arch/arm/lib/spl.c @@ -63,3 +63,18 @@ void __noreturn jump_to_image_linux(struct spl_image_info *spl_image, void *arg) image_entry(0, machid, arg); } #endif + +/* This overwrites the weak definition in spl_fit.c */ +u8 spl_genimg_get_arch_id(const char *arch_str) +{ + if (!arch_str) + return IH_ARCH_DEFAULT; + + if (!strcmp(arch_str, "arm")) + return IH_ARCH_ARM; + + if (!strcmp(arch_str, "arm64")) + return IH_ARCH_ARM64; + + return IH_ARCH_DEFAULT; +}

On 20 November 2016 at 07:57, Andre Przywara andre.przywara@arm.com wrote:
Since the SPL FIT loader can now differentiate between different architectures, teach it how to tell arm and arm64 apart when a FIT image is used. We just support those two for now, as these are so far the only sensible alternatives.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/lib/spl.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
Reviewed-by: Simon Glass sjg@chromium.org

The ARMv8 capable Allwinner A64 SoC comes out of reset in AArch32 mode. To run AArch64 code, we have to trigger a warm reset via the RMR register, which proceeds with code execution at the address stored in the RVBAR register. If the bootable payload in the FIT image is using a different architecture than the SPL has been compiled for, enter it via this said RMR switch mechanism, by writing the entry point address into the MMIO mapped, writable version of the RVBAR register. Then the warm reset is triggered via a system register write. If the payload architecture is the same as the SPL, we use the normal branch as usual.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/spl_switch.c | 60 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) create mode 100644 arch/arm/mach-sunxi/spl_switch.c
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index 7daba11..128091e 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -51,4 +51,5 @@ obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o +obj-$(CONFIG_MACH_SUN50I) += spl_switch.o endif diff --git a/arch/arm/mach-sunxi/spl_switch.c b/arch/arm/mach-sunxi/spl_switch.c new file mode 100644 index 0000000..20f21b1 --- /dev/null +++ b/arch/arm/mach-sunxi/spl_switch.c @@ -0,0 +1,60 @@ +/* + * (C) Copyright 2016 ARM Ltd. + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#include <common.h> +#include <spl.h> + +#include <asm/io.h> +#include <asm/barriers.h> + +static void __noreturn jump_to_image_native(struct spl_image_info *spl_image) +{ + typedef void __noreturn (*image_entry_noargs_t)(void); + + image_entry_noargs_t image_entry = + (image_entry_noargs_t)spl_image->entry_point; + + image_entry(); +} + +static void __noreturn reset_rmr_switch(void) +{ +#ifdef CONFIG_ARM64 + __asm__ volatile ( "mrs x0, RMR_EL3\n\t" + "bic x0, x0, #1\n\t" /* Clear enter-in-64 bit */ + "orr x0, x0, #2\n\t" /* set reset request bit */ + "msr RMR_EL3, x0\n\t" + "isb sy\n\t" + "nop\n\t" + "wfi\n\t" + "b .\n" + ::: "x0"); +#else + __asm__ volatile ( "mrc 15, 0, r0, cr12, cr0, 2\n\t" + "orr r0, r0, #3\n\t" /* request reset in 64 bit */ + "mcr 15, 0, r0, cr12, cr0, 2\n\t" + "isb\n\t" + "nop\n\t" + "wfi\n\t" + "b .\n" + ::: "r0"); +#endif + while (1); /* to avoid a compiler warning about __noreturn */ +} + +void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image) +{ + if (spl_image->arch == IH_ARCH_DEFAULT) { + debug("entering by branch\n"); + jump_to_image_native(spl_image); + } else { + debug("entering by RMR switch\n"); + writel(spl_image->entry_point, 0x17000a0); + DSB; + ISB; + reset_rmr_switch(); + } +}

When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding. This should only be used for the SPL, the U-Boot proper should still be using the existing 64-bit configuration. The SPL code can switch to AArch64 if needed, so a 32-bit SPL can be combined with a 64-bit U-Boot proper to eventually launch arm64 kernels.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- board/sunxi/Kconfig | 14 ++++++++++++-- configs/pine64_plus_defconfig | 2 +- configs/sun50i_spl32_defconfig | 11 +++++++++++ include/configs/sunxi-common.h | 2 +- 4 files changed, 25 insertions(+), 4 deletions(-) create mode 100644 configs/sun50i_spl32_defconfig
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index b5246df..bb6e7fa 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -43,6 +43,10 @@ config SUNXI_GEN_SUN6I watchdog, etc.
+config MACH_SUN50I + bool + select SUNXI_GEN_SUN6I + choice prompt "Sunxi SoC Variant" optional @@ -121,10 +125,16 @@ config MACH_SUN9I select SUNXI_GEN_SUN6I select SUPPORT_SPL
-config MACH_SUN50I +config MACH_SUN50I_64 bool "sun50i (Allwinner A64)" + select MACH_SUN50I select ARM64 - select SUNXI_GEN_SUN6I + select SUPPORT_SPL + +config MACH_SUN50I_32 + bool "sun50i (Allwinner A64) SPL-32bit" + select MACH_SUN50I + select CPU_V7 select SUPPORT_SPL
endchoice diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index 5286fee..a81c8b2 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -1,7 +1,7 @@ CONFIG_ARM=y CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y -CONFIG_MACH_SUN50I=y +CONFIG_MACH_SUN50I_64=y CONFIG_DRAM_CLK=672 CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set diff --git a/configs/sun50i_spl32_defconfig b/configs/sun50i_spl32_defconfig new file mode 100644 index 0000000..12d102d --- /dev/null +++ b/configs/sun50i_spl32_defconfig @@ -0,0 +1,11 @@ +CONFIG_ARM=y +CONFIG_ARCH_SUNXI=y +CONFIG_MACH_SUN50I_32=y +CONFIG_DRAM_CLK=672 +CONFIG_SPL=y +CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" +CONFIG_OF_LIST="sun50i-a64-pine64 sun50i-a64-pine64-plus" +# CONFIG_CMD_IMLS is not set +# CONFIG_CMD_FLASH is not set +# CONFIG_CMD_FPGA is not set +CONFIG_MMC_SUNXI_SLOT_EXTRA=2 diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index f2cb174..d64a2b0 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -182,7 +182,7 @@
#define CONFIG_SPL_FRAMEWORK
-#ifndef CONFIG_MACH_SUN50I +#ifndef CONFIG_MACH_SUN50I_64 #define CONFIG_SPL_BOARD_LOAD_IMAGE #endif
participants (8)
-
Alexander Graf
-
Andre Przywara
-
André Przywara
-
Jagan Teki
-
Maxime Ripard
-
Siarhei Siamashka
-
Simon Glass
-
york sun