[U-Boot] [PATCH v4 00/26] sunxi: Allwinner A64: SPL support

Hi,
hopefully the final version of the SPL support series for the Allwinner A64 SoC. Actually no real code changes this time, just rebased on top of recent master, adding some comments in patches 16/26 and 19/26 following Maxime's suggestions and adding Acked-by:s and Reviewed-by:s. I left the final patch 26/26 in for the sake of completeness, but don't expect it to be merged. We need a clever solution to unify 32-bit and 64-bit board configurations, but that shouldn't hold back this series for now. Merging everything until and including patch 21/26 (sunxi: A64: enable SPL) would be great, the other patches until 25/26 can go in as well, I think.
------------- As the previous versions this one includes support for both AArch64 and AArch32 SPL builds. The FIT support is still missing, which means the functionality is limited. Due to the missing ARM Trusted Firmware (ATF) in this firmware chain we lose Ethernet and SMP, among other minor things. A full 64-bit build can be written to an SD card as expected and will boot the U-Boot proper prompt. However Linux will crash on boot, as PSCI is missing. Building the 32-bit version of the SPL and combining this with an ATF build and the 64-bit U-Boot proper allows to use FEL booting now: # sunxi-fel spl sunxi-spl.bin write 0x4a000000 u-boot-dtb.bin \ write 0x44000 bl31.bin reset64 0x44000 This way of booting the board gives full functionality.
The first patch is a rather simple fix (with no changes to v2). Patches 2-8 prepare the SPL code to be compiled for 64-bit in general and AArch64 in particular. Patches 9-11 refactor the existing boot0 header functionality to be used by patch 12, which introduces the 64-bit switch in the first SPL instructions. Patches 13-20 then introduce the actual core of the SPL support: the DRAM initialization, courtesy of Jens. This piggy backs on the existing H3 DRAM code, deviating where needed. This has been reworked compared to v2: I added a patch from Philipp to replace the rather uninspired register writes in the MBUS priority setup function with some meaningful code, explaining the various bits. Also the actual A64 DRAM code is no longer #ifdef'ed into the H3 driver, but uses parameters to (static) functions. The compiler detects this and removes the dead code from the other variant, resulting in the same binary size for the H3.
Patch 21 finally enables the 64-bit SPL support. So now building the existing pine64_plus_defconfig will generate a sunxi-spl.bin, which can be prepended to the U-Boot proper image (not .bin) to boot from an SD card. Due to the missing ATF support this is of limited usability at the moment, though. Also FEL support requires more love - to switch back to AArch32 before returning to FEL (without crashing, that is ;-), so this is disabled. On my setup this results in a 26KB SPL binary, which is close to the 28K limit mksunxiboot imposes at the moment. Adding anything (like FIT support or DEBUG) will exceed this, and although I have patches to let mksunxiboot get close to 32KB, this is the ulimate frontier.
So patches 22-25 then teach the SPL how to detect an U-Boot image file of a different bitness and do the RMR switch from AArch32 to AArch64, if needed. This is used by the final patch 26, which creates another _defconfig to let the SPL compile for AArch32 using the Thumb2 encoding. This results in a binary of less than 17KB in my case, so has plenty of room for extensions.
Cheers, Andre.
Changelog v3 .. v4: - rebased on top of latest HEAD - add various Reviewed-by: and Acked-by: tags - add comments about register bit meanings in non-ODT-setting fix - clarify meaning of delay values in single bit delay support patch - removing stray semicolons from boot0.h header
Changelog v2 .. v3: - add various Reviewed-by: and Acked-by: tags - split tiny-printf fix to handle "-" separately - add various comments and extend commit messages - add assembly file to re-create the embedded RMR switch code - add patch 14/26 to explain the MBUS priority setup - move DRAM r/w delay values into #defines to simplify re-usablity - replace #ifdef'ed addition of A64 support to the H3 DRAM driver with an approach using static parameters
Changelog v1 .. v2: - drop SPI build fix (already merged) - confine A31 register init change to H3 and A64 - use IS_ENABLED() instead of #idef to guard MBUS2 clock init - fix tiny-printf (proper sign extension for 32-bit integers) - add "size" output in commit msg to document tiny-printf size impact - fix sdelay(): use only one register, add "cc" clobber - update RMR switch code to provide easy access to RVBAR register address - drop redundant DRAM frequency setting from Pine64 defconfig - minor changes as requested by reviewers
Andre Przywara (21): sun6i: Restrict some register initialization to Allwinner A31 SoC armv8: prevent using THUMB armv8: add lowlevel_init.S SPL: tiny-printf: add "l" modifier SPL: tiny-printf: ignore "-" modifier move UL() macro from armv8/mmu.h into common.h SPL: make struct spl_image 64-bit safe armv8: add simple sdelay implementation armv8: move reset branch into boot hook ARM: boot0 hook: remove macro, include whole header file sunxi: introduce extra config option for boot0 header sunxi: A64: do an RMR switch if started in AArch32 mode sunxi: provide default DRAM config for sun50i in Kconfig sunxi: H3/A64: fix non-ODT setting sunxi: DRAM: fix H3 DRAM size display on aarch64 sunxi: A64: enable SPL SPL: read and store arch property from U-Boot image Makefile: use "arm64" architecture for U-Boot image files ARM: SPL/FIT: differentiate between arm and arm64 arch properties sunxi: introduce RMR switch to enter payloads in 64-bit mode sunxi: A64: add 32-bit SPL support
Jens Kuske (3): sunxi: H3: add and rename some DRAM contoller registers sunxi: H3: add DRAM controller single bit delay support sunxi: A64: use H3 DRAM initialization code for A64 as well
Philipp Tomsich (2): sunxi: H3: Rework MBUS priority setup sunxi: clocks: Use the correct pattern register for PLL11
Makefile | 9 +- arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/cpu.c | 14 + arch/arm/cpu/armv8/lowlevel_init.S | 44 +++ arch/arm/cpu/armv8/start.S | 5 +- arch/arm/include/asm/arch-bcm235xx/boot0.h | 8 +- arch/arm/include/asm/arch-bcm281xx/boot0.h | 8 +- arch/arm/include/asm/arch-sunxi/boot0.h | 37 ++- arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/cpu.h | 3 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 53 ++-- arch/arm/include/asm/armv8/mmu.h | 8 - arch/arm/lib/Makefile | 2 + arch/arm/lib/spl.c | 15 + arch/arm/lib/vectors.S | 1 - arch/arm/mach-omap2/boot-common.c | 2 +- arch/arm/mach-sunxi/Makefile | 2 + arch/arm/mach-sunxi/board.c | 2 +- arch/arm/mach-sunxi/clock_sun6i.c | 10 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 400 +++++++++++++++++------- arch/arm/mach-sunxi/rmr_switch.S | 41 +++ arch/arm/mach-sunxi/spl_switch.c | 81 +++++ arch/arm/mach-tegra/spl.c | 2 +- board/sunxi/Kconfig | 41 ++- common/spl/spl.c | 9 +- common/spl/spl_fit.c | 8 + common/spl/spl_mmc.c | 2 +- configs/pine64_plus_defconfig | 7 +- configs/sun50i_spl32_defconfig | 10 + include/common.h | 13 +- include/configs/sunxi-common.h | 4 +- include/spl.h | 19 +- lib/tiny-printf.c | 50 ++- 34 files changed, 713 insertions(+), 201 deletions(-) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S create mode 100644 arch/arm/mach-sunxi/rmr_switch.S create mode 100644 arch/arm/mach-sunxi/spl_switch.c create mode 100644 configs/sun50i_spl32_defconfig

These days many Allwinner SoCs use clock_sun6i.c, although out of them only the (original sun6i) A31 has a second MBUS clock register. Also the requirement for setting up the PRCM PLL_CTLR1 register to provide the proper voltage seems to be a property of older SoCs only as well.
Restrict the MBUS initialization to this SoC only to avoid writing bogus values to (undefined) registers in other chips. I can only verify that the PLL voltage setup is not needed for H3 and A64, so for now we only spare those two SoCs.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de Reviewed-by: Chen-Yu Tsai wens@csie.org Reviewed-by: Simon Glass sjg@chromium.org Acked-by: Maxime Ripard maxime.ripard@free-electrons.com --- arch/arm/mach-sunxi/clock_sun6i.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index ed8cd9b..80cfc0b 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -21,6 +21,8 @@ void clock_init_safe(void) { struct sunxi_ccm_reg * const ccm = (struct sunxi_ccm_reg *)SUNXI_CCM_BASE; + +#if !defined(CONFIG_MACH_SUN8I_H3) && !defined(CONFIG_MACH_SUN50I) struct sunxi_prcm_reg * const prcm = (struct sunxi_prcm_reg *)SUNXI_PRCM_BASE;
@@ -31,6 +33,7 @@ void clock_init_safe(void) PRCM_PLL_CTRL_LDO_DIGITAL_EN | PRCM_PLL_CTRL_LDO_ANALOG_EN | PRCM_PLL_CTRL_EXT_OSC_EN | PRCM_PLL_CTRL_LDO_OUT_L(1140)); clrbits_le32(&prcm->pll_ctrl1, PRCM_PLL_CTRL_LDO_KEY_MASK); +#endif
clock_set_pll1(408000000);
@@ -41,7 +44,8 @@ void clock_init_safe(void) writel(AHB1_ABP1_DIV_DEFAULT, &ccm->ahb1_apb1_div);
writel(MBUS_CLK_DEFAULT, &ccm->mbus0_clk_cfg); - writel(MBUS_CLK_DEFAULT, &ccm->mbus1_clk_cfg); + if (IS_ENABLED(CONFIG_MACH_SUN6I)) + writel(MBUS_CLK_DEFAULT, &ccm->mbus1_clk_cfg); } #endif

The predominantely 32-bit ARM targets try to compile the SPL in Thumb mode to reduce code size. The 64-bit AArch64 instruction set does not know an alternative, concise encoding, so the Thumb build option should only be set for 32-bit targets. Likewise -marm machine options are only valid for ARMv7 targets.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de Reviewed-by: Simon Glass sjg@chromium.org Reviewed-by: Tom Rini trini@konsulko.com Acked-by: Maxime Ripard maxime.ripard@free-electrons.com --- arch/arm/lib/Makefile | 2 ++ include/configs/sunxi-common.h | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile index 0051f76..024139d 100644 --- a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -77,8 +77,10 @@ ifndef CONFIG_HAS_THUMB2
# for C files, just apend -marm, which will override previous -mthumb*
+ifndef CONFIG_ARM64 CFLAGS_cache.o := -marm CFLAGS_cache-cp15.o := -marm +endif
# For .S, drop -mthumb* and other thumb-related options. # CFLAGS_REMOVE_* would not have an effet, so AFLAGS_REMOVE_* diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index b0bfc0d..e05c318 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -35,7 +35,7 @@ /* * High Level Configuration Options */ -#ifdef CONFIG_SPL_BUILD +#if defined(CONFIG_SPL_BUILD) && !defined(CONFIG_ARM64) #define CONFIG_SYS_THUMB_BUILD /* Thumbs mode to save space in SPL */ #endif

For boards that call s_init() when the SPL runs, we are expected to setup an early stack before calling this C function. Implement the proper AArch64 version of this based on the ARMv7 code. This allows sunxi boards to setup the basic peripherals even with a 64-bit SPL.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org --- arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/lowlevel_init.S | 44 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S
diff --git a/arch/arm/cpu/armv8/Makefile b/arch/arm/cpu/armv8/Makefile index 28ba786..e780afc 100644 --- a/arch/arm/cpu/armv8/Makefile +++ b/arch/arm/cpu/armv8/Makefile @@ -26,3 +26,4 @@ obj-$(CONFIG_S32V234) += s32v234/ obj-$(CONFIG_ARCH_ZYNQMP) += zynqmp/ obj-$(CONFIG_TARGET_HIKEY) += hisilicon/ obj-$(CONFIG_ARMV8_PSCI) += psci.o +obj-$(CONFIG_ARCH_SUNXI) += lowlevel_init.o diff --git a/arch/arm/cpu/armv8/lowlevel_init.S b/arch/arm/cpu/armv8/lowlevel_init.S new file mode 100644 index 0000000..189e35f --- /dev/null +++ b/arch/arm/cpu/armv8/lowlevel_init.S @@ -0,0 +1,44 @@ +/* + * A lowlevel_init function that sets up the stack to call a C function to + * perform further init. + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#include <asm-offsets.h> +#include <config.h> +#include <linux/linkage.h> + +ENTRY(lowlevel_init) + /* + * Setup a temporary stack. Global data is not available yet. + */ +#if defined(CONFIG_SPL_BUILD) && defined(CONFIG_SPL_STACK) + ldr w0, =CONFIG_SPL_STACK +#else + ldr w0, =CONFIG_SYS_INIT_SP_ADDR +#endif + bic sp, x0, #0xf /* 16-byte alignment for ABI compliance */ + + /* + * Save the old LR(passed in x29) and the current LR to stack + */ + stp x29, x30, [sp, #-16]! + + /* + * Call the very early init function. This should do only the + * absolute bare minimum to get started. It should not: + * + * - set up DRAM + * - use global_data + * - clear BSS + * - try to start a console + * + * For boards with SPL this should be empty since SPL can do all of + * this init in the SPL board_init_f() function which is called + * immediately after this. + */ + bl s_init + ldp x29, x30, [sp] + ret +ENDPROC(lowlevel_init)

tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly.
Using a relatively decent compiler (GCC 5.3.0) this does _not_ increase the code size of tiny-printf.o for 32-bit builds (where long and int are actually the same), actually it looses three (ARM Thumb2) instructions from the actual SPL (numbers for orangepi_plus_defconfig): text data bss dec hex filename 758 0 0 758 2f6 spl/lib/tiny-printf.o before 18839 488 232 19559 4c67 spl/u-boot-spl before 758 0 0 758 2f6 spl/lib/tiny-printf.o after 18833 488 232 19553 4c61 spl/u-boot-spl after
This adds some substantial amount of code to a 64-bit build, though: (taken after a later commit, which enables the ARM64 SPL build for sunxi) text data bss dec hex filename 1542 0 0 1542 606 spl/lib/tiny-printf.o before 25830 392 360 26582 67d6 spl/u-boot-spl before 1758 0 0 1758 6de spl/lib/tiny-printf.o after 26040 392 360 26792 68a8 spl/u-boot-spl after
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org --- lib/tiny-printf.c | 47 ++++++++++++++++++++++++++++++++++++----------- 1 file changed, 36 insertions(+), 11 deletions(-)
diff --git a/lib/tiny-printf.c b/lib/tiny-printf.c index 30ac759..0b8512f 100644 --- a/lib/tiny-printf.c +++ b/lib/tiny-printf.c @@ -38,8 +38,8 @@ static void out_dgt(struct printf_info *info, char dgt) info->zs = 1; }
-static void div_out(struct printf_info *info, unsigned int *num, - unsigned int div) +static void div_out(struct printf_info *info, unsigned long *num, + unsigned long div) { unsigned char dgt = 0;
@@ -56,9 +56,9 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) { char ch; char *p; - unsigned int num; + unsigned long num; char buf[12]; - unsigned int div; + unsigned long div;
while ((ch = *(fmt++))) { if (ch != '%') { @@ -66,6 +66,7 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) } else { bool lz = false; int width = 0; + bool islong = false;
ch = *(fmt++); if (ch == '0') { @@ -80,6 +81,11 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) ch = *fmt++; } } + if (ch == 'l') { + ch = *(fmt++); + islong = true; + } + info->bf = buf; p = info->bf; info->zs = 0; @@ -89,24 +95,43 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) goto abort; case 'u': case 'd': - num = va_arg(va, unsigned int); - if (ch == 'd' && (int)num < 0) { - num = -(int)num; - out(info, '-'); + div = 1000000000; + if (islong) { + num = va_arg(va, unsigned long); + if (sizeof(long) > 4) + div *= div * 10; + } else { + num = va_arg(va, unsigned int); + } + + if (ch == 'd') { + if (islong && (long)num < 0) { + num = -(long)num; + out(info, '-'); + } else if (!islong && (int)num < 0) { + num = -(int)num; + out(info, '-'); + } } if (!num) { out_dgt(info, 0); } else { - for (div = 1000000000; div; div /= 10) + for (; div; div /= 10) div_out(info, &num, div); } break; case 'x': - num = va_arg(va, unsigned int); + if (islong) { + num = va_arg(va, unsigned long); + div = 1UL << (sizeof(long) * 8 - 4); + } else { + num = va_arg(va, unsigned int); + div = 0x10000000; + } if (!num) { out_dgt(info, 0); } else { - for (div = 0x10000000; div; div /= 0x10) + for (; div; div /= 0x10) div_out(info, &num, div); } break;

tiny-printf does not know about the "-" modifier, which aligns numbers. This is used by some SPL code, but as it's purely cosmetical, we just ignore this modifier here to avoid changing correct printf strings.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org --- lib/tiny-printf.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/lib/tiny-printf.c b/lib/tiny-printf.c index 0b8512f..dfa8432 100644 --- a/lib/tiny-printf.c +++ b/lib/tiny-printf.c @@ -69,6 +69,9 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) bool islong = false;
ch = *(fmt++); + if (ch == '-') + ch = *(fmt++); + if (ch == '0') { ch = *(fmt++); lz = 1;

The UL() macro is pretty useful in sharing constants between assembly and C files while still being able to specify a type for C. Move the macro from an armv8 specific header into a common header file to be able to use it by arm code (for instance) as well.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de --- arch/arm/include/asm/armv8/mmu.h | 8 -------- include/common.h | 13 ++++++++++++- 2 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h index aa0f3c4..e9b4cdb 100644 --- a/arch/arm/include/asm/armv8/mmu.h +++ b/arch/arm/include/asm/armv8/mmu.h @@ -8,14 +8,6 @@ #ifndef _ASM_ARMV8_MMU_H_ #define _ASM_ARMV8_MMU_H_
-#ifdef __ASSEMBLY__ -#define _AC(X, Y) X -#else -#define _AC(X, Y) (X##Y) -#endif - -#define UL(x) _AC(x, UL) - /***************************************************************/ /* * The following definitions are related each other, shoud be diff --git a/include/common.h b/include/common.h index a8d833b..ee0436b 100644 --- a/include/common.h +++ b/include/common.h @@ -15,6 +15,9 @@ typedef volatile unsigned long vu_long; typedef volatile unsigned short vu_short; typedef volatile unsigned char vu_char;
+/* Allow sharing constants with type modifiers between C and assembly. */ +#define _AC(X, Y) (X##Y) + #include <config.h> #include <errno.h> #include <asm-offsets.h> @@ -936,7 +939,12 @@ int cpu_disable(int nr); int cpu_release(int nr, int argc, char * const argv[]); #endif
-#endif /* __ASSEMBLY__ */ +#else /* __ASSEMBLY__ */ + +/* Drop a C type modifier (like in 3UL) for constants used in assembly. */ +#define _AC(X, Y) X + +#endif /* __ASSEMBLY__ */
#ifdef CONFIG_PPC /* @@ -948,6 +956,9 @@ int cpu_release(int nr, int argc, char * const argv[]);
/* Put only stuff here that the assembler can digest */
+/* Declare an unsigned long constant digestable both by C and an assembler. */ +#define UL(x) _AC(x, UL) + #ifdef CONFIG_POST #define CONFIG_HAS_POST #ifndef CONFIG_POST_ALT_LIST

Since entry_point and load_addr are addresses, they should be represented as longs to cover the whole address space and to avoid warning when compiling the SPL in 64-bit. Also adjust debug prints to add the 'l' specifier, where needed.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de Reviewed-by: Simon Glass sjg@chromium.org Reviewed-by: Tom Rini trini@konsulko.com Acked-by: Maxime Ripard maxime.ripard@free-electrons.com --- arch/arm/mach-omap2/boot-common.c | 2 +- arch/arm/mach-tegra/spl.c | 2 +- common/spl/spl.c | 8 ++++---- common/spl/spl_mmc.c | 2 +- include/spl.h | 4 ++-- 5 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/arm/mach-omap2/boot-common.c b/arch/arm/mach-omap2/boot-common.c index 385310b..7ae3d80 100644 --- a/arch/arm/mach-omap2/boot-common.c +++ b/arch/arm/mach-omap2/boot-common.c @@ -228,7 +228,7 @@ void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image)
u32 boot_params = *((u32 *)OMAP_SRAM_SCRATCH_BOOT_PARAMS);
- debug("image entry point: 0x%X\n", spl_image->entry_point); + debug("image entry point: 0x%lX\n", spl_image->entry_point); /* Pass the saved boot_params from rom code */ image_entry((u32 *)boot_params); } diff --git a/arch/arm/mach-tegra/spl.c b/arch/arm/mach-tegra/spl.c index e0f9d5b..41c88cb 100644 --- a/arch/arm/mach-tegra/spl.c +++ b/arch/arm/mach-tegra/spl.c @@ -42,7 +42,7 @@ u32 spl_boot_device(void)
void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image) { - debug("image entry point: 0x%X\n", spl_image->entry_point); + debug("image entry point: 0x%lX\n", spl_image->entry_point);
start_cpu((u32)spl_image->entry_point); halt_avp(); diff --git a/common/spl/spl.c b/common/spl/spl.c index f7df834..a76ea3a 100644 --- a/common/spl/spl.c +++ b/common/spl/spl.c @@ -115,7 +115,7 @@ int spl_parse_image_header(struct spl_image_info *spl_image, } spl_image->os = image_get_os(header); spl_image->name = image_get_name(header); - debug("spl: payload image: %.*s load addr: 0x%x size: %d\n", + debug("spl: payload image: %.*s load addr: 0x%lx size: %d\n", (int)sizeof(spl_image->name), spl_image->name, spl_image->load_addr, spl_image->size); } else { @@ -140,7 +140,7 @@ int spl_parse_image_header(struct spl_image_info *spl_image, spl_image->load_addr = CONFIG_SYS_LOAD_ADDR; spl_image->entry_point = CONFIG_SYS_LOAD_ADDR; spl_image->size = end - start; - debug("spl: payload zImage, load addr: 0x%x size: %d\n", + debug("spl: payload zImage, load addr: 0x%lx size: %d\n", spl_image->load_addr, spl_image->size); return 0; } @@ -164,9 +164,9 @@ __weak void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image) typedef void __noreturn (*image_entry_noargs_t)(void);
image_entry_noargs_t image_entry = - (image_entry_noargs_t)(unsigned long)spl_image->entry_point; + (image_entry_noargs_t)spl_image->entry_point;
- debug("image entry point: 0x%X\n", spl_image->entry_point); + debug("image entry point: 0x%lX\n", spl_image->entry_point); image_entry(); }
diff --git a/common/spl/spl_mmc.c b/common/spl/spl_mmc.c index 85e3de8..0cd355c 100644 --- a/common/spl/spl_mmc.c +++ b/common/spl/spl_mmc.c @@ -36,7 +36,7 @@ static int mmc_load_legacy(struct spl_image_info *spl_image, struct mmc *mmc, /* Read the header too to avoid extra memcpy */ count = blk_dread(mmc_get_blk_desc(mmc), sector, image_size_sectors, (void *)(ulong)spl_image->load_addr); - debug("read %x sectors to %x\n", image_size_sectors, + debug("read %x sectors to %lx\n", image_size_sectors, spl_image->load_addr); if (count != image_size_sectors) return -EIO; diff --git a/include/spl.h b/include/spl.h index 6e746b2..bde4437 100644 --- a/include/spl.h +++ b/include/spl.h @@ -23,8 +23,8 @@ struct spl_image_info { const char *name; u8 os; - u32 load_addr; - u32 entry_point; + ulong load_addr; + ulong entry_point; u32 size; u32 flags; };

The sunxi DRAM setup code needs an sdelay() implementation, which wasn't defined for armv8 so far. Shamelessly copy the armv7 version and adjust it to work in AArch64.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/cpu/armv8/cpu.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c index 5dcb5e2..28a27f7 100644 --- a/arch/arm/cpu/armv8/cpu.c +++ b/arch/arm/cpu/armv8/cpu.c @@ -17,6 +17,20 @@ #include <asm/secure.h> #include <linux/compiler.h>
+/* + * sdelay() - simple spin loop. + * + * Will delay execution by roughly (@loops * 2) cycles. + * This is necessary to be used before timers are accessible. + * + * A value of "0" will results in 2^64 loops. + */ +void sdelay(unsigned long loops) +{ + __asm__ volatile ("1:\n" "subs %0, %0, #1\n" + "b.ne 1b" : "=r" (loops) : "0"(loops) : "cc"); +} + int cleanup_before_linux(void) { /*

The boot0 hook we have so far is applied _after_ the initial branch to the "reset" entry point. An upcoming change requires even this branch to be changed, so we apply the hook macro at the earliest point, and have the branch in the hook file as well. This is no functional change at this point, just refactoring to simplify upcoming patches.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org --- arch/arm/cpu/armv8/start.S | 4 ++-- arch/arm/include/asm/arch-sunxi/boot0.h | 1 + 2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm/cpu/armv8/start.S b/arch/arm/cpu/armv8/start.S index 4f5f6d8..ee393d7 100644 --- a/arch/arm/cpu/armv8/start.S +++ b/arch/arm/cpu/armv8/start.S @@ -19,8 +19,6 @@
.globl _start _start: - b reset - #ifdef CONFIG_ENABLE_ARM_SOC_BOOT0_HOOK /* * Various SoCs need something special and SoC-specific up front in @@ -29,6 +27,8 @@ _start: */ #include <asm/arch/boot0.h> ARM_SOC_BOOT0_HOOK +#else + b reset #endif
.align 3 diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index ea5675e..6f28d63 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -9,6 +9,7 @@
/* reserve space for BOOT0 header information */ #define ARM_SOC_BOOT0_HOOK \ + b reset; \ .space 1532
#endif /* __BOOT0_H */

For prepending some board specific header area to U-Boot images we were so far including a header file with a macro definition containing the actual header specification. This works fine if there are just a few statements and if there is only one alternative. However adding more complex code quickly gets messy with this approach, so let's just drop that intermediate macro and let the #include actually insert the code directly. This converts the callers and the callees, but doesn't change anything at this point.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org Tested-by: Steve Rae steve.rae@raedomain.com --- arch/arm/cpu/armv8/start.S | 1 - arch/arm/include/asm/arch-bcm235xx/boot0.h | 8 +------- arch/arm/include/asm/arch-bcm281xx/boot0.h | 8 +------- arch/arm/include/asm/arch-sunxi/boot0.h | 8 +------- arch/arm/lib/vectors.S | 1 - 5 files changed, 3 insertions(+), 23 deletions(-)
diff --git a/arch/arm/cpu/armv8/start.S b/arch/arm/cpu/armv8/start.S index ee393d7..140609d 100644 --- a/arch/arm/cpu/armv8/start.S +++ b/arch/arm/cpu/armv8/start.S @@ -26,7 +26,6 @@ _start: * use it here. */ #include <asm/arch/boot0.h> -ARM_SOC_BOOT0_HOOK #else b reset #endif diff --git a/arch/arm/include/asm/arch-bcm235xx/boot0.h b/arch/arm/include/asm/arch-bcm235xx/boot0.h index 7e72882..a747bd3 100644 --- a/arch/arm/include/asm/arch-bcm235xx/boot0.h +++ b/arch/arm/include/asm/arch-bcm235xx/boot0.h @@ -4,12 +4,6 @@ * SPDX-License-Identifier: GPL-2.0+ */
-#ifndef __BOOT0_H -#define __BOOT0_H - /* BOOT0 header information */ -#define ARM_SOC_BOOT0_HOOK \ - .word 0xbabeface; \ + .word 0xbabeface .word _end - _start - -#endif /* __BOOT0_H */ diff --git a/arch/arm/include/asm/arch-bcm281xx/boot0.h b/arch/arm/include/asm/arch-bcm281xx/boot0.h index 7e72882..a747bd3 100644 --- a/arch/arm/include/asm/arch-bcm281xx/boot0.h +++ b/arch/arm/include/asm/arch-bcm281xx/boot0.h @@ -4,12 +4,6 @@ * SPDX-License-Identifier: GPL-2.0+ */
-#ifndef __BOOT0_H -#define __BOOT0_H - /* BOOT0 header information */ -#define ARM_SOC_BOOT0_HOOK \ - .word 0xbabeface; \ + .word 0xbabeface .word _end - _start - -#endif /* __BOOT0_H */ diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index 6f28d63..6a13db5 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,12 +4,6 @@ * SPDX-License-Identifier: GPL-2.0+ */
-#ifndef __BOOT0_H -#define __BOOT0_H - /* reserve space for BOOT0 header information */ -#define ARM_SOC_BOOT0_HOOK \ - b reset; \ + b reset .space 1532 - -#endif /* __BOOT0_H */ diff --git a/arch/arm/lib/vectors.S b/arch/arm/lib/vectors.S index 5cc132b..9fe7415 100644 --- a/arch/arm/lib/vectors.S +++ b/arch/arm/lib/vectors.S @@ -67,7 +67,6 @@ _start: * use it here. */ #include <asm/arch/boot0.h> -ARM_SOC_BOOT0_HOOK #endif
/*

The ENABLE_ARM_SOC_BOOT0_HOOK option is a generic option shared with other boards. To allow alternative code to be inserted, we create another, now function specific config symbol on top of it to simplify later additions. No functional change at this time.
Signed-off-by: Andre Przywara andre.przywara@arm.com Acked-by: Maxime Ripard maxime.ripard@free-electrons.com Reviewed-by: Simon Glass sjg@chromium.org --- board/sunxi/Kconfig | 9 +++++++++ configs/pine64_plus_defconfig | 2 +- 2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index e1d4ab1..0cd57a2 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -133,6 +133,15 @@ config MACH_SUN8I bool default y if MACH_SUN8I_A23 || MACH_SUN8I_A33 || MACH_SUN8I_H3 || MACH_SUN8I_A83T
+config RESERVE_ALLWINNER_BOOT0_HEADER + bool "reserve space for Allwinner boot0 header" + select ENABLE_ARM_SOC_BOOT0_HOOK + ---help--- + Prepend a 1536 byte (empty) header to the U-Boot image file, to be + filled with magic values post build. The Allwinner provided boot0 + blob relies on this information to load and execute U-Boot. + Only needed on 64-bit Allwinner boards so far when using boot0. + config DRAM_TYPE int "sunxi dram type" depends on MACH_SUN8I_A83T diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index 6d0198f..ea53b96 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -1,5 +1,5 @@ CONFIG_ARM=y -CONFIG_ENABLE_ARM_SOC_BOOT0_HOOK=y +CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y CONFIG_MACH_SUN50I=y CONFIG_DRAM_CLK=672

The Allwinner A64 SoC starts execution in AArch32 mode, and both the boot ROM and Allwinner's boot0 keep running in this mode. So U-Boot gets entered in 32-bit, although we want it to run in AArch64.
By using a "magic" instruction, which happens to be an almost-NOP in AArch64 and a branch in AArch32, we differentiate between being entered in 64-bit or 32-bit mode. If in 64-bit mode, we proceed with the branch to reset, but in 32-bit mode we trigger an RMR write to bring the core into AArch64/EL3 and re-enter U-Boot at CONFIG_SYS_TEXT_BASE. This allows a 64-bit U-Boot to be both entered in 32 and 64-bit mode, so we can use the same start code for the SPL and the U-Boot proper.
We use the existing custom header (boot0.h) functionality, but restrict the existing boot0 header reservation to the non-SPL build now. A SPL wouldn't need such header anyway. This allows to have both options defined and lets us use one for the SPL and the other for U-Boot proper.
Also add arch/arm/mach-sunxi/rmr_switch.S, which contains the original ARM assembly code and instructions how to re-generate the encoded version.
Signed-off-by: Andre Przywara andre.przywara@arm.com Acked-by: Maxime Ripard maxime.ripard@free-electrons.com --- arch/arm/include/asm/arch-sunxi/boot0.h | 30 ++++++++++++++++++++++++ arch/arm/mach-sunxi/rmr_switch.S | 41 +++++++++++++++++++++++++++++++++ board/sunxi/Kconfig | 14 +++++++++++ 3 files changed, 85 insertions(+) create mode 100644 arch/arm/mach-sunxi/rmr_switch.S
diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index 6a13db5..9c6d82d 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,6 +4,36 @@ * SPDX-License-Identifier: GPL-2.0+ */
+#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) && !defined(CONFIG_SPL_BUILD) /* reserve space for BOOT0 header information */ b reset .space 1532 +#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) +/* + * Switch into AArch64 if needed. + * Refer to arch/arm/mach-sunxi/rmr_switch.S for the original source. + */ + tst x0, x0 // this is "b #0x84" in ARM + b reset + .space 0x7c + .word 0xe59f1024 // ldr r1, [pc, #36] ; 0x170000a0 + .word 0xe59f0024 // ldr r0, [pc, #36] ; CONFIG_*_TEXT_BASE + .word 0xe5810000 // str r0, [r1] + .word 0xf57ff04f // dsb sy + .word 0xf57ff06f // isb sy + .word 0xee1c0f50 // mrc 15, 0, r0, cr12, cr0, {2} ; RMR + .word 0xe3800003 // orr r0, r0, #3 + .word 0xee0c0f50 // mcr 15, 0, r0, cr12, cr0, {2} ; RMR + .word 0xf57ff06f // isb sy + .word 0xe320f003 // wfi + .word 0xeafffffd // b @wfi + .word 0x017000a0 // writeable RVBAR mapping address +#ifdef CONFIG_SPL_BUILD + .word CONFIG_SPL_TEXT_BASE +#else + .word CONFIG_SYS_TEXT_BASE +#endif +#else +/* normal execution */ + b reset +#endif diff --git a/arch/arm/mach-sunxi/rmr_switch.S b/arch/arm/mach-sunxi/rmr_switch.S new file mode 100644 index 0000000..cefa930 --- /dev/null +++ b/arch/arm/mach-sunxi/rmr_switch.S @@ -0,0 +1,41 @@ +@ +@ ARMv8 RMR reset sequence on Allwinner SoCs. +@ +@ All 64-bit capable Allwinner SoCs reset in AArch32 (and continue to +@ exectute the Boot ROM in this state), so we need to switch to AArch64 +@ at some point. +@ Section G6.2.133 of the ARMv8 ARM describes the Reset Management Register +@ (RMR), which triggers a warm-reset of a core and can request to switch +@ into a different execution state (AArch32 or AArch64). +@ The address at which execution starts after the reset is held in the +@ RVBAR system register, which is architecturally read-only. +@ Allwinner provides a writable alias of this register in MMIO space, so +@ we can easily set the start address of AArch64 code. +@ This code below switches to AArch64 and starts execution at the specified +@ start address. It needs to be assembled by an ARM(32) assembler and +@ the machine code must be inserted as verbatim .word statements into the +@ beginning of the AArch64 U-Boot code. +@ To get the encoded bytes, use: +@ ${CROSS_COMPILE}gcc -c -o rmr_switch.o rmr_switch.S +@ ${CROSS_COMPILE}objdump -d rmr_switch.o +@ +@ The resulting words should be inserted into the U-Boot file at +@ arch/arm/include/asm/arch-sunxi/boot0.h. +@ +@ This file is not build by the U-Boot build system, but provided only as a +@ reference and to be able to regenerate a (probably fixed) version of this +@ code found in encoded form in boot0.h. + +.text + + ldr r1, =0x017000a0 @ MMIO mapped RVBAR[0] register + ldr r0, =0x57aA7add @ start address, to be replaced + str r0, [r1] + dsb sy + isb sy + mrc 15, 0, r0, cr12, cr0, 2 @ read RMR register + orr r0, r0, #3 @ request reset in AArch64 + mcr 15, 0, r0, cr12, cr0, 2 @ write RMR register + isb sy +1: wfi + b 1b diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index 0cd57a2..f020573 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -142,6 +142,20 @@ config RESERVE_ALLWINNER_BOOT0_HEADER blob relies on this information to load and execute U-Boot. Only needed on 64-bit Allwinner boards so far when using boot0.
+config ARM_BOOT_HOOK_RMR + bool + depends on ARM64 + default y + select ENABLE_ARM_SOC_BOOT0_HOOK + ---help--- + Insert some ARM32 code at the very beginning of the U-Boot binary + which uses an RMR register write to bring the core into AArch64 mode. + The very first instruction acts as a switch, since it's carefully + chosen to be a NOP in one mode and a branch in the other, so the + code would only be executed if not already in AArch64. + This allows both the SPL and the U-Boot proper to be entered in + either mode and switch to AArch64 if needed. + config DRAM_TYPE int "sunxi dram type" depends on MACH_SUN8I_A83T

To avoid enumerating the very same DRAM values in defconfig files for each and every Allwinner A64 board out there, let's put some sane default values in the Kconfig file. Boards with different needs can override them at any time.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org Acked-by: Maxime Ripard maxime.ripard@free-electrons.com --- board/sunxi/Kconfig | 2 ++ configs/pine64_plus_defconfig | 2 -- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index f020573..c2eb85e 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -168,6 +168,7 @@ config DRAM_CLK default 792 if MACH_SUN9I default 312 if MACH_SUN6I || MACH_SUN8I default 360 if MACH_SUN4I || MACH_SUN5I || MACH_SUN7I + default 672 if MACH_SUN50I ---help--- Set the dram clock speed, valid range 240 - 480 (prior to sun9i), must be a multiple of 24. For the sun9i (A80), the tested values @@ -187,6 +188,7 @@ config DRAM_ZQ default 123 if MACH_SUN4I || MACH_SUN5I || MACH_SUN6I || MACH_SUN8I default 127 if MACH_SUN7I default 4145117 if MACH_SUN9I + default 3881915 if MACH_SUN50I ---help--- Set the dram zq value.
diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index ea53b96..ebc24b8 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -2,8 +2,6 @@ CONFIG_ARM=y CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y CONFIG_MACH_SUN50I=y -CONFIG_DRAM_CLK=672 -CONFIG_DRAM_ZQ=3881915 CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y

From: Philipp Tomsich philipp.tomsich@theobroma-systems.com
So far the MBUS priority setup was done by writing "magic" values taken from a DRAM controller register dump after a boot0 run. By peeking at the Linux (sic!) MBUS driver [1] from the Allwinner BSP kernel, we learned more about the actual meaning of those bits. Add macros and refactor the setup function to make the MBUS setup much more readable and meaningful. The actual values used now are a transformation of the values used before, which are assembled by the new code to result in the same register writes. So this rework does not change any settings, also the code size stays the same.
The respective source files in the BSP kernel had a proper GPL header, so lifting this code and information into U-Boot is legal.
[Andre: provide a convenience macro to fit definitions on one line]
[1] https://github.com/longsleep/linux-pine64/blob/lichee-dev-v3.10.65/drivers/b...
Signed-off-by: Philipp Tomsich philipp.tomsich@theobroma-systems.com Signed-off-by: Andre Przywara andre.przywara@arm.com Acked-by: Maxime Ripard maxime.ripard@free-electrons.com --- arch/arm/mach-sunxi/dram_sun8i_h3.c | 88 +++++++++++++++++++++++++++---------- 1 file changed, 64 insertions(+), 24 deletions(-)
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index b08b8e6..8925446 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -94,6 +94,58 @@ static void mctl_dq_delay(u32 read, u32 write) udelay(1); }
+enum { + MBUS_PORT_CPU = 0, + MBUS_PORT_GPU = 1, + MBUS_PORT_UNUSED = 2, + MBUS_PORT_DMA = 3, + MBUS_PORT_VE = 4, + MBUS_PORT_CSI = 5, + MBUS_PORT_NAND = 6, + MBUS_PORT_SS = 7, + MBUS_PORT_TS = 8, + MBUS_PORT_DI = 9, + MBUS_PORT_DE = 10, + MBUS_PORT_DE_CFD = 11, +}; + +enum { + MBUS_QOS_LOWEST = 0, + MBUS_QOS_LOW, + MBUS_QOS_HIGH, + MBUS_QOS_HIGHEST +}; + +inline void mbus_configure_port(u8 port, + bool bwlimit, + bool priority, + u8 qos, /* MBUS_QOS_LOWEST .. MBUS_QOS_HIGEST */ + u8 waittime, /* 0 .. 0xf */ + u8 acs, /* 0 .. 0xff */ + u16 bwl0, /* 0 .. 0xffff, bandwidth limit in MB/s */ + u16 bwl1, + u16 bwl2) +{ + struct sunxi_mctl_com_reg * const mctl_com = + (struct sunxi_mctl_com_reg *)SUNXI_DRAM_COM_BASE; + + const u32 cfg0 = ( (bwlimit ? (1 << 0) : 0) + | (priority ? (1 << 1) : 0) + | ((qos & 0x3) << 2) + | ((waittime & 0xf) << 4) + | ((acs & 0xff) << 8) + | (bwl0 << 16) ); + const u32 cfg1 = ((u32)bwl2 << 16) | (bwl1 & 0xffff); + + debug("MBUS port %d cfg0 %08x cfg1 %08x\n", port, cfg0, cfg1); + writel(cfg0, &mctl_com->mcr[port][0]); + writel(cfg1, &mctl_com->mcr[port][1]); +} + +#define MBUS_CONF(port, bwlimit, qos, acs, bwl0, bwl1, bwl2) \ + mbus_configure_port(MBUS_PORT_ ## port, bwlimit, false, \ + MBUS_QOS_ ## qos, 0, acs, bwl0, bwl1, bwl2) + static void mctl_set_master_priority(void) { struct sunxi_mctl_com_reg * const mctl_com = @@ -105,30 +157,18 @@ static void mctl_set_master_priority(void) /* set cpu high priority */ writel(0x00000001, &mctl_com->mapr);
- writel(0x0200000d, &mctl_com->mcr[0][0]); - writel(0x00800100, &mctl_com->mcr[0][1]); - writel(0x06000009, &mctl_com->mcr[1][0]); - writel(0x01000400, &mctl_com->mcr[1][1]); - writel(0x0200000d, &mctl_com->mcr[2][0]); - writel(0x00600100, &mctl_com->mcr[2][1]); - writel(0x0100000d, &mctl_com->mcr[3][0]); - writel(0x00200080, &mctl_com->mcr[3][1]); - writel(0x07000009, &mctl_com->mcr[4][0]); - writel(0x01000640, &mctl_com->mcr[4][1]); - writel(0x0100000d, &mctl_com->mcr[5][0]); - writel(0x00200080, &mctl_com->mcr[5][1]); - writel(0x01000009, &mctl_com->mcr[6][0]); - writel(0x00400080, &mctl_com->mcr[6][1]); - writel(0x0100000d, &mctl_com->mcr[7][0]); - writel(0x00400080, &mctl_com->mcr[7][1]); - writel(0x0100000d, &mctl_com->mcr[8][0]); - writel(0x00400080, &mctl_com->mcr[8][1]); - writel(0x04000009, &mctl_com->mcr[9][0]); - writel(0x00400100, &mctl_com->mcr[9][1]); - writel(0x2000030d, &mctl_com->mcr[10][0]); - writel(0x04001800, &mctl_com->mcr[10][1]); - writel(0x04000009, &mctl_com->mcr[11][0]); - writel(0x00400120, &mctl_com->mcr[11][1]); + MBUS_CONF( CPU, true, HIGHEST, 0, 512, 256, 128); + MBUS_CONF( GPU, true, HIGH, 0, 1536, 1024, 256); + MBUS_CONF(UNUSED, true, HIGHEST, 0, 512, 256, 96); + MBUS_CONF( DMA, true, HIGHEST, 0, 256, 128, 32); + MBUS_CONF( VE, true, HIGH, 0, 1792, 1600, 256); + MBUS_CONF( CSI, true, HIGHEST, 0, 256, 128, 32); + MBUS_CONF( NAND, true, HIGH, 0, 256, 128, 64); + MBUS_CONF( SS, true, HIGHEST, 0, 256, 128, 64); + MBUS_CONF( TS, true, HIGHEST, 0, 256, 128, 64); + MBUS_CONF( DI, true, HIGH, 0, 1024, 256, 64); + MBUS_CONF( DE, true, HIGHEST, 3, 8192, 6120, 1024); + MBUS_CONF(DE_CFD, true, HIGH, 0, 1024, 288, 64); }
static void mctl_set_timing_params(struct dram_para *para)

From: Jens Kuske jenskuske@gmail.com
The IOCR registers got renamed to BDLR to match the public documentation of similar controllers.
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 43 ++++++++++++++----------- arch/arm/mach-sunxi/dram_sun8i_h3.c | 34 +++++++++---------- 2 files changed, 41 insertions(+), 36 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index d0f2b8a..346538c 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -106,20 +106,23 @@ struct sunxi_mctl_ctl_reg { u32 perfhpr[2]; /* 0x1c4 */ u32 perflpr[2]; /* 0x1cc */ u32 perfwr[2]; /* 0x1d4 */ - u8 res8[0x2c]; /* 0x1dc */ - u32 aciocr; /* 0x208 */ - u8 res9[0xf4]; /* 0x20c */ + u8 res8[0x24]; /* 0x1dc */ + u32 acmdlr; /* 0x200 AC master delay line register */ + u32 aclcdlr; /* 0x204 AC local calibrated delay line register */ + u32 aciocr; /* 0x208 AC I/O configuration register */ + u8 res9[0x4]; /* 0x20c */ + u32 acbdlr[31]; /* 0x210 AC bit delay line registers */ + u8 res10[0x74]; /* 0x28c */ struct { /* 0x300 DATX8 modules*/ - u32 mdlr; /* 0x00 */ - u32 lcdlr[3]; /* 0x04 */ - u32 iocr[11]; /* 0x10 IO configuration register */ - u32 bdlr6; /* 0x3c */ - u32 gtr; /* 0x40 */ - u32 gcr; /* 0x44 */ - u32 gsr[3]; /* 0x48 */ + u32 mdlr; /* 0x00 master delay line register */ + u32 lcdlr[3]; /* 0x04 local calibrated delay line registers */ + u32 bdlr[12]; /* 0x10 bit delay line registers */ + u32 gtr; /* 0x40 general timing register */ + u32 gcr; /* 0x44 general configuration register */ + u32 gsr[3]; /* 0x48 general status registers */ u8 res0[0x2c]; /* 0x54 */ - } datx[4]; - u8 res10[0x388]; /* 0x500 */ + } dx[4]; + u8 res11[0x388]; /* 0x500 */ u32 upd2; /* 0x888 */ };
@@ -172,14 +175,16 @@ struct sunxi_mctl_ctl_reg {
#define PGSR_INIT_DONE (0x1 << 0) /* PHY init done */
-#define ZQCR_PWRDOWN (0x1 << 31) /* ZQ power down */ +#define ZQCR_PWRDOWN (1U << 31) /* ZQ power down */
-#define DATX_IOCR_DQ(x) (x) /* DQ0-7 IOCR index */ -#define DATX_IOCR_DM (8) /* DM IOCR index */ -#define DATX_IOCR_DQS (9) /* DQS IOCR index */ -#define DATX_IOCR_DQSN (10) /* DQSN IOCR index */ +#define ACBDLR_WRITE_DELAY(x) ((x) << 8)
-#define DATX_IOCR_WRITE_DELAY(x) ((x) << 8) -#define DATX_IOCR_READ_DELAY(x) ((x) << 0) +#define DXBDLR_DQ(x) (x) /* DQ0-7 BDLR index */ +#define DXBDLR_DM 8 /* DM BDLR index */ +#define DXBDLR_DQS 9 /* DQS BDLR index */ +#define DXBDLR_DQSN 10 /* DQSN BDLR index */ + +#define DXBDLR_WRITE_DELAY(x) ((x) << 8) +#define DXBDLR_READ_DELAY(x) ((x) << 0)
#endif /* _SUNXI_DRAM_SUN8I_H3_H */ diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 8925446..539268f 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -72,21 +72,21 @@ static void mctl_dq_delay(u32 read, u32 write) u32 val;
for (i = 0; i < 4; i++) { - val = DATX_IOCR_WRITE_DELAY((write >> (i * 4)) & 0xf) | - DATX_IOCR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2); + val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) | + DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2);
- for (j = DATX_IOCR_DQ(0); j <= DATX_IOCR_DM; j++) - writel(val, &mctl_ctl->datx[i].iocr[j]); + for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++) + writel(val, &mctl_ctl->dx[i].bdlr[j]); }
clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
for (i = 0; i < 4; i++) { - val = DATX_IOCR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) | - DATX_IOCR_READ_DELAY((read >> (16 + i * 4)) & 0xf); + val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) | + DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf);
- writel(val, &mctl_ctl->datx[i].iocr[DATX_IOCR_DQS]); - writel(val, &mctl_ctl->datx[i].iocr[DATX_IOCR_DQSN]); + writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]); + writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]); }
setbits_le32(&mctl_ctl->pgcr[0], 1 << 26); @@ -384,7 +384,7 @@ static int mctl_channel_init(struct dram_para *para)
/* set dramc odt */ for (i = 0; i < 4; i++) - clrsetbits_le32(&mctl_ctl->datx[i].gcr, (0x3 << 4) | + clrsetbits_le32(&mctl_ctl->dx[i].gcr, (0x3 << 4) | (0x1 << 1) | (0x3 << 2) | (0x3 << 12) | (0x3 << 14), IS_ENABLED(CONFIG_DRAM_ODT_EN) ? 0x0 : 0x2); @@ -404,8 +404,8 @@ static int mctl_channel_init(struct dram_para *para)
/* set half DQ */ if (para->bus_width != 32) { - writel(0x0, &mctl_ctl->datx[2].gcr); - writel(0x0, &mctl_ctl->datx[3].gcr); + writel(0x0, &mctl_ctl->dx[2].gcr); + writel(0x0, &mctl_ctl->dx[3].gcr); }
/* data training configuration */ @@ -426,17 +426,17 @@ static int mctl_channel_init(struct dram_para *para) /* detect ranks and bus width */ if (readl(&mctl_ctl->pgsr[0]) & (0xfe << 20)) { /* only one rank */ - if (((readl(&mctl_ctl->datx[0].gsr[0]) >> 24) & 0x2) || - ((readl(&mctl_ctl->datx[1].gsr[0]) >> 24) & 0x2)) { + if (((readl(&mctl_ctl->dx[0].gsr[0]) >> 24) & 0x2) || + ((readl(&mctl_ctl->dx[1].gsr[0]) >> 24) & 0x2)) { clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, 0x1 << 24); para->dual_rank = 0; }
/* only half DQ width */ - if (((readl(&mctl_ctl->datx[2].gsr[0]) >> 24) & 0x1) || - ((readl(&mctl_ctl->datx[3].gsr[0]) >> 24) & 0x1)) { - writel(0x0, &mctl_ctl->datx[2].gcr); - writel(0x0, &mctl_ctl->datx[3].gcr); + if (((readl(&mctl_ctl->dx[2].gsr[0]) >> 24) & 0x1) || + ((readl(&mctl_ctl->dx[3].gsr[0]) >> 24) & 0x1)) { + writel(0x0, &mctl_ctl->dx[2].gcr); + writel(0x0, &mctl_ctl->dx[3].gcr); para->bus_width = 16; }

From: Jens Kuske jenskuske@gmail.com
So far the DRAM driver for the H3 SoC (and apparently boot0/libdram as well) only applied coarse delay line settings, with one delay value for all the data lines in each byte lane and one value for the control lines.
Instead of setting the delays for whole bytes only allow setting it for each individual bit. Also add support for address/command lane delays.
For the purpose of this patch the rules for the existing coarse settings were just applied to the new scheme, so the actual register writes don't change for the H3. Other SoCs will utilize this feature later properly.
With a stock GCC 5.3.0 this increases the dram_sun8i_h3.o code size from 2296 to 2344 Bytes.
[Andre: move delay parameters into macros to ease later sharing, use defines for numbers of delay registers, extend commit message]
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/mach-sunxi/dram_sun8i_h3.c | 77 ++++++++++++++++++++++++------------- 1 file changed, 50 insertions(+), 27 deletions(-)
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 539268f..4396754 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -15,13 +15,24 @@ #include <asm/arch/dram.h> #include <linux/kconfig.h>
+/* + * The delay parameters below allow to allegedly specify delay times of some + * unknown unit for each individual bit trace in each of the four data bytes + * the 32-bit wide access consists of. Also three control signals can be + * adjusted individually. + */ +#define BITS_PER_BYTE 8 +#define NR_OF_BYTE_LANES (32 / BITS_PER_BYTE) +/* The eight data lines (DQn) plus DM, DQS and DQSN */ +#define LINES_PER_BYTE_LANE (BITS_PER_BYTE + 3) struct dram_para { - u32 read_delays; - u32 write_delays; u16 page_size; u8 bus_width; u8 dual_rank; u8 row_bits; + const u8 dx_read_delays[NR_OF_BYTE_LANES][LINES_PER_BYTE_LANE]; + const u8 dx_write_delays[NR_OF_BYTE_LANES][LINES_PER_BYTE_LANE]; + const u8 ac_delays[31]; };
static inline int ns_to_t(int nanoseconds) @@ -64,34 +75,25 @@ static void mctl_phy_init(u32 val) mctl_await_completion(&mctl_ctl->pgsr[0], PGSR_INIT_DONE, 0x1); }
-static void mctl_dq_delay(u32 read, u32 write) +static void mctl_set_bit_delays(struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; int i, j; - u32 val; - - for (i = 0; i < 4; i++) { - val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) | - DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2); - - for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++) - writel(val, &mctl_ctl->dx[i].bdlr[j]); - }
clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
- for (i = 0; i < 4; i++) { - val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) | - DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf); + for (i = 0; i < NR_OF_BYTE_LANES; i++) + for (j = 0; j < LINES_PER_BYTE_LANE; j++) + writel(DXBDLR_WRITE_DELAY(para->dx_write_delays[i][j]) | + DXBDLR_READ_DELAY(para->dx_read_delays[i][j]), + &mctl_ctl->dx[i].bdlr[j]);
- writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]); - writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]); - } + for (i = 0; i < 31; i++) + writel(ACBDLR_WRITE_DELAY(para->ac_delays[i]), + &mctl_ctl->acbdlr[i]);
setbits_le32(&mctl_ctl->pgcr[0], 1 << 26); - - udelay(1); }
enum { @@ -412,11 +414,8 @@ static int mctl_channel_init(struct dram_para *para) clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, (para->dual_rank ? 0x3 : 0x1) << 24);
- - if (para->read_delays || para->write_delays) { - mctl_dq_delay(para->read_delays, para->write_delays); - udelay(50); - } + mctl_set_bit_delays(para); + udelay(50);
mctl_zq_calibration(para);
@@ -490,6 +489,29 @@ static void mctl_auto_detect_dram_size(struct dram_para *para) break; }
+/* + * The actual values used here are taken from Allwinner provided boot0 + * binaries, though they are probably board specific, so would likely benefit + * from invidual tuning for each board. Apparently a lot of boards copy from + * some Allwinner reference design, so we go with those generic values for now + * in the hope that they are reasonable for most (all?) boards. + */ +#define SUN8I_H3_DX_READ_DELAYS \ + {{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, \ + { 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }, \ + { 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, \ + { 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }} +#define SUN8I_H3_DX_WRITE_DELAYS \ + {{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 }, \ + { 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 }, \ + { 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 }, \ + { 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6 }} +#define SUN8I_H3_AC_DELAYS \ + { 0, 0, 0, 0, 0, 0, 0, 0, \ + 0, 0, 0, 0, 0, 0, 0, 0, \ + 0, 0, 0, 0, 0, 0, 0, 0, \ + 0, 0, 0, 0, 0, 0, 0 } + unsigned long sunxi_dram_init(void) { struct sunxi_mctl_com_reg * const mctl_com = @@ -498,12 +520,13 @@ unsigned long sunxi_dram_init(void) (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
struct dram_para para = { - .read_delays = 0x00007979, /* dram_tpr12 */ - .write_delays = 0x6aaa0000, /* dram_tpr11 */ .dual_rank = 0, .bus_width = 32, .row_bits = 15, .page_size = 4096, + .dx_read_delays = SUN8I_H3_DX_READ_DELAYS, + .dx_write_delays = SUN8I_H3_DX_WRITE_DELAYS, + .ac_delays = SUN8I_H3_AC_DELAYS, };
mctl_sys_init(¶);

From: Philipp Tomsich philipp.tomsich@theobroma-systems.com
Signed-off-by: Philipp Tomsich philipp.tomsich@theobroma-systems.com Signed-off-by: Andre Przywara andre.przywara@arm.com Acked-by: Maxime Ripard maxime.ripard@free-electrons.com --- arch/arm/mach-sunxi/clock_sun6i.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index 80cfc0b..8e39bbe 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -224,7 +224,7 @@ void clock_set_pll11(unsigned int clk, bool sigma_delta_enable) (struct sunxi_ccm_reg *)SUNXI_CCM_BASE;
if (sigma_delta_enable) - writel(CCM_PLL11_PATTERN, &ccm->pll5_pattern_cfg); + writel(CCM_PLL11_PATTERN, &ccm->pll11_pattern_cfg0);
writel(CCM_PLL11_CTRL_EN | CCM_PLL11_CTRL_UPD | (sigma_delta_enable ? CCM_PLL11_CTRL_SIGMA_DELTA_EN : 0) |

From: Jens Kuske jenskuske@gmail.com
The A64 DRAM controller is very similar to the H3 one, so the code can be reused with some small changes. This refactoring does not change the code size for the existing H3 part.
[Andre: rework from #ifdefs to using socid parameters in static functions, minor fixes, merging in fixes from Jens]
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com Acked-by: Maxime Ripard maxime.ripard@free-electrons.com --- arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/cpu.h | 3 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 10 +- arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/clock_sun6i.c | 2 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 211 ++++++++++++++++++------ 7 files changed, 174 insertions(+), 56 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h index be9fcfd..3f87672 100644 --- a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h @@ -322,6 +322,7 @@ struct sunxi_ccm_reg { #define CCM_DRAMCLK_CFG_DIV0_MASK (0xf << 8) #define CCM_DRAMCLK_CFG_SRC_PLL5 (0x0 << 20) #define CCM_DRAMCLK_CFG_SRC_PLL6x2 (0x1 << 20) +#define CCM_DRAMCLK_CFG_SRC_PLL11 (0x1 << 20) /* A64 only */ #define CCM_DRAMCLK_CFG_SRC_MASK (0x3 << 20) #define CCM_DRAMCLK_CFG_UPD (0x1 << 16) #define CCM_DRAMCLK_CFG_RST (0x1 << 31) diff --git a/arch/arm/include/asm/arch-sunxi/cpu.h b/arch/arm/include/asm/arch-sunxi/cpu.h index 73583ed..6f96a97 100644 --- a/arch/arm/include/asm/arch-sunxi/cpu.h +++ b/arch/arm/include/asm/arch-sunxi/cpu.h @@ -13,4 +13,7 @@ #include <asm/arch/cpu_sun4i.h> #endif
+#define SOCID_A64 0x1689 +#define SOCID_H3 0x1680 + #endif /* _SUNXI_CPU_H */ diff --git a/arch/arm/include/asm/arch-sunxi/dram.h b/arch/arm/include/asm/arch-sunxi/dram.h index e0be744..53e6d47 100644 --- a/arch/arm/include/asm/arch-sunxi/dram.h +++ b/arch/arm/include/asm/arch-sunxi/dram.h @@ -24,7 +24,7 @@ #include <asm/arch/dram_sun8i_a33.h> #elif defined(CONFIG_MACH_SUN8I_A83T) #include <asm/arch/dram_sun8i_a83t.h> -#elif defined(CONFIG_MACH_SUN8I_H3) +#elif defined(CONFIG_MACH_SUN8I_H3) || defined(CONFIG_MACH_SUN50I) #include <asm/arch/dram_sun8i_h3.h> #elif defined(CONFIG_MACH_SUN9I) #include <asm/arch/dram_sun9i.h> diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index 346538c..25d07d9 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -15,7 +15,8 @@
struct sunxi_mctl_com_reg { u32 cr; /* 0x00 control register */ - u8 res0[0xc]; /* 0x04 */ + u8 res0[0x8]; /* 0x04 */ + u32 tmr; /* 0x0c (unused on H3) */ u32 mcr[16][2]; /* 0x10 */ u32 bwcr; /* 0x90 bandwidth control register */ u32 maer; /* 0x94 master enable register */ @@ -32,7 +33,9 @@ struct sunxi_mctl_com_reg { u32 swoffr; /* 0xc4 */ u8 res2[0x8]; /* 0xc8 */ u32 cccr; /* 0xd0 */ - u8 res3[0x72c]; /* 0xd4 */ + u8 res3[0x54]; /* 0xd4 */ + u32 mdfs_bwlr[3]; /* 0x128 (unused on H3) */ + u8 res4[0x6cc]; /* 0x134 */ u32 protect; /* 0x800 */ };
@@ -81,7 +84,8 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */ - u8 res3[0x20]; /* 0x9c */ + u8 res3[0x1c]; /* 0x9c */ + u32 vtfcr; /* 0xb8 (unused on H3) */ u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */ diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index e73114e..7daba11 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -50,4 +50,5 @@ obj-$(CONFIG_MACH_SUN8I_A33) += dram_sun8i_a33.o obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o +obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o endif diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index 8e39bbe..d123b3a 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -217,7 +217,7 @@ done: } #endif
-#ifdef CONFIG_MACH_SUN8I_A33 +#if defined(CONFIG_MACH_SUN8I_A33) || defined(CONFIG_MACH_SUN50I) void clock_set_pll11(unsigned int clk, bool sigma_delta_enable) { struct sunxi_ccm_reg * const ccm = diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 4396754..fe9cf9a 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -13,6 +13,7 @@ #include <asm/io.h> #include <asm/arch/clock.h> #include <asm/arch/dram.h> +#include <asm/arch/cpu.h> #include <linux/kconfig.h>
/* @@ -42,30 +43,6 @@ static inline int ns_to_t(int nanoseconds) return DIV_ROUND_UP(ctrl_freq * nanoseconds, 1000); }
-static u32 bin_to_mgray(int val) -{ - static const u8 lookup_table[32] = { - 0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05, - 0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09, - 0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d, - 0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11, - }; - - return lookup_table[clamp(val, 0, 31)]; -} - -static int mgray_to_bin(u32 val) -{ - static const u8 lookup_table[32] = { - 0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05, - 0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b, - 0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b, - 0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15, - }; - - return lookup_table[val & 0x1f]; -} - static void mctl_phy_init(u32 val) { struct sunxi_mctl_ctl_reg * const mctl_ctl = @@ -148,13 +125,13 @@ inline void mbus_configure_port(u8 port, mbus_configure_port(MBUS_PORT_ ## port, bwlimit, false, \ MBUS_QOS_ ## qos, 0, acs, bwl0, bwl1, bwl2)
-static void mctl_set_master_priority(void) +static void mctl_set_master_priority_h3(void) { struct sunxi_mctl_com_reg * const mctl_com = (struct sunxi_mctl_com_reg *)SUNXI_DRAM_COM_BASE;
/* enable bandwidth limit windows and set windows size 1us */ - writel(0x00010190, &mctl_com->bwcr); + writel((1 << 16) | (400 << 0), &mctl_com->bwcr);
/* set cpu high priority */ writel(0x00000001, &mctl_com->mapr); @@ -173,7 +150,46 @@ static void mctl_set_master_priority(void) MBUS_CONF(DE_CFD, true, HIGH, 0, 1024, 288, 64); }
-static void mctl_set_timing_params(struct dram_para *para) +static void mctl_set_master_priority_a64(void) +{ + struct sunxi_mctl_com_reg * const mctl_com = + (struct sunxi_mctl_com_reg *)SUNXI_DRAM_COM_BASE; + + /* enable bandwidth limit windows and set windows size 1us */ + writel(399, &mctl_com->tmr); + writel((1 << 16), &mctl_com->bwcr); + + /* Port 2 is reserved per Allwinner's linux-3.10 source, yet they + * initialise it */ + MBUS_CONF( CPU, true, HIGHEST, 0, 160, 100, 80); + MBUS_CONF( GPU, false, HIGH, 0, 1536, 1400, 256); + MBUS_CONF(UNUSED, true, HIGHEST, 0, 512, 256, 96); + MBUS_CONF( DMA, true, HIGH, 0, 256, 80, 100); + MBUS_CONF( VE, true, HIGH, 0, 1792, 1600, 256); + MBUS_CONF( CSI, true, HIGH, 0, 256, 128, 0); + MBUS_CONF( NAND, true, HIGH, 0, 256, 128, 64); + MBUS_CONF( SS, true, HIGHEST, 0, 256, 128, 64); + MBUS_CONF( TS, true, HIGHEST, 0, 256, 128, 64); + MBUS_CONF( DI, true, HIGH, 0, 1024, 256, 64); + MBUS_CONF( DE, true, HIGH, 2, 8192, 6144, 2048); + MBUS_CONF(DE_CFD, true, HIGH, 0, 1280, 144, 64); + + writel(0x81000004, &mctl_com->mdfs_bwlr[2]); +} + +static void mctl_set_master_priority(uint16_t socid) +{ + switch (socid) { + case SOCID_H3: + mctl_set_master_priority_h3(); + return; + case SOCID_A64: + mctl_set_master_priority_a64(); + return; + } +} + +static void mctl_set_timing_params(uint16_t socid, struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; @@ -254,7 +270,31 @@ static void mctl_set_timing_params(struct dram_para *para) writel(RFSHTMG_TREFI(trefi) | RFSHTMG_TRFC(trfc), &mctl_ctl->rfshtmg); }
-static void mctl_zq_calibration(struct dram_para *para) +static u32 bin_to_mgray(int val) +{ + static const u8 lookup_table[32] = { + 0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05, + 0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09, + 0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d, + 0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11, + }; + + return lookup_table[clamp(val, 0, 31)]; +} + +static int mgray_to_bin(u32 val) +{ + static const u8 lookup_table[32] = { + 0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05, + 0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b, + 0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b, + 0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15, + }; + + return lookup_table[val & 0x1f]; +} + +static void mctl_h3_zq_calibration_quirk(struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; @@ -324,7 +364,7 @@ static void mctl_set_cr(struct dram_para *para) MCTL_CR_ROW_BITS(para->row_bits), &mctl_com->cr); }
-static void mctl_sys_init(struct dram_para *para) +static void mctl_sys_init(uint16_t socid, struct dram_para *para) { struct sunxi_ccm_reg * const ccm = (struct sunxi_ccm_reg *)SUNXI_CCM_BASE; @@ -336,16 +376,30 @@ static void mctl_sys_init(struct dram_para *para) clrbits_le32(&ccm->ahb_gate0, 1 << AHB_GATE_OFFSET_MCTL); clrbits_le32(&ccm->ahb_reset0_cfg, 1 << AHB_RESET_OFFSET_MCTL); clrbits_le32(&ccm->pll5_cfg, CCM_PLL5_CTRL_EN); + if (socid == SOCID_A64) + clrbits_le32(&ccm->pll11_cfg, CCM_PLL11_CTRL_EN); udelay(10);
clrbits_le32(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_RST); udelay(1000);
- clock_set_pll5(CONFIG_DRAM_CLK * 2 * 1000000, false); - clrsetbits_le32(&ccm->dram_clk_cfg, - CCM_DRAMCLK_CFG_DIV_MASK | CCM_DRAMCLK_CFG_SRC_MASK, - CCM_DRAMCLK_CFG_DIV(1) | CCM_DRAMCLK_CFG_SRC_PLL5 | - CCM_DRAMCLK_CFG_UPD); + if (socid == SOCID_A64) { + clock_set_pll11(CONFIG_DRAM_CLK * 2 * 1000000, false); + clrsetbits_le32(&ccm->dram_clk_cfg, + CCM_DRAMCLK_CFG_DIV_MASK | + CCM_DRAMCLK_CFG_SRC_MASK, + CCM_DRAMCLK_CFG_DIV(1) | + CCM_DRAMCLK_CFG_SRC_PLL11 | + CCM_DRAMCLK_CFG_UPD); + } else if (socid == SOCID_H3) { + clock_set_pll5(CONFIG_DRAM_CLK * 2 * 1000000, false); + clrsetbits_le32(&ccm->dram_clk_cfg, + CCM_DRAMCLK_CFG_DIV_MASK | + CCM_DRAMCLK_CFG_SRC_MASK, + CCM_DRAMCLK_CFG_DIV(1) | + CCM_DRAMCLK_CFG_SRC_PLL5 | + CCM_DRAMCLK_CFG_UPD); + } mctl_await_completion(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_UPD, 0);
setbits_le32(&ccm->ahb_reset0_cfg, 1 << AHB_RESET_OFFSET_MCTL); @@ -360,7 +414,7 @@ static void mctl_sys_init(struct dram_para *para) udelay(500); }
-static int mctl_channel_init(struct dram_para *para) +static int mctl_channel_init(uint16_t socid, struct dram_para *para) { struct sunxi_mctl_com_reg * const mctl_com = (struct sunxi_mctl_com_reg *)SUNXI_DRAM_COM_BASE; @@ -370,8 +424,8 @@ static int mctl_channel_init(struct dram_para *para) unsigned int i;
mctl_set_cr(para); - mctl_set_timing_params(para); - mctl_set_master_priority(); + mctl_set_timing_params(socid, para); + mctl_set_master_priority(socid);
/* setting VTC, default disable all VT */ clrbits_le32(&mctl_ctl->pgcr[0], (1 << 30) | 0x3f); @@ -397,12 +451,18 @@ static int mctl_channel_init(struct dram_para *para) /* set DQS auto gating PD mode */ setbits_le32(&mctl_ctl->pgcr[2], 0x3 << 6);
- /* dx ddr_clk & hdr_clk dynamic mode */ - clrbits_le32(&mctl_ctl->pgcr[0], (0x3 << 14) | (0x3 << 12)); - - /* dphy & aphy phase select 270 degree */ - clrsetbits_le32(&mctl_ctl->pgcr[2], (0x3 << 10) | (0x3 << 8), - (0x1 << 10) | (0x2 << 8)); + if (socid == SOCID_H3) { + /* dx ddr_clk & hdr_clk dynamic mode */ + clrbits_le32(&mctl_ctl->pgcr[0], (0x3 << 14) | (0x3 << 12)); + + /* dphy & aphy phase select 270 degree */ + clrsetbits_le32(&mctl_ctl->pgcr[2], (0x3 << 10) | (0x3 << 8), + (0x1 << 10) | (0x2 << 8)); + } else if (socid == SOCID_A64) { + /* dphy & aphy phase select ? */ + clrsetbits_le32(&mctl_ctl->pgcr[2], (0x3 << 10) | (0x3 << 8), + (0x0 << 10) | (0x3 << 8)); + }
/* set half DQ */ if (para->bus_width != 32) { @@ -417,10 +477,17 @@ static int mctl_channel_init(struct dram_para *para) mctl_set_bit_delays(para); udelay(50);
- mctl_zq_calibration(para); + if (socid == SOCID_H3) { + mctl_h3_zq_calibration_quirk(para);
- mctl_phy_init(PIR_PLLINIT | PIR_DCAL | PIR_PHYRST | PIR_DRAMRST | - PIR_DRAMINIT | PIR_QSGATE); + mctl_phy_init(PIR_PLLINIT | PIR_DCAL | PIR_PHYRST | + PIR_DRAMRST | PIR_DRAMINIT | PIR_QSGATE); + } else if (socid == SOCID_A64) { + clrsetbits_le32(&mctl_ctl->zqcr, 0xffffff, CONFIG_DRAM_ZQ); + + mctl_phy_init(PIR_ZCAL | PIR_PLLINIT | PIR_DCAL | PIR_PHYRST | + PIR_DRAMRST | PIR_DRAMINIT | PIR_QSGATE); + }
/* detect ranks and bus width */ if (readl(&mctl_ctl->pgsr[0]) & (0xfe << 20)) { @@ -458,7 +525,10 @@ static int mctl_channel_init(struct dram_para *para) udelay(10);
/* set PGCR3, CKE polarity */ - writel(0x00aa0060, &mctl_ctl->pgcr[3]); + if (socid == SOCID_H3) + writel(0x00aa0060, &mctl_ctl->pgcr[3]); + else if (socid == SOCID_A64) + writel(0xc0aa0060, &mctl_ctl->pgcr[3]);
/* power down zq calibration module for power save */ setbits_le32(&mctl_ctl->zqcr, ZQCR_PWRDOWN); @@ -512,6 +582,22 @@ static void mctl_auto_detect_dram_size(struct dram_para *para) 0, 0, 0, 0, 0, 0, 0, 0, \ 0, 0, 0, 0, 0, 0, 0 }
+#define SUN50I_A64_DX_READ_DELAYS \ + {{ 16, 16, 16, 16, 17, 16, 16, 17, 16, 1, 0 }, \ + { 17, 17, 17, 17, 17, 17, 17, 17, 17, 1, 0 }, \ + { 16, 17, 17, 16, 16, 16, 16, 16, 16, 0, 0 }, \ + { 17, 17, 17, 17, 17, 17, 17, 17, 17, 1, 0 }} +#define SUN50I_A64_DX_WRITE_DELAYS \ + {{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 15, 15 }, \ + { 0, 0, 0, 0, 1, 1, 1, 1, 0, 10, 10 }, \ + { 1, 0, 1, 1, 1, 1, 1, 1, 0, 11, 11 }, \ + { 1, 0, 0, 1, 1, 1, 1, 1, 0, 12, 12 }} +#define SUN50I_A64_AC_DELAYS \ + { 5, 5, 13, 10, 2, 5, 3, 3, \ + 0, 3, 3, 3, 1, 0, 0, 0, \ + 3, 4, 0, 3, 4, 1, 4, 0, \ + 1, 1, 0, 1, 13, 5, 4 } + unsigned long sunxi_dram_init(void) { struct sunxi_mctl_com_reg * const mctl_com = @@ -524,13 +610,30 @@ unsigned long sunxi_dram_init(void) .bus_width = 32, .row_bits = 15, .page_size = 4096, + +#if defined(CONFIG_MACH_SUN8I_H3) .dx_read_delays = SUN8I_H3_DX_READ_DELAYS, .dx_write_delays = SUN8I_H3_DX_WRITE_DELAYS, .ac_delays = SUN8I_H3_AC_DELAYS, +#elif defined(CONFIG_MACH_SUN50I) + .dx_read_delays = SUN50I_A64_DX_READ_DELAYS, + .dx_write_delays = SUN50I_A64_DX_WRITE_DELAYS, + .ac_delays = SUN50I_A64_AC_DELAYS, +#endif }; - - mctl_sys_init(¶); - if (mctl_channel_init(¶)) +/* + * Let the compiler optimize alternatives away by passing this value into + * the static functions. This saves us #ifdefs, but still keeps the binary + * small. + */ +#if defined(CONFIG_MACH_SUN8I_H3) + uint16_t socid = SOCID_H3; +#elif defined(CONFIG_MACH_SUN50I) + uint16_t socid = SOCID_A64; +#endif + + mctl_sys_init(socid, ¶); + if (mctl_channel_init(socid, ¶)) return 0;
if (para.dual_rank) @@ -540,7 +643,13 @@ unsigned long sunxi_dram_init(void) udelay(1);
/* odt delay */ - writel(0x0c000400, &mctl_ctl->odtcfg); + if (socid == SOCID_H3) + writel(0x0c000400, &mctl_ctl->odtcfg); + + if (socid == SOCID_A64) { + setbits_le32(&mctl_ctl->vtfcr, 2 << 8); + clrbits_le32(&mctl_ctl->pgcr[2], (1 << 13)); + }
/* clear credit value */ setbits_le32(&mctl_com->cccr, 1 << 31);

According to Jens disabling the on-die-termination should set bit 5, not bit 1 in the respective register. Fix this.
Reported-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/mach-sunxi/dram_sun8i_h3.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index fe9cf9a..1311eda 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -414,6 +414,11 @@ static void mctl_sys_init(uint16_t socid, struct dram_para *para) udelay(500); }
+/* These are more guessed based on some Allwinner code. */ +#define DX_GCR_ODT_DYNAMIC (0x0 << 4) +#define DX_GCR_ODT_ALWAYS_ON (0x1 << 4) +#define DX_GCR_ODT_OFF (0x2 << 4) + static int mctl_channel_init(uint16_t socid, struct dram_para *para) { struct sunxi_mctl_com_reg * const mctl_com = @@ -443,7 +448,8 @@ static int mctl_channel_init(uint16_t socid, struct dram_para *para) clrsetbits_le32(&mctl_ctl->dx[i].gcr, (0x3 << 4) | (0x1 << 1) | (0x3 << 2) | (0x3 << 12) | (0x3 << 14), - IS_ENABLED(CONFIG_DRAM_ODT_EN) ? 0x0 : 0x2); + IS_ENABLED(CONFIG_DRAM_ODT_EN) ? + DX_GCR_ODT_DYNAMIC : DX_GCR_ODT_OFF);
/* AC PDR should always ON */ setbits_le32(&mctl_ctl->aciocr, 0x1 << 1);

Fix the output of the DRAM size on AArch64 SPLs.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de Reviewed-by: Simon Glass sjg@chromium.org Acked-by: Maxime Ripard maxime.ripard@free-electrons.com --- arch/arm/mach-sunxi/dram_sun8i_h3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 1311eda..9f7cc7f 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -664,6 +664,6 @@ unsigned long sunxi_dram_init(void) mctl_auto_detect_dram_size(¶); mctl_set_cr(¶);
- return (1 << (para.row_bits + 3)) * para.page_size * + return (1UL << (para.row_bits + 3)) * para.page_size * (para.dual_rank ? 2 : 1); }

Now that the SPL is ready to be compiled in AArch64 and the DRAM init code is ready, enable SPL support for the A64 SoC and in the Pine64 defconfig. For now we keep the boot0 header in the U-Boot proper, as this allows to still use boot0 as an SPL replacement without hurting the SPL use case. We disable FEL support for now by making its compilation conditional and disabling it for ARM64, as the code isn't ready yet.
Signed-off-by: Andre Przywara andre.przywara@arm.com Acked-by: Maxime Ripard maxime.ripard@free-electrons.com Reviewed-by: Simon Glass sjg@chromium.org --- arch/arm/mach-sunxi/board.c | 2 +- board/sunxi/Kconfig | 2 ++ configs/pine64_plus_defconfig | 1 + include/configs/sunxi-common.h | 2 ++ 4 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mach-sunxi/board.c b/arch/arm/mach-sunxi/board.c index aa11493..52be5b0 100644 --- a/arch/arm/mach-sunxi/board.c +++ b/arch/arm/mach-sunxi/board.c @@ -133,7 +133,7 @@ static int gpio_init(void) return 0; }
-#ifdef CONFIG_SPL_BUILD +#if defined(CONFIG_SPL_BOARD_LOAD_IMAGE) && defined(CONFIG_SPL_BUILD) static int spl_board_load_image(struct spl_image_info *spl_image, struct spl_boot_device *bootdev) { diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index c2eb85e..0001133 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -125,6 +125,7 @@ config MACH_SUN50I bool "sun50i (Allwinner A64)" select ARM64 select SUNXI_GEN_SUN6I + select SUPPORT_SPL
endchoice
@@ -196,6 +197,7 @@ config DRAM_ODT_EN bool "sunxi dram odt enable" default n if !MACH_SUN8I_A23 default y if MACH_SUN8I_A23 + default y if MACH_SUN50I ---help--- Select this to enable dram odt (on die termination).
diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index ebc24b8..2374170 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -5,6 +5,7 @@ CONFIG_MACH_SUN50I=y CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y +CONFIG_SPL=y # CONFIG_CMD_IMLS is not set # CONFIG_CMD_FLASH is not set # CONFIG_CMD_FPGA is not set diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index e05c318..ab2d33f 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -183,7 +183,9 @@
#define CONFIG_SPL_FRAMEWORK
+#ifndef CONFIG_ARM64 /* AArch64 FEL support is not ready yet */ #define CONFIG_SPL_BOARD_LOAD_IMAGE +#endif
#if defined(CONFIG_MACH_SUN9I) #define CONFIG_SPL_TEXT_BASE 0x10040 /* sram start+header */

Read the specified "arch" value from a legacy or FIT U-Boot image and store it in our SPL data structure. This allows loaders to take the target architecture in account for custom loading procedures. Having the complete string -> arch mapping for FIT based images in the SPL would be too big, so we leave it up to architectures (or boards) to overwrite the weak function that does the actual translation, possibly covering only the required subset there. Document struct spl_image_info on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org Reviewed-by: Tom Rini trini@konsulko.com --- common/spl/spl.c | 1 + common/spl/spl_fit.c | 8 ++++++++ include/spl.h | 15 ++++++++++++++- 3 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/common/spl/spl.c b/common/spl/spl.c index a76ea3a..ef195e0 100644 --- a/common/spl/spl.c +++ b/common/spl/spl.c @@ -114,6 +114,7 @@ int spl_parse_image_header(struct spl_image_info *spl_image, header_size; } spl_image->os = image_get_os(header); + spl_image->arch = image_get_arch(header); spl_image->name = image_get_name(header); debug("spl: payload image: %.*s load addr: 0x%lx size: %d\n", (int)sizeof(spl_image->name), spl_image->name, diff --git a/common/spl/spl_fit.c b/common/spl/spl_fit.c index aae556f..a5d903b 100644 --- a/common/spl/spl_fit.c +++ b/common/spl/spl_fit.c @@ -123,6 +123,11 @@ static int get_aligned_image_size(struct spl_load_info *info, int data_size, return (data_size + info->bl_len - 1) / info->bl_len; }
+__weak u8 spl_genimg_get_arch_id(const char *arch_str) +{ + return IH_ARCH_DEFAULT; +} + int spl_load_simple_fit(struct spl_image_info *spl_image, struct spl_load_info *info, ulong sector, void *fit) { @@ -136,6 +141,7 @@ int spl_load_simple_fit(struct spl_image_info *spl_image, int base_offset, align_len = ARCH_DMA_MINALIGN - 1; int src_sector; void *dst, *src; + const char *arch_str;
/* * Figure out where the external images start. This is the base for the @@ -184,10 +190,12 @@ int spl_load_simple_fit(struct spl_image_info *spl_image, data_offset = fdt_getprop_u32(fit, node, "data-offset"); data_size = fdt_getprop_u32(fit, node, "data-size"); load = fdt_getprop_u32(fit, node, "load"); + arch_str = fdt_getprop(fit, node, "arch", NULL); debug("data_offset=%x, data_size=%x\n", data_offset, data_size); spl_image->load_addr = load; spl_image->entry_point = load; spl_image->os = IH_OS_U_BOOT; + spl_image->arch = spl_genimg_get_arch_id(arch_str);
/* * Work out where to place the image. We read it so that the first diff --git a/include/spl.h b/include/spl.h index bde4437..8223f4b 100644 --- a/include/spl.h +++ b/include/spl.h @@ -20,13 +20,26 @@ #define MMCSD_MODE_FS 2 #define MMCSD_MODE_EMMCBOOT 3
+/* + * Information about an U-Boot image file as described in include/image.h. + * Parsed by the SPL code from a legacy or FIT image file. + * + * @name: descriptive string (mkimage -n) + * @load_addr: address to load the image file to (mkimage -a) + * @entry_point: address of first instruction to execute (mkimage -e) + * @size: size of image in bytes + * @flags: optional, used only for SPL_COPY_PAYLOAD_ONLY so far + * @os: target operating system, one of IH_OS_* (mkimage -O) + * @arch: target architecture, one of IH_ARCH_* (mkimage -A) + */ struct spl_image_info { const char *name; - u8 os; ulong load_addr; ulong entry_point; u32 size; u32 flags; + u8 os; + u8 arch; };
/*

At the moment we use the arch/arm directory for arm64 boards as well, so the Makefile will pick up the "arm" name for the architecture to use for tagging binaries in U-Boot image files. Differentiate between the two by looking at the CPU variable being defined to "armv8", and use the arm64 architecture name on creating the image file if that matches.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org Reviewed-by: Tom Rini trini@konsulko.com --- Makefile | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/Makefile b/Makefile index 0874964..bae1256 100644 --- a/Makefile +++ b/Makefile @@ -929,13 +929,18 @@ quiet_cmd_cpp_cfg = CFG $@ cmd_cpp_cfg = $(CPP) -Wp,-MD,$(depfile) $(cpp_flags) $(LDPPFLAGS) -ansi \ -DDO_DEPS_ONLY -D__ASSEMBLY__ -x assembler-with-cpp -P -dM -E -o $@ $<
+ifeq ($(CPU),armv8) +IH_ARCH := arm64 +else +IH_ARCH := $(ARCH) +endif ifdef CONFIG_SPL_LOAD_FIT -MKIMAGEFLAGS_u-boot.img = -f auto -A $(ARCH) -T firmware -C none -O u-boot \ +MKIMAGEFLAGS_u-boot.img = -f auto -A $(IH_ARCH) -T firmware -C none -O u-boot \ -a $(CONFIG_SYS_TEXT_BASE) -e $(CONFIG_SYS_UBOOT_START) \ -n "U-Boot $(UBOOTRELEASE) for $(BOARD) board" -E \ $(patsubst %,-b arch/$(ARCH)/dts/%.dtb,$(subst ",,$(CONFIG_OF_LIST))) else -MKIMAGEFLAGS_u-boot.img = -A $(ARCH) -T firmware -C none -O u-boot \ +MKIMAGEFLAGS_u-boot.img = -A $(IH_ARCH) -T firmware -C none -O u-boot \ -a $(CONFIG_SYS_TEXT_BASE) -e $(CONFIG_SYS_UBOOT_START) \ -n "U-Boot $(UBOOTRELEASE) for $(BOARD) board" endif

Since the SPL FIT loader can now differentiate between different architectures, teach it how to tell arm and arm64 apart when a FIT image is used. We just support those two for now, as these are so far the only sensible alternatives.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org Reviewed-by: Tom Rini trini@konsulko.com --- arch/arm/lib/spl.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/arch/arm/lib/spl.c b/arch/arm/lib/spl.c index e606d47..45d285c 100644 --- a/arch/arm/lib/spl.c +++ b/arch/arm/lib/spl.c @@ -63,3 +63,18 @@ void __noreturn jump_to_image_linux(struct spl_image_info *spl_image, void *arg) image_entry(0, machid, arg); } #endif + +/* This overwrites the weak definition in spl_fit.c */ +u8 spl_genimg_get_arch_id(const char *arch_str) +{ + if (!arch_str) + return IH_ARCH_DEFAULT; + + if (!strcmp(arch_str, "arm")) + return IH_ARCH_ARM; + + if (!strcmp(arch_str, "arm64")) + return IH_ARCH_ARM64; + + return IH_ARCH_DEFAULT; +}

The ARMv8 capable Allwinner A64 SoC comes out of reset in AArch32 mode. To run AArch64 code, we have to trigger a warm reset via the RMR register, which proceeds with code execution at the address stored in the RVBAR register. If the bootable payload in the FIT image is using a different architecture than the SPL has been compiled for, enter it via this said RMR switch mechanism, by writing the entry point address into the MMIO mapped, writable version of the RVBAR register. Then the warm reset is triggered via a system register write. If the payload architecture is the same as the SPL, we use the normal branch as usual.
Signed-off-by: Andre Przywara andre.przywara@arm.com Acked-by: Maxime Ripard maxime.ripard@free-electrons.com --- arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/spl_switch.c | 81 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 82 insertions(+) create mode 100644 arch/arm/mach-sunxi/spl_switch.c
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index 7daba11..128091e 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -51,4 +51,5 @@ obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o +obj-$(CONFIG_MACH_SUN50I) += spl_switch.o endif diff --git a/arch/arm/mach-sunxi/spl_switch.c b/arch/arm/mach-sunxi/spl_switch.c new file mode 100644 index 0000000..855379e --- /dev/null +++ b/arch/arm/mach-sunxi/spl_switch.c @@ -0,0 +1,81 @@ +/* + * (C) Copyright 2016 ARM Ltd. + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#include <common.h> +#include <spl.h> + +#include <asm/io.h> +#include <asm/barriers.h> + +static void __noreturn jump_to_image_native(struct spl_image_info *spl_image) +{ + typedef void __noreturn (*image_entry_noargs_t)(void); + + image_entry_noargs_t image_entry = + (image_entry_noargs_t)spl_image->entry_point; + + image_entry(); +} + +/* + * Do a warm-reset via the RMR register to enter the processor in a different + * execution mode. This allows to switch from AArch32 to AArch64 and vice + * versa. Execution starts at the address hold in the RVBAR register, which + * needs to be set before. + */ +static void __noreturn reset_rmr_switch(void) +{ +#ifdef CONFIG_ARM64 + __asm__ volatile ( "mrs x0, RMR_EL3\n\t" + "bic x0, x0, #1\n\t" /* Clear enter-in-64 bit */ + "orr x0, x0, #2\n\t" /* set reset request bit */ + "msr RMR_EL3, x0\n\t" + "isb sy\n\t" + "nop\n\t" + "wfi\n\t" + "b .\n" + ::: "x0"); +#else + __asm__ volatile ( "mrc 15, 0, r0, cr12, cr0, 2\n\t" + "orr r0, r0, #3\n\t" /* request reset in 64 bit */ + "mcr 15, 0, r0, cr12, cr0, 2\n\t" + "isb\n\t" + "nop\n\t" + "wfi\n\t" + "b .\n" + ::: "r0"); +#endif + while (1); /* to avoid a compiler warning about __noreturn */ +} + +void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image) +{ + if (spl_image->arch == IH_ARCH_DEFAULT) { + /* + * If the image to be executed is using the same architecture + * as we are currently running in, just branch to the target + * address. + */ + debug("entering by branch\n"); + jump_to_image_native(spl_image); + } else { + /* + * If the target architecture and the current one differ, use + * the RMR routine to change it. + */ + debug("entering by RMR switch\n"); + /* + * The start address at which execution continues after the + * RMR switch is held in the RVBAR system register, which is + * architecturally read-only. + * Allwinner provides a writeable alias in MMIO space for it. + */ + writel(spl_image->entry_point, 0x17000a0); + DSB; + ISB; + reset_rmr_switch(); + } +}

When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding. This should only be used for the SPL, the U-Boot proper should still be using the existing 64-bit configuration. The SPL code can switch to AArch64 if needed, so a 32-bit SPL can be combined with a 64-bit U-Boot proper to eventually launch arm64 kernels.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- board/sunxi/Kconfig | 14 ++++++++++++-- configs/pine64_plus_defconfig | 2 +- configs/sun50i_spl32_defconfig | 10 ++++++++++ 3 files changed, 23 insertions(+), 3 deletions(-) create mode 100644 configs/sun50i_spl32_defconfig
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index 0001133..0d77c3a 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -43,6 +43,10 @@ config SUNXI_GEN_SUN6I watchdog, etc.
+config MACH_SUN50I + bool + select SUNXI_GEN_SUN6I + choice prompt "Sunxi SoC Variant" optional @@ -121,10 +125,16 @@ config MACH_SUN9I select SUNXI_GEN_SUN6I select SUPPORT_SPL
-config MACH_SUN50I +config MACH_SUN50I_64 bool "sun50i (Allwinner A64)" + select MACH_SUN50I select ARM64 - select SUNXI_GEN_SUN6I + select SUPPORT_SPL + +config MACH_SUN50I_32 + bool "sun50i (Allwinner A64) SPL-32bit" + select MACH_SUN50I + select CPU_V7 select SUPPORT_SPL
endchoice diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index 2374170..a76f66a 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -1,7 +1,7 @@ CONFIG_ARM=y CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y -CONFIG_MACH_SUN50I=y +CONFIG_MACH_SUN50I_64=y CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y diff --git a/configs/sun50i_spl32_defconfig b/configs/sun50i_spl32_defconfig new file mode 100644 index 0000000..29c6a47 --- /dev/null +++ b/configs/sun50i_spl32_defconfig @@ -0,0 +1,10 @@ +CONFIG_ARM=y +CONFIG_ARCH_SUNXI=y +CONFIG_MACH_SUN50I_32=y +CONFIG_SPL=y +CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" +CONFIG_OF_LIST="sun50i-a64-pine64 sun50i-a64-pine64-plus" +# CONFIG_CMD_IMLS is not set +# CONFIG_CMD_FLASH is not set +# CONFIG_CMD_FPGA is not set +CONFIG_MMC_SUNXI_SLOT_EXTRA=2

Hi Andre,
On 2 January 2017 at 04:48, Andre Przywara andre.przywara@arm.com wrote:
When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding. This should only be used for the SPL, the U-Boot proper should still be using the existing 64-bit configuration. The SPL code can switch to AArch64 if needed, so a 32-bit SPL can be combined with a 64-bit U-Boot proper to eventually launch arm64 kernels.
Signed-off-by: Andre Przywara andre.przywara@arm.com
board/sunxi/Kconfig | 14 ++++++++++++-- configs/pine64_plus_defconfig | 2 +- configs/sun50i_spl32_defconfig | 10 ++++++++++ 3 files changed, 23 insertions(+), 3 deletions(-) create mode 100644 configs/sun50i_spl32_defconfig
Following up on the previous discussion, I take it that one on problem is that U-Boot does not support building SPL with one toolchain and U-Boot proper with another?
Re wanting to build SPL either as 32-bit or 64-bit, could this be a Kconfig option perhaps?
Anyway, if this is what we have for now, fine. But I'd like to see at least a TODO suggesting a better solution.
Regards, Simon

Hi,
On 13/01/17 02:19, Simon Glass wrote:
Hi Andre,
On 2 January 2017 at 04:48, Andre Przywara andre.przywara@arm.com wrote:
When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding. This should only be used for the SPL, the U-Boot proper should still be using the existing 64-bit configuration. The SPL code can switch to AArch64 if needed, so a 32-bit SPL can be combined with a 64-bit U-Boot proper to eventually launch arm64 kernels.
Signed-off-by: Andre Przywara andre.przywara@arm.com
board/sunxi/Kconfig | 14 ++++++++++++-- configs/pine64_plus_defconfig | 2 +- configs/sun50i_spl32_defconfig | 10 ++++++++++ 3 files changed, 23 insertions(+), 3 deletions(-) create mode 100644 configs/sun50i_spl32_defconfig
Following up on the previous discussion, I take it that one on problem is that U-Boot does not support building SPL with one toolchain and U-Boot proper with another?
Yes. For the time being I have a script [1] that first configures and compiles everything with an ARM cross compiler, then repeats this with an aarch64 cross compiler. Then I combine the 32-bit SPL and the 64-bit U-Boot proper.
While we are at it: I didn't find a Makefile target to just compile the SPL (or just U-Boot proper), is that worth looking at? Or can we somehow influence Kconfig options on the command line to achieve that on that level?
Re wanting to build SPL either as 32-bit or 64-bit, could this be a Kconfig option perhaps?
Sounds like a direction worth to investigate. In the moment we have two separate defconfig files, because CPU_V7 and ARM64 are actually mutually exclusive, which is really a pain and the main reason that part wasn't merged.
Anyway, if this is what we have for now, fine. But I'd like to see at least a TODO suggesting a better solution.
So yes, the series was now merged omitting the 32-bit SPL part, which is really optional. For the time being we seem to get away with only 64-bit, but I will definitely revisit this after having handled the higher priority patches in my queue.
Cheers, Andre.

On Fri, Jan 13, 2017 at 09:42:28AM +0000, Andre Przywara wrote:
Re wanting to build SPL either as 32-bit or 64-bit, could this be a Kconfig option perhaps?
Sounds like a direction worth to investigate. In the moment we have two separate defconfig files, because CPU_V7 and ARM64 are actually mutually exclusive, which is really a pain and the main reason that part wasn't merged.
I really like that option.
Maxime

On 16/01/17 08:55, Maxime Ripard wrote:
On Fri, Jan 13, 2017 at 09:42:28AM +0000, Andre Przywara wrote:
Re wanting to build SPL either as 32-bit or 64-bit, could this be a Kconfig option perhaps?
Sounds like a direction worth to investigate. In the moment we have two separate defconfig files, because CPU_V7 and ARM64 are actually mutually exclusive, which is really a pain and the main reason that part wasn't merged.
I really like that option.
So I gave this a try this weekend, I have something like:
+choice + prompt "32/64 bit build selection" + depends on MACH_SUN50I + +config SUNXI_64BIT_BUILD + bool "64-bit Aarch64 build" + select ARM64 + +config SUNXI_32BIT_BUILD + bool "32-bit ARM build" + select CPU_V7 + select PHYS_64BIT + +endchoice
I then set CONFIG_SUNXI_64BIT_BUILD=y in the defconfig. That seems to work, however switching to a 32-bit build requires either a) a separate defconfig - which is what we didn't want b) manually toggling this via menuconfig c) sed-ing these two lines in .config
I was hoping that there was some simple command line way of toggling a Kconfig option, à la: $ make pine64_plus_defconfig CONFIG_SUNXI_32BIT_BUILD=y
But that didn't work. mergeconfig.sh wasn't helpful as well.
Any ideas on how we could easily switch between the two options?
Cheers, Andre.

On Mon, Jan 16, 2017 at 09:47:06AM +0000, André Przywara wrote:
On 16/01/17 08:55, Maxime Ripard wrote:
On Fri, Jan 13, 2017 at 09:42:28AM +0000, Andre Przywara wrote:
Re wanting to build SPL either as 32-bit or 64-bit, could this be a Kconfig option perhaps?
Sounds like a direction worth to investigate. In the moment we have two separate defconfig files, because CPU_V7 and ARM64 are actually mutually exclusive, which is really a pain and the main reason that part wasn't merged.
I really like that option.
So I gave this a try this weekend, I have something like:
+choice
- prompt "32/64 bit build selection"
- depends on MACH_SUN50I
+config SUNXI_64BIT_BUILD
- bool "64-bit Aarch64 build"
- select ARM64
+config SUNXI_32BIT_BUILD
- bool "32-bit ARM build"
- select CPU_V7
- select PHYS_64BIT
+endchoice
I then set CONFIG_SUNXI_64BIT_BUILD=y in the defconfig. That seems to work, however switching to a 32-bit build requires either a) a separate defconfig - which is what we didn't want b) manually toggling this via menuconfig c) sed-ing these two lines in .config
I was hoping that there was some simple command line way of toggling a Kconfig option, à la: $ make pine64_plus_defconfig CONFIG_SUNXI_32BIT_BUILD=y
But that didn't work. mergeconfig.sh wasn't helpful as well.
Any ideas on how we could easily switch between the two options?
I know that Linux has a way to append defconfig fragments (originally for android options iirc). That could be worth a look.
Maxime

I recently ran into a probably with the UARTs on the A64. Many Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT is 3Mb/s with about 2.1Mb/s though put. To handle this most systems set the speed of the BT UART to 3Mb/s.
By default the Allwinner UART clock input is OSC24. When using OSC24 the maximum speed the UART can be set to is 1.5Mb/s. The clock input (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree and 3Mb/s is then supported.
But... there's a problem, UART0 (the console) is using the same master clock source. So when you change the clock input over to PERIPH0x2 the console stops working. There is no mechanism in Linux to handle this clock source change and adjust the dividers on active uarts. So it would be best if this master clock was set very early in u-boot and then the console is adjusted to use it.
Are there any downsides to making this change in u-boot?
On Mon, Jan 2, 2017 at 6:48 AM, Andre Przywara andre.przywara@arm.com wrote:
Hi,
hopefully the final version of the SPL support series for the Allwinner A64 SoC. Actually no real code changes this time, just rebased on top of recent master, adding some comments in patches 16/26 and 19/26 following Maxime's suggestions and adding Acked-by:s and Reviewed-by:s. I left the final patch 26/26 in for the sake of completeness, but don't expect it to be merged. We need a clever solution to unify 32-bit and 64-bit board configurations, but that shouldn't hold back this series for now. Merging everything until and including patch 21/26 (sunxi: A64: enable SPL) would be great, the other patches until 25/26 can go in as well, I think.
As the previous versions this one includes support for both AArch64 and AArch32 SPL builds. The FIT support is still missing, which means the functionality is limited. Due to the missing ARM Trusted Firmware (ATF) in this firmware chain we lose Ethernet and SMP, among other minor things. A full 64-bit build can be written to an SD card as expected and will boot the U-Boot proper prompt. However Linux will crash on boot, as PSCI is missing. Building the 32-bit version of the SPL and combining this with an ATF build and the 64-bit U-Boot proper allows to use FEL booting now: # sunxi-fel spl sunxi-spl.bin write 0x4a000000 u-boot-dtb.bin \ write 0x44000 bl31.bin reset64 0x44000 This way of booting the board gives full functionality.
The first patch is a rather simple fix (with no changes to v2). Patches 2-8 prepare the SPL code to be compiled for 64-bit in general and AArch64 in particular. Patches 9-11 refactor the existing boot0 header functionality to be used by patch 12, which introduces the 64-bit switch in the first SPL instructions. Patches 13-20 then introduce the actual core of the SPL support: the DRAM initialization, courtesy of Jens. This piggy backs on the existing H3 DRAM code, deviating where needed. This has been reworked compared to v2: I added a patch from Philipp to replace the rather uninspired register writes in the MBUS priority setup function with some meaningful code, explaining the various bits. Also the actual A64 DRAM code is no longer #ifdef'ed into the H3 driver, but uses parameters to (static) functions. The compiler detects this and removes the dead code from the other variant, resulting in the same binary size for the H3.
Patch 21 finally enables the 64-bit SPL support. So now building the existing pine64_plus_defconfig will generate a sunxi-spl.bin, which can be prepended to the U-Boot proper image (not .bin) to boot from an SD card. Due to the missing ATF support this is of limited usability at the moment, though. Also FEL support requires more love - to switch back to AArch32 before returning to FEL (without crashing, that is ;-), so this is disabled. On my setup this results in a 26KB SPL binary, which is close to the 28K limit mksunxiboot imposes at the moment. Adding anything (like FIT support or DEBUG) will exceed this, and although I have patches to let mksunxiboot get close to 32KB, this is the ulimate frontier.
So patches 22-25 then teach the SPL how to detect an U-Boot image file of a different bitness and do the RMR switch from AArch32 to AArch64, if needed. This is used by the final patch 26, which creates another _defconfig to let the SPL compile for AArch32 using the Thumb2 encoding. This results in a binary of less than 17KB in my case, so has plenty of room for extensions.
Cheers, Andre.
Changelog v3 .. v4:
- rebased on top of latest HEAD
- add various Reviewed-by: and Acked-by: tags
- add comments about register bit meanings in non-ODT-setting fix
- clarify meaning of delay values in single bit delay support patch
- removing stray semicolons from boot0.h header
Changelog v2 .. v3:
- add various Reviewed-by: and Acked-by: tags
- split tiny-printf fix to handle "-" separately
- add various comments and extend commit messages
- add assembly file to re-create the embedded RMR switch code
- add patch 14/26 to explain the MBUS priority setup
- move DRAM r/w delay values into #defines to simplify re-usablity
- replace #ifdef'ed addition of A64 support to the H3 DRAM driver with an approach using static parameters
Changelog v1 .. v2:
- drop SPI build fix (already merged)
- confine A31 register init change to H3 and A64
- use IS_ENABLED() instead of #idef to guard MBUS2 clock init
- fix tiny-printf (proper sign extension for 32-bit integers)
- add "size" output in commit msg to document tiny-printf size impact
- fix sdelay(): use only one register, add "cc" clobber
- update RMR switch code to provide easy access to RVBAR register address
- drop redundant DRAM frequency setting from Pine64 defconfig
- minor changes as requested by reviewers
Andre Przywara (21): sun6i: Restrict some register initialization to Allwinner A31 SoC armv8: prevent using THUMB armv8: add lowlevel_init.S SPL: tiny-printf: add "l" modifier SPL: tiny-printf: ignore "-" modifier move UL() macro from armv8/mmu.h into common.h SPL: make struct spl_image 64-bit safe armv8: add simple sdelay implementation armv8: move reset branch into boot hook ARM: boot0 hook: remove macro, include whole header file sunxi: introduce extra config option for boot0 header sunxi: A64: do an RMR switch if started in AArch32 mode sunxi: provide default DRAM config for sun50i in Kconfig sunxi: H3/A64: fix non-ODT setting sunxi: DRAM: fix H3 DRAM size display on aarch64 sunxi: A64: enable SPL SPL: read and store arch property from U-Boot image Makefile: use "arm64" architecture for U-Boot image files ARM: SPL/FIT: differentiate between arm and arm64 arch properties sunxi: introduce RMR switch to enter payloads in 64-bit mode sunxi: A64: add 32-bit SPL support
Jens Kuske (3): sunxi: H3: add and rename some DRAM contoller registers sunxi: H3: add DRAM controller single bit delay support sunxi: A64: use H3 DRAM initialization code for A64 as well
Philipp Tomsich (2): sunxi: H3: Rework MBUS priority setup sunxi: clocks: Use the correct pattern register for PLL11
Makefile | 9 +- arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/cpu.c | 14 + arch/arm/cpu/armv8/lowlevel_init.S | 44 +++ arch/arm/cpu/armv8/start.S | 5 +- arch/arm/include/asm/arch-bcm235xx/boot0.h | 8 +- arch/arm/include/asm/arch-bcm281xx/boot0.h | 8 +- arch/arm/include/asm/arch-sunxi/boot0.h | 37 ++- arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/cpu.h | 3 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 53 ++-- arch/arm/include/asm/armv8/mmu.h | 8 - arch/arm/lib/Makefile | 2 + arch/arm/lib/spl.c | 15 + arch/arm/lib/vectors.S | 1 - arch/arm/mach-omap2/boot-common.c | 2 +- arch/arm/mach-sunxi/Makefile | 2 + arch/arm/mach-sunxi/board.c | 2 +- arch/arm/mach-sunxi/clock_sun6i.c | 10 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 400 +++++++++++++++++------- arch/arm/mach-sunxi/rmr_switch.S | 41 +++ arch/arm/mach-sunxi/spl_switch.c | 81 +++++ arch/arm/mach-tegra/spl.c | 2 +- board/sunxi/Kconfig | 41 ++- common/spl/spl.c | 9 +- common/spl/spl_fit.c | 8 + common/spl/spl_mmc.c | 2 +- configs/pine64_plus_defconfig | 7 +- configs/sun50i_spl32_defconfig | 10 + include/common.h | 13 +- include/configs/sunxi-common.h | 4 +- include/spl.h | 19 +- lib/tiny-printf.c | 50 ++- 34 files changed, 713 insertions(+), 201 deletions(-) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S create mode 100644 arch/arm/mach-sunxi/rmr_switch.S create mode 100644 arch/arm/mach-sunxi/spl_switch.c create mode 100644 configs/sun50i_spl32_defconfig
-- 2.8.2
-- You received this message because you are subscribed to the Google Groups "linux-sunxi" group. To unsubscribe from this group and stop receiving emails from it, send an email to linux-sunxi+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/d/optout.

On Tue, Jan 3, 2017 at 3:38 AM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
I recently ran into a probably with the UARTs on the A64. Many Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT is 3Mb/s with about 2.1Mb/s though put. To handle this most systems set the speed of the BT UART to 3Mb/s.
By default the Allwinner UART clock input is OSC24. When using OSC24 the maximum speed the UART can be set to is 1.5Mb/s. The clock input (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree and 3Mb/s is then supported.
But... there's a problem, UART0 (the console) is using the same master clock source. So when you change the clock input over to PERIPH0x2 the console stops working. There is no mechanism in Linux to handle this clock source change and adjust the dividers on active uarts. So it would be best if this master clock was set very early in u-boot and then the console is adjusted to use it.
Are there any downsides to making this change in u-boot?
I don't understand, did you find this behaviour with these SPL changes or general sunxi u-boot?
thanks!

On Tue, Jan 3, 2017 at 5:41 AM, Jagan Teki jagannadh.teki@gmail.com wrote:
On Tue, Jan 3, 2017 at 3:38 AM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
I recently ran into a probably with the UARTs on the A64. Many Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT is 3Mb/s with about 2.1Mb/s though put. To handle this most systems set the speed of the BT UART to 3Mb/s.
By default the Allwinner UART clock input is OSC24. When using OSC24 the maximum speed the UART can be set to is 1.5Mb/s. The clock input (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree and 3Mb/s is then supported.
But... there's a problem, UART0 (the console) is using the same master clock source. So when you change the clock input over to PERIPH0x2 the console stops working. There is no mechanism in Linux to handle this clock source change and adjust the dividers on active uarts. So it would be best if this master clock was set very early in u-boot and then the console is adjusted to use it.
Are there any downsides to making this change in u-boot?
I don't understand, did you find this behaviour with these SPL changes or general sunxi u-boot?
Previously the boot console uart0 was getting setup in the SPL code. I have not been closely following these changes, is that still true?
Changing the clock parent needs to be done before uart0 is initialized. Changing this parent should have no other impact on u-boot other than changing the clock divisor uart0 is using.
Once Linux is up, the Linux uart code will see the changed clock parent and allow higher baud rates to be set.
This clock parent also impacts the I2C clocks, but I don't believe I2C is enabled in A64 uboot.
thanks!
Jagan Teki Free Software Engineer | www.openedev.com U-Boot, Linux | Upstream Maintainer Hyderabad, India.

On Tue, Jan 3, 2017 at 2:52 PM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
On Tue, Jan 3, 2017 at 5:41 AM, Jagan Teki jagannadh.teki@gmail.com wrote:
On Tue, Jan 3, 2017 at 3:38 AM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
I recently ran into a probably with the UARTs on the A64. Many Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT is 3Mb/s with about 2.1Mb/s though put. To handle this most systems set the speed of the BT UART to 3Mb/s.
By default the Allwinner UART clock input is OSC24. When using OSC24 the maximum speed the UART can be set to is 1.5Mb/s. The clock input (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree and 3Mb/s is then supported.
But... there's a problem, UART0 (the console) is using the same master clock source. So when you change the clock input over to PERIPH0x2 the console stops working. There is no mechanism in Linux to handle this clock source change and adjust the dividers on active uarts. So it would be best if this master clock was set very early in u-boot and then the console is adjusted to use it.
Are there any downsides to making this change in u-boot?
I don't understand, did you find this behaviour with these SPL changes or general sunxi u-boot?
I think, this issue need to resolve, Andre any comments?
Previously the boot console uart0 was getting setup in the SPL code. I have not been closely following these changes, is that still true?
Changing the clock parent needs to be done before uart0 is initialized. Changing this parent should have no other impact on u-boot other than changing the clock divisor uart0 is using.
Once Linux is up, the Linux uart code will see the changed clock parent and allow higher baud rates to be set.
This clock parent also impacts the I2C clocks, but I don't believe I2C is enabled in A64 uboot.
Jagan.

On Wed, Jan 4, 2017 at 6:28 PM, Jagan Teki jagan@openedev.com wrote:
On Tue, Jan 3, 2017 at 2:52 PM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
On Tue, Jan 3, 2017 at 5:41 AM, Jagan Teki jagannadh.teki@gmail.com wrote:
On Tue, Jan 3, 2017 at 3:38 AM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
I recently ran into a probably with the UARTs on the A64. Many Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT is 3Mb/s with about 2.1Mb/s though put. To handle this most systems set the speed of the BT UART to 3Mb/s.
By default the Allwinner UART clock input is OSC24. When using OSC24 the maximum speed the UART can be set to is 1.5Mb/s. The clock input (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree and 3Mb/s is then supported.
But... there's a problem, UART0 (the console) is using the same master clock source. So when you change the clock input over to PERIPH0x2 the console stops working. There is no mechanism in Linux to handle this clock source change and adjust the dividers on active uarts. So it would be best if this master clock was set very early in u-boot and then the console is adjusted to use it.
Are there any downsides to making this change in u-boot?
I don't understand, did you find this behaviour with these SPL changes or general sunxi u-boot?
I think, this issue need to resolve, Andre any comments?
This is a completely different issue unrelated to A64 SPL support.
What Jon is saying is that for the UART to go faster than 1.5 Mb/s, The APB2 clock has to be reparented to the peripheral PLL. When do we do this is the question. This is a generic sunxi issue.
Now, I think doing this as soon as possible (with regards to the running system) would be best. Reparenting the clk on the fly would change the baud rate, and result in the uart glitching. Also, as Jon mentioned, the 8250_dw driver in the kernel doesn't support clk notifiers. And it won't work with earlycon anyway...
Regards ChenYu
Previously the boot console uart0 was getting setup in the SPL code. I have not been closely following these changes, is that still true?
Changing the clock parent needs to be done before uart0 is initialized. Changing this parent should have no other impact on u-boot other than changing the clock divisor uart0 is using.
Once Linux is up, the Linux uart code will see the changed clock parent and allow higher baud rates to be set.
This clock parent also impacts the I2C clocks, but I don't believe I2C is enabled in A64 uboot.
Jagan. _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

On 04/01/17 11:25, Chen-Yu Tsai wrote:
On Wed, Jan 4, 2017 at 6:28 PM, Jagan Teki jagan@openedev.com wrote:
On Tue, Jan 3, 2017 at 2:52 PM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
On Tue, Jan 3, 2017 at 5:41 AM, Jagan Teki jagannadh.teki@gmail.com wrote:
On Tue, Jan 3, 2017 at 3:38 AM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
I recently ran into a probably with the UARTs on the A64. Many Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT is 3Mb/s with about 2.1Mb/s though put. To handle this most systems set the speed of the BT UART to 3Mb/s.
By default the Allwinner UART clock input is OSC24. When using OSC24 the maximum speed the UART can be set to is 1.5Mb/s. The clock input (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree and 3Mb/s is then supported.
But... there's a problem, UART0 (the console) is using the same master clock source. So when you change the clock input over to PERIPH0x2 the console stops working. There is no mechanism in Linux to handle this clock source change and adjust the dividers on active uarts. So it would be best if this master clock was set very early in u-boot and then the console is adjusted to use it.
Are there any downsides to making this change in u-boot?
I don't understand, did you find this behaviour with these SPL changes or general sunxi u-boot?
I think, this issue need to resolve, Andre any comments?
This is a completely different issue unrelated to A64 SPL support.
I agree that's a completely orthogonal issue. Someone needs to bake a patch (easy?) and post it. This doesn't depend in any way on this series, in fact would affect many sunxi boards.
What Jon is saying is that for the UART to go faster than 1.5 Mb/s, The APB2 clock has to be reparented to the peripheral PLL. When do we do this is the question. This is a generic sunxi issue.
On the first glance this approach sounds a bit hackish, since we use firmware to setup the clocks in a way to solves a particular issue.
On the other hand using PERIPH0(2x) as a base for APB2 seems like a completely proper setup, even somewhat recommended by Allwinner (after all the UARTs are based on this special clock for a reason).
Now, I think doing this as soon as possible (with regards to the running system) would be best. Reparenting the clk on the fly would change the baud rate, and result in the uart glitching.
Can't we change it when observing the proper order: turning TX/RX off, programming new divisors, changing the clock source, turning TX/RX back on?
Nevertheless it seems worthwhile to give this rather simple U-Boot approach a go. But I would like to see some testing, since this will affect many sunxi boards.
Also, as Jon mentioned, the 8250_dw driver in the kernel doesn't support clk notifiers. And it won't work with earlycon anyway...
As a long-term goal teaching the Linux driver to reparent APB2 seems like a good thing, though I expect some nastiness in this.
Cheers, Andre.
Previously the boot console uart0 was getting setup in the SPL code. I have not been closely following these changes, is that still true?
Changing the clock parent needs to be done before uart0 is initialized. Changing this parent should have no other impact on u-boot other than changing the clock divisor uart0 is using.
Once Linux is up, the Linux uart code will see the changed clock parent and allow higher baud rates to be set.
This clock parent also impacts the I2C clocks, but I don't believe I2C is enabled in A64 uboot.
Jagan. _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

On Wed, Jan 4, 2017 at 8:36 AM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 11:25, Chen-Yu Tsai wrote:
On Wed, Jan 4, 2017 at 6:28 PM, Jagan Teki jagan@openedev.com wrote:
On Tue, Jan 3, 2017 at 2:52 PM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
On Tue, Jan 3, 2017 at 5:41 AM, Jagan Teki jagannadh.teki@gmail.com wrote:
On Tue, Jan 3, 2017 at 3:38 AM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
I recently ran into a probably with the UARTs on the A64. Many Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT is 3Mb/s with about 2.1Mb/s though put. To handle this most systems set the speed of the BT UART to 3Mb/s.
By default the Allwinner UART clock input is OSC24. When using OSC24 the maximum speed the UART can be set to is 1.5Mb/s. The clock input (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree and 3Mb/s is then supported.
But... there's a problem, UART0 (the console) is using the same master clock source. So when you change the clock input over to PERIPH0x2 the console stops working. There is no mechanism in Linux to handle this clock source change and adjust the dividers on active uarts. So it would be best if this master clock was set very early in u-boot and then the console is adjusted to use it.
Are there any downsides to making this change in u-boot?
I don't understand, did you find this behaviour with these SPL changes or general sunxi u-boot?
I think, this issue need to resolve, Andre any comments?
This is a completely different issue unrelated to A64 SPL support.
I agree that's a completely orthogonal issue. Someone needs to bake a patch (easy?) and post it. This doesn't depend in any way on this series, in fact would affect many sunxi boards.
What Jon is saying is that for the UART to go faster than 1.5 Mb/s, The APB2 clock has to be reparented to the peripheral PLL. When do we do this is the question. This is a generic sunxi issue.
On the first glance this approach sounds a bit hackish, since we use firmware to setup the clocks in a way to solves a particular issue.
On the other hand using PERIPH0(2x) as a base for APB2 seems like a completely proper setup, even somewhat recommended by Allwinner (after all the UARTs are based on this special clock for a reason).
Now, I think doing this as soon as possible (with regards to the running system) would be best. Reparenting the clk on the fly would change the baud rate, and result in the uart glitching.
Can't we change it when observing the proper order: turning TX/RX off, programming new divisors, changing the clock source, turning TX/RX back on?
You would expect to be able to achieve this reparenting by simply changing the device tree in Linux. I tried that on the Allwinner 3.10 kernel and it actually works. But... it causes the console to quit working.
Do Linux clk_notifiers work soon enough to handle reparenting the console in the device tree? It is not clear to me that they do. Plus the 8250_dw driver doesn't support it.
Next I considered changing it in the u-boot device tree. But again, console is set up before u-boot loads that device tree. In Allwinner uboot console is set up in the SPL code before the device tree is loaded.
Is the PERIPH0(2x) clock always running? Maybe defaulting UARTs/I2C to OSC24 was done to save power?
After investigating this I now understand why Allwinner modified the standard Broadcom Bluetooth driver in AOSP. They fixed it to run with a 1.5Mb UART. Everyone else in AOSP uses it with a 3Mb/s UART. All of this started because we were unaware of the changes Allwinner had made to the Broadcom AOSP code and we couldn't get AOSP Bluetooth to work. Who knows why Allwinner didn't just adjust the UART to run at 3Mb/s. Probably two different programmer groups.
Nevertheless it seems worthwhile to give this rather simple U-Boot approach a go. But I would like to see some testing, since this will affect many sunxi boards.
Also, as Jon mentioned, the 8250_dw driver in the kernel doesn't support clk notifiers. And it won't work with earlycon anyway...
As a long-term goal teaching the Linux driver to reparent APB2 seems like a good thing, though I expect some nastiness in this.
Cheers, Andre.
Previously the boot console uart0 was getting setup in the SPL code. I have not been closely following these changes, is that still true?
Changing the clock parent needs to be done before uart0 is initialized. Changing this parent should have no other impact on u-boot other than changing the clock divisor uart0 is using.
Once Linux is up, the Linux uart code will see the changed clock parent and allow higher baud rates to be set.
This clock parent also impacts the I2C clocks, but I don't believe I2C is enabled in A64 uboot.
Jagan. _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

On 04/01/17 16:40, jonsmirl@gmail.com wrote:
On Wed, Jan 4, 2017 at 8:36 AM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 11:25, Chen-Yu Tsai wrote:
On Wed, Jan 4, 2017 at 6:28 PM, Jagan Teki jagan@openedev.com wrote:
On Tue, Jan 3, 2017 at 2:52 PM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
On Tue, Jan 3, 2017 at 5:41 AM, Jagan Teki jagannadh.teki@gmail.com wrote:
On Tue, Jan 3, 2017 at 3:38 AM, jonsmirl@gmail.com jonsmirl@gmail.com wrote: > I recently ran into a probably with the UARTs on the A64. Many > Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT > is 3Mb/s with about 2.1Mb/s though put. To handle this most systems > set the speed of the BT UART to 3Mb/s. > > By default the Allwinner UART clock input is OSC24. When using OSC24 > the maximum speed the UART can be set to is 1.5Mb/s. The clock input > (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree > and 3Mb/s is then supported. > > But... there's a problem, UART0 (the console) is using the same master > clock source. So when you change the clock input over to PERIPH0x2 the > console stops working. There is no mechanism in Linux to handle this > clock source change and adjust the dividers on active uarts. So it > would be best if this master clock was set very early in u-boot and > then the console is adjusted to use it. > > Are there any downsides to making this change in u-boot?
I don't understand, did you find this behaviour with these SPL changes or general sunxi u-boot?
I think, this issue need to resolve, Andre any comments?
This is a completely different issue unrelated to A64 SPL support.
I agree that's a completely orthogonal issue. Someone needs to bake a patch (easy?) and post it. This doesn't depend in any way on this series, in fact would affect many sunxi boards.
What Jon is saying is that for the UART to go faster than 1.5 Mb/s, The APB2 clock has to be reparented to the peripheral PLL. When do we do this is the question. This is a generic sunxi issue.
On the first glance this approach sounds a bit hackish, since we use firmware to setup the clocks in a way to solves a particular issue.
On the other hand using PERIPH0(2x) as a base for APB2 seems like a completely proper setup, even somewhat recommended by Allwinner (after all the UARTs are based on this special clock for a reason).
Now, I think doing this as soon as possible (with regards to the running system) would be best. Reparenting the clk on the fly would change the baud rate, and result in the uart glitching.
Can't we change it when observing the proper order: turning TX/RX off, programming new divisors, changing the clock source, turning TX/RX back on?
You would expect to be able to achieve this reparenting by simply changing the device tree in Linux.
Why would this be done in DT? The APB2 clock is capable of being driven by 32KHz, 24 MHz or PERIPH0(2x), the DT should express this. The old Allwinner binding certainly did, the new hides this in the driver. But as the hardware allows it, the DT shouldn't have a say in the parenting decision - apart from listing the alternatives. This is actually a driver or clock system decision.
I tried that on the Allwinner 3.10
<connection lost>
kernel and it actually works. But... it causes the console to quit working.
Do Linux clk_notifiers work soon enough to handle reparenting the console in the device tree? It is not clear to me that they do. Plus the 8250_dw driver doesn't support it.
I believe Chen-Yu meant that clk_notifiers would allow to notify all clock users of some changed situation (like a different parent clock). As you know, _all_ UARTs plus I2C use this clock, so if *one* requires a higher base clock, *all* users have to notified to change their clock divisors to match the new base frequency. This has nothing to do with device tree directly.
Next I considered changing it in the u-boot device tree. But again, console is set up before u-boot loads that device tree. In Allwinner uboot console is set up in the SPL code before the device tree is loaded.
Traditionally for sunxi boards in upstream U-Boot we use the DT only sparingly. IIRC DT clock information is completely ignored and there is just some static setup - which is way easier and sufficient for our needs. The SPL doesn't use DT at all (mostly for space constraints).
So as I said changing this in U-Boot looks like an easy patch, PLL6 is setup to 600 MHz already, so we just need to change the clock source for APB2 to that and adjust the dividers.
Do you have an idea what a good APB2 frequency would be? IIRC someone one IRC mentioned that the UARTs can't do much higher than 3 Mb/s anyway, so I guess 48 MHz or 96 MHz would be enough? Allwinner's I2C is limited to 400 KHz anyway, so there is no need for a higher clock here.
It would be great if someone could do experiments to get the highest usable baudrate.
Is the PERIPH0(2x) clock always running?
It seems so, anyway the Linux clock system would take care of this now, as the console UART would be at least one user. Though I believe there are more users, so it's unlikely that it would get turned off anyway.
Cheers, Andre.
Maybe defaulting UARTs/I2C to OSC24 was done to save power?
After investigating this I now understand why Allwinner modified the standard Broadcom Bluetooth driver in AOSP. They fixed it to run with a 1.5Mb UART. Everyone else in AOSP uses it with a 3Mb/s UART. All of this started because we were unaware of the changes Allwinner had made to the Broadcom AOSP code and we couldn't get AOSP Bluetooth to work. Who knows why Allwinner didn't just adjust the UART to run at 3Mb/s. Probably two different programmer groups.
Nevertheless it seems worthwhile to give this rather simple U-Boot approach a go. But I would like to see some testing, since this will affect many sunxi boards.
Also, as Jon mentioned, the 8250_dw driver in the kernel doesn't support clk notifiers. And it won't work with earlycon anyway...
As a long-term goal teaching the Linux driver to reparent APB2 seems like a good thing, though I expect some nastiness in this.
Cheers, Andre.
Previously the boot console uart0 was getting setup in the SPL code. I have not been closely following these changes, is that still true?
Changing the clock parent needs to be done before uart0 is initialized. Changing this parent should have no other impact on u-boot other than changing the clock divisor uart0 is using.
Once Linux is up, the Linux uart code will see the changed clock parent and allow higher baud rates to be set.
This clock parent also impacts the I2C clocks, but I don't believe I2C is enabled in A64 uboot.
Jagan. _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

On Wed, Jan 4, 2017 at 12:29 PM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 16:40, jonsmirl@gmail.com wrote:
On Wed, Jan 4, 2017 at 8:36 AM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 11:25, Chen-Yu Tsai wrote:
On Wed, Jan 4, 2017 at 6:28 PM, Jagan Teki jagan@openedev.com wrote:
On Tue, Jan 3, 2017 at 2:52 PM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
On Tue, Jan 3, 2017 at 5:41 AM, Jagan Teki jagannadh.teki@gmail.com wrote: > On Tue, Jan 3, 2017 at 3:38 AM, jonsmirl@gmail.com jonsmirl@gmail.com wrote: >> I recently ran into a probably with the UARTs on the A64. Many >> Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT >> is 3Mb/s with about 2.1Mb/s though put. To handle this most systems >> set the speed of the BT UART to 3Mb/s. >> >> By default the Allwinner UART clock input is OSC24. When using OSC24 >> the maximum speed the UART can be set to is 1.5Mb/s. The clock input >> (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree >> and 3Mb/s is then supported. >> >> But... there's a problem, UART0 (the console) is using the same master >> clock source. So when you change the clock input over to PERIPH0x2 the >> console stops working. There is no mechanism in Linux to handle this >> clock source change and adjust the dividers on active uarts. So it >> would be best if this master clock was set very early in u-boot and >> then the console is adjusted to use it. >> >> Are there any downsides to making this change in u-boot? > > I don't understand, did you find this behaviour with these SPL changes > or general sunxi u-boot?
I think, this issue need to resolve, Andre any comments?
This is a completely different issue unrelated to A64 SPL support.
I agree that's a completely orthogonal issue. Someone needs to bake a patch (easy?) and post it. This doesn't depend in any way on this series, in fact would affect many sunxi boards.
What Jon is saying is that for the UART to go faster than 1.5 Mb/s, The APB2 clock has to be reparented to the peripheral PLL. When do we do this is the question. This is a generic sunxi issue.
On the first glance this approach sounds a bit hackish, since we use firmware to setup the clocks in a way to solves a particular issue.
On the other hand using PERIPH0(2x) as a base for APB2 seems like a completely proper setup, even somewhat recommended by Allwinner (after all the UARTs are based on this special clock for a reason).
Now, I think doing this as soon as possible (with regards to the running system) would be best. Reparenting the clk on the fly would change the baud rate, and result in the uart glitching.
Can't we change it when observing the proper order: turning TX/RX off, programming new divisors, changing the clock source, turning TX/RX back on?
You would expect to be able to achieve this reparenting by simply changing the device tree in Linux.
Why would this be done in DT? The APB2 clock is capable of being driven by 32KHz, 24 MHz or PERIPH0(2x), the DT should express this. The old Allwinner binding certainly did, the new hides this in the driver. But as the hardware allows it, the DT shouldn't have a say in the parenting decision - apart from listing the alternatives. This is actually a driver or clock system decision.
I tried that on the Allwinner 3.10
<connection lost>
kernel and it actually works. But... it causes the console to quit working.
Do Linux clk_notifiers work soon enough to handle reparenting the console in the device tree? It is not clear to me that they do. Plus the 8250_dw driver doesn't support it.
I believe Chen-Yu meant that clk_notifiers would allow to notify all clock users of some changed situation (like a different parent clock). As you know, _all_ UARTs plus I2C use this clock, so if *one* requires a higher base clock, *all* users have to notified to change their clock divisors to match the new base frequency. This has nothing to do with device tree directly.
Next I considered changing it in the u-boot device tree. But again, console is set up before u-boot loads that device tree. In Allwinner uboot console is set up in the SPL code before the device tree is loaded.
Traditionally for sunxi boards in upstream U-Boot we use the DT only sparingly. IIRC DT clock information is completely ignored and there is just some static setup - which is way easier and sufficient for our needs. The SPL doesn't use DT at all (mostly for space constraints).
So as I said changing this in U-Boot looks like an easy patch, PLL6 is setup to 600 MHz already, so we just need to change the clock source for APB2 to that and adjust the dividers.
Do you have an idea what a good APB2 frequency would be? IIRC someone one IRC mentioned that the UARTs can't do much higher than 3 Mb/s anyway, so I guess 48 MHz or 96 MHz would be enough? Allwinner's I2C is limited to 400 KHz anyway, so there is no need for a higher clock here.
In the datasheet there is a note recommending not to change this 600Mhz (2x 1.2Ghz).
It would be great if someone could do experiments to get the highest usable baudrate.
Is the PERIPH0(2x) clock always running?
It seems so, anyway the Linux clock system would take care of this now, as the console UART would be at least one user. Though I believe there are more users, so it's unlikely that it would get turned off anyway.
Cheers, Andre.
Maybe defaulting UARTs/I2C to OSC24 was done to save power?
After investigating this I now understand why Allwinner modified the standard Broadcom Bluetooth driver in AOSP. They fixed it to run with a 1.5Mb UART. Everyone else in AOSP uses it with a 3Mb/s UART. All of this started because we were unaware of the changes Allwinner had made to the Broadcom AOSP code and we couldn't get AOSP Bluetooth to work. Who knows why Allwinner didn't just adjust the UART to run at 3Mb/s. Probably two different programmer groups.
Nevertheless it seems worthwhile to give this rather simple U-Boot approach a go. But I would like to see some testing, since this will affect many sunxi boards.
Also, as Jon mentioned, the 8250_dw driver in the kernel doesn't support clk notifiers. And it won't work with earlycon anyway...
As a long-term goal teaching the Linux driver to reparent APB2 seems like a good thing, though I expect some nastiness in this.
Cheers, Andre.
Previously the boot console uart0 was getting setup in the SPL code. I have not been closely following these changes, is that still true?
Changing the clock parent needs to be done before uart0 is initialized. Changing this parent should have no other impact on u-boot other than changing the clock divisor uart0 is using.
Once Linux is up, the Linux uart code will see the changed clock parent and allow higher baud rates to be set.
This clock parent also impacts the I2C clocks, but I don't believe I2C is enabled in A64 uboot.
Jagan. _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

On 04/01/17 19:00, jonsmirl@gmail.com wrote:
On Wed, Jan 4, 2017 at 12:29 PM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 16:40, jonsmirl@gmail.com wrote:
On Wed, Jan 4, 2017 at 8:36 AM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 11:25, Chen-Yu Tsai wrote:
On Wed, Jan 4, 2017 at 6:28 PM, Jagan Teki jagan@openedev.com wrote:
On Tue, Jan 3, 2017 at 2:52 PM, jonsmirl@gmail.com jonsmirl@gmail.com wrote: > On Tue, Jan 3, 2017 at 5:41 AM, Jagan Teki jagannadh.teki@gmail.com wrote: >> On Tue, Jan 3, 2017 at 3:38 AM, jonsmirl@gmail.com jonsmirl@gmail.com wrote: >>> I recently ran into a probably with the UARTs on the A64. Many >>> Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT >>> is 3Mb/s with about 2.1Mb/s though put. To handle this most systems >>> set the speed of the BT UART to 3Mb/s. >>> >>> By default the Allwinner UART clock input is OSC24. When using OSC24 >>> the maximum speed the UART can be set to is 1.5Mb/s. The clock input >>> (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree >>> and 3Mb/s is then supported. >>> >>> But... there's a problem, UART0 (the console) is using the same master >>> clock source. So when you change the clock input over to PERIPH0x2 the >>> console stops working. There is no mechanism in Linux to handle this >>> clock source change and adjust the dividers on active uarts. So it >>> would be best if this master clock was set very early in u-boot and >>> then the console is adjusted to use it. >>> >>> Are there any downsides to making this change in u-boot? >> >> I don't understand, did you find this behaviour with these SPL changes >> or general sunxi u-boot?
I think, this issue need to resolve, Andre any comments?
This is a completely different issue unrelated to A64 SPL support.
I agree that's a completely orthogonal issue. Someone needs to bake a patch (easy?) and post it. This doesn't depend in any way on this series, in fact would affect many sunxi boards.
What Jon is saying is that for the UART to go faster than 1.5 Mb/s, The APB2 clock has to be reparented to the peripheral PLL. When do we do this is the question. This is a generic sunxi issue.
On the first glance this approach sounds a bit hackish, since we use firmware to setup the clocks in a way to solves a particular issue.
On the other hand using PERIPH0(2x) as a base for APB2 seems like a completely proper setup, even somewhat recommended by Allwinner (after all the UARTs are based on this special clock for a reason).
Now, I think doing this as soon as possible (with regards to the running system) would be best. Reparenting the clk on the fly would change the baud rate, and result in the uart glitching.
Can't we change it when observing the proper order: turning TX/RX off, programming new divisors, changing the clock source, turning TX/RX back on?
You would expect to be able to achieve this reparenting by simply changing the device tree in Linux.
Why would this be done in DT? The APB2 clock is capable of being driven by 32KHz, 24 MHz or PERIPH0(2x), the DT should express this. The old Allwinner binding certainly did, the new hides this in the driver. But as the hardware allows it, the DT shouldn't have a say in the parenting decision - apart from listing the alternatives. This is actually a driver or clock system decision.
I tried that on the Allwinner 3.10
<connection lost>
kernel and it actually works. But... it causes the console to quit working.
Do Linux clk_notifiers work soon enough to handle reparenting the console in the device tree? It is not clear to me that they do. Plus the 8250_dw driver doesn't support it.
I believe Chen-Yu meant that clk_notifiers would allow to notify all clock users of some changed situation (like a different parent clock). As you know, _all_ UARTs plus I2C use this clock, so if *one* requires a higher base clock, *all* users have to notified to change their clock divisors to match the new base frequency. This has nothing to do with device tree directly.
Next I considered changing it in the u-boot device tree. But again, console is set up before u-boot loads that device tree. In Allwinner uboot console is set up in the SPL code before the device tree is loaded.
Traditionally for sunxi boards in upstream U-Boot we use the DT only sparingly. IIRC DT clock information is completely ignored and there is just some static setup - which is way easier and sufficient for our needs. The SPL doesn't use DT at all (mostly for space constraints).
So as I said changing this in U-Boot looks like an easy patch, PLL6 is setup to 600 MHz already, so we just need to change the clock source for APB2 to that and adjust the dividers.
Do you have an idea what a good APB2 frequency would be? IIRC someone one IRC mentioned that the UARTs can't do much higher than 3 Mb/s anyway, so I guess 48 MHz or 96 MHz would be enough? Allwinner's I2C is limited to 400 KHz anyway, so there is no need for a higher clock here.
In the datasheet there is a note recommending not to change this 600Mhz (2x 1.2Ghz).
That refers to PERIPH0, I was asking for the APB2 frequency. Looking at the achievable baud rates and error rates for common speeds (115200) I came up with a divider of 5: 1200 MHz / 5 = 240 MHz APB2 frequency 240 MHz / 16 = 15 Mbps maximum baud rate divider|baud rate | error 5 3 Mbps 0.0% 10 1.5 Mbps 0.0% 16 921600 1.7% 130 115200 0.16%
Reports seems to indicate that at least 12.5 Mbps is feasible with the UARTs, so this maximum of 15 Mbps seems like a good approach, because it gives 3 and 1.5 Mbps without errors.
Alternatives would be max. baud rate of 3 Mbps (still limiting) or the system maximum of 75 Mbps, but clocking APB2 at 1.2 GHz sounds a bit over the top for me.
Jon, can you try this U-Boot patch (copy & pasted in, so probably won't apply cleanly):
--- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -65,9 +65,9 @@ void clock_init_uart(void) (struct sunxi_ccm_reg *)SUNXI_CCM_BASE;
/* uart clock source is apb2 */ - writel(APB2_CLK_SRC_OSC24M| + writel(APB2_CLK_SRC_PLL6| APB2_CLK_RATE_N_1| - APB2_CLK_RATE_M(1), + APB2_CLK_RATE_M(5), &ccm->apb2_div);
/* open the clock for uart */ --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -42,7 +42,7 @@ /* Serial & console */ #define CONFIG_SYS_NS16550_SERIAL /* ns16550 reg in the low bits of cpu reg */ -#define CONFIG_SYS_NS16550_CLK 24000000 +#define CONFIG_SYS_NS16550_CLK 240000000 #ifndef CONFIG_DM_SERIAL # define CONFIG_SYS_NS16550_REG_SIZE -4 # define CONFIG_SYS_NS16550_COM1 SUNXI_UART0_BASE
Cheers, Andre.
It would be great if someone could do experiments to get the highest usable baudrate.
Is the PERIPH0(2x) clock always running?
It seems so, anyway the Linux clock system would take care of this now, as the console UART would be at least one user. Though I believe there are more users, so it's unlikely that it would get turned off anyway.
Cheers, Andre.
Maybe defaulting UARTs/I2C to OSC24 was done to save power?
After investigating this I now understand why Allwinner modified the standard Broadcom Bluetooth driver in AOSP. They fixed it to run with a 1.5Mb UART. Everyone else in AOSP uses it with a 3Mb/s UART. All of this started because we were unaware of the changes Allwinner had made to the Broadcom AOSP code and we couldn't get AOSP Bluetooth to work. Who knows why Allwinner didn't just adjust the UART to run at 3Mb/s. Probably two different programmer groups.
Nevertheless it seems worthwhile to give this rather simple U-Boot approach a go. But I would like to see some testing, since this will affect many sunxi boards.
Also, as Jon mentioned, the 8250_dw driver in the kernel doesn't support clk notifiers. And it won't work with earlycon anyway...
As a long-term goal teaching the Linux driver to reparent APB2 seems like a good thing, though I expect some nastiness in this.
Cheers, Andre.
> > Previously the boot console uart0 was getting setup in the SPL code. I > have not been closely following these changes, is that still true? > > Changing the clock parent needs to be done before uart0 is > initialized. Changing this parent should have no other impact on > u-boot other than changing the clock divisor uart0 is using. > > Once Linux is up, the Linux uart code will see the changed clock > parent and allow higher baud rates to be set. > > This clock parent also impacts the I2C clocks, but I don't believe I2C > is enabled in A64 uboot.
Jagan. _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

On Wed, Jan 4, 2017 at 5:36 PM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 19:00, jonsmirl@gmail.com wrote:
On Wed, Jan 4, 2017 at 12:29 PM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 16:40, jonsmirl@gmail.com wrote:
On Wed, Jan 4, 2017 at 8:36 AM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 11:25, Chen-Yu Tsai wrote:
On Wed, Jan 4, 2017 at 6:28 PM, Jagan Teki jagan@openedev.com wrote: > On Tue, Jan 3, 2017 at 2:52 PM, jonsmirl@gmail.com jonsmirl@gmail.com wrote: >> On Tue, Jan 3, 2017 at 5:41 AM, Jagan Teki jagannadh.teki@gmail.com wrote: >>> On Tue, Jan 3, 2017 at 3:38 AM, jonsmirl@gmail.com jonsmirl@gmail.com wrote: >>>> I recently ran into a probably with the UARTs on the A64. Many >>>> Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT >>>> is 3Mb/s with about 2.1Mb/s though put. To handle this most systems >>>> set the speed of the BT UART to 3Mb/s. >>>> >>>> By default the Allwinner UART clock input is OSC24. When using OSC24 >>>> the maximum speed the UART can be set to is 1.5Mb/s. The clock input >>>> (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree >>>> and 3Mb/s is then supported. >>>> >>>> But... there's a problem, UART0 (the console) is using the same master >>>> clock source. So when you change the clock input over to PERIPH0x2 the >>>> console stops working. There is no mechanism in Linux to handle this >>>> clock source change and adjust the dividers on active uarts. So it >>>> would be best if this master clock was set very early in u-boot and >>>> then the console is adjusted to use it. >>>> >>>> Are there any downsides to making this change in u-boot? >>> >>> I don't understand, did you find this behaviour with these SPL changes >>> or general sunxi u-boot? > > I think, this issue need to resolve, Andre any comments?
This is a completely different issue unrelated to A64 SPL support.
I agree that's a completely orthogonal issue. Someone needs to bake a patch (easy?) and post it. This doesn't depend in any way on this series, in fact would affect many sunxi boards.
What Jon is saying is that for the UART to go faster than 1.5 Mb/s, The APB2 clock has to be reparented to the peripheral PLL. When do we do this is the question. This is a generic sunxi issue.
On the first glance this approach sounds a bit hackish, since we use firmware to setup the clocks in a way to solves a particular issue.
On the other hand using PERIPH0(2x) as a base for APB2 seems like a completely proper setup, even somewhat recommended by Allwinner (after all the UARTs are based on this special clock for a reason).
Now, I think doing this as soon as possible (with regards to the running system) would be best. Reparenting the clk on the fly would change the baud rate, and result in the uart glitching.
Can't we change it when observing the proper order: turning TX/RX off, programming new divisors, changing the clock source, turning TX/RX back on?
You would expect to be able to achieve this reparenting by simply changing the device tree in Linux.
Why would this be done in DT? The APB2 clock is capable of being driven by 32KHz, 24 MHz or PERIPH0(2x), the DT should express this. The old Allwinner binding certainly did, the new hides this in the driver. But as the hardware allows it, the DT shouldn't have a say in the parenting decision - apart from listing the alternatives. This is actually a driver or clock system decision.
I tried that on the Allwinner 3.10
<connection lost>
kernel and it actually works. But... it causes the console to quit working.
Do Linux clk_notifiers work soon enough to handle reparenting the console in the device tree? It is not clear to me that they do. Plus the 8250_dw driver doesn't support it.
I believe Chen-Yu meant that clk_notifiers would allow to notify all clock users of some changed situation (like a different parent clock). As you know, _all_ UARTs plus I2C use this clock, so if *one* requires a higher base clock, *all* users have to notified to change their clock divisors to match the new base frequency. This has nothing to do with device tree directly.
Next I considered changing it in the u-boot device tree. But again, console is set up before u-boot loads that device tree. In Allwinner uboot console is set up in the SPL code before the device tree is loaded.
Traditionally for sunxi boards in upstream U-Boot we use the DT only sparingly. IIRC DT clock information is completely ignored and there is just some static setup - which is way easier and sufficient for our needs. The SPL doesn't use DT at all (mostly for space constraints).
So as I said changing this in U-Boot looks like an easy patch, PLL6 is setup to 600 MHz already, so we just need to change the clock source for APB2 to that and adjust the dividers.
Do you have an idea what a good APB2 frequency would be? IIRC someone one IRC mentioned that the UARTs can't do much higher than 3 Mb/s anyway, so I guess 48 MHz or 96 MHz would be enough? Allwinner's I2C is limited to 400 KHz anyway, so there is no need for a higher clock here.
In the datasheet there is a note recommending not to change this 600Mhz (2x 1.2Ghz).
That refers to PERIPH0, I was asking for the APB2 frequency. Looking at the achievable baud rates and error rates for common speeds (115200) I came up with a divider of 5: 1200 MHz / 5 = 240 MHz APB2 frequency 240 MHz / 16 = 15 Mbps maximum baud rate divider|baud rate | error 5 3 Mbps 0.0% 10 1.5 Mbps 0.0% 16 921600 1.7% 130 115200 0.16%
Reports seems to indicate that at least 12.5 Mbps is feasible with the UARTs, so this maximum of 15 Mbps seems like a good approach, because it gives 3 and 1.5 Mbps without errors.
Alternatives would be max. baud rate of 3 Mbps (still limiting) or the system maximum of 75 Mbps, but clocking APB2 at 1.2 GHz sounds a bit over the top for me.
Jon, can you try this U-Boot patch (copy & pasted in, so probably won't apply cleanly):
What is uboot is this against? I am not using mainline u-boot, we are on the Allwinner stuff.
--- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -65,9 +65,9 @@ void clock_init_uart(void) (struct sunxi_ccm_reg *)SUNXI_CCM_BASE;
/* uart clock source is apb2 */
writel(APB2_CLK_SRC_OSC24M|
writel(APB2_CLK_SRC_PLL6| APB2_CLK_RATE_N_1|
APB2_CLK_RATE_M(1),
APB2_CLK_RATE_M(5), &ccm->apb2_div); /* open the clock for uart */
--- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -42,7 +42,7 @@ /* Serial & console */ #define CONFIG_SYS_NS16550_SERIAL /* ns16550 reg in the low bits of cpu reg */ -#define CONFIG_SYS_NS16550_CLK 24000000 +#define CONFIG_SYS_NS16550_CLK 240000000 #ifndef CONFIG_DM_SERIAL # define CONFIG_SYS_NS16550_REG_SIZE -4 # define CONFIG_SYS_NS16550_COM1 SUNXI_UART0_BASE
Cheers, Andre.
It would be great if someone could do experiments to get the highest usable baudrate.
Is the PERIPH0(2x) clock always running?
It seems so, anyway the Linux clock system would take care of this now, as the console UART would be at least one user. Though I believe there are more users, so it's unlikely that it would get turned off anyway.
Cheers, Andre.
Maybe defaulting UARTs/I2C to OSC24 was done to save power?
After investigating this I now understand why Allwinner modified the standard Broadcom Bluetooth driver in AOSP. They fixed it to run with a 1.5Mb UART. Everyone else in AOSP uses it with a 3Mb/s UART. All of this started because we were unaware of the changes Allwinner had made to the Broadcom AOSP code and we couldn't get AOSP Bluetooth to work. Who knows why Allwinner didn't just adjust the UART to run at 3Mb/s. Probably two different programmer groups.
Nevertheless it seems worthwhile to give this rather simple U-Boot approach a go. But I would like to see some testing, since this will affect many sunxi boards.
Also, as Jon mentioned, the 8250_dw driver in the kernel doesn't support clk notifiers. And it won't work with earlycon anyway...
As a long-term goal teaching the Linux driver to reparent APB2 seems like a good thing, though I expect some nastiness in this.
Cheers, Andre.
> >> >> Previously the boot console uart0 was getting setup in the SPL code. I >> have not been closely following these changes, is that still true? >> >> Changing the clock parent needs to be done before uart0 is >> initialized. Changing this parent should have no other impact on >> u-boot other than changing the clock divisor uart0 is using. >> >> Once Linux is up, the Linux uart code will see the changed clock >> parent and allow higher baud rates to be set. >> >> This clock parent also impacts the I2C clocks, but I don't believe I2C >> is enabled in A64 uboot. > > Jagan. > _______________________________________________ > U-Boot mailing list > U-Boot@lists.denx.de > http://lists.denx.de/mailman/listinfo/u-boot

On 04/01/17 22:59, jonsmirl@gmail.com wrote:
On Wed, Jan 4, 2017 at 5:36 PM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 19:00, jonsmirl@gmail.com wrote:
On Wed, Jan 4, 2017 at 12:29 PM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 16:40, jonsmirl@gmail.com wrote:
On Wed, Jan 4, 2017 at 8:36 AM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 11:25, Chen-Yu Tsai wrote: > On Wed, Jan 4, 2017 at 6:28 PM, Jagan Teki jagan@openedev.com wrote: >> On Tue, Jan 3, 2017 at 2:52 PM, jonsmirl@gmail.com jonsmirl@gmail.com wrote: >>> On Tue, Jan 3, 2017 at 5:41 AM, Jagan Teki jagannadh.teki@gmail.com wrote: >>>> On Tue, Jan 3, 2017 at 3:38 AM, jonsmirl@gmail.com jonsmirl@gmail.com wrote: >>>>> I recently ran into a probably with the UARTs on the A64. Many >>>>> Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT >>>>> is 3Mb/s with about 2.1Mb/s though put. To handle this most systems >>>>> set the speed of the BT UART to 3Mb/s. >>>>> >>>>> By default the Allwinner UART clock input is OSC24. When using OSC24 >>>>> the maximum speed the UART can be set to is 1.5Mb/s. The clock input >>>>> (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree >>>>> and 3Mb/s is then supported. >>>>> >>>>> But... there's a problem, UART0 (the console) is using the same master >>>>> clock source. So when you change the clock input over to PERIPH0x2 the >>>>> console stops working. There is no mechanism in Linux to handle this >>>>> clock source change and adjust the dividers on active uarts. So it >>>>> would be best if this master clock was set very early in u-boot and >>>>> then the console is adjusted to use it. >>>>> >>>>> Are there any downsides to making this change in u-boot? >>>> >>>> I don't understand, did you find this behaviour with these SPL changes >>>> or general sunxi u-boot? >> >> I think, this issue need to resolve, Andre any comments? > > This is a completely different issue unrelated to A64 SPL support.
I agree that's a completely orthogonal issue. Someone needs to bake a patch (easy?) and post it. This doesn't depend in any way on this series, in fact would affect many sunxi boards.
> What Jon is saying is that for the UART to go faster than 1.5 Mb/s, > The APB2 clock has to be reparented to the peripheral PLL. When do > we do this is the question. This is a generic sunxi issue.
On the first glance this approach sounds a bit hackish, since we use firmware to setup the clocks in a way to solves a particular issue.
On the other hand using PERIPH0(2x) as a base for APB2 seems like a completely proper setup, even somewhat recommended by Allwinner (after all the UARTs are based on this special clock for a reason).
> Now, I think doing this as soon as possible (with regards to the > running system) would be best. Reparenting the clk on the fly > would change the baud rate, and result in the uart glitching.
Can't we change it when observing the proper order: turning TX/RX off, programming new divisors, changing the clock source, turning TX/RX back on?
You would expect to be able to achieve this reparenting by simply changing the device tree in Linux.
Why would this be done in DT? The APB2 clock is capable of being driven by 32KHz, 24 MHz or PERIPH0(2x), the DT should express this. The old Allwinner binding certainly did, the new hides this in the driver. But as the hardware allows it, the DT shouldn't have a say in the parenting decision - apart from listing the alternatives. This is actually a driver or clock system decision.
I tried that on the Allwinner 3.10
<connection lost>
kernel and it actually works. But... it causes the console to quit working.
Do Linux clk_notifiers work soon enough to handle reparenting the console in the device tree? It is not clear to me that they do. Plus the 8250_dw driver doesn't support it.
I believe Chen-Yu meant that clk_notifiers would allow to notify all clock users of some changed situation (like a different parent clock). As you know, _all_ UARTs plus I2C use this clock, so if *one* requires a higher base clock, *all* users have to notified to change their clock divisors to match the new base frequency. This has nothing to do with device tree directly.
Next I considered changing it in the u-boot device tree. But again, console is set up before u-boot loads that device tree. In Allwinner uboot console is set up in the SPL code before the device tree is loaded.
Traditionally for sunxi boards in upstream U-Boot we use the DT only sparingly. IIRC DT clock information is completely ignored and there is just some static setup - which is way easier and sufficient for our needs. The SPL doesn't use DT at all (mostly for space constraints).
So as I said changing this in U-Boot looks like an easy patch, PLL6 is setup to 600 MHz already, so we just need to change the clock source for APB2 to that and adjust the dividers.
Do you have an idea what a good APB2 frequency would be? IIRC someone one IRC mentioned that the UARTs can't do much higher than 3 Mb/s anyway, so I guess 48 MHz or 96 MHz would be enough? Allwinner's I2C is limited to 400 KHz anyway, so there is no need for a higher clock here.
In the datasheet there is a note recommending not to change this 600Mhz (2x 1.2Ghz).
That refers to PERIPH0, I was asking for the APB2 frequency. Looking at the achievable baud rates and error rates for common speeds (115200) I came up with a divider of 5: 1200 MHz / 5 = 240 MHz APB2 frequency 240 MHz / 16 = 15 Mbps maximum baud rate divider|baud rate | error 5 3 Mbps 0.0% 10 1.5 Mbps 0.0% 16 921600 1.7% 130 115200 0.16%
Reports seems to indicate that at least 12.5 Mbps is feasible with the UARTs, so this maximum of 15 Mbps seems like a good approach, because it gives 3 and 1.5 Mbps without errors.
Alternatives would be max. baud rate of 3 Mbps (still limiting) or the system maximum of 75 Mbps, but clocking APB2 at 1.2 GHz sounds a bit over the top for me.
Jon, can you try this U-Boot patch (copy & pasted in, so probably won't apply cleanly):
What is uboot is this against? I am not using mainline u-boot, we are on the Allwinner stuff.
Sorry, but you are on your own then. This thread is a reply on a mainline U-Boot patch series, so I assumed you were interested in that. My time is far too precious to waste it on that AW crap. Either you switch to mainline or you "port" this patch over to your tree.
Cheers, Andre.
--- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -65,9 +65,9 @@ void clock_init_uart(void) (struct sunxi_ccm_reg *)SUNXI_CCM_BASE;
/* uart clock source is apb2 */
writel(APB2_CLK_SRC_OSC24M|
writel(APB2_CLK_SRC_PLL6| APB2_CLK_RATE_N_1|
APB2_CLK_RATE_M(1),
APB2_CLK_RATE_M(5), &ccm->apb2_div); /* open the clock for uart */
--- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -42,7 +42,7 @@ /* Serial & console */ #define CONFIG_SYS_NS16550_SERIAL /* ns16550 reg in the low bits of cpu reg */ -#define CONFIG_SYS_NS16550_CLK 24000000 +#define CONFIG_SYS_NS16550_CLK 240000000 #ifndef CONFIG_DM_SERIAL # define CONFIG_SYS_NS16550_REG_SIZE -4 # define CONFIG_SYS_NS16550_COM1 SUNXI_UART0_BASE
Cheers, Andre.
It would be great if someone could do experiments to get the highest usable baudrate.
Is the PERIPH0(2x) clock always running?
It seems so, anyway the Linux clock system would take care of this now, as the console UART would be at least one user. Though I believe there are more users, so it's unlikely that it would get turned off anyway.
Cheers, Andre.
Maybe defaulting UARTs/I2C to OSC24 was done to save power?
After investigating this I now understand why Allwinner modified the standard Broadcom Bluetooth driver in AOSP. They fixed it to run with a 1.5Mb UART. Everyone else in AOSP uses it with a 3Mb/s UART. All of this started because we were unaware of the changes Allwinner had made to the Broadcom AOSP code and we couldn't get AOSP Bluetooth to work. Who knows why Allwinner didn't just adjust the UART to run at 3Mb/s. Probably two different programmer groups.
Nevertheless it seems worthwhile to give this rather simple U-Boot approach a go. But I would like to see some testing, since this will affect many sunxi boards.
> Also, as Jon mentioned, the 8250_dw driver in the kernel doesn't > support clk notifiers. And it won't work with earlycon anyway...
As a long-term goal teaching the Linux driver to reparent APB2 seems like a good thing, though I expect some nastiness in this.
Cheers, Andre.
>> >>> >>> Previously the boot console uart0 was getting setup in the SPL code. I >>> have not been closely following these changes, is that still true? >>> >>> Changing the clock parent needs to be done before uart0 is >>> initialized. Changing this parent should have no other impact on >>> u-boot other than changing the clock divisor uart0 is using. >>> >>> Once Linux is up, the Linux uart code will see the changed clock >>> parent and allow higher baud rates to be set. >>> >>> This clock parent also impacts the I2C clocks, but I don't believe I2C >>> is enabled in A64 uboot. >> >> Jagan. >> _______________________________________________ >> U-Boot mailing list >> U-Boot@lists.denx.de >> http://lists.denx.de/mailman/listinfo/u-boot

I'm trying to find how to use higher baud rate through modifying uart clock.
I followed your u-boot patch. But there's nothing changed.
After patching u-boot, uart clock still has 24MHz. I checked using this command "cat /sys/class/tty/ttyS1/uartclk".
Should I modify dts file also?
On Thursday, January 5, 2017 at 7:36:24 AM UTC+9, André Przywara wrote:
On 04/01/17 19:00, jons...@gmail.com javascript: wrote:
On Wed, Jan 4, 2017 at 12:29 PM, André Przywara <andre.p...@arm.com
javascript:> wrote:
On 04/01/17 16:40, jons...@gmail.com javascript: wrote:
On Wed, Jan 4, 2017 at 8:36 AM, André Przywara <andre.p...@arm.com
javascript:> wrote:
On 04/01/17 11:25, Chen-Yu Tsai wrote:
On Wed, Jan 4, 2017 at 6:28 PM, Jagan Teki <ja...@openedev.com
javascript:> wrote:
> On Tue, Jan 3, 2017 at 2:52 PM, jons...@gmail.com javascript: <
jons...@gmail.com javascript:> wrote:
>> On Tue, Jan 3, 2017 at 5:41 AM, Jagan Teki <jaganna...@gmail.com
javascript:> wrote:
>>> On Tue, Jan 3, 2017 at 3:38 AM, jons...@gmail.com javascript: <
jons...@gmail.com javascript:> wrote:
>>>> I recently ran into a probably with the UARTs on the A64. Many >>>> Bluetooth modules (like Ampak) use the UART. The data rate of
EDR BT
>>>> is 3Mb/s with about 2.1Mb/s though put. To handle this most
systems
>>>> set the speed of the BT UART to 3Mb/s. >>>> >>>> By default the Allwinner UART clock input is OSC24. When using
OSC24
>>>> the maximum speed the UART can be set to is 1.5Mb/s. The clock
input
>>>> (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device
tree
>>>> and 3Mb/s is then supported. >>>> >>>> But... there's a problem, UART0 (the console) is using the same
master
>>>> clock source. So when you change the clock input over to
PERIPH0x2 the
>>>> console stops working. There is no mechanism in Linux to handle
this
>>>> clock source change and adjust the dividers on active uarts. So
it
>>>> would be best if this master clock was set very early in u-boot
and
>>>> then the console is adjusted to use it. >>>> >>>> Are there any downsides to making this change in u-boot? >>> >>> I don't understand, did you find this behaviour with these SPL
changes
>>> or general sunxi u-boot? > > I think, this issue need to resolve, Andre any comments?
This is a completely different issue unrelated to A64 SPL support.
I agree that's a completely orthogonal issue. Someone needs to bake a patch (easy?) and post it. This doesn't depend in any way on this series, in fact would affect many sunxi boards.
What Jon is saying is that for the UART to go faster than 1.5 Mb/s, The APB2 clock has to be reparented to the peripheral PLL. When do we do this is the question. This is a generic sunxi issue.
On the first glance this approach sounds a bit hackish, since we use firmware to setup the clocks in a way to solves a particular issue.
On the other hand using PERIPH0(2x) as a base for APB2 seems like a completely proper setup, even somewhat recommended by Allwinner
(after
all the UARTs are based on this special clock for a reason).
Now, I think doing this as soon as possible (with regards to the running system) would be best. Reparenting the clk on the fly would change the baud rate, and result in the uart glitching.
Can't we change it when observing the proper order: turning TX/RX
off,
programming new divisors, changing the clock source, turning TX/RX
back on?
You would expect to be able to achieve this reparenting by simply changing the device tree in Linux.
Why would this be done in DT? The APB2 clock is capable of being driven by 32KHz, 24 MHz or PERIPH0(2x), the DT should express this. The old Allwinner binding certainly did, the new hides this in the driver. But as the hardware allows it, the DT shouldn't have a say in the parenting decision - apart from listing the alternatives. This is actually a driver or clock system decision.
I tried that on the Allwinner 3.10
<connection lost>
kernel and it actually works. But... it causes the console to quit working.
Do Linux clk_notifiers work soon enough to handle reparenting the console in the device tree? It is not clear to me that they do. Plus the 8250_dw driver doesn't support it.
I believe Chen-Yu meant that clk_notifiers would allow to notify all clock users of some changed situation (like a different parent clock). As you know, _all_ UARTs plus I2C use this clock, so if *one* requires
a
higher base clock, *all* users have to notified to change their clock divisors to match the new base frequency. This has nothing to do with device tree directly.
Next I considered changing it in the u-boot device tree. But again, console is set up before u-boot loads that device tree. In Allwinner uboot console is set up in the SPL code before the device tree is loaded.
Traditionally for sunxi boards in upstream U-Boot we use the DT only sparingly. IIRC DT clock information is completely ignored and there is just some static setup - which is way easier and sufficient for our needs. The SPL doesn't use DT at all (mostly for space constraints).
So as I said changing this in U-Boot looks like an easy patch, PLL6 is setup to 600 MHz already, so we just need to change the clock source
for
APB2 to that and adjust the dividers.
Do you have an idea what a good APB2 frequency would be? IIRC someone one IRC mentioned that the UARTs can't do much higher than 3 Mb/s anyway, so I guess 48 MHz or 96 MHz would be enough? Allwinner's I2C is limited to 400 KHz anyway, so there is no need for a higher clock here.
In the datasheet there is a note recommending not to change this 600Mhz (2x 1.2Ghz).
That refers to PERIPH0, I was asking for the APB2 frequency. Looking at the achievable baud rates and error rates for common speeds (115200) I came up with a divider of 5: 1200 MHz / 5 = 240 MHz APB2 frequency 240 MHz / 16 = 15 Mbps maximum baud rate divider|baud rate | error 5 3 Mbps 0.0% 10 1.5 Mbps 0.0% 16 921600 1.7% 130 115200 0.16%
Reports seems to indicate that at least 12.5 Mbps is feasible with the UARTs, so this maximum of 15 Mbps seems like a good approach, because it gives 3 and 1.5 Mbps without errors.
Alternatives would be max. baud rate of 3 Mbps (still limiting) or the system maximum of 75 Mbps, but clocking APB2 at 1.2 GHz sounds a bit over the top for me.
Jon, can you try this U-Boot patch (copy & pasted in, so probably won't apply cleanly):
--- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -65,9 +65,9 @@ void clock_init_uart(void) (struct sunxi_ccm_reg *)SUNXI_CCM_BASE;
/* uart clock source is apb2 */
writel(APB2_CLK_SRC_OSC24M|
writel(APB2_CLK_SRC_PLL6| APB2_CLK_RATE_N_1|
APB2_CLK_RATE_M(1),
APB2_CLK_RATE_M(5), &ccm->apb2_div); /* open the clock for uart */
--- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -42,7 +42,7 @@ /* Serial & console */ #define CONFIG_SYS_NS16550_SERIAL /* ns16550 reg in the low bits of cpu reg */ -#define CONFIG_SYS_NS16550_CLK 24000000 +#define CONFIG_SYS_NS16550_CLK 240000000 #ifndef CONFIG_DM_SERIAL # define CONFIG_SYS_NS16550_REG_SIZE -4 # define CONFIG_SYS_NS16550_COM1 SUNXI_UART0_BASE
Cheers, Andre.
It would be great if someone could do experiments to get the highest usable baudrate.
Is the PERIPH0(2x) clock always running?
It seems so, anyway the Linux clock system would take care of this now, as the console UART would be at least one user. Though I believe there are more users, so it's unlikely that it would get turned off anyway.
Cheers, Andre.
Maybe defaulting UARTs/I2C to OSC24 was done to save power?
After investigating this I now understand why Allwinner modified the standard Broadcom Bluetooth driver in AOSP. They fixed it to run with a 1.5Mb UART. Everyone else in AOSP uses it with a 3Mb/s UART. All of this started because we were unaware of the changes Allwinner had made to the Broadcom AOSP code and we couldn't get AOSP Bluetooth to work. Who knows why Allwinner didn't just adjust the UART to run at 3Mb/s. Probably two different programmer groups.
Nevertheless it seems worthwhile to give this rather simple U-Boot approach a go. But I would like to see some testing, since this will affect many sunxi boards.
Also, as Jon mentioned, the 8250_dw driver in the kernel doesn't support clk notifiers. And it won't work with earlycon anyway...
As a long-term goal teaching the Linux driver to reparent APB2 seems like a good thing, though I expect some nastiness in this.
Cheers, Andre.
> >> >> Previously the boot console uart0 was getting setup in the SPL
code. I
>> have not been closely following these changes, is that still true? >> >> Changing the clock parent needs to be done before uart0 is >> initialized. Changing this parent should have no other impact on >> u-boot other than changing the clock divisor uart0 is using. >> >> Once Linux is up, the Linux uart code will see the changed clock >> parent and allow higher baud rates to be set. >> >> This clock parent also impacts the I2C clocks, but I don't believe
I2C
>> is enabled in A64 uboot. > > Jagan. > _______________________________________________ > U-Boot mailing list > U-B...@lists.denx.de javascript: > http://lists.denx.de/mailman/listinfo/u-boot

On Wed, Jan 4, 2017 at 12:29 PM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 16:40, jonsmirl@gmail.com wrote:
On Wed, Jan 4, 2017 at 8:36 AM, André Przywara andre.przywara@arm.com wrote:
On 04/01/17 11:25, Chen-Yu Tsai wrote:
On Wed, Jan 4, 2017 at 6:28 PM, Jagan Teki jagan@openedev.com wrote:
On Tue, Jan 3, 2017 at 2:52 PM, jonsmirl@gmail.com jonsmirl@gmail.com wrote:
On Tue, Jan 3, 2017 at 5:41 AM, Jagan Teki jagannadh.teki@gmail.com wrote: > On Tue, Jan 3, 2017 at 3:38 AM, jonsmirl@gmail.com jonsmirl@gmail.com wrote: >> I recently ran into a probably with the UARTs on the A64. Many >> Bluetooth modules (like Ampak) use the UART. The data rate of EDR BT >> is 3Mb/s with about 2.1Mb/s though put. To handle this most systems >> set the speed of the BT UART to 3Mb/s. >> >> By default the Allwinner UART clock input is OSC24. When using OSC24 >> the maximum speed the UART can be set to is 1.5Mb/s. The clock input >> (apb2) can be changed over to PERIPH0x2 (1.2ghz) via the device tree >> and 3Mb/s is then supported. >> >> But... there's a problem, UART0 (the console) is using the same master >> clock source. So when you change the clock input over to PERIPH0x2 the >> console stops working. There is no mechanism in Linux to handle this >> clock source change and adjust the dividers on active uarts. So it >> would be best if this master clock was set very early in u-boot and >> then the console is adjusted to use it. >> >> Are there any downsides to making this change in u-boot? > > I don't understand, did you find this behaviour with these SPL changes > or general sunxi u-boot?
I think, this issue need to resolve, Andre any comments?
This is a completely different issue unrelated to A64 SPL support.
I agree that's a completely orthogonal issue. Someone needs to bake a patch (easy?) and post it. This doesn't depend in any way on this series, in fact would affect many sunxi boards.
What Jon is saying is that for the UART to go faster than 1.5 Mb/s, The APB2 clock has to be reparented to the peripheral PLL. When do we do this is the question. This is a generic sunxi issue.
On the first glance this approach sounds a bit hackish, since we use firmware to setup the clocks in a way to solves a particular issue.
On the other hand using PERIPH0(2x) as a base for APB2 seems like a completely proper setup, even somewhat recommended by Allwinner (after all the UARTs are based on this special clock for a reason).
Now, I think doing this as soon as possible (with regards to the running system) would be best. Reparenting the clk on the fly would change the baud rate, and result in the uart glitching.
Can't we change it when observing the proper order: turning TX/RX off, programming new divisors, changing the clock source, turning TX/RX back on?
You would expect to be able to achieve this reparenting by simply changing the device tree in Linux.
Why would this be done in DT? The APB2 clock is capable of being driven by 32KHz, 24 MHz or PERIPH0(2x), the DT should express this. The old Allwinner binding certainly did, the new hides this in the driver. But as the hardware allows it, the DT shouldn't have a say in the parenting decision - apart from listing the alternatives. This is actually a driver or clock system decision.
I tried that on the Allwinner 3.10
<connection lost>
kernel and it actually works. But... it causes the console to quit working.
Do Linux clk_notifiers work soon enough to handle reparenting the console in the device tree? It is not clear to me that they do. Plus the 8250_dw driver doesn't support it.
I believe Chen-Yu meant that clk_notifiers would allow to notify all clock users of some changed situation (like a different parent clock). As you know, _all_ UARTs plus I2C use this clock, so if *one* requires a higher base clock, *all* users have to notified to change their clock divisors to match the new base frequency. This has nothing to do with device tree directly.
Next I considered changing it in the u-boot device tree. But again, console is set up before u-boot loads that device tree. In Allwinner uboot console is set up in the SPL code before the device tree is loaded.
Traditionally for sunxi boards in upstream U-Boot we use the DT only sparingly. IIRC DT clock information is completely ignored and there is just some static setup - which is way easier and sufficient for our needs. The SPL doesn't use DT at all (mostly for space constraints).
So as I said changing this in U-Boot looks like an easy patch, PLL6 is setup to 600 MHz already, so we just need to change the clock source for APB2 to that and adjust the dividers.
Do you have an idea what a good APB2 frequency would be? IIRC someone one IRC mentioned that the UARTs can't do much higher than 3 Mb/s anyway, so I guess 48 MHz or 96 MHz would be enough? Allwinner's I2C is limited to 400 KHz anyway, so there is no need for a higher clock here.
It would be great if someone could do experiments to get the highest usable baudrate.
In the Allwinner UART driver they use a look up table for setting the UART divider. It has entries in it for when the 1.2Ghz x2 clock is used. This table limits the max baud to 4Mb/s.
Is the PERIPH0(2x) clock always running?
It seems so, anyway the Linux clock system would take care of this now, as the console UART would be at least one user. Though I believe there are more users, so it's unlikely that it would get turned off anyway.
Cheers, Andre.
Maybe defaulting UARTs/I2C to OSC24 was done to save power?
After investigating this I now understand why Allwinner modified the standard Broadcom Bluetooth driver in AOSP. They fixed it to run with a 1.5Mb UART. Everyone else in AOSP uses it with a 3Mb/s UART. All of this started because we were unaware of the changes Allwinner had made to the Broadcom AOSP code and we couldn't get AOSP Bluetooth to work. Who knows why Allwinner didn't just adjust the UART to run at 3Mb/s. Probably two different programmer groups.
Nevertheless it seems worthwhile to give this rather simple U-Boot approach a go. But I would like to see some testing, since this will affect many sunxi boards.
Also, as Jon mentioned, the 8250_dw driver in the kernel doesn't support clk notifiers. And it won't work with earlycon anyway...
As a long-term goal teaching the Linux driver to reparent APB2 seems like a good thing, though I expect some nastiness in this.
Cheers, Andre.
Previously the boot console uart0 was getting setup in the SPL code. I have not been closely following these changes, is that still true?
Changing the clock parent needs to be done before uart0 is initialized. Changing this parent should have no other impact on u-boot other than changing the clock divisor uart0 is using.
Once Linux is up, the Linux uart code will see the changed clock parent and allow higher baud rates to be set.
This clock parent also impacts the I2C clocks, but I don't believe I2C is enabled in A64 uboot.
Jagan. _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot

On Mon, Jan 2, 2017 at 12:48 PM, Andre Przywara andre.przywara@arm.com wrote:
Hi,
hopefully the final version of the SPL support series for the Allwinner A64 SoC. Actually no real code changes this time, just rebased on top of recent master, adding some comments in patches 16/26 and 19/26 following Maxime's suggestions and adding Acked-by:s and Reviewed-by:s. I left the final patch 26/26 in for the sake of completeness, but don't expect it to be merged. We need a clever solution to unify 32-bit and 64-bit board configurations, but that shouldn't hold back this series for now. Merging everything until and including patch 21/26 (sunxi: A64: enable SPL) would be great, the other patches until 25/26 can go in as well, I think.
As the previous versions this one includes support for both AArch64 and AArch32 SPL builds. The FIT support is still missing, which means the functionality is limited. Due to the missing ARM Trusted Firmware (ATF) in this firmware chain we lose Ethernet and SMP, among other minor things. A full 64-bit build can be written to an SD card as expected and will boot the U-Boot proper prompt. However Linux will crash on boot, as PSCI is missing. Building the 32-bit version of the SPL and combining this with an ATF build and the 64-bit U-Boot proper allows to use FEL booting now: # sunxi-fel spl sunxi-spl.bin write 0x4a000000 u-boot-dtb.bin \ write 0x44000 bl31.bin reset64 0x44000 This way of booting the board gives full functionality.
The first patch is a rather simple fix (with no changes to v2). Patches 2-8 prepare the SPL code to be compiled for 64-bit in general and AArch64 in particular. Patches 9-11 refactor the existing boot0 header functionality to be used by patch 12, which introduces the 64-bit switch in the first SPL instructions. Patches 13-20 then introduce the actual core of the SPL support: the DRAM initialization, courtesy of Jens. This piggy backs on the existing H3 DRAM code, deviating where needed. This has been reworked compared to v2: I added a patch from Philipp to replace the rather uninspired register writes in the MBUS priority setup function with some meaningful code, explaining the various bits. Also the actual A64 DRAM code is no longer #ifdef'ed into the H3 driver, but uses parameters to (static) functions. The compiler detects this and removes the dead code from the other variant, resulting in the same binary size for the H3.
Patch 21 finally enables the 64-bit SPL support. So now building the existing pine64_plus_defconfig will generate a sunxi-spl.bin, which can be prepended to the U-Boot proper image (not .bin) to boot from an SD card. Due to the missing ATF support this is of limited usability at the moment, though. Also FEL support requires more love - to switch back to AArch32 before returning to FEL (without crashing, that is ;-), so this is disabled. On my setup this results in a 26KB SPL binary, which is close to the 28K limit mksunxiboot imposes at the moment. Adding anything (like FIT support or DEBUG) will exceed this, and although I have patches to let mksunxiboot get close to 32KB, this is the ulimate frontier.
So patches 22-25 then teach the SPL how to detect an U-Boot image file of a different bitness and do the RMR switch from AArch32 to AArch64, if needed. This is used by the final patch 26, which creates another _defconfig to let the SPL compile for AArch32 using the Thumb2 encoding. This results in a binary of less than 17KB in my case, so has plenty of room for extensions.
Cheers, Andre.
Changelog v3 .. v4:
- rebased on top of latest HEAD
- add various Reviewed-by: and Acked-by: tags
- add comments about register bit meanings in non-ODT-setting fix
- clarify meaning of delay values in single bit delay support patch
- removing stray semicolons from boot0.h header
Changelog v2 .. v3:
- add various Reviewed-by: and Acked-by: tags
- split tiny-printf fix to handle "-" separately
- add various comments and extend commit messages
- add assembly file to re-create the embedded RMR switch code
- add patch 14/26 to explain the MBUS priority setup
- move DRAM r/w delay values into #defines to simplify re-usablity
- replace #ifdef'ed addition of A64 support to the H3 DRAM driver with an approach using static parameters
Changelog v1 .. v2:
- drop SPI build fix (already merged)
- confine A31 register init change to H3 and A64
- use IS_ENABLED() instead of #idef to guard MBUS2 clock init
- fix tiny-printf (proper sign extension for 32-bit integers)
- add "size" output in commit msg to document tiny-printf size impact
- fix sdelay(): use only one register, add "cc" clobber
- update RMR switch code to provide easy access to RVBAR register address
- drop redundant DRAM frequency setting from Pine64 defconfig
- minor changes as requested by reviewers
Andre Przywara (21): sun6i: Restrict some register initialization to Allwinner A31 SoC armv8: prevent using THUMB armv8: add lowlevel_init.S SPL: tiny-printf: add "l" modifier SPL: tiny-printf: ignore "-" modifier move UL() macro from armv8/mmu.h into common.h SPL: make struct spl_image 64-bit safe armv8: add simple sdelay implementation armv8: move reset branch into boot hook ARM: boot0 hook: remove macro, include whole header file sunxi: introduce extra config option for boot0 header sunxi: A64: do an RMR switch if started in AArch32 mode sunxi: provide default DRAM config for sun50i in Kconfig sunxi: H3/A64: fix non-ODT setting sunxi: DRAM: fix H3 DRAM size display on aarch64 sunxi: A64: enable SPL SPL: read and store arch property from U-Boot image Makefile: use "arm64" architecture for U-Boot image files ARM: SPL/FIT: differentiate between arm and arm64 arch properties sunxi: introduce RMR switch to enter payloads in 64-bit mode sunxi: A64: add 32-bit SPL support
Jens Kuske (3): sunxi: H3: add and rename some DRAM contoller registers sunxi: H3: add DRAM controller single bit delay support sunxi: A64: use H3 DRAM initialization code for A64 as well
Philipp Tomsich (2): sunxi: H3: Rework MBUS priority setup sunxi: clocks: Use the correct pattern register for PLL11
Except arm64, applied SPL support (till 21)
Applied to u-boot-sunxi/master
thanks!

On Wed, Jan 4, 2017 at 9:20 PM, Jagan Teki jagan@openedev.com wrote:
On Mon, Jan 2, 2017 at 12:48 PM, Andre Przywara andre.przywara@arm.com wrote:
Hi,
hopefully the final version of the SPL support series for the Allwinner A64 SoC. Actually no real code changes this time, just rebased on top of recent master, adding some comments in patches 16/26 and 19/26 following Maxime's suggestions and adding Acked-by:s and Reviewed-by:s. I left the final patch 26/26 in for the sake of completeness, but don't expect it to be merged. We need a clever solution to unify 32-bit and 64-bit board configurations, but that shouldn't hold back this series for now. Merging everything until and including patch 21/26 (sunxi: A64: enable SPL) would be great, the other patches until 25/26 can go in as well, I think.
As the previous versions this one includes support for both AArch64 and AArch32 SPL builds. The FIT support is still missing, which means the functionality is limited. Due to the missing ARM Trusted Firmware (ATF) in this firmware chain we lose Ethernet and SMP, among other minor things. A full 64-bit build can be written to an SD card as expected and will boot the U-Boot proper prompt. However Linux will crash on boot, as PSCI is missing. Building the 32-bit version of the SPL and combining this with an ATF build and the 64-bit U-Boot proper allows to use FEL booting now: # sunxi-fel spl sunxi-spl.bin write 0x4a000000 u-boot-dtb.bin \ write 0x44000 bl31.bin reset64 0x44000 This way of booting the board gives full functionality.
The first patch is a rather simple fix (with no changes to v2). Patches 2-8 prepare the SPL code to be compiled for 64-bit in general and AArch64 in particular. Patches 9-11 refactor the existing boot0 header functionality to be used by patch 12, which introduces the 64-bit switch in the first SPL instructions. Patches 13-20 then introduce the actual core of the SPL support: the DRAM initialization, courtesy of Jens. This piggy backs on the existing H3 DRAM code, deviating where needed. This has been reworked compared to v2: I added a patch from Philipp to replace the rather uninspired register writes in the MBUS priority setup function with some meaningful code, explaining the various bits. Also the actual A64 DRAM code is no longer #ifdef'ed into the H3 driver, but uses parameters to (static) functions. The compiler detects this and removes the dead code from the other variant, resulting in the same binary size for the H3.
Patch 21 finally enables the 64-bit SPL support. So now building the existing pine64_plus_defconfig will generate a sunxi-spl.bin, which can be prepended to the U-Boot proper image (not .bin) to boot from an SD card. Due to the missing ATF support this is of limited usability at the moment, though. Also FEL support requires more love - to switch back to AArch32 before returning to FEL (without crashing, that is ;-), so this is disabled. On my setup this results in a 26KB SPL binary, which is close to the 28K limit mksunxiboot imposes at the moment. Adding anything (like FIT support or DEBUG) will exceed this, and although I have patches to let mksunxiboot get close to 32KB, this is the ulimate frontier.
So patches 22-25 then teach the SPL how to detect an U-Boot image file of a different bitness and do the RMR switch from AArch32 to AArch64, if needed. This is used by the final patch 26, which creates another _defconfig to let the SPL compile for AArch32 using the Thumb2 encoding. This results in a binary of less than 17KB in my case, so has plenty of room for extensions.
Cheers, Andre.
Changelog v3 .. v4:
- rebased on top of latest HEAD
- add various Reviewed-by: and Acked-by: tags
- add comments about register bit meanings in non-ODT-setting fix
- clarify meaning of delay values in single bit delay support patch
- removing stray semicolons from boot0.h header
Changelog v2 .. v3:
- add various Reviewed-by: and Acked-by: tags
- split tiny-printf fix to handle "-" separately
- add various comments and extend commit messages
- add assembly file to re-create the embedded RMR switch code
- add patch 14/26 to explain the MBUS priority setup
- move DRAM r/w delay values into #defines to simplify re-usablity
- replace #ifdef'ed addition of A64 support to the H3 DRAM driver with an approach using static parameters
Changelog v1 .. v2:
- drop SPI build fix (already merged)
- confine A31 register init change to H3 and A64
- use IS_ENABLED() instead of #idef to guard MBUS2 clock init
- fix tiny-printf (proper sign extension for 32-bit integers)
- add "size" output in commit msg to document tiny-printf size impact
- fix sdelay(): use only one register, add "cc" clobber
- update RMR switch code to provide easy access to RVBAR register address
- drop redundant DRAM frequency setting from Pine64 defconfig
- minor changes as requested by reviewers
Andre Przywara (21): sun6i: Restrict some register initialization to Allwinner A31 SoC armv8: prevent using THUMB armv8: add lowlevel_init.S SPL: tiny-printf: add "l" modifier SPL: tiny-printf: ignore "-" modifier move UL() macro from armv8/mmu.h into common.h SPL: make struct spl_image 64-bit safe armv8: add simple sdelay implementation armv8: move reset branch into boot hook ARM: boot0 hook: remove macro, include whole header file sunxi: introduce extra config option for boot0 header sunxi: A64: do an RMR switch if started in AArch32 mode sunxi: provide default DRAM config for sun50i in Kconfig sunxi: H3/A64: fix non-ODT setting sunxi: DRAM: fix H3 DRAM size display on aarch64 sunxi: A64: enable SPL SPL: read and store arch property from U-Boot image Makefile: use "arm64" architecture for U-Boot image files ARM: SPL/FIT: differentiate between arm and arm64 arch properties sunxi: introduce RMR switch to enter payloads in 64-bit mode sunxi: A64: add 32-bit SPL support
Jens Kuske (3): sunxi: H3: add and rename some DRAM contoller registers sunxi: H3: add DRAM controller single bit delay support sunxi: A64: use H3 DRAM initialization code for A64 as well
Philipp Tomsich (2): sunxi: H3: Rework MBUS priority setup sunxi: clocks: Use the correct pattern register for PLL11
Except arm64, applied SPL support (till 21)
The remaining 4 arm64 patches need necessary changes wrt current tree, so updated patchwork status as "Changes Requested"
thanks!
participants (9)
-
Andre Przywara
-
André Przywara
-
BongHo Lee
-
Chen-Yu Tsai
-
Jagan Teki
-
Jagan Teki
-
jonsmirl@gmail.com
-
Maxime Ripard
-
Simon Glass