[U-Boot] [PATCH v2 00/23] sunxi: Allwinner A64: SPL support

Hi,
this is the second spin of the SPL support series for the Allwinner A64 SoC. Thanks for the review comments, I hope I addressed all of them. As v1, this one includes support for both AArch64 and AArch32 SPL builds. The FIT support is still missing, which means the functionality is limited. Due to the missing ARM Trusted Firmware (ATF) in this firmware chain we lose Ethernet and SMP, among other minor things. A full 64-bit build can be written to an SD card as expected and will boot the U-Boot proper prompt. However Linux will crash on boot, as PSCI is missing. Building the 32-bit version of the SPL and combining this with an ATF build and the 64-bit U-Boot proper allows to use FEL booting now: # sunxi-fel spl sunxi-spl.bin write 0x4a000000 u-boot-dtb.bin \ write 0x44000 bl31.bin reset64 0x44000
The first patch is a fix, which has been slightly tweaked compared to v1 (see below). Patch 2-7 prepare the SPL code to be compiled for 64-bit in general and AArch64 in particular. Patch 8-10 refactor the existing boot0 header functionality to be used by patch 11, which introduces the 64-bit switch in the first SPL instructions. Patches 12-16 then introduce the actual core of the SPL support: the DRAM initialization, courtesy of Jens. This piggy backs on the existing H3 DRAM code, deviating where needed. Patch 18 finally enables the 64-bit SPL support. So now building the existing pine64_plus_defconfig will generate a sunxi-spl.bin, which can be prepended to the U-Boot proper image (not .bin) to boot from an SD card. Due to the missing ATF support this is of limited usability at the moment, though. Also FEL support requires more love - to switch back to AArch32 before returning to FEL (without crashing, that is ;-), so this is disabled. On my setup this results in a 26KB SPL binary, which is close to the 28K limit mksunxiboot imposes at the moment. Adding anything (like FIT support or DEBUG) will exceed this, and although I have patches to let mksunxiboot get close to 32KB, this is the ulimate frontier.
So patches 19-22 then teach the SPL how to detect an U-Boot image file of a different bitness and do the RMR switch from AArch32 to AArch64, if needed. This is used by the final patch 23, which creates another _defconfig to let the SPL compile for AArch32 using the Thumb2 encoding. This results in a binary of less than 17KB in my case, so has plenty of room for extensions.
Cheers, Andre.
Changelog v1 .. v2: - drop SPI build fix (already merged) - confine A31 register init change to H3 and A64 - use IS_ENABLED() instead of #idef to guard MBUS2 clock init - fix tiny-printf (proper sign extension for 32-bit integers) - add "size" output in commit msg to document tiny-printf size impact - fix sdelay(): use only one register, add "cc" clobber - update RMR switch code to provide easy access to RVBAR register address - drop redundant DRAM frequency setting from Pine64 defconfig - minor changes as requested by reviewers
Andre Przywara (20): sun6i: Restrict some register initialization to Allwinner A31 SoC armv8: prevent using THUMB armv8: add lowlevel_init.S SPL: tiny-printf: add "l" modifier move UL() macro from armv8/mmu.h into common.h SPL: make struct spl_image 64-bit safe armv8: add simple sdelay implementation armv8: move reset branch into boot hook ARM: boot0 hook: remove macro, include whole header file sunxi: introduce extra config option for boot0 header sunxi: A64: do an RMR switch if started in AArch32 mode sunxi: provide default DRAM config for sun50i in Kconfig sunxi: H3/A64: fix non-ODT setting sunxi: DRAM: fix H3 DRAM size display on aarch64 sunxi: A64: enable SPL SPL: read and store arch property from U-Boot image Makefile: use "arm64" architecture for U-Boot image files ARM: SPL/FIT: differentiate between arm and arm64 arch properties sunxi: introduce RMR switch to enter payloads in 64-bit mode sunxi: A64: add 32-bit SPL support
Jens Kuske (3): sunxi: H3: add and rename some DRAM contoller registers sunxi: H3: add DRAM controller single bit delay support sunxi: A64: use H3 DRAM initialization code for A64
Makefile | 9 +- arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/cpu.c | 13 ++ arch/arm/cpu/armv8/lowlevel_init.S | 44 +++++ arch/arm/cpu/armv8/start.S | 5 +- arch/arm/include/asm/arch-bcm235xx/boot0.h | 8 +- arch/arm/include/asm/arch-bcm281xx/boot0.h | 8 +- arch/arm/include/asm/arch-sunxi/boot0.h | 34 +++- arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 51 +++--- arch/arm/include/asm/armv8/mmu.h | 8 - arch/arm/lib/Makefile | 2 + arch/arm/lib/spl.c | 15 ++ arch/arm/lib/vectors.S | 1 - arch/arm/mach-omap2/boot-common.c | 2 +- arch/arm/mach-sunxi/Makefile | 2 + arch/arm/mach-sunxi/board.c | 2 +- arch/arm/mach-sunxi/clock_sun6i.c | 8 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 215 +++++++++++++++++------- arch/arm/mach-sunxi/spl_switch.c | 60 +++++++ arch/arm/mach-tegra/spl.c | 2 +- board/sunxi/Kconfig | 32 +++- common/spl/spl.c | 9 +- common/spl/spl_fit.c | 8 + common/spl/spl_mmc.c | 2 +- configs/pine64_plus_defconfig | 7 +- configs/sun50i_spl32_defconfig | 10 ++ include/common.h | 10 +- include/configs/sunxi-common.h | 4 +- include/spl.h | 19 ++- lib/tiny-printf.c | 50 ++++-- 32 files changed, 491 insertions(+), 153 deletions(-) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S create mode 100644 arch/arm/mach-sunxi/spl_switch.c create mode 100644 configs/sun50i_spl32_defconfig

These days many Allwinner SoCs use clock_sun6i.c, although out of them only the (original sun6i) A31 has a second MBUS clock register. Also the requirement for setting up the PRCM PLL_CTLR1 register to provide the proper voltage seems to be a property of older SoCs only as well.
Restrict the MBUS initialization to this SoC only to avoid writing bogus values to (undefined) registers in other chips. I can only verify that the PLL voltage setup is not needed for H3 and A64, so for now we only spare those two SoCs.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de Reviewed-by: Chen-Yu Tsai wens@csie.org --- arch/arm/mach-sunxi/clock_sun6i.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index ed8cd9b..80cfc0b 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -21,6 +21,8 @@ void clock_init_safe(void) { struct sunxi_ccm_reg * const ccm = (struct sunxi_ccm_reg *)SUNXI_CCM_BASE; + +#if !defined(CONFIG_MACH_SUN8I_H3) && !defined(CONFIG_MACH_SUN50I) struct sunxi_prcm_reg * const prcm = (struct sunxi_prcm_reg *)SUNXI_PRCM_BASE;
@@ -31,6 +33,7 @@ void clock_init_safe(void) PRCM_PLL_CTRL_LDO_DIGITAL_EN | PRCM_PLL_CTRL_LDO_ANALOG_EN | PRCM_PLL_CTRL_EXT_OSC_EN | PRCM_PLL_CTRL_LDO_OUT_L(1140)); clrbits_le32(&prcm->pll_ctrl1, PRCM_PLL_CTRL_LDO_KEY_MASK); +#endif
clock_set_pll1(408000000);
@@ -41,7 +44,8 @@ void clock_init_safe(void) writel(AHB1_ABP1_DIV_DEFAULT, &ccm->ahb1_apb1_div);
writel(MBUS_CLK_DEFAULT, &ccm->mbus0_clk_cfg); - writel(MBUS_CLK_DEFAULT, &ccm->mbus1_clk_cfg); + if (IS_ENABLED(CONFIG_MACH_SUN6I)) + writel(MBUS_CLK_DEFAULT, &ccm->mbus1_clk_cfg); } #endif

On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
These days many Allwinner SoCs use clock_sun6i.c, although out of them only the (original sun6i) A31 has a second MBUS clock register. Also the requirement for setting up the PRCM PLL_CTLR1 register to provide the proper voltage seems to be a property of older SoCs only as well.
Restrict the MBUS initialization to this SoC only to avoid writing bogus values to (undefined) registers in other chips. I can only verify that the PLL voltage setup is not needed for H3 and A64, so for now we only spare those two SoCs.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de Reviewed-by: Chen-Yu Tsai wens@csie.org
arch/arm/mach-sunxi/clock_sun6i.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
Reviewed-by: Simon Glass sjg@chromium.org

On Mon, Dec 05, 2016 at 01:52:08AM +0000, Andre Przywara wrote:
These days many Allwinner SoCs use clock_sun6i.c, although out of them only the (original sun6i) A31 has a second MBUS clock register. Also the requirement for setting up the PRCM PLL_CTLR1 register to provide the proper voltage seems to be a property of older SoCs only as well.
Restrict the MBUS initialization to this SoC only to avoid writing bogus values to (undefined) registers in other chips. I can only verify that the PLL voltage setup is not needed for H3 and A64, so for now we only spare those two SoCs.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de Reviewed-by: Chen-Yu Tsai wens@csie.org
Acked-by: Maxime Ripard maxime.ripard@free-electrons.com
Thanks! Maxime

The predominantely 32-bit ARM targets try to compile the SPL in Thumb mode to reduce code size. The 64-bit AArch64 instruction set does not know an alternative, concise encoding, so the Thumb build option should only be set for 32-bit targets. Likewise -marm machine options are only valid for ARMv7 targets.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de --- arch/arm/lib/Makefile | 2 ++ include/configs/sunxi-common.h | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile index 0051f76..024139d 100644 --- a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -77,8 +77,10 @@ ifndef CONFIG_HAS_THUMB2
# for C files, just apend -marm, which will override previous -mthumb*
+ifndef CONFIG_ARM64 CFLAGS_cache.o := -marm CFLAGS_cache-cp15.o := -marm +endif
# For .S, drop -mthumb* and other thumb-related options. # CFLAGS_REMOVE_* would not have an effet, so AFLAGS_REMOVE_* diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index b0bfc0d..e05c318 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -35,7 +35,7 @@ /* * High Level Configuration Options */ -#ifdef CONFIG_SPL_BUILD +#if defined(CONFIG_SPL_BUILD) && !defined(CONFIG_ARM64) #define CONFIG_SYS_THUMB_BUILD /* Thumbs mode to save space in SPL */ #endif

On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
The predominantely 32-bit ARM targets try to compile the SPL in Thumb mode to reduce code size. The 64-bit AArch64 instruction set does not know an alternative, concise encoding, so the Thumb build option should only be set for 32-bit targets. Likewise -marm machine options are only valid for ARMv7 targets.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de
arch/arm/lib/Makefile | 2 ++ include/configs/sunxi-common.h | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
Reviewed-by: Simon Glass sjg@chromium.org

On Mon, Dec 05, 2016 at 01:52:09AM +0000, Andre Przywara wrote:
The predominantely 32-bit ARM targets try to compile the SPL in Thumb mode to reduce code size. The 64-bit AArch64 instruction set does not know an alternative, concise encoding, so the Thumb build option should only be set for 32-bit targets. Likewise -marm machine options are only valid for ARMv7 targets.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de
Acked-by: Maxime Ripard maxime.ripard@free-electrons.com
Thanks, Maxime

On Mon, Dec 05, 2016 at 01:52:09AM +0000, Andre Przywara wrote:
The predominantely 32-bit ARM targets try to compile the SPL in Thumb mode to reduce code size. The 64-bit AArch64 instruction set does not know an alternative, concise encoding, so the Thumb build option should only be set for 32-bit targets. Likewise -marm machine options are only valid for ARMv7 targets.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de
Reviewed-by: Tom Rini trini@konsulko.com

For boards that call s_init() when the SPL runs, we are expected to setup an early stack before calling this C function. Implement the proper AArch64 version of this based on the ARMv7 code. This allows sunxi boards to setup the basic peripherals even on with a 64-bit SPL.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/lowlevel_init.S | 44 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S
diff --git a/arch/arm/cpu/armv8/Makefile b/arch/arm/cpu/armv8/Makefile index dea1465..799a752 100644 --- a/arch/arm/cpu/armv8/Makefile +++ b/arch/arm/cpu/armv8/Makefile @@ -25,3 +25,4 @@ obj-$(CONFIG_FSL_LAYERSCAPE) += fsl-layerscape/ obj-$(CONFIG_S32V234) += s32v234/ obj-$(CONFIG_ARCH_ZYNQMP) += zynqmp/ obj-$(CONFIG_TARGET_HIKEY) += hisilicon/ +obj-$(CONFIG_ARCH_SUNXI) += lowlevel_init.o diff --git a/arch/arm/cpu/armv8/lowlevel_init.S b/arch/arm/cpu/armv8/lowlevel_init.S new file mode 100644 index 0000000..189e35f --- /dev/null +++ b/arch/arm/cpu/armv8/lowlevel_init.S @@ -0,0 +1,44 @@ +/* + * A lowlevel_init function that sets up the stack to call a C function to + * perform further init. + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#include <asm-offsets.h> +#include <config.h> +#include <linux/linkage.h> + +ENTRY(lowlevel_init) + /* + * Setup a temporary stack. Global data is not available yet. + */ +#if defined(CONFIG_SPL_BUILD) && defined(CONFIG_SPL_STACK) + ldr w0, =CONFIG_SPL_STACK +#else + ldr w0, =CONFIG_SYS_INIT_SP_ADDR +#endif + bic sp, x0, #0xf /* 16-byte alignment for ABI compliance */ + + /* + * Save the old LR(passed in x29) and the current LR to stack + */ + stp x29, x30, [sp, #-16]! + + /* + * Call the very early init function. This should do only the + * absolute bare minimum to get started. It should not: + * + * - set up DRAM + * - use global_data + * - clear BSS + * - try to start a console + * + * For boards with SPL this should be empty since SPL can do all of + * this init in the SPL board_init_f() function which is called + * immediately after this. + */ + bl s_init + ldp x29, x30, [sp] + ret +ENDPROC(lowlevel_init)

Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
For boards that call s_init() when the SPL runs, we are expected to setup an early stack before calling this C function. Implement the proper AArch64 version of this based on the ARMv7 code. This allows sunxi boards to setup the basic peripherals even on with a 64-bit SPL.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/lowlevel_init.S | 44 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S
Is this actually needed / used for anything?
Regards, Simon

On 05/12/16 06:26, Simon Glass wrote:
Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
For boards that call s_init() when the SPL runs, we are expected to setup an early stack before calling this C function. Implement the proper AArch64 version of this based on the ARMv7 code. This allows sunxi boards to setup the basic peripherals even on with a 64-bit SPL.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/lowlevel_init.S | 44 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S
Is this actually needed / used for anything?
All sunxi boards need to call s_init() in mach-sunxi/board.c. But I gave this a closer look: Indeed I believe we don't need lowlevel_init.S and the early call to s_init(). We can just follow the recommendation in lowlevel_init.S and move that code to board_init_f(). I have a small series that reworks this, but this would affect all sunxi boards so should be considered separately.
Are you OK if we go ahead with this solution here for now, as it creates the least churn?
Cheers, Andre.

Hi Andre,
On 16 December 2016 at 19:55, André Przywara andre.przywara@arm.com wrote:
On 05/12/16 06:26, Simon Glass wrote:
Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
For boards that call s_init() when the SPL runs, we are expected to setup an early stack before calling this C function. Implement the proper AArch64 version of this based on the ARMv7 code. This allows sunxi boards to setup the basic peripherals even on with a 64-bit SPL.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/Makefile | 1 + arch/arm/cpu/armv8/lowlevel_init.S | 44 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) create mode 100644 arch/arm/cpu/armv8/lowlevel_init.S
Is this actually needed / used for anything?
All sunxi boards need to call s_init() in mach-sunxi/board.c. But I gave this a closer look: Indeed I believe we don't need lowlevel_init.S and the early call to s_init(). We can just follow the recommendation in lowlevel_init.S and move that code to board_init_f(). I have a small series that reworks this, but this would affect all sunxi boards so should be considered separately.
Are you OK if we go ahead with this solution here for now, as it creates the least churn?
Sounds good to me.
Regards, Simon

On Mon, Dec 05, 2016 at 01:52:10AM +0000, Andre Przywara wrote:
For boards that call s_init() when the SPL runs, we are expected to setup an early stack before calling this C function. Implement the proper AArch64 version of this based on the ARMv7 code. This allows sunxi boards to setup the basic peripherals even on with a 64-bit SPL.
Signed-off-by: Andre Przywara andre.przywara@arm.com
This is going to override lowlevel_init from arch/arm/cpu/armv8/start.S and is that really desired? Thanks!

On 05/12/16 21:56, Tom Rini wrote:
On Mon, Dec 05, 2016 at 01:52:10AM +0000, Andre Przywara wrote:
For boards that call s_init() when the SPL runs, we are expected to setup an early stack before calling this C function. Implement the proper AArch64 version of this based on the ARMv7 code. This allows sunxi boards to setup the basic peripherals even on with a 64-bit SPL.
Signed-off-by: Andre Przywara andre.przywara@arm.com
This is going to override lowlevel_init from arch/arm/cpu/armv8/start.S and is that really desired? Thanks!
Not sure if it is desired, but it's needed. The weak function in start.S just initialises the GIC (if configured), that looks like a NOP for sunxi to me (we do this already much better in ARM Trusted Firmware). For 32-bit sunxi we call s_init() through lowlevel_init(), which is what I copy here for armv8. Now there is this comment which discourages doing so and Alex already complained about it as well, so I might take a look at how we would skip this step. I was a bit wary going there as this would mean to rework the 32-bit code as well, which affects a lot of boards, which I can barely test here.
Cheers, Andre.

On Tue, Dec 06, 2016 at 08:04:24AM +0000, André Przywara wrote:
On 05/12/16 21:56, Tom Rini wrote:
On Mon, Dec 05, 2016 at 01:52:10AM +0000, Andre Przywara wrote:
For boards that call s_init() when the SPL runs, we are expected to setup an early stack before calling this C function. Implement the proper AArch64 version of this based on the ARMv7 code. This allows sunxi boards to setup the basic peripherals even on with a 64-bit SPL.
Signed-off-by: Andre Przywara andre.przywara@arm.com
This is going to override lowlevel_init from arch/arm/cpu/armv8/start.S and is that really desired? Thanks!
Not sure if it is desired, but it's needed. The weak function in start.S just initialises the GIC (if configured), that looks like a NOP for sunxi to me (we do this already much better in ARM Trusted Firmware). For 32-bit sunxi we call s_init() through lowlevel_init(), which is what I copy here for armv8. Now there is this comment which discourages doing so and Alex already complained about it as well, so I might take a look at how we would skip this step. I was a bit wary going there as this would mean to rework the 32-bit code as well, which affects a lot of boards, which I can barely test here.
The path flow here on 32bit, both with and without SPL is what makes this area a good bit of a pain to make changes in.

tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly. Also there are printfs using the '-' modifier, which we choose to ignore for simplicity.
Using a relatively decent compiler (GCC 5.3.0) this does _not_ increase the code size of tiny-printf.o for 32-bit builds (where long and int are actually the same), actually it looses three (ARM Thumb2) instructions from the actual SPL (numbers for orangepi_plus_defconfig): text data bss dec hex filename 758 0 0 758 2f6 spl/lib/tiny-printf.o before 18839 488 232 19559 4c67 spl/u-boot-spl before 758 0 0 758 2f6 spl/lib/tiny-printf.o after 18833 488 232 19553 4c61 spl/u-boot-spl after
This adds some substantial amount of code to a 64-bit build, though: (taken after a later commit, which enables the ARM64 SPL build for sunxi) text data bss dec hex filename 1542 0 0 1542 606 spl/lib/tiny-printf.o before 25830 392 360 26582 67d6 spl/u-boot-spl before 1758 0 0 1758 6de spl/lib/tiny-printf.o after 26040 392 360 26792 68a8 spl/u-boot-spl after
Signed-off-by: Andre Przywara andre.przywara@arm.com --- lib/tiny-printf.c | 50 +++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 39 insertions(+), 11 deletions(-)
diff --git a/lib/tiny-printf.c b/lib/tiny-printf.c index 30ac759..dfa8432 100644 --- a/lib/tiny-printf.c +++ b/lib/tiny-printf.c @@ -38,8 +38,8 @@ static void out_dgt(struct printf_info *info, char dgt) info->zs = 1; }
-static void div_out(struct printf_info *info, unsigned int *num, - unsigned int div) +static void div_out(struct printf_info *info, unsigned long *num, + unsigned long div) { unsigned char dgt = 0;
@@ -56,9 +56,9 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) { char ch; char *p; - unsigned int num; + unsigned long num; char buf[12]; - unsigned int div; + unsigned long div;
while ((ch = *(fmt++))) { if (ch != '%') { @@ -66,8 +66,12 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) } else { bool lz = false; int width = 0; + bool islong = false;
ch = *(fmt++); + if (ch == '-') + ch = *(fmt++); + if (ch == '0') { ch = *(fmt++); lz = 1; @@ -80,6 +84,11 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) ch = *fmt++; } } + if (ch == 'l') { + ch = *(fmt++); + islong = true; + } + info->bf = buf; p = info->bf; info->zs = 0; @@ -89,24 +98,43 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) goto abort; case 'u': case 'd': - num = va_arg(va, unsigned int); - if (ch == 'd' && (int)num < 0) { - num = -(int)num; - out(info, '-'); + div = 1000000000; + if (islong) { + num = va_arg(va, unsigned long); + if (sizeof(long) > 4) + div *= div * 10; + } else { + num = va_arg(va, unsigned int); + } + + if (ch == 'd') { + if (islong && (long)num < 0) { + num = -(long)num; + out(info, '-'); + } else if (!islong && (int)num < 0) { + num = -(int)num; + out(info, '-'); + } } if (!num) { out_dgt(info, 0); } else { - for (div = 1000000000; div; div /= 10) + for (; div; div /= 10) div_out(info, &num, div); } break; case 'x': - num = va_arg(va, unsigned int); + if (islong) { + num = va_arg(va, unsigned long); + div = 1UL << (sizeof(long) * 8 - 4); + } else { + num = va_arg(va, unsigned int); + div = 0x10000000; + } if (!num) { out_dgt(info, 0); } else { - for (div = 0x10000000; div; div /= 0x10) + for (; div; div /= 0x10) div_out(info, &num, div); } break;

On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly. Also there are printfs using the '-' modifier, which we choose to ignore for simplicity.
Using a relatively decent compiler (GCC 5.3.0) this does _not_ increase the code size of tiny-printf.o for 32-bit builds (where long and int are actually the same), actually it looses three (ARM Thumb2) instructions from the actual SPL (numbers for orangepi_plus_defconfig): text data bss dec hex filename 758 0 0 758 2f6 spl/lib/tiny-printf.o before 18839 488 232 19559 4c67 spl/u-boot-spl before 758 0 0 758 2f6 spl/lib/tiny-printf.o after 18833 488 232 19553 4c61 spl/u-boot-spl after
This adds some substantial amount of code to a 64-bit build, though: (taken after a later commit, which enables the ARM64 SPL build for sunxi) text data bss dec hex filename 1542 0 0 1542 606 spl/lib/tiny-printf.o before 25830 392 360 26582 67d6 spl/u-boot-spl before 1758 0 0 1758 6de spl/lib/tiny-printf.o after 26040 392 360 26792 68a8 spl/u-boot-spl after
Signed-off-by: Andre Przywara andre.przywara@arm.com
lib/tiny-printf.c | 50 +++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 39 insertions(+), 11 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org

On Mon, 5 Dec 2016 01:52:11 +0000 Andre Przywara andre.przywara@arm.com wrote:
tiny-printf does not know about the "l" modifier so far, which breaks the crash dump on AArch64, because it uses %lx to print the registers. Add an easy way of handling longs correctly. Also there are printfs using the '-' modifier, which we choose to ignore for simplicity.
If the '-' modifier is so useless in practice, then why don't we just remove it from the format string of the caller rather than making the printf implementation deviate from the expected behaviour?
From "man 3 printf":
" - The converted value is to be left adjusted on the field boundary. (The default is right justification.) The converted value is padded on the right with blanks, rather than on the left with blanks or zeros. A - overrides a 0 if both are given."
Either way, I think that this change needs to be done as a separate patch. Smuggling it as a part of this "l" modifier patch looks rather fishy ;-)
Using a relatively decent compiler (GCC 5.3.0) this does _not_ increase the code size of tiny-printf.o for 32-bit builds (where long and int are actually the same), actually it looses three (ARM Thumb2) instructions from the actual SPL (numbers for orangepi_plus_defconfig): text data bss dec hex filename 758 0 0 758 2f6 spl/lib/tiny-printf.o before 18839 488 232 19559 4c67 spl/u-boot-spl before 758 0 0 758 2f6 spl/lib/tiny-printf.o after 18833 488 232 19553 4c61 spl/u-boot-spl after
Very cool :-)
This adds some substantial amount of code to a 64-bit build, though: (taken after a later commit, which enables the ARM64 SPL build for sunxi) text data bss dec hex filename 1542 0 0 1542 606 spl/lib/tiny-printf.o before 25830 392 360 26582 67d6 spl/u-boot-spl before 1758 0 0 1758 6de spl/lib/tiny-printf.o after 26040 392 360 26792 68a8 spl/u-boot-spl after
OK, I guess we have to live with this. One more win for Thumb2 vs. AArch64 though.
And thanks for adding these stats to the commit message.
Signed-off-by: Andre Przywara andre.przywara@arm.com
lib/tiny-printf.c | 50 +++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 39 insertions(+), 11 deletions(-)
diff --git a/lib/tiny-printf.c b/lib/tiny-printf.c index 30ac759..dfa8432 100644 --- a/lib/tiny-printf.c +++ b/lib/tiny-printf.c @@ -38,8 +38,8 @@ static void out_dgt(struct printf_info *info, char dgt) info->zs = 1; }
-static void div_out(struct printf_info *info, unsigned int *num,
unsigned int div)
+static void div_out(struct printf_info *info, unsigned long *num,
unsigned long div)
{ unsigned char dgt = 0;
@@ -56,9 +56,9 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) { char ch; char *p;
- unsigned int num;
- unsigned long num; char buf[12];
- unsigned int div;
unsigned long div;
while ((ch = *(fmt++))) { if (ch != '%') {
@@ -66,8 +66,12 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) } else { bool lz = false; int width = 0;
bool islong = false; ch = *(fmt++);
if (ch == '-')
ch = *(fmt++);
if (ch == '0') { ch = *(fmt++); lz = 1;
@@ -80,6 +84,11 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) ch = *fmt++; } }
if (ch == 'l') {
ch = *(fmt++);
islong = true;
}
info->bf = buf; p = info->bf; info->zs = 0;
@@ -89,24 +98,43 @@ int _vprintf(struct printf_info *info, const char *fmt, va_list va) goto abort; case 'u': case 'd':
num = va_arg(va, unsigned int);
if (ch == 'd' && (int)num < 0) {
num = -(int)num;
out(info, '-');
div = 1000000000;
if (islong) {
num = va_arg(va, unsigned long);
if (sizeof(long) > 4)
div *= div * 10;
} else {
num = va_arg(va, unsigned int);
}
if (ch == 'd') {
if (islong && (long)num < 0) {
num = -(long)num;
out(info, '-');
} else if (!islong && (int)num < 0) {
num = -(int)num;
out(info, '-');
} } if (!num) { out_dgt(info, 0); } else {
for (div = 1000000000; div; div /= 10)
for (; div; div /= 10) div_out(info, &num, div); } break; case 'x':
num = va_arg(va, unsigned int);
if (islong) {
num = va_arg(va, unsigned long);
div = 1UL << (sizeof(long) * 8 - 4);
} else {
num = va_arg(va, unsigned int);
div = 0x10000000;
} if (!num) { out_dgt(info, 0); } else {
for (div = 0x10000000; div; div /= 0x10)
for (; div; div /= 0x10) div_out(info, &num, div); } break;

The UL() macro is pretty useful in sharing constants between assembly and C files while still being able to specify a type for C. Move the macro from an armv8 specific header into a common header file to be able to use it by arm code (for instance) as well.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de --- arch/arm/include/asm/armv8/mmu.h | 8 -------- include/common.h | 10 +++++++++- 2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h index aa0f3c4..e9b4cdb 100644 --- a/arch/arm/include/asm/armv8/mmu.h +++ b/arch/arm/include/asm/armv8/mmu.h @@ -8,14 +8,6 @@ #ifndef _ASM_ARMV8_MMU_H_ #define _ASM_ARMV8_MMU_H_
-#ifdef __ASSEMBLY__ -#define _AC(X, Y) X -#else -#define _AC(X, Y) (X##Y) -#endif - -#define UL(x) _AC(x, UL) - /***************************************************************/ /* * The following definitions are related each other, shoud be diff --git a/include/common.h b/include/common.h index a8d833b..5fcd5f5 100644 --- a/include/common.h +++ b/include/common.h @@ -15,6 +15,8 @@ typedef volatile unsigned long vu_long; typedef volatile unsigned short vu_short; typedef volatile unsigned char vu_char;
+#define _AC(X, Y) (X##Y) + #include <config.h> #include <errno.h> #include <asm-offsets.h> @@ -936,7 +938,11 @@ int cpu_disable(int nr); int cpu_release(int nr, int argc, char * const argv[]); #endif
-#endif /* __ASSEMBLY__ */ +#else /* __ASSEMBLY__ */ + +#define _AC(X, Y) X + +#endif /* __ASSEMBLY__ */
#ifdef CONFIG_PPC /* @@ -948,6 +954,8 @@ int cpu_release(int nr, int argc, char * const argv[]);
/* Put only stuff here that the assembler can digest */
+#define UL(x) _AC(x, UL) + #ifdef CONFIG_POST #define CONFIG_HAS_POST #ifndef CONFIG_POST_ALT_LIST

Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
The UL() macro is pretty useful in sharing constants between assembly and C files while still being able to specify a type for C. Move the macro from an armv8 specific header into a common header file to be able to use it by arm code (for instance) as well.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de
arch/arm/include/asm/armv8/mmu.h | 8 -------- include/common.h | 10 +++++++++- 2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h index aa0f3c4..e9b4cdb 100644 --- a/arch/arm/include/asm/armv8/mmu.h +++ b/arch/arm/include/asm/armv8/mmu.h @@ -8,14 +8,6 @@ #ifndef _ASM_ARMV8_MMU_H_ #define _ASM_ARMV8_MMU_H_
-#ifdef __ASSEMBLY__ -#define _AC(X, Y) X -#else -#define _AC(X, Y) (X##Y) -#endif
-#define UL(x) _AC(x, UL)
/***************************************************************/ /*
- The following definitions are related each other, shoud be
diff --git a/include/common.h b/include/common.h index a8d833b..5fcd5f5 100644 --- a/include/common.h +++ b/include/common.h @@ -15,6 +15,8 @@ typedef volatile unsigned long vu_long; typedef volatile unsigned short vu_short; typedef volatile unsigned char vu_char;
+#define _AC(X, Y) (X##Y)
#include <config.h> #include <errno.h> #include <asm-offsets.h> @@ -936,7 +938,11 @@ int cpu_disable(int nr); int cpu_release(int nr, int argc, char * const argv[]); #endif
-#endif /* __ASSEMBLY__ */ +#else /* __ASSEMBLY__ */
+#define _AC(X, Y) X
Can you please comment what this macro is for?
+#endif /* __ASSEMBLY__ */
#ifdef CONFIG_PPC /* @@ -948,6 +954,8 @@ int cpu_release(int nr, int argc, char * const argv[]);
/* Put only stuff here that the assembler can digest */
+#define UL(x) _AC(x, UL)
#ifdef CONFIG_POST #define CONFIG_HAS_POST
#ifndef CONFIG_POST_ALT_LIST
2.8.2
Regards, Simon

Since entry_point and load_addr are addresses, they should be represented as longs to cover the whole address space and to avoid warning when compiling the SPL in 64-bit. Also adjust debug prints to add the 'l' specifier, where needed.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de --- arch/arm/mach-omap2/boot-common.c | 2 +- arch/arm/mach-tegra/spl.c | 2 +- common/spl/spl.c | 8 ++++---- common/spl/spl_mmc.c | 2 +- include/spl.h | 4 ++-- 5 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/arm/mach-omap2/boot-common.c b/arch/arm/mach-omap2/boot-common.c index 385310b..7ae3d80 100644 --- a/arch/arm/mach-omap2/boot-common.c +++ b/arch/arm/mach-omap2/boot-common.c @@ -228,7 +228,7 @@ void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image)
u32 boot_params = *((u32 *)OMAP_SRAM_SCRATCH_BOOT_PARAMS);
- debug("image entry point: 0x%X\n", spl_image->entry_point); + debug("image entry point: 0x%lX\n", spl_image->entry_point); /* Pass the saved boot_params from rom code */ image_entry((u32 *)boot_params); } diff --git a/arch/arm/mach-tegra/spl.c b/arch/arm/mach-tegra/spl.c index e0f9d5b..41c88cb 100644 --- a/arch/arm/mach-tegra/spl.c +++ b/arch/arm/mach-tegra/spl.c @@ -42,7 +42,7 @@ u32 spl_boot_device(void)
void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image) { - debug("image entry point: 0x%X\n", spl_image->entry_point); + debug("image entry point: 0x%lX\n", spl_image->entry_point);
start_cpu((u32)spl_image->entry_point); halt_avp(); diff --git a/common/spl/spl.c b/common/spl/spl.c index 32b9f1e..cda2f8a 100644 --- a/common/spl/spl.c +++ b/common/spl/spl.c @@ -115,7 +115,7 @@ int spl_parse_image_header(struct spl_image_info *spl_image, } spl_image->os = image_get_os(header); spl_image->name = image_get_name(header); - debug("spl: payload image: %.*s load addr: 0x%x size: %d\n", + debug("spl: payload image: %.*s load addr: 0x%lx size: %d\n", (int)sizeof(spl_image->name), spl_image->name, spl_image->load_addr, spl_image->size); } else { @@ -140,7 +140,7 @@ int spl_parse_image_header(struct spl_image_info *spl_image, spl_image->load_addr = CONFIG_SYS_LOAD_ADDR; spl_image->entry_point = CONFIG_SYS_LOAD_ADDR; spl_image->size = end - start; - debug("spl: payload zImage, load addr: 0x%x size: %d\n", + debug("spl: payload zImage, load addr: 0x%lx size: %d\n", spl_image->load_addr, spl_image->size); return 0; } @@ -164,9 +164,9 @@ __weak void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image) typedef void __noreturn (*image_entry_noargs_t)(void);
image_entry_noargs_t image_entry = - (image_entry_noargs_t)(unsigned long)spl_image->entry_point; + (image_entry_noargs_t)spl_image->entry_point;
- debug("image entry point: 0x%X\n", spl_image->entry_point); + debug("image entry point: 0x%lX\n", spl_image->entry_point); image_entry(); }
diff --git a/common/spl/spl_mmc.c b/common/spl/spl_mmc.c index 58b061f..9575d48 100644 --- a/common/spl/spl_mmc.c +++ b/common/spl/spl_mmc.c @@ -36,7 +36,7 @@ static int mmc_load_legacy(struct spl_image_info *spl_image, struct mmc *mmc, /* Read the header too to avoid extra memcpy */ count = blk_dread(mmc_get_blk_desc(mmc), sector, image_size_sectors, (void *)(ulong)spl_image->load_addr); - debug("read %x sectors to %x\n", image_size_sectors, + debug("read %x sectors to %lx\n", image_size_sectors, spl_image->load_addr); if (count != image_size_sectors) return -EIO; diff --git a/include/spl.h b/include/spl.h index c727eb7..feadb33 100644 --- a/include/spl.h +++ b/include/spl.h @@ -23,8 +23,8 @@ struct spl_image_info { const char *name; u8 os; - u32 load_addr; - u32 entry_point; + ulong load_addr; + ulong entry_point; u32 size; u32 flags; };

On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
Since entry_point and load_addr are addresses, they should be represented as longs to cover the whole address space and to avoid warning when compiling the SPL in 64-bit. Also adjust debug prints to add the 'l' specifier, where needed.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de
arch/arm/mach-omap2/boot-common.c | 2 +- arch/arm/mach-tegra/spl.c | 2 +- common/spl/spl.c | 8 ++++---- common/spl/spl_mmc.c | 2 +- include/spl.h | 4 ++-- 5 files changed, 9 insertions(+), 9 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org

On Mon, Dec 05, 2016 at 01:52:13AM +0000, Andre Przywara wrote:
Since entry_point and load_addr are addresses, they should be represented as longs to cover the whole address space and to avoid warning when compiling the SPL in 64-bit. Also adjust debug prints to add the 'l' specifier, where needed.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de
Reviewed-by: Tom Rini trini@konsulko.com

On Mon, Dec 05, 2016 at 01:52:13AM +0000, Andre Przywara wrote:
Since entry_point and load_addr are addresses, they should be represented as longs to cover the whole address space and to avoid warning when compiling the SPL in 64-bit. Also adjust debug prints to add the 'l' specifier, where needed.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de
Acked-by: Maxime Ripard maxime.ripard@free-electrons.com
Thanks, Maxime

The sunxi DRAM setup code needs an sdelay() implementation, which wasn't defined for armv8 so far. Shamelessly copy the armv7 version and adjust it to work in AArch64.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/cpu/armv8/cpu.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c index e06c3cc..0366ff4 100644 --- a/arch/arm/cpu/armv8/cpu.c +++ b/arch/arm/cpu/armv8/cpu.c @@ -16,6 +16,19 @@ #include <asm/system.h> #include <linux/compiler.h>
+/************************************************************ + * sdelay() - simple spin loop. Will be constant time as + * its generally used in bypass conditions only. This + * is necessary until timers are accessible. + * + * not inline to increase chances its in cache when called + *************************************************************/ +void sdelay(unsigned long loops) +{ + __asm__ volatile ("1:\n" "subs %0, %0, #1\n" + "b.ne 1b" : "=r" (loops) : "0"(loops) : "cc"); +} + int cleanup_before_linux(void) { /*

Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
The sunxi DRAM setup code needs an sdelay() implementation, which wasn't defined for armv8 so far. Shamelessly copy the armv7 version and adjust it to work in AArch64.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/cpu.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c index e06c3cc..0366ff4 100644 --- a/arch/arm/cpu/armv8/cpu.c +++ b/arch/arm/cpu/armv8/cpu.c @@ -16,6 +16,19 @@ #include <asm/system.h> #include <linux/compiler.h>
+/************************************************************
Can we drop the extra stars and use the normal function comment style?
- sdelay() - simple spin loop. Will be constant time as
- its generally used in bypass conditions only. This
- is necessary until timers are accessible.
- not inline to increase chances its in cache when called
Should mention the meaning of the parameter and that it cannot be called with 0.
- *************************************************************/
+void sdelay(unsigned long loops) +{
__asm__ volatile ("1:\n" "subs %0, %0, #1\n"
"b.ne 1b" : "=r" (loops) : "0"(loops) : "cc");
+}
int cleanup_before_linux(void) { /* -- 2.8.2
Regards, Simon

The boot0 hook we have so far is applied _after_ the initial branch to the "reset" entry point. An upcoming change requires even this branch to be changed, so we apply the hook macro at the earliest point, and have the branch in the hook file as well. This is no functional change at this point, just refactoring to simplify upcoming patches.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/cpu/armv8/start.S | 4 ++-- arch/arm/include/asm/arch-sunxi/boot0.h | 1 + 2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm/cpu/armv8/start.S b/arch/arm/cpu/armv8/start.S index 4f5f6d8..ee393d7 100644 --- a/arch/arm/cpu/armv8/start.S +++ b/arch/arm/cpu/armv8/start.S @@ -19,8 +19,6 @@
.globl _start _start: - b reset - #ifdef CONFIG_ENABLE_ARM_SOC_BOOT0_HOOK /* * Various SoCs need something special and SoC-specific up front in @@ -29,6 +27,8 @@ _start: */ #include <asm/arch/boot0.h> ARM_SOC_BOOT0_HOOK +#else + b reset #endif
.align 3 diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index ea5675e..6f28d63 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -9,6 +9,7 @@
/* reserve space for BOOT0 header information */ #define ARM_SOC_BOOT0_HOOK \ + b reset; \ .space 1532
#endif /* __BOOT0_H */

Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
The boot0 hook we have so far is applied _after_ the initial branch to the "reset" entry point. An upcoming change requires even this branch to be changed, so we apply the hook macro at the earliest point, and have the branch in the hook file as well. This is no functional change at this point, just refactoring to simplify upcoming patches.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/start.S | 4 ++-- arch/arm/include/asm/arch-sunxi/boot0.h | 1 + 2 files changed, 3 insertions(+), 2 deletions(-)
Will this not affect other boards which use ARM_SOC_BOOT0_HOOK?
Regards, Simon

Hi,
On 05/12/16 06:25, Simon Glass wrote:
Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
The boot0 hook we have so far is applied _after_ the initial branch to the "reset" entry point. An upcoming change requires even this branch to be changed, so we apply the hook macro at the earliest point, and have the branch in the hook file as well. This is no functional change at this point, just refactoring to simplify upcoming patches.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/start.S | 4 ++-- arch/arm/include/asm/arch-sunxi/boot0.h | 1 + 2 files changed, 3 insertions(+), 2 deletions(-)
Will this not affect other boards which use ARM_SOC_BOOT0_HOOK?
That's a valid question, but the answer is: no. This roughly same mechanism is used by two Broadcom ARMv7 boards, but the usage is different there: they include the boot0.h header file only after the vectors (and not only after the initial branch-to-reset). So this is already different and not compatible between armv7 and armv8 right now, so it's not a regression or change this patch is introducing.
I agree it's a bit confusing to have the same header and Kconfig name, but a different behaviour, but I don't see a good solution to unify this. If you do, I am all ears.
Cheers, Andre.

On 5 December 2016 at 08:43, Andre Przywara andre.przywara@arm.com wrote:
Hi,
On 05/12/16 06:25, Simon Glass wrote:
Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
The boot0 hook we have so far is applied _after_ the initial branch to the "reset" entry point. An upcoming change requires even this branch to be changed, so we apply the hook macro at the earliest point, and have the branch in the hook file as well. This is no functional change at this point, just refactoring to simplify upcoming patches.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/start.S | 4 ++-- arch/arm/include/asm/arch-sunxi/boot0.h | 1 + 2 files changed, 3 insertions(+), 2 deletions(-)
Will this not affect other boards which use ARM_SOC_BOOT0_HOOK?
That's a valid question, but the answer is: no. This roughly same mechanism is used by two Broadcom ARMv7 boards, but the usage is different there: they include the boot0.h header file only after the vectors (and not only after the initial branch-to-reset). So this is already different and not compatible between armv7 and armv8 right now, so it's not a regression or change this patch is introducing.
I agree it's a bit confusing to have the same header and Kconfig name, but a different behaviour, but I don't see a good solution to unify this. If you do, I am all ears.
Reviewed-by: Simon Glass sjg@chromium.org

For prepending some board specific header area to U-Boot images we were so far including a header file with a macro definition containing the actual header specification. This works fine if there are just a few statements and if there is only one alternative. However adding more complex code quickly gets messy with this approach, so let's just drop that intermediate macro and let the #include actually insert the code directly. This converts the callers and the callees, but doesn't change anything at this point.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/cpu/armv8/start.S | 1 - arch/arm/include/asm/arch-bcm235xx/boot0.h | 8 +------- arch/arm/include/asm/arch-bcm281xx/boot0.h | 8 +------- arch/arm/include/asm/arch-sunxi/boot0.h | 8 +------- arch/arm/lib/vectors.S | 1 - 5 files changed, 3 insertions(+), 23 deletions(-)
diff --git a/arch/arm/cpu/armv8/start.S b/arch/arm/cpu/armv8/start.S index ee393d7..140609d 100644 --- a/arch/arm/cpu/armv8/start.S +++ b/arch/arm/cpu/armv8/start.S @@ -26,7 +26,6 @@ _start: * use it here. */ #include <asm/arch/boot0.h> -ARM_SOC_BOOT0_HOOK #else b reset #endif diff --git a/arch/arm/include/asm/arch-bcm235xx/boot0.h b/arch/arm/include/asm/arch-bcm235xx/boot0.h index 7e72882..9ff90b8 100644 --- a/arch/arm/include/asm/arch-bcm235xx/boot0.h +++ b/arch/arm/include/asm/arch-bcm235xx/boot0.h @@ -4,12 +4,6 @@ * SPDX-License-Identifier: GPL-2.0+ */
-#ifndef __BOOT0_H -#define __BOOT0_H - /* BOOT0 header information */ -#define ARM_SOC_BOOT0_HOOK \ - .word 0xbabeface; \ + .word 0xbabeface; .word _end - _start - -#endif /* __BOOT0_H */ diff --git a/arch/arm/include/asm/arch-bcm281xx/boot0.h b/arch/arm/include/asm/arch-bcm281xx/boot0.h index 7e72882..9ff90b8 100644 --- a/arch/arm/include/asm/arch-bcm281xx/boot0.h +++ b/arch/arm/include/asm/arch-bcm281xx/boot0.h @@ -4,12 +4,6 @@ * SPDX-License-Identifier: GPL-2.0+ */
-#ifndef __BOOT0_H -#define __BOOT0_H - /* BOOT0 header information */ -#define ARM_SOC_BOOT0_HOOK \ - .word 0xbabeface; \ + .word 0xbabeface; .word _end - _start - -#endif /* __BOOT0_H */ diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index 6f28d63..6a13db5 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,12 +4,6 @@ * SPDX-License-Identifier: GPL-2.0+ */
-#ifndef __BOOT0_H -#define __BOOT0_H - /* reserve space for BOOT0 header information */ -#define ARM_SOC_BOOT0_HOOK \ - b reset; \ + b reset .space 1532 - -#endif /* __BOOT0_H */ diff --git a/arch/arm/lib/vectors.S b/arch/arm/lib/vectors.S index 5cc132b..9fe7415 100644 --- a/arch/arm/lib/vectors.S +++ b/arch/arm/lib/vectors.S @@ -67,7 +67,6 @@ _start: * use it here. */ #include <asm/arch/boot0.h> -ARM_SOC_BOOT0_HOOK #endif
/*

On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
For prepending some board specific header area to U-Boot images we were so far including a header file with a macro definition containing the actual header specification. This works fine if there are just a few statements and if there is only one alternative. However adding more complex code quickly gets messy with this approach, so let's just drop that intermediate macro and let the #include actually insert the code directly. This converts the callers and the callees, but doesn't change anything at this point.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/start.S | 1 - arch/arm/include/asm/arch-bcm235xx/boot0.h | 8 +------- arch/arm/include/asm/arch-bcm281xx/boot0.h | 8 +------- arch/arm/include/asm/arch-sunxi/boot0.h | 8 +------- arch/arm/lib/vectors.S | 1 - 5 files changed, 3 insertions(+), 23 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org

Hi Andre (3rd attempt...)
On Sun, Dec 4, 2016 at 5:52 PM, Andre Przywara andre.przywara@arm.com wrote:
For prepending some board specific header area to U-Boot images we were so far including a header file with a macro definition containing the actual header specification. This works fine if there are just a few statements and if there is only one alternative. However adding more complex code quickly gets messy with this approach, so let's just drop that intermediate macro and let the #include actually insert the code directly. This converts the callers and the callees, but doesn't change anything at this point.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/start.S | 1 - arch/arm/include/asm/arch-bcm235xx/boot0.h | 8 +------- arch/arm/include/asm/arch-bcm281xx/boot0.h | 8 +------- arch/arm/include/asm/arch-sunxi/boot0.h | 8 +------- arch/arm/lib/vectors.S | 1 - 5 files changed, 3 insertions(+), 23 deletions(-)
diff --git a/arch/arm/cpu/armv8/start.S b/arch/arm/cpu/armv8/start.S index ee393d7..140609d 100644 --- a/arch/arm/cpu/armv8/start.S +++ b/arch/arm/cpu/armv8/start.S @@ -26,7 +26,6 @@ _start:
- use it here.
*/ #include <asm/arch/boot0.h> -ARM_SOC_BOOT0_HOOK #else b reset #endif diff --git a/arch/arm/include/asm/arch-bcm235xx/boot0.h b/arch/arm/include/asm/arch-bcm235xx/boot0.h index 7e72882..9ff90b8 100644 --- a/arch/arm/include/asm/arch-bcm235xx/boot0.h +++ b/arch/arm/include/asm/arch-bcm235xx/boot0.h @@ -4,12 +4,6 @@
- SPDX-License-Identifier: GPL-2.0+
*/
-#ifndef __BOOT0_H -#define __BOOT0_H
/* BOOT0 header information */ -#define ARM_SOC_BOOT0_HOOK \
.word 0xbabeface; \
.word 0xbabeface;
the trailing semi-colon is not necessary
.word _end - _start
-#endif /* __BOOT0_H */ diff --git a/arch/arm/include/asm/arch-bcm281xx/boot0.h b/arch/arm/include/asm/arch-bcm281xx/boot0.h index 7e72882..9ff90b8 100644 --- a/arch/arm/include/asm/arch-bcm281xx/boot0.h +++ b/arch/arm/include/asm/arch-bcm281xx/boot0.h @@ -4,12 +4,6 @@
- SPDX-License-Identifier: GPL-2.0+
*/
-#ifndef __BOOT0_H -#define __BOOT0_H
/* BOOT0 header information */ -#define ARM_SOC_BOOT0_HOOK \
.word 0xbabeface; \
.word 0xbabeface;
the trailing semi-colon is not necessary
.word _end - _start
-#endif /* __BOOT0_H */ diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index 6f28d63..6a13db5 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,12 +4,6 @@
- SPDX-License-Identifier: GPL-2.0+
*/
-#ifndef __BOOT0_H -#define __BOOT0_H
/* reserve space for BOOT0 header information */ -#define ARM_SOC_BOOT0_HOOK \
b reset; \
b reset .space 1532
-#endif /* __BOOT0_H */ diff --git a/arch/arm/lib/vectors.S b/arch/arm/lib/vectors.S index 5cc132b..9fe7415 100644 --- a/arch/arm/lib/vectors.S +++ b/arch/arm/lib/vectors.S @@ -67,7 +67,6 @@ _start:
- use it here.
*/ #include <asm/arch/boot0.h> -ARM_SOC_BOOT0_HOOK #endif
/*
2.8.2
U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Tested-by: Steve Rae steve.rae@raedomain.com
Thanks, Steve

The ENABLE_ARM_SOC_BOOT0_HOOK option is a generic option shared with other boards. To allow alternative code to be inserted, we create another, now function specific config symbol on top of it to simplify later additions. No functional change at this time.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- board/sunxi/Kconfig | 9 +++++++++ configs/pine64_plus_defconfig | 2 +- 2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index e1d4ab1..0cd57a2 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -133,6 +133,15 @@ config MACH_SUN8I bool default y if MACH_SUN8I_A23 || MACH_SUN8I_A33 || MACH_SUN8I_H3 || MACH_SUN8I_A83T
+config RESERVE_ALLWINNER_BOOT0_HEADER + bool "reserve space for Allwinner boot0 header" + select ENABLE_ARM_SOC_BOOT0_HOOK + ---help--- + Prepend a 1536 byte (empty) header to the U-Boot image file, to be + filled with magic values post build. The Allwinner provided boot0 + blob relies on this information to load and execute U-Boot. + Only needed on 64-bit Allwinner boards so far when using boot0. + config DRAM_TYPE int "sunxi dram type" depends on MACH_SUN8I_A83T diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index 6d0198f..ea53b96 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -1,5 +1,5 @@ CONFIG_ARM=y -CONFIG_ENABLE_ARM_SOC_BOOT0_HOOK=y +CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y CONFIG_MACH_SUN50I=y CONFIG_DRAM_CLK=672

On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
The ENABLE_ARM_SOC_BOOT0_HOOK option is a generic option shared with other boards. To allow alternative code to be inserted, we create another, now function specific config symbol on top of it to simplify later additions. No functional change at this time.
Signed-off-by: Andre Przywara andre.przywara@arm.com
board/sunxi/Kconfig | 9 +++++++++ configs/pine64_plus_defconfig | 2 +- 2 files changed, 10 insertions(+), 1 deletion(-)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index e1d4ab1..0cd57a2 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -133,6 +133,15 @@ config MACH_SUN8I bool default y if MACH_SUN8I_A23 || MACH_SUN8I_A33 || MACH_SUN8I_H3 || MACH_SUN8I_A83T
+config RESERVE_ALLWINNER_BOOT0_HEADER
Would RESERVE_SUNXI_BOOT0_HEADER be better?
bool "reserve space for Allwinner boot0 header"
select ENABLE_ARM_SOC_BOOT0_HOOK
---help---
Prepend a 1536 byte (empty) header to the U-Boot image file, to be
filled with magic values post build. The Allwinner provided boot0
blob relies on this information to load and execute U-Boot.
Only needed on 64-bit Allwinner boards so far when using boot0.
config DRAM_TYPE int "sunxi dram type" depends on MACH_SUN8I_A83T diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index 6d0198f..ea53b96 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -1,5 +1,5 @@ CONFIG_ARM=y -CONFIG_ENABLE_ARM_SOC_BOOT0_HOOK=y +CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y CONFIG_MACH_SUN50I=y CONFIG_DRAM_CLK=672 -- 2.8.2

Hi,
On 05/12/16 06:25, Simon Glass wrote:
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
The ENABLE_ARM_SOC_BOOT0_HOOK option is a generic option shared with other boards. To allow alternative code to be inserted, we create another, now function specific config symbol on top of it to simplify later additions. No functional change at this time.
Signed-off-by: Andre Przywara andre.przywara@arm.com
board/sunxi/Kconfig | 9 +++++++++ configs/pine64_plus_defconfig | 2 +- 2 files changed, 10 insertions(+), 1 deletion(-)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index e1d4ab1..0cd57a2 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -133,6 +133,15 @@ config MACH_SUN8I bool default y if MACH_SUN8I_A23 || MACH_SUN8I_A33 || MACH_SUN8I_H3 || MACH_SUN8I_A83T
+config RESERVE_ALLWINNER_BOOT0_HEADER
Would RESERVE_SUNXI_BOOT0_HEADER be better?
Well, although originally an Allwinner invention, the "sunxi" term is mostly used to denote community driven work for Allwinner SoCs. This particular symbol here is for using "boot0", which is an Allwinner provided binary blob and which we actually want to get rid of (hence this series). So to stress that this is really an "Allwinner Tech Ltd." dependent option I chose the verbatim Allwinner string here.
Eventually I plan to kill this option once we convinced ourselves that using our SPL is stable and provides the same feature set as boot0, so we don't need to worry too much about this naming, I guess.
Cheers, Andre.
bool "reserve space for Allwinner boot0 header"
select ENABLE_ARM_SOC_BOOT0_HOOK
---help---
Prepend a 1536 byte (empty) header to the U-Boot image file, to be
filled with magic values post build. The Allwinner provided boot0
blob relies on this information to load and execute U-Boot.
Only needed on 64-bit Allwinner boards so far when using boot0.
config DRAM_TYPE int "sunxi dram type" depends on MACH_SUN8I_A83T diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index 6d0198f..ea53b96 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -1,5 +1,5 @@ CONFIG_ARM=y -CONFIG_ENABLE_ARM_SOC_BOOT0_HOOK=y +CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y CONFIG_MACH_SUN50I=y CONFIG_DRAM_CLK=672 -- 2.8.2

On Mon, Dec 05, 2016 at 01:52:17AM +0000, Andre Przywara wrote:
The ENABLE_ARM_SOC_BOOT0_HOOK option is a generic option shared with other boards. To allow alternative code to be inserted, we create another, now function specific config symbol on top of it to simplify later additions. No functional change at this time.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Acked-by: Maxime Ripard maxime.ripard@free-electrons.com
Thanks, Maxime

The Allwinner A64 SoC starts execution in AArch32 mode, and both the boot ROM and Allwinner's boot0 keep running in this mode. So U-Boot gets entered in 32-bit, although we want it to run in AArch64.
By using a "magic" instruction, which happens to be an almost-NOP in AArch64 and a branch in AArch32, we differentiate between being entered in 64-bit or 32-bit mode. If in 64-bit mode, we proceed with the branch to reset, but in 32-bit mode we trigger an RMR write to bring the core into AArch64/EL3 and re-enter U-Boot at CONFIG_SYS_TEXT_BASE. This allows a 64-bit U-Boot to be both entered in 32 and 64-bit mode, so we can use the same start code for the SPL and the U-Boot proper.
We use the existing custom header (boot0.h) functionality, but restrict the existing boot0 header reservation to the non-SPL build now. A SPL wouldn't need such header anyway. This allows to have both options defined and lets us use one for the SPL and the other for U-Boot proper.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/include/asm/arch-sunxi/boot0.h | 27 +++++++++++++++++++++++++++ board/sunxi/Kconfig | 5 +++++ 2 files changed, 32 insertions(+)
diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index 6a13db5..7799a03 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,6 +4,33 @@ * SPDX-License-Identifier: GPL-2.0+ */
+#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) && !defined(CONFIG_SPL_BUILD) /* reserve space for BOOT0 header information */ b reset .space 1532 +#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) +/* switch into AArch64 if needed */ + tst x0, x0 // this is "b #0x84" in ARM + b reset + .space 0x7c + .word 0xe59f1024 // ldr r1, [pc, #36] ; 0x170000a0 + .word 0xe59f0024 // ldr r0, [pc, #36] ; CONFIG_*_TEXT_BASE + .word 0xe5810000 // str r0, [r1] + .word 0xf57ff04f // dsb sy + .word 0xf57ff06f // isb sy + .word 0xee1c0f50 // mrc 15, 0, r0, cr12, cr0, {2} ; RMR + .word 0xe3800003 // orr r0, r0, #3 + .word 0xee0c0f50 // mcr 15, 0, r0, cr12, cr0, {2} ; RMR + .word 0xf57ff06f // isb sy + .word 0xe320f003 // wfi + .word 0xeafffffd // b @wfi + .word 0x017000a0 // writeable RVBAR mapping address +#ifdef CONFIG_SPL_BUILD + .word CONFIG_SPL_TEXT_BASE +#else + .word CONFIG_SYS_TEXT_BASE +#endif +#else +/* normal execution */ + b reset +#endif diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index 0cd57a2..ba72e76 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -142,6 +142,11 @@ config RESERVE_ALLWINNER_BOOT0_HEADER blob relies on this information to load and execute U-Boot. Only needed on 64-bit Allwinner boards so far when using boot0.
+config ARM_BOOT_HOOK_RMR + bool + default y if ARM64 + select ENABLE_ARM_SOC_BOOT0_HOOK + config DRAM_TYPE int "sunxi dram type" depends on MACH_SUN8I_A83T

Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
The Allwinner A64 SoC starts execution in AArch32 mode, and both the boot ROM and Allwinner's boot0 keep running in this mode. So U-Boot gets entered in 32-bit, although we want it to run in AArch64.
By using a "magic" instruction, which happens to be an almost-NOP in AArch64 and a branch in AArch32, we differentiate between being entered in 64-bit or 32-bit mode. If in 64-bit mode, we proceed with the branch to reset, but in 32-bit mode we trigger an RMR write to bring the core into AArch64/EL3 and re-enter U-Boot at CONFIG_SYS_TEXT_BASE. This allows a 64-bit U-Boot to be both entered in 32 and 64-bit mode, so we can use the same start code for the SPL and the U-Boot proper.
We use the existing custom header (boot0.h) functionality, but restrict the existing boot0 header reservation to the non-SPL build now. A SPL wouldn't need such header anyway. This allows to have both options defined and lets us use one for the SPL and the other for U-Boot proper.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/boot0.h | 27 +++++++++++++++++++++++++++ board/sunxi/Kconfig | 5 +++++ 2 files changed, 32 insertions(+)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index 6a13db5..7799a03 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,6 +4,33 @@
- SPDX-License-Identifier: GPL-2.0+
*/
+#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) && !defined(CONFIG_SPL_BUILD) /* reserve space for BOOT0 header information */ b reset .space 1532 +#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) +/* switch into AArch64 if needed */
tst x0, x0 // this is "b #0x84" in ARM
b reset
.space 0x7c
.word 0xe59f1024 // ldr r1, [pc, #36] ; 0x170000a0
.word 0xe59f0024 // ldr r0, [pc, #36] ; CONFIG_*_TEXT_BASE
.word 0xe5810000 // str r0, [r1]
.word 0xf57ff04f // dsb sy
.word 0xf57ff06f // isb sy
.word 0xee1c0f50 // mrc 15, 0, r0, cr12, cr0, {2} ; RMR
.word 0xe3800003 // orr r0, r0, #3
.word 0xee0c0f50 // mcr 15, 0, r0, cr12, cr0, {2} ; RMR
.word 0xf57ff06f // isb sy
.word 0xe320f003 // wfi
.word 0xeafffffd // b @wfi
.word 0x017000a0 // writeable RVBAR mapping address
How come you cannot use the assembler here?
+#ifdef CONFIG_SPL_BUILD
.word CONFIG_SPL_TEXT_BASE
+#else
.word CONFIG_SYS_TEXT_BASE
+#endif +#else +/* normal execution */
b reset
+#endif diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index 0cd57a2..ba72e76 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -142,6 +142,11 @@ config RESERVE_ALLWINNER_BOOT0_HEADER blob relies on this information to load and execute U-Boot. Only needed on 64-bit Allwinner boards so far when using boot0.
+config ARM_BOOT_HOOK_RMR
bool
default y if ARM64
select ENABLE_ARM_SOC_BOOT0_HOOK
help?
config DRAM_TYPE int "sunxi dram type" depends on MACH_SUN8I_A83T -- 2.8.2
Regards, Simon

Hi Simon,
thanks a lot for looking at this!
On 05/12/16 06:25, Simon Glass wrote:
Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
The Allwinner A64 SoC starts execution in AArch32 mode, and both the boot ROM and Allwinner's boot0 keep running in this mode. So U-Boot gets entered in 32-bit, although we want it to run in AArch64.
By using a "magic" instruction, which happens to be an almost-NOP in AArch64 and a branch in AArch32, we differentiate between being entered in 64-bit or 32-bit mode. If in 64-bit mode, we proceed with the branch to reset, but in 32-bit mode we trigger an RMR write to bring the core into AArch64/EL3 and re-enter U-Boot at CONFIG_SYS_TEXT_BASE. This allows a 64-bit U-Boot to be both entered in 32 and 64-bit mode, so we can use the same start code for the SPL and the U-Boot proper.
We use the existing custom header (boot0.h) functionality, but restrict the existing boot0 header reservation to the non-SPL build now. A SPL wouldn't need such header anyway. This allows to have both options defined and lets us use one for the SPL and the other for U-Boot proper.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/boot0.h | 27 +++++++++++++++++++++++++++ board/sunxi/Kconfig | 5 +++++ 2 files changed, 32 insertions(+)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index 6a13db5..7799a03 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,6 +4,33 @@
- SPDX-License-Identifier: GPL-2.0+
*/
+#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) && !defined(CONFIG_SPL_BUILD) /* reserve space for BOOT0 header information */ b reset .space 1532 +#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) +/* switch into AArch64 if needed */
tst x0, x0 // this is "b #0x84" in ARM
b reset
.space 0x7c
.word 0xe59f1024 // ldr r1, [pc, #36] ; 0x170000a0
.word 0xe59f0024 // ldr r0, [pc, #36] ; CONFIG_*_TEXT_BASE
.word 0xe5810000 // str r0, [r1]
.word 0xf57ff04f // dsb sy
.word 0xf57ff06f // isb sy
.word 0xee1c0f50 // mrc 15, 0, r0, cr12, cr0, {2} ; RMR
.word 0xe3800003 // orr r0, r0, #3
.word 0xee0c0f50 // mcr 15, 0, r0, cr12, cr0, {2} ; RMR
.word 0xf57ff06f // isb sy
.word 0xe320f003 // wfi
.word 0xeafffffd // b @wfi
.word 0x017000a0 // writeable RVBAR mapping address
How come you cannot use the assembler here?
Because this is ARM code, whereas this file is included in an AArch64 build. In contrast to x86 the AArch64 toolchain does not support both bitnesses in one build, mostly because the two architectures are really different.
The actual reason for this exercise is that the Allwinner boot ROM enters the payload in AArch32 mode, but we want to compile and run the SPL in AArch64. So we need some small ARM(32) stub to enter AArch64.
Running the whole SPL in 32-bit is the other option which the later patches enable, but I didn't want to call some 32-bit ARM (cross-)compiler just for this handful of instructions in this case here.
+#ifdef CONFIG_SPL_BUILD
.word CONFIG_SPL_TEXT_BASE
+#else
.word CONFIG_SYS_TEXT_BASE
+#endif +#else +/* normal execution */
b reset
+#endif diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index 0cd57a2..ba72e76 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -142,6 +142,11 @@ config RESERVE_ALLWINNER_BOOT0_HEADER blob relies on this information to load and execute U-Boot. Only needed on 64-bit Allwinner boards so far when using boot0.
+config ARM_BOOT_HOOK_RMR
bool
default y if ARM64
select ENABLE_ARM_SOC_BOOT0_HOOK
help?
Good point, I can copy some parts of the commit message into here.
Cheers, Andre.
config DRAM_TYPE int "sunxi dram type" depends on MACH_SUN8I_A83T -- 2.8.2
Regards, Simon

On Mon, Dec 05, 2016 at 10:41:27AM +0000, Andre Przywara wrote:
Hi Simon,
thanks a lot for looking at this!
On 05/12/16 06:25, Simon Glass wrote:
Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
The Allwinner A64 SoC starts execution in AArch32 mode, and both the boot ROM and Allwinner's boot0 keep running in this mode. So U-Boot gets entered in 32-bit, although we want it to run in AArch64.
By using a "magic" instruction, which happens to be an almost-NOP in AArch64 and a branch in AArch32, we differentiate between being entered in 64-bit or 32-bit mode. If in 64-bit mode, we proceed with the branch to reset, but in 32-bit mode we trigger an RMR write to bring the core into AArch64/EL3 and re-enter U-Boot at CONFIG_SYS_TEXT_BASE. This allows a 64-bit U-Boot to be both entered in 32 and 64-bit mode, so we can use the same start code for the SPL and the U-Boot proper.
We use the existing custom header (boot0.h) functionality, but restrict the existing boot0 header reservation to the non-SPL build now. A SPL wouldn't need such header anyway. This allows to have both options defined and lets us use one for the SPL and the other for U-Boot proper.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/boot0.h | 27 +++++++++++++++++++++++++++ board/sunxi/Kconfig | 5 +++++ 2 files changed, 32 insertions(+)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/arch/arm/include/asm/arch-sunxi/boot0.h b/arch/arm/include/asm/arch-sunxi/boot0.h index 6a13db5..7799a03 100644 --- a/arch/arm/include/asm/arch-sunxi/boot0.h +++ b/arch/arm/include/asm/arch-sunxi/boot0.h @@ -4,6 +4,33 @@
- SPDX-License-Identifier: GPL-2.0+
*/
+#if defined(CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER) && !defined(CONFIG_SPL_BUILD) /* reserve space for BOOT0 header information */ b reset .space 1532 +#elif defined(CONFIG_ARM_BOOT_HOOK_RMR) +/* switch into AArch64 if needed */
tst x0, x0 // this is "b #0x84" in ARM
b reset
.space 0x7c
.word 0xe59f1024 // ldr r1, [pc, #36] ; 0x170000a0
.word 0xe59f0024 // ldr r0, [pc, #36] ; CONFIG_*_TEXT_BASE
.word 0xe5810000 // str r0, [r1]
.word 0xf57ff04f // dsb sy
.word 0xf57ff06f // isb sy
.word 0xee1c0f50 // mrc 15, 0, r0, cr12, cr0, {2} ; RMR
.word 0xe3800003 // orr r0, r0, #3
.word 0xee0c0f50 // mcr 15, 0, r0, cr12, cr0, {2} ; RMR
.word 0xf57ff06f // isb sy
.word 0xe320f003 // wfi
.word 0xeafffffd // b @wfi
.word 0x017000a0 // writeable RVBAR mapping address
How come you cannot use the assembler here?
Because this is ARM code, whereas this file is included in an AArch64 build. In contrast to x86 the AArch64 toolchain does not support both bitnesses in one build, mostly because the two architectures are really different.
The actual reason for this exercise is that the Allwinner boot ROM enters the payload in AArch32 mode, but we want to compile and run the SPL in AArch64. So we need some small ARM(32) stub to enter AArch64.
Running the whole SPL in 32-bit is the other option which the later patches enable, but I didn't want to call some 32-bit ARM (cross-)compiler just for this handful of instructions in this case here.
A comment stating that, and how to regenerate that part from a actual assembly source would be great.
Maxime

To avoid enumerating the very same DRAM values in defconfig files for each and every Allwinner A64 board out there, let's put some sane default values in the Kconfig file. Boards with different needs can override them at any time.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- board/sunxi/Kconfig | 2 ++ configs/pine64_plus_defconfig | 2 -- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index ba72e76..d477925 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -159,6 +159,7 @@ config DRAM_CLK default 792 if MACH_SUN9I default 312 if MACH_SUN6I || MACH_SUN8I default 360 if MACH_SUN4I || MACH_SUN5I || MACH_SUN7I + default 672 if MACH_SUN50I ---help--- Set the dram clock speed, valid range 240 - 480 (prior to sun9i), must be a multiple of 24. For the sun9i (A80), the tested values @@ -178,6 +179,7 @@ config DRAM_ZQ default 123 if MACH_SUN4I || MACH_SUN5I || MACH_SUN6I || MACH_SUN8I default 127 if MACH_SUN7I default 4145117 if MACH_SUN9I + default 3881915 if MACH_SUN50I ---help--- Set the dram zq value.
diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index ea53b96..ebc24b8 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -2,8 +2,6 @@ CONFIG_ARM=y CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y CONFIG_MACH_SUN50I=y -CONFIG_DRAM_CLK=672 -CONFIG_DRAM_ZQ=3881915 CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y

On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
To avoid enumerating the very same DRAM values in defconfig files for each and every Allwinner A64 board out there, let's put some sane default values in the Kconfig file. Boards with different needs can override them at any time.
Signed-off-by: Andre Przywara andre.przywara@arm.com
board/sunxi/Kconfig | 2 ++ configs/pine64_plus_defconfig | 2 -- 2 files changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org

On Mon, Dec 05, 2016 at 01:52:19AM +0000, Andre Przywara wrote:
To avoid enumerating the very same DRAM values in defconfig files for each and every Allwinner A64 board out there, let's put some sane default values in the Kconfig file. Boards with different needs can override them at any time.
Did you check other boards to see what their values were before calling it a default?
Thanks, Maxime

Hi,
On 06/12/16 10:56, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:19AM +0000, Andre Przywara wrote:
To avoid enumerating the very same DRAM values in defconfig files for each and every Allwinner A64 board out there, let's put some sane default values in the Kconfig file. Boards with different needs can override them at any time.
Did you check other boards to see what their values were before calling it a default?
I sampled all the boards (two ;-) I have access to and got a 100% coverage ;-) As far as I know, this 672 MHz seems to be an Allwinner recommendation, though it needs to be worked out how stable this is under load. I have no idea if the ZQ value is similarly common.
If you have any other data, I am all ears and happy to use a different safe default if this is the intention. But we could use a "most common value" approach here to avoid pointless defconfig entries with the same value for lots of boards, so even if there are board with can't do 672 MHz, for instance, we still put it in Kconfig and let that one board override it.
Cheers, Andre.

On Tue, Dec 06, 2016 at 11:21:26AM +0000, Andre Przywara wrote:
Hi,
On 06/12/16 10:56, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:19AM +0000, Andre Przywara wrote:
To avoid enumerating the very same DRAM values in defconfig files for each and every Allwinner A64 board out there, let's put some sane default values in the Kconfig file. Boards with different needs can override them at any time.
Did you check other boards to see what their values were before calling it a default?
I sampled all the boards (two ;-) I have access to and got a 100% coverage ;-) As far as I know, this 672 MHz seems to be an Allwinner recommendation, though it needs to be worked out how stable this is under load. I have no idea if the ZQ value is similarly common.
If you have any other data, I am all ears and happy to use a different safe default if this is the intention. But we could use a "most common value" approach here to avoid pointless defconfig entries with the same value for lots of boards, so even if there are board with can't do 672 MHz, for instance, we still put it in Kconfig and let that one board override it.
Fair enough.
Maxime

From: Jens Kuske jenskuske@gmail.com
The IOCR registers got renamed to BDLR to match the public documentation of similar controllers.
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 43 ++++++++++++++----------- arch/arm/mach-sunxi/dram_sun8i_h3.c | 34 +++++++++---------- 2 files changed, 41 insertions(+), 36 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index d0f2b8a..867fd12 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -81,7 +81,7 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */ - u8 res3[0x20]; /* 0x9c */ + u8 res3[0x20]; /* 0x9c */ u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */ @@ -106,20 +106,23 @@ struct sunxi_mctl_ctl_reg { u32 perfhpr[2]; /* 0x1c4 */ u32 perflpr[2]; /* 0x1cc */ u32 perfwr[2]; /* 0x1d4 */ - u8 res8[0x2c]; /* 0x1dc */ - u32 aciocr; /* 0x208 */ - u8 res9[0xf4]; /* 0x20c */ + u8 res8[0x24]; /* 0x1dc */ + u32 acmdlr; /* 0x200 AC master delay line register */ + u32 aclcdlr; /* 0x204 AC local calibrated delay line register */ + u32 aciocr; /* 0x208 AC I/O configuration register */ + u8 res9[0x4]; /* 0x20c */ + u32 acbdlr[31]; /* 0x210 AC bit delay line registers */ + u8 res10[0x74]; /* 0x28c */ struct { /* 0x300 DATX8 modules*/ - u32 mdlr; /* 0x00 */ - u32 lcdlr[3]; /* 0x04 */ - u32 iocr[11]; /* 0x10 IO configuration register */ - u32 bdlr6; /* 0x3c */ - u32 gtr; /* 0x40 */ - u32 gcr; /* 0x44 */ - u32 gsr[3]; /* 0x48 */ + u32 mdlr; /* 0x00 master delay line register */ + u32 lcdlr[3]; /* 0x04 local calibrated delay line registers */ + u32 bdlr[12]; /* 0x10 bit delay line registers */ + u32 gtr; /* 0x40 general timing register */ + u32 gcr; /* 0x44 general configuration register */ + u32 gsr[3]; /* 0x48 general status registers */ u8 res0[0x2c]; /* 0x54 */ - } datx[4]; - u8 res10[0x388]; /* 0x500 */ + } dx[4]; + u8 res11[0x388]; /* 0x500 */ u32 upd2; /* 0x888 */ };
@@ -174,12 +177,14 @@ struct sunxi_mctl_ctl_reg {
#define ZQCR_PWRDOWN (0x1 << 31) /* ZQ power down */
-#define DATX_IOCR_DQ(x) (x) /* DQ0-7 IOCR index */ -#define DATX_IOCR_DM (8) /* DM IOCR index */ -#define DATX_IOCR_DQS (9) /* DQS IOCR index */ -#define DATX_IOCR_DQSN (10) /* DQSN IOCR index */ +#define ACBDLR_WRITE_DELAY(x) ((x) << 8)
-#define DATX_IOCR_WRITE_DELAY(x) ((x) << 8) -#define DATX_IOCR_READ_DELAY(x) ((x) << 0) +#define DXBDLR_DQ(x) (x) /* DQ0-7 BDLR index */ +#define DXBDLR_DM (8) /* DM BDLR index */ +#define DXBDLR_DQS (9) /* DQS BDLR index */ +#define DXBDLR_DQSN (10) /* DQSN BDLR index */ + +#define DXBDLR_WRITE_DELAY(x) ((x) << 8) +#define DXBDLR_READ_DELAY(x) ((x) << 0)
#endif /* _SUNXI_DRAM_SUN8I_H3_H */ diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index b08b8e6..3dd6803 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -72,21 +72,21 @@ static void mctl_dq_delay(u32 read, u32 write) u32 val;
for (i = 0; i < 4; i++) { - val = DATX_IOCR_WRITE_DELAY((write >> (i * 4)) & 0xf) | - DATX_IOCR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2); + val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) | + DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2);
- for (j = DATX_IOCR_DQ(0); j <= DATX_IOCR_DM; j++) - writel(val, &mctl_ctl->datx[i].iocr[j]); + for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++) + writel(val, &mctl_ctl->dx[i].bdlr[j]); }
clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
for (i = 0; i < 4; i++) { - val = DATX_IOCR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) | - DATX_IOCR_READ_DELAY((read >> (16 + i * 4)) & 0xf); + val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) | + DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf);
- writel(val, &mctl_ctl->datx[i].iocr[DATX_IOCR_DQS]); - writel(val, &mctl_ctl->datx[i].iocr[DATX_IOCR_DQSN]); + writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]); + writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]); }
setbits_le32(&mctl_ctl->pgcr[0], 1 << 26); @@ -344,7 +344,7 @@ static int mctl_channel_init(struct dram_para *para)
/* set dramc odt */ for (i = 0; i < 4; i++) - clrsetbits_le32(&mctl_ctl->datx[i].gcr, (0x3 << 4) | + clrsetbits_le32(&mctl_ctl->dx[i].gcr, (0x3 << 4) | (0x1 << 1) | (0x3 << 2) | (0x3 << 12) | (0x3 << 14), IS_ENABLED(CONFIG_DRAM_ODT_EN) ? 0x0 : 0x2); @@ -364,8 +364,8 @@ static int mctl_channel_init(struct dram_para *para)
/* set half DQ */ if (para->bus_width != 32) { - writel(0x0, &mctl_ctl->datx[2].gcr); - writel(0x0, &mctl_ctl->datx[3].gcr); + writel(0x0, &mctl_ctl->dx[2].gcr); + writel(0x0, &mctl_ctl->dx[3].gcr); }
/* data training configuration */ @@ -386,17 +386,17 @@ static int mctl_channel_init(struct dram_para *para) /* detect ranks and bus width */ if (readl(&mctl_ctl->pgsr[0]) & (0xfe << 20)) { /* only one rank */ - if (((readl(&mctl_ctl->datx[0].gsr[0]) >> 24) & 0x2) || - ((readl(&mctl_ctl->datx[1].gsr[0]) >> 24) & 0x2)) { + if (((readl(&mctl_ctl->dx[0].gsr[0]) >> 24) & 0x2) || + ((readl(&mctl_ctl->dx[1].gsr[0]) >> 24) & 0x2)) { clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, 0x1 << 24); para->dual_rank = 0; }
/* only half DQ width */ - if (((readl(&mctl_ctl->datx[2].gsr[0]) >> 24) & 0x1) || - ((readl(&mctl_ctl->datx[3].gsr[0]) >> 24) & 0x1)) { - writel(0x0, &mctl_ctl->datx[2].gcr); - writel(0x0, &mctl_ctl->datx[3].gcr); + if (((readl(&mctl_ctl->dx[2].gsr[0]) >> 24) & 0x1) || + ((readl(&mctl_ctl->dx[3].gsr[0]) >> 24) & 0x1)) { + writel(0x0, &mctl_ctl->dx[2].gcr); + writel(0x0, &mctl_ctl->dx[3].gcr); para->bus_width = 16; }

Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
From: Jens Kuske jenskuske@gmail.com
The IOCR registers got renamed to BDLR to match the public documentation of similar controllers.
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 43 ++++++++++++++----------- arch/arm/mach-sunxi/dram_sun8i_h3.c | 34 +++++++++---------- 2 files changed, 41 insertions(+), 36 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org
Some suggestions below if you have the energy.
diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index d0f2b8a..867fd12 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -81,7 +81,7 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */
u8 res3[0x20]; /* 0x9c */
u8 res3[0x20]; /* 0x9c */ u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */
@@ -106,20 +106,23 @@ struct sunxi_mctl_ctl_reg { u32 perfhpr[2]; /* 0x1c4 */ u32 perflpr[2]; /* 0x1cc */ u32 perfwr[2]; /* 0x1d4 */
u8 res8[0x2c]; /* 0x1dc */
u32 aciocr; /* 0x208 */
u8 res9[0xf4]; /* 0x20c */
u8 res8[0x24]; /* 0x1dc */
u32 acmdlr; /* 0x200 AC master delay line register */
u32 aclcdlr; /* 0x204 AC local calibrated delay line register */
u32 aciocr; /* 0x208 AC I/O configuration register */
u8 res9[0x4]; /* 0x20c */
u32 acbdlr[31]; /* 0x210 AC bit delay line registers */
u8 res10[0x74]; /* 0x28c */ struct { /* 0x300 DATX8 modules*/
u32 mdlr; /* 0x00 */
u32 lcdlr[3]; /* 0x04 */
u32 iocr[11]; /* 0x10 IO configuration register */
u32 bdlr6; /* 0x3c */
u32 gtr; /* 0x40 */
u32 gcr; /* 0x44 */
u32 gsr[3]; /* 0x48 */
u32 mdlr; /* 0x00 master delay line register */
u32 lcdlr[3]; /* 0x04 local calibrated delay line registers */
u32 bdlr[12]; /* 0x10 bit delay line registers */
u32 gtr; /* 0x40 general timing register */
u32 gcr; /* 0x44 general configuration register */
u32 gsr[3]; /* 0x48 general status registers */ u8 res0[0x2c]; /* 0x54 */
} datx[4];
u8 res10[0x388]; /* 0x500 */
} dx[4];
u8 res11[0x388]; /* 0x500 */ u32 upd2; /* 0x888 */
};
@@ -174,12 +177,14 @@ struct sunxi_mctl_ctl_reg {
#define ZQCR_PWRDOWN (0x1 << 31) /* ZQ power down */
1U << 31
-#define DATX_IOCR_DQ(x) (x) /* DQ0-7 IOCR index */ -#define DATX_IOCR_DM (8) /* DM IOCR index */ -#define DATX_IOCR_DQS (9) /* DQS IOCR index */ -#define DATX_IOCR_DQSN (10) /* DQSN IOCR index */ +#define ACBDLR_WRITE_DELAY(x) ((x) << 8)
Better to have
#define ACBDLR_WRITE_DELAY_SHIFT 8 #define ACBDLR_WRITE_DELAY_MASK (0xff << ACBDLR_WRITE_DELAY_SHIFT)
and then use that in the code. Similarly with other accessors.
-#define DATX_IOCR_WRITE_DELAY(x) ((x) << 8) -#define DATX_IOCR_READ_DELAY(x) ((x) << 0) +#define DXBDLR_DQ(x) (x) /* DQ0-7 BDLR index */ +#define DXBDLR_DM (8) /* DM BDLR index */
Can we drop the unnecessary brackets around constants?
+#define DXBDLR_DQS (9) /* DQS BDLR index */ +#define DXBDLR_DQSN (10) /* DQSN BDLR index */
+#define DXBDLR_WRITE_DELAY(x) ((x) << 8) +#define DXBDLR_READ_DELAY(x) ((x) << 0)
#endif /* _SUNXI_DRAM_SUN8I_H3_H */ diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index b08b8e6..3dd6803 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -72,21 +72,21 @@ static void mctl_dq_delay(u32 read, u32 write) u32 val;
for (i = 0; i < 4; i++) {
val = DATX_IOCR_WRITE_DELAY((write >> (i * 4)) & 0xf) |
DATX_IOCR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2);
val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) |
DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2);
for (j = DATX_IOCR_DQ(0); j <= DATX_IOCR_DM; j++)
writel(val, &mctl_ctl->datx[i].iocr[j]);
for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++)
writel(val, &mctl_ctl->dx[i].bdlr[j]); } clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26); for (i = 0; i < 4; i++) {
val = DATX_IOCR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) |
DATX_IOCR_READ_DELAY((read >> (16 + i * 4)) & 0xf);
val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) |
DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf);
writel(val, &mctl_ctl->datx[i].iocr[DATX_IOCR_DQS]);
writel(val, &mctl_ctl->datx[i].iocr[DATX_IOCR_DQSN]);
writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]);
writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]); } setbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
@@ -344,7 +344,7 @@ static int mctl_channel_init(struct dram_para *para)
/* set dramc odt */ for (i = 0; i < 4; i++)
clrsetbits_le32(&mctl_ctl->datx[i].gcr, (0x3 << 4) |
clrsetbits_le32(&mctl_ctl->dx[i].gcr, (0x3 << 4) | (0x1 << 1) | (0x3 << 2) | (0x3 << 12) | (0x3 << 14), IS_ENABLED(CONFIG_DRAM_ODT_EN) ? 0x0 : 0x2);
@@ -364,8 +364,8 @@ static int mctl_channel_init(struct dram_para *para)
/* set half DQ */ if (para->bus_width != 32) {
writel(0x0, &mctl_ctl->datx[2].gcr);
writel(0x0, &mctl_ctl->datx[3].gcr);
writel(0x0, &mctl_ctl->dx[2].gcr);
writel(0x0, &mctl_ctl->dx[3].gcr); } /* data training configuration */
@@ -386,17 +386,17 @@ static int mctl_channel_init(struct dram_para *para) /* detect ranks and bus width */ if (readl(&mctl_ctl->pgsr[0]) & (0xfe << 20)) { /* only one rank */
if (((readl(&mctl_ctl->datx[0].gsr[0]) >> 24) & 0x2) ||
((readl(&mctl_ctl->datx[1].gsr[0]) >> 24) & 0x2)) {
if (((readl(&mctl_ctl->dx[0].gsr[0]) >> 24) & 0x2) ||
Looks like these fields should have #defines also.
((readl(&mctl_ctl->dx[1].gsr[0]) >> 24) & 0x2)) { clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, 0x1 << 24); para->dual_rank = 0; } /* only half DQ width */
if (((readl(&mctl_ctl->datx[2].gsr[0]) >> 24) & 0x1) ||
((readl(&mctl_ctl->datx[3].gsr[0]) >> 24) & 0x1)) {
writel(0x0, &mctl_ctl->datx[2].gcr);
writel(0x0, &mctl_ctl->datx[3].gcr);
if (((readl(&mctl_ctl->dx[2].gsr[0]) >> 24) & 0x1) ||
((readl(&mctl_ctl->dx[3].gsr[0]) >> 24) & 0x1)) {
writel(0x0, &mctl_ctl->dx[2].gcr);
writel(0x0, &mctl_ctl->dx[3].gcr); para->bus_width = 16; }
-- 2.8.2
Regards, Simon

On 05/12/16 06:26, Simon Glass wrote:
Hi Simon,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
From: Jens Kuske jenskuske@gmail.com
The IOCR registers got renamed to BDLR to match the public documentation of similar controllers.
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 43 ++++++++++++++----------- arch/arm/mach-sunxi/dram_sun8i_h3.c | 34 +++++++++---------- 2 files changed, 41 insertions(+), 36 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org
Some suggestions below if you have the energy.
diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index d0f2b8a..867fd12 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -81,7 +81,7 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */
u8 res3[0x20]; /* 0x9c */
u8 res3[0x20]; /* 0x9c */ u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */
@@ -106,20 +106,23 @@ struct sunxi_mctl_ctl_reg { u32 perfhpr[2]; /* 0x1c4 */ u32 perflpr[2]; /* 0x1cc */ u32 perfwr[2]; /* 0x1d4 */
u8 res8[0x2c]; /* 0x1dc */
u32 aciocr; /* 0x208 */
u8 res9[0xf4]; /* 0x20c */
u8 res8[0x24]; /* 0x1dc */
u32 acmdlr; /* 0x200 AC master delay line register */
u32 aclcdlr; /* 0x204 AC local calibrated delay line register */
u32 aciocr; /* 0x208 AC I/O configuration register */
u8 res9[0x4]; /* 0x20c */
u32 acbdlr[31]; /* 0x210 AC bit delay line registers */
u8 res10[0x74]; /* 0x28c */ struct { /* 0x300 DATX8 modules*/
u32 mdlr; /* 0x00 */
u32 lcdlr[3]; /* 0x04 */
u32 iocr[11]; /* 0x10 IO configuration register */
u32 bdlr6; /* 0x3c */
u32 gtr; /* 0x40 */
u32 gcr; /* 0x44 */
u32 gsr[3]; /* 0x48 */
u32 mdlr; /* 0x00 master delay line register */
u32 lcdlr[3]; /* 0x04 local calibrated delay line registers */
u32 bdlr[12]; /* 0x10 bit delay line registers */
u32 gtr; /* 0x40 general timing register */
u32 gcr; /* 0x44 general configuration register */
u32 gsr[3]; /* 0x48 general status registers */ u8 res0[0x2c]; /* 0x54 */
} datx[4];
u8 res10[0x388]; /* 0x500 */
} dx[4];
u8 res11[0x388]; /* 0x500 */ u32 upd2; /* 0x888 */
};
@@ -174,12 +177,14 @@ struct sunxi_mctl_ctl_reg {
#define ZQCR_PWRDOWN (0x1 << 31) /* ZQ power down */
1U << 31
-#define DATX_IOCR_DQ(x) (x) /* DQ0-7 IOCR index */ -#define DATX_IOCR_DM (8) /* DM IOCR index */ -#define DATX_IOCR_DQS (9) /* DQS IOCR index */ -#define DATX_IOCR_DQSN (10) /* DQSN IOCR index */ +#define ACBDLR_WRITE_DELAY(x) ((x) << 8)
Better to have
#define ACBDLR_WRITE_DELAY_SHIFT 8 #define ACBDLR_WRITE_DELAY_MASK (0xff << ACBDLR_WRITE_DELAY_SHIFT)
and then use that in the code. Similarly with other accessors.
I first agreed, but actually doing this makes it more hideous, IMHO. These defines are rather long now (but can't be really shorter), so they break 80 lines, also add more brackets. Also existing accessors in this file use the same approach, so I'd rather keep it this way for now.
-#define DATX_IOCR_WRITE_DELAY(x) ((x) << 8) -#define DATX_IOCR_READ_DELAY(x) ((x) << 0) +#define DXBDLR_DQ(x) (x) /* DQ0-7 BDLR index */ +#define DXBDLR_DM (8) /* DM BDLR index */
Can we drop the unnecessary brackets around constants?
Yup.
+#define DXBDLR_DQS (9) /* DQS BDLR index */ +#define DXBDLR_DQSN (10) /* DQSN BDLR index */
+#define DXBDLR_WRITE_DELAY(x) ((x) << 8) +#define DXBDLR_READ_DELAY(x) ((x) << 0)
#endif /* _SUNXI_DRAM_SUN8I_H3_H */ diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index b08b8e6..3dd6803 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -72,21 +72,21 @@ static void mctl_dq_delay(u32 read, u32 write) u32 val;
for (i = 0; i < 4; i++) {
val = DATX_IOCR_WRITE_DELAY((write >> (i * 4)) & 0xf) |
DATX_IOCR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2);
val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) |
DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2);
for (j = DATX_IOCR_DQ(0); j <= DATX_IOCR_DM; j++)
writel(val, &mctl_ctl->datx[i].iocr[j]);
for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++)
writel(val, &mctl_ctl->dx[i].bdlr[j]); } clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26); for (i = 0; i < 4; i++) {
val = DATX_IOCR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) |
DATX_IOCR_READ_DELAY((read >> (16 + i * 4)) & 0xf);
val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) |
DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf);
writel(val, &mctl_ctl->datx[i].iocr[DATX_IOCR_DQS]);
writel(val, &mctl_ctl->datx[i].iocr[DATX_IOCR_DQSN]);
writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]);
writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]); } setbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
@@ -344,7 +344,7 @@ static int mctl_channel_init(struct dram_para *para)
/* set dramc odt */ for (i = 0; i < 4; i++)
clrsetbits_le32(&mctl_ctl->datx[i].gcr, (0x3 << 4) |
clrsetbits_le32(&mctl_ctl->dx[i].gcr, (0x3 << 4) | (0x1 << 1) | (0x3 << 2) | (0x3 << 12) | (0x3 << 14), IS_ENABLED(CONFIG_DRAM_ODT_EN) ? 0x0 : 0x2);
@@ -364,8 +364,8 @@ static int mctl_channel_init(struct dram_para *para)
/* set half DQ */ if (para->bus_width != 32) {
writel(0x0, &mctl_ctl->datx[2].gcr);
writel(0x0, &mctl_ctl->datx[3].gcr);
writel(0x0, &mctl_ctl->dx[2].gcr);
writel(0x0, &mctl_ctl->dx[3].gcr); } /* data training configuration */
@@ -386,17 +386,17 @@ static int mctl_channel_init(struct dram_para *para) /* detect ranks and bus width */ if (readl(&mctl_ctl->pgsr[0]) & (0xfe << 20)) { /* only one rank */
if (((readl(&mctl_ctl->datx[0].gsr[0]) >> 24) & 0x2) ||
((readl(&mctl_ctl->datx[1].gsr[0]) >> 24) & 0x2)) {
if (((readl(&mctl_ctl->dx[0].gsr[0]) >> 24) & 0x2) ||
Looks like these fields should have #defines also.
They should, only I don't know their meaning.
Cheers, Andre.
((readl(&mctl_ctl->dx[1].gsr[0]) >> 24) & 0x2)) { clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, 0x1 << 24); para->dual_rank = 0; } /* only half DQ width */
if (((readl(&mctl_ctl->datx[2].gsr[0]) >> 24) & 0x1) ||
((readl(&mctl_ctl->datx[3].gsr[0]) >> 24) & 0x1)) {
writel(0x0, &mctl_ctl->datx[2].gcr);
writel(0x0, &mctl_ctl->datx[3].gcr);
if (((readl(&mctl_ctl->dx[2].gsr[0]) >> 24) & 0x1) ||
((readl(&mctl_ctl->dx[3].gsr[0]) >> 24) & 0x1)) {
writel(0x0, &mctl_ctl->dx[2].gcr);
writel(0x0, &mctl_ctl->dx[3].gcr); para->bus_width = 16; }
-- 2.8.2
Regards, Simon

On Mon, Dec 05, 2016 at 01:52:20AM +0000, Andre Przywara wrote:
From: Jens Kuske jenskuske@gmail.com
The IOCR registers got renamed to BDLR to match the public documentation of similar controllers.
It looks like there's a lot more to it.
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 43 ++++++++++++++----------- arch/arm/mach-sunxi/dram_sun8i_h3.c | 34 +++++++++---------- 2 files changed, 41 insertions(+), 36 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index d0f2b8a..867fd12 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -81,7 +81,7 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */
- u8 res3[0x20]; /* 0x9c */
- u8 res3[0x20]; /* 0x9c */
Spurious change?
Thanks, Maxime

From: Jens Kuske jenskuske@gmail.com
Instead of setting the delay for whole bytes allow setting it for each individual bit. Also add support for address/command lane delays.
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/mach-sunxi/dram_sun8i_h3.c | 54 ++++++++++++++++++------------------- 1 file changed, 27 insertions(+), 27 deletions(-)
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 3dd6803..1647d76 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -16,12 +16,13 @@ #include <linux/kconfig.h>
struct dram_para { - u32 read_delays; - u32 write_delays; u16 page_size; u8 bus_width; u8 dual_rank; u8 row_bits; + const u8 dx_read_delays[4][11]; + const u8 dx_write_delays[4][11]; + const u8 ac_delays[31]; };
static inline int ns_to_t(int nanoseconds) @@ -64,34 +65,25 @@ static void mctl_phy_init(u32 val) mctl_await_completion(&mctl_ctl->pgsr[0], PGSR_INIT_DONE, 0x1); }
-static void mctl_dq_delay(u32 read, u32 write) +static void mctl_set_bit_delays(struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; int i, j; - u32 val; - - for (i = 0; i < 4; i++) { - val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) | - DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2); - - for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++) - writel(val, &mctl_ctl->dx[i].bdlr[j]); - }
clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
- for (i = 0; i < 4; i++) { - val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) | - DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf); + for (i = 0; i < 4; i++) + for (j = 0; j < 11; j++) + writel(DXBDLR_WRITE_DELAY(para->dx_write_delays[i][j]) | + DXBDLR_READ_DELAY(para->dx_read_delays[i][j]), + &mctl_ctl->dx[i].bdlr[j]);
- writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]); - writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]); - } + for (i = 0; i < 31; i++) + writel(ACBDLR_WRITE_DELAY(para->ac_delays[i]), + &mctl_ctl->acbdlr[i]);
setbits_le32(&mctl_ctl->pgcr[0], 1 << 26); - - udelay(1); }
static void mctl_set_master_priority(void) @@ -372,11 +364,8 @@ static int mctl_channel_init(struct dram_para *para) clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, (para->dual_rank ? 0x3 : 0x1) << 24);
- - if (para->read_delays || para->write_delays) { - mctl_dq_delay(para->read_delays, para->write_delays); - udelay(50); - } + mctl_set_bit_delays(para); + udelay(50);
mctl_zq_calibration(para);
@@ -458,12 +447,23 @@ unsigned long sunxi_dram_init(void) (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
struct dram_para para = { - .read_delays = 0x00007979, /* dram_tpr12 */ - .write_delays = 0x6aaa0000, /* dram_tpr11 */ .dual_rank = 0, .bus_width = 32, .row_bits = 15, .page_size = 4096, + + .dx_read_delays = {{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, + { 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }, + { 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, + { 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }}, + .dx_write_delays = {{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 }, + { 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 }, + { 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 }, + { 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6 }}, + .ac_delays = { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0 }, };
mctl_sys_init(¶);

Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
From: Jens Kuske jenskuske@gmail.com
Instead of setting the delay for whole bytes allow setting it for each individual bit. Also add support for address/command lane delays.
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/mach-sunxi/dram_sun8i_h3.c | 54 ++++++++++++++++++------------------- 1 file changed, 27 insertions(+), 27 deletions(-)
ACBDLR_WRITE_DELAY_SHIFT
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 3dd6803..1647d76 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -16,12 +16,13 @@ #include <linux/kconfig.h>
struct dram_para {
u32 read_delays;
u32 write_delays; u16 page_size; u8 bus_width; u8 dual_rank; u8 row_bits;
const u8 dx_read_delays[4][11];
Can we have #defines for 4 and 11?
const u8 dx_write_delays[4][11];
const u8 ac_delays[31];
};
static inline int ns_to_t(int nanoseconds) @@ -64,34 +65,25 @@ static void mctl_phy_init(u32 val) mctl_await_completion(&mctl_ctl->pgsr[0], PGSR_INIT_DONE, 0x1); }
-static void mctl_dq_delay(u32 read, u32 write) +static void mctl_set_bit_delays(struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; int i, j;
u32 val;
for (i = 0; i < 4; i++) {
val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) |
DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2);
for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++)
writel(val, &mctl_ctl->dx[i].bdlr[j]);
} clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
for (i = 0; i < 4; i++) {
val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) |
DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf);
for (i = 0; i < 4; i++)
for (j = 0; j < 11; j++)
writel(DXBDLR_WRITE_DELAY(para->dx_write_delays[i][j]) |
DXBDLR_READ_DELAY(para->dx_read_delays[i][j]),
&mctl_ctl->dx[i].bdlr[j]);
writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]);
writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]);
}
for (i = 0; i < 31; i++)
writel(ACBDLR_WRITE_DELAY(para->ac_delays[i]),
&mctl_ctl->acbdlr[i]); setbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
udelay(1);
}
static void mctl_set_master_priority(void) @@ -372,11 +364,8 @@ static int mctl_channel_init(struct dram_para *para) clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, (para->dual_rank ? 0x3 : 0x1) << 24);
if (para->read_delays || para->write_delays) {
mctl_dq_delay(para->read_delays, para->write_delays);
udelay(50);
}
mctl_set_bit_delays(para);
udelay(50); mctl_zq_calibration(para);
@@ -458,12 +447,23 @@ unsigned long sunxi_dram_init(void) (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
struct dram_para para = {
.read_delays = 0x00007979, /* dram_tpr12 */
.write_delays = 0x6aaa0000, /* dram_tpr11 */ .dual_rank = 0, .bus_width = 32, .row_bits = 15, .page_size = 4096,
.dx_read_delays = {{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 },
{ 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 },
{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 },
{ 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }},
.dx_write_delays = {{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6 }},
.ac_delays = { 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0 }, }; mctl_sys_init(¶);
-- 2.8.2
I wonder if there is value in moving this to device tree with of-platdata?
Regards, Simon

On Mon, Dec 5, 2016 at 2:26 PM, Simon Glass sjg@chromium.org wrote:
Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
From: Jens Kuske jenskuske@gmail.com
Instead of setting the delay for whole bytes allow setting it for each individual bit. Also add support for address/command lane delays.
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/mach-sunxi/dram_sun8i_h3.c | 54 ++++++++++++++++++------------------- 1 file changed, 27 insertions(+), 27 deletions(-)
ACBDLR_WRITE_DELAY_SHIFT
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 3dd6803..1647d76 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -16,12 +16,13 @@ #include <linux/kconfig.h>
struct dram_para {
u32 read_delays;
u32 write_delays; u16 page_size; u8 bus_width; u8 dual_rank; u8 row_bits;
const u8 dx_read_delays[4][11];
Can we have #defines for 4 and 11?
const u8 dx_write_delays[4][11];
const u8 ac_delays[31];
};
static inline int ns_to_t(int nanoseconds) @@ -64,34 +65,25 @@ static void mctl_phy_init(u32 val) mctl_await_completion(&mctl_ctl->pgsr[0], PGSR_INIT_DONE, 0x1); }
-static void mctl_dq_delay(u32 read, u32 write) +static void mctl_set_bit_delays(struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; int i, j;
u32 val;
for (i = 0; i < 4; i++) {
val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) |
DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2);
for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++)
writel(val, &mctl_ctl->dx[i].bdlr[j]);
} clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
for (i = 0; i < 4; i++) {
val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) |
DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf);
for (i = 0; i < 4; i++)
for (j = 0; j < 11; j++)
writel(DXBDLR_WRITE_DELAY(para->dx_write_delays[i][j]) |
DXBDLR_READ_DELAY(para->dx_read_delays[i][j]),
&mctl_ctl->dx[i].bdlr[j]);
writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]);
writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]);
}
for (i = 0; i < 31; i++)
writel(ACBDLR_WRITE_DELAY(para->ac_delays[i]),
&mctl_ctl->acbdlr[i]); setbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
udelay(1);
}
static void mctl_set_master_priority(void) @@ -372,11 +364,8 @@ static int mctl_channel_init(struct dram_para *para) clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, (para->dual_rank ? 0x3 : 0x1) << 24);
if (para->read_delays || para->write_delays) {
mctl_dq_delay(para->read_delays, para->write_delays);
udelay(50);
}
mctl_set_bit_delays(para);
udelay(50); mctl_zq_calibration(para);
@@ -458,12 +447,23 @@ unsigned long sunxi_dram_init(void) (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
struct dram_para para = {
.read_delays = 0x00007979, /* dram_tpr12 */
.write_delays = 0x6aaa0000, /* dram_tpr11 */ .dual_rank = 0, .bus_width = 32, .row_bits = 15, .page_size = 4096,
.dx_read_delays = {{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 },
{ 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 },
{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 },
{ 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }},
.dx_write_delays = {{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6 }},
.ac_delays = { 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0 }, }; mctl_sys_init(¶);
-- 2.8.2
I wonder if there is value in moving this to device tree with of-platdata?
I think device tree support is unlikely to fit in SPL for sunxi. IIRC Andre already mentions the space constraints in his cover letter.
ChenYu

Hi,
On 05/12/16 07:58, Chen-Yu Tsai wrote:
On Mon, Dec 5, 2016 at 2:26 PM, Simon Glass sjg@chromium.org wrote:
Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
From: Jens Kuske jenskuske@gmail.com
Instead of setting the delay for whole bytes allow setting it for each individual bit. Also add support for address/command lane delays.
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/mach-sunxi/dram_sun8i_h3.c | 54 ++++++++++++++++++------------------- 1 file changed, 27 insertions(+), 27 deletions(-)
ACBDLR_WRITE_DELAY_SHIFT
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 3dd6803..1647d76 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -16,12 +16,13 @@ #include <linux/kconfig.h>
struct dram_para {
u32 read_delays;
u32 write_delays; u16 page_size; u8 bus_width; u8 dual_rank; u8 row_bits;
const u8 dx_read_delays[4][11];
Can we have #defines for 4 and 11?
const u8 dx_write_delays[4][11];
const u8 ac_delays[31];
};
static inline int ns_to_t(int nanoseconds) @@ -64,34 +65,25 @@ static void mctl_phy_init(u32 val) mctl_await_completion(&mctl_ctl->pgsr[0], PGSR_INIT_DONE, 0x1); }
-static void mctl_dq_delay(u32 read, u32 write) +static void mctl_set_bit_delays(struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; int i, j;
u32 val;
for (i = 0; i < 4; i++) {
val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) |
DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2);
for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++)
writel(val, &mctl_ctl->dx[i].bdlr[j]);
} clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
for (i = 0; i < 4; i++) {
val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) |
DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf);
for (i = 0; i < 4; i++)
for (j = 0; j < 11; j++)
writel(DXBDLR_WRITE_DELAY(para->dx_write_delays[i][j]) |
DXBDLR_READ_DELAY(para->dx_read_delays[i][j]),
&mctl_ctl->dx[i].bdlr[j]);
writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]);
writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]);
}
for (i = 0; i < 31; i++)
writel(ACBDLR_WRITE_DELAY(para->ac_delays[i]),
&mctl_ctl->acbdlr[i]); setbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
udelay(1);
}
static void mctl_set_master_priority(void) @@ -372,11 +364,8 @@ static int mctl_channel_init(struct dram_para *para) clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, (para->dual_rank ? 0x3 : 0x1) << 24);
if (para->read_delays || para->write_delays) {
mctl_dq_delay(para->read_delays, para->write_delays);
udelay(50);
}
mctl_set_bit_delays(para);
udelay(50); mctl_zq_calibration(para);
@@ -458,12 +447,23 @@ unsigned long sunxi_dram_init(void) (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
struct dram_para para = {
.read_delays = 0x00007979, /* dram_tpr12 */
.write_delays = 0x6aaa0000, /* dram_tpr11 */ .dual_rank = 0, .bus_width = 32, .row_bits = 15, .page_size = 4096,
.dx_read_delays = {{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 },
{ 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 },
{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 },
{ 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }},
.dx_write_delays = {{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6 }},
.ac_delays = { 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0 }, }; mctl_sys_init(¶);
-- 2.8.2
I wonder if there is value in moving this to device tree with of-platdata?
While I kind of like the idea of using the DT for this, there are some issues:
1) There is no binding so far for representing the DRAM data. Given the lacking documentation for the DRAM controller it sounds very hard to come up with a good binding anyway. Also we can't push this through the Linux DT binding review, since this is of no interest to the kernel. And I'd rather avoid making up some dodgy binding just for this.
There is work underway to improve the DRAM init code and make it more robust and flexible. Ideally we can use some autodetection and calibration feature the controller offers to get rid of arbitrary magic numbers. But this is quite some work ahead and shouldn't block the much sought after A64 SPL support for now.
2) If there is need, we can detect the SoC easily by reading the ID register and differentiate at runtime. This is probably less code than pulling in DT bits, also more robust.
I think device tree support is unlikely to fit in SPL for sunxi. IIRC Andre already mentions the space constraints in his cover letter.
3) Yes, adding DT support for the SPL makes it rather big. I think it breaks the 28K limit that the mksunxiboot tool currently has. This can (and will) be fixed later, but just for this exercise I'd rather keep it small, especially as we would use it only for the DRAM code and not for the device drivers.
Actually I have a plan to make better use of DT, but not for the SPL. To a good degree the SPL code mimics the on-SoC boot ROM operation (accessing storage devices to load code), which has to work with every board already and thus does not need a board specific DT. I can elaborate on that if there is interest.
Cheers, Andre.

Hi Andre,
[...]
I wonder if there is value in moving this to device tree with of-platdata?
While I kind of like the idea of using the DT for this, there are some issues:
- There is no binding so far for representing the DRAM data. Given the
lacking documentation for the DRAM controller it sounds very hard to come up with a good binding anyway. Also we can't push this through the Linux DT binding review, since this is of no interest to the kernel. And I'd rather avoid making up some dodgy binding just for this.
There is work underway to improve the DRAM init code and make it more robust and flexible. Ideally we can use some autodetection and calibration feature the controller offers to get rid of arbitrary magic numbers. But this is quite some work ahead and shouldn't block the much sought after A64 SPL support for now.
- If there is need, we can detect the SoC easily by reading the ID
register and differentiate at runtime. This is probably less code than pulling in DT bits, also more robust.
I think device tree support is unlikely to fit in SPL for sunxi. IIRC Andre already mentions the space constraints in his cover letter.
- Yes, adding DT support for the SPL makes it rather big. I think it
breaks the 28K limit that the mksunxiboot tool currently has. This can (and will) be fixed later, but just for this exercise I'd rather keep it small, especially as we would use it only for the DRAM code and not for the device drivers.
Take a look at rk3288-firefly if you like. It has an ad-hoc device tree binding (no one has the energy to try to get this into Linux :-).
With of-platdata, DT support doesn't actually add any space (or at least very little). There is no libfdt and the only code is that needed to copy data from the of-platdata struct to the normal one.
That said, there has to be a benefit, and it's much more desirable to spend the time on this IMO:
Actually I have a plan to make better use of DT, but not for the SPL. To a good degree the SPL code mimics the on-SoC boot ROM operation (accessing storage devices to load code), which has to work with every board already and thus does not need a board specific DT. I can elaborate on that if there is interest.
Regards, Simon

On 07/12/16 03:48, Simon Glass wrote:
Hi Andre,
[...]
I wonder if there is value in moving this to device tree with of-platdata?
While I kind of like the idea of using the DT for this, there are some issues:
- There is no binding so far for representing the DRAM data. Given the
lacking documentation for the DRAM controller it sounds very hard to come up with a good binding anyway. Also we can't push this through the Linux DT binding review, since this is of no interest to the kernel. And I'd rather avoid making up some dodgy binding just for this.
There is work underway to improve the DRAM init code and make it more robust and flexible. Ideally we can use some autodetection and calibration feature the controller offers to get rid of arbitrary magic numbers. But this is quite some work ahead and shouldn't block the much sought after A64 SPL support for now.
- If there is need, we can detect the SoC easily by reading the ID
register and differentiate at runtime. This is probably less code than pulling in DT bits, also more robust.
I think device tree support is unlikely to fit in SPL for sunxi. IIRC Andre already mentions the space constraints in his cover letter.
- Yes, adding DT support for the SPL makes it rather big. I think it
breaks the 28K limit that the mksunxiboot tool currently has. This can (and will) be fixed later, but just for this exercise I'd rather keep it small, especially as we would use it only for the DRAM code and not for the device drivers.
Take a look at rk3288-firefly if you like. It has an ad-hoc device tree binding (no one has the energy to try to get this into Linux :-).
I found some lpddr2 binding in Linux, I guess we can use these as a template. But ....
With of-platdata, DT support doesn't actually add any space (or at least very little). There is no libfdt and the only code is that needed to copy data from the of-platdata struct to the normal one.
That said, there has to be a benefit, and it's much more desirable to spend the time on this IMO:
I think there is some benefit, but as you hinted it takes more time. My understanding is that these parameters are actually board specific, although a) nobody cared so much before and just went with the same Allwinner provided values for every board and b) many vendors copy the DRAM trace layout and thus share the same values here.
So: 1) We would need to work out what parameters we actually need. 2) Also which are a SoC property, which are board specific and which are DRAM chip dependent. For instance we often see the same chips used on different boards, also similar layouts across boards. Technically the amount of DRAM also matters, as that means different chips in possibly different configurations (Pine64 1GB with 2x16 bits vs. the 2GB version with 4x8 bits) 3) We need to learn how much we can actually auto detect. The DRAM controller seem to have some facilities, it may be worth to explore this.
There are some patches to rework and improve the DRAM setup, so I guess we need to revisit this anyway. I tried to merge them in here, but gave up because I think they need more love. So for the sake of getting good-enough support for the Pine64 board now I'd rather go with these fixed values for now and postpone this discussion.
Cheers, Andre.
Actually I have a plan to make better use of DT, but not for the SPL. To a good degree the SPL code mimics the on-SoC boot ROM operation (accessing storage devices to load code), which has to work with every board already and thus does not need a board specific DT. I can elaborate on that if there is interest.

On Mon, Dec 05, 2016 at 01:52:21AM +0000, Andre Przywara wrote:
From: Jens Kuske jenskuske@gmail.com
Instead of setting the delay for whole bytes allow setting it for each individual bit. Also add support for address/command lane delays.
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/mach-sunxi/dram_sun8i_h3.c | 54 ++++++++++++++++++------------------- 1 file changed, 27 insertions(+), 27 deletions(-)
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 3dd6803..1647d76 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -16,12 +16,13 @@ #include <linux/kconfig.h>
struct dram_para {
- u32 read_delays;
- u32 write_delays; u16 page_size; u8 bus_width; u8 dual_rank; u8 row_bits;
- const u8 dx_read_delays[4][11];
- const u8 dx_write_delays[4][11];
- const u8 ac_delays[31];
};
Some documentation on what is the expected format and what it corresponds to would be welcome.
static inline int ns_to_t(int nanoseconds) @@ -64,34 +65,25 @@ static void mctl_phy_init(u32 val) mctl_await_completion(&mctl_ctl->pgsr[0], PGSR_INIT_DONE, 0x1); }
-static void mctl_dq_delay(u32 read, u32 write) +static void mctl_set_bit_delays(struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; int i, j;
u32 val;
for (i = 0; i < 4; i++) {
val = DXBDLR_WRITE_DELAY((write >> (i * 4)) & 0xf) |
DXBDLR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2);
for (j = DXBDLR_DQ(0); j <= DXBDLR_DM; j++)
writel(val, &mctl_ctl->dx[i].bdlr[j]);
}
clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
for (i = 0; i < 4; i++) {
val = DXBDLR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) |
DXBDLR_READ_DELAY((read >> (16 + i * 4)) & 0xf);
- for (i = 0; i < 4; i++)
for (j = 0; j < 11; j++)
writel(DXBDLR_WRITE_DELAY(para->dx_write_delays[i][j]) |
DXBDLR_READ_DELAY(para->dx_read_delays[i][j]),
&mctl_ctl->dx[i].bdlr[j]);
writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQS]);
writel(val, &mctl_ctl->dx[i].bdlr[DXBDLR_DQSN]);
- }
for (i = 0; i < 31; i++)
writel(ACBDLR_WRITE_DELAY(para->ac_delays[i]),
&mctl_ctl->acbdlr[i]);
setbits_le32(&mctl_ctl->pgcr[0], 1 << 26);
- udelay(1);
}
static void mctl_set_master_priority(void) @@ -372,11 +364,8 @@ static int mctl_channel_init(struct dram_para *para) clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, (para->dual_rank ? 0x3 : 0x1) << 24);
- if (para->read_delays || para->write_delays) {
mctl_dq_delay(para->read_delays, para->write_delays);
udelay(50);
- }
mctl_set_bit_delays(para);
udelay(50);
mctl_zq_calibration(para);
@@ -458,12 +447,23 @@ unsigned long sunxi_dram_init(void) (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
struct dram_para para = {
.read_delays = 0x00007979, /* dram_tpr12 */
.dual_rank = 0, .bus_width = 32, .row_bits = 15, .page_size = 4096,.write_delays = 0x6aaa0000, /* dram_tpr11 */
.dx_read_delays = {{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 },
{ 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 },
{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 },
{ 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }},
.dx_write_delays = {{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10 },
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 6 }},
.ac_delays = { 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0 },
You're mixing tabs and spaces for the indentation, and the tab before that bracket looks useless.
Thanks, Maxime

From: Jens Kuske jenskuske@gmail.com
The A64 DRAM controller is very similar to the H3 one, so the code can be reused with some small changes. [Andre: fixed up typo, merged in fixes from Jens]
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 10 +- arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/clock_sun6i.c | 2 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 139 +++++++++++++++++++----- 6 files changed, 123 insertions(+), 32 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h index be9fcfd..3f87672 100644 --- a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h @@ -322,6 +322,7 @@ struct sunxi_ccm_reg { #define CCM_DRAMCLK_CFG_DIV0_MASK (0xf << 8) #define CCM_DRAMCLK_CFG_SRC_PLL5 (0x0 << 20) #define CCM_DRAMCLK_CFG_SRC_PLL6x2 (0x1 << 20) +#define CCM_DRAMCLK_CFG_SRC_PLL11 (0x1 << 20) /* A64 only */ #define CCM_DRAMCLK_CFG_SRC_MASK (0x3 << 20) #define CCM_DRAMCLK_CFG_UPD (0x1 << 16) #define CCM_DRAMCLK_CFG_RST (0x1 << 31) diff --git a/arch/arm/include/asm/arch-sunxi/dram.h b/arch/arm/include/asm/arch-sunxi/dram.h index e0be744..53e6d47 100644 --- a/arch/arm/include/asm/arch-sunxi/dram.h +++ b/arch/arm/include/asm/arch-sunxi/dram.h @@ -24,7 +24,7 @@ #include <asm/arch/dram_sun8i_a33.h> #elif defined(CONFIG_MACH_SUN8I_A83T) #include <asm/arch/dram_sun8i_a83t.h> -#elif defined(CONFIG_MACH_SUN8I_H3) +#elif defined(CONFIG_MACH_SUN8I_H3) || defined(CONFIG_MACH_SUN50I) #include <asm/arch/dram_sun8i_h3.h> #elif defined(CONFIG_MACH_SUN9I) #include <asm/arch/dram_sun9i.h> diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index 867fd12..b0e5d93 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -15,7 +15,8 @@
struct sunxi_mctl_com_reg { u32 cr; /* 0x00 control register */ - u8 res0[0xc]; /* 0x04 */ + u8 res0[0x8]; /* 0x04 */ + u32 tmr; /* 0x0c (A64 only) */ u32 mcr[16][2]; /* 0x10 */ u32 bwcr; /* 0x90 bandwidth control register */ u32 maer; /* 0x94 master enable register */ @@ -32,7 +33,9 @@ struct sunxi_mctl_com_reg { u32 swoffr; /* 0xc4 */ u8 res2[0x8]; /* 0xc8 */ u32 cccr; /* 0xd0 */ - u8 res3[0x72c]; /* 0xd4 */ + u8 res3[0x54]; /* 0xd4 */ + u32 mdfs_bwlr[3]; /* 0x128 (A64 only) */ + u8 res4[0x6cc]; /* 0x134 */ u32 protect; /* 0x800 */ };
@@ -81,7 +84,8 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */ - u8 res3[0x20]; /* 0x9c */ + u8 res3[0x1c]; /* 0x9c */ + u32 vtfcr; /* 0xb8 (A64 only) */ u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */ diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index e73114e..7daba11 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -50,4 +50,5 @@ obj-$(CONFIG_MACH_SUN8I_A33) += dram_sun8i_a33.o obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o +obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o endif diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index 80cfc0b..99f515d 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -217,7 +217,7 @@ done: } #endif
-#ifdef CONFIG_MACH_SUN8I_A33 +#if defined(CONFIG_MACH_SUN8I_A33) || defined(CONFIG_MACH_SUN50I) void clock_set_pll11(unsigned int clk, bool sigma_delta_enable) { struct sunxi_ccm_reg * const ccm = diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 1647d76..2dc2071 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -32,30 +32,6 @@ static inline int ns_to_t(int nanoseconds) return DIV_ROUND_UP(ctrl_freq * nanoseconds, 1000); }
-static u32 bin_to_mgray(int val) -{ - static const u8 lookup_table[32] = { - 0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05, - 0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09, - 0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d, - 0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11, - }; - - return lookup_table[clamp(val, 0, 31)]; -} - -static int mgray_to_bin(u32 val) -{ - static const u8 lookup_table[32] = { - 0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05, - 0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b, - 0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b, - 0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15, - }; - - return lookup_table[val & 0x1f]; -} - static void mctl_phy_init(u32 val) { struct sunxi_mctl_ctl_reg * const mctl_ctl = @@ -91,8 +67,9 @@ static void mctl_set_master_priority(void) struct sunxi_mctl_com_reg * const mctl_com = (struct sunxi_mctl_com_reg *)SUNXI_DRAM_COM_BASE;
+#if defined(CONFIG_MACH_SUN8I_H3) /* enable bandwidth limit windows and set windows size 1us */ - writel(0x00010190, &mctl_com->bwcr); + writel((1 << 16) | (400 << 0), &mctl_com->bwcr);
/* set cpu high priority */ writel(0x00000001, &mctl_com->mapr); @@ -121,6 +98,38 @@ static void mctl_set_master_priority(void) writel(0x04001800, &mctl_com->mcr[10][1]); writel(0x04000009, &mctl_com->mcr[11][0]); writel(0x00400120, &mctl_com->mcr[11][1]); +#elif defined(CONFIG_MACH_SUN50I) + /* enable bandwidth limit windows and set windows size 1us */ + writel(399, &mctl_com->tmr); + writel((1 << 16), &mctl_com->bwcr); + + writel(0x00a0000d, &mctl_com->mcr[0][0]); + writel(0x00500064, &mctl_com->mcr[0][1]); + writel(0x06000009, &mctl_com->mcr[1][0]); + writel(0x01000578, &mctl_com->mcr[1][1]); + writel(0x0200000d, &mctl_com->mcr[2][0]); + writel(0x00600100, &mctl_com->mcr[2][1]); + writel(0x01000009, &mctl_com->mcr[3][0]); + writel(0x00500064, &mctl_com->mcr[3][1]); + writel(0x07000009, &mctl_com->mcr[4][0]); + writel(0x01000640, &mctl_com->mcr[4][1]); + writel(0x01000009, &mctl_com->mcr[5][0]); + writel(0x00000080, &mctl_com->mcr[5][1]); + writel(0x01000009, &mctl_com->mcr[6][0]); + writel(0x00400080, &mctl_com->mcr[6][1]); + writel(0x0100000d, &mctl_com->mcr[7][0]); + writel(0x00400080, &mctl_com->mcr[7][1]); + writel(0x0100000d, &mctl_com->mcr[8][0]); + writel(0x00400080, &mctl_com->mcr[8][1]); + writel(0x04000009, &mctl_com->mcr[9][0]); + writel(0x00400100, &mctl_com->mcr[9][1]); + writel(0x20000209, &mctl_com->mcr[10][0]); + writel(0x08001800, &mctl_com->mcr[10][1]); + writel(0x05000009, &mctl_com->mcr[11][0]); + writel(0x00400090, &mctl_com->mcr[11][1]); + + writel(0x81000004, &mctl_com->mdfs_bwlr[2]); +#endif }
static void mctl_set_timing_params(struct dram_para *para) @@ -204,7 +213,32 @@ static void mctl_set_timing_params(struct dram_para *para) writel(RFSHTMG_TREFI(trefi) | RFSHTMG_TRFC(trfc), &mctl_ctl->rfshtmg); }
-static void mctl_zq_calibration(struct dram_para *para) +#ifdef CONFIG_MACH_SUN8I_H3 +static u32 bin_to_mgray(int val) +{ + static const u8 lookup_table[32] = { + 0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05, + 0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09, + 0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d, + 0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11, + }; + + return lookup_table[clamp(val, 0, 31)]; +} + +static int mgray_to_bin(u32 val) +{ + static const u8 lookup_table[32] = { + 0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05, + 0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b, + 0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b, + 0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15, + }; + + return lookup_table[val & 0x1f]; +} + +static void mctl_h3_zq_calibration_quirk(struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; @@ -261,6 +295,7 @@ static void mctl_zq_calibration(struct dram_para *para) writel((zq_val[5] << 16) | zq_val[4], &mctl_ctl->zqdr[2]); } } +#endif
static void mctl_set_cr(struct dram_para *para) { @@ -286,16 +321,27 @@ static void mctl_sys_init(struct dram_para *para) clrbits_le32(&ccm->ahb_gate0, 1 << AHB_GATE_OFFSET_MCTL); clrbits_le32(&ccm->ahb_reset0_cfg, 1 << AHB_RESET_OFFSET_MCTL); clrbits_le32(&ccm->pll5_cfg, CCM_PLL5_CTRL_EN); +#ifdef CONFIG_MACH_SUN50I + clrbits_le32(&ccm->pll11_cfg, CCM_PLL11_CTRL_EN); +#endif udelay(10);
clrbits_le32(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_RST); udelay(1000);
+#ifdef CONFIG_MACH_SUN50I + clock_set_pll11(CONFIG_DRAM_CLK * 2 * 1000000, false); + clrsetbits_le32(&ccm->dram_clk_cfg, + CCM_DRAMCLK_CFG_DIV_MASK | CCM_DRAMCLK_CFG_SRC_MASK, + CCM_DRAMCLK_CFG_DIV(1) | CCM_DRAMCLK_CFG_SRC_PLL11 | + CCM_DRAMCLK_CFG_UPD); +#else clock_set_pll5(CONFIG_DRAM_CLK * 2 * 1000000, false); clrsetbits_le32(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_DIV_MASK | CCM_DRAMCLK_CFG_SRC_MASK, CCM_DRAMCLK_CFG_DIV(1) | CCM_DRAMCLK_CFG_SRC_PLL5 | CCM_DRAMCLK_CFG_UPD); +#endif mctl_await_completion(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_UPD, 0);
setbits_le32(&ccm->ahb_reset0_cfg, 1 << AHB_RESET_OFFSET_MCTL); @@ -347,12 +393,18 @@ static int mctl_channel_init(struct dram_para *para) /* set DQS auto gating PD mode */ setbits_le32(&mctl_ctl->pgcr[2], 0x3 << 6);
+#if defined(CONFIG_MACH_SUN8I_H3) /* dx ddr_clk & hdr_clk dynamic mode */ clrbits_le32(&mctl_ctl->pgcr[0], (0x3 << 14) | (0x3 << 12));
/* dphy & aphy phase select 270 degree */ clrsetbits_le32(&mctl_ctl->pgcr[2], (0x3 << 10) | (0x3 << 8), (0x1 << 10) | (0x2 << 8)); +#elif defined(CONFIG_MACH_SUN50I) + /* dphy & aphy phase select ? */ + clrsetbits_le32(&mctl_ctl->pgcr[2], (0x3 << 10) | (0x3 << 8), + (0x0 << 10) | (0x3 << 8)); +#endif
/* set half DQ */ if (para->bus_width != 32) { @@ -367,10 +419,17 @@ static int mctl_channel_init(struct dram_para *para) mctl_set_bit_delays(para); udelay(50);
- mctl_zq_calibration(para); +#ifdef CONFIG_MACH_SUN8I_H3 + mctl_h3_zq_calibration_quirk(para);
mctl_phy_init(PIR_PLLINIT | PIR_DCAL | PIR_PHYRST | PIR_DRAMRST | PIR_DRAMINIT | PIR_QSGATE); +#else + clrsetbits_le32(&mctl_ctl->zqcr, 0xffffff, CONFIG_DRAM_ZQ); + + mctl_phy_init(PIR_ZCAL | PIR_PLLINIT | PIR_DCAL | PIR_PHYRST | + PIR_DRAMRST | PIR_DRAMINIT | PIR_QSGATE); +#endif
/* detect ranks and bus width */ if (readl(&mctl_ctl->pgsr[0]) & (0xfe << 20)) { @@ -408,7 +467,11 @@ static int mctl_channel_init(struct dram_para *para) udelay(10);
/* set PGCR3, CKE polarity */ +#ifdef CONFIG_MACH_SUN50I + writel(0xc0aa0060, &mctl_ctl->pgcr[3]); +#else writel(0x00aa0060, &mctl_ctl->pgcr[3]); +#endif
/* power down zq calibration module for power save */ setbits_le32(&mctl_ctl->zqcr, ZQCR_PWRDOWN); @@ -452,6 +515,7 @@ unsigned long sunxi_dram_init(void) .row_bits = 15, .page_size = 4096,
+#if defined(CONFIG_MACH_SUN8I_H3) .dx_read_delays = {{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, { 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }, { 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, @@ -464,6 +528,20 @@ unsigned long sunxi_dram_init(void) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, +#elif defined(CONFIG_MACH_SUN50I) + .dx_read_delays = {{ 16, 16, 16, 16, 17, 16, 16, 17, 16, 1, 0 }, + { 17, 17, 17, 17, 17, 17, 17, 17, 17, 1, 0 }, + { 16, 17, 17, 16, 16, 16, 16, 16, 16, 0, 0 }, + { 17, 17, 17, 17, 17, 17, 17, 17, 17, 1, 0 }}, + .dx_write_delays = {{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 15, 15 }, + { 0, 0, 0, 0, 1, 1, 1, 1, 0, 10, 10 }, + { 1, 0, 1, 1, 1, 1, 1, 1, 0, 11, 11 }, + { 1, 0, 0, 1, 1, 1, 1, 1, 0, 12, 12 }}, + .ac_delays = { 5, 5, 13, 10, 2, 5, 3, 3, + 0, 3, 3, 3, 1, 0, 0, 0, + 3, 4, 0, 3, 4, 1, 4, 0, + 1, 1, 0, 1, 13, 5, 4 }, +#endif };
mctl_sys_init(¶); @@ -476,8 +554,15 @@ unsigned long sunxi_dram_init(void) writel(0x00000201, &mctl_ctl->odtmap); udelay(1);
+#ifdef CONFIG_MACH_SUN8I_H3 /* odt delay */ writel(0x0c000400, &mctl_ctl->odtcfg); +#endif + +#ifdef CONFIG_MACH_SUN50I + setbits_le32(&mctl_ctl->vtfcr, (1 << 9)); + clrbits_le32(&mctl_ctl->pgcr[2], (1 << 13)); +#endif
/* clear credit value */ setbits_le32(&mctl_com->cccr, 1 << 31);

Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
From: Jens Kuske jenskuske@gmail.com
The A64 DRAM controller is very similar to the H3 one, so the code can be reused with some small changes.
Yes but it makes the code a mess. Can you avoid putting #iifdefs everywhere?
Since it is a static function I wonder if you can instead use a function parameter which defines the chip to support, and the compiler will eliminate the unused code?
See below...
[Andre: fixed up typo, merged in fixes from Jens]
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 10 +- arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/clock_sun6i.c | 2 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 139 +++++++++++++++++++----- 6 files changed, 123 insertions(+), 32 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h index be9fcfd..3f87672 100644 --- a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h @@ -322,6 +322,7 @@ struct sunxi_ccm_reg { #define CCM_DRAMCLK_CFG_DIV0_MASK (0xf << 8) #define CCM_DRAMCLK_CFG_SRC_PLL5 (0x0 << 20) #define CCM_DRAMCLK_CFG_SRC_PLL6x2 (0x1 << 20) +#define CCM_DRAMCLK_CFG_SRC_PLL11 (0x1 << 20) /* A64 only */ #define CCM_DRAMCLK_CFG_SRC_MASK (0x3 << 20) #define CCM_DRAMCLK_CFG_UPD (0x1 << 16) #define CCM_DRAMCLK_CFG_RST (0x1 << 31) diff --git a/arch/arm/include/asm/arch-sunxi/dram.h b/arch/arm/include/asm/arch-sunxi/dram.h index e0be744..53e6d47 100644 --- a/arch/arm/include/asm/arch-sunxi/dram.h +++ b/arch/arm/include/asm/arch-sunxi/dram.h @@ -24,7 +24,7 @@ #include <asm/arch/dram_sun8i_a33.h> #elif defined(CONFIG_MACH_SUN8I_A83T) #include <asm/arch/dram_sun8i_a83t.h> -#elif defined(CONFIG_MACH_SUN8I_H3) +#elif defined(CONFIG_MACH_SUN8I_H3) || defined(CONFIG_MACH_SUN50I) #include <asm/arch/dram_sun8i_h3.h> #elif defined(CONFIG_MACH_SUN9I) #include <asm/arch/dram_sun9i.h> diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index 867fd12..b0e5d93 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -15,7 +15,8 @@
struct sunxi_mctl_com_reg { u32 cr; /* 0x00 control register */
u8 res0[0xc]; /* 0x04 */
u8 res0[0x8]; /* 0x04 */
u32 tmr; /* 0x0c (A64 only) */ u32 mcr[16][2]; /* 0x10 */ u32 bwcr; /* 0x90 bandwidth control register */ u32 maer; /* 0x94 master enable register */
@@ -32,7 +33,9 @@ struct sunxi_mctl_com_reg { u32 swoffr; /* 0xc4 */ u8 res2[0x8]; /* 0xc8 */ u32 cccr; /* 0xd0 */
u8 res3[0x72c]; /* 0xd4 */
u8 res3[0x54]; /* 0xd4 */
u32 mdfs_bwlr[3]; /* 0x128 (A64 only) */
u8 res4[0x6cc]; /* 0x134 */ u32 protect; /* 0x800 */
};
@@ -81,7 +84,8 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */
u8 res3[0x20]; /* 0x9c */
u8 res3[0x1c]; /* 0x9c */
u32 vtfcr; /* 0xb8 (A64 only) */ u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index e73114e..7daba11 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -50,4 +50,5 @@ obj-$(CONFIG_MACH_SUN8I_A33) += dram_sun8i_a33.o obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o +obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o endif diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index 80cfc0b..99f515d 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -217,7 +217,7 @@ done: } #endif
-#ifdef CONFIG_MACH_SUN8I_A33 +#if defined(CONFIG_MACH_SUN8I_A33) || defined(CONFIG_MACH_SUN50I) void clock_set_pll11(unsigned int clk, bool sigma_delta_enable) { struct sunxi_ccm_reg * const ccm = diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 1647d76..2dc2071 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -32,30 +32,6 @@ static inline int ns_to_t(int nanoseconds) return DIV_ROUND_UP(ctrl_freq * nanoseconds, 1000); }
-static u32 bin_to_mgray(int val) -{
static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09,
0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d,
0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11,
};
return lookup_table[clamp(val, 0, 31)];
-}
-static int mgray_to_bin(u32 val) -{
static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b,
0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b,
0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15,
};
return lookup_table[val & 0x1f];
-}
static void mctl_phy_init(u32 val) { struct sunxi_mctl_ctl_reg * const mctl_ctl = @@ -91,8 +67,9 @@ static void mctl_set_master_priority(void) struct sunxi_mctl_com_reg * const mctl_com = (struct sunxi_mctl_com_reg *)SUNXI_DRAM_COM_BASE;
+#if defined(CONFIG_MACH_SUN8I_H3) /* enable bandwidth limit windows and set windows size 1us */
writel(0x00010190, &mctl_com->bwcr);
writel((1 << 16) | (400 << 0), &mctl_com->bwcr); /* set cpu high priority */ writel(0x00000001, &mctl_com->mapr);
@@ -121,6 +98,38 @@ static void mctl_set_master_priority(void) writel(0x04001800, &mctl_com->mcr[10][1]); writel(0x04000009, &mctl_com->mcr[11][0]); writel(0x00400120, &mctl_com->mcr[11][1]); +#elif defined(CONFIG_MACH_SUN50I)
/* enable bandwidth limit windows and set windows size 1us */
writel(399, &mctl_com->tmr);
writel((1 << 16), &mctl_com->bwcr);
writel(0x00a0000d, &mctl_com->mcr[0][0]);
writel(0x00500064, &mctl_com->mcr[0][1]);
writel(0x06000009, &mctl_com->mcr[1][0]);
writel(0x01000578, &mctl_com->mcr[1][1]);
writel(0x0200000d, &mctl_com->mcr[2][0]);
writel(0x00600100, &mctl_com->mcr[2][1]);
writel(0x01000009, &mctl_com->mcr[3][0]);
writel(0x00500064, &mctl_com->mcr[3][1]);
writel(0x07000009, &mctl_com->mcr[4][0]);
writel(0x01000640, &mctl_com->mcr[4][1]);
writel(0x01000009, &mctl_com->mcr[5][0]);
writel(0x00000080, &mctl_com->mcr[5][1]);
writel(0x01000009, &mctl_com->mcr[6][0]);
writel(0x00400080, &mctl_com->mcr[6][1]);
writel(0x0100000d, &mctl_com->mcr[7][0]);
writel(0x00400080, &mctl_com->mcr[7][1]);
writel(0x0100000d, &mctl_com->mcr[8][0]);
writel(0x00400080, &mctl_com->mcr[8][1]);
writel(0x04000009, &mctl_com->mcr[9][0]);
writel(0x00400100, &mctl_com->mcr[9][1]);
writel(0x20000209, &mctl_com->mcr[10][0]);
writel(0x08001800, &mctl_com->mcr[10][1]);
writel(0x05000009, &mctl_com->mcr[11][0]);
writel(0x00400090, &mctl_com->mcr[11][1]);
writel(0x81000004, &mctl_com->mdfs_bwlr[2]);
+#endif }
static void mctl_set_timing_params(struct dram_para *para) @@ -204,7 +213,32 @@ static void mctl_set_timing_params(struct dram_para *para) writel(RFSHTMG_TREFI(trefi) | RFSHTMG_TRFC(trfc), &mctl_ctl->rfshtmg); }
-static void mctl_zq_calibration(struct dram_para *para) +#ifdef CONFIG_MACH_SUN8I_H3 +static u32 bin_to_mgray(int val) +{
static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09,
0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d,
0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11,
};
return lookup_table[clamp(val, 0, 31)];
+}
+static int mgray_to_bin(u32 val) +{
static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b,
0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b,
0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15,
};
return lookup_table[val & 0x1f];
+}
+static void mctl_h3_zq_calibration_quirk(struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; @@ -261,6 +295,7 @@ static void mctl_zq_calibration(struct dram_para *para) writel((zq_val[5] << 16) | zq_val[4], &mctl_ctl->zqdr[2]); } } +#endif
static void mctl_set_cr(struct dram_para *para)
Can this be:
static void mctl_set_cr(struct dram_para *para, enum chip_type type)
{ @@ -286,16 +321,27 @@ static void mctl_sys_init(struct dram_para *para) clrbits_le32(&ccm->ahb_gate0, 1 << AHB_GATE_OFFSET_MCTL); clrbits_le32(&ccm->ahb_reset0_cfg, 1 << AHB_RESET_OFFSET_MCTL); clrbits_le32(&ccm->pll5_cfg, CCM_PLL5_CTRL_EN); +#ifdef CONFIG_MACH_SUN50I
clrbits_le32(&ccm->pll11_cfg, CCM_PLL11_CTRL_EN);
+#endif udelay(10);
clrbits_le32(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_RST); udelay(1000);
+#ifdef CONFIG_MACH_SUN50I
if (type == SUN50I)
clock_set_pll11(CONFIG_DRAM_CLK * 2 * 1000000, false);
clrsetbits_le32(&ccm->dram_clk_cfg,
CCM_DRAMCLK_CFG_DIV_MASK | CCM_DRAMCLK_CFG_SRC_MASK,
CCM_DRAMCLK_CFG_DIV(1) | CCM_DRAMCLK_CFG_SRC_PLL11 |
CCM_DRAMCLK_CFG_UPD);
+#else clock_set_pll5(CONFIG_DRAM_CLK * 2 * 1000000, false); clrsetbits_le32(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_DIV_MASK | CCM_DRAMCLK_CFG_SRC_MASK, CCM_DRAMCLK_CFG_DIV(1) | CCM_DRAMCLK_CFG_SRC_PLL5 | CCM_DRAMCLK_CFG_UPD); +#endif mctl_await_completion(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_UPD, 0);
setbits_le32(&ccm->ahb_reset0_cfg, 1 << AHB_RESET_OFFSET_MCTL);
@@ -347,12 +393,18 @@ static int mctl_channel_init(struct dram_para *para) /* set DQS auto gating PD mode */ setbits_le32(&mctl_ctl->pgcr[2], 0x3 << 6);
+#if defined(CONFIG_MACH_SUN8I_H3) /* dx ddr_clk & hdr_clk dynamic mode */ clrbits_le32(&mctl_ctl->pgcr[0], (0x3 << 14) | (0x3 << 12));
/* dphy & aphy phase select 270 degree */ clrsetbits_le32(&mctl_ctl->pgcr[2], (0x3 << 10) | (0x3 << 8), (0x1 << 10) | (0x2 << 8));
+#elif defined(CONFIG_MACH_SUN50I)
/* dphy & aphy phase select ? */
clrsetbits_le32(&mctl_ctl->pgcr[2], (0x3 << 10) | (0x3 << 8),
(0x0 << 10) | (0x3 << 8));
+#endif
/* set half DQ */ if (para->bus_width != 32) {
@@ -367,10 +419,17 @@ static int mctl_channel_init(struct dram_para *para) mctl_set_bit_delays(para); udelay(50);
mctl_zq_calibration(para);
+#ifdef CONFIG_MACH_SUN8I_H3
mctl_h3_zq_calibration_quirk(para); mctl_phy_init(PIR_PLLINIT | PIR_DCAL | PIR_PHYRST | PIR_DRAMRST | PIR_DRAMINIT | PIR_QSGATE);
+#else
clrsetbits_le32(&mctl_ctl->zqcr, 0xffffff, CONFIG_DRAM_ZQ);
mctl_phy_init(PIR_ZCAL | PIR_PLLINIT | PIR_DCAL | PIR_PHYRST |
PIR_DRAMRST | PIR_DRAMINIT | PIR_QSGATE);
+#endif
/* detect ranks and bus width */ if (readl(&mctl_ctl->pgsr[0]) & (0xfe << 20)) {
@@ -408,7 +467,11 @@ static int mctl_channel_init(struct dram_para *para) udelay(10);
/* set PGCR3, CKE polarity */
+#ifdef CONFIG_MACH_SUN50I
writel(0xc0aa0060, &mctl_ctl->pgcr[3]);
+#else writel(0x00aa0060, &mctl_ctl->pgcr[3]); +#endif
/* power down zq calibration module for power save */ setbits_le32(&mctl_ctl->zqcr, ZQCR_PWRDOWN);
@@ -452,6 +515,7 @@ unsigned long sunxi_dram_init(void) .row_bits = 15, .page_size = 4096,
+#if defined(CONFIG_MACH_SUN8I_H3) .dx_read_delays = {{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, { 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }, { 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, @@ -464,6 +528,20 @@ unsigned long sunxi_dram_init(void) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, +#elif defined(CONFIG_MACH_SUN50I)
.dx_read_delays = {{ 16, 16, 16, 16, 17, 16, 16, 17, 16, 1, 0 },
{ 17, 17, 17, 17, 17, 17, 17, 17, 17, 1, 0 },
{ 16, 17, 17, 16, 16, 16, 16, 16, 16, 0, 0 },
{ 17, 17, 17, 17, 17, 17, 17, 17, 17, 1, 0 }},
.dx_write_delays = {{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 15, 15 },
{ 0, 0, 0, 0, 1, 1, 1, 1, 0, 10, 10 },
{ 1, 0, 1, 1, 1, 1, 1, 1, 0, 11, 11 },
{ 1, 0, 0, 1, 1, 1, 1, 1, 0, 12, 12 }},
.ac_delays = { 5, 5, 13, 10, 2, 5, 3, 3,
0, 3, 3, 3, 1, 0, 0, 0,
3, 4, 0, 3, 4, 1, 4, 0,
1, 1, 0, 1, 13, 5, 4 },
+#endif };
mctl_sys_init(¶);
@@ -476,8 +554,15 @@ unsigned long sunxi_dram_init(void) writel(0x00000201, &mctl_ctl->odtmap); udelay(1);
+#ifdef CONFIG_MACH_SUN8I_H3 /* odt delay */ writel(0x0c000400, &mctl_ctl->odtcfg); +#endif
+#ifdef CONFIG_MACH_SUN50I
setbits_le32(&mctl_ctl->vtfcr, (1 << 9));
clrbits_le32(&mctl_ctl->pgcr[2], (1 << 13));
+#endif
/* clear credit value */ setbits_le32(&mctl_com->cccr, 1 << 31);
-- 2.8.2
Then:
+#ifdef CONFIG_MACH_SUN50I mctl_set_cr(params, SUN50I) #elif defined(CONFIG_MACH_SUN8I_H3) mctl_set_cr(params, SUN8I_H3) ...
Regards, Simon

Hi Simon,
On 05/12/16 06:26, Simon Glass wrote:
Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
From: Jens Kuske jenskuske@gmail.com
The A64 DRAM controller is very similar to the H3 one, so the code can be reused with some small changes.
Yes but it makes the code a mess. Can you avoid putting #iifdefs everywhere?
Sure ...
Since it is a static function I wonder if you can instead use a function parameter which defines the chip to support, and the compiler will eliminate the unused code?
I like that idea. Actually I went ahead and converted the file to this approach and it looks much nicer, especially since I added support for a third DRAM controller on top. Also it opens the doors to potentially serve multiple variants in one binary (one day ...)
I took the freedom to use the Allwinner SoCID (a 16-bit value (mostly) unique to a certain SoC) as the variant identifier. That doesn't make any difference for this passing-to-static-functions case, but avoids inventing identifiers and can be easily reused with auto-detection.
Cheers, Andre.
See below...
[Andre: fixed up typo, merged in fixes from Jens]
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 10 +- arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/clock_sun6i.c | 2 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 139 +++++++++++++++++++----- 6 files changed, 123 insertions(+), 32 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h index be9fcfd..3f87672 100644 --- a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h @@ -322,6 +322,7 @@ struct sunxi_ccm_reg { #define CCM_DRAMCLK_CFG_DIV0_MASK (0xf << 8) #define CCM_DRAMCLK_CFG_SRC_PLL5 (0x0 << 20) #define CCM_DRAMCLK_CFG_SRC_PLL6x2 (0x1 << 20) +#define CCM_DRAMCLK_CFG_SRC_PLL11 (0x1 << 20) /* A64 only */ #define CCM_DRAMCLK_CFG_SRC_MASK (0x3 << 20) #define CCM_DRAMCLK_CFG_UPD (0x1 << 16) #define CCM_DRAMCLK_CFG_RST (0x1 << 31) diff --git a/arch/arm/include/asm/arch-sunxi/dram.h b/arch/arm/include/asm/arch-sunxi/dram.h index e0be744..53e6d47 100644 --- a/arch/arm/include/asm/arch-sunxi/dram.h +++ b/arch/arm/include/asm/arch-sunxi/dram.h @@ -24,7 +24,7 @@ #include <asm/arch/dram_sun8i_a33.h> #elif defined(CONFIG_MACH_SUN8I_A83T) #include <asm/arch/dram_sun8i_a83t.h> -#elif defined(CONFIG_MACH_SUN8I_H3) +#elif defined(CONFIG_MACH_SUN8I_H3) || defined(CONFIG_MACH_SUN50I) #include <asm/arch/dram_sun8i_h3.h> #elif defined(CONFIG_MACH_SUN9I) #include <asm/arch/dram_sun9i.h> diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index 867fd12..b0e5d93 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -15,7 +15,8 @@
struct sunxi_mctl_com_reg { u32 cr; /* 0x00 control register */
u8 res0[0xc]; /* 0x04 */
u8 res0[0x8]; /* 0x04 */
u32 tmr; /* 0x0c (A64 only) */ u32 mcr[16][2]; /* 0x10 */ u32 bwcr; /* 0x90 bandwidth control register */ u32 maer; /* 0x94 master enable register */
@@ -32,7 +33,9 @@ struct sunxi_mctl_com_reg { u32 swoffr; /* 0xc4 */ u8 res2[0x8]; /* 0xc8 */ u32 cccr; /* 0xd0 */
u8 res3[0x72c]; /* 0xd4 */
u8 res3[0x54]; /* 0xd4 */
u32 mdfs_bwlr[3]; /* 0x128 (A64 only) */
u8 res4[0x6cc]; /* 0x134 */ u32 protect; /* 0x800 */
};
@@ -81,7 +84,8 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */
u8 res3[0x20]; /* 0x9c */
u8 res3[0x1c]; /* 0x9c */
u32 vtfcr; /* 0xb8 (A64 only) */ u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index e73114e..7daba11 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -50,4 +50,5 @@ obj-$(CONFIG_MACH_SUN8I_A33) += dram_sun8i_a33.o obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o +obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o endif diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index 80cfc0b..99f515d 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -217,7 +217,7 @@ done: } #endif
-#ifdef CONFIG_MACH_SUN8I_A33 +#if defined(CONFIG_MACH_SUN8I_A33) || defined(CONFIG_MACH_SUN50I) void clock_set_pll11(unsigned int clk, bool sigma_delta_enable) { struct sunxi_ccm_reg * const ccm = diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 1647d76..2dc2071 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -32,30 +32,6 @@ static inline int ns_to_t(int nanoseconds) return DIV_ROUND_UP(ctrl_freq * nanoseconds, 1000); }
-static u32 bin_to_mgray(int val) -{
static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09,
0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d,
0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11,
};
return lookup_table[clamp(val, 0, 31)];
-}
-static int mgray_to_bin(u32 val) -{
static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b,
0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b,
0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15,
};
return lookup_table[val & 0x1f];
-}
static void mctl_phy_init(u32 val) { struct sunxi_mctl_ctl_reg * const mctl_ctl = @@ -91,8 +67,9 @@ static void mctl_set_master_priority(void) struct sunxi_mctl_com_reg * const mctl_com = (struct sunxi_mctl_com_reg *)SUNXI_DRAM_COM_BASE;
+#if defined(CONFIG_MACH_SUN8I_H3) /* enable bandwidth limit windows and set windows size 1us */
writel(0x00010190, &mctl_com->bwcr);
writel((1 << 16) | (400 << 0), &mctl_com->bwcr); /* set cpu high priority */ writel(0x00000001, &mctl_com->mapr);
@@ -121,6 +98,38 @@ static void mctl_set_master_priority(void) writel(0x04001800, &mctl_com->mcr[10][1]); writel(0x04000009, &mctl_com->mcr[11][0]); writel(0x00400120, &mctl_com->mcr[11][1]); +#elif defined(CONFIG_MACH_SUN50I)
/* enable bandwidth limit windows and set windows size 1us */
writel(399, &mctl_com->tmr);
writel((1 << 16), &mctl_com->bwcr);
writel(0x00a0000d, &mctl_com->mcr[0][0]);
writel(0x00500064, &mctl_com->mcr[0][1]);
writel(0x06000009, &mctl_com->mcr[1][0]);
writel(0x01000578, &mctl_com->mcr[1][1]);
writel(0x0200000d, &mctl_com->mcr[2][0]);
writel(0x00600100, &mctl_com->mcr[2][1]);
writel(0x01000009, &mctl_com->mcr[3][0]);
writel(0x00500064, &mctl_com->mcr[3][1]);
writel(0x07000009, &mctl_com->mcr[4][0]);
writel(0x01000640, &mctl_com->mcr[4][1]);
writel(0x01000009, &mctl_com->mcr[5][0]);
writel(0x00000080, &mctl_com->mcr[5][1]);
writel(0x01000009, &mctl_com->mcr[6][0]);
writel(0x00400080, &mctl_com->mcr[6][1]);
writel(0x0100000d, &mctl_com->mcr[7][0]);
writel(0x00400080, &mctl_com->mcr[7][1]);
writel(0x0100000d, &mctl_com->mcr[8][0]);
writel(0x00400080, &mctl_com->mcr[8][1]);
writel(0x04000009, &mctl_com->mcr[9][0]);
writel(0x00400100, &mctl_com->mcr[9][1]);
writel(0x20000209, &mctl_com->mcr[10][0]);
writel(0x08001800, &mctl_com->mcr[10][1]);
writel(0x05000009, &mctl_com->mcr[11][0]);
writel(0x00400090, &mctl_com->mcr[11][1]);
writel(0x81000004, &mctl_com->mdfs_bwlr[2]);
+#endif }
static void mctl_set_timing_params(struct dram_para *para) @@ -204,7 +213,32 @@ static void mctl_set_timing_params(struct dram_para *para) writel(RFSHTMG_TREFI(trefi) | RFSHTMG_TRFC(trfc), &mctl_ctl->rfshtmg); }
-static void mctl_zq_calibration(struct dram_para *para) +#ifdef CONFIG_MACH_SUN8I_H3 +static u32 bin_to_mgray(int val) +{
static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09,
0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d,
0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11,
};
return lookup_table[clamp(val, 0, 31)];
+}
+static int mgray_to_bin(u32 val) +{
static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b,
0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b,
0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15,
};
return lookup_table[val & 0x1f];
+}
+static void mctl_h3_zq_calibration_quirk(struct dram_para *para) { struct sunxi_mctl_ctl_reg * const mctl_ctl = (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; @@ -261,6 +295,7 @@ static void mctl_zq_calibration(struct dram_para *para) writel((zq_val[5] << 16) | zq_val[4], &mctl_ctl->zqdr[2]); } } +#endif
static void mctl_set_cr(struct dram_para *para)
Can this be:
static void mctl_set_cr(struct dram_para *para, enum chip_type type)
{ @@ -286,16 +321,27 @@ static void mctl_sys_init(struct dram_para *para) clrbits_le32(&ccm->ahb_gate0, 1 << AHB_GATE_OFFSET_MCTL); clrbits_le32(&ccm->ahb_reset0_cfg, 1 << AHB_RESET_OFFSET_MCTL); clrbits_le32(&ccm->pll5_cfg, CCM_PLL5_CTRL_EN); +#ifdef CONFIG_MACH_SUN50I
clrbits_le32(&ccm->pll11_cfg, CCM_PLL11_CTRL_EN);
+#endif udelay(10);
clrbits_le32(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_RST); udelay(1000);
+#ifdef CONFIG_MACH_SUN50I
if (type == SUN50I)
clock_set_pll11(CONFIG_DRAM_CLK * 2 * 1000000, false);
clrsetbits_le32(&ccm->dram_clk_cfg,
CCM_DRAMCLK_CFG_DIV_MASK | CCM_DRAMCLK_CFG_SRC_MASK,
CCM_DRAMCLK_CFG_DIV(1) | CCM_DRAMCLK_CFG_SRC_PLL11 |
CCM_DRAMCLK_CFG_UPD);
+#else clock_set_pll5(CONFIG_DRAM_CLK * 2 * 1000000, false); clrsetbits_le32(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_DIV_MASK | CCM_DRAMCLK_CFG_SRC_MASK, CCM_DRAMCLK_CFG_DIV(1) | CCM_DRAMCLK_CFG_SRC_PLL5 | CCM_DRAMCLK_CFG_UPD); +#endif mctl_await_completion(&ccm->dram_clk_cfg, CCM_DRAMCLK_CFG_UPD, 0);
setbits_le32(&ccm->ahb_reset0_cfg, 1 << AHB_RESET_OFFSET_MCTL);
@@ -347,12 +393,18 @@ static int mctl_channel_init(struct dram_para *para) /* set DQS auto gating PD mode */ setbits_le32(&mctl_ctl->pgcr[2], 0x3 << 6);
+#if defined(CONFIG_MACH_SUN8I_H3) /* dx ddr_clk & hdr_clk dynamic mode */ clrbits_le32(&mctl_ctl->pgcr[0], (0x3 << 14) | (0x3 << 12));
/* dphy & aphy phase select 270 degree */ clrsetbits_le32(&mctl_ctl->pgcr[2], (0x3 << 10) | (0x3 << 8), (0x1 << 10) | (0x2 << 8));
+#elif defined(CONFIG_MACH_SUN50I)
/* dphy & aphy phase select ? */
clrsetbits_le32(&mctl_ctl->pgcr[2], (0x3 << 10) | (0x3 << 8),
(0x0 << 10) | (0x3 << 8));
+#endif
/* set half DQ */ if (para->bus_width != 32) {
@@ -367,10 +419,17 @@ static int mctl_channel_init(struct dram_para *para) mctl_set_bit_delays(para); udelay(50);
mctl_zq_calibration(para);
+#ifdef CONFIG_MACH_SUN8I_H3
mctl_h3_zq_calibration_quirk(para); mctl_phy_init(PIR_PLLINIT | PIR_DCAL | PIR_PHYRST | PIR_DRAMRST | PIR_DRAMINIT | PIR_QSGATE);
+#else
clrsetbits_le32(&mctl_ctl->zqcr, 0xffffff, CONFIG_DRAM_ZQ);
mctl_phy_init(PIR_ZCAL | PIR_PLLINIT | PIR_DCAL | PIR_PHYRST |
PIR_DRAMRST | PIR_DRAMINIT | PIR_QSGATE);
+#endif
/* detect ranks and bus width */ if (readl(&mctl_ctl->pgsr[0]) & (0xfe << 20)) {
@@ -408,7 +467,11 @@ static int mctl_channel_init(struct dram_para *para) udelay(10);
/* set PGCR3, CKE polarity */
+#ifdef CONFIG_MACH_SUN50I
writel(0xc0aa0060, &mctl_ctl->pgcr[3]);
+#else writel(0x00aa0060, &mctl_ctl->pgcr[3]); +#endif
/* power down zq calibration module for power save */ setbits_le32(&mctl_ctl->zqcr, ZQCR_PWRDOWN);
@@ -452,6 +515,7 @@ unsigned long sunxi_dram_init(void) .row_bits = 15, .page_size = 4096,
+#if defined(CONFIG_MACH_SUN8I_H3) .dx_read_delays = {{ 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, { 14, 14, 14, 14, 14, 14, 14, 14, 14, 0, 0 }, { 18, 18, 18, 18, 18, 18, 18, 18, 18, 0, 0 }, @@ -464,6 +528,20 @@ unsigned long sunxi_dram_init(void) 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, +#elif defined(CONFIG_MACH_SUN50I)
.dx_read_delays = {{ 16, 16, 16, 16, 17, 16, 16, 17, 16, 1, 0 },
{ 17, 17, 17, 17, 17, 17, 17, 17, 17, 1, 0 },
{ 16, 17, 17, 16, 16, 16, 16, 16, 16, 0, 0 },
{ 17, 17, 17, 17, 17, 17, 17, 17, 17, 1, 0 }},
.dx_write_delays = {{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 15, 15 },
{ 0, 0, 0, 0, 1, 1, 1, 1, 0, 10, 10 },
{ 1, 0, 1, 1, 1, 1, 1, 1, 0, 11, 11 },
{ 1, 0, 0, 1, 1, 1, 1, 1, 0, 12, 12 }},
.ac_delays = { 5, 5, 13, 10, 2, 5, 3, 3,
0, 3, 3, 3, 1, 0, 0, 0,
3, 4, 0, 3, 4, 1, 4, 0,
1, 1, 0, 1, 13, 5, 4 },
+#endif };
mctl_sys_init(¶);
@@ -476,8 +554,15 @@ unsigned long sunxi_dram_init(void) writel(0x00000201, &mctl_ctl->odtmap); udelay(1);
+#ifdef CONFIG_MACH_SUN8I_H3 /* odt delay */ writel(0x0c000400, &mctl_ctl->odtcfg); +#endif
+#ifdef CONFIG_MACH_SUN50I
setbits_le32(&mctl_ctl->vtfcr, (1 << 9));
clrbits_le32(&mctl_ctl->pgcr[2], (1 << 13));
+#endif
/* clear credit value */ setbits_le32(&mctl_com->cccr, 1 << 31);
-- 2.8.2
Then:
+#ifdef CONFIG_MACH_SUN50I mctl_set_cr(params, SUN50I) #elif defined(CONFIG_MACH_SUN8I_H3) mctl_set_cr(params, SUN8I_H3) ...
Regards, Simon

Hi Andre,
On 16 December 2016 at 10:30, Andre Przywara andre.przywara@arm.com wrote:
Hi Simon,
On 05/12/16 06:26, Simon Glass wrote:
Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
From: Jens Kuske jenskuske@gmail.com
The A64 DRAM controller is very similar to the H3 one, so the code can be reused with some small changes.
Yes but it makes the code a mess. Can you avoid putting #iifdefs everywhere?
Sure ...
Since it is a static function I wonder if you can instead use a function parameter which defines the chip to support, and the compiler will eliminate the unused code?
I like that idea. Actually I went ahead and converted the file to this approach and it looks much nicer, especially since I added support for a third DRAM controller on top. Also it opens the doors to potentially serve multiple variants in one binary (one day ...)
I took the freedom to use the Allwinner SoCID (a 16-bit value (mostly) unique to a certain SoC) as the variant identifier. That doesn't make any difference for this passing-to-static-functions case, but avoids inventing identifiers and can be easily reused with auto-detection.
OK great I'm pleased it worked out.
Regards, Simon

On Mon, Dec 05, 2016 at 01:52:22AM +0000, Andre Przywara wrote:
From: Jens Kuske jenskuske@gmail.com
The A64 DRAM controller is very similar to the H3 one, so the code can be reused with some small changes. [Andre: fixed up typo, merged in fixes from Jens]
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 10 +- arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/clock_sun6i.c | 2 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 139 +++++++++++++++++++----- 6 files changed, 123 insertions(+), 32 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h index be9fcfd..3f87672 100644 --- a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h @@ -322,6 +322,7 @@ struct sunxi_ccm_reg { #define CCM_DRAMCLK_CFG_DIV0_MASK (0xf << 8) #define CCM_DRAMCLK_CFG_SRC_PLL5 (0x0 << 20) #define CCM_DRAMCLK_CFG_SRC_PLL6x2 (0x1 << 20) +#define CCM_DRAMCLK_CFG_SRC_PLL11 (0x1 << 20) /* A64 only */ #define CCM_DRAMCLK_CFG_SRC_MASK (0x3 << 20) #define CCM_DRAMCLK_CFG_UPD (0x1 << 16) #define CCM_DRAMCLK_CFG_RST (0x1 << 31) diff --git a/arch/arm/include/asm/arch-sunxi/dram.h b/arch/arm/include/asm/arch-sunxi/dram.h index e0be744..53e6d47 100644 --- a/arch/arm/include/asm/arch-sunxi/dram.h +++ b/arch/arm/include/asm/arch-sunxi/dram.h @@ -24,7 +24,7 @@ #include <asm/arch/dram_sun8i_a33.h> #elif defined(CONFIG_MACH_SUN8I_A83T) #include <asm/arch/dram_sun8i_a83t.h> -#elif defined(CONFIG_MACH_SUN8I_H3) +#elif defined(CONFIG_MACH_SUN8I_H3) || defined(CONFIG_MACH_SUN50I) #include <asm/arch/dram_sun8i_h3.h> #elif defined(CONFIG_MACH_SUN9I) #include <asm/arch/dram_sun9i.h> diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index 867fd12..b0e5d93 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -15,7 +15,8 @@
struct sunxi_mctl_com_reg { u32 cr; /* 0x00 control register */
- u8 res0[0xc]; /* 0x04 */
- u8 res0[0x8]; /* 0x04 */
- u32 tmr; /* 0x0c (A64 only) */
#ifdef?
u32 mcr[16][2]; /* 0x10 */ u32 bwcr; /* 0x90 bandwidth control register */ u32 maer; /* 0x94 master enable register */ @@ -32,7 +33,9 @@ struct sunxi_mctl_com_reg { u32 swoffr; /* 0xc4 */ u8 res2[0x8]; /* 0xc8 */ u32 cccr; /* 0xd0 */
- u8 res3[0x72c]; /* 0xd4 */
- u8 res3[0x54]; /* 0xd4 */
- u32 mdfs_bwlr[3]; /* 0x128 (A64 only) */
- u8 res4[0x6cc]; /* 0x134 */
Ditto.
u32 protect; /* 0x800 */ };
@@ -81,7 +84,8 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */
- u8 res3[0x20]; /* 0x9c */
- u8 res3[0x1c]; /* 0x9c */
- u32 vtfcr; /* 0xb8 (A64 only) */
Ditto
u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */ diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index e73114e..7daba11 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -50,4 +50,5 @@ obj-$(CONFIG_MACH_SUN8I_A33) += dram_sun8i_a33.o obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o +obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o endif diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index 80cfc0b..99f515d 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -217,7 +217,7 @@ done: } #endif
-#ifdef CONFIG_MACH_SUN8I_A33 +#if defined(CONFIG_MACH_SUN8I_A33) || defined(CONFIG_MACH_SUN50I) void clock_set_pll11(unsigned int clk, bool sigma_delta_enable) { struct sunxi_ccm_reg * const ccm = diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 1647d76..2dc2071 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -32,30 +32,6 @@ static inline int ns_to_t(int nanoseconds) return DIV_ROUND_UP(ctrl_freq * nanoseconds, 1000); }
-static u32 bin_to_mgray(int val) -{
- static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09,
0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d,
0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11,
- };
- return lookup_table[clamp(val, 0, 31)];
-}
-static int mgray_to_bin(u32 val) -{
- static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b,
0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b,
0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15,
- };
- return lookup_table[val & 0x1f];
-}
static void mctl_phy_init(u32 val) { struct sunxi_mctl_ctl_reg * const mctl_ctl = @@ -91,8 +67,9 @@ static void mctl_set_master_priority(void) struct sunxi_mctl_com_reg * const mctl_com = (struct sunxi_mctl_com_reg *)SUNXI_DRAM_COM_BASE;
+#if defined(CONFIG_MACH_SUN8I_H3) /* enable bandwidth limit windows and set windows size 1us */
- writel(0x00010190, &mctl_com->bwcr);
writel((1 << 16) | (400 << 0), &mctl_com->bwcr);
/* set cpu high priority */ writel(0x00000001, &mctl_com->mapr);
@@ -121,6 +98,38 @@ static void mctl_set_master_priority(void) writel(0x04001800, &mctl_com->mcr[10][1]); writel(0x04000009, &mctl_com->mcr[11][0]); writel(0x00400120, &mctl_com->mcr[11][1]); +#elif defined(CONFIG_MACH_SUN50I)
- /* enable bandwidth limit windows and set windows size 1us */
- writel(399, &mctl_com->tmr);
- writel((1 << 16), &mctl_com->bwcr);
- writel(0x00a0000d, &mctl_com->mcr[0][0]);
- writel(0x00500064, &mctl_com->mcr[0][1]);
- writel(0x06000009, &mctl_com->mcr[1][0]);
- writel(0x01000578, &mctl_com->mcr[1][1]);
- writel(0x0200000d, &mctl_com->mcr[2][0]);
- writel(0x00600100, &mctl_com->mcr[2][1]);
- writel(0x01000009, &mctl_com->mcr[3][0]);
- writel(0x00500064, &mctl_com->mcr[3][1]);
- writel(0x07000009, &mctl_com->mcr[4][0]);
- writel(0x01000640, &mctl_com->mcr[4][1]);
- writel(0x01000009, &mctl_com->mcr[5][0]);
- writel(0x00000080, &mctl_com->mcr[5][1]);
- writel(0x01000009, &mctl_com->mcr[6][0]);
- writel(0x00400080, &mctl_com->mcr[6][1]);
- writel(0x0100000d, &mctl_com->mcr[7][0]);
- writel(0x00400080, &mctl_com->mcr[7][1]);
- writel(0x0100000d, &mctl_com->mcr[8][0]);
- writel(0x00400080, &mctl_com->mcr[8][1]);
- writel(0x04000009, &mctl_com->mcr[9][0]);
- writel(0x00400100, &mctl_com->mcr[9][1]);
- writel(0x20000209, &mctl_com->mcr[10][0]);
- writel(0x08001800, &mctl_com->mcr[10][1]);
- writel(0x05000009, &mctl_com->mcr[11][0]);
- writel(0x00400090, &mctl_com->mcr[11][1]);
- writel(0x81000004, &mctl_com->mdfs_bwlr[2]);
Where is this pulled from? having some defines would be great..
Maxime

Hi,
On 06/12/16 11:20, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:22AM +0000, Andre Przywara wrote:
From: Jens Kuske jenskuske@gmail.com
The A64 DRAM controller is very similar to the H3 one, so the code can be reused with some small changes. [Andre: fixed up typo, merged in fixes from Jens]
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 10 +- arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/clock_sun6i.c | 2 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 139 +++++++++++++++++++----- 6 files changed, 123 insertions(+), 32 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h index be9fcfd..3f87672 100644 --- a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h @@ -322,6 +322,7 @@ struct sunxi_ccm_reg { #define CCM_DRAMCLK_CFG_DIV0_MASK (0xf << 8) #define CCM_DRAMCLK_CFG_SRC_PLL5 (0x0 << 20) #define CCM_DRAMCLK_CFG_SRC_PLL6x2 (0x1 << 20) +#define CCM_DRAMCLK_CFG_SRC_PLL11 (0x1 << 20) /* A64 only */ #define CCM_DRAMCLK_CFG_SRC_MASK (0x3 << 20) #define CCM_DRAMCLK_CFG_UPD (0x1 << 16) #define CCM_DRAMCLK_CFG_RST (0x1 << 31) diff --git a/arch/arm/include/asm/arch-sunxi/dram.h b/arch/arm/include/asm/arch-sunxi/dram.h index e0be744..53e6d47 100644 --- a/arch/arm/include/asm/arch-sunxi/dram.h +++ b/arch/arm/include/asm/arch-sunxi/dram.h @@ -24,7 +24,7 @@ #include <asm/arch/dram_sun8i_a33.h> #elif defined(CONFIG_MACH_SUN8I_A83T) #include <asm/arch/dram_sun8i_a83t.h> -#elif defined(CONFIG_MACH_SUN8I_H3) +#elif defined(CONFIG_MACH_SUN8I_H3) || defined(CONFIG_MACH_SUN50I) #include <asm/arch/dram_sun8i_h3.h> #elif defined(CONFIG_MACH_SUN9I) #include <asm/arch/dram_sun9i.h> diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index 867fd12..b0e5d93 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -15,7 +15,8 @@
struct sunxi_mctl_com_reg { u32 cr; /* 0x00 control register */
- u8 res0[0xc]; /* 0x04 */
- u8 res0[0x8]; /* 0x04 */
- u32 tmr; /* 0x0c (A64 only) */
#ifdef?
What would that change aside from making it hard to read? This is a structure definition, so it doesn't generate any code on its own. And we only access that field from the A64, so it keeps its reserved nature for the H3.
u32 mcr[16][2]; /* 0x10 */ u32 bwcr; /* 0x90 bandwidth control register */ u32 maer; /* 0x94 master enable register */ @@ -32,7 +33,9 @@ struct sunxi_mctl_com_reg { u32 swoffr; /* 0xc4 */ u8 res2[0x8]; /* 0xc8 */ u32 cccr; /* 0xd0 */
- u8 res3[0x72c]; /* 0xd4 */
- u8 res3[0x54]; /* 0xd4 */
- u32 mdfs_bwlr[3]; /* 0x128 (A64 only) */
- u8 res4[0x6cc]; /* 0x134 */
Ditto.
u32 protect; /* 0x800 */ };
@@ -81,7 +84,8 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */
- u8 res3[0x20]; /* 0x9c */
- u8 res3[0x1c]; /* 0x9c */
- u32 vtfcr; /* 0xb8 (A64 only) */
Ditto
u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */ diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index e73114e..7daba11 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -50,4 +50,5 @@ obj-$(CONFIG_MACH_SUN8I_A33) += dram_sun8i_a33.o obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o +obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o endif diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index 80cfc0b..99f515d 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -217,7 +217,7 @@ done: } #endif
-#ifdef CONFIG_MACH_SUN8I_A33 +#if defined(CONFIG_MACH_SUN8I_A33) || defined(CONFIG_MACH_SUN50I) void clock_set_pll11(unsigned int clk, bool sigma_delta_enable) { struct sunxi_ccm_reg * const ccm = diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 1647d76..2dc2071 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -32,30 +32,6 @@ static inline int ns_to_t(int nanoseconds) return DIV_ROUND_UP(ctrl_freq * nanoseconds, 1000); }
-static u32 bin_to_mgray(int val) -{
- static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09,
0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d,
0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11,
- };
- return lookup_table[clamp(val, 0, 31)];
-}
-static int mgray_to_bin(u32 val) -{
- static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b,
0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b,
0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15,
- };
- return lookup_table[val & 0x1f];
-}
static void mctl_phy_init(u32 val) { struct sunxi_mctl_ctl_reg * const mctl_ctl = @@ -91,8 +67,9 @@ static void mctl_set_master_priority(void) struct sunxi_mctl_com_reg * const mctl_com = (struct sunxi_mctl_com_reg *)SUNXI_DRAM_COM_BASE;
+#if defined(CONFIG_MACH_SUN8I_H3) /* enable bandwidth limit windows and set windows size 1us */
- writel(0x00010190, &mctl_com->bwcr);
writel((1 << 16) | (400 << 0), &mctl_com->bwcr);
/* set cpu high priority */ writel(0x00000001, &mctl_com->mapr);
@@ -121,6 +98,38 @@ static void mctl_set_master_priority(void) writel(0x04001800, &mctl_com->mcr[10][1]); writel(0x04000009, &mctl_com->mcr[11][0]); writel(0x00400120, &mctl_com->mcr[11][1]); +#elif defined(CONFIG_MACH_SUN50I)
- /* enable bandwidth limit windows and set windows size 1us */
- writel(399, &mctl_com->tmr);
- writel((1 << 16), &mctl_com->bwcr);
- writel(0x00a0000d, &mctl_com->mcr[0][0]);
- writel(0x00500064, &mctl_com->mcr[0][1]);
- writel(0x06000009, &mctl_com->mcr[1][0]);
- writel(0x01000578, &mctl_com->mcr[1][1]);
- writel(0x0200000d, &mctl_com->mcr[2][0]);
- writel(0x00600100, &mctl_com->mcr[2][1]);
- writel(0x01000009, &mctl_com->mcr[3][0]);
- writel(0x00500064, &mctl_com->mcr[3][1]);
- writel(0x07000009, &mctl_com->mcr[4][0]);
- writel(0x01000640, &mctl_com->mcr[4][1]);
- writel(0x01000009, &mctl_com->mcr[5][0]);
- writel(0x00000080, &mctl_com->mcr[5][1]);
- writel(0x01000009, &mctl_com->mcr[6][0]);
- writel(0x00400080, &mctl_com->mcr[6][1]);
- writel(0x0100000d, &mctl_com->mcr[7][0]);
- writel(0x00400080, &mctl_com->mcr[7][1]);
- writel(0x0100000d, &mctl_com->mcr[8][0]);
- writel(0x00400080, &mctl_com->mcr[8][1]);
- writel(0x04000009, &mctl_com->mcr[9][0]);
- writel(0x00400100, &mctl_com->mcr[9][1]);
- writel(0x20000209, &mctl_com->mcr[10][0]);
- writel(0x08001800, &mctl_com->mcr[10][1]);
- writel(0x05000009, &mctl_com->mcr[11][0]);
- writel(0x00400090, &mctl_com->mcr[11][1]);
- writel(0x81000004, &mctl_com->mdfs_bwlr[2]);
Where is this pulled from? having some defines would be great..
AFAIK this is from register dumps after boot0 has done its job. I am afraid this is as far as we get in the moment. IIRC even the disassembly of boot0 shows that these registers are just initialised with those magic values and are not computed somehow. So the original source code _might_ have names to it, but I guess we will never know.
Cheers, Andre.

On Tue, Dec 06, 2016 at 02:15:17PM +0000, Andre Przywara wrote:
Hi,
On 06/12/16 11:20, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:22AM +0000, Andre Przywara wrote:
From: Jens Kuske jenskuske@gmail.com
The A64 DRAM controller is very similar to the H3 one, so the code can be reused with some small changes. [Andre: fixed up typo, merged in fixes from Jens]
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 10 +- arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/clock_sun6i.c | 2 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 139 +++++++++++++++++++----- 6 files changed, 123 insertions(+), 32 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h index be9fcfd..3f87672 100644 --- a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h @@ -322,6 +322,7 @@ struct sunxi_ccm_reg { #define CCM_DRAMCLK_CFG_DIV0_MASK (0xf << 8) #define CCM_DRAMCLK_CFG_SRC_PLL5 (0x0 << 20) #define CCM_DRAMCLK_CFG_SRC_PLL6x2 (0x1 << 20) +#define CCM_DRAMCLK_CFG_SRC_PLL11 (0x1 << 20) /* A64 only */ #define CCM_DRAMCLK_CFG_SRC_MASK (0x3 << 20) #define CCM_DRAMCLK_CFG_UPD (0x1 << 16) #define CCM_DRAMCLK_CFG_RST (0x1 << 31) diff --git a/arch/arm/include/asm/arch-sunxi/dram.h b/arch/arm/include/asm/arch-sunxi/dram.h index e0be744..53e6d47 100644 --- a/arch/arm/include/asm/arch-sunxi/dram.h +++ b/arch/arm/include/asm/arch-sunxi/dram.h @@ -24,7 +24,7 @@ #include <asm/arch/dram_sun8i_a33.h> #elif defined(CONFIG_MACH_SUN8I_A83T) #include <asm/arch/dram_sun8i_a83t.h> -#elif defined(CONFIG_MACH_SUN8I_H3) +#elif defined(CONFIG_MACH_SUN8I_H3) || defined(CONFIG_MACH_SUN50I) #include <asm/arch/dram_sun8i_h3.h> #elif defined(CONFIG_MACH_SUN9I) #include <asm/arch/dram_sun9i.h> diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index 867fd12..b0e5d93 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -15,7 +15,8 @@
struct sunxi_mctl_com_reg { u32 cr; /* 0x00 control register */
- u8 res0[0xc]; /* 0x04 */
- u8 res0[0x8]; /* 0x04 */
- u32 tmr; /* 0x0c (A64 only) */
#ifdef?
What would that change aside from making it hard to read? This is a structure definition, so it doesn't generate any code on its own. And we only access that field from the A64, so it keeps its reserved nature for the H3.
Yes, but at least we can catch improper writes to the register on !A64 platforms at compile time. We already use that construct in the u-boot code, and you're using it every where in the code already, so it wouldn't really make it harder to read than it already is.
u32 mcr[16][2]; /* 0x10 */ u32 bwcr; /* 0x90 bandwidth control register */ u32 maer; /* 0x94 master enable register */ @@ -32,7 +33,9 @@ struct sunxi_mctl_com_reg { u32 swoffr; /* 0xc4 */ u8 res2[0x8]; /* 0xc8 */ u32 cccr; /* 0xd0 */
- u8 res3[0x72c]; /* 0xd4 */
- u8 res3[0x54]; /* 0xd4 */
- u32 mdfs_bwlr[3]; /* 0x128 (A64 only) */
- u8 res4[0x6cc]; /* 0x134 */
Ditto.
u32 protect; /* 0x800 */ };
@@ -81,7 +84,8 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */
- u8 res3[0x20]; /* 0x9c */
- u8 res3[0x1c]; /* 0x9c */
- u32 vtfcr; /* 0xb8 (A64 only) */
Ditto
u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */ diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index e73114e..7daba11 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -50,4 +50,5 @@ obj-$(CONFIG_MACH_SUN8I_A33) += dram_sun8i_a33.o obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o +obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o endif diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index 80cfc0b..99f515d 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -217,7 +217,7 @@ done: } #endif
-#ifdef CONFIG_MACH_SUN8I_A33 +#if defined(CONFIG_MACH_SUN8I_A33) || defined(CONFIG_MACH_SUN50I) void clock_set_pll11(unsigned int clk, bool sigma_delta_enable) { struct sunxi_ccm_reg * const ccm = diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 1647d76..2dc2071 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -32,30 +32,6 @@ static inline int ns_to_t(int nanoseconds) return DIV_ROUND_UP(ctrl_freq * nanoseconds, 1000); }
-static u32 bin_to_mgray(int val) -{
- static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09,
0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d,
0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11,
- };
- return lookup_table[clamp(val, 0, 31)];
-}
-static int mgray_to_bin(u32 val) -{
- static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b,
0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b,
0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15,
- };
- return lookup_table[val & 0x1f];
-}
static void mctl_phy_init(u32 val) { struct sunxi_mctl_ctl_reg * const mctl_ctl = @@ -91,8 +67,9 @@ static void mctl_set_master_priority(void) struct sunxi_mctl_com_reg * const mctl_com = (struct sunxi_mctl_com_reg *)SUNXI_DRAM_COM_BASE;
+#if defined(CONFIG_MACH_SUN8I_H3) /* enable bandwidth limit windows and set windows size 1us */
- writel(0x00010190, &mctl_com->bwcr);
writel((1 << 16) | (400 << 0), &mctl_com->bwcr);
/* set cpu high priority */ writel(0x00000001, &mctl_com->mapr);
@@ -121,6 +98,38 @@ static void mctl_set_master_priority(void) writel(0x04001800, &mctl_com->mcr[10][1]); writel(0x04000009, &mctl_com->mcr[11][0]); writel(0x00400120, &mctl_com->mcr[11][1]); +#elif defined(CONFIG_MACH_SUN50I)
- /* enable bandwidth limit windows and set windows size 1us */
- writel(399, &mctl_com->tmr);
- writel((1 << 16), &mctl_com->bwcr);
- writel(0x00a0000d, &mctl_com->mcr[0][0]);
- writel(0x00500064, &mctl_com->mcr[0][1]);
- writel(0x06000009, &mctl_com->mcr[1][0]);
- writel(0x01000578, &mctl_com->mcr[1][1]);
- writel(0x0200000d, &mctl_com->mcr[2][0]);
- writel(0x00600100, &mctl_com->mcr[2][1]);
- writel(0x01000009, &mctl_com->mcr[3][0]);
- writel(0x00500064, &mctl_com->mcr[3][1]);
- writel(0x07000009, &mctl_com->mcr[4][0]);
- writel(0x01000640, &mctl_com->mcr[4][1]);
- writel(0x01000009, &mctl_com->mcr[5][0]);
- writel(0x00000080, &mctl_com->mcr[5][1]);
- writel(0x01000009, &mctl_com->mcr[6][0]);
- writel(0x00400080, &mctl_com->mcr[6][1]);
- writel(0x0100000d, &mctl_com->mcr[7][0]);
- writel(0x00400080, &mctl_com->mcr[7][1]);
- writel(0x0100000d, &mctl_com->mcr[8][0]);
- writel(0x00400080, &mctl_com->mcr[8][1]);
- writel(0x04000009, &mctl_com->mcr[9][0]);
- writel(0x00400100, &mctl_com->mcr[9][1]);
- writel(0x20000209, &mctl_com->mcr[10][0]);
- writel(0x08001800, &mctl_com->mcr[10][1]);
- writel(0x05000009, &mctl_com->mcr[11][0]);
- writel(0x00400090, &mctl_com->mcr[11][1]);
- writel(0x81000004, &mctl_com->mdfs_bwlr[2]);
Where is this pulled from? having some defines would be great..
AFAIK this is from register dumps after boot0 has done its job. I am afraid this is as far as we get in the moment. IIRC even the disassembly of boot0 shows that these registers are just initialised with those magic values and are not computed somehow. So the original source code _might_ have names to it, but I guess we will never know.
Can you please make that a comment?
Thanks, Maxime

Hi,
On 12/12/16 12:29, Maxime Ripard wrote:
On Tue, Dec 06, 2016 at 02:15:17PM +0000, Andre Przywara wrote:
Hi,
On 06/12/16 11:20, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:22AM +0000, Andre Przywara wrote:
From: Jens Kuske jenskuske@gmail.com
The A64 DRAM controller is very similar to the H3 one, so the code can be reused with some small changes. [Andre: fixed up typo, merged in fixes from Jens]
Signed-off-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/include/asm/arch-sunxi/clock_sun6i.h | 1 + arch/arm/include/asm/arch-sunxi/dram.h | 2 +- arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h | 10 +- arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/clock_sun6i.c | 2 +- arch/arm/mach-sunxi/dram_sun8i_h3.c | 139 +++++++++++++++++++----- 6 files changed, 123 insertions(+), 32 deletions(-)
diff --git a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h index be9fcfd..3f87672 100644 --- a/arch/arm/include/asm/arch-sunxi/clock_sun6i.h +++ b/arch/arm/include/asm/arch-sunxi/clock_sun6i.h @@ -322,6 +322,7 @@ struct sunxi_ccm_reg { #define CCM_DRAMCLK_CFG_DIV0_MASK (0xf << 8) #define CCM_DRAMCLK_CFG_SRC_PLL5 (0x0 << 20) #define CCM_DRAMCLK_CFG_SRC_PLL6x2 (0x1 << 20) +#define CCM_DRAMCLK_CFG_SRC_PLL11 (0x1 << 20) /* A64 only */ #define CCM_DRAMCLK_CFG_SRC_MASK (0x3 << 20) #define CCM_DRAMCLK_CFG_UPD (0x1 << 16) #define CCM_DRAMCLK_CFG_RST (0x1 << 31) diff --git a/arch/arm/include/asm/arch-sunxi/dram.h b/arch/arm/include/asm/arch-sunxi/dram.h index e0be744..53e6d47 100644 --- a/arch/arm/include/asm/arch-sunxi/dram.h +++ b/arch/arm/include/asm/arch-sunxi/dram.h @@ -24,7 +24,7 @@ #include <asm/arch/dram_sun8i_a33.h> #elif defined(CONFIG_MACH_SUN8I_A83T) #include <asm/arch/dram_sun8i_a83t.h> -#elif defined(CONFIG_MACH_SUN8I_H3) +#elif defined(CONFIG_MACH_SUN8I_H3) || defined(CONFIG_MACH_SUN50I) #include <asm/arch/dram_sun8i_h3.h> #elif defined(CONFIG_MACH_SUN9I) #include <asm/arch/dram_sun9i.h> diff --git a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h index 867fd12..b0e5d93 100644 --- a/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h +++ b/arch/arm/include/asm/arch-sunxi/dram_sun8i_h3.h @@ -15,7 +15,8 @@
struct sunxi_mctl_com_reg { u32 cr; /* 0x00 control register */
- u8 res0[0xc]; /* 0x04 */
- u8 res0[0x8]; /* 0x04 */
- u32 tmr; /* 0x0c (A64 only) */
#ifdef?
What would that change aside from making it hard to read? This is a structure definition, so it doesn't generate any code on its own. And we only access that field from the A64, so it keeps its reserved nature for the H3.
Yes, but at least we can catch improper writes to the register on !A64 platforms at compile time. We already use that construct in the u-boot code, and you're using it every where in the code already, so it wouldn't really make it harder to read than it already is.
Still not convinced, but I will give it a try.
u32 mcr[16][2]; /* 0x10 */ u32 bwcr; /* 0x90 bandwidth control register */ u32 maer; /* 0x94 master enable register */ @@ -32,7 +33,9 @@ struct sunxi_mctl_com_reg { u32 swoffr; /* 0xc4 */ u8 res2[0x8]; /* 0xc8 */ u32 cccr; /* 0xd0 */
- u8 res3[0x72c]; /* 0xd4 */
- u8 res3[0x54]; /* 0xd4 */
- u32 mdfs_bwlr[3]; /* 0x128 (A64 only) */
- u8 res4[0x6cc]; /* 0x134 */
Ditto.
u32 protect; /* 0x800 */ };
@@ -81,7 +84,8 @@ struct sunxi_mctl_ctl_reg { u32 rfshtmg; /* 0x90 refresh timing */ u32 rfshctl1; /* 0x94 */ u32 pwrtmg; /* 0x98 */
- u8 res3[0x20]; /* 0x9c */
- u8 res3[0x1c]; /* 0x9c */
- u32 vtfcr; /* 0xb8 (A64 only) */
Ditto
u32 dqsgmr; /* 0xbc */ u32 dtcr; /* 0xc0 */ u32 dtar[4]; /* 0xc4 */ diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index e73114e..7daba11 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -50,4 +50,5 @@ obj-$(CONFIG_MACH_SUN8I_A33) += dram_sun8i_a33.o obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o +obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o endif diff --git a/arch/arm/mach-sunxi/clock_sun6i.c b/arch/arm/mach-sunxi/clock_sun6i.c index 80cfc0b..99f515d 100644 --- a/arch/arm/mach-sunxi/clock_sun6i.c +++ b/arch/arm/mach-sunxi/clock_sun6i.c @@ -217,7 +217,7 @@ done: } #endif
-#ifdef CONFIG_MACH_SUN8I_A33 +#if defined(CONFIG_MACH_SUN8I_A33) || defined(CONFIG_MACH_SUN50I) void clock_set_pll11(unsigned int clk, bool sigma_delta_enable) { struct sunxi_ccm_reg * const ccm = diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 1647d76..2dc2071 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -32,30 +32,6 @@ static inline int ns_to_t(int nanoseconds) return DIV_ROUND_UP(ctrl_freq * nanoseconds, 1000); }
-static u32 bin_to_mgray(int val) -{
- static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0c, 0x0d, 0x0e, 0x0f, 0x0a, 0x0b, 0x08, 0x09,
0x18, 0x19, 0x1a, 0x1b, 0x1e, 0x1f, 0x1c, 0x1d,
0x14, 0x15, 0x16, 0x17, 0x12, 0x13, 0x10, 0x11,
- };
- return lookup_table[clamp(val, 0, 31)];
-}
-static int mgray_to_bin(u32 val) -{
- static const u8 lookup_table[32] = {
0x00, 0x01, 0x02, 0x03, 0x06, 0x07, 0x04, 0x05,
0x0e, 0x0f, 0x0c, 0x0d, 0x08, 0x09, 0x0a, 0x0b,
0x1e, 0x1f, 0x1c, 0x1d, 0x18, 0x19, 0x1a, 0x1b,
0x10, 0x11, 0x12, 0x13, 0x16, 0x17, 0x14, 0x15,
- };
- return lookup_table[val & 0x1f];
-}
static void mctl_phy_init(u32 val) { struct sunxi_mctl_ctl_reg * const mctl_ctl = @@ -91,8 +67,9 @@ static void mctl_set_master_priority(void) struct sunxi_mctl_com_reg * const mctl_com = (struct sunxi_mctl_com_reg *)SUNXI_DRAM_COM_BASE;
+#if defined(CONFIG_MACH_SUN8I_H3) /* enable bandwidth limit windows and set windows size 1us */
- writel(0x00010190, &mctl_com->bwcr);
writel((1 << 16) | (400 << 0), &mctl_com->bwcr);
/* set cpu high priority */ writel(0x00000001, &mctl_com->mapr);
@@ -121,6 +98,38 @@ static void mctl_set_master_priority(void) writel(0x04001800, &mctl_com->mcr[10][1]); writel(0x04000009, &mctl_com->mcr[11][0]); writel(0x00400120, &mctl_com->mcr[11][1]); +#elif defined(CONFIG_MACH_SUN50I)
- /* enable bandwidth limit windows and set windows size 1us */
- writel(399, &mctl_com->tmr);
- writel((1 << 16), &mctl_com->bwcr);
- writel(0x00a0000d, &mctl_com->mcr[0][0]);
- writel(0x00500064, &mctl_com->mcr[0][1]);
- writel(0x06000009, &mctl_com->mcr[1][0]);
- writel(0x01000578, &mctl_com->mcr[1][1]);
- writel(0x0200000d, &mctl_com->mcr[2][0]);
- writel(0x00600100, &mctl_com->mcr[2][1]);
- writel(0x01000009, &mctl_com->mcr[3][0]);
- writel(0x00500064, &mctl_com->mcr[3][1]);
- writel(0x07000009, &mctl_com->mcr[4][0]);
- writel(0x01000640, &mctl_com->mcr[4][1]);
- writel(0x01000009, &mctl_com->mcr[5][0]);
- writel(0x00000080, &mctl_com->mcr[5][1]);
- writel(0x01000009, &mctl_com->mcr[6][0]);
- writel(0x00400080, &mctl_com->mcr[6][1]);
- writel(0x0100000d, &mctl_com->mcr[7][0]);
- writel(0x00400080, &mctl_com->mcr[7][1]);
- writel(0x0100000d, &mctl_com->mcr[8][0]);
- writel(0x00400080, &mctl_com->mcr[8][1]);
- writel(0x04000009, &mctl_com->mcr[9][0]);
- writel(0x00400100, &mctl_com->mcr[9][1]);
- writel(0x20000209, &mctl_com->mcr[10][0]);
- writel(0x08001800, &mctl_com->mcr[10][1]);
- writel(0x05000009, &mctl_com->mcr[11][0]);
- writel(0x00400090, &mctl_com->mcr[11][1]);
- writel(0x81000004, &mctl_com->mdfs_bwlr[2]);
Where is this pulled from? having some defines would be great..
AFAIK this is from register dumps after boot0 has done its job. I am afraid this is as far as we get in the moment. IIRC even the disassembly of boot0 shows that these registers are just initialised with those magic values and are not computed somehow. So the original source code _might_ have names to it, but I guess we will never know.
Can you please make that a comment?
Even better there is a patch from Philipp which actually dissects this into meaningful fields.
So I looked into the H3 BSP and found the respective code there as well, so I have now a patch which turns the existing H3 mctl_com->mcr setup code into something like: MBUS_CONF(CPU, true, HIGHEST, 0, 160, 100, 80); .... (with explanations for the numerical parameters as well).
Then this A64 patch will just copy that scheme, and will hopefully make everyone happy.
Cheers, Andre.

According to Jens disabling the on-die-termination should set bit 5, not bit 1 in the respective register. Fix this.
Reported-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/mach-sunxi/dram_sun8i_h3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 2dc2071..3d569fc 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -385,7 +385,7 @@ static int mctl_channel_init(struct dram_para *para) clrsetbits_le32(&mctl_ctl->dx[i].gcr, (0x3 << 4) | (0x1 << 1) | (0x3 << 2) | (0x3 << 12) | (0x3 << 14), - IS_ENABLED(CONFIG_DRAM_ODT_EN) ? 0x0 : 0x2); + IS_ENABLED(CONFIG_DRAM_ODT_EN) ? 0x0 : 0x20);
/* AC PDR should always ON */ setbits_le32(&mctl_ctl->aciocr, 0x1 << 1);

On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
According to Jens disabling the on-die-termination should set bit 5, not bit 1 in the respective register. Fix this.
Reported-by: Jens Kuske jenskuske@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/mach-sunxi/dram_sun8i_h3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
If you know the field name, a #define would be good.
Reviewed-by: Simon Glass sjg@chromium.org

Fix the output of the DRAM size on AArch64 SPLs.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de --- arch/arm/mach-sunxi/dram_sun8i_h3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mach-sunxi/dram_sun8i_h3.c b/arch/arm/mach-sunxi/dram_sun8i_h3.c index 3d569fc..5ee8b3d 100644 --- a/arch/arm/mach-sunxi/dram_sun8i_h3.c +++ b/arch/arm/mach-sunxi/dram_sun8i_h3.c @@ -571,6 +571,6 @@ unsigned long sunxi_dram_init(void) mctl_auto_detect_dram_size(¶); mctl_set_cr(¶);
- return (1 << (para.row_bits + 3)) * para.page_size * + return (1UL << (para.row_bits + 3)) * para.page_size * (para.dual_rank ? 2 : 1); }

On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
Fix the output of the DRAM size on AArch64 SPLs.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Alexander Graf agraf@suse.de
arch/arm/mach-sunxi/dram_sun8i_h3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Reviewed-by: Simon Glass sjg@chromium.org

Now that the SPL is ready to be compiled in AArch64 and the DRAM init code is ready, enable SPL support for the A64 SoC and in the Pine64 defconfig. For now we keep the boot0 header in the U-Boot proper, as this allows to still use boot0 as an SPL replacement without hurting the SPL use case. We disable FEL support for now, as the code isn't ready yet.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/mach-sunxi/board.c | 2 +- board/sunxi/Kconfig | 2 ++ configs/pine64_plus_defconfig | 1 + include/configs/sunxi-common.h | 2 ++ 4 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mach-sunxi/board.c b/arch/arm/mach-sunxi/board.c index 0f8ead9..80d4b57 100644 --- a/arch/arm/mach-sunxi/board.c +++ b/arch/arm/mach-sunxi/board.c @@ -133,7 +133,7 @@ static int gpio_init(void) return 0; }
-#ifdef CONFIG_SPL_BUILD +#if defined(CONFIG_SPL_BOARD_LOAD_IMAGE) && defined(CONFIG_SPL_BUILD) static int spl_board_load_image(struct spl_image_info *spl_image, struct spl_boot_device *bootdev) { diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index d477925..b5246df 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -125,6 +125,7 @@ config MACH_SUN50I bool "sun50i (Allwinner A64)" select ARM64 select SUNXI_GEN_SUN6I + select SUPPORT_SPL
endchoice
@@ -187,6 +188,7 @@ config DRAM_ODT_EN bool "sunxi dram odt enable" default n if !MACH_SUN8I_A23 default y if MACH_SUN8I_A23 + default y if MACH_SUN50I ---help--- Select this to enable dram odt (on die termination).
diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index ebc24b8..2374170 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -5,6 +5,7 @@ CONFIG_MACH_SUN50I=y CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y +CONFIG_SPL=y # CONFIG_CMD_IMLS is not set # CONFIG_CMD_FLASH is not set # CONFIG_CMD_FPGA is not set diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index e05c318..5279e51 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -183,7 +183,9 @@
#define CONFIG_SPL_FRAMEWORK
+#ifndef CONFIG_MACH_SUN50I #define CONFIG_SPL_BOARD_LOAD_IMAGE +#endif
#if defined(CONFIG_MACH_SUN9I) #define CONFIG_SPL_TEXT_BASE 0x10040 /* sram start+header */

On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
Now that the SPL is ready to be compiled in AArch64 and the DRAM init code is ready, enable SPL support for the A64 SoC and in the Pine64 defconfig. For now we keep the boot0 header in the U-Boot proper, as this allows to still use boot0 as an SPL replacement without hurting the SPL use case. We disable FEL support for now, as the code isn't ready yet.
Where is this done? Is it because you don't enable it?
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/mach-sunxi/board.c | 2 +- board/sunxi/Kconfig | 2 ++ configs/pine64_plus_defconfig | 1 + include/configs/sunxi-common.h | 2 ++ 4 files changed, 6 insertions(+), 1 deletion(-)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/arch/arm/mach-sunxi/board.c b/arch/arm/mach-sunxi/board.c index 0f8ead9..80d4b57 100644 --- a/arch/arm/mach-sunxi/board.c +++ b/arch/arm/mach-sunxi/board.c @@ -133,7 +133,7 @@ static int gpio_init(void) return 0; }
-#ifdef CONFIG_SPL_BUILD +#if defined(CONFIG_SPL_BOARD_LOAD_IMAGE) && defined(CONFIG_SPL_BUILD) static int spl_board_load_image(struct spl_image_info *spl_image, struct spl_boot_device *bootdev) { diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index d477925..b5246df 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -125,6 +125,7 @@ config MACH_SUN50I bool "sun50i (Allwinner A64)" select ARM64 select SUNXI_GEN_SUN6I
select SUPPORT_SPL
endchoice
@@ -187,6 +188,7 @@ config DRAM_ODT_EN bool "sunxi dram odt enable" default n if !MACH_SUN8I_A23 default y if MACH_SUN8I_A23
default y if MACH_SUN50I ---help--- Select this to enable dram odt (on die termination).
diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index ebc24b8..2374170 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -5,6 +5,7 @@ CONFIG_MACH_SUN50I=y CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y +CONFIG_SPL=y # CONFIG_CMD_IMLS is not set # CONFIG_CMD_FLASH is not set # CONFIG_CMD_FPGA is not set diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index e05c318..5279e51 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -183,7 +183,9 @@
#define CONFIG_SPL_FRAMEWORK
+#ifndef CONFIG_MACH_SUN50I #define CONFIG_SPL_BOARD_LOAD_IMAGE +#endif
#if defined(CONFIG_MACH_SUN9I)
#define CONFIG_SPL_TEXT_BASE 0x10040 /* sram start+header */
2.8.2

Hi,
On 05/12/16 06:26, Simon Glass wrote:
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
Now that the SPL is ready to be compiled in AArch64 and the DRAM init code is ready, enable SPL support for the A64 SoC and in the Pine64 defconfig. For now we keep the boot0 header in the U-Boot proper, as this allows to still use boot0 as an SPL replacement without hurting the SPL use case. We disable FEL support for now, as the code isn't ready yet.
Where is this done? Is it because you don't enable it?
It is done at two places below ....
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/mach-sunxi/board.c | 2 +- board/sunxi/Kconfig | 2 ++ configs/pine64_plus_defconfig | 1 + include/configs/sunxi-common.h | 2 ++ 4 files changed, 6 insertions(+), 1 deletion(-)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/arch/arm/mach-sunxi/board.c b/arch/arm/mach-sunxi/board.c index 0f8ead9..80d4b57 100644 --- a/arch/arm/mach-sunxi/board.c +++ b/arch/arm/mach-sunxi/board.c @@ -133,7 +133,7 @@ static int gpio_init(void) return 0; }
-#ifdef CONFIG_SPL_BUILD +#if defined(CONFIG_SPL_BOARD_LOAD_IMAGE) && defined(CONFIG_SPL_BUILD)
Here we disable FEL support if the board or build does not support it.
...
static int spl_board_load_image(struct spl_image_info *spl_image, struct spl_boot_device *bootdev) { diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index d477925..b5246df 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -125,6 +125,7 @@ config MACH_SUN50I bool "sun50i (Allwinner A64)" select ARM64 select SUNXI_GEN_SUN6I
select SUPPORT_SPL
endchoice
@@ -187,6 +188,7 @@ config DRAM_ODT_EN bool "sunxi dram odt enable" default n if !MACH_SUN8I_A23 default y if MACH_SUN8I_A23
default y if MACH_SUN50I ---help--- Select this to enable dram odt (on die termination).
diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index ebc24b8..2374170 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -5,6 +5,7 @@ CONFIG_MACH_SUN50I=y CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y +CONFIG_SPL=y # CONFIG_CMD_IMLS is not set # CONFIG_CMD_FLASH is not set # CONFIG_CMD_FPGA is not set diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index e05c318..5279e51 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -183,7 +183,9 @@
#define CONFIG_SPL_FRAMEWORK
+#ifndef CONFIG_MACH_SUN50I #define CONFIG_SPL_BOARD_LOAD_IMAGE +#endif
... and this one makes sure that it is not enabled for a 64-bit build. Actually let me change this into ARM64, which is more generic and describes the reason better.
Also I guess you'd like to see comments here as well ...
Cheers, Andre.
#if defined(CONFIG_MACH_SUN9I)
#define CONFIG_SPL_TEXT_BASE 0x10040 /* sram start+header */
2.8.2

On 16 December 2016 at 10:40, Andre Przywara andre.przywara@arm.com wrote:
Hi,
On 05/12/16 06:26, Simon Glass wrote:
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
Now that the SPL is ready to be compiled in AArch64 and the DRAM init code is ready, enable SPL support for the A64 SoC and in the Pine64 defconfig. For now we keep the boot0 header in the U-Boot proper, as this allows to still use boot0 as an SPL replacement without hurting the SPL use case. We disable FEL support for now, as the code isn't ready yet.
Where is this done? Is it because you don't enable it?
It is done at two places below ....
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/mach-sunxi/board.c | 2 +- board/sunxi/Kconfig | 2 ++ configs/pine64_plus_defconfig | 1 + include/configs/sunxi-common.h | 2 ++ 4 files changed, 6 insertions(+), 1 deletion(-)
Reviewed-by: Simon Glass sjg@chromium.org
diff --git a/arch/arm/mach-sunxi/board.c b/arch/arm/mach-sunxi/board.c index 0f8ead9..80d4b57 100644 --- a/arch/arm/mach-sunxi/board.c +++ b/arch/arm/mach-sunxi/board.c @@ -133,7 +133,7 @@ static int gpio_init(void) return 0; }
-#ifdef CONFIG_SPL_BUILD +#if defined(CONFIG_SPL_BOARD_LOAD_IMAGE) && defined(CONFIG_SPL_BUILD)
Here we disable FEL support if the board or build does not support it.
...
static int spl_board_load_image(struct spl_image_info *spl_image, struct spl_boot_device *bootdev) { diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index d477925..b5246df 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -125,6 +125,7 @@ config MACH_SUN50I bool "sun50i (Allwinner A64)" select ARM64 select SUNXI_GEN_SUN6I
select SUPPORT_SPL
endchoice
@@ -187,6 +188,7 @@ config DRAM_ODT_EN bool "sunxi dram odt enable" default n if !MACH_SUN8I_A23 default y if MACH_SUN8I_A23
default y if MACH_SUN50I ---help--- Select this to enable dram odt (on die termination).
diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index ebc24b8..2374170 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -5,6 +5,7 @@ CONFIG_MACH_SUN50I=y CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y +CONFIG_SPL=y # CONFIG_CMD_IMLS is not set # CONFIG_CMD_FLASH is not set # CONFIG_CMD_FPGA is not set diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index e05c318..5279e51 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -183,7 +183,9 @@
#define CONFIG_SPL_FRAMEWORK
+#ifndef CONFIG_MACH_SUN50I #define CONFIG_SPL_BOARD_LOAD_IMAGE +#endif
... and this one makes sure that it is not enabled for a 64-bit build. Actually let me change this into ARM64, which is more generic and describes the reason better.
Also I guess you'd like to see comments here as well ...
Well you could, but if it is temporary perhaps it isn't that important.
Regards, Simon

Read the specified "arch" value from a legacy or FIT U-Boot image and store it in our SPL data structure. This allows loaders to take the target architecture in account for custom loading procedures. Having the complete string -> arch mapping for FIT based images in the SPL would be too big, so we leave it up to architectures (or boards) to overwrite the weak function that does the actual translation, possibly covering only the required subset there. Document struct spl_image_info on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org --- common/spl/spl.c | 1 + common/spl/spl_fit.c | 8 ++++++++ include/spl.h | 15 ++++++++++++++- 3 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/common/spl/spl.c b/common/spl/spl.c index cda2f8a..b457052 100644 --- a/common/spl/spl.c +++ b/common/spl/spl.c @@ -114,6 +114,7 @@ int spl_parse_image_header(struct spl_image_info *spl_image, header_size; } spl_image->os = image_get_os(header); + spl_image->arch = image_get_arch(header); spl_image->name = image_get_name(header); debug("spl: payload image: %.*s load addr: 0x%lx size: %d\n", (int)sizeof(spl_image->name), spl_image->name, diff --git a/common/spl/spl_fit.c b/common/spl/spl_fit.c index aae556f..a5d903b 100644 --- a/common/spl/spl_fit.c +++ b/common/spl/spl_fit.c @@ -123,6 +123,11 @@ static int get_aligned_image_size(struct spl_load_info *info, int data_size, return (data_size + info->bl_len - 1) / info->bl_len; }
+__weak u8 spl_genimg_get_arch_id(const char *arch_str) +{ + return IH_ARCH_DEFAULT; +} + int spl_load_simple_fit(struct spl_image_info *spl_image, struct spl_load_info *info, ulong sector, void *fit) { @@ -136,6 +141,7 @@ int spl_load_simple_fit(struct spl_image_info *spl_image, int base_offset, align_len = ARCH_DMA_MINALIGN - 1; int src_sector; void *dst, *src; + const char *arch_str;
/* * Figure out where the external images start. This is the base for the @@ -184,10 +190,12 @@ int spl_load_simple_fit(struct spl_image_info *spl_image, data_offset = fdt_getprop_u32(fit, node, "data-offset"); data_size = fdt_getprop_u32(fit, node, "data-size"); load = fdt_getprop_u32(fit, node, "load"); + arch_str = fdt_getprop(fit, node, "arch", NULL); debug("data_offset=%x, data_size=%x\n", data_offset, data_size); spl_image->load_addr = load; spl_image->entry_point = load; spl_image->os = IH_OS_U_BOOT; + spl_image->arch = spl_genimg_get_arch_id(arch_str);
/* * Work out where to place the image. We read it so that the first diff --git a/include/spl.h b/include/spl.h index feadb33..87129df 100644 --- a/include/spl.h +++ b/include/spl.h @@ -20,13 +20,26 @@ #define MMCSD_MODE_FS 2 #define MMCSD_MODE_EMMCBOOT 3
+/* + * Information about an U-Boot image file as described in include/image.h. + * Parsed by the SPL code from a legacy or FIT image file. + * + * @name: descriptive string (mkimage -n) + * @load_addr: address to load the image file to (mkimage -a) + * @entry_point: address of first instruction to execute (mkimage -e) + * @size: size of image in bytes + * @flags: optional, used only for SPL_COPY_PAYLOAD_ONLY so far + * @os: target operating system, one of IH_OS_* (mkimage -O) + * @arch: target architecture, one of IH_ARCH_* (mkimage -A) + */ struct spl_image_info { const char *name; - u8 os; ulong load_addr; ulong entry_point; u32 size; u32 flags; + u8 os; + u8 arch; };
/*

On Mon, Dec 05, 2016 at 01:52:26AM +0000, Andre Przywara wrote:
Read the specified "arch" value from a legacy or FIT U-Boot image and store it in our SPL data structure. This allows loaders to take the target architecture in account for custom loading procedures. Having the complete string -> arch mapping for FIT based images in the SPL would be too big, so we leave it up to architectures (or boards) to overwrite the weak function that does the actual translation, possibly covering only the required subset there. Document struct spl_image_info on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org
Reviewed-by: Tom Rini trini@konsulko.com

At the moment we use the arch/arm directory for arm64 boards as well, so the Makefile will pick up the "arm" name for the architecture to use for tagging binaries in U-Boot image files. Differentiate between the two by looking at the CPU variable being defined to "armv8", and use the arm64 architecture name on creating the image file if that matches.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org --- Makefile | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/Makefile b/Makefile index 96ddc59..d6ef646 100644 --- a/Makefile +++ b/Makefile @@ -921,13 +921,18 @@ quiet_cmd_cpp_cfg = CFG $@ cmd_cpp_cfg = $(CPP) -Wp,-MD,$(depfile) $(cpp_flags) $(LDPPFLAGS) -ansi \ -DDO_DEPS_ONLY -D__ASSEMBLY__ -x assembler-with-cpp -P -dM -E -o $@ $<
+ifeq ($(CPU),armv8) +IH_ARCH := arm64 +else +IH_ARCH := $(ARCH) +endif ifdef CONFIG_SPL_LOAD_FIT -MKIMAGEFLAGS_u-boot.img = -f auto -A $(ARCH) -T firmware -C none -O u-boot \ +MKIMAGEFLAGS_u-boot.img = -f auto -A $(IH_ARCH) -T firmware -C none -O u-boot \ -a $(CONFIG_SYS_TEXT_BASE) -e $(CONFIG_SYS_UBOOT_START) \ -n "U-Boot $(UBOOTRELEASE) for $(BOARD) board" -E \ $(patsubst %,-b arch/$(ARCH)/dts/%.dtb,$(subst ",,$(CONFIG_OF_LIST))) else -MKIMAGEFLAGS_u-boot.img = -A $(ARCH) -T firmware -C none -O u-boot \ +MKIMAGEFLAGS_u-boot.img = -A $(IH_ARCH) -T firmware -C none -O u-boot \ -a $(CONFIG_SYS_TEXT_BASE) -e $(CONFIG_SYS_UBOOT_START) \ -n "U-Boot $(UBOOTRELEASE) for $(BOARD) board" endif

On Mon, Dec 05, 2016 at 01:52:27AM +0000, Andre Przywara wrote:
At the moment we use the arch/arm directory for arm64 boards as well, so the Makefile will pick up the "arm" name for the architecture to use for tagging binaries in U-Boot image files. Differentiate between the two by looking at the CPU variable being defined to "armv8", and use the arm64 architecture name on creating the image file if that matches.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org
Reviewed-by: Tom Rini trini@konsulko.com

Since the SPL FIT loader can now differentiate between different architectures, teach it how to tell arm and arm64 apart when a FIT image is used. We just support those two for now, as these are so far the only sensible alternatives.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org --- arch/arm/lib/spl.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/arch/arm/lib/spl.c b/arch/arm/lib/spl.c index e606d47..45d285c 100644 --- a/arch/arm/lib/spl.c +++ b/arch/arm/lib/spl.c @@ -63,3 +63,18 @@ void __noreturn jump_to_image_linux(struct spl_image_info *spl_image, void *arg) image_entry(0, machid, arg); } #endif + +/* This overwrites the weak definition in spl_fit.c */ +u8 spl_genimg_get_arch_id(const char *arch_str) +{ + if (!arch_str) + return IH_ARCH_DEFAULT; + + if (!strcmp(arch_str, "arm")) + return IH_ARCH_ARM; + + if (!strcmp(arch_str, "arm64")) + return IH_ARCH_ARM64; + + return IH_ARCH_DEFAULT; +}

On Mon, Dec 05, 2016 at 01:52:28AM +0000, Andre Przywara wrote:
Since the SPL FIT loader can now differentiate between different architectures, teach it how to tell arm and arm64 apart when a FIT image is used. We just support those two for now, as these are so far the only sensible alternatives.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Simon Glass sjg@chromium.org
Reviewed-by: Tom Rini trini@konsulko.com

The ARMv8 capable Allwinner A64 SoC comes out of reset in AArch32 mode. To run AArch64 code, we have to trigger a warm reset via the RMR register, which proceeds with code execution at the address stored in the RVBAR register. If the bootable payload in the FIT image is using a different architecture than the SPL has been compiled for, enter it via this said RMR switch mechanism, by writing the entry point address into the MMIO mapped, writable version of the RVBAR register. Then the warm reset is triggered via a system register write. If the payload architecture is the same as the SPL, we use the normal branch as usual.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/spl_switch.c | 60 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) create mode 100644 arch/arm/mach-sunxi/spl_switch.c
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index 7daba11..128091e 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -51,4 +51,5 @@ obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o +obj-$(CONFIG_MACH_SUN50I) += spl_switch.o endif diff --git a/arch/arm/mach-sunxi/spl_switch.c b/arch/arm/mach-sunxi/spl_switch.c new file mode 100644 index 0000000..20f21b1 --- /dev/null +++ b/arch/arm/mach-sunxi/spl_switch.c @@ -0,0 +1,60 @@ +/* + * (C) Copyright 2016 ARM Ltd. + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#include <common.h> +#include <spl.h> + +#include <asm/io.h> +#include <asm/barriers.h> + +static void __noreturn jump_to_image_native(struct spl_image_info *spl_image) +{ + typedef void __noreturn (*image_entry_noargs_t)(void); + + image_entry_noargs_t image_entry = + (image_entry_noargs_t)spl_image->entry_point; + + image_entry(); +} + +static void __noreturn reset_rmr_switch(void) +{ +#ifdef CONFIG_ARM64 + __asm__ volatile ( "mrs x0, RMR_EL3\n\t" + "bic x0, x0, #1\n\t" /* Clear enter-in-64 bit */ + "orr x0, x0, #2\n\t" /* set reset request bit */ + "msr RMR_EL3, x0\n\t" + "isb sy\n\t" + "nop\n\t" + "wfi\n\t" + "b .\n" + ::: "x0"); +#else + __asm__ volatile ( "mrc 15, 0, r0, cr12, cr0, 2\n\t" + "orr r0, r0, #3\n\t" /* request reset in 64 bit */ + "mcr 15, 0, r0, cr12, cr0, 2\n\t" + "isb\n\t" + "nop\n\t" + "wfi\n\t" + "b .\n" + ::: "r0"); +#endif + while (1); /* to avoid a compiler warning about __noreturn */ +} + +void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image) +{ + if (spl_image->arch == IH_ARCH_DEFAULT) { + debug("entering by branch\n"); + jump_to_image_native(spl_image); + } else { + debug("entering by RMR switch\n"); + writel(spl_image->entry_point, 0x17000a0); + DSB; + ISB; + reset_rmr_switch(); + } +}

Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
The ARMv8 capable Allwinner A64 SoC comes out of reset in AArch32 mode. To run AArch64 code, we have to trigger a warm reset via the RMR register, which proceeds with code execution at the address stored in the RVBAR register. If the bootable payload in the FIT image is using a different architecture than the SPL has been compiled for, enter it via this said RMR switch mechanism, by writing the entry point address into the MMIO mapped, writable version of the RVBAR register. Then the warm reset is triggered via a system register write. If the payload architecture is the same as the SPL, we use the normal branch as usual.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/mach-sunxi/Makefile | 1 + arch/arm/mach-sunxi/spl_switch.c | 60 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) create mode 100644 arch/arm/mach-sunxi/spl_switch.c
diff --git a/arch/arm/mach-sunxi/Makefile b/arch/arm/mach-sunxi/Makefile index 7daba11..128091e 100644 --- a/arch/arm/mach-sunxi/Makefile +++ b/arch/arm/mach-sunxi/Makefile @@ -51,4 +51,5 @@ obj-$(CONFIG_MACH_SUN8I_A83T) += dram_sun8i_a83t.o obj-$(CONFIG_MACH_SUN8I_H3) += dram_sun8i_h3.o obj-$(CONFIG_MACH_SUN9I) += dram_sun9i.o obj-$(CONFIG_MACH_SUN50I) += dram_sun8i_h3.o +obj-$(CONFIG_MACH_SUN50I) += spl_switch.o endif diff --git a/arch/arm/mach-sunxi/spl_switch.c b/arch/arm/mach-sunxi/spl_switch.c new file mode 100644 index 0000000..20f21b1 --- /dev/null +++ b/arch/arm/mach-sunxi/spl_switch.c @@ -0,0 +1,60 @@ +/*
- (C) Copyright 2016 ARM Ltd.
- SPDX-License-Identifier: GPL-2.0+
- */
+#include <common.h> +#include <spl.h>
+#include <asm/io.h> +#include <asm/barriers.h>
+static void __noreturn jump_to_image_native(struct spl_image_info *spl_image) +{
typedef void __noreturn (*image_entry_noargs_t)(void);
image_entry_noargs_t image_entry =
(image_entry_noargs_t)spl_image->entry_point;
image_entry();
+}
+static void __noreturn reset_rmr_switch(void) +{ +#ifdef CONFIG_ARM64
__asm__ volatile ( "mrs x0, RMR_EL3\n\t"
"bic x0, x0, #1\n\t" /* Clear enter-in-64 bit */
"orr x0, x0, #2\n\t" /* set reset request bit */
"msr RMR_EL3, x0\n\t"
"isb sy\n\t"
"nop\n\t"
"wfi\n\t"
"b .\n"
::: "x0");
+#else
__asm__ volatile ( "mrc 15, 0, r0, cr12, cr0, 2\n\t"
"orr r0, r0, #3\n\t" /* request reset in 64 bit */
"mcr 15, 0, r0, cr12, cr0, 2\n\t"
"isb\n\t"
"nop\n\t"
"wfi\n\t"
"b .\n"
::: "r0");
+#endif
while (1); /* to avoid a compiler warning about __noreturn */
+}
+void __noreturn jump_to_image_no_args(struct spl_image_info *spl_image) +{
if (spl_image->arch == IH_ARCH_DEFAULT) {
debug("entering by branch\n");
jump_to_image_native(spl_image);
} else {
debug("entering by RMR switch\n");
writel(spl_image->entry_point, 0x17000a0);
DSB;
ISB;
reset_rmr_switch();
}
I think this could use some comments or a pointer to a README to explain what is going on.
+}
2.8.2
Regards, Simon

When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding. This should only be used for the SPL, the U-Boot proper should still be using the existing 64-bit configuration. The SPL code can switch to AArch64 if needed, so a 32-bit SPL can be combined with a 64-bit U-Boot proper to eventually launch arm64 kernels.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- board/sunxi/Kconfig | 14 ++++++++++++-- configs/pine64_plus_defconfig | 2 +- configs/sun50i_spl32_defconfig | 10 ++++++++++ include/configs/sunxi-common.h | 2 +- 4 files changed, 24 insertions(+), 4 deletions(-) create mode 100644 configs/sun50i_spl32_defconfig
diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index b5246df..bb6e7fa 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -43,6 +43,10 @@ config SUNXI_GEN_SUN6I watchdog, etc.
+config MACH_SUN50I + bool + select SUNXI_GEN_SUN6I + choice prompt "Sunxi SoC Variant" optional @@ -121,10 +125,16 @@ config MACH_SUN9I select SUNXI_GEN_SUN6I select SUPPORT_SPL
-config MACH_SUN50I +config MACH_SUN50I_64 bool "sun50i (Allwinner A64)" + select MACH_SUN50I select ARM64 - select SUNXI_GEN_SUN6I + select SUPPORT_SPL + +config MACH_SUN50I_32 + bool "sun50i (Allwinner A64) SPL-32bit" + select MACH_SUN50I + select CPU_V7 select SUPPORT_SPL
endchoice diff --git a/configs/pine64_plus_defconfig b/configs/pine64_plus_defconfig index 2374170..a76f66a 100644 --- a/configs/pine64_plus_defconfig +++ b/configs/pine64_plus_defconfig @@ -1,7 +1,7 @@ CONFIG_ARM=y CONFIG_RESERVE_ALLWINNER_BOOT0_HEADER=y CONFIG_ARCH_SUNXI=y -CONFIG_MACH_SUN50I=y +CONFIG_MACH_SUN50I_64=y CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" # CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set CONFIG_CONSOLE_MUX=y diff --git a/configs/sun50i_spl32_defconfig b/configs/sun50i_spl32_defconfig new file mode 100644 index 0000000..29c6a47 --- /dev/null +++ b/configs/sun50i_spl32_defconfig @@ -0,0 +1,10 @@ +CONFIG_ARM=y +CONFIG_ARCH_SUNXI=y +CONFIG_MACH_SUN50I_32=y +CONFIG_SPL=y +CONFIG_DEFAULT_DEVICE_TREE="sun50i-a64-pine64-plus" +CONFIG_OF_LIST="sun50i-a64-pine64 sun50i-a64-pine64-plus" +# CONFIG_CMD_IMLS is not set +# CONFIG_CMD_FLASH is not set +# CONFIG_CMD_FPGA is not set +CONFIG_MMC_SUNXI_SLOT_EXTRA=2 diff --git a/include/configs/sunxi-common.h b/include/configs/sunxi-common.h index 5279e51..4113591 100644 --- a/include/configs/sunxi-common.h +++ b/include/configs/sunxi-common.h @@ -183,7 +183,7 @@
#define CONFIG_SPL_FRAMEWORK
-#ifndef CONFIG_MACH_SUN50I +#ifndef CONFIG_MACH_SUN50I_64 #define CONFIG_SPL_BOARD_LOAD_IMAGE #endif

Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding. This should only be used for the SPL, the U-Boot proper should still be using the existing 64-bit configuration. The SPL code can switch to AArch64 if needed, so a 32-bit SPL can be combined with a 64-bit U-Boot proper to eventually launch arm64 kernels.
So if I understand correctly, you want SPL to be 32-bit and U-Boot proper to be 64-bit? And you are adding a new board config for that?
Instead, can you do something similar to tegra, which uses ARMv4t for SPL and ARMv7 for U-Boot proper?
Signed-off-by: Andre Przywara andre.przywara@arm.com
board/sunxi/Kconfig | 14 ++++++++++++-- configs/pine64_plus_defconfig | 2 +- configs/sun50i_spl32_defconfig | 10 ++++++++++ include/configs/sunxi-common.h | 2 +- 4 files changed, 24 insertions(+), 4 deletions(-) create mode 100644 configs/sun50i_spl32_defconfig
Regards, Simon

On 05/12/16 06:26, Simon Glass wrote:
Hi Andre,
On 4 December 2016 at 18:52, Andre Przywara andre.przywara@arm.com wrote:
When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding. This should only be used for the SPL, the U-Boot proper should still be using the existing 64-bit configuration. The SPL code can switch to AArch64 if needed, so a 32-bit SPL can be combined with a 64-bit U-Boot proper to eventually launch arm64 kernels.
So if I understand correctly, you want SPL to be 32-bit and U-Boot proper to be 64-bit?
Yes, that is _one_ possible option, mostly driven by size constraints and due to AArch64 code being much bigger than Thumb2.
And you are adding a new board config for that?
Yes, for now this one (separate) defconfig aims to cover all A64 boards. This is not ideal (as Maxime pointed out already), so if you know a nice way of using a single defconfig for one board and configuring it once with CPU_V7 and then again with ARM64 set, I am all ears.
Also I haven't found a make target to just build the SPL (possibly another one for just the U-Boot proper). Maybe this would help things?
Instead, can you do something similar to tegra, which uses ARMv4t for SPL and ARMv7 for U-Boot proper?
You will need two different (cross-)compilers, so just setting some compiler options will not help. Besides compiling the SPL as 32-bit is only an option, the SPL also works as a pure 64-bit binary. And people expressed the wish of having the option of using both ways - at least for the time being.
Cheers, Andre.
Signed-off-by: Andre Przywara andre.przywara@arm.com
board/sunxi/Kconfig | 14 ++++++++++++-- configs/pine64_plus_defconfig | 2 +- configs/sun50i_spl32_defconfig | 10 ++++++++++ include/configs/sunxi-common.h | 2 +- 4 files changed, 24 insertions(+), 4 deletions(-) create mode 100644 configs/sun50i_spl32_defconfig
Regards, Simon

On Sat, Dec 17, 2016 at 02:44:46PM +0000, André Przywara wrote:
Instead, can you do something similar to tegra, which uses ARMv4t for SPL and ARMv7 for U-Boot proper?
You will need two different (cross-)compilers, so just setting some compiler options will not help. Besides compiling the SPL as 32-bit is only an option, the SPL also works as a pure 64-bit binary. And people expressed the wish of having the option of using both ways - at least for the time being.
Honestly, I have the feeling that this is the only thing that holds this serie back. Maybe you can send it without it first, we merge it, and then we work on the 32 bits SPL. That way, we don't delay everything else but the very last patch.
Thanks! Maxime

Salut Maxime,
On 19/12/16 08:20, Maxime Ripard wrote:
On Sat, Dec 17, 2016 at 02:44:46PM +0000, André Przywara wrote:
Instead, can you do something similar to tegra, which uses ARMv4t for SPL and ARMv7 for U-Boot proper?
You will need two different (cross-)compilers, so just setting some compiler options will not help. Besides compiling the SPL as 32-bit is only an option, the SPL also works as a pure 64-bit binary. And people expressed the wish of having the option of using both ways - at least for the time being.
Honestly, I have the feeling that this is the only thing that holds this serie back. Maybe you can send it without it first, we merge it, and then we work on the 32 bits SPL. That way, we don't delay everything else but the very last patch.
Feel free to do so already. I deliberately made the series this way, so you can merge anything up to and including "[PATCH v3 21/26] sunxi: A64: enable SPL" and get the 64-bit SPL functionality. If you like, you can also take the others but the last patch as well.
Actually I would be very glad to get that two-digit patch count off my back, as this makes handling much easier. Also I have the H5 support ready on top of that ...
Thanks!
Cheers, Andre.

On Mon, Dec 05, 2016 at 01:52:30AM +0000, Andre Przywara wrote:
When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding.
"mostly generic". Where do you draw the line? How do you deal with a board that would use a different UART? a different MMC? different memory configuration.?
This looks like it's not generic at all, it's just a configuration for the Pine64.
Maxime

Hi,
On 06/12/16 11:28, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:30AM +0000, Andre Przywara wrote:
When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding.
"mostly generic". Where do you draw the line? How do you deal with a board that would use a different UART? a different MMC? different memory configuration.?
My impression was that it's rather pointless to provide another set of 32-bit SPL defconfigs for each board again, especially given that for the SPL's needs the boards so far seem to be very similar. For the loading part we will probably go with what the BROM already started: load more data from one of the BROM boot sources, which is fixed in the SoC and can't be really changed by a board vendor anyway. Which really leaves the DRAM setup and the UART. I can't predict the future, but so far those A64 boards look fairly similar in this respect. So I just avoid having another SPL defconfig for the BananaPi M64, for instance. I just added MMC_SUNXI_SLOT_EXTRA because this doesn't hurt on the Pine64, so less churn here.
So if you know of any board which breaks this assumption, I am happy to hear about it and see if it can be integrated.
Actually I had the idea of eventually going the other way around: Providing one U-Boot proper A64 defconfig and let the DT (provided by the SPL) sort out the differences. Then we might be able to live with separate SPL defconfigs. But that's another patchset and probably quite some work.
This looks like it's not generic at all, it's just a configuration for the Pine64.
And the BananaPi M64. And the upcoming Olimex board (trusting their latest published schematics). If in need, we can always provide separate defconfigs for "odd" boards.
So this is the best solution I came up with, if you have a better one: I am all ears. Would it make sense to rename this file to not claim universal coverage?
Cheers, Andre.

On Tue, Dec 06, 2016 at 12:22:59PM +0000, Andre Przywara wrote:
Hi,
On 06/12/16 11:28, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:30AM +0000, Andre Przywara wrote:
When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding.
"mostly generic". Where do you draw the line? How do you deal with a board that would use a different UART? a different MMC? different memory configuration.?
My impression was that it's rather pointless to provide another set of 32-bit SPL defconfigs for each board again, especially given that for the SPL's needs the boards so far seem to be very similar. For the loading part we will probably go with what the BROM already started: load more data from one of the BROM boot sources, which is fixed in the SoC and can't be really changed by a board vendor anyway. Which really leaves the DRAM setup and the UART.
So you plan on enabling all BROM boot sources as well (NAND, SPI) ?
I can't predict the future, but so far those A64 boards look fairly similar in this respect. So I just avoid having another SPL defconfig for the BananaPi M64, for instance. I just added MMC_SUNXI_SLOT_EXTRA because this doesn't hurt on the Pine64, so less churn here.
So if you know of any board which breaks this assumption, I am happy to hear about it and see if it can be integrated.
I know at least of one board that uses the UART3 on A33, instead of UART0. The trend is very clear on the A64 and the previous SoCs, but we also had some variations, so we need to take that into account. Which brings me back to my original question, where do you draw the line ? :)
And that would prevent us from doing any kind of DRAM settings enhancements in the future, every one using the best common denominator, which seems inefficient.
Actually I had the idea of eventually going the other way around: Providing one U-Boot proper A64 defconfig and let the DT (provided by the SPL) sort out the differences. Then we might be able to live with separate SPL defconfigs. But that's another patchset and probably quite some work.
That would work for MMC and UART (in u-boot, not in the SPL), but not for the RAM setup.
Maxime

Hi,
On 12/12/16 15:13, Maxime Ripard wrote:
On Tue, Dec 06, 2016 at 12:22:59PM +0000, Andre Przywara wrote:
Hi,
On 06/12/16 11:28, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:30AM +0000, Andre Przywara wrote:
When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding.
"mostly generic". Where do you draw the line? How do you deal with a board that would use a different UART? a different MMC? different memory configuration.?
My impression was that it's rather pointless to provide another set of 32-bit SPL defconfigs for each board again, especially given that for the SPL's needs the boards so far seem to be very similar. For the loading part we will probably go with what the BROM already started: load more data from one of the BROM boot sources, which is fixed in the SoC and can't be really changed by a board vendor anyway. Which really leaves the DRAM setup and the UART.
So you plan on enabling all BROM boot sources as well (NAND, SPI) ?
In fact SPI works already (with little to no changes). And I don't care about NAND, really ;-) Is anyone aware of an A64 board using this?
I can't predict the future, but so far those A64 boards look fairly similar in this respect. So I just avoid having another SPL defconfig for the BananaPi M64, for instance. I just added MMC_SUNXI_SLOT_EXTRA because this doesn't hurt on the Pine64, so less churn here.
So if you know of any board which breaks this assumption, I am happy to hear about it and see if it can be integrated.
I know at least of one board that uses the UART3 on A33, instead of UART0. The trend is very clear on the A64 and the previous SoCs, but we also had some variations, so we need to take that into account. Which brings me back to my original question, where do you draw the line ? :)
I don't know, and to make this clear: I see the point in having separate configs for the SPL, but due to the 32-bit/64-bit split we probably need _two_ sets of defconfigs, which gets pretty messy very quickly. Especially given that they are very similar.
So how do we avoid this? Can we somehow share a defconfig between armv8 and armv7? In the moment "CONFIG_CPU_V7" and "CONFIG_ARM64" conflict in the same file.
And that would prevent us from doing any kind of DRAM settings enhancements in the future, every one using the best common denominator, which seems inefficient.
Yeah, I think this is a much better point. I was a lot in DRAM land in the past few days, and I think we _have_ different DRAMs already: 1) The Pine64 uses DDR3-1333 DRAM chips @ 672 MHz 2) The SoPine and the Pinebook use LPDDR3 DRAM, which requires a slightly different setup, also the stock frequency in that one boot0.bin floating around is much lower (524 MHz, IIRC). 3) The BananaPi M64 and Theobroma's board use SKhynix DDR3 DRAM chips which are from a DDR3-1600 bin. I think Philipp reported to have it running at 800 MHz with some sane JEDEC based timing happily, so it's worth to enable this there.
Which brings us to the following complications: a) We have similar, but still different DRAM _controllers_ (H3, A64, H5). Those have a slightly different register set, though it seems to be mostly missing/added features, so nothing really conflicting. b) We have different DRAM _chips_ being used on top of possibly any of those controllers.
So this brings us from "one type of DRAM chip on top of one DRAM controller" today for the H3 driver to "multiple DRAM chips on top of slightly different controllers", which is a totally different story. I was looking into significantly reworking the DRAM driver to address that, but was hoping to defer this to a later stage.
While the DRAM controller part can probably be solved by #ifdef'ing or static parameters, I wonder if we should really explore how to address different DRAM chips properly: I) We create a DT binding loosely based on the "jedec,lpddr2-timings" one in the kernel and use of-platdata as Simon suggested. II) We create some fixed tables of standard (JEDEC) timings in the driver and let one of those tables be selected at runtime based on some parameters, for instance in the SPL header. This could as easy as "DDR3-1333" vs. "LPDDR3-1066". This would allow us to adopt an existing SPL to multiple chips/boards without needing to rebuild it, possibly even open the door to auto-detection. For instance I found the LPDDR3 boot0 bailing out on the Pine64, so we might at least detect the difference there. III) Something in between.
It's just a pity that this could hold off upstreaming the Pine64 SPL support.
Cheers, Andre.
Actually I had the idea of eventually going the other way around: Providing one U-Boot proper A64 defconfig and let the DT (provided by the SPL) sort out the differences. Then we might be able to live with separate SPL defconfigs. But that's another patchset and probably quite some work.
That would work for MMC and UART (in u-boot, not in the SPL), but not for the RAM setup.

On Tue, Dec 13, 2016 at 12:04 AM, Andre Przywara andre.przywara@arm.com wrote:
Hi,
On 12/12/16 15:13, Maxime Ripard wrote:
On Tue, Dec 06, 2016 at 12:22:59PM +0000, Andre Przywara wrote:
Hi,
On 06/12/16 11:28, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:30AM +0000, Andre Przywara wrote:
When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding.
"mostly generic". Where do you draw the line? How do you deal with a board that would use a different UART? a different MMC? different memory configuration.?
My impression was that it's rather pointless to provide another set of 32-bit SPL defconfigs for each board again, especially given that for the SPL's needs the boards so far seem to be very similar. For the loading part we will probably go with what the BROM already started: load more data from one of the BROM boot sources, which is fixed in the SoC and can't be really changed by a board vendor anyway. Which really leaves the DRAM setup and the UART.
So you plan on enabling all BROM boot sources as well (NAND, SPI) ?
In fact SPI works already (with little to no changes). And I don't care about NAND, really ;-) Is anyone aware of an A64 board using this?
I can't predict the future, but so far those A64 boards look fairly similar in this respect. So I just avoid having another SPL defconfig for the BananaPi M64, for instance. I just added MMC_SUNXI_SLOT_EXTRA because this doesn't hurt on the Pine64, so less churn here.
So if you know of any board which breaks this assumption, I am happy to hear about it and see if it can be integrated.
I know at least of one board that uses the UART3 on A33, instead of UART0. The trend is very clear on the A64 and the previous SoCs, but we also had some variations, so we need to take that into account. Which brings me back to my original question, where do you draw the line ? :)
I don't know, and to make this clear: I see the point in having separate configs for the SPL, but due to the 32-bit/64-bit split we probably need _two_ sets of defconfigs, which gets pretty messy very quickly. Especially given that they are very similar.
So how do we avoid this? Can we somehow share a defconfig between armv8 and armv7? In the moment "CONFIG_CPU_V7" and "CONFIG_ARM64" conflict in the same file.
And that would prevent us from doing any kind of DRAM settings enhancements in the future, every one using the best common denominator, which seems inefficient.
Yeah, I think this is a much better point. I was a lot in DRAM land in the past few days, and I think we _have_ different DRAMs already:
- The Pine64 uses DDR3-1333 DRAM chips @ 672 MHz
- The SoPine and the Pinebook use LPDDR3 DRAM, which requires a
slightly different setup, also the stock frequency in that one boot0.bin floating around is much lower (524 MHz, IIRC). 3) The BananaPi M64 and Theobroma's board use SKhynix DDR3 DRAM chips which are from a DDR3-1600 bin. I think Philipp reported to have it running at 800 MHz with some sane JEDEC based timing happily, so it's worth to enable this there.
Which brings us to the following complications: a) We have similar, but still different DRAM _controllers_ (H3, A64, H5). Those have a slightly different register set, though it seems to be mostly missing/added features, so nothing really conflicting. b) We have different DRAM _chips_ being used on top of possibly any of those controllers.
So this brings us from "one type of DRAM chip on top of one DRAM controller" today for the H3 driver to "multiple DRAM chips on top of slightly different controllers", which is a totally different story. I was looking into significantly reworking the DRAM driver to address that, but was hoping to defer this to a later stage.
While the DRAM controller part can probably be solved by #ifdef'ing or static parameters, I wonder if we should really explore how to address different DRAM chips properly: I) We create a DT binding loosely based on the "jedec,lpddr2-timings" one in the kernel and use of-platdata as Simon suggested. II) We create some fixed tables of standard (JEDEC) timings in the driver and let one of those tables be selected at runtime based on some parameters, for instance in the SPL header. This could as easy as "DDR3-1333" vs. "LPDDR3-1066". This would allow us to adopt an existing SPL to multiple chips/boards without needing to rebuild it, possibly even open the door to auto-detection. For instance I found the LPDDR3 boot0 bailing out on the Pine64, so we might at least detect the difference there.
Bailing out should be expected. LPDDR3 runs at a lower voltage, so the standard DDR3 chip w/ LPDDR3 boot0 would not get sufficient power to work. Also, it seems LPDDR3 uses a different protocol. I'm not an expert on this though. See
https://blogs.synopsys.com/committedtomemory/2014/01/10/when-is-lpddr3-not-l... https://blogs.synopsys.com/committedtomemory/files/2014/01/DDR3-DDR3L-LPDDR3...
As I see it, there would need to be 2 settings:
a) RAM type. We already have a Kconfig option to support LPDDR2 for the A83T.
b) DRAM speed bin. This would probably select one of the standard JEDEC timings for the selected DRAM type.
Note that Allwinner's boot0 supposedly just has all the DRAM parameters in its header, including options to use standard or custom timings.
Regards ChenYu
III) Something in between.
It's just a pity that this could hold off upstreaming the Pine64 SPL support.
Cheers, Andre.
Actually I had the idea of eventually going the other way around: Providing one U-Boot proper A64 defconfig and let the DT (provided by the SPL) sort out the differences. Then we might be able to live with separate SPL defconfigs. But that's another patchset and probably quite some work.
That would work for MMC and UART (in u-boot, not in the SPL), but not for the RAM setup.

Hi,
On 12/12/16 16:18, Chen-Yu Tsai wrote:
On Tue, Dec 13, 2016 at 12:04 AM, Andre Przywara andre.przywara@arm.com wrote:
Hi,
On 12/12/16 15:13, Maxime Ripard wrote:
On Tue, Dec 06, 2016 at 12:22:59PM +0000, Andre Przywara wrote:
Hi,
On 06/12/16 11:28, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:30AM +0000, Andre Przywara wrote:
When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding.
"mostly generic". Where do you draw the line? How do you deal with a board that would use a different UART? a different MMC? different memory configuration.?
My impression was that it's rather pointless to provide another set of 32-bit SPL defconfigs for each board again, especially given that for the SPL's needs the boards so far seem to be very similar. For the loading part we will probably go with what the BROM already started: load more data from one of the BROM boot sources, which is fixed in the SoC and can't be really changed by a board vendor anyway. Which really leaves the DRAM setup and the UART.
So you plan on enabling all BROM boot sources as well (NAND, SPI) ?
In fact SPI works already (with little to no changes). And I don't care about NAND, really ;-) Is anyone aware of an A64 board using this?
I can't predict the future, but so far those A64 boards look fairly similar in this respect. So I just avoid having another SPL defconfig for the BananaPi M64, for instance. I just added MMC_SUNXI_SLOT_EXTRA because this doesn't hurt on the Pine64, so less churn here.
So if you know of any board which breaks this assumption, I am happy to hear about it and see if it can be integrated.
I know at least of one board that uses the UART3 on A33, instead of UART0. The trend is very clear on the A64 and the previous SoCs, but we also had some variations, so we need to take that into account. Which brings me back to my original question, where do you draw the line ? :)
I don't know, and to make this clear: I see the point in having separate configs for the SPL, but due to the 32-bit/64-bit split we probably need _two_ sets of defconfigs, which gets pretty messy very quickly. Especially given that they are very similar.
So how do we avoid this? Can we somehow share a defconfig between armv8 and armv7? In the moment "CONFIG_CPU_V7" and "CONFIG_ARM64" conflict in the same file.
And that would prevent us from doing any kind of DRAM settings enhancements in the future, every one using the best common denominator, which seems inefficient.
Yeah, I think this is a much better point. I was a lot in DRAM land in the past few days, and I think we _have_ different DRAMs already:
- The Pine64 uses DDR3-1333 DRAM chips @ 672 MHz
- The SoPine and the Pinebook use LPDDR3 DRAM, which requires a
slightly different setup, also the stock frequency in that one boot0.bin floating around is much lower (524 MHz, IIRC). 3) The BananaPi M64 and Theobroma's board use SKhynix DDR3 DRAM chips which are from a DDR3-1600 bin. I think Philipp reported to have it running at 800 MHz with some sane JEDEC based timing happily, so it's worth to enable this there.
Which brings us to the following complications: a) We have similar, but still different DRAM _controllers_ (H3, A64, H5). Those have a slightly different register set, though it seems to be mostly missing/added features, so nothing really conflicting. b) We have different DRAM _chips_ being used on top of possibly any of those controllers.
So this brings us from "one type of DRAM chip on top of one DRAM controller" today for the H3 driver to "multiple DRAM chips on top of slightly different controllers", which is a totally different story. I was looking into significantly reworking the DRAM driver to address that, but was hoping to defer this to a later stage.
While the DRAM controller part can probably be solved by #ifdef'ing or static parameters, I wonder if we should really explore how to address different DRAM chips properly: I) We create a DT binding loosely based on the "jedec,lpddr2-timings" one in the kernel and use of-platdata as Simon suggested. II) We create some fixed tables of standard (JEDEC) timings in the driver and let one of those tables be selected at runtime based on some parameters, for instance in the SPL header. This could as easy as "DDR3-1333" vs. "LPDDR3-1066". This would allow us to adopt an existing SPL to multiple chips/boards without needing to rebuild it, possibly even open the door to auto-detection. For instance I found the LPDDR3 boot0 bailing out on the Pine64, so we might at least detect the difference there.
Bailing out should be expected. LPDDR3 runs at a lower voltage, so the standard DDR3 chip w/ LPDDR3 boot0 would not get sufficient power to work.
I don't think this is the reason. The AXP803 has a pin to externally select a reset voltage for DCDC5, which drives the DRAM chips. So a board vendor would connect this pin appropriately to match the _soldered_ DRAM chips.
Also, it seems LPDDR3 uses a different protocol.
Yes, that's what I found also, and I hope that this is sufficient enough to reliably tell the two apart. Somehow the algorithm found out that something is wrong and quit. I was wondering if this could be used in a reliable way.
I'm not an expert on this though. See
https://blogs.synopsys.com/committedtomemory/2014/01/10/when-is-lpddr3-not-l... https://blogs.synopsys.com/committedtomemory/files/2014/01/DDR3-DDR3L-LPDDR3...
Thanks for those links, I will add these to my constantly piling up heap of "how DRAM works" documents that populate my desktop lately, leading to the OOM killer going 'round yesterday on my machine ;-)
As I see it, there would need to be 2 settings:
a) RAM type. We already have a Kconfig option to support LPDDR2 for the A83T.
Yes, though it's just DDR3 vs LPDDR3 in our case.
b) DRAM speed bin. This would probably select one of the standard JEDEC timings for the selected DRAM type.
Yes.
Note that Allwinner's boot0 supposedly just has all the DRAM parameters in its header, including options to use standard or custom timings.
By staring at the boot0 disassembly lately I believe the existing boot0's ignore all those numerous parameters unless a bit in this tpr13 parameter is set (which is clear on the A64 one). Instead they just derive the timing parameters from the given frequency, though in a slightly non-standard (wrong?) way.
So yes: I hope that just "memory type" and "frequency/JEDEC speed bin" should be sufficient to get the right DRAM setup, which could then happily live as two Kconfig parameters.
Cheers, Andre.
III) Something in between.
It's just a pity that this could hold off upstreaming the Pine64 SPL support.
Cheers, Andre.
Actually I had the idea of eventually going the other way around: Providing one U-Boot proper A64 defconfig and let the DT (provided by the SPL) sort out the differences. Then we might be able to live with separate SPL defconfigs. But that's another patchset and probably quite some work.
That would work for MMC and UART (in u-boot, not in the SPL), but not for the RAM setup.

On Mon, Dec 12, 2016 at 04:04:23PM +0000, Andre Przywara wrote:
Hi,
On 12/12/16 15:13, Maxime Ripard wrote:
On Tue, Dec 06, 2016 at 12:22:59PM +0000, Andre Przywara wrote:
Hi,
On 06/12/16 11:28, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:30AM +0000, Andre Przywara wrote:
When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding.
"mostly generic". Where do you draw the line? How do you deal with a board that would use a different UART? a different MMC? different memory configuration.?
My impression was that it's rather pointless to provide another set of 32-bit SPL defconfigs for each board again, especially given that for the SPL's needs the boards so far seem to be very similar. For the loading part we will probably go with what the BROM already started: load more data from one of the BROM boot sources, which is fixed in the SoC and can't be really changed by a board vendor anyway. Which really leaves the DRAM setup and the UART.
So you plan on enabling all BROM boot sources as well (NAND, SPI) ?
In fact SPI works already (with little to no changes).
It's the little changes that I'm interested in to be honest :)
And I don't care about NAND, really ;-) Is anyone aware of an A64 board using this?
Well, it's one of the possible boot source, so we have to consider it and not ignore it entirely hoping that no one will use it, ever.
SPI booting was not used for 5-6 years, until someone started that trend and now we have a significant number of boards implementing it.
I can't predict the future, but so far those A64 boards look fairly similar in this respect. So I just avoid having another SPL defconfig for the BananaPi M64, for instance. I just added MMC_SUNXI_SLOT_EXTRA because this doesn't hurt on the Pine64, so less churn here.
So if you know of any board which breaks this assumption, I am happy to hear about it and see if it can be integrated.
I know at least of one board that uses the UART3 on A33, instead of UART0. The trend is very clear on the A64 and the previous SoCs, but we also had some variations, so we need to take that into account. Which brings me back to my original question, where do you draw the line ? :)
I don't know, and to make this clear: I see the point in having separate configs for the SPL, but due to the 32-bit/64-bit split we probably need _two_ sets of defconfigs, which gets pretty messy very quickly. Especially given that they are very similar.
So how do we avoid this? Can we somehow share a defconfig between armv8 and armv7? In the moment "CONFIG_CPU_V7" and "CONFIG_ARM64" conflict in the same file.
I guess the easiest and most robust solution to do this would be to just generate it from the defconfig. It really feels from your patches that it's just a matter of sed -i 's/CONFIG_MACH_SUN50I/CONFIG_MACH_SUN50I_32' on the "real" defconfig.
And we don't have to matter about keeping it in sync, or the board specific init that might need to be done.
Maxime

Hi,
On 16/12/16 14:52, Maxime Ripard wrote:
On Mon, Dec 12, 2016 at 04:04:23PM +0000, Andre Przywara wrote:
Hi,
On 12/12/16 15:13, Maxime Ripard wrote:
On Tue, Dec 06, 2016 at 12:22:59PM +0000, Andre Przywara wrote:
Hi,
On 06/12/16 11:28, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:30AM +0000, Andre Przywara wrote:
When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't use the more compact Thumb2 encoding, which only exists for AArch32 code. This makes the SPL rather big, up to a point where any code additions or even a different compiler may easily exceed the 32KB limit that the Allwinner BROM imposes. Introduce a separate, mostly generic sun50i-a64 configuration, which defines the CPU_V7 symbol and thus will create a 32-bit binary using the memory-saving Thumb2 encoding.
"mostly generic". Where do you draw the line? How do you deal with a board that would use a different UART? a different MMC? different memory configuration.?
My impression was that it's rather pointless to provide another set of 32-bit SPL defconfigs for each board again, especially given that for the SPL's needs the boards so far seem to be very similar. For the loading part we will probably go with what the BROM already started: load more data from one of the BROM boot sources, which is fixed in the SoC and can't be really changed by a board vendor anyway. Which really leaves the DRAM setup and the UART.
So you plan on enabling all BROM boot sources as well (NAND, SPI) ?
In fact SPI works already (with little to no changes).
It's the little changes that I'm interested in to be honest :)
Well, thinking about it again I think normal SPI boot (legacy U-Boot image as the payload) requires no changes: just enable it in the defconfig. The patches I was talking about were about enabling FIT support on top of it.
The reason why it works is due to Siarhei's SPL SPI code and due to the fact it is kind of "supported" on these boards, which is not true for NAND on most boards, AFAIK. But if we get support for that, it would just work the same way, due to the boot source detection. So I think handling multiple boot sources within one SPL binary is a general sunxi SPL feature already implemented today. I definitely use the same thing for FEL, SPI and SD and eMMC.
And I don't care about NAND, really ;-) Is anyone aware of an A64 board using this?
Well, it's one of the possible boot source, so we have to consider it and not ignore it entirely hoping that no one will use it, ever.
But there isn't any support for it so far, is there? Even if we wanted to compile an SPL just for NAND.
SPI booting was not used for 5-6 years, until someone started that trend and now we have a significant number of boards implementing it.
I can't predict the future, but so far those A64 boards look fairly similar in this respect. So I just avoid having another SPL defconfig for the BananaPi M64, for instance. I just added MMC_SUNXI_SLOT_EXTRA because this doesn't hurt on the Pine64, so less churn here.
So if you know of any board which breaks this assumption, I am happy to hear about it and see if it can be integrated.
I know at least of one board that uses the UART3 on A33, instead of UART0. The trend is very clear on the A64 and the previous SoCs, but we also had some variations, so we need to take that into account. Which brings me back to my original question, where do you draw the line ? :)
I don't know, and to make this clear: I see the point in having separate configs for the SPL, but due to the 32-bit/64-bit split we probably need _two_ sets of defconfigs, which gets pretty messy very quickly. Especially given that they are very similar.
So how do we avoid this? Can we somehow share a defconfig between armv8 and armv7? In the moment "CONFIG_CPU_V7" and "CONFIG_ARM64" conflict in the same file.
I guess the easiest and most robust solution to do this would be to just generate it from the defconfig. It really feels from your patches that it's just a matter of sed -i 's/CONFIG_MACH_SUN50I/CONFIG_MACH_SUN50I_32' on the "real" defconfig.
Yes, the difference between the defconfigs is really minimal. They look more different at the moment because we don't need Ethernet and USB in the SPL, for instance, and the SPI support is SPL only atm. But yeah, we could unify them, no question.
BUT: What do you mean exactly with: "just generate it from the defconfig"? Some Makefile hack? Like detecting 32 vs 64 with the help of the ARCH environment variable or ${CROSS_COMPILE}gcc -dumpmachine output?
And we don't have to matter about keeping it in sync, or the board specific init that might need to be done.
Agreed.
Cheers, Andre.

On Fri, Dec 16, 2016 at 03:39:06PM +0000, Andre Przywara wrote:
On 16/12/16 14:52, Maxime Ripard wrote:
On Mon, Dec 12, 2016 at 04:04:23PM +0000, Andre Przywara wrote:
Hi,
On 12/12/16 15:13, Maxime Ripard wrote:
On Tue, Dec 06, 2016 at 12:22:59PM +0000, Andre Przywara wrote:
Hi,
On 06/12/16 11:28, Maxime Ripard wrote:
On Mon, Dec 05, 2016 at 01:52:30AM +0000, Andre Przywara wrote: > When compiling the SPL for the Allwinner A64 in AArch64 mode, we can't > use the more compact Thumb2 encoding, which only exists for AArch32 > code. This makes the SPL rather big, up to a point where any code > additions or even a different compiler may easily exceed the 32KB limit > that the Allwinner BROM imposes. > Introduce a separate, mostly generic sun50i-a64 configuration, which > defines the CPU_V7 symbol and thus will create a 32-bit binary using > the memory-saving Thumb2 encoding.
"mostly generic". Where do you draw the line? How do you deal with a board that would use a different UART? a different MMC? different memory configuration.?
My impression was that it's rather pointless to provide another set of 32-bit SPL defconfigs for each board again, especially given that for the SPL's needs the boards so far seem to be very similar. For the loading part we will probably go with what the BROM already started: load more data from one of the BROM boot sources, which is fixed in the SoC and can't be really changed by a board vendor anyway. Which really leaves the DRAM setup and the UART.
So you plan on enabling all BROM boot sources as well (NAND, SPI) ?
In fact SPI works already (with little to no changes).
It's the little changes that I'm interested in to be honest :)
Well, thinking about it again I think normal SPI boot (legacy U-Boot image as the payload) requires no changes: just enable it in the defconfig. The patches I was talking about were about enabling FIT support on top of it.
The reason why it works is due to Siarhei's SPL SPI code and due to the fact it is kind of "supported" on these boards, which is not true for NAND on most boards, AFAIK. But if we get support for that, it would just work the same way, due to the boot source detection. So I think handling multiple boot sources within one SPL binary is a general sunxi SPL feature already implemented today. I definitely use the same thing for FEL, SPI and SD and eMMC.
At what offset for the SPL? U-Boot binary? with or without redundacy for U-Boot? for the environment? with an environment in the first place?
All those things are user configurable, and will also depend on some board features (staring with the size of the EEPROM embedded on that board). We should treat them as user configurable, and not just as "meh, don't care, works for my setup".
And I don't care about NAND, really ;-) Is anyone aware of an A64 board using this?
Well, it's one of the possible boot source, so we have to consider it and not ignore it entirely hoping that no one will use it, ever.
But there isn't any support for it so far, is there? Even if we wanted to compile an SPL just for NAND.
The thing is once you started telling people to use something, removing that something somewhere down the road is *not* nice. So I'd really like to have it taken into account.
SPI booting was not used for 5-6 years, until someone started that trend and now we have a significant number of boards implementing it.
I can't predict the future, but so far those A64 boards look fairly similar in this respect. So I just avoid having another SPL defconfig for the BananaPi M64, for instance. I just added MMC_SUNXI_SLOT_EXTRA because this doesn't hurt on the Pine64, so less churn here.
So if you know of any board which breaks this assumption, I am happy to hear about it and see if it can be integrated.
I know at least of one board that uses the UART3 on A33, instead of UART0. The trend is very clear on the A64 and the previous SoCs, but we also had some variations, so we need to take that into account. Which brings me back to my original question, where do you draw the line ? :)
I don't know, and to make this clear: I see the point in having separate configs for the SPL, but due to the 32-bit/64-bit split we probably need _two_ sets of defconfigs, which gets pretty messy very quickly. Especially given that they are very similar.
So how do we avoid this? Can we somehow share a defconfig between armv8 and armv7? In the moment "CONFIG_CPU_V7" and "CONFIG_ARM64" conflict in the same file.
I guess the easiest and most robust solution to do this would be to just generate it from the defconfig. It really feels from your patches that it's just a matter of sed -i 's/CONFIG_MACH_SUN50I/CONFIG_MACH_SUN50I_32' on the "real" defconfig.
Yes, the difference between the defconfigs is really minimal. They look more different at the moment because we don't need Ethernet and USB in the SPL, for instance, and the SPI support is SPL only atm. But yeah, we could unify them, no question.
BUT: What do you mean exactly with: "just generate it from the defconfig"? Some Makefile hack? Like detecting 32 vs 64 with the help of the ARCH environment variable or ${CROSS_COMPILE}gcc -dumpmachine output?
Plugging into a makefile target seems for reasonable.
In particular, I guess that would imply: - Generating an spl/include/generated/autoconf.h - Since that file is included through include/linux/kconfig.h, create / copy that one over - Change UBOOTINCLUDE to have a different one for the SPL that would automatically pick the right kconfig.h
I guess some of that can also be eased through the include order in the C flags.
Once that is done, you can easily mangle your configuration in the Makefile rule only for the SPL.
Maxime
participants (8)
-
Andre Przywara
-
André Przywara
-
Chen-Yu Tsai
-
Maxime Ripard
-
Siarhei Siamashka
-
Simon Glass
-
Steve Rae
-
Tom Rini