
Hi Daniel,
On 12.08.2018 00:52, Daniel Schwierzeck wrote:
On 09.08.2018 16:22, Stefan Roese wrote:
diff --git a/arch/mips/mach-mt7620/cpu.c b/arch/mips/mach-mt7620/cpu.c new file mode 100644 index 0000000000..0b22956499 --- /dev/null +++ b/arch/mips/mach-mt7620/cpu.c @@ -0,0 +1,66 @@ +// SPDX-License-Identifier: GPL-2.0+ +/*
- Copyright (C) 2018 Stefan Roese sr@denx.de
- */
+#include <common.h> +#include <dm.h> +#include <ram.h> +#include <asm/io.h> +#include <linux/io.h> +#include <linux/sizes.h> +#include "mt76xx.h"
+#define STR_LEN 6
+#ifdef CONFIG_BOOT_ROM +int mach_cpu_init(void) +{ + void (*ptr)(void);
+ /* + * DDR calibration routine needs to be called very early. This + * function also configures the clock to run at full speed. + */ + ptr = (void *)CKSEG0ADDR(ddr_calibrate); + (*ptr)();
what is the purpose of forcing the function symbol to KSEG0?
Its copied from the original MediaTek code. I just tested it without forcing the execution into KSEG0 and the DDR calibration is extremely slow then, taking a few minutes to complete.
I have to admit that I am not 100% sure, if the caches are configured 100% correctly / optimally for this SoC. Perhaps you (or someone else) has some improvements here.
I guess the BootROM does some XiP magic with the SPI flash controller and runs in the uncached KSEG1 segment. That would explain why the pre-relocation code runs so slowly. But this shift from KSEG1 to KSEG0 could be done generically in start.S after the cache initialization is complete. I will have a look at it.
That would be great. Please let me know if you need some help with testing etc.
could you try branch mips_optimize_cache_init from git://git.denx.de/u-boot-mips.git. You have to change your text base from 0xbc000000 to 0x9c000000 to execute all code prior relocation from the cached KSEG0 segment. The code should now run fast without that function pointer magic.
For now the patch series is only tested in Qemu and needs some more cleanup. I'll try to do some testing on real hardware in the next days.
I'm back from a short trip and tested these 2 patches in my current internal branch. It works great, thanks! The speedup is impressive.
How would you like to proceed? Should I re-send my MT7688 SoC port again with the functions pointer magic removed? Or will you pull this version first and your cache improvement later? I can always change this ddr_calibrate() pointer magic at a later time (and change the TEXT_BASE).
Thanks, Stefan