[U-Boot] [RFC PATCH 0/3] spl: Add D-cache support

This series tries to add D-cache support in spl in order to reduce boot time either in 2stage boot or Falcon Boot.
Lokesh Vutla (3): arch: arm: omap: Declare size of ddr very early spl: reorder the assignment of board info to global data spl: Add support for enabling dcache
arch/arm/include/asm/cache.h | 1 + arch/arm/lib/cache-cp15.c | 46 +++++++++++++++++++++++++++++-------- arch/arm/mach-omap2/am33xx/board.c | 4 ++++ arch/arm/mach-omap2/hwinit-common.c | 1 + arch/arm/mach-omap2/omap-cache.c | 15 ++++++++++++ common/spl/spl.c | 42 ++++++++++++++++++++++++++++++++- 6 files changed, 98 insertions(+), 11 deletions(-)

Declare the size of ddr very early in spl, so that this can be used to enable cache.
Signed-off-by: Lokesh Vutla lokeshvutla@ti.com --- arch/arm/mach-omap2/am33xx/board.c | 4 ++++ arch/arm/mach-omap2/hwinit-common.c | 1 + 2 files changed, 5 insertions(+)
diff --git a/arch/arm/mach-omap2/am33xx/board.c b/arch/arm/mach-omap2/am33xx/board.c index 5ebeac0..7f445ae 100644 --- a/arch/arm/mach-omap2/am33xx/board.c +++ b/arch/arm/mach-omap2/am33xx/board.c @@ -303,6 +303,10 @@ void board_init_f(ulong dummy) early_system_init(); board_early_init_f(); sdram_init(); + /* dram_init must store complete ramsize in gd->ram_size */ + gd->ram_size = get_ram_size( + (void *)CONFIG_SYS_SDRAM_BASE, + CONFIG_MAX_RAM_BANK_SIZE); } #endif
diff --git a/arch/arm/mach-omap2/hwinit-common.c b/arch/arm/mach-omap2/hwinit-common.c index f317293..cac3274 100644 --- a/arch/arm/mach-omap2/hwinit-common.c +++ b/arch/arm/mach-omap2/hwinit-common.c @@ -171,6 +171,7 @@ void board_init_f(ulong dummy) #endif /* For regular u-boot sdram_init() is called from dram_init() */ sdram_init(); + gd->ram_size = omap_sdram_size(); } #endif

On Mon, Nov 28, 2016 at 03:04:43PM +0530, Lokesh Vutla wrote:
Declare the size of ddr very early in spl, so that this can be used to enable cache.
Signed-off-by: Lokesh Vutla lokeshvutla@ti.com
Reviewed-by: Tom Rini trini@konsulko.com

Move the assignment of board info to global data a bit early which is safe, so that ram details can be used to enable caches.
Signed-off-by: Lokesh Vutla lokeshvutla@ti.com --- common/spl/spl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/common/spl/spl.c b/common/spl/spl.c index bdb165a..990b700 100644 --- a/common/spl/spl.c +++ b/common/spl/spl.c @@ -394,6 +394,7 @@ void board_init_r(gd_t *dummy1, ulong dummy2) int i;
debug(">>spl:board_init_r()\n"); + gd->bd = &bdata;
#if defined(CONFIG_SYS_SPL_MALLOC_START) mem_malloc_init(CONFIG_SYS_SPL_MALLOC_START, @@ -461,7 +462,6 @@ void board_init_r(gd_t *dummy1, ulong dummy2) */ void preloader_console_init(void) { - gd->bd = &bdata; gd->baudrate = CONFIG_BAUDRATE;
serial_init(); /* serial communications setup */

On Mon, Nov 28, 2016 at 03:04:44PM +0530, Lokesh Vutla wrote:
Move the assignment of board info to global data a bit early which is safe, so that ram details can be used to enable caches.
Signed-off-by: Lokesh Vutla lokeshvutla@ti.com
Reviewed-by: Tom Rini trini@konsulko.com

On 28 November 2016 at 02:34, Lokesh Vutla lokeshvutla@ti.com wrote:
Move the assignment of board info to global data a bit early which is safe, so that ram details can be used to enable caches.
Signed-off-by: Lokesh Vutla lokeshvutla@ti.com
common/spl/spl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Reviewed-by: Simon Glass sjg@chromium.org

Add support for enabling d-cache in SPL. The sequence in SPL tries to replicate the sequence done in U-Boot except that MMU entries were added for SRAM.
Signed-off-by: Lokesh Vutla lokeshvutla@ti.com --- arch/arm/include/asm/cache.h | 1 + arch/arm/lib/cache-cp15.c | 46 +++++++++++++++++++++++++++++++--------- arch/arm/mach-omap2/omap-cache.c | 15 +++++++++++++ common/spl/spl.c | 40 ++++++++++++++++++++++++++++++++++ 4 files changed, 92 insertions(+), 10 deletions(-)
diff --git a/arch/arm/include/asm/cache.h b/arch/arm/include/asm/cache.h index 5400cbe..20f6aca 100644 --- a/arch/arm/include/asm/cache.h +++ b/arch/arm/include/asm/cache.h @@ -39,6 +39,7 @@ void arm_init_before_mmu(void); void arm_init_domains(void); void cpu_cache_initialization(void); void dram_bank_mmu_setup(int bank); +void sram_bank_mmu_setup(phys_addr_t start, phys_addr_t size);
#endif
diff --git a/arch/arm/lib/cache-cp15.c b/arch/arm/lib/cache-cp15.c index e9bbcf5..76f95d6 100644 --- a/arch/arm/lib/cache-cp15.c +++ b/arch/arm/lib/cache-cp15.c @@ -94,16 +94,8 @@ void mmu_set_region_dcache_behaviour(phys_addr_t start, size_t size, mmu_page_table_flush(startpt, stoppt); }
-__weak void dram_bank_mmu_setup(int bank) +static void set_section_caches(int i) { - bd_t *bd = gd->bd; - int i; - - debug("%s: bank: %d\n", __func__, bank); - for (i = bd->bi_dram[bank].start >> MMU_SECTION_SHIFT; - i < (bd->bi_dram[bank].start >> MMU_SECTION_SHIFT) + - (bd->bi_dram[bank].size >> MMU_SECTION_SHIFT); - i++) { #if defined(CONFIG_SYS_ARM_CACHE_WRITETHROUGH) set_section_dcache(i, DCACHE_WRITETHROUGH); #elif defined(CONFIG_SYS_ARM_CACHE_WRITEALLOC) @@ -111,9 +103,33 @@ __weak void dram_bank_mmu_setup(int bank) #else set_section_dcache(i, DCACHE_WRITEBACK); #endif - } }
+__weak void dram_bank_mmu_setup(int bank) +{ + bd_t *bd = gd->bd; + int i; + + debug("%s: bank: %d\n", __func__, bank); + for (i = bd->bi_dram[bank].start >> MMU_SECTION_SHIFT; + i < (bd->bi_dram[bank].start >> MMU_SECTION_SHIFT) + + (bd->bi_dram[bank].size >> MMU_SECTION_SHIFT); i++) + set_section_caches(i); +} + +#if defined(CONFIG_SPL_BUILD) && (defined(CONFIG_SPL_MAX_SIZE) || \ + defined(CONFIG_SPL_MAX_FOOTPRINT)) +__weak void sram_bank_mmu_setup(phys_addr_t start, phys_addr_t size) +{ + int i; + + for (i = start >> MMU_SECTION_SHIFT; + i < (start >> MMU_SECTION_SHIFT) + (size >> MMU_SECTION_SHIFT); + i++) + set_section_caches(i); +} +#endif + /* to activate the MMU we need to set up virtual memory: use 1M areas */ static inline void mmu_setup(void) { @@ -129,6 +145,16 @@ static inline void mmu_setup(void) dram_bank_mmu_setup(i); }
+#if defined(CONFIG_SPL_BUILD) +#if defined(CONFIG_SPL_MAX_SIZE) + sram_bank_mmu_setup(CONFIG_SPL_TEXT_BASE, + ALIGN(CONFIG_SPL_MAX_SIZE, MMU_SECTION_SIZE)); +#elif defined(CONFIG_SPL_MAX_FOOTPRINT) + sram_bank_mmu_setup(CONFIG_SPL_TEXT_BASE, + ALIGN(CONFIG_SPL_MAX_FOOTPRINT, MMU_SECTION_SIZE)); +#endif +#endif + #ifdef CONFIG_ARMV7_LPAE /* Set up 4 PTE entries pointing to our 4 1GB page tables */ for (i = 0; i < 4; i++) { diff --git a/arch/arm/mach-omap2/omap-cache.c b/arch/arm/mach-omap2/omap-cache.c index b37163a..6019e0c 100644 --- a/arch/arm/mach-omap2/omap-cache.c +++ b/arch/arm/mach-omap2/omap-cache.c @@ -62,6 +62,21 @@ void dram_bank_mmu_setup(int bank) set_section_dcache(i, ARMV7_DCACHE_POLICY); }
+#ifdef CONFIG_SPL_BUILD +void sram_bank_mmu_setup(phys_addr_t start, phys_addr_t size) +{ + int i; + phys_addr_t end; + + start = start >> MMU_SECTION_SHIFT; + size = size >> MMU_SECTION_SHIFT; + end = start + size; + + for (i = start; i <= end; i++) + set_section_dcache(i, ARMV7_DCACHE_POLICY); +} +#endif + void arm_init_domains(void) { u32 reg; diff --git a/common/spl/spl.c b/common/spl/spl.c index 990b700..cdd2917 100644 --- a/common/spl/spl.c +++ b/common/spl/spl.c @@ -381,6 +381,34 @@ static int spl_load_image(struct spl_image_info *spl_image, u32 boot_device) return -ENODEV; }
+#if !(defined(CONFIG_SYS_ICACHE_OFF) && defined(CONFIG_SYS_DCACHE_OFF)) && \ + defined(CONFIG_ARM) +static int reserve_mmu(void) +{ + phys_addr_t ram_top = 0; + /* reserve TLB table */ + gd->arch.tlb_size = PGTABLE_SIZE; + +#ifdef CONFIG_SYS_SDRAM_BASE + ram_top = CONFIG_SYS_SDRAM_BASE; +#endif + ram_top += get_effective_memsize(); + gd->arch.tlb_addr = ram_top - gd->arch.tlb_size; + debug("TLB table from %08lx to %08lx\n", gd->arch.tlb_addr, + gd->arch.tlb_addr + gd->arch.tlb_size); + return 0; +} + +__weak void dram_init_banksize(void) +{ +#if defined(CONFIG_NR_DRAM_BANKS) && defined(CONFIG_SYS_SDRAM_BASE) + gd->bd->bi_dram[0].start = CONFIG_SYS_SDRAM_BASE; + gd->bd->bi_dram[0].size = get_effective_memsize(); +#endif +} + +#endif + void board_init_r(gd_t *dummy1, ulong dummy2) { u32 spl_boot_list[] = { @@ -396,6 +424,13 @@ void board_init_r(gd_t *dummy1, ulong dummy2) debug(">>spl:board_init_r()\n"); gd->bd = &bdata;
+#if !(defined(CONFIG_SYS_ICACHE_OFF) && defined(CONFIG_SYS_DCACHE_OFF)) && \ + defined(CONFIG_ARM) + dram_init_banksize(); + reserve_mmu(); + enable_caches(); +#endif + #if defined(CONFIG_SYS_SPL_MALLOC_START) mem_malloc_init(CONFIG_SYS_SPL_MALLOC_START, CONFIG_SYS_SPL_MALLOC_SIZE); @@ -432,6 +467,11 @@ void board_init_r(gd_t *dummy1, ulong dummy2) hang(); }
+#if !(defined(CONFIG_SYS_ICACHE_OFF) && defined(CONFIG_SYS_DCACHE_OFF)) && \ + defined(CONFIG_ARM) + cleanup_before_linux(); +#endif + switch (spl_image.os) { case IH_OS_U_BOOT: debug("Jumping to U-Boot\n");

On Mon, Nov 28, 2016 at 03:04:45PM +0530, Lokesh Vutla wrote:
Add support for enabling d-cache in SPL. The sequence in SPL tries to replicate the sequence done in U-Boot except that MMU entries were added for SRAM.
Signed-off-by: Lokesh Vutla lokeshvutla@ti.com
Reviewed-by: Tom Rini trini@konsulko.com

On 28 November 2016 at 02:34, Lokesh Vutla lokeshvutla@ti.com wrote:
Add support for enabling d-cache in SPL. The sequence in SPL tries to replicate the sequence done in U-Boot except that MMU entries were added for SRAM.
Signed-off-by: Lokesh Vutla lokeshvutla@ti.com
arch/arm/include/asm/cache.h | 1 + arch/arm/lib/cache-cp15.c | 46 +++++++++++++++++++++++++++++++--------- arch/arm/mach-omap2/omap-cache.c | 15 +++++++++++++ common/spl/spl.c | 40 ++++++++++++++++++++++++++++++++++ 4 files changed, 92 insertions(+), 10 deletions(-)
Reviewed-by: Simon Glass sjg@chromium.org

On Mon, Nov 28, 2016 at 03:04:42PM +0530, Lokesh Vutla wrote:
This series tries to add D-cache support in spl in order to reduce boot time either in 2stage boot or Falcon Boot.
I assume you've measured and confirmed that there is a speed increase? I ask since I'd tried this ages ago but..
Lokesh Vutla (3): arch: arm: omap: Declare size of ddr very early spl: reorder the assignment of board info to global data
... I didn't have changes like this, which is perhaps why it ended up not working right. Thanks!

On Monday 28 November 2016 10:10 PM, Tom Rini wrote:
On Mon, Nov 28, 2016 at 03:04:42PM +0530, Lokesh Vutla wrote:
This series tries to add D-cache support in spl in order to reduce boot time either in 2stage boot or Falcon Boot.
I assume you've measured and confirmed that there is a speed increase? I ask since I'd tried this ages ago but..
Yes. I have verified it on all TI platforms. On DRA7-evm with MMCSD boot: without this series SPL took 607 ms to complete with this series SPL took 318 ms to complete.
Lokesh Vutla (3): arch: arm: omap: Declare size of ddr very early spl: reorder the assignment of board info to global data
... I didn't have changes like this, which is perhaps why it ended up not working right. Thanks!
I am mainly worried about platforms other than TI(Just want to be sure that this series did not break other platforms)
Thanks and regards, Lokesh

Hi Tom,
On Monday 28 November 2016 03:04 PM, Lokesh Vutla wrote:
This series tries to add D-cache support in spl in order to reduce boot time either in 2stage boot or Falcon Boot.
I hope there are no further comments on this series. Do you want me to re post this series or this is good to go?
Thanks and regards, Lokesh
Lokesh Vutla (3): arch: arm: omap: Declare size of ddr very early spl: reorder the assignment of board info to global data spl: Add support for enabling dcache
arch/arm/include/asm/cache.h | 1 + arch/arm/lib/cache-cp15.c | 46 +++++++++++++++++++++++++++++-------- arch/arm/mach-omap2/am33xx/board.c | 4 ++++ arch/arm/mach-omap2/hwinit-common.c | 1 + arch/arm/mach-omap2/omap-cache.c | 15 ++++++++++++ common/spl/spl.c | 42 ++++++++++++++++++++++++++++++++- 6 files changed, 98 insertions(+), 11 deletions(-)

On Mon, Dec 12, 2016 at 03:22:50PM +0530, Lokesh Vutla wrote:
Hi Tom,
On Monday 28 November 2016 03:04 PM, Lokesh Vutla wrote:
This series tries to add D-cache support in spl in order to reduce boot time either in 2stage boot or Falcon Boot.
I hope there are no further comments on this series. Do you want me to re post this series or this is good to go?
I've thought about this a bit, and yes, it was posted before the end of the merge window, but given the wide impact of some of these changes and the general time taken off around now, I think this is more safely pushed in right after the next release to give it the most time to shake out. I will however find some time to boot test this at least on RPi3 (32bit), allwinner and imx6 beforehand. Thanks!

On Mon, Dec 12, 2016 at 03:22:50PM +0530, Lokesh Vutla wrote:
Hi Tom,
On Monday 28 November 2016 03:04 PM, Lokesh Vutla wrote:
This series tries to add D-cache support in spl in order to reduce boot time either in 2stage boot or Falcon Boot.
I hope there are no further comments on this series. Do you want me to re post this series or this is good to go?
This breaks building on a few platforms: https://travis-ci.org/trini/u-boot/jobs/183441104 https://travis-ci.org/trini/u-boot/jobs/183441159 https://travis-ci.org/trini/u-boot/jobs/183441171
And break booting on omap4_panda: 20:02:33 U-Boot SPL 2017.01-rc1-00068-g8ae0906 (Dec 12 2016 - 19:11:20) 20:02:33 OMAP4460-GP ES1.1 20:02:33 Trying to boot from MMC1SPL: Please implement spl_start_uboot() for your board 20:02:33 SPL: Direct Linux boot not active! 20:02:33 reading u-boot.img 20:02:33 reading u-boot.img [ hangs here ]

On Tuesday 13 December 2016 05:44 PM, Tom Rini wrote:
On Mon, Dec 12, 2016 at 03:22:50PM +0530, Lokesh Vutla wrote:
Hi Tom,
On Monday 28 November 2016 03:04 PM, Lokesh Vutla wrote:
This series tries to add D-cache support in spl in order to reduce boot time either in 2stage boot or Falcon Boot.
I hope there are no further comments on this series. Do you want me to re post this series or this is good to go?
This breaks building on a few platforms:
I am kind of expecting for that :)
https://travis-ci.org/trini/u-boot/jobs/183441104 https://travis-ci.org/trini/u-boot/jobs/183441159
These errors are mostly saying that SPL image is too big.
Looks like I did not care about armv8 platforms. Ill take a look at this.
And break booting on omap4_panda: 20:02:33 U-Boot SPL 2017.01-rc1-00068-g8ae0906 (Dec 12 2016 - 19:11:20) 20:02:33 OMAP4460-GP ES1.1 20:02:33 Trying to boot from MMC1SPL: Please implement spl_start_uboot() for your board 20:02:33 SPL: Direct Linux boot not active! 20:02:33 reading u-boot.img 20:02:33 reading u-boot.img [ hangs here ]
hmm...ill dig more into this.
Thanks and regards, Lokesh

On Tue, Dec 13, 2016 at 08:48:55PM +0530, Lokesh Vutla wrote:
On Tuesday 13 December 2016 05:44 PM, Tom Rini wrote:
On Mon, Dec 12, 2016 at 03:22:50PM +0530, Lokesh Vutla wrote:
Hi Tom,
On Monday 28 November 2016 03:04 PM, Lokesh Vutla wrote:
This series tries to add D-cache support in spl in order to reduce boot time either in 2stage boot or Falcon Boot.
I hope there are no further comments on this series. Do you want me to re post this series or this is good to go?
This breaks building on a few platforms:
I am kind of expecting for that :)
https://travis-ci.org/trini/u-boot/jobs/183441104 https://travis-ci.org/trini/u-boot/jobs/183441159
These errors are mostly saying that SPL image is too big.
Right. But they need to build still.
Looks like I did not care about armv8 platforms. Ill take a look at this.
And break booting on omap4_panda: 20:02:33 U-Boot SPL 2017.01-rc1-00068-g8ae0906 (Dec 12 2016 - 19:11:20) 20:02:33 OMAP4460-GP ES1.1 20:02:33 Trying to boot from MMC1SPL: Please implement spl_start_uboot() for your board 20:02:33 SPL: Direct Linux boot not active! 20:02:33 reading u-boot.img 20:02:33 reading u-boot.img [ hangs here ]
hmm...ill dig more into this.
Thanks!
participants (3)
-
Lokesh Vutla
-
Simon Glass
-
Tom Rini