[U-Boot] Enabling ARM DCache (and MMU setup) in U-Boot

Hi all, I would like to eneble DCache in U-Boot, because peripheral register R/W and SDRAM R/W is extremely slow on my platform, so booting Linux image takes unexceptable long time. But apparently for ARM, MMU setup is needed first. Now, I did not find example on any presented ARM platforms included in U-Boot.
Did somebody maybe suceeded on doing this and where can I find some C/ASM examples how to set up MMU and enable DCache for ARM?
Best regards, Drasko

On 21:56 Mon 30 Mar , Drasko DRASKOVIC wrote:
Hi all, I would like to eneble DCache in U-Boot, because peripheral register R/W and SDRAM R/W is extremely slow on my platform, so booting Linux image takes unexceptable long time. But apparently for ARM, MMU setup is needed first. Now, I did not find example on any presented ARM platforms included in U-Boot.
Did somebody maybe suceeded on doing this and where can I find some C/ASM examples how to set up MMU and enable DCache for ARM?
before booting linux you must disable the cache which will be re-enable by linux.
Could you give us more details about your soc, u-boot verison and linux version
Best Regards, J.

On Mon, Mar 30, 2009 at 10:31 PM, Jean-Christophe PLAGNIOL-VILLARD wrote: before booting linux you must disable the cache which will be re-enable by linux.
OK. Is that done in bootm.c? I can see lines : /* * We have reached the point of no return: we are going to * overwrite all exception vector code, so we cannot easily * recover from any failures any more... */
iflag = disable_interrupts();
#ifdef CONFIG_AMIGAONEG3SE /* * We've possible left the caches enabled during * bios emulation, so turn them off again */ icache_disable(); invalidate_l1_instruction_cache(); flush_data_cache(); dcache_disable(); #endif
Looks like only interrupts are disabled, and caches only in the case of AMIGAONE (whatever that might be). There are no other calls to cache_disable functions. I already use ICache, and cmd_bootm.c like presented (thus no call to icache_disable() here), and it works. Only that copying image from Flash is slow, so I want to speed it up with enablilng DCache.
Could you give us more details about your soc, u-boot verison and linux version
u-boot-1.1.6, linux version is linux-2.6.25.10, although that is not important because I have no problem with this but with slow access to Flash and SDRAM, as I said before. Core is ARM926.
As I understand, these things have to be set up in the order to enable DCache :
1. Page tables 2. The Translation Lookaside Buffer (TLB) 3. Domains and access permission 4. Caches and write buffer 5. The CP15:c1 control register 6. The Fast Context Switch Extension
Now, that seems like a lot of work to be done, and reading manual is not extremely helpful, so I was wondering if somebody already done similar thing in U-Boot, for ARM9 platform, so I could reuse some work or examine examples to figure out how this is done.
Best regards, Drasko

On 14:20 Tue 31 Mar , Drasko DRASKOVIC wrote:
On Mon, Mar 30, 2009 at 10:31 PM, Jean-Christophe PLAGNIOL-VILLARD wrote: before booting linux you must disable the cache which will be re-enable
by
linux.
OK. Is that done in bootm.c? I can see lines : /* * We have reached the point of no return: we are going to * overwrite all exception vector code, so we cannot easily * recover from any failures any more... */
iflag = disable_interrupts();
#ifdef CONFIG_AMIGAONEG3SE /* * We've possible left the caches enabled during * bios emulation, so turn them off again */ icache_disable(); invalidate_l1_instruction_ cache(); flush_data_cache(); dcache_disable(); #endif
AMIGAONE is a PPC
please take a look on this file lib_arm/bootm.c and specialy on the cleanup_before_linux (); soc implementation
Could you give us more details about your soc, u-boot verison and linux version
u-boot-1.1.6, linux version is linux-2.6.25.10, although that is not important because I have no problem with this but with slow access to Flash and SDRAM, as I said before. Core is ARM926.
U-Boot 1.1.6 is quite old (more than 2 years old) please try to the current version is your SOC in the Mainline? if you can tell us which one it's and if it is mainline in U-Boot or Linux? we could take a look on the code
As I understand, these things have to be set up in the order to enable DCache : 1. Page tables 2. The Translation Lookaside Buffer (TLB) 3. Domains and access permission 4. Caches and write buffer 5. The CP15:c1 control register 6. The Fast Context Switch Extension Now, that seems like a lot of work to be done, and reading manual is not extremely helpful, so I was wondering if somebody already done similar thing in U-Boot, for ARM9 platform, so I could reuse some work or examine examples to figure out how this is done.
in linux yes in u-boot some arch do it but not the arm9 IMHO your boot problem is more in linux than in U-Boot but until we can take a look on the code it will be hard to known
Best Regards, J.

On Tue, Mar 31, 2009 at 2:21 PM, Jean-Christophe PLAGNIOL-VILLARD wrote:
U-Boot 1.1.6 is quite old (more than 2 years old) please try to the current
version I know, but that's the one we use... For now, everything works fine.
is your SOC in the Mainline? if you can tell us which one it's and if it is mainline in U-Boot or Linux? we could take a look on the code
Nope, actually it is a custom chip based on ARM9. Most of other things are proprietery.
in linux yes in u-boot some arch do it but not the arm9
In Linux for sure, but we have MMU setup in Linux, and it is beyond my knowledge. I am concentrated on as-simple-as-can-be DCache switch on, to speed up operations of copying Linux image PRIOR to kernel boot.
IMHO your boot problem is more in linux than in U-Boot
To be clear - I am experiencing long delay in reading peripheral regs. I am reading some registers in a loop, so it takes long time for each access. After that I also noticed that giving a value to any variable takes long time, so it makes me suspect that SDRAM access is very slow. For example, calling get_timer(0) takes the time of over 1000 cycles!
So, quite independantly of Linux, I want to speed up Flash and SDRAM R/W, as well as R/W of peripherals regs in the loop by introducing DCache. Then I saw that for ARM9 to enable DCache one must setup and enable MMU also, and it becomes mess, because I can find any examples in U-Boot and it seem pretty complex to me.
To underline, my intention of enabling DCache in U-Boot has nothing to do with Linux, because I will switch off caches anyway before boot. I just want to use DCache while read and write operations prior to calling kernel boot.
but until we can take a look on the code it will be hard to known
Which code would help? I do not have anything yet regarding the setup. I do not know where to start. For the time, I am examining CP15 coprocessor manipulations in cpu/arm926, which enables/disables cache (even function that you mentioned, cleanup_before_linux() is defined here). In cpu/arm920t/cpu.c i found these lines, for example :
#ifdef USE_920T_MMU /* It makes no sense to use the dcache if the MMU is not enabled */ void dcache_enable (void) { ulong reg;
reg = read_p15_c1 (); cp_delay (); write_p15_c1 (reg | C1_DC); }
I tried calling function dcache_enable () when I enter my driver which does extensive R/W on peripheral regs (and on SDRAM), but off course - it does not work. Even comment says so :). In ARM manual I found:
CP15 C bit M bit 0 0 DCache disabled. All data accesses are to the external memory.
1 0 DCache enabled, MMU disabled. The C bit is overriden by the M bit setting, which means that the DCache is effectively disabled. All data accesses are noncachable, nonbufferable, with no protection checks. All addresses are flat mapped, that is VA = MVA = PA.
1 1 DCache enabled, MMU enabled. All data accesses are cachable or noncachable depending on the page descriptor C bit and B bit (see Table 4-4), and protection checks are performed. All addresses are remapped from VA to PA, depending on the MMU page table entry, that is the VA is translated to an MVA, and the MVA is remapped to a PA.
So, started digging with a hope that somebody alredy implemented this in U-Boot, so I can copy/paste code in my cpu set-up, or, better yet, call a set of C functions (similar to dcache_enable), to set-up MMU, and in the end just call dcache_enable() to do the magic of enabling DCache.
Best regards, Drasko

On 15:36 Tue 31 Mar , Drasko DRASKOVIC wrote:
On Tue, Mar 31, 2009 at 2:21 PM, Jean-Christophe PLAGNIOL-VILLARD wrote:
U-Boot 1.1.6 is quite old (more than 2 years old) please try to the
current version I know, but that's the one we use... For now, everything works fine.
Honnestly we will not work on such old code. so please really consider to rebase it against mainline
is your SOC in the Mainline? if you can tell us which one it's and if it is mainline in U-Boot or
Linux?
we could take a look on the code
Nope, actually it is a custom chip based on ARM9. Most of other things are proprietery.
in linux yes in u-boot some arch do it but not the arm9
In Linux for sure, but we have MMU setup in Linux, and it is beyond my knowledge. I am concentrated on as-simple-as-can-be DCache switch on, to speed up operations of copying Linux image PRIOR to kernel boot.
first you may start to think to run in XIP to avoid the copy too
IMHO your boot problem is more in linux than in U-Boot
To be clear - I am experiencing long delay in reading peripheral regs. I am reading some registers in a loop, so it takes long time for each access. After that I also noticed that giving a value to any variable takes long time, so it makes me suspect that SDRAM access is very slow. For example, calling get_timer(0) takes the time of over 1000 cycles!
So, quite independantly of Linux, I want to speed up Flash and SDRAM R/W, as well as R/W of peripherals regs in the loop by introducing DCache. Then I saw that for ARM9 to enable DCache one must setup and enable MMU also, and it becomes mess, because I can find any examples in U-Boot and it seem pretty complex to me.
To underline, my intention of enabling DCache in U-Boot has nothing to do with Linux, because I will switch off caches anyway before boot. I just want to use DCache while read and write operations prior to calling kernel boot.
now it's more clear about what you try to do
yes on arm if you want to have access to the dcache you will have to set the MMU first.
you can also have an overhead for the memory copy due to the fact it's code in c ad not optimized in asm
but until we can take a look on the code it will be hard to known
Which code would help? I do not have anything yet regarding the setup. I do not know where to start. For the time, I am examining CP15 coprocessor manipulations in cpu/arm926, which enables/disables cache (even function that you mentioned, cleanup_before_linux() is defined here). In cpu/arm920t/cpu.c i found these lines, for example :
as mention precedently if you want to use the dcache you must first correctly setup the MMU and the TLB
So, started digging with a hope that somebody alredy implemented this in U-Boot, so I can copy/paste code in my cpu set-up, or, better yet, call a set of C functions (similar to dcache_enable), to set-up MMU, and in the end just call dcache_enable() to do the magic of enabling DCache.
if you want to use the dcache you must first correctly setup the MMU and the TLB
you may need to take a look on the arm1176/start.S
Best Regards, J.

On Tue, Mar 31, 2009 at 4:09 PM, Jean-Christophe PLAGNIOL-VILLARD wrote:
Honnestly we will not work on such old code. so please really consider to rebase it against mainline
I downloaded the last version of U-Boot and will do all my work regarding DCache and MMU setup here.
first you may start to think to run in XIP to avoid the copy too
I would adore to do this. Unforunately design does not allow :'(. Flash is serially connected, so I have to drag whole god damn image byte by byte in SDRAM before CRC check and boot (I had to change cmd_bootm.c for this to work). This is the loop I mention before, and I would like to speed up this long process by introducing DCache.
now it's more clear about what you try to do
I am sorry for not being clear before, but even I myself do not quite understand where to start...
yes on arm if you want to have access to the dcache you will have to set
the
MMU first.
That's what I was affraid of.
you may need to take a look on the arm1176/start.S
Thanks for the hint! I will examine this file (now that thing are begining to be clearer to me maybe I will understand something).
Thank you very much for your time and help.
BR, Drasko

On Tue, Mar 31, 2009 at 4:09 PM, Jean-Christophe PLAGNIOL-VILLARD wrote:
if you want to use the dcache you must first correctly setup the MMU and
the TLB
I set-up : 1) pagetable in the SDRAM (one master pagetable, to map all 4096 pages of address space to map from the virtual to the same addresses in physical address) 2) I use client domain 3 3) set up all permissions to RW 4) attached pagetable (put info about master table address into CP15 reg)
When I try to switch MMU on after all of this (in CP15 reg), u-boot blocks.
I created one command to do all this init, and in the end write into CP15 reg to switch on MMU. Is this approach OK? but for some reason, u-boot blocks there, right after switching MMU on.
I am sure that MMU works, because I can boot Linux on the same board, and Linux works.
With JTAG I can see that pagetable is written on SDRAM on given address, so I suspect on permissions, but from each pagetable entry I have that domain and access permissions as I said (domain 3 client, RW).
Do you have any idea what can be wrong and where to look?
Best regards, Drasko

On Tue, Mar 31, 2009 at 2:21 PM, Jean-Christophe PLAGNIOL-VILLARD wrote:
but until we can take a look on the code it will be hard to known
And I just found : ./examples/test_burst_lib.S: * void mmu_init(void); ./examples/test_burst_lib.S: .global mmu_init ./examples/test_burst_lib.S:mmu_init: ./examples/test_burst.c: mmu_init();
I think I need something like this (I did not deeply inspected the code), but for ARM (examples here are for PPC).
Similar to this example, I would first enable MMU ( call to a mmu_init() ), then caches ( calls to dcache_enable() and icache_enable() ), then do some read and write to a memory. Voila.
I hope that then all R/W to memory will be faster because of support of DCache.
If anybody knows where I can find similar example (or how can I modify this one for ARM maybe), help will be more than welcome.
Best regards, Drasko
participants (2)
-
Drasko DRASKOVIC
-
Jean-Christophe PLAGNIOL-VILLARD