[U-Boot] [PATCH v4] ARM926ejs: Add routines to invalidate D-Cache

After DMA operation, we need to maintain D-Cache coherency. So that the DCache must be invalidated (hence CPU will fetch data written by DMA controller from RAM).
Tested on AT91SAM9261EK with Peripheral DMA controller.
Signed-off-by: Hong Xu hong.xu@atmel.com Tested-by: Elen Song elen.song@atmel.com CC: Albert Aribaud albert.u.boot@aribaud.net CC: Aneesh V aneesh@ti.com CC: Marek Vasut marek.vasut@gmail.com CC: Reinhard Meyer u-boot@emk-elektronik.de CC: Heiko Schocher hs@denx.de --- V2: Per Albert's suggestion, add invalidate_dcache_range
V3: invalidate_dcache_range emits warning when detecting unaligned buffer
invalidate_dcache_range won't clean any adjacent cache line when detecting unaligned buffer and only round up/down the buffer address
v4: invalidate_dcache_range will emit clearer warning message
Per Albert's suggestion, if not alighed to cache line size, round up start address, round down stop addres
Per Marek Vasut's suggestion, use __func__ stated in C99
arch/arm/lib/cache.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 58 insertions(+), 0 deletions(-)
diff --git a/arch/arm/lib/cache.c b/arch/arm/lib/cache.c index 92b61a2..4c8e160 100644 --- a/arch/arm/lib/cache.c +++ b/arch/arm/lib/cache.c @@ -53,3 +53,61 @@ void __flush_dcache_all(void) } void flush_dcache_all(void) __attribute__((weak, alias("__flush_dcache_all"))); + +/* + * The buffer range to be invalidated is [start, stop) + */ +void __invalidate_dcache_range(unsigned long start, unsigned long stop) +{ + int cache_line_len; + unsigned long mva; + +#ifdef CONFIG_ARM926EJS +#ifdef CONFIG_SYS_CACHELINE_SIZE + cache_line_len = CONFIG_SYS_CACHELINE_SIZE; +#else + /* + * ARM926EJ-S Technical Reference Manual, Chap 2.3.1 Table 2-9 + * only b'10, aka. 32 bytes cache line len is valid + */ + cache_line_len = 32; +#endif + mva = start; + if ((mva & (cache_line_len - 1)) != 0) { + printf("WARNING: %s - start address 0x%08x not aligned to" + "cache line size(%d bytes)\n", __func__, start, + cache_line_len); + /* Round up starting address */ + mva = (mva | (cache_line_len - 1)) + 1; + } + if ((stop & (cache_line_len - 1)) != 0) { + printf("WARNING: %s - stop address 0x%08x not aligned to" + "cache line size(%d bytes)\n", __func__, stop, + cache_line_len); + /* Round down ending address */ + stop &= ~(cache_line_len - 1); + } + + while (mva < stop) { + asm("mcr p15, 0, %0, c7, c6, 1" : : "r"(mva)); + mva += cache_line_len; + } + + /* Drain the WB */ + asm("mcr p15, 0, %0, c7, c10, 4" : : "r" (0)); +#endif + + return; +} +void invalidate_dcache_range(unsigned long start, unsigned long stop) + __attribute__((weak, alias("__invalidate_dcache_range"))); + +void __invalidate_dcache_all(void) +{ +#ifdef CONFIG_ARM926EJS + asm("mcr p15, 0, %0, c7, c6, 0" : : "r" (0)); +#endif + return; +} +void invalidate_dcache_all(void) + __attribute__((weak, alias("__invalidate_dcache_all")));

On Wednesday, August 10, 2011 04:49:25 AM Hong Xu wrote:
After DMA operation, we need to maintain D-Cache coherency. So that the DCache must be invalidated (hence CPU will fetch data written by DMA controller from RAM).
Tested on AT91SAM9261EK with Peripheral DMA controller.
Hi Hong,
one more thing, not that I want to disappoint you.
Try to take a look at arch/arm/cpu/armv7/cache_v7.c
Maybe we should do the same for arm926ejs -- have arch/arm/cpu/arm926ejs/cache.c -- containing arm926ejs specific cache management functions. That way, arch/arm/lib/cache.c won't become mess.
What do you think ?
Signed-off-by: Hong Xu hong.xu@atmel.com Tested-by: Elen Song elen.song@atmel.com CC: Albert Aribaud albert.u.boot@aribaud.net CC: Aneesh V aneesh@ti.com CC: Marek Vasut marek.vasut@gmail.com CC: Reinhard Meyer u-boot@emk-elektronik.de CC: Heiko Schocher hs@denx.de
V2: Per Albert's suggestion, add invalidate_dcache_range
V3: invalidate_dcache_range emits warning when detecting unaligned buffer
invalidate_dcache_range won't clean any adjacent cache line when detecting unaligned buffer and only round up/down the buffer address
v4: invalidate_dcache_range will emit clearer warning message
Per Albert's suggestion, if not alighed to cache line size, round up start address, round down stop addres
Per Marek Vasut's suggestion, use __func__ stated in C99
arch/arm/lib/cache.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 58 insertions(+), 0 deletions(-)
diff --git a/arch/arm/lib/cache.c b/arch/arm/lib/cache.c index 92b61a2..4c8e160 100644 --- a/arch/arm/lib/cache.c +++ b/arch/arm/lib/cache.c @@ -53,3 +53,61 @@ void __flush_dcache_all(void) } void flush_dcache_all(void) __attribute__((weak, alias("__flush_dcache_all")));
+/*
- The buffer range to be invalidated is [start, stop)
- */
+void __invalidate_dcache_range(unsigned long start, unsigned long stop) +{
- int cache_line_len;
- unsigned long mva;
+#ifdef CONFIG_ARM926EJS +#ifdef CONFIG_SYS_CACHELINE_SIZE
- cache_line_len = CONFIG_SYS_CACHELINE_SIZE;
+#else
- /*
* ARM926EJ-S Technical Reference Manual, Chap 2.3.1 Table 2-9
* only b'10, aka. 32 bytes cache line len is valid
*/
- cache_line_len = 32;
+#endif
- mva = start;
- if ((mva & (cache_line_len - 1)) != 0) {
printf("WARNING: %s - start address 0x%08x not aligned to"
"cache line size(%d bytes)\n", __func__, start,
cache_line_len);
/* Round up starting address */
mva = (mva | (cache_line_len - 1)) + 1;
- }
- if ((stop & (cache_line_len - 1)) != 0) {
printf("WARNING: %s - stop address 0x%08x not aligned to"
"cache line size(%d bytes)\n", __func__, stop,
cache_line_len);
/* Round down ending address */
stop &= ~(cache_line_len - 1);
- }
- while (mva < stop) {
asm("mcr p15, 0, %0, c7, c6, 1" : : "r"(mva));
mva += cache_line_len;
- }
- /* Drain the WB */
- asm("mcr p15, 0, %0, c7, c10, 4" : : "r" (0));
+#endif
- return;
+} +void invalidate_dcache_range(unsigned long start, unsigned long stop)
- __attribute__((weak, alias("__invalidate_dcache_range")));
+void __invalidate_dcache_all(void) +{ +#ifdef CONFIG_ARM926EJS
- asm("mcr p15, 0, %0, c7, c6, 0" : : "r" (0));
+#endif
- return;
+} +void invalidate_dcache_all(void)
- __attribute__((weak, alias("__invalidate_dcache_all")));

Hello Hong,
Marek Vasut wrote:
On Wednesday, August 10, 2011 04:49:25 AM Hong Xu wrote:
After DMA operation, we need to maintain D-Cache coherency. So that the DCache must be invalidated (hence CPU will fetch data written by DMA controller from RAM).
Tested on AT91SAM9261EK with Peripheral DMA controller.
Hi Hong,
one more thing, not that I want to disappoint you.
Try to take a look at arch/arm/cpu/armv7/cache_v7.c
Maybe we should do the same for arm926ejs -- have arch/arm/cpu/arm926ejs/cache.c -- containing arm926ejs specific cache management functions. That way, arch/arm/lib/cache.c won't become mess.
What do you think ?
Full Ack, this should be moved like on armv7 to arch/arm/cpu/arm926ejs/cache.c
Thanks!
bye, Heiko

Hi Marek Vasut,
On 08/10/2011 01:52 PM, Marek Vasut wrote:
On Wednesday, August 10, 2011 04:49:25 AM Hong Xu wrote:
After DMA operation, we need to maintain D-Cache coherency. So that the DCache must be invalidated (hence CPU will fetch data written by DMA controller from RAM).
Tested on AT91SAM9261EK with Peripheral DMA controller.
Hi Hong,
one more thing, not that I want to disappoint you.
Not at all ;-)
To raise such discussion is not bad actually.
Try to take a look at arch/arm/cpu/armv7/cache_v7.c
Maybe we should do the same for arm926ejs -- have arch/arm/cpu/arm926ejs/cache.c -- containing arm926ejs specific cache management functions. That way, arch/arm/lib/cache.c won't become mess.
What do you think ?
Basically I'm not quite sure about the design of ARM cache part. I noticed the work for armv7, but I've thought it as a special case because armv7 cache part looks more complicated than ARM926.
There are some ARM926 specific code in arch/arm/lib/cache.c; So I also put the stuff there. ;-) I think Albert Aribaud or the original contributor of cache part shall have clearer view.So, I'll keep neutral to hear more ideas.
BR, Eric
Signed-off-by: Hong Xuhong.xu@atmel.com Tested-by: Elen Songelen.song@atmel.com CC: Albert Aribaudalbert.u.boot@aribaud.net CC: Aneesh Vaneesh@ti.com CC: Marek Vasutmarek.vasut@gmail.com CC: Reinhard Meyeru-boot@emk-elektronik.de CC: Heiko Schocherhs@denx.de
V2: Per Albert's suggestion, add invalidate_dcache_range
V3: invalidate_dcache_range emits warning when detecting unaligned buffer
invalidate_dcache_range won't clean any adjacent cache line when detecting unaligned buffer and only round up/down the buffer address
v4: invalidate_dcache_range will emit clearer warning message
Per Albert's suggestion, if not alighed to cache line size, round up start address, round down stop addres
Per Marek Vasut's suggestion, use __func__ stated in C99
arch/arm/lib/cache.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 58 insertions(+), 0 deletions(-)
[...]

Hi Hong Xu,
Le 10/08/2011 08:17, Hong Xu a écrit :
There are some ARM926 specific code in arch/arm/lib/cache.c; So I also put the stuff there. ;-) I think Albert Aribaud or the original contributor of cache part shall have clearer view.So, I'll keep neutral to hear more ideas.
Basically, cache operations are CP15 commands which are defined for each ARM architecture, not for each ISA, so Marek is right about the best place for this being in arm926ejs. Actually, I think arch/arm/lib/cache.c should only contain the weak defaults (i.e., no real cache action) and each architecture should provide overrides to the defaults. Currently this is almost the case, with (apart from arm926ejs which you're already touching) only one arm1136 specific implementation to move.
So please move the arm926ejs specific implementations to arch/arm/cpu/arm926ejs/cache.c as suggested by Marek, keeping only the weak default in arch/arm/lib/cache.c.
Amicalement,

Hi Albert,
On 08/10/2011 02:36 PM, Albert ARIBAUD wrote:
Hi Hong Xu,
Le 10/08/2011 08:17, Hong Xu a écrit :
There are some ARM926 specific code in arch/arm/lib/cache.c; So I also put the stuff there. ;-) I think Albert Aribaud or the original contributor of cache part shall have clearer view.So, I'll keep neutral to hear more ideas.
Basically, cache operations are CP15 commands which are defined for each ARM architecture, not for each ISA, so Marek is right about the best place for this being in arm926ejs. Actually, I think arch/arm/lib/cache.c should only contain the weak defaults (i.e., no real cache action) and each architecture should provide overrides to the defaults. Currently this is almost the case, with (apart from arm926ejs which you're already touching) only one arm1136 specific implementation to move.
So please move the arm926ejs specific implementations to arch/arm/cpu/arm926ejs/cache.c as suggested by Marek, keeping only the weak default in arch/arm/lib/cache.c.
Ok, I'll manage a basic version for review.
Thanks
BR, Eric
Amicalement,

On Wednesday, August 10, 2011 08:41:58 AM Hong Xu wrote:
Hi Albert,
On 08/10/2011 02:36 PM, Albert ARIBAUD wrote:
Hi Hong Xu,
Le 10/08/2011 08:17, Hong Xu a écrit :
There are some ARM926 specific code in arch/arm/lib/cache.c; So I also put the stuff there. ;-) I think Albert Aribaud or the original contributor of cache part shall have clearer view.So, I'll keep neutral to hear more ideas.
Basically, cache operations are CP15 commands which are defined for each ARM architecture, not for each ISA, so Marek is right about the best place for this being in arm926ejs. Actually, I think arch/arm/lib/cache.c should only contain the weak defaults (i.e., no real cache action) and each architecture should provide overrides to the defaults. Currently this is almost the case, with (apart from arm926ejs which you're already touching) only one arm1136 specific implementation to move.
So please move the arm926ejs specific implementations to arch/arm/cpu/arm926ejs/cache.c as suggested by Marek, keeping only the weak default in arch/arm/lib/cache.c.
Ok, I'll manage a basic version for review.
Can you also make a separate patch for the arm1136 possibly (to move the stuff from lib/cache.c)?
Anyway, thanks a lot for your endurance.
Thanks
BR, Eric
Amicalement,

Hi Albert, Hong,
On Wednesday 10 August 2011 12:06 PM, Albert ARIBAUD wrote:
Hi Hong Xu,
Le 10/08/2011 08:17, Hong Xu a écrit :
There are some ARM926 specific code in arch/arm/lib/cache.c; So I also put the stuff there. ;-) I think Albert Aribaud or the original contributor of cache part shall have clearer view.So, I'll keep neutral to hear more ideas.
Basically, cache operations are CP15 commands which are defined for each ARM architecture, not for each ISA, so Marek is right about the best
I am not sure if this is the case. I just quickly had a look at the ARMv5 and ARMv6 manuals. They are defining the CP15 instructions for cache operations(section 5.6.2 in ARMv5 manual and section B6.6.5 in the ARMv6 manual. And on first look, the CP15 operations look very similar
So, I feel that we can benefit many boards if Hong's operations are kept armv5/armv6 generic(you may have to carefully look at all operations and make sure they are valid for both armv5/v6).
My suggestion would be to create a new file in arch/arm /lib/cache_v5_v6.c and include this file in the build conditionally based on a config flag like CONFIG_SYS_ARM_CACHE_V5_V6_SUPPORT or something like that. Any platform that wants to use these operations can then just enable this flag.
BTW, ARMv7 is not really backward compatible in the way the operations are done, because it provides the capability to find the levels of caches and repeat the operations at all levels where as up to armv6 the CP15 operations were only for Level 1.
I have not looked at the compatibility of ARMv4 operations.
best regards, Aneesh

On Thursday, August 11, 2011 09:02:20 AM Aneesh V wrote:
Hi Albert, Hong,
On Wednesday 10 August 2011 12:06 PM, Albert ARIBAUD wrote:
Hi Hong Xu,
Le 10/08/2011 08:17, Hong Xu a écrit :
There are some ARM926 specific code in arch/arm/lib/cache.c; So I also put the stuff there. ;-) I think Albert Aribaud or the original contributor of cache part shall have clearer view.So, I'll keep neutral to hear more ideas.
Basically, cache operations are CP15 commands which are defined for each ARM architecture, not for each ISA, so Marek is right about the best
I am not sure if this is the case. I just quickly had a look at the ARMv5 and ARMv6 manuals. They are defining the CP15 instructions for cache operations(section 5.6.2 in ARMv5 manual and section B6.6.5 in the ARMv6 manual. And on first look, the CP15 operations look very similar
So, I feel that we can benefit many boards if Hong's operations are kept armv5/armv6 generic(you may have to carefully look at all operations and make sure they are valid for both armv5/v6).
My suggestion would be to create a new file in arch/arm /lib/cache_v5_v6.c and include this file in the build conditionally based on a config flag like CONFIG_SYS_ARM_CACHE_V5_V6_SUPPORT or something like that. Any platform that wants to use these operations can then just enable this flag.
BTW, ARMv7 is not really backward compatible in the way the operations are done, because it provides the capability to find the levels of caches and repeat the operations at all levels where as up to armv6 the CP15 operations were only for Level 1.
That's actually not a bad idea, but we need to be definitelly 100% sure it'll work for all these different v5 and v6 cores !
I have not looked at the compatibility of ARMv4 operations.
best regards, Aneesh

On 11/08/2011 09:30, Marek Vasut wrote:
That's actually not a bad idea, but we need to be definitelly 100% sure it'll work for all these different v5 and v6 cores !
That's what I'm afraid of: not all v5 or v6 cores may have the same set of cache functions. I'll have a look at various ARMv5 based cores to get a feel of the risk.
Amicalement,

On Friday, August 12, 2011 11:57:08 AM Albert ARIBAUD wrote:
On 11/08/2011 09:30, Marek Vasut wrote:
That's actually not a bad idea, but we need to be definitelly 100% sure it'll work for all these different v5 and v6 cores !
That's what I'm afraid of: not all v5 or v6 cores may have the same set of cache functions. I'll have a look at various ARMv5 based cores to get a feel of the risk.
Don't forget that but the usual cores produced by ARM, there are modified cores like Xscale core etc ... that's what sucks most
Amicalement,
participants (5)
-
Albert ARIBAUD
-
Aneesh V
-
Heiko Schocher
-
Hong Xu
-
Marek Vasut