[U-Boot] [v3] command/cache: Add flush command

When we copy code/data to the main memory, we may need to flush the cache if required by architecture. It uses the existing function flush_cache. Syntax is
flush <addr> <size>
The addr and size are given in hexadecimal. Like memory command, there is no sanity check for the parameters.
Signed-off-by: York Sun yorksun@freescale.com --- Change since v2: rename flush_cache to flush
README | 5 ++++- common/cmd_cache.c | 26 ++++++++++++++++++++++++++ 2 files changed, 30 insertions(+), 1 deletion(-)
diff --git a/README b/README index dc6b0b3..b2f80d6 100644 --- a/README +++ b/README @@ -809,6 +809,7 @@ The following options need to be configured: CONFIG_CMD_BSP * Board specific commands CONFIG_CMD_BOOTD bootd CONFIG_CMD_CACHE * icache, dcache + CONFIG_CMD_CACHE_FLUSH * flush cache by the address and range CONFIG_CMD_CONSOLE coninfo CONFIG_CMD_CRC32 * crc32 CONFIG_CMD_DATE * support for RTC, date/time... @@ -912,7 +913,9 @@ The following options need to be configured: 8260 (where accesses to the IMMR region must be uncached), and it cannot be disabled on all other systems where we (mis-) use the data cache to hold an - initial stack and some data. + initial stack and some data. The CONFIG_CMD_CACHE_FLUSH + macro enables flushing cache by the address and range to + maintain coherency if required by architecture.
XXX - this list needs to get updated! diff --git a/common/cmd_cache.c b/common/cmd_cache.c index 5512f92..c228e0c 100644 --- a/common/cmd_cache.c +++ b/common/cmd_cache.c @@ -120,3 +120,29 @@ U_BOOT_CMD( "[on, off, flush]\n" " - enable, disable, or flush data (writethrough) cache" ); + +#ifdef CONFIG_CMD_CACHE_FLUSH +int do_flush_cache(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[]) +{ + ulong addr, size; + + switch (argc) { + case 3: + addr = simple_strtoul(argv[1], NULL, 16); + size = simple_strtoul(argv[2], NULL, 16); + flush_cache(addr, size); + break; + default: + return cmd_usage(cmdtp); + } + return 0; + +} + +U_BOOT_CMD( + flush, 3, 0, do_flush_cache, + "flush cache for a range", + "<addr> <size>\n" + " - flush cache for specificed range" +); +#endif

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 04/05/2013 04:50 PM, York Sun wrote:
When we copy code/data to the main memory, we may need to flush the cache if required by architecture. It uses the existing function flush_cache. Syntax is
flush <addr> <size>
The addr and size are given in hexadecimal. Like memory command, there is no sanity check for the parameters.
Signed-off-by: York Sun yorksun@freescale.com --- Change since v2: rename flush_cache to flush
Wait, why? flush_cache was clear as to what we were doing, flush less so. At least that's my first reaction.
- -- Tom

On 04/05/2013 02:00 PM, Tom Rini wrote:
On 04/05/2013 04:50 PM, York Sun wrote:
When we copy code/data to the main memory, we may need to flush the cache if required by architecture. It uses the existing function flush_cache. Syntax is
flush <addr> <size>
The addr and size are given in hexadecimal. Like memory command, there is no sanity check for the parameters.
Signed-off-by: York Sun yorksun@freescale.com --- Change since v2: rename flush_cache to flush
Wait, why? flush_cache was clear as to what we were doing, flush less so. At least that's my first reaction.
Scott suggested the underscore is not recommended for the command name. Shall I change it to cacheflush or flushcache?
York

Dear York Sun,
In message 1365195056-20188-1-git-send-email-yorksun@freescale.com you wrote:
When we copy code/data to the main memory, we may need to flush the cache if required by architecture. It uses the existing function flush_cache. Syntax is
flush <addr> <size>
Plain "flush" is way too generic a name. I think we should make it clear from the command invocation that we are dealing with caches here.
Actually I think we should not even use a new command for this - we already have the "dcahe" and "icache" commands for this purpose.
What do you think about implementiung this as a subcommand to "dcache"? Something like:
So far:
dcache on dcache off
adding new:
dcache flush => flush all dcache flush <addr> <size> => flush range
I think this makes more sense. Comments?
Best regards,
Wolfgang Denk

On 04/05/2013 03:09 PM, Wolfgang Denk wrote:
Dear York Sun,
In message 1365195056-20188-1-git-send-email-yorksun@freescale.com you wrote:
When we copy code/data to the main memory, we may need to flush the cache if required by architecture. It uses the existing function flush_cache. Syntax is
flush <addr> <size>
Plain "flush" is way too generic a name. I think we should make it clear from the command invocation that we are dealing with caches here.
Actually I think we should not even use a new command for this - we already have the "dcahe" and "icache" commands for this purpose.
What do you think about implementiung this as a subcommand to "dcache"? Something like:
So far:
dcache on dcache off
adding new:
dcache flush => flush all dcache flush <addr> <size> => flush range
I think this makes more sense. Comments?
It would if the command only deals with dcache. This command flushes dcache _and_ invalidates icache.
If "flush_cache" is acceptable, we can use v2. If not, please suggest one. My candidates are "flushcache", "cacheflush".
York

Dear York Sun,
In message 515F5812.8030008@freescale.com you wrote:
adding new:
dcache flush => flush all dcache flush <addr> <size> => flush range
I think this makes more sense. Comments?
It would if the command only deals with dcache. This command flushes dcache _and_ invalidates icache.
Then the name "flush" is even more a bad choice.
If "flush_cache" is acceptable, we can use v2. If not, please suggest one. My candidates are "flushcache", "cacheflush".
Can we not split this into:
dcache flush icache invalidate
? This would make clear what's happening.
Best regards,
Wolfgang Denk

On Apr 6, 2013, at 12:01 AM, Wolfgang Denk wrote:
Dear York Sun,
In message 515F5812.8030008@freescale.com you wrote:
adding new:
dcache flush => flush all dcache flush <addr> <size> => flush range
I think this makes more sense. Comments?
It would if the command only deals with dcache. This command flushes dcache _and_ invalidates icache.
Then the name "flush" is even more a bad choice.
If "flush_cache" is acceptable, we can use v2. If not, please suggest one. My candidates are "flushcache", "cacheflush".
Can we not split this into:
dcache flush icache invalidate
? This would make clear what's happening.
The idea is to reuse existing code with minimum addition. For the applications concerned, these two steps are both needed. Splitting them doesn't make things easier. If I have to use existing command, I'd rather to put these two steps under icache invalide <addr> <size>.
York

Dear sun york-R58495,
In message C707E9F4D8007146BF8DC1424B113AC70B3A435B@039-SN2MPN1-012.039d.mgd.msft.net you wrote:
Can we not split this into:
dcache flush icache invalidate
? This would make clear what's happening.
The idea is to reuse existing code with minimum addition. For the applicati ons concerned, these two steps are both needed. Splitting them doesn't make things easier.
Reusing code is a Good Thing, but not when it comes at the cost of obfucating what the code actually does.
If I have to use existing command, I'd rather to put these two steps under icache invalide <addr> <size>.
No, this is not acceptable. The "icache" command deals with the IC only, it must not meddle with the data cache (like flushing it).
Best regards,
Wolfgang Denk

On Apr 7, 2013, at 1:29 AM, Wolfgang Denk wrote:
Dear sun york-R58495,
In message C707E9F4D8007146BF8DC1424B113AC70B3A435B@039-SN2MPN1-012.039d.mgd.msft.net you wrote:
Can we not split this into:
dcache flush icache invalidate
? This would make clear what's happening.
The idea is to reuse existing code with minimum addition. For the applicati ons concerned, these two steps are both needed. Splitting them doesn't make things easier.
Reusing code is a Good Thing, but not when it comes at the cost of obfucating what the code actually does.
If I have to use existing command, I'd rather to put these two steps under icache invalide <addr> <size>.
No, this is not acceptable. The "icache" command deals with the IC only, it must not meddle with the data cache (like flushing it).
I think it is best to keep this patch as it and stick with the original flush_cache name. It uses the existing function flush_cache() which is available for most (if not all) architectures. Splitting the dcache and icache not only adds more code, but architecture-dependent code.
York

Dear sun york-R58495,
In message C707E9F4D8007146BF8DC1424B113AC70B3A6A83@039-SN2MPN1-012.039d.mgd.msft.net you wrote:
I think it is best to keep this patch as it and stick with the original flush_cache name. It uses the existing function flush_cache() which is available for most (if not all) architectures. Splitting the dcache and icache not only adds more code, but architecture-dependent code.
As mentioned before: reusing existing code is fine, but we already have commands for cache handling, and adding arbitrary new ones to implement combinations of functions makes does not scale. Assume tomorrow someone needs to add more ganular handling for example regarding L2 caches, etc. Would you suggest to add another set of new commands, then? This makes no sense.
Please implement IC related operations as subcommands to the "icache" command, and DC releated ones as subcommands to "dcache".
[In the example of L2 cache above, it would be for example sufficient to add a "-L2" option to the "icache" / "dcache" commands.]
Best regards,
Wolfgang Denk

On 04/08/2013 01:35:13 PM, Wolfgang Denk wrote:
Dear sun york-R58495,
In message C707E9F4D8007146BF8DC1424B113AC70B3A6A83@039-SN2MPN1-012.039d.mgd.msft.net you wrote:
I think it is best to keep this patch as it and stick with the original flush_cache name. It uses the existing function flush_cache() which is available for most (if not all)
architectures.
Splitting the dcache and icache not only adds more code, but architecture-dependent code.
As mentioned before: reusing existing code is fine, but we already have commands for cache handling, and adding arbitrary new ones to implement combinations of functions makes does not scale. Assume tomorrow someone needs to add more ganular handling for example regarding L2 caches, etc. Would you suggest to add another set of new commands, then? This makes no sense.
Maybe "cache" should be the toplevel command, with "icache" and "dcache" refactored to be subcommands? Of course, then you're making an incompatible interface change. How much is consistency worth?
Please implement IC related operations as subcommands to the "icache" command, and DC releated ones as subcommands to "dcache".
The whole point of the patch is to expose the existing flush_cache() functionality, which is not split into icache/dcache. From the user's perspective, it's a command to flush the specified region out of *all* caches. It's an implementation detail that some hardware or architectures accomplish this using separate dcache and icache instructions. If you make the interface be "icache/dcache", how would you handle hardware where the flushing mechanism (or even the cache itself) is not split?
[In the example of L2 cache above, it would be for example sufficient to add a "-L2" option to the "icache" / "dcache" commands.]
Would it? On our chips L2 cache is (more or less) unified. There's no separate icache/dcache flush.
-Scott

Dear Scott,
In message 1365447916.28843.7@snotra you wrote:
Maybe "cache" should be the toplevel command, with "icache" and "dcache" refactored to be subcommands? Of course, then you're making an incompatible interface change. How much is consistency worth?
I think backward compatibility is mandatory here. We cannot break existing user scripts.
I'm also not convinced that merging this into one command would be a better design. I think the current split is a pretty good represen- tation of what the hardware looks like and how i works. Let's keep it.
The whole point of the patch is to expose the existing flush_cache() functionality, which is not split into icache/dcache. From the user's
I understand this. But while the combination of IC and DC related operations into one function may be convenient for internal use, it is not a good idea when exposed externally.
perspective, it's a command to flush the specified region out of *all* caches. It's an implementation detail that some hardware or
I understand what you mean, but actually we do not flush the IC, we invalidate it, which is something different. I don't want to start a discussion here if flush_cache() is actually a bad function name (when discussing the functionality, it is), not do I want to suggest to change that name.
But when we publish such interfaces to the end user, it is important to be precise in our terminology.
architectures accomplish this using separate dcache and icache instructions. If you make the interface be "icache/dcache", how would you handle hardware where the flushing mechanism (or even the cache itself) is not split?
If IC and DC are the same thing, the same function can be used to implement the operations. "icache" and "dcache" would then run the same code, i. e. be aliases.
[In the example of L2 cache above, it would be for example sufficient to add a "-L2" option to the "icache" / "dcache" commands.]
Would it? On our chips L2 cache is (more or less) unified. There's no separate icache/dcache flush.
See above. In such a case "icache" and "dcache" can just call the same underlying code.
Best regards,
Wolfgang Denk

On 04/08/2013 02:18:20 PM, Wolfgang Denk wrote:
Dear Scott,
In message 1365447916.28843.7@snotra you wrote:
Maybe "cache" should be the toplevel command, with "icache" and "dcache" refactored to be subcommands? Of course, then you're
making
an incompatible interface change. How much is consistency worth?
I think backward compatibility is mandatory here. We cannot break existing user scripts.
Sure. But if the main reason for the icache/dcache split is compatibility, I don't think that should constrict the form of new commands.
I'm also not convinced that merging this into one command would be a better design. I think the current split is a pretty good represen- tation of what the hardware looks like and how i works. Let's keep it.
The whole point of the patch is to expose the existing flush_cache() functionality, which is not split into icache/dcache. From the
user's
I understand this. But while the combination of IC and DC related operations into one function may be convenient for internal use, it is not a good idea when exposed externally.
perspective, it's a command to flush the specified region out of
*all*
caches. It's an implementation detail that some hardware or
I understand what you mean, but actually we do not flush the IC, we invalidate it, which is something different. I don't want to start a discussion here if flush_cache() is actually a bad function name (when discussing the functionality, it is), not do I want to suggest to change that name.
But when we publish such interfaces to the end user, it is important to be precise in our terminology.
I think the terminology is just fine. It's not just invalidating the icache (flushing and invalidating are the same thing for cache lines which are not modified -- or are incapable of being modified). It's flushing the region out of *all* caches.
What actual use case is there for only wanting to flush one or the other?
architectures accomplish this using separate dcache and icache instructions. If you make the interface be "icache/dcache", how
would
you handle hardware where the flushing mechanism (or even the cache itself) is not split?
If IC and DC are the same thing, the same function can be used to implement the operations. "icache" and "dcache" would then run the same code, i. e. be aliases.
[In the example of L2 cache above, it would be for example
sufficient
to add a "-L2" option to the "icache" / "dcache" commands.]
Would it? On our chips L2 cache is (more or less) unified.
There's no
separate icache/dcache flush.
See above. In such a case "icache" and "dcache" can just call the same underlying code.
And then we end up having to do the flush twice, if the user follows the "first flush dcache, then flush icache" instructions. That's not an ideal interface.
-Scott

Dear Scott,
In message 1365449512.28843.10@snotra you wrote:
Maybe "cache" should be the toplevel command, with "icache" and "dcache" refactored to be subcommands? Of course, then you're making an incompatible interface change. How much is consistency worth?
I think backward compatibility is mandatory here. We cannot break existing user scripts.
Sure. But if the main reason for the icache/dcache split is compatibility, I don't think that should constrict the form of new commands.
Backward compatibility is just the argument for not changing the existing command interfaces. I tried to explain why I do not want to see the flush_cache() combination of functions exposed to end users.
I think the terminology is just fine. It's not just invalidating the icache (flushing and invalidating are the same thing for cache lines which are not modified -- or are incapable of being modified). It's flushing the region out of *all* caches.
There is a well-defined difference between flushing and invalidating a cache, and I don't intend to go into hair-splitting mode here just to follow your path which would lead me to where I d not want to go.
See above. In such a case "icache" and "dcache" can just call the same underlying code.
And then we end up having to do the flush twice, if the user follows the "first flush dcache, then flush icache" instructions. That's not an ideal interface.
OK, this is indeed a good argument. But is this really the case? When looking at the current code of flash_cache() for PPC, it seems actually easy to split:
28 void flush_cache(ulong start_addr, ulong size) 29 { 30 #ifndef CONFIG_5xx 31 ulong addr, start, end; 32 33 start = start_addr & ~(CONFIG_SYS_CACHELINE_SIZE - 1); 34 end = start_addr + size - 1; 35 36 for (addr = start; (addr <= end) && (addr >= start); 37 addr += CONFIG_SYS_CACHELINE_SIZE) { 38 asm volatile("dcbst 0,%0" : : "r" (addr) : "memory"); 39 WATCHDOG_RESET(); 40 } 41 /* wait for all dcbst to complete on bus */ 42 asm volatile("sync" : : : "memory"); 43 44 for (addr = start; (addr <= end) && (addr >= start); 45 addr += CONFIG_SYS_CACHELINE_SIZE) { 46 asm volatile("icbi 0,%0" : : "r" (addr) : "memory"); 47 WATCHDOG_RESET(); 48 } 49 asm volatile("sync" : : : "memory"); 50 /* flush prefetch queue */ 51 asm volatile("isync" : : : "memory"); 52 #endif 53 }
Lines 36..42 would go into the "dcache" command, while lines 44..51 would go into the "icache" command.
So far this code appears to work fine. What am I missing?
Best regards,
Wolfgang Denk

On 04/09/2013 12:45:31 PM, Wolfgang Denk wrote:
Dear Scott,
In message 1365449512.28843.10@snotra you wrote:
I think the terminology is just fine. It's not just invalidating
the
icache (flushing and invalidating are the same thing for cache lines which are not modified -- or are incapable of being modified). It's flushing the region out of *all* caches.
There is a well-defined difference between flushing and invalidating a cache, and I don't intend to go into hair-splitting mode here just to follow your path which would lead me to where I d not want to go.
If that means you have some other reason for objecting that you *do* want to go into, could you elaborate?
I do agree that flush is not a good name for what we do on the data side for PPC, given that we use dcbst and not dcbf.
See above. In such a case "icache" and "dcache" can just call the same underlying code.
And then we end up having to do the flush twice, if the user follows the "first flush dcache, then flush icache" instructions. That's
not
an ideal interface.
OK, this is indeed a good argument. But is this really the case? When looking at the current code of flash_cache() for PPC, it seems actually easy to split:
28 void flush_cache(ulong start_addr, ulong size) 29 { 30 #ifndef CONFIG_5xx 31 ulong addr, start, end; 32 33 start = start_addr & ~(CONFIG_SYS_CACHELINE_SIZE - 1); 34 end = start_addr + size - 1; 35 36 for (addr = start; (addr <= end) && (addr >= start); 37 addr += CONFIG_SYS_CACHELINE_SIZE) { 38 asm volatile("dcbst 0,%0" : : "r" (addr) : "memory"); 39 WATCHDOG_RESET(); 40 } 41 /* wait for all dcbst to complete on bus */ 42 asm volatile("sync" : : : "memory"); 43 44 for (addr = start; (addr <= end) && (addr >= start); 45 addr += CONFIG_SYS_CACHELINE_SIZE) { 46 asm volatile("icbi 0,%0" : : "r" (addr) : "memory"); 47 WATCHDOG_RESET(); 48 } 49 asm volatile("sync" : : : "memory"); 50 /* flush prefetch queue */ 51 asm volatile("isync" : : : "memory"); 52 #endif 53 }
would go into the "icache" command. Lines 36..42 would go into the "dcache" command, while lines 44..51
So far this code appears to work fine. What am I missing?
This is just one implementation of flush_cache(). Here are all of them:
arch/sparc/lib/cache.c:void flush_cache(ulong start_addr, ulong size) arch/openrisc/cpu/cache.c:void flush_cache(unsigned long addr, unsigned long size) arch/powerpc/lib/cache.c:void flush_cache(ulong start_addr, ulong size) arch/nds32/lib/cache.c:void flush_cache(unsigned long addr, unsigned long size) arch/m68k/lib/cache.c:void flush_cache(ulong start_addr, ulong size) arch/sh/cpu/sh4/cpu.c:void flush_cache (unsigned long addr, unsigned long size) arch/sh/cpu/sh2/cpu.c:void flush_cache(unsigned long addr, unsigned long size) arch/sh/cpu/sh3/cpu.c:void flush_cache(unsigned long addr, unsigned long size) arch/microblaze/cpu/cache.c:void flush_cache (ulong addr, ulong size) arch/arm/cpu/arm1136/cpu.c:void flush_cache(unsigned long start, unsigned long size) arch/arm/cpu/arm1136/cpu.c:void flush_cache(unsigned long start, unsigned long size) arch/arm/cpu/arm926ejs/cache.c:void flush_cache(unsigned long start, unsigned long size) arch/arm/cpu/arm926ejs/cache.c:void flush_cache(unsigned long start, unsigned long size) arch/blackfin/lib/cache.c:void flush_cache(unsigned long addr, unsigned long size) arch/mips/cpu/mips64/cpu.c:void flush_cache(ulong start_addr, ulong size) arch/mips/cpu/mips32/cpu.c:void flush_cache(ulong start_addr, ulong size) arch/mips/cpu/xburst/cpu.c:void flush_cache(ulong start_addr, ulong size)
This is what we're trying to expose as a command. If you want it split into icache and dcache, it needs to be rewritten for every architecture (or else we'd need to limit the command to only running on certain architectures), and not all of them are as easily splittable. What is the actual need for splitting it? The semantics we're trying to expose as a command are the same as this function is meant to implement, regardless of the choice of name. Those semantics are "make the data we just wrote to this range visible to instruction fetches".
-Scott

Hi, experts: What is the abbreviation of DWMMC / DWMCI? I could not google any doc introducing them! Are they the new types of EMMC device?
Best wishes,

Dear TigerLiu@viatech.com.cn,
In message FE7ADED5C2218B4786C09CD97DC4C49F8128FC@exchbj02.viatech.com.bj you wrote:
What is the abbreviation of DWMMC / DWMCI?
These are already abbreviations; I don't think you can shorten them even more without losing too much of information ;-)
I could not google any doc introducing them! Are they the new types of EMMC device?
No. "DW" here just means "DesignWare" and is the name of a Synopsys trademark / controller name; see for example http://www.synopsys.com/dw/ipdir.php?ds=dwc_mobile_storage
Best regards,
Wolfgang Denk

Hi, Denk: Thanks for your reply! Some Samsung engineers were updating DWMMC related drivers. So, Samsung SOC adopted this DWMCI IP?
Best wishes,
-----邮件原件----- 发件人: Wolfgang Denk [mailto:wd@denx.de] 发送时间: 2013年4月10日 19:58 收件人: Tiger Liu 抄送: u-boot@lists.denx.de 主题: Re: [U-Boot] DWMMC / DWMCI question
Dear TigerLiu@viatech.com.cn,
In message FE7ADED5C2218B4786C09CD97DC4C49F8128FC@exchbj02.viatech.com.bj you wrote:
What is the abbreviation of DWMMC / DWMCI?
These are already abbreviations; I don't think you can shorten them even more without losing too much of information ;-)
I could not google any doc introducing them! Are they the new types of EMMC device?
No. "DW" here just means "DesignWare" and is the name of a Synopsys trademark / controller name; see for example http://www.synopsys.com/dw/ipdir.php?ds=dwc_mobile_storage
Best regards,
Wolfgang Denk

Dear Scott,
In message 1365555490.31043.20@snotra you wrote:
If that means you have some other reason for objecting that you *do* want to go into, could you elaborate?
I explained that before: we already have commands to operate with the caches, and we should rather use the existingones and extend these as needed instead of adding arbitrary new ones.
This is just one implementation of flush_cache(). Here are all of them:
Thanks for the list. But that was not my question. I asked (ok, in the other message):
| ... Can you please point me to | the code (ideally in mainline U-Boot) which would cause problems with | the suggested separation of invalidating the IC and flushing the DC | into subcommands of the "icache" resp. "dcache" commands?
Yes, I did have a look at all these implementations, and I think none of them will cause any issues. Actually quite a number of them are actually just combiing calls like that:
"arch/openrisc/cpu/cache.c":
54 void flush_cache(unsigned long addr, unsigned long size) 55 { 56 flush_dcache_range(addr, addr + size); 57 invalidate_icache_range(addr, addr + size); 58 }
I do not claim to have perfect understanding of all arhcitectures and implementations; I may easily miss something. So could you please tell me _exactly_ where you see issues?
This is what we're trying to expose as a command. If you want it split into icache and dcache, it needs to be rewritten for every architecture
Actually it appears to be already split quite naturally for all currently supported architectures (at least to the extend these implement this functinality at all). flush_cache() is just a convenience, and if you think twice not even a very lucky one. The openrisc example above shows this pretty clearly.
So far, I cannot see a real technical problem in doing what I proposed, i. e. running flush_dcache_range() as "dcache flush" sub-command, and invalidate_icache_range() as "icache invalidate" sub-command. I see no need to define a new command, and I see no technical problems with the implementation of the new subcommands.
If I'm missing something, and you are aware of any real problems, then please point these out.
(or else we'd need to limit the command to only running on certain architectures), and not all of them are as easily splittable. What is
Where exactly do you see problems? Do you have any unpublished code for a new architecture sitting somewhere, or what? The current code in mainline U-Boot should be straightforward to handle.
the actual need for splitting it? The semantics we're trying to expose as a command are the same as this function is meant to implement,
You defend this stupid function flush_cache() as if it was of major achievement of the development of software in the last decades? I'm sorry, but it ain't so. It's just a pretty stupid thing without any real value. I can imagine to see it gone without sheding a tear.
regardless of the choice of name. Those semantics are "make the data we just wrote to this range visible to instruction fetches".
...and the implementation usually requires the two steps flush_dcache_range() and invalidate_icache_range(), which translated straightforward into "dcache flush" resp. "icache invalidate" commands.
Best regards,
Wolfgang Denk

On 04/10/2013 06:54:25 AM, Wolfgang Denk wrote:
Dear Scott,
In message 1365555490.31043.20@snotra you wrote:
If that means you have some other reason for objecting that you *do* want to go into, could you elaborate?
I explained that before: we already have commands to operate with the caches, and we should rather use the existingones and extend these as needed instead of adding arbitrary new ones.
The existing ones have semantics that are mismatched to what we want to expose.
This is just one implementation of flush_cache(). Here are all of
them:
Thanks for the list. But that was not my question. I asked (ok, in the other message):
It was my answer to your question. If you don't like my answer, fine.
| ... Can you please point me to | the code (ideally in mainline U-Boot) which would cause problems with | the suggested separation of invalidating the IC and flushing the DC | into subcommands of the "icache" resp. "dcache" commands?
Yes, I did have a look at all these implementations, and I think none of them will cause any issues.
Needing to touch them at all is a big issue. We don't know the details of all these architectures, and cannot test them. If there were a good *reason* for it we could engage the maintainers of those architectures, but I've yet to hear a reason that justifies the hassle. We'd probably be better off just keeping the patch in our internal tree, which is the tree that the user who ran into this problem is using.
Actually quite a number of them are actually just combiing calls like that:
Many have the instruction/data sequence inside the loop so we'd need to pick it apart (higher risk of introducing a bug, so more need for testing that we cannot do). Blackfin is weird -- if we did a simple split at the C-code level it looks like we'd have two dummy loops executing.
This is what we're trying to expose as a command. If you want it
split
into icache and dcache, it needs to be rewritten for every
architecture
Actually it appears to be already split quite naturally for all currently supported architectures (at least to the extend these implement this functinality at all). flush_cache() is just a convenience, and if you think twice not even a very lucky one. The openrisc example above shows this pretty clearly.
The openrisc example does not show any great difficulty implementing flush_cache().
the actual need for splitting it? The semantics we're trying to
expose
as a command are the same as this function is meant to implement,
You defend this stupid function flush_cache() as if it was of major achievement of the development of software in the last decades?
"Major achievement"? No, I just said it was a useful (but poorly named) abstraction, which it is. What do we gain by replacing every caller of flush_cache() with two function calls instead? When would we ever call one but not the other, and if there is such a case, what harm would come out of doing both anyway?
-Scott

Dear Scott,
In message 1365622923.8381.10@snotra you wrote:
I explained that before: we already have commands to operate with the caches, and we should rather use the existingones and extend these as needed instead of adding arbitrary new ones.
The existing ones have semantics that are mismatched to what we want to expose.
Agreed. So we can either change the API to match your wishes, or we can ask you to adapt to the existing API. The first is less effort, the second is conceptionally cleaner.
I prefer the second one.
Many have the instruction/data sequence inside the loop so we'd need to pick it apart (higher risk of introducing a bug, so more need for
Come one. This is actually all pretty trivial code. I don't buy this argument.
testing that we cannot do). Blackfin is weird -- if we did a simple split at the C-code level it looks like we'd have two dummy loops executing.
Huh? flush_cache() does not include any loop for BF.
Actually it appears to be already split quite naturally for all currently supported architectures (at least to the extend these implement this functinality at all). flush_cache() is just a convenience, and if you think twice not even a very lucky one. The openrisc example above shows this pretty clearly.
The openrisc example does not show any great difficulty implementing flush_cache().
Correct, and neither does any other of the existing architectures. It's a plain stupid laborious task; it does not involve any kind of critical operations or other forms of rocket science.
Best regards,
Wolfgang Denk

On 04/10/2013 04:04:43 PM, Wolfgang Denk wrote:
Dear Scott,
In message 1365622923.8381.10@snotra you wrote:
testing that we cannot do). Blackfin is weird -- if we did a simple split at the C-code level it looks like we'd have two dummy loops executing.
Huh? flush_cache() does not include any loop for BF.
Yes it does. It's in assembly code (do_flush macro in cache.S). Or at least, it looks sort of like a loop -- I'm not familiar with blackfin assembly code.
-Scott

Dear Scott Wood,
In message 1365628243.8381.20@snotra you wrote:
testing that we cannot do). Blackfin is weird -- if we did a simple split at the C-code level it looks like we'd have two dummy loops executing.
Huh? flush_cache() does not include any loop for BF.
Yes it does. It's in assembly code (do_flush macro in cache.S). Or at least, it looks sort of like a loop -- I'm not familiar with blackfin assembly code.
do_flush is not used anywhere within flush_cache(); we are talking about "arch/blackfin/lib/cache.c" here, don't we?
What the actual implementation of blackfin_icache_dcache_flush_range() resp. blackfin_icache_flush_range() resp. blackfin_dcache_flush_range() might look like is of no real interest for this discussion here, right? We just need to call these functions, there is no need to touch these in any way in the context of what we are discussing here.
Do you have any real technical arguments?
Best regards,
Wolfgang Denk

On 04/10/2013 05:50:56 PM, Wolfgang Denk wrote:
Dear Scott Wood,
In message 1365628243.8381.20@snotra you wrote:
testing that we cannot do). Blackfin is weird -- if we did a
simple
split at the C-code level it looks like we'd have two dummy
loops
executing.
Huh? flush_cache() does not include any loop for BF.
Yes it does. It's in assembly code (do_flush macro in cache.S).
Or at
least, it looks sort of like a loop -- I'm not familiar with
blackfin
assembly code.
do_flush is not used anywhere within flush_cache(); we are talking about "arch/blackfin/lib/cache.c" here, don't we?
Sigh. blackfin's flush_cache() calls blackfin_icache_dcache_flush_range(), which is in cache.S.
blackfin_icache_dcache_flush_range() calls do_flush.
What the actual implementation of blackfin_icache_dcache_flush_range() resp. blackfin_icache_flush_range() resp. blackfin_dcache_flush_range() might look like is of no real interest for this discussion here, right? We just need to call these functions, there is no need to touch these in any way in the context of what we are discussing here.
Really, you'd be OK with leaving dead code around? Nothing else would call blackfin_icache_dcache_flush_range() if we remove the combined flushing in cache.c. Even if we remove blackfin_icache_dcache_flush_range(), that leaves mechanics in do_flush that are no longer needed (the ability to do two different cache ops inside the loop).
Do you have any real technical arguments?
I'm done arguing and have already recommended we just keep this patch in our local tree. We do not have unlimited time to spend messing around with other arch code to produce a result that wouldn't be better in any meaningful way. If someone else wants to hack up flush_cache(), they're welcome to.
-Scott

Dear Scott,
In message 1365634846.8381.24@snotra you wrote:
Do you have any real technical arguments?
I'm done arguing and have already recommended we just keep this patch in our local tree. We do not have unlimited time to spend messing around with other arch code to produce a result that wouldn't be better in any meaningful way. If someone else wants to hack up flush_cache(), they're welcome to.
Let me try to summarize the situation:
- You want to add a user command that implements some cache operations for a range of addresses. The needed operations are 1) flushing the data cache; and 2) invalidating the instruction cache.
- U-Boot currently provides a common, architecture-independent C API that implements this functionality:
void flush_cache(ulong start_addr, ulong size)
All architectures provide such a function, but implementation differs: some provide functions like
void flush_dcache_range(unsigned long addr, unsigned long end) void invalidate_icache_range(unsigned long addr, unsigned long end)
while others (like SPARC) always flush the whole cache, and even other implement only parts like data cache flushing (like sh4) or just an empty dummy function (like sh2).
- The cheapest way (in terms of efforts) to provide the requested command line API would be to add a new command that maps to the existing flush_cache() function.
- However, we already have a command line API that deals with cache operations: the commands "icache" and "dcache".
- The existing command line API could be extended like this:
dcache flush addr size => flush_dcache_range(addr, addr+size) icache invalidate addr size => invalidate_icache_range(addr, addr+size)
[Additional functionality, like flushing/invalidating the whole caches if arguments are omitted, are straightforward to add.]
- This implementation requires more effort, since all architectures need to be touched (and then tested), and some of the architecture specific code needs a closer look. Also. this approach is less convenient to the user who now needs to run two commands instead of one.
---------------------------------------------------------------------
Do you agree to this summary, or am I missing important points or is anything wrong?
Cache handling has always been a much neglected area in U-Boot. There has never been an attempt to provide a common, generic API (neither in C nor on the command line) for it. Instead, cache support has always been added only where needed, i. e. when some cache related bug popped up. You can see the result of this by just comparing the implemen- tations of flush_cache() for the different architectures: basically evera architecture invented her approach and interfaces. This is an area where the code is not exactly in perfect shape - any work to unify this wuld be more than welcome.
When we are now discussing to provide a command line API to more cache related functions we should not make the same mistakes again. That means that we should not just take existing code and use it just because it is existing (and thus convenient to use), but we should spend a thought or two to come up with a simple yet powerful (and backward compatible) command line API that can be used across all currently known architectures.
This is why I resist to the cheap and easy way of just adding somthing like a "flush_cache" command.
Best regards,
Wolfgang Denk

On 04/07/2013 03:29:31 AM, Wolfgang Denk wrote:
Dear sun york-R58495,
In message C707E9F4D8007146BF8DC1424B113AC70B3A435B@039-SN2MPN1-012.039d.mgd.msft.net you wrote:
Can we not split this into:
dcache flush icache invalidate
? This would make clear what's happening.
The idea is to reuse existing code with minimum addition. For the
applicati
ons concerned, these two steps are both needed. Splitting them
doesn't make
things easier.
Reusing code is a Good Thing, but not when it comes at the cost of obfucating what the code actually does.
If I have to use existing command, I'd rather to put these two
steps under
icache invalide <addr> <size>.
No, this is not acceptable. The "icache" command deals with the IC only, it must not meddle with the data cache (like flushing it).
I thought you said it was OK to flush more than the user asked for, if the implementation does not have separate icache/dcache flushes? Why is it fundamentally different if it's a hardware limitation, or a limitation of the software layer whose functionality is being exposed?
-Scott

Dear Scott Wood,
In message 1365450620.28843.12@snotra you wrote:
I thought you said it was OK to flush more than the user asked for, if the implementation does not have separate icache/dcache flushes? Why is it fundamentally different if it's a hardware limitation, or a limitation of the software layer whose functionality is being exposed?
I don't get what you are trying to prove. Can you please point me to the code (ideally in mainline U-Boot) which would cause problems with the suggested separation of invalidating the IC and flushing the DC into subcommands of the "icache" resp. "dcache" commands?
Best regards,
Wolfgang Denk
participants (6)
-
Scott Wood
-
sun york-R58495
-
TigerLiu@viatech.com.cn
-
Tom Rini
-
Wolfgang Denk
-
York Sun