[U-Boot] [RFC] command/cache: Add flush_cache command

When we need the copied code/data in the main memory, we can flush the cache now. It uses the existing function flush_cache. Syntax is
flush_cache <addr> <size>
The addr and size are given in hexadecimal. Like memory command, there is no sanity check for the parameters.
Signed-off-by: York Sun yorksun@freescale.com --- common/cmd_cache.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/common/cmd_cache.c b/common/cmd_cache.c index 5512f92..93b7337 100644 --- a/common/cmd_cache.c +++ b/common/cmd_cache.c @@ -94,6 +94,29 @@ int do_dcache(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[]) return 0; }
+void __weak flush_cache(ulong addr, ulong size) +{ + puts("No arch specific flush_cache available!\n"); + /* please define arch specific flush_cache */ +} + +int do_flush_cache(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[]) +{ + ulong addr, size; + + switch (argc) { + case 3: + addr = simple_strtoul(argv[1], NULL, 16); + size = simple_strtoul(argv[2], NULL, 16); + flush_cache(addr, size); + break; + default: + return cmd_usage(cmdtp); + } + return 0; + +} + static int parse_argv(const char *s) { if (strcmp(s, "flush") == 0) @@ -120,3 +143,10 @@ U_BOOT_CMD( "[on, off, flush]\n" " - enable, disable, or flush data (writethrough) cache" ); + +U_BOOT_CMD( + flush_cache, 3, 0, do_flush_cache, + "flush cache for a range", + "<addr> <size>\n" + " - flush cache for specificed range" +);

Hi York,
On Tue, 19 Mar 2013 13:29:52 -0700, York Sun yorksun@freescale.com wrote:
When we need the copied code/data in the main memory, we can flush the cache now. It uses the existing function flush_cache. Syntax is
flush_cache <addr> <size>
The addr and size are given in hexadecimal. Like memory command, there is no sanity check for the parameters.
Signed-off-by: York Sun yorksun@freescale.com
common/cmd_cache.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/common/cmd_cache.c b/common/cmd_cache.c index 5512f92..93b7337 100644 --- a/common/cmd_cache.c +++ b/common/cmd_cache.c @@ -94,6 +94,29 @@ int do_dcache(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[]) return 0; }
+void __weak flush_cache(ulong addr, ulong size) +{
- puts("No arch specific flush_cache available!\n");
- /* please define arch specific flush_cache */
+}
+int do_flush_cache(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[]) +{
- ulong addr, size;
- switch (argc) {
- case 3:
addr = simple_strtoul(argv[1], NULL, 16);
size = simple_strtoul(argv[2], NULL, 16);
flush_cache(addr, size);
break;
- default:
return cmd_usage(cmdtp);
- }
- return 0;
+}
static int parse_argv(const char *s) { if (strcmp(s, "flush") == 0) @@ -120,3 +143,10 @@ U_BOOT_CMD( "[on, off, flush]\n" " - enable, disable, or flush data (writethrough) cache" );
+U_BOOT_CMD(
- flush_cache, 3, 0, do_flush_cache,
- "flush cache for a range",
- "<addr> <size>\n"
- " - flush cache for specificed range"
+);
What's the point of this command exactly? I can see the point of range flushes (and invalidates) in the code for drivers that use DMA, but as a shell command, I fail to see the interest of it.
Amicalement,

On 03/19/2013 03:01 PM, Albert ARIBAUD wrote:
What's the point of this command exactly? I can see the point of range flushes (and invalidates) in the code for drivers that use DMA, but as a shell command, I fail to see the interest of it.
I am actually expecting this question. We have a situation that an application is copied by u-boot to its destination in memory. The code needs to be "seen" in the main memory. Without flushing cache, it is only visible to cores.
York

On 03/19/2013 05:07:33 PM, York Sun wrote:
On 03/19/2013 03:01 PM, Albert ARIBAUD wrote:
What's the point of this command exactly? I can see the point of
range
flushes (and invalidates) in the code for drivers that use DMA, but
as a
shell command, I fail to see the interest of it.
I am actually expecting this question. We have a situation that an application is copied by u-boot to its destination in memory. The code needs to be "seen" in the main memory. Without flushing cache, it is only visible to cores.
It's the same purpose as the cache flushing that happens in bootm, except for code loading that happens outside bootm.
-Scott

Hi Scott,
On Tue, 19 Mar 2013 18:32:39 -0500, Scott Wood scottwood@freescale.com wrote:
On 03/19/2013 05:07:33 PM, York Sun wrote:
On 03/19/2013 03:01 PM, Albert ARIBAUD wrote:
What's the point of this command exactly? I can see the point of
range
flushes (and invalidates) in the code for drivers that use DMA, but
as a
shell command, I fail to see the interest of it.
I am actually expecting this question. We have a situation that an application is copied by u-boot to its destination in memory. The code needs to be "seen" in the main memory. Without flushing cache, it is only visible to cores.
It's the same purpose as the cache flushing that happens in bootm, except for code loading that happens outside bootm.
-Scott
I do understand what it does, but I still don't get why it should be done, since precisely payload control transfer happens through bootm and the like which already properly flush cache.
Also, AFAIK U-Boot on multiple cores runs on a single core (possibly started from a smaller adjunct core) and will load and execute its payload on that same single core; the payload might enable and run additional cores if it so decides, but I don't know that U-Boot would start another main core. Still, I may have missed something.
Anyway:
Is there an ARM multi-core target in U-Boot where U-Boot runs on one core but its payload shall be started on another, "un-booted", core, and which experiences issues due to the first core not flushing cache? If no existing target needs this, then this patch is useless. If there exists such a target and issue, then the right fix is not a shell command, it is a programmatic flush before the other core is enabled, so that it always sees correct RAM.
Maybe this is some code that might come in handy for some future target not in U-Boot yes which will have the first core start a payload on another core? Then the previous argument applies (the fix should not be a shell command, it should be in source code), plus, the patch is dead code until and unless said target is also added in a single series.
Amicalement,

Dear Albert,
In message 20130320145927.2031b913@lilith you wrote:
I do understand what it does, but I still don't get why it should be done, since precisely payload control transfer happens through bootm and the like which already properly flush cache.
Full agrement.
Is there an ARM multi-core target in U-Boot where U-Boot runs on one core but its payload shall be started on another, "un-booted", core, and which experiences issues due to the first core not flushing cache? If no existing target needs this, then this patch is useless. If there exists such a target and issue, then the right fix is not a shell command, it is a programmatic flush before the other core is enabled, so that it always sees correct RAM.
Agreed again. As is, the patch was only adding dead code, as there are no users of the feature.
<nitpick> Also, it was added unconditionally which is a strict no-no as it just adds code-bloat to everyone, without benefit. </nitpick>
Maybe this is some code that might come in handy for some future target not in U-Boot yes which will have the first core start a payload on another core? Then the previous argument applies (the fix should not be a shell command, it should be in source code), plus, the patch is dead code until and unless said target is also added in a single series.
Agreed! Thanks.
Best regards,
Wolfgang Denk

On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:
Dear Albert,
In message 20130320145927.2031b913@lilith you wrote:
I do understand what it does, but I still don't get why it should be done, since precisely payload control transfer happens through
bootm and
the like which already properly flush cache.
It doesn't always happen through bootm. Standalone apps use the "go" command.
Full agrement.
Is there an ARM multi-core target in U-Boot where U-Boot runs on one core but its payload shall be started on another, "un-booted", core, and which experiences issues due to the first core not
flushing
cache? If no existing target needs this, then this patch is
useless. If
there exists such a target and issue, then the right fix is not a
shell
command, it is a programmatic flush before the other core is
enabled,
so that it always sees correct RAM.
Agreed again. As is, the patch was only adding dead code, as there are no users of the feature.
It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...
<nitpick> Also, it was added unconditionally which is a strict no-no as it just adds code-bloat to everyone, without benefit. </nitpick>
Only for boards which select CONFIG_CMD_CACHE... not sure how fine-grained it makes sense to make it.
-Scott

Hi Scott,
On Wed, 20 Mar 2013 11:43:15 -0500, Scott Wood scottwood@freescale.com wrote:
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:
Dear Albert,
In message 20130320145927.2031b913@lilith you wrote:
I do understand what it does, but I still don't get why it should be done, since precisely payload control transfer happens through bootm and the like which already properly flush cache.
It doesn't always happen through bootm. Standalone apps use the "go" command.
Hence my "and the like".
Full agrement.
Is there an ARM multi-core target in U-Boot where U-Boot runs on one core but its payload shall be started on another, "un-booted", core, and which experiences issues due to the first core not
flushing
cache? If no existing target needs this, then this patch is
useless. If
there exists such a target and issue, then the right fix is not a
shell
command, it is a programmatic flush before the other core is
enabled,
so that it always sees correct RAM.
Agreed again. As is, the patch was only adding dead code, as there are no users of the feature.
It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...
But as I stated, you can, and should, include at least one board that will need this command. So far, apparently, no board needs it.
<nitpick> Also, it was added unconditionally which is a strict no-no as it just adds code-bloat to everyone, without benefit. </nitpick>
Only for boards which select CONFIG_CMD_CACHE... not sure how fine-grained it makes sense to make it.
As a command, and useful at most to a few boards, it should have its own option so that only the board(s) that will need it will have the command list entry and associated code.
-Scott
Amicalement,

On 03/20/2013 12:38:13 PM, Albert ARIBAUD wrote:
Hi Scott,
On Wed, 20 Mar 2013 11:43:15 -0500, Scott Wood scottwood@freescale.com wrote:
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:
Dear Albert,
In message 20130320145927.2031b913@lilith you wrote:
I do understand what it does, but I still don't get why it
should be
done, since precisely payload control transfer happens through bootm and the like which already properly flush cache.
It doesn't always happen through bootm. Standalone apps use the
"go"
command.
Hence my "and the like".
But a cache flush does not happen with "go", and cannot (unless you flush the entirety of all relevant caches, which is not always easy and requires more platform-specific code) because it only knows the entry point, not the start/end of the image.
Agreed again. As is, the patch was only adding dead code, as
there
are no users of the feature.
It's a user command! How can it be dead code? I don't know of a
way
to include a human user in a patchset...
But as I stated, you can, and should, include at least one board that will need this command. So far, apparently, no board needs it.
Where does this appearance come from? It's relevant on any PPC target where icache and dcache are not mutually coherent, which is most of them. The odds of getting bitten by the lack of flushing may differ among different targets, but it exists
What specifically would you be looking to be added? Again, it's a user command. The user is a human, not a piece of code. Should we rip out every command that isn't referred to by a board's default environment or some other in-tree reference? I could drop a lot of useful NAND commands that way...
-Scott

On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:
Dear Albert,
In message 20130320145927.2031b913@lilith you wrote:
I do understand what it does, but I still don't get why it should be done, since precisely payload control transfer happens through
bootm and
the like which already properly flush cache.
It doesn't always happen through bootm. Standalone apps use the "go" command.
So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is why bootm does a cache flush, we need some way to flush in this case.
And in this case you aren't better served by say bootelf ?
Assuming that no, it's not, we have a real example where we need this. So please make sure to update the README with the new command CONFIG_CMD_CACHE sets.
Full agrement.
Is there an ARM multi-core target in U-Boot where U-Boot runs on one core but its payload shall be started on another, "un-booted", core, and which experiences issues due to the first core not
flushing
cache? If no existing target needs this, then this patch is
useless. If
there exists such a target and issue, then the right fix is not
a shell
command, it is a programmatic flush before the other core is
enabled,
so that it always sees correct RAM.
Agreed again. As is, the patch was only adding dead code, as there are no users of the feature.
It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...
Can you hightlight what exactly causes the world today to go off and fail? Is the hello_world example app sufficient in this case or do we need something much larger?

On 03/20/2013 02:15:19 PM, Tom Rini wrote:
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:
Dear Albert,
In message 20130320145927.2031b913@lilith you wrote:
I do understand what it does, but I still don't get why it
should be
done, since precisely payload control transfer happens through
bootm and
the like which already properly flush cache.
It doesn't always happen through bootm. Standalone apps use the "go" command.
So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is why bootm does a cache flush, we need some way to flush in this case.
It's also an issue with using the "cpu <n> release" command.
And in this case you aren't better served by say bootelf ?
That wouldn't handle the "cpu release" case. In any case, "go" exists and is currently the recommended way to launch a standalone application in doc/README.standalone.
It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...
Can you hightlight what exactly causes the world today to go off and fail? Is the hello_world example app sufficient in this case or do we need something much larger?
A user inside Freescale is running standalone performance test apps, using both "go" and "cpu <n> release" (since the test needs to run on all CPUs). They are seeing cache problems running on a T4240 if they don't have this flush. This flush is architecturally required between modifying/loading code and running it.
-Scott

On Wed, Mar 20, 2013 at 02:36:05PM -0500, Scott Wood wrote:
On 03/20/2013 02:15:19 PM, Tom Rini wrote:
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:
Dear Albert,
In message 20130320145927.2031b913@lilith you wrote:
I do understand what it does, but I still don't get why it
should be
done, since precisely payload control transfer happens through
bootm and
the like which already properly flush cache.
It doesn't always happen through bootm. Standalone apps use the "go" command.
So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is why bootm does a cache flush, we need some way to flush in this case.
It's also an issue with using the "cpu <n> release" command.
Hadn't seen that command before, where is it?
And in this case you aren't better served by say bootelf ?
That wouldn't handle the "cpu release" case. In any case, "go" exists and is currently the recommended way to launch a standalone application in doc/README.standalone.
It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...
Can you hightlight what exactly causes the world today to go off and fail? Is the hello_world example app sufficient in this case or do we need something much larger?
A user inside Freescale is running standalone performance test apps, using both "go" and "cpu <n> release" (since the test needs to run on all CPUs). They are seeing cache problems running on a T4240 if they don't have this flush. This flush is architecturally required between modifying/loading code and running it.
OK, so this does sound like a real need / use for it, and if we added the granularity of CONFIG_CMD_CACHE_FLUSH or similar, it would be reasonable to turn it on to a large number of boards for a small space savings (so lets not). My next concern is that this needs build testing (and some inspection) on say ARM where we have a weak flush_cache already. But perhaps the right answer is to say it doesn't make sense to add CONFIG_CMD_CACHE on an architecture which doesn't already provide flush_cache, so drop the weak one from this patch.

On 03/20/2013 02:59:19 PM, Tom Rini wrote:
On Wed, Mar 20, 2013 at 02:36:05PM -0500, Scott Wood wrote:
On 03/20/2013 02:15:19 PM, Tom Rini wrote:
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:
Dear Albert,
In message 20130320145927.2031b913@lilith you wrote:
I do understand what it does, but I still don't get why it
should be
done, since precisely payload control transfer happens through
bootm and
the like which already properly flush cache.
It doesn't always happen through bootm. Standalone apps use the "go" command.
So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is
why
bootm does a cache flush, we need some way to flush in this case.
It's also an issue with using the "cpu <n> release" command.
Hadn't seen that command before, where is it?
common/cmd_mp.c
And in this case you aren't better served by say bootelf ?
That wouldn't handle the "cpu release" case. In any case, "go" exists and is currently the recommended way to launch a standalone application in doc/README.standalone.
It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...
Can you hightlight what exactly causes the world today to go off
and
fail? Is the hello_world example app sufficient in this case or
do we
need something much larger?
A user inside Freescale is running standalone performance test apps, using both "go" and "cpu <n> release" (since the test needs to run on all CPUs). They are seeing cache problems running on a T4240 if they don't have this flush. This flush is architecturally required between modifying/loading code and running it.
OK, so this does sound like a real need / use for it, and if we added the granularity of CONFIG_CMD_CACHE_FLUSH or similar, it would be reasonable to turn it on to a large number of boards for a small space savings (so lets not). My next concern is that this needs build testing (and some inspection) on say ARM where we have a weak flush_cache already. But perhaps the right answer is to say it doesn't make sense to add CONFIG_CMD_CACHE on an architecture which doesn't already provide flush_cache, so drop the weak one from this patch.
flush_cache() is already called from generic code, so we should be ok just dropping the weak function.
-Scott

Dear Scott Wood,
In message 1363815061.25034.18@snotra you wrote:
Hadn't seen that command before, where is it?
common/cmd_mp.c
This depends on CONFIG_MP, which is undocumented. Can you please add / have added documentation for this config option? Thanks.
Best regards,
Wolfgang Denk

Dear Tom Rini,
In message 20130320195919.GR25919@bill-the-cat you wrote:
OK, so this does sound like a real need / use for it, and if we added the granularity of CONFIG_CMD_CACHE_FLUSH or similar, it would be reasonable to turn it on to a large number of boards for a small space savings (so lets not). My next concern is that this needs build testing
Not really. Only a tiny fraction of users will ever run any standalone applications, so please let's save the memory footprint for the overwhelming majority of users who do not need that.
Best regards,
Wolfgang Denk

On Thu, Mar 21, 2013 at 06:39:31AM +0100, Wolfgang Denk wrote:
Dear Tom Rini,
In message 20130320195919.GR25919@bill-the-cat you wrote:
OK, so this does sound like a real need / use for it, and if we added the granularity of CONFIG_CMD_CACHE_FLUSH or similar, it would be reasonable to turn it on to a large number of boards for a small space savings (so lets not). My next concern is that this needs build testing
Not really. Only a tiny fraction of users will ever run any standalone applications, so please let's save the memory footprint for the overwhelming majority of users who do not need that.
Well, can we run into this problem on ARM (v7 or v8) systems as well?

Dear Tom,
In message 20130321122923.GB26945@bill-the-cat you wrote:
Not really. Only a tiny fraction of users will ever run any standalone applications, so please let's save the memory footprint for the overwhelming majority of users who do not need that.
Well, can we run into this problem on ARM (v7 or v8) systems as well?
Probably.
But I wonder what the exact usage szenario is that will trigger the problem. If I understand correctly, this can only happen when you perform a (manual) memory copy (either between different locations in RAM, or from parallel NOR flash to RAM) of the code you are going to run.
As far as I understand all other ways to load any such code (over the network or from storage devices) already make sure to run flush_cache() after any such load operation.
Scott: is my understanding correct that you only need this because you are performing such memory copy ops manually? From where / to where are you copying, and why?
Thinking about alternatives:
- eventually we should discourage the use of "go"; it may be conveniend when you know what you are doing, but if it's casuing such problems we might be better off recommending to use proper IH_TYPE_STANDALONE legacy images in combination with the bootm command instead.
- Also, instead of adding a new command, this could probably be scripted; I guess this should be roughly equivalent?
setenv flush_cache 'dc off;ic off;dc on;ic on'
??
Best regards,
Wolfgang Denk

On 03/21/2013 08:37:32 AM, Wolfgang Denk wrote:
Dear Tom,
In message 20130321122923.GB26945@bill-the-cat you wrote:
Not really. Only a tiny fraction of users will ever run any
standalone
applications, so please let's save the memory footprint for the overwhelming majority of users who do not need that.
Well, can we run into this problem on ARM (v7 or v8) systems as
well?
Probably.
But I wonder what the exact usage szenario is that will trigger the problem. If I understand correctly, this can only happen when you perform a (manual) memory copy (either between different locations in RAM, or from parallel NOR flash to RAM) of the code you are going to run.
Yes, the person who is seeing this problem is copying code from flash to RAM.
As far as I understand all other ways to load any such code (over the network or from storage devices) already make sure to run flush_cache() after any such load operation.
A lot of places seem to have it, but it seems not everywhere. I'm not aware of such a flush for NAND reads.
We could instead try to make sure everything, including commands like "cp" and "mm", flush cache when they're done. This approach is more user friendly, though it has a higher risk of missing some code path.
Scott: is my understanding correct that you only need this because you are performing such memory copy ops manually? From where / to where are you copying, and why?
As above it's from flash (I assume NOR) to RAM. The "why" is to be able to run the code from RAM. :-P
Thinking about alternatives:
- eventually we should discourage the use of "go"; it may be conveniend when you know what you are doing, but if it's casuing such problems we might be better off recommending to use proper IH_TYPE_STANDALONE legacy images in combination with the bootm command instead.
That or bootelf, sure. I think there still should be some way to do it manually, though. bootm/bootelf probably wouldn't work for this use case because it involves releasing other cores after the image is copied/flushed, but before u-boot gives up control on the boot core.
Also, instead of adding a new command, this could probably be scripted; I guess this should be roughly equivalent?
setenv flush_cache 'dc off;ic off;dc on;ic on'
This assumes that we support those cache operations, that they affect all relevant caches (on 85xx it only flushes L1, but at least on newer chips L2 is relevant as well), and that there are no errata or architectural limitations to running with caches off.
-Scott

Dear Scott,
In message 1363890157.31522.14@snotra you wrote:
As above it's from flash (I assume NOR) to RAM. The "why" is to be =20 able to run the code from RAM. :-P
* Why don't you run it form flash?
* Why do you insist on using the "go" command (instead of "bootm" with a IH_TYPE_STANDALONE image?
- eventually we should discourage the use of "go"; it may be conveniend when you know what you are doing, but if it's casuing such problems we might be better off recommending to use proper IH_TYPE_STANDALONE legacy images in combination with the bootm command instead.
That or bootelf, sure. I think there still should be some way to do it manually, though. bootm/bootelf probably wouldn't work for this use case because it involves releasing other cores after the image is copied/flushed, but before u-boot gives up control on the boot core.
The "releasing other cores" would then be a sub-command in the bootm sequence?
Also, instead of adding a new command, this could probably be scripted; I guess this should be roughly equivalent?
setenv flush_cache 'dc off;ic off;dc on;ic on'
This assumes that we support those cache operations, that they affect all relevant caches (on 85xx it only flushes L1, but at least on newer chips L2 is relevant as well), and that there are no errata or architectural limitations to running with caches off.
Indeed.
Best regards,
Wolfgang Denk

On 03/21/2013 02:25:10 PM, Wolfgang Denk wrote:
Dear Scott,
In message 1363890157.31522.14@snotra you wrote:
As above it's from flash (I assume NOR) to RAM. The "why" is to be
=20
able to run the code from RAM. :-P
- Why don't you run it form flash?
Presumably because it's slow. As indicated elsewhere in the thread, these are performance tests.
- Why do you insist on using the "go" command (instead of "bootm" with a IH_TYPE_STANDALONE image?
Presumably because of the need to release other cores first.
- eventually we should discourage the use of "go"; it may be conveniend when you know what you are doing, but if it's casuing such problems we might be better off recommending to use proper IH_TYPE_STANDALONE legacy images in combination with the bootm command instead.
That or bootelf, sure. I think there still should be some way to
do it
manually, though. bootm/bootelf probably wouldn't work for this use case because it involves releasing other cores after the image is copied/flushed, but before u-boot gives up control on the boot core.
The "releasing other cores" would then be a sub-command in the bootm sequence?
Perhaps it could be, or the application could be altered to release secondary cores through the spin table. I don't think that excuses a situation where some ways of putting a blob of bytes into RAM flush the cache (to the extent the architecture requires it for the blob of bytes to be executable) and others don't, and there's no way to do it manually. Would you remove the "go" command entirely? I think that would be a mistake.
-Scott

Hi Scott,
Perhaps it could be, or the application could be altered to release secondary cores through the spin table. I don't think that excuses a situation where some ways of putting a blob of bytes into RAM flush the cache (to the extent the architecture requires it for the blob of bytes to be executable) and others don't, and there's no way to do it manually.
AFAIU there is.
Would you remove the "go" command entirely? I think that would be a mistake.
I do not see why you are talking about removing the "go" command. In the 'worst' scenario (from an effort perspective), it would have to be do a flush and possibly cache disable before branching to the payload; in the 'best' scenario, it needs not be modified at all.
-Scott
Amicalement,

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 03/22/2013 02:30 AM, Albert ARIBAUD wrote:
Hi Scott,
Perhaps it could be, or the application could be altered to release secondary cores through the spin table. I don't think that excuses a situation where some ways of putting a blob of bytes into RAM flush the cache (to the extent the architecture requires it for the blob of bytes to be executable) and others don't, and there's no way to do it manually.
AFAIU there is.
Would you remove the "go" command entirely? I think that would be a mistake.
I do not see why you are talking about removing the "go" command. In the 'worst' scenario (from an effort perspective), it would have to be do a flush and possibly cache disable before branching to the payload; in the 'best' scenario, it needs not be modified at all.
It seems like we're going around and around with one point not being addressed. When using 'go', how do you know the size to flush? And since Scott is talking about performance testing apps, the cache should not be disabled (unless we expect all standalone apps to enable the cache, in which case we need to provide something in the jump table to make that easy and document this change).
- -- Tom

Dear Tom,
In message 514C4BE8.10508@ti.com you wrote:
It seems like we're going around and around with one point not being addressed. When using 'go', how do you know the size to flush? And since Scott is talking about performance testing apps, the cache should not be disabled (unless we expect all standalone apps to enable the cache, in which case we need to provide something in the jump table to make that easy and document this change).
I also wonder about this. To me it appears much easier to use a IH_TYPE_STANDALONE image, which 1) provides the needed size information and 2) can be used with bootm, so the required additional steps (flush caches, release CPU) can be handled in bootm subcommands.
Best regards,
Wolfgang Denk

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 03/22/2013 10:03 AM, Wolfgang Denk wrote:
Dear Tom,
In message 514C4BE8.10508@ti.com you wrote:
It seems like we're going around and around with one point not being addressed. When using 'go', how do you know the size to flush? And since Scott is talking about performance testing apps, the cache should not be disabled (unless we expect all standalone apps to enable the cache, in which case we need to provide something in the jump table to make that easy and document this change).
I also wonder about this. To me it appears much easier to use a IH_TYPE_STANDALONE image, which 1) provides the needed size information and 2) can be used with bootm, so the required additional steps (flush caches, release CPU) can be handled in bootm subcommands.
But that then circles us back to Scott's other point of "go" is broken then and it is the recommended way to start standalone applications.
Now, if we want to change things and say that no, you can't just run totally raw binaries reliably with "go" but instead need to throw some form of header on top of them, how portable, really, is mkimage? We've just made that a required part of the work-flow for anyone doing development that's not producing ELF or something else already boot*'able. That might be a rather large pool I suspect.
Scott, part of the problem here is that we have multiple cores, yes? Say core0 is the one that read things in from NOR to DDR, core1 is the one that will be running things. How about we make flush_cache depend on CONFIG_MP || CONFIG_CMD_CACHE_FLUSH ? It's a likely required often thing for CONFIG_MP systems and anyone else that needs it can opt-in.
- -- Tom

Hi Tom,
On Fri, 22 Mar 2013 10:29:04 -0400, Tom Rini trini@ti.com wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 03/22/2013 10:03 AM, Wolfgang Denk wrote:
Dear Tom,
In message 514C4BE8.10508@ti.com you wrote:
It seems like we're going around and around with one point not being addressed. When using 'go', how do you know the size to flush? And since Scott is talking about performance testing apps, the cache should not be disabled (unless we expect all standalone apps to enable the cache, in which case we need to provide something in the jump table to make that easy and document this change).
I also wonder about this. To me it appears much easier to use a IH_TYPE_STANDALONE image, which 1) provides the needed size information and 2) can be used with bootm, so the required additional steps (flush caches, release CPU) can be handled in bootm subcommands.
But that then circles us back to Scott's other point of "go" is broken then and it is the recommended way to start standalone applications.
I am not sure I understand how exactly go is broken, or how the bootm proposal from Wolfgang circles back to a brokennness of 'go'. I am not even sure of the exact scenario. How many cores are we dealing with here, and what does each core do in sequence?
Scott, part of the problem here is that we have multiple cores, yes? Say core0 is the one that read things in from NOR to DDR, core1 is the one that will be running things.
That's one possible scenario but I would like more detail before we look for a solution.
Amicalement,

On 03/22/2013 09:29:04 AM, Tom Rini wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 03/22/2013 10:03 AM, Wolfgang Denk wrote:
Dear Tom,
In message 514C4BE8.10508@ti.com you wrote:
It seems like we're going around and around with one point not being addressed. When using 'go', how do you know the size to flush? And since Scott is talking about performance testing apps, the cache should not be disabled (unless we expect all standalone apps to enable the cache, in which case we need to provide something in the jump table to make that easy and document this change).
I also wonder about this. To me it appears much easier to use a IH_TYPE_STANDALONE image, which 1) provides the needed size information and 2) can be used with bootm, so the required additional steps (flush caches, release CPU) can be handled in bootm subcommands.
But that then circles us back to Scott's other point of "go" is broken then and it is the recommended way to start standalone applications.
Now, if we want to change things and say that no, you can't just run totally raw binaries reliably with "go" but instead need to throw some form of header on top of them, how portable, really, is mkimage? We've just made that a required part of the work-flow for anyone doing development that's not producing ELF or something else already boot*'able. That might be a rather large pool I suspect.
Scott, part of the problem here is that we have multiple cores, yes? Say core0 is the one that read things in from NOR to DDR, core1 is the one that will be running things. How about we make flush_cache depend on CONFIG_MP || CONFIG_CMD_CACHE_FLUSH ? It's a likely required often thing for CONFIG_MP systems and anyone else that needs it can opt-in.
Multiple CPUs may make it more likely to see problems on some hardware, but architecturally on PPC the flush is required even with a single CPU. Icache fetches won't snoop the dcache.
Can we have it depend on a new config symbol, so it's not bloating anyone's U-Boot who doesn't want it, and deal with improvements to recommended standalone app workflow as a separate issue? We're not talking about a huge amount of code here, just exposing functionality that U-Boot already has internally.
-Scott

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 03/22/2013 12:48 PM, Scott Wood wrote:
On 03/22/2013 09:29:04 AM, Tom Rini wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 03/22/2013 10:03 AM, Wolfgang Denk wrote:
Dear Tom,
In message 514C4BE8.10508@ti.com you wrote:
It seems like we're going around and around with one point not being addressed. When using 'go', how do you know the size to flush? And since Scott is talking about performance testing apps, the cache should not be disabled (unless we expect all standalone apps to enable the cache, in which case we need to provide something in the jump table to make that easy and document this change).
I also wonder about this. To me it appears much easier to use a IH_TYPE_STANDALONE image, which 1) provides the needed size information and 2) can be used with bootm, so the required additional steps (flush caches, release CPU) can be handled in bootm subcommands.
But that then circles us back to Scott's other point of "go" is broken then and it is the recommended way to start standalone applications.
Now, if we want to change things and say that no, you can't just run totally raw binaries reliably with "go" but instead need to throw some form of header on top of them, how portable, really, is mkimage? We've just made that a required part of the work-flow for anyone doing development that's not producing ELF or something else already boot*'able. That might be a rather large pool I suspect.
Scott, part of the problem here is that we have multiple cores, yes? Say core0 is the one that read things in from NOR to DDR, core1 is the one that will be running things. How about we make flush_cache depend on CONFIG_MP || CONFIG_CMD_CACHE_FLUSH ? It's a likely required often thing for CONFIG_MP systems and anyone else that needs it can opt-in.
Multiple CPUs may make it more likely to see problems on some hardware, but architecturally on PPC the flush is required even with a single CPU. Icache fetches won't snoop the dcache.
Can we have it depend on a new config symbol, so it's not bloating anyone's U-Boot who doesn't want it, and deal with improvements to recommended standalone app workflow as a separate issue? We're not talking about a huge amount of code here, just exposing functionality that U-Boot already has internally.
Works for me.
- -- Tom

Dear Tom,
In message 514C6AB0.7090808@ti.com you wrote:
I also wonder about this. To me it appears much easier to use a IH_TYPE_STANDALONE image, which 1) provides the needed size information and 2) can be used with bootm, so the required additional steps (flush caches, release CPU) can be handled in bootm subcommands.
But that then circles us back to Scott's other point of "go" is broken then and it is the recommended way to start standalone applications.
I'm not sosure about this It once was, yes, when IH_TYPE_STANDALONE did not exist yet. Later this changed.
Of course "go" has it's right to exist. For example, I use it regularly when I want to crash a system by jumping to a non-existent or unaligned address ;-)
But today, "bootm" for a IH_TYPE_STANDALONE image ppears to be a better way to start a standalone app.
Now, if we want to change things and say that no, you can't just run totally raw binaries reliably with "go" but instead need to throw some form of header on top of them, how portable, really, is mkimage?
I think it would be wrong here tobe so absolute - this or that, and nothing inbetween. The thing is, for well over a decase "go" has been working fine, and problems like this here hever never been much of an issue. And being able just to jump anywhere _is_ a necessary function, IMO.
How portable is mkimage? Well, pretty much, it seems - probably more than your options are to build a bare metal standalone image to run on the target hardware. Why exactly are you asking this?
We've just made that a required part of the work-flow for anyone doing development that's not producing ELF or something else already boot*'able. That might be a rather large pool I suspect.
How many of the total number of U-Boot users are actively using standalone apps?
And how many of these are actually experiencing the problems we're discussing here?
I'd be willing to bet a few beer that it's only a _tiny_ fraction.
Of course we should provide working solutions for each and every use case, but here I fear we're designing for a lot of overkill.
Best regards,
Wolfgang Denk

Hi Scott,
On Wed, 20 Mar 2013 14:36:05 -0500, Scott Wood scottwood@freescale.com wrote:
On 03/20/2013 02:15:19 PM, Tom Rini wrote:
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:
Dear Albert,
In message 20130320145927.2031b913@lilith you wrote:
I do understand what it does, but I still don't get why it
should be
done, since precisely payload control transfer happens through
bootm and
the like which already properly flush cache.
It doesn't always happen through bootm. Standalone apps use the "go" command.
So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is why bootm does a cache flush, we need some way to flush in this case.
It's also an issue with using the "cpu <n> release" command.
And in this case you aren't better served by say bootelf ?
That wouldn't handle the "cpu release" case. In any case, "go" exists and is currently the recommended way to launch a standalone application in doc/README.standalone.
It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...
Can you hightlight what exactly causes the world today to go off and fail? Is the hello_world example app sufficient in this case or do we need something much larger?
A user inside Freescale is running standalone performance test apps, using both "go" and "cpu <n> release" (since the test needs to run on all CPUs). They are seeing cache problems running on a T4240 if they don't have this flush. This flush is architecturally required between modifying/loading code and running it.
Still, why make it a shell command? Since this user needs a flush with "go" and "cpu release", then we should add a programmatic global cache flush in the "go" and "cpu release" commands.
-Scott
Amicalement,

On 03/20/2013 05:11:57 PM, Albert ARIBAUD wrote:
Hi Scott,
On Wed, 20 Mar 2013 14:36:05 -0500, Scott Wood scottwood@freescale.com wrote:
On 03/20/2013 02:15:19 PM, Tom Rini wrote:
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:
Dear Albert,
In message 20130320145927.2031b913@lilith you wrote:
I do understand what it does, but I still don't get why it
should be
done, since precisely payload control transfer happens
through
bootm and
the like which already properly flush cache.
It doesn't always happen through bootm. Standalone apps use the "go" command.
So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is
why
bootm does a cache flush, we need some way to flush in this case.
It's also an issue with using the "cpu <n> release" command.
And in this case you aren't better served by say bootelf ?
That wouldn't handle the "cpu release" case. In any case, "go"
exists
and is currently the recommended way to launch a standalone
application
in doc/README.standalone.
It's a user command! How can it be dead code? I don't know of
a
way to include a human user in a patchset...
Can you hightlight what exactly causes the world today to go off
and
fail? Is the hello_world example app sufficient in this case or
do we
need something much larger?
A user inside Freescale is running standalone performance test apps, using both "go" and "cpu <n> release" (since the test needs to run
on
all CPUs). They are seeing cache problems running on a T4240 if
they
don't have this flush. This flush is architecturally required
between
modifying/loading code and running it.
Still, why make it a shell command? Since this user needs a flush with "go" and "cpu release", then we should add a programmatic global cache flush in the "go" and "cpu release" commands.
Why add any new commands? They could all be subcommands of bootm! :-)
Really, instead of adding one command, you want to modify *two* commands to do the same thing separately, which involves changing the syntax of both commands to accept memory range information?
-Scott

On Mar 20, 2013, at 6:35 PM, Scott Wood scottwood@freescale.com wrote:
Really, instead of adding one command, you want to modify *two* commands to do the same thing separately, which involves changing the syntax of both commands to accept memory range information?
What is the purpose of limiting the memory range to be flushed? Is there a reason one might want to NOT flush certain data sitting in a dirty cache line out to memory before doing a go or boot command?
If you drive the operation as a "walk the cache" process rather than a "iterate over all SDRAM physical address space to cherry pick within a range" it wouldn't seem that slow. I mean, there's only so much cache memory.
Maybe I'm not well versed enough on the various architectures to understand how a cache walk is really done. So while the question really falls to someone who does know, I wanted to make sure it was at least asked.
-Mike

On 03/20/2013 06:33:41 PM, Michael Cashwell wrote:
On Mar 20, 2013, at 6:35 PM, Scott Wood scottwood@freescale.com wrote:
Really, instead of adding one command, you want to modify *two*
commands to do the same thing separately, which involves changing the syntax of both commands to accept memory range information?
What is the purpose of limiting the memory range to be flushed? Is there a reason one might want to NOT flush certain data sitting in a dirty cache line out to memory before doing a go or boot command?
Because it would take a while to flush all of RAM?
If you drive the operation as a "walk the cache" process rather than a "iterate over all SDRAM physical address space to cherry pick within a range" it wouldn't seem that slow. I mean, there's only so much cache memory.
There's no way to "walk the cache" on many chips, including ours. There are generally ways to flush the entire thing, but they are (more) hardware-specific than using the architected method of flushing a memory range, and sometimes these methods have errata associated with them.
-Scott

On Mar 20, 2013, at 7:48 PM, Scott Wood scottwood@freescale.com wrote:
On 03/20/2013 06:33:41 PM, Michael Cashwell wrote:
What is the purpose of limiting the memory range to be flushed? Is there a reason one might want to NOT flush certain data sitting in a dirty cache line out to memory before doing a go or boot command?
Because it would take a while to flush all of RAM?
"Flushing all of RAM" is what trips me up. Fundamentally, that puts the cart in front of the horse. The goal isn't to flush all of RAM but rather to flush all of cache. Iterating over the small thing rather than the large would seem reasonably efficient.
But as you say, if there are architectures where that can't be done and you must pass GBs of physical address space (rather than KB of cache space) through some process then range limiting it does make sense.
I'm not trying to argue a point. I just want all angles to be considered.
Thanks for the enlightenment!
-Mike

On 03/20/2013 07:27:29 PM, Michael Cashwell wrote:
On Mar 20, 2013, at 7:48 PM, Scott Wood scottwood@freescale.com wrote:
On 03/20/2013 06:33:41 PM, Michael Cashwell wrote:
What is the purpose of limiting the memory range to be flushed? Is
there a reason one might want to NOT flush certain data sitting in a dirty cache line out to memory before doing a go or boot command?
Because it would take a while to flush all of RAM?
"Flushing all of RAM" is what trips me up. Fundamentally, that puts the cart in front of the horse. The goal isn't to flush all of RAM but rather to flush all of cache.
Right, I was just responding to your question of, "What is the purpose of limiting the memory range to be flushed?"
Iterating over the small thing rather than the large would seem reasonably efficient.
But as you say, if there are architectures where that can't be done and you must pass GBs of physical address space (rather than KB of cache space) through some process then range limiting it does make sense.
Right. The range specified is a minimum to be flushed -- if a particular architecture finds it easier/quicker to flush everything instead, that's fine.
-Scott

On Thursday 21 March 2013 06:01 AM, Scott Wood wrote:
On 03/20/2013 07:27:29 PM, Michael Cashwell wrote:
On Mar 20, 2013, at 7:48 PM, Scott Wood scottwood@freescale.com wrote:
On 03/20/2013 06:33:41 PM, Michael Cashwell wrote:
What is the purpose of limiting the memory range to be flushed? Is there a reason one might want to NOT flush certain data sitting in a dirty cache line out to memory before doing a go or boot command?
Because it would take a while to flush all of RAM?
"Flushing all of RAM" is what trips me up. Fundamentally, that puts the cart in front of the horse. The goal isn't to flush all of RAM but rather to flush all of cache.
Right, I was just responding to your question of, "What is the purpose of limiting the memory range to be flushed?"
Iterating over the small thing rather than the large would seem reasonably efficient.
But as you say, if there are architectures where that can't be done and you must pass GBs of physical address space (rather than KB of cache space) through some process then range limiting it does make sense.
Right. The range specified is a minimum to be flushed -- if a particular architecture finds it easier/quicker to flush everything instead, that's fine.
So in your case, how do you find out the addresses of buffers to be flushed from command ? Just thinking how this can be used generically ?
Regards, Sricharan

On 03/21/2013 12:02:23 AM, Sricharan R wrote:
On Thursday 21 March 2013 06:01 AM, Scott Wood wrote:
On 03/20/2013 07:27:29 PM, Michael Cashwell wrote:
On Mar 20, 2013, at 7:48 PM, Scott Wood scottwood@freescale.com
wrote:
On 03/20/2013 06:33:41 PM, Michael Cashwell wrote:
What is the purpose of limiting the memory range to be flushed?
Is there a reason one might want to NOT flush certain data sitting in a dirty cache line out to memory before doing a go or boot command?
Because it would take a while to flush all of RAM?
"Flushing all of RAM" is what trips me up. Fundamentally, that
puts the cart in front of the horse. The goal isn't to flush all of RAM but rather to flush all of cache.
Right, I was just responding to your question of, "What is the
purpose of limiting the memory range to be flushed?"
Iterating over the small thing rather than the large would seem
reasonably efficient.
But as you say, if there are architectures where that can't be
done and you must pass GBs of physical address space (rather than KB of cache space) through some process then range limiting it does make sense.
Right. The range specified is a minimum to be flushed -- if a
particular architecture finds it easier/quicker to flush everything instead, that's fine.
So in your case, how do you find out the addresses of buffers to be flushed from command ? Just thinking how this can be used generically ?
I'm not sure what you're asking. How does the user know what address range to pass to the command? It's whatever covers the program they want to run.
-Scott

Hi Scott,
On Wed, 20 Mar 2013 17:35:51 -0500, Scott Wood scottwood@freescale.com wrote:
On 03/20/2013 05:11:57 PM, Albert ARIBAUD wrote:
Hi Scott,
On Wed, 20 Mar 2013 14:36:05 -0500, Scott Wood scottwood@freescale.com wrote:
On 03/20/2013 02:15:19 PM, Tom Rini wrote:
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:
Dear Albert,
In message 20130320145927.2031b913@lilith you wrote: > > I do understand what it does, but I still don't get why it
should be
> done, since precisely payload control transfer happens
through
bootm and > the like which already properly flush cache.
It doesn't always happen through bootm. Standalone apps use the "go" command.
So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is
why
bootm does a cache flush, we need some way to flush in this case.
It's also an issue with using the "cpu <n> release" command.
And in this case you aren't better served by say bootelf ?
That wouldn't handle the "cpu release" case. In any case, "go"
exists
and is currently the recommended way to launch a standalone
application
in doc/README.standalone.
It's a user command! How can it be dead code? I don't know of
a
way to include a human user in a patchset...
Can you hightlight what exactly causes the world today to go off
and
fail? Is the hello_world example app sufficient in this case or
do we
need something much larger?
A user inside Freescale is running standalone performance test apps, using both "go" and "cpu <n> release" (since the test needs to run
on
all CPUs). They are seeing cache problems running on a T4240 if
they
don't have this flush. This flush is architecturally required
between
modifying/loading code and running it.
Still, why make it a shell command? Since this user needs a flush with "go" and "cpu release", then we should add a programmatic global cache flush in the "go" and "cpu release" commands.
Why add any new commands? They could all be subcommands of bootm! :-)
I did not say "subcommand", I said "programmatic" -- precisely, I don't like the idea that a specific command or command form is needed to avoid a situation that can be avoided automatically.
Really, instead of adding one command, you want to modify *two* commands to do the same thing separately, which involves changing the syntax of both commands to accept memory range information?
There is no need to change any syntax.
-Scott
Amicalement,

On 03/21/2013 12:58:37 PM, Albert ARIBAUD wrote:
Hi Scott,
On Wed, 20 Mar 2013 17:35:51 -0500, Scott Wood scottwood@freescale.com wrote:
On 03/20/2013 05:11:57 PM, Albert ARIBAUD wrote:
Hi Scott,
On Wed, 20 Mar 2013 14:36:05 -0500, Scott Wood scottwood@freescale.com wrote:
On 03/20/2013 02:15:19 PM, Tom Rini wrote:
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote: >Dear Albert, > >In message 20130320145927.2031b913@lilith you wrote: >> >> I do understand what it does, but I still don't get why
it
should be
>> done, since precisely payload control transfer happens
through
>bootm and >> the like which already properly flush cache.
It doesn't always happen through bootm. Standalone apps
use the
"go" command.
So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence,
which is
why
bootm does a cache flush, we need some way to flush in this
case.
It's also an issue with using the "cpu <n> release" command.
And in this case you aren't better served by say bootelf ?
That wouldn't handle the "cpu release" case. In any case, "go"
exists
and is currently the recommended way to launch a standalone
application
in doc/README.standalone.
It's a user command! How can it be dead code? I don't
know of
a
way to include a human user in a patchset...
Can you hightlight what exactly causes the world today to go
off
and
fail? Is the hello_world example app sufficient in this case
or
do we
need something much larger?
A user inside Freescale is running standalone performance test
apps,
using both "go" and "cpu <n> release" (since the test needs to
run
on
all CPUs). They are seeing cache problems running on a T4240 if
they
don't have this flush. This flush is architecturally required
between
modifying/loading code and running it.
Still, why make it a shell command? Since this user needs a flush
with
"go" and "cpu release", then we should add a programmatic global
cache
flush in the "go" and "cpu release" commands.
Why add any new commands? They could all be subcommands of bootm!
:-)
I did not say "subcommand", I said "programmatic" -- precisely, I don't like the idea that a specific command or command form is needed to avoid a situation that can be avoided automatically.
bootm subcommands are programmatic in normal use cases.
Really, instead of adding one command, you want to modify *two* commands to do the same thing separately, which involves changing
the
syntax of both commands to accept memory range information?
There is no need to change any syntax.
Then how would we know what range to flush?
-Scott

Dear Scott Wood,
In message 1363889275.31522.12@snotra you wrote:
There is no need to change any syntax.
Then how would we know what range to flush?
Use a IH_TYPE_STANDALONE U-Boot image, and bootm?
Best regards,
Wolfgang Denk

On 03/20/2013 12:15 PM, Tom Rini wrote:
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:
Dear Albert,
In message 20130320145927.2031b913@lilith you wrote:
I do understand what it does, but I still don't get why it should be done, since precisely payload control transfer happens through
bootm and
the like which already properly flush cache.
It doesn't always happen through bootm. Standalone apps use the "go" command.
So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is why bootm does a cache flush, we need some way to flush in this case.
And in this case you aren't better served by say bootelf ?
We have a user's case to release secondary cores to run user's application which is in main memory. The secondary cores don't necessarily share the cache with the core u-boot is running, depends on hardware implementation. Without flushing the cache, those cores don't have the correct code to fetch.
York

Dear York Sun,
In message 1363724992-9803-1-git-send-email-yorksun@freescale.com you wrote:
When we need the copied code/data in the main memory, we can flush the cache now. It uses the existing function flush_cache. Syntax is
flush_cache <addr> <size>
The addr and size are given in hexadecimal. Like memory command, there is no sanity check for the parameters.
Why would it be necessary to ever run this manually?
I feel you are just papering over bugs / shortcomings elsewhere in the code. Should we not rather fix the places where the cache flushes are missing in the code?
Best regards,
Wolfgang Denk

On 03/20/2013 09:51:03 AM, Wolfgang Denk wrote:
Dear York Sun,
In message 1363724992-9803-1-git-send-email-yorksun@freescale.com you wrote:
When we need the copied code/data in the main memory, we can flush
the
cache now. It uses the existing function flush_cache. Syntax is
flush_cache <addr> <size>
The addr and size are given in hexadecimal. Like memory command,
there is
no sanity check for the parameters.
Why would it be necessary to ever run this manually?
I feel you are just papering over bugs / shortcomings elsewhere in the code. Should we not rather fix the places where the cache flushes are missing in the code?
Possibly -- the alternative would be to have every command that modifies memory flush that region of cache afterward (cp, tftp, etc).
You could say the same thing about the existing cache command, though.
-Scott

I apologize for being so late with this question.
York Sun <yorksun <at> freescale.com> writes:
When we need the copied code/data in the main memory, we can flush the cache now. It uses the existing function flush_cache. Syntax is
flush_cache <addr> <size>
The addr and size are given in hexadecimal. Like memory command, there is no sanity check for the parameters.
Are there symptoms for a specific system failure you can provide?
My problem is that when a stand alone application, which is copied from NOR flash to DDR, is started with the "go" command, it sometimes experiences a program check (illegal instruction) after a block of 32 zero bytes "appears" in memory.
I'm using a U-Boot for a custom Freescale P1022-based board, currently based on the old 2010.12 U-Boot as patched by Freescale in their SDK_V1_0_20110429_ltib.iso. Unfortunately, upgrading to a more recent version of U-Boot is not possible at this time, no more recent version is available from Freescale and I don't have the resources to verify all their patches apply correctly to a release directly from DENX.
I've used Codewarrior to observe the block of zeros in memory once the program check has happened, and I've verified that before the stand alone application begins execution, there are no zeros. Figuring it is likely that I have an error in my stand alone application code that corrupts the stack or writes based on an un-initialized pointer, I've tried using the CW watchpoint to catch where the zeros are written, but enabling the watchpoint seems to avoid the problem.
Based on later discussion on this thread, I've tried adding "flush_dcache(); invalidate_icache();" to cmd_boot.c:do_go(), just before control is passed to the stand alone app. Based on my ad hoc testing with this change, I don't get the program check exception.
I believe this result helps make the case that caching behavior is at the root of my problem, but since I was not able to isolate the actual cause of that problem, I can't be sure I've really got the solution.
(By the way, I would not leave the "flush_dcache(); invalidate_icache();" in do_go(), I merely found that for demonstrating a possible solution, this change easier than switching to a stand alone app that starts with bootm, or similar)
Any help or comments are very welcome.
Thanks, Jim
Signed-off-by: York Sun <yorksun <at> freescale.com>
common/cmd_cache.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/common/cmd_cache.c b/common/cmd_cache.c index 5512f92..93b7337 100644 --- a/common/cmd_cache.c +++ b/common/cmd_cache.c @@ -94,6 +94,29 @@ int do_dcache(cmd_tbl_t *cmdtp, int flag, int argc, char *
const argv[])
return 0; }
+void __weak flush_cache(ulong addr, ulong size) +{
- puts("No arch specific flush_cache available!\n");
- /* please define arch specific flush_cache */
+}
+int do_flush_cache(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[]) +{
- ulong addr, size;
- switch (argc) {
- case 3:
addr = simple_strtoul(argv[1], NULL, 16);
size = simple_strtoul(argv[2], NULL, 16);
flush_cache(addr, size);
break;
- default:
return cmd_usage(cmdtp);
- }
- return 0;
+}
static int parse_argv(const char *s) { if (strcmp(s, "flush") == 0) @@ -120,3 +143,10 @@ U_BOOT_CMD( "[on, off, flush]\n" " - enable, disable, or flush data (writethrough) cache" );
+U_BOOT_CMD(
- flush_cache, 3, 0, do_flush_cache,
- "flush cache for a range",
- "<addr> <size>\n"
- " - flush cache for specificed range"
+);

On 04/03/2013 09:02:15 AM, Jim Chargin wrote:
I apologize for being so late with this question.
York Sun <yorksun <at> freescale.com> writes:
When we need the copied code/data in the main memory, we can flush
the
cache now. It uses the existing function flush_cache. Syntax is
flush_cache <addr> <size>
The addr and size are given in hexadecimal. Like memory command,
there is
no sanity check for the parameters.
Are there symptoms for a specific system failure you can provide?
Generally you'd see code behaving in a way that is different than the instructions in memory (as seen by a load instruction) would suggest. Often it's obvious, such as an illegal instruction trap on on opcode that is valid, but it could be arbitrary depending on what the old contents of memory were (especially if there had been valid (but different) code there before).
My problem is that when a stand alone application, which is copied from NOR flash to DDR, is started with the "go" command, it sometimes experiences a program check (illegal instruction) after a block of 32 zero bytes "appears" in memory.
Maybe the old contents of memory contained a dcbz? Or somehow confused a legimate dcbz to target the wrong address.
I'm using a U-Boot for a custom Freescale P1022-based board, currently based on the old 2010.12 U-Boot as patched by Freescale in their SDK_V1_0_20110429_ltib.iso. Unfortunately, upgrading to a more recent version of U-Boot is not possible at this time, no more recent version is available from Freescale and I don't have the resources to verify all their patches apply correctly to a release directly from DENX.
You can find the latest Freescale U-Boot at: http://git.freescale.com/git/cgit.cgi/ppc/sdk/u-boot.git/ git://git.freescale.com/ppc/sdk/u-boot.git
...though there's a good chance that you would not need anything from the SDK if you run the latest upstream U-Boot. You'd still need to apply your own patches for the custom board, of course.
(By the way, I would not leave the "flush_dcache(); invalidate_icache();" in do_go(), I merely found that for demonstrating a possible solution, this change easier than switching to a stand alone app that starts with bootm, or similar)
It seems reasonably likely that this is the cause of your problem; in any case, the flush is architecturally required.
-Scott
participants (8)
-
Albert ARIBAUD
-
Jim Chargin
-
Michael Cashwell
-
Scott Wood
-
Sricharan R
-
Tom Rini
-
Wolfgang Denk
-
York Sun