[U-Boot] [RFC] command/cache: Add flush_cache command

newer
[U-Boot] [PATCH] crc32: Correct...

older
[U-Boot] [PATCH v4] palmtreo680:...

York Sun

19 Mar 2013 19 Mar '13

9:29 p.m.

When we need the copied code/data in the main memory, we can flush the cache now. It uses the existing function flush_cache. Syntax is

flush_cache <addr> <size>

The addr and size are given in hexadecimal. Like memory command, there is no sanity check for the parameters.

Signed-off-by: York Sun yorksun@freescale.com --- common/cmd_cache.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)

diff --git a/common/cmd_cache.c b/common/cmd_cache.c index 5512f92..93b7337 100644 --- a/common/cmd_cache.c +++ b/common/cmd_cache.c @@ -94,6 +94,29 @@ int do_dcache(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[]) return 0; }

+void __weak flush_cache(ulong addr, ulong size) +{ + puts("No arch specific flush_cache available!\n"); + /* please define arch specific flush_cache */ +} + +int do_flush_cache(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[]) +{ + ulong addr, size; + + switch (argc) { + case 3: + addr = simple_strtoul(argv[1], NULL, 16); + size = simple_strtoul(argv[2], NULL, 16); + flush_cache(addr, size); + break; + default: + return cmd_usage(cmdtp); + } + return 0; + +} + static int parse_argv(const char *s) { if (strcmp(s, "flush") == 0) @@ -120,3 +143,10 @@ U_BOOT_CMD( "[on, off, flush]\n" " - enable, disable, or flush data (writethrough) cache" ); + +U_BOOT_CMD( + flush_cache, 3, 0, do_flush_cache, + "flush cache for a range", + "<addr> <size>\n" + " - flush cache for specificed range" +);

-- 1.7.9.5

Show replies by date

Albert ARIBAUD

19 Mar 19 Mar

11:01 p.m.

Hi York,

On Tue, 19 Mar 2013 13:29:52 -0700, York Sun yorksun@freescale.com wrote:

...

When we need the copied code/data in the main memory, we can flush the cache now. It uses the existing function flush_cache. Syntax is

flush_cache <addr> <size>

The addr and size are given in hexadecimal. Like memory command, there is no sanity check for the parameters.

Signed-off-by: York Sun yorksun@freescale.com

common/cmd_cache.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)

diff --git a/common/cmd_cache.c b/common/cmd_cache.c index 5512f92..93b7337 100644 --- a/common/cmd_cache.c +++ b/common/cmd_cache.c @@ -94,6 +94,29 @@ int do_dcache(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[]) return 0; }

+void __weak flush_cache(ulong addr, ulong size) +{

puts("No arch specific flush_cache available!\n");

/* please define arch specific flush_cache */

+}

+int do_flush_cache(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[]) +{
ulong addr, size;

switch (argc) {

case 3:
addr = simple_strtoul(argv[1], NULL, 16);
size = simple_strtoul(argv[2], NULL, 16);
flush_cache(addr, size);
break;
default:
return cmd_usage(cmdtp);
}

return 0;
+}

static int parse_argv(const char *s) { if (strcmp(s, "flush") == 0) @@ -120,3 +143,10 @@ U_BOOT_CMD( "[on, off, flush]\n" " - enable, disable, or flush data (writethrough) cache" );

+U_BOOT_CMD(

flush_cache, 3, 0, do_flush_cache,

"flush cache for a range",

"<addr> <size>\n"

" - flush cache for specificed range"

+);

What's the point of this command exactly? I can see the point of range flushes (and invalidates) in the code for drivers that use DMA, but as a shell command, I fail to see the interest of it.

Amicalement,

-- Albert.

York Sun

11:07 p.m.

On 03/19/2013 03:01 PM, Albert ARIBAUD wrote:

...

What's the point of this command exactly? I can see the point of range flushes (and invalidates) in the code for drivers that use DMA, but as a shell command, I fail to see the interest of it.

I am actually expecting this question. We have a situation that an application is copied by u-boot to its destination in memory. The code needs to be "seen" in the main memory. Without flushing cache, it is only visible to cores.

York

Scott Wood

20 Mar 20 Mar

12:32 a.m.

On 03/19/2013 05:07:33 PM, York Sun wrote:

...

On 03/19/2013 03:01 PM, Albert ARIBAUD wrote:

...
What's the point of this command exactly? I can see the point of

range

...
flushes (and invalidates) in the code for drivers that use DMA, but

as a

...
shell command, I fail to see the interest of it.

I am actually expecting this question. We have a situation that an application is copied by u-boot to its destination in memory. The code needs to be "seen" in the main memory. Without flushing cache, it is only visible to cores.

It's the same purpose as the cache flushing that happens in bootm, except for code loading that happens outside bootm.

-Scott

Albert ARIBAUD

2:59 p.m.

Hi Scott,

On Tue, 19 Mar 2013 18:32:39 -0500, Scott Wood scottwood@freescale.com wrote:

...

On 03/19/2013 05:07:33 PM, York Sun wrote:

...
On 03/19/2013 03:01 PM, Albert ARIBAUD wrote:

...
What's the point of this command exactly? I can see the point of

range

...
flushes (and invalidates) in the code for drivers that use DMA, but

as a

...
shell command, I fail to see the interest of it.

I am actually expecting this question. We have a situation that an application is copied by u-boot to its destination in memory. The code needs to be "seen" in the main memory. Without flushing cache, it is only visible to cores.

It's the same purpose as the cache flushing that happens in bootm, except for code loading that happens outside bootm.

-Scott

I do understand what it does, but I still don't get why it should be done, since precisely payload control transfer happens through bootm and the like which already properly flush cache.

Also, AFAIK U-Boot on multiple cores runs on a single core (possibly started from a smaller adjunct core) and will load and execute its payload on that same single core; the payload might enable and run additional cores if it so decides, but I don't know that U-Boot would start another main core. Still, I may have missed something.

Anyway:

Is there an ARM multi-core target in U-Boot where U-Boot runs on one core but its payload shall be started on another, "un-booted", core, and which experiences issues due to the first core not flushing cache? If no existing target needs this, then this patch is useless. If there exists such a target and issue, then the right fix is not a shell command, it is a programmatic flush before the other core is enabled, so that it always sees correct RAM.

Maybe this is some code that might come in handy for some future target not in U-Boot yes which will have the first core start a payload on another core? Then the previous argument applies (the fix should not be a shell command, it should be in source code), plus, the patch is dead code until and unless said target is also added in a single series.

Amicalement,

-- Albert.

Wolfgang Denk

3:58 p.m.

Dear Albert,

In message 20130320145927.2031b913@lilith you wrote:

...

I do understand what it does, but I still don't get why it should be done, since precisely payload control transfer happens through bootm and the like which already properly flush cache.

Full agrement.

...

Is there an ARM multi-core target in U-Boot where U-Boot runs on one core but its payload shall be started on another, "un-booted", core, and which experiences issues due to the first core not flushing cache? If no existing target needs this, then this patch is useless. If there exists such a target and issue, then the right fix is not a shell command, it is a programmatic flush before the other core is enabled, so that it always sees correct RAM.

Agreed again. As is, the patch was only adding dead code, as there are no users of the feature.

<nitpick> Also, it was added unconditionally which is a strict no-no as it just adds code-bloat to everyone, without benefit. </nitpick>

...

Maybe this is some code that might come in handy for some future target not in U-Boot yes which will have the first core start a payload on another core? Then the previous argument applies (the fix should not be a shell command, it should be in source code), plus, the patch is dead code until and unless said target is also added in a single series.

Agreed! Thanks.

Best regards,

Wolfgang Denk

Scott Wood

5:43 p.m.

On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:

...

Dear Albert,

In message 20130320145927.2031b913@lilith you wrote:

...
I do understand what it does, but I still don't get why it should be done, since precisely payload control transfer happens through

bootm and

...
the like which already properly flush cache.

It doesn't always happen through bootm. Standalone apps use the "go" command.

...

Full agrement.

...
Is there an ARM multi-core target in U-Boot where U-Boot runs on one core but its payload shall be started on another, "un-booted", core, and which experiences issues due to the first core not

flushing

...
cache? If no existing target needs this, then this patch is

useless. If

...
there exists such a target and issue, then the right fix is not a

shell

...
command, it is a programmatic flush before the other core is

enabled,

...
so that it always sees correct RAM.

Agreed again. As is, the patch was only adding dead code, as there are no users of the feature.

It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...

...

<nitpick> Also, it was added unconditionally which is a strict no-no as it just adds code-bloat to everyone, without benefit. </nitpick>

Only for boards which select CONFIG_CMD_CACHE... not sure how fine-grained it makes sense to make it.

-Scott

Albert ARIBAUD

6:38 p.m.

Hi Scott,

On Wed, 20 Mar 2013 11:43:15 -0500, Scott Wood scottwood@freescale.com wrote:

...

On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:

...
Dear Albert,

In message 20130320145927.2031b913@lilith you wrote:

...
I do understand what it does, but I still don't get why it should be done, since precisely payload control transfer happens through bootm and the like which already properly flush cache.

It doesn't always happen through bootm. Standalone apps use the "go" command.

Hence my "and the like".

...

...
Full agrement.

...
Is there an ARM multi-core target in U-Boot where U-Boot runs on one core but its payload shall be started on another, "un-booted", core, and which experiences issues due to the first core not

flushing

...
cache? If no existing target needs this, then this patch is

useless. If

...
there exists such a target and issue, then the right fix is not a

shell

...
command, it is a programmatic flush before the other core is

enabled,

...
so that it always sees correct RAM.

Agreed again. As is, the patch was only adding dead code, as there are no users of the feature.

It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...

But as I stated, you can, and should, include at least one board that will need this command. So far, apparently, no board needs it.

...

...
<nitpick> Also, it was added unconditionally which is a strict no-no as it just adds code-bloat to everyone, without benefit. </nitpick>

Only for boards which select CONFIG_CMD_CACHE... not sure how fine-grained it makes sense to make it.

As a command, and useful at most to a few boards, it should have its own option so that only the board(s) that will need it will have the command list entry and associated code.

...

-Scott

Amicalement,

-- Albert.

Scott Wood

7:16 p.m.

On 03/20/2013 12:38:13 PM, Albert ARIBAUD wrote:

...

Hi Scott,

On Wed, 20 Mar 2013 11:43:15 -0500, Scott Wood scottwood@freescale.com wrote:

...
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:

...
Dear Albert,

In message 20130320145927.2031b913@lilith you wrote:

...
I do understand what it does, but I still don't get why it

should be

...
...
...
done, since precisely payload control transfer happens through bootm and the like which already properly flush cache.

It doesn't always happen through bootm. Standalone apps use the

"go"

...
command.

Hence my "and the like".

But a cache flush does not happen with "go", and cannot (unless you flush the entirety of all relevant caches, which is not always easy and requires more platform-specific code) because it only knows the entry point, not the start/end of the image.

...

...
...
Agreed again. As is, the patch was only adding dead code, as

there

...
...
are no users of the feature.

It's a user command! How can it be dead code? I don't know of a

way

...
to include a human user in a patchset...

But as I stated, you can, and should, include at least one board that will need this command. So far, apparently, no board needs it.

Where does this appearance come from? It's relevant on any PPC target where icache and dcache are not mutually coherent, which is most of them. The odds of getting bitten by the lack of flushing may differ among different targets, but it exists

What specifically would you be looking to be added? Again, it's a user command. The user is a human, not a piece of code. Should we rip out every command that isn't referred to by a board's default environment or some other in-tree reference? I could drop a lot of useful NAND commands that way...

-Scott

Tom Rini

8:15 p.m.

On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:

...

On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:

...
Dear Albert,

In message 20130320145927.2031b913@lilith you wrote:

...
I do understand what it does, but I still don't get why it should be done, since precisely payload control transfer happens through

bootm and

...
the like which already properly flush cache.

It doesn't always happen through bootm. Standalone apps use the "go" command.

So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is why bootm does a cache flush, we need some way to flush in this case.

And in this case you aren't better served by say bootelf ?

Assuming that no, it's not, we have a real example where we need this. So please make sure to update the README with the new command CONFIG_CMD_CACHE sets.

...

...
Full agrement.

...
Is there an ARM multi-core target in U-Boot where U-Boot runs on one core but its payload shall be started on another, "un-booted", core, and which experiences issues due to the first core not

flushing

...
cache? If no existing target needs this, then this patch is

useless. If

...
there exists such a target and issue, then the right fix is not

a shell

...
command, it is a programmatic flush before the other core is

enabled,

...
so that it always sees correct RAM.

Agreed again. As is, the patch was only adding dead code, as there are no users of the feature.

It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...

Can you hightlight what exactly causes the world today to go off and fail? Is the hello_world example app sufficient in this case or do we need something much larger?

-- Tom

Scott Wood

8:36 p.m.

On 03/20/2013 02:15:19 PM, Tom Rini wrote:

...

On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:

...
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:

...
Dear Albert,

In message 20130320145927.2031b913@lilith you wrote:

...
I do understand what it does, but I still don't get why it

should be

...
...
...
done, since precisely payload control transfer happens through

bootm and

...
the like which already properly flush cache.

It doesn't always happen through bootm. Standalone apps use the "go" command.

So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is why bootm does a cache flush, we need some way to flush in this case.

It's also an issue with using the "cpu <n> release" command.

...

And in this case you aren't better served by say bootelf ?

That wouldn't handle the "cpu release" case. In any case, "go" exists and is currently the recommended way to launch a standalone application in doc/README.standalone.

...

...
It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...

Can you hightlight what exactly causes the world today to go off and fail? Is the hello_world example app sufficient in this case or do we need something much larger?

A user inside Freescale is running standalone performance test apps, using both "go" and "cpu <n> release" (since the test needs to run on all CPUs). They are seeing cache problems running on a T4240 if they don't have this flush. This flush is architecturally required between modifying/loading code and running it.

-Scott

Tom Rini

8:59 p.m.

On Wed, Mar 20, 2013 at 02:36:05PM -0500, Scott Wood wrote:

...

On 03/20/2013 02:15:19 PM, Tom Rini wrote:

...
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:

...
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:

...
Dear Albert,

In message 20130320145927.2031b913@lilith you wrote:

...
I do understand what it does, but I still don't get why it

should be

...
...
...
done, since precisely payload control transfer happens through

bootm and

...
the like which already properly flush cache.

It doesn't always happen through bootm. Standalone apps use the "go" command.

So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is why bootm does a cache flush, we need some way to flush in this case.

It's also an issue with using the "cpu <n> release" command.

Hadn't seen that command before, where is it?

...

...
And in this case you aren't better served by say bootelf ?

That wouldn't handle the "cpu release" case. In any case, "go" exists and is currently the recommended way to launch a standalone application in doc/README.standalone.

...
...
It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...

Can you hightlight what exactly causes the world today to go off and fail? Is the hello_world example app sufficient in this case or do we need something much larger?

A user inside Freescale is running standalone performance test apps, using both "go" and "cpu <n> release" (since the test needs to run on all CPUs). They are seeing cache problems running on a T4240 if they don't have this flush. This flush is architecturally required between modifying/loading code and running it.

OK, so this does sound like a real need / use for it, and if we added the granularity of CONFIG_CMD_CACHE_FLUSH or similar, it would be reasonable to turn it on to a large number of boards for a small space savings (so lets not). My next concern is that this needs build testing (and some inspection) on say ARM where we have a weak flush_cache already. But perhaps the right answer is to say it doesn't make sense to add CONFIG_CMD_CACHE on an architecture which doesn't already provide flush_cache, so drop the weak one from this patch.

-- Tom

Scott Wood

10:31 p.m.

On 03/20/2013 02:59:19 PM, Tom Rini wrote:

...

On Wed, Mar 20, 2013 at 02:36:05PM -0500, Scott Wood wrote:

...
On 03/20/2013 02:15:19 PM, Tom Rini wrote:

...
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:

...
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:

...
Dear Albert,

In message 20130320145927.2031b913@lilith you wrote:

...
I do understand what it does, but I still don't get why it

should be

...
...
...
done, since precisely payload control transfer happens through

bootm and

...
the like which already properly flush cache.

It doesn't always happen through bootm. Standalone apps use the "go" command.

So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is

why

...
...
bootm does a cache flush, we need some way to flush in this case.

It's also an issue with using the "cpu <n> release" command.

Hadn't seen that command before, where is it?

common/cmd_mp.c

...

...
...
And in this case you aren't better served by say bootelf ?

That wouldn't handle the "cpu release" case. In any case, "go" exists and is currently the recommended way to launch a standalone application in doc/README.standalone.

...
...
It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...

Can you hightlight what exactly causes the world today to go off

and

...
...
fail? Is the hello_world example app sufficient in this case or

do we

...
...
need something much larger?

A user inside Freescale is running standalone performance test apps, using both "go" and "cpu <n> release" (since the test needs to run on all CPUs). They are seeing cache problems running on a T4240 if they don't have this flush. This flush is architecturally required between modifying/loading code and running it.

OK, so this does sound like a real need / use for it, and if we added the granularity of CONFIG_CMD_CACHE_FLUSH or similar, it would be reasonable to turn it on to a large number of boards for a small space savings (so lets not). My next concern is that this needs build testing (and some inspection) on say ARM where we have a weak flush_cache already. But perhaps the right answer is to say it doesn't make sense to add CONFIG_CMD_CACHE on an architecture which doesn't already provide flush_cache, so drop the weak one from this patch.

flush_cache() is already called from generic code, so we should be ok just dropping the weak function.

-Scott

Wolfgang Denk

21 Mar 21 Mar

6:42 a.m.

Dear Scott Wood,

In message 1363815061.25034.18@snotra you wrote:

...

...
Hadn't seen that command before, where is it?

common/cmd_mp.c

This depends on CONFIG_MP, which is undocumented. Can you please add / have added documentation for this config option? Thanks.

Best regards,

Wolfgang Denk

-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de Good morning. This is the telephone company. Due to repairs, we're giving you advance notice that your service will be cut off indefi- nitely at ten o'clock. That's two minutes from now.

Wolfgang Denk

6:39 a.m.

Dear Tom Rini,

In message 20130320195919.GR25919@bill-the-cat you wrote:

...

OK, so this does sound like a real need / use for it, and if we added the granularity of CONFIG_CMD_CACHE_FLUSH or similar, it would be reasonable to turn it on to a large number of boards for a small space savings (so lets not). My next concern is that this needs build testing

Not really. Only a tiny fraction of users will ever run any standalone applications, so please let's save the memory footprint for the overwhelming majority of users who do not need that.

Best regards,

Wolfgang Denk

Tom Rini

1:29 p.m.

On Thu, Mar 21, 2013 at 06:39:31AM +0100, Wolfgang Denk wrote:

...

Dear Tom Rini,

In message 20130320195919.GR25919@bill-the-cat you wrote:

...
OK, so this does sound like a real need / use for it, and if we added the granularity of CONFIG_CMD_CACHE_FLUSH or similar, it would be reasonable to turn it on to a large number of boards for a small space savings (so lets not). My next concern is that this needs build testing

Not really. Only a tiny fraction of users will ever run any standalone applications, so please let's save the memory footprint for the overwhelming majority of users who do not need that.

Well, can we run into this problem on ARM (v7 or v8) systems as well?

-- Tom

Wolfgang Denk

2:37 p.m.

Dear Tom,

In message 20130321122923.GB26945@bill-the-cat you wrote:

...

...
Not really. Only a tiny fraction of users will ever run any standalone applications, so please let's save the memory footprint for the overwhelming majority of users who do not need that.

Well, can we run into this problem on ARM (v7 or v8) systems as well?

Probably.

But I wonder what the exact usage szenario is that will trigger the problem. If I understand correctly, this can only happen when you perform a (manual) memory copy (either between different locations in RAM, or from parallel NOR flash to RAM) of the code you are going to run.

As far as I understand all other ways to load any such code (over the network or from storage devices) already make sure to run flush_cache() after any such load operation.

Scott: is my understanding correct that you only need this because you are performing such memory copy ops manually? From where / to where are you copying, and why?

Thinking about alternatives:

- eventually we should discourage the use of "go"; it may be conveniend when you know what you are doing, but if it's casuing such problems we might be better off recommending to use proper IH_TYPE_STANDALONE legacy images in combination with the bootm command instead.

- Also, instead of adding a new command, this could probably be scripted; I guess this should be roughly equivalent?

setenv flush_cache 'dc off;ic off;dc on;ic on'

Best regards,

Wolfgang Denk

Scott Wood

7:22 p.m.

On 03/21/2013 08:37:32 AM, Wolfgang Denk wrote:

...

Dear Tom,

In message 20130321122923.GB26945@bill-the-cat you wrote:

...
...
Not really. Only a tiny fraction of users will ever run any

standalone

...
...
applications, so please let's save the memory footprint for the overwhelming majority of users who do not need that.

Well, can we run into this problem on ARM (v7 or v8) systems as

well?

Probably.

But I wonder what the exact usage szenario is that will trigger the problem. If I understand correctly, this can only happen when you perform a (manual) memory copy (either between different locations in RAM, or from parallel NOR flash to RAM) of the code you are going to run.

Yes, the person who is seeing this problem is copying code from flash to RAM.

...

As far as I understand all other ways to load any such code (over the network or from storage devices) already make sure to run flush_cache() after any such load operation.

A lot of places seem to have it, but it seems not everywhere. I'm not aware of such a flush for NAND reads.

We could instead try to make sure everything, including commands like "cp" and "mm", flush cache when they're done. This approach is more user friendly, though it has a higher risk of missing some code path.

...

Scott: is my understanding correct that you only need this because you are performing such memory copy ops manually? From where / to where are you copying, and why?

As above it's from flash (I assume NOR) to RAM. The "why" is to be able to run the code from RAM. :-P

...

Thinking about alternatives:

eventually we should discourage the use of "go"; it may be conveniend when you know what you are doing, but if it's casuing such problems we might be better off recommending to use proper IH_TYPE_STANDALONE legacy images in combination with the bootm command instead.

That or bootelf, sure. I think there still should be some way to do it manually, though. bootm/bootelf probably wouldn't work for this use case because it involves releasing other cores after the image is copied/flushed, but before u-boot gives up control on the boot core.

...

Also, instead of adding a new command, this could probably be scripted; I guess this should be roughly equivalent?

setenv flush_cache 'dc off;ic off;dc on;ic on'

This assumes that we support those cache operations, that they affect all relevant caches (on 85xx it only flushes L1, but at least on newer chips L2 is relevant as well), and that there are no errata or architectural limitations to running with caches off.

-Scott

Wolfgang Denk

8:25 p.m.

Dear Scott,

In message 1363890157.31522.14@snotra you wrote:

...

As above it's from flash (I assume NOR) to RAM. The "why" is to be =20 able to run the code from RAM. :-P

* Why don't you run it form flash?

* Why do you insist on using the "go" command (instead of "bootm" with a IH_TYPE_STANDALONE image?

...

...

eventually we should discourage the use of "go"; it may be conveniend when you know what you are doing, but if it's casuing such problems we might be better off recommending to use proper IH_TYPE_STANDALONE legacy images in combination with the bootm command instead.

That or bootelf, sure. I think there still should be some way to do it manually, though. bootm/bootelf probably wouldn't work for this use case because it involves releasing other cores after the image is copied/flushed, but before u-boot gives up control on the boot core.

The "releasing other cores" would then be a sub-command in the bootm sequence?

...

...

Also, instead of adding a new command, this could probably be scripted; I guess this should be roughly equivalent?

setenv flush_cache 'dc off;ic off;dc on;ic on'

This assumes that we support those cache operations, that they affect all relevant caches (on 85xx it only flushes L1, but at least on newer chips L2 is relevant as well), and that there are no errata or architectural limitations to running with caches off.

Indeed.

Best regards,

Wolfgang Denk

Scott Wood

9:34 p.m.

On 03/21/2013 02:25:10 PM, Wolfgang Denk wrote:

...

Dear Scott,

In message 1363890157.31522.14@snotra you wrote:

...
As above it's from flash (I assume NOR) to RAM. The "why" is to be

=20

...
able to run the code from RAM. :-P

Why don't you run it form flash?

Presumably because it's slow. As indicated elsewhere in the thread, these are performance tests.

...

Why do you insist on using the "go" command (instead of "bootm" with a IH_TYPE_STANDALONE image?

Presumably because of the need to release other cores first.

...

...
...

eventually we should discourage the use of "go"; it may be conveniend when you know what you are doing, but if it's casuing such problems we might be better off recommending to use proper IH_TYPE_STANDALONE legacy images in combination with the bootm command instead.

That or bootelf, sure. I think there still should be some way to

do it

...
manually, though. bootm/bootelf probably wouldn't work for this use case because it involves releasing other cores after the image is copied/flushed, but before u-boot gives up control on the boot core.

The "releasing other cores" would then be a sub-command in the bootm sequence?

Perhaps it could be, or the application could be altered to release secondary cores through the spin table. I don't think that excuses a situation where some ways of putting a blob of bytes into RAM flush the cache (to the extent the architecture requires it for the blob of bytes to be executable) and others don't, and there's no way to do it manually. Would you remove the "go" command entirely? I think that would be a mistake.

-Scott

Albert ARIBAUD

22 Mar 22 Mar

7:30 a.m.

Hi Scott,

...

Perhaps it could be, or the application could be altered to release secondary cores through the spin table. I don't think that excuses a situation where some ways of putting a blob of bytes into RAM flush the cache (to the extent the architecture requires it for the blob of bytes to be executable) and others don't, and there's no way to do it manually.

AFAIU there is.

...

Would you remove the "go" command entirely? I think that would be a mistake.

I do not see why you are talking about removing the "go" command. In the 'worst' scenario (from an effort perspective), it would have to be do a flush and possibly cache disable before branching to the payload; in the 'best' scenario, it needs not be modified at all.

...

-Scott

Amicalement,

-- Albert.

Tom Rini

1:17 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 03/22/2013 02:30 AM, Albert ARIBAUD wrote:

...

Hi Scott,

...
Perhaps it could be, or the application could be altered to release secondary cores through the spin table. I don't think that excuses a situation where some ways of putting a blob of bytes into RAM flush the cache (to the extent the architecture requires it for the blob of bytes to be executable) and others don't, and there's no way to do it manually.

AFAIU there is.

...
Would you remove the "go" command entirely? I think that would be a mistake.

I do not see why you are talking about removing the "go" command. In the 'worst' scenario (from an effort perspective), it would have to be do a flush and possibly cache disable before branching to the payload; in the 'best' scenario, it needs not be modified at all.

It seems like we're going around and around with one point not being addressed. When using 'go', how do you know the size to flush? And since Scott is talking about performance testing apps, the cache should not be disabled (unless we expect all standalone apps to enable the cache, in which case we need to provide something in the jump table to make that easy and document this change).

- -- Tom

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJRTEvnAAoJENk4IS6UOR1WHZEP/3YU+LOnwSP9TRarsg2gtH2I jfgb31qqQ1RZjF0rYawxfJdYyhJPYVkyHIlsLHBUq7onB2Y5VGwO0PkbDoImroBt Fy8V/th1oW9NzmvrcDe9fUiisGQzvH4glgtBhKDp+a1XGuXkyLUE+C6lkJzzZ5kp XzSqHKSER3yOC1nUhZMyYAgy0u5sFQreH+smTz7Tt+3kOtSMj73dbSm5w9dfifMS Med82eVUWJGS+1ZjAe0NNGHZHFmIHd+o7hQkrkpzH8H+uOBOZQkgD9xD5fiat2Nc 2rMNfuvcvHeLAgwNcSOEQwIcr/EHLKb3OyXQop7qw4XLvxTtQ23Dw/HLvkRFYWMg A82SUd9GVrBrVd9WtTpNxbDGZ1xttKOHP0zB9BhazpFP9rLCVpWvIPWMExooAboe dYWBNmosiHcL0J2dfE3ep5tz+/FWkbHYPUCZ/MOs9avd7PEzT8IGp0OU7mpPYXRQ 2RuLhvc8SIW7PlKV8NL69/FrliYxNsxML3+eLMl0IjlW4lGJdSbL0HaIw7Lsftb5 8dEVrfmE2ftzLs2LbjoC12PHdGuYNsQoM/Kvwbvq8s8Bl8wzhXzcSOCf1YKLrXRG 7l9oBYX5oEiBdYAfEq7De1bODyravoJ6R1ledaz5CrRuQptChDhl4z3aZaKU0Mos WK5aeCrPTJYSOVSNCdUJ =TtQR -----END PGP SIGNATURE-----

Wolfgang Denk

3:03 p.m.

Dear Tom,

In message 514C4BE8.10508@ti.com you wrote:

...

It seems like we're going around and around with one point not being addressed. When using 'go', how do you know the size to flush? And since Scott is talking about performance testing apps, the cache should not be disabled (unless we expect all standalone apps to enable the cache, in which case we need to provide something in the jump table to make that easy and document this change).

I also wonder about this. To me it appears much easier to use a IH_TYPE_STANDALONE image, which 1) provides the needed size information and 2) can be used with bootm, so the required additional steps (flush caches, release CPU) can be handled in bootm subcommands.

Best regards,

Wolfgang Denk

-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de A positive attitude may not solve all your problems, but it will annoy enough people to make it worth the effort. - Herm Albright

Tom Rini

3:29 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 03/22/2013 10:03 AM, Wolfgang Denk wrote:

...

Dear Tom,

In message 514C4BE8.10508@ti.com you wrote:

...
It seems like we're going around and around with one point not being addressed. When using 'go', how do you know the size to flush? And since Scott is talking about performance testing apps, the cache should not be disabled (unless we expect all standalone apps to enable the cache, in which case we need to provide something in the jump table to make that easy and document this change).

I also wonder about this. To me it appears much easier to use a IH_TYPE_STANDALONE image, which 1) provides the needed size information and 2) can be used with bootm, so the required additional steps (flush caches, release CPU) can be handled in bootm subcommands.

But that then circles us back to Scott's other point of "go" is broken then and it is the recommended way to start standalone applications.

Now, if we want to change things and say that no, you can't just run totally raw binaries reliably with "go" but instead need to throw some form of header on top of them, how portable, really, is mkimage? We've just made that a required part of the work-flow for anyone doing development that's not producing ELF or something else already boot*'able. That might be a rather large pool I suspect.

Scott, part of the problem here is that we have multiple cores, yes? Say core0 is the one that read things in from NOR to DDR, core1 is the one that will be running things. How about we make flush_cache depend on CONFIG_MP || CONFIG_CMD_CACHE_FLUSH ? It's a likely required often thing for CONFIG_MP systems and anyone else that needs it can opt-in.

- -- Tom

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJRTGqwAAoJENk4IS6UOR1WaLsP/0exKAgPt0pOskKewNCNP2C5 RlaZ55mG+ueWtYCXDhosNTD8H5AgPmOzA+zblUPOzUpX4+EASoaAoq4LNNnaKGJf a9lPS+rsnSoefrM4qCzcDx6Iq8AIY6Vaj9I9VaaDtCDgT9wMILNpkHobalujC7hl PoCW0niIJZl05mRI6YaMP32zrtHxnZAJHRGPIX1/f7PMAg7WwoJXZc14snMaaFGT Bi2JXFswT07Vjttmk14qnaYt1zApHY2vmTn+kAnncokl9vu7Go5pcwOmrnNCONcY qeHkYSed+y+q/5fczxFYMncxbeBNC1l3RpkOX5j05Sa9DEylyNxPlQAfFZ7klt2s Fy+5C4s08q/vNFsL/b/Ic7xl80bfTWIu0BAY3OFKBJl7OnKyIdEtri7LCYEYrkZI 0UvDu6TZVc37WG5CcrqM0DcRr7YCZ6Ak6N3qMYwcWKKarW9T9dhC/pyQ9m591Abr NTjsRyRdDYhFNs3a/25SNazvqDlI1cF/BwUGUsLdZJ9tdG9mS5mLNU7okP+KtP7K cfJGb0Ln+AB1TPZRXFuEM4gEJIYhoDbOmJenjEKWDxWbV2cLAqX/Lk3dBTO+RnMS XHYBS2cSG46gkgXf3rqvhd4md93HoYRD6h2MBzxLyKhyjOicHk5zfeFMYmYrZCEO /uyGAvA+hkIyJXmVzNbb =Hn61 -----END PGP SIGNATURE-----

Albert ARIBAUD

4:57 p.m.

Hi Tom,

On Fri, 22 Mar 2013 10:29:04 -0400, Tom Rini trini@ti.com wrote:

...

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 03/22/2013 10:03 AM, Wolfgang Denk wrote:

...
Dear Tom,

In message 514C4BE8.10508@ti.com you wrote:

...
It seems like we're going around and around with one point not being addressed. When using 'go', how do you know the size to flush? And since Scott is talking about performance testing apps, the cache should not be disabled (unless we expect all standalone apps to enable the cache, in which case we need to provide something in the jump table to make that easy and document this change).

I also wonder about this. To me it appears much easier to use a IH_TYPE_STANDALONE image, which 1) provides the needed size information and 2) can be used with bootm, so the required additional steps (flush caches, release CPU) can be handled in bootm subcommands.

But that then circles us back to Scott's other point of "go" is broken then and it is the recommended way to start standalone applications.

I am not sure I understand how exactly go is broken, or how the bootm proposal from Wolfgang circles back to a brokennness of 'go'. I am not even sure of the exact scenario. How many cores are we dealing with here, and what does each core do in sequence?

...

Scott, part of the problem here is that we have multiple cores, yes? Say core0 is the one that read things in from NOR to DDR, core1 is the one that will be running things.

That's one possible scenario but I would like more detail before we look for a solution.

Amicalement,

-- Albert.

Scott Wood

5:48 p.m.

On 03/22/2013 09:29:04 AM, Tom Rini wrote:

...

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 03/22/2013 10:03 AM, Wolfgang Denk wrote:

...
Dear Tom,

In message 514C4BE8.10508@ti.com you wrote:

...
It seems like we're going around and around with one point not being addressed. When using 'go', how do you know the size to flush? And since Scott is talking about performance testing apps, the cache should not be disabled (unless we expect all standalone apps to enable the cache, in which case we need to provide something in the jump table to make that easy and document this change).

I also wonder about this. To me it appears much easier to use a IH_TYPE_STANDALONE image, which 1) provides the needed size information and 2) can be used with bootm, so the required additional steps (flush caches, release CPU) can be handled in bootm subcommands.

But that then circles us back to Scott's other point of "go" is broken then and it is the recommended way to start standalone applications.

Now, if we want to change things and say that no, you can't just run totally raw binaries reliably with "go" but instead need to throw some form of header on top of them, how portable, really, is mkimage? We've just made that a required part of the work-flow for anyone doing development that's not producing ELF or something else already boot*'able. That might be a rather large pool I suspect.

Scott, part of the problem here is that we have multiple cores, yes? Say core0 is the one that read things in from NOR to DDR, core1 is the one that will be running things. How about we make flush_cache depend on CONFIG_MP || CONFIG_CMD_CACHE_FLUSH ? It's a likely required often thing for CONFIG_MP systems and anyone else that needs it can opt-in.

Multiple CPUs may make it more likely to see problems on some hardware, but architecturally on PPC the flush is required even with a single CPU. Icache fetches won't snoop the dcache.

Can we have it depend on a new config symbol, so it's not bloating anyone's U-Boot who doesn't want it, and deal with improvements to recommended standalone app workflow as a separate issue? We're not talking about a huge amount of code here, just exposing functionality that U-Boot already has internally.

-Scott

Tom Rini

6:19 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 03/22/2013 12:48 PM, Scott Wood wrote:

...

On 03/22/2013 09:29:04 AM, Tom Rini wrote:

...
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 03/22/2013 10:03 AM, Wolfgang Denk wrote:

...
Dear Tom,

In message 514C4BE8.10508@ti.com you wrote:

...
It seems like we're going around and around with one point not being addressed. When using 'go', how do you know the size to flush? And since Scott is talking about performance testing apps, the cache should not be disabled (unless we expect all standalone apps to enable the cache, in which case we need to provide something in the jump table to make that easy and document this change).

I also wonder about this. To me it appears much easier to use a IH_TYPE_STANDALONE image, which 1) provides the needed size information and 2) can be used with bootm, so the required additional steps (flush caches, release CPU) can be handled in bootm subcommands.

But that then circles us back to Scott's other point of "go" is broken then and it is the recommended way to start standalone applications.

Now, if we want to change things and say that no, you can't just run totally raw binaries reliably with "go" but instead need to throw some form of header on top of them, how portable, really, is mkimage? We've just made that a required part of the work-flow for anyone doing development that's not producing ELF or something else already boot*'able. That might be a rather large pool I suspect.

Scott, part of the problem here is that we have multiple cores, yes? Say core0 is the one that read things in from NOR to DDR, core1 is the one that will be running things. How about we make flush_cache depend on CONFIG_MP || CONFIG_CMD_CACHE_FLUSH ? It's a likely required often thing for CONFIG_MP systems and anyone else that needs it can opt-in.

Multiple CPUs may make it more likely to see problems on some hardware, but architecturally on PPC the flush is required even with a single CPU. Icache fetches won't snoop the dcache.

Can we have it depend on a new config symbol, so it's not bloating anyone's U-Boot who doesn't want it, and deal with improvements to recommended standalone app workflow as a separate issue? We're not talking about a huge amount of code here, just exposing functionality that U-Boot already has internally.

Works for me.

- -- Tom

...PGP SIGNATURE...

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJRTJK8AAoJENk4IS6UOR1W27sP/0pS2QGy0qzfgKrRyuIDbgRG harO37ddiKNy5NTKFP8kXEFz1mIFgYk8Lu6SE30hu5kDje4t4bWK9S2F1NPcU8tr r5zM+ekR8y3q29P4Upk98YPMEkgV6L0FmnyQOkcQi3Kdg6Lx+S5GHikUQ6uvPJlh EBir2XQEZWxVWrL5KTdJV3TZ34NfLPh/gkXSpQTlvfFgGhsbskypt4DOgxz5Vl9d khOGMk9nX8o3iMaO/gf9F1d5wIoCAt4rzrBaPc/lu/KCMfC6JVmsJP21BCEOVols NgykmYxm+E8M5/vItCBzWrbGgkmeUfHbnhxGZXYHFPgPkPlz4s8CJZgKblwk4RkN BVEgfAnJblgPut5vV4Mz+x1D9Z0EQ12ZC3yTJlO2BssfbBK+oVr3r6e2S4HLp08o VV0wEF5E6fdc2M0gUUJFIMsX/vxjz7fB2y7dYKZmYk08IZvmL5Ddr31E0t/o6IvG Kp4lymSEEe20IdNuUxp0km4evild4NrcefCNsB/41F5poIdoJ9XudIK9aSFVHAah IB0nNiszLSgO4fUz3i7XUo8cW/jUzlS2vZe4RBXo8BqYR9GLo9wAhICFibIFVP4i JK7WA01Vjaqax6zn66L2ngdd/moA/jKalIMchxJehy2v4/Manvh1tno4OQJQ5BKJ cfCkfeurwOY6BgCef7mz =OaP1 -----END PGP SIGNATURE-----

Wolfgang Denk

9:39 p.m.

Dear Tom,

In message 514C6AB0.7090808@ti.com you wrote:

...

...
I also wonder about this. To me it appears much easier to use a IH_TYPE_STANDALONE image, which 1) provides the needed size information and 2) can be used with bootm, so the required additional steps (flush caches, release CPU) can be handled in bootm subcommands.

But that then circles us back to Scott's other point of "go" is broken then and it is the recommended way to start standalone applications.

I'm not sosure about this It once was, yes, when IH_TYPE_STANDALONE did not exist yet. Later this changed.

Of course "go" has it's right to exist. For example, I use it regularly when I want to crash a system by jumping to a non-existent or unaligned address ;-)

But today, "bootm" for a IH_TYPE_STANDALONE image ppears to be a better way to start a standalone app.

...

Now, if we want to change things and say that no, you can't just run totally raw binaries reliably with "go" but instead need to throw some form of header on top of them, how portable, really, is mkimage?

I think it would be wrong here tobe so absolute - this or that, and nothing inbetween. The thing is, for well over a decase "go" has been working fine, and problems like this here hever never been much of an issue. And being able just to jump anywhere _is_ a necessary function, IMO.

How portable is mkimage? Well, pretty much, it seems - probably more than your options are to build a bare metal standalone image to run on the target hardware. Why exactly are you asking this?

...

We've just made that a required part of the work-flow for anyone doing development that's not producing ELF or something else already boot*'able. That might be a rather large pool I suspect.

How many of the total number of U-Boot users are actively using standalone apps?

And how many of these are actually experiencing the problems we're discussing here?

I'd be willing to bet a few beer that it's only a _tiny_ fraction.

Of course we should provide working solutions for each and every use case, but here I fear we're designing for a lot of overkill.

Best regards,

Wolfgang Denk

-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de IMPORTANT NOTICE TO PURCHASERS: The Entire Physical Universe, Inclu- ding This Product, May One Day Collapse Back into an Infinitesimally Small Space. Should Another Universe Subsequently Re-emerge, the Existence of This Product in That Universe Cannot Be Guaranteed.

Albert ARIBAUD

20 Mar 20 Mar

11:11 p.m.

Hi Scott,

On Wed, 20 Mar 2013 14:36:05 -0500, Scott Wood scottwood@freescale.com wrote:

...

On 03/20/2013 02:15:19 PM, Tom Rini wrote:

...
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:

...
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:

...
Dear Albert,

In message 20130320145927.2031b913@lilith you wrote:

...
I do understand what it does, but I still don't get why it

should be

...
...
...
done, since precisely payload control transfer happens through

bootm and

...
the like which already properly flush cache.

It doesn't always happen through bootm. Standalone apps use the "go" command.

So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is why bootm does a cache flush, we need some way to flush in this case.

It's also an issue with using the "cpu <n> release" command.

...
And in this case you aren't better served by say bootelf ?

That wouldn't handle the "cpu release" case. In any case, "go" exists and is currently the recommended way to launch a standalone application in doc/README.standalone.

...
...
It's a user command! How can it be dead code? I don't know of a way to include a human user in a patchset...

Can you hightlight what exactly causes the world today to go off and fail? Is the hello_world example app sufficient in this case or do we need something much larger?

A user inside Freescale is running standalone performance test apps, using both "go" and "cpu <n> release" (since the test needs to run on all CPUs). They are seeing cache problems running on a T4240 if they don't have this flush. This flush is architecturally required between modifying/loading code and running it.

Still, why make it a shell command? Since this user needs a flush with "go" and "cpu release", then we should add a programmatic global cache flush in the "go" and "cpu release" commands.

...

-Scott

Amicalement,

-- Albert.

Scott Wood

11:35 p.m.

On 03/20/2013 05:11:57 PM, Albert ARIBAUD wrote:

...

Hi Scott,

On Wed, 20 Mar 2013 14:36:05 -0500, Scott Wood scottwood@freescale.com wrote:

...
On 03/20/2013 02:15:19 PM, Tom Rini wrote:

...
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:

...
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:

...
Dear Albert,

In message 20130320145927.2031b913@lilith you wrote:

...
I do understand what it does, but I still don't get why it

should be

...
...
...
done, since precisely payload control transfer happens

through

...
...
...
...
bootm and

...
the like which already properly flush cache.

It doesn't always happen through bootm. Standalone apps use the "go" command.

So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is

why

...
...
bootm does a cache flush, we need some way to flush in this case.

It's also an issue with using the "cpu <n> release" command.

...
And in this case you aren't better served by say bootelf ?

That wouldn't handle the "cpu release" case. In any case, "go"

exists

...
and is currently the recommended way to launch a standalone

application

...
in doc/README.standalone.

...
...
It's a user command! How can it be dead code? I don't know of

a

...
...
...
way to include a human user in a patchset...

Can you hightlight what exactly causes the world today to go off

and

...
...
fail? Is the hello_world example app sufficient in this case or

do we

...
...
need something much larger?

A user inside Freescale is running standalone performance test apps, using both "go" and "cpu <n> release" (since the test needs to run

on

...
all CPUs). They are seeing cache problems running on a T4240 if

they

...
don't have this flush. This flush is architecturally required

between

...
modifying/loading code and running it.

Still, why make it a shell command? Since this user needs a flush with "go" and "cpu release", then we should add a programmatic global cache flush in the "go" and "cpu release" commands.

Why add any new commands? They could all be subcommands of bootm! :-)

Really, instead of adding one command, you want to modify *two* commands to do the same thing separately, which involves changing the syntax of both commands to accept memory range information?

-Scott

Michael Cashwell

21 Mar 21 Mar

12:33 a.m.

On Mar 20, 2013, at 6:35 PM, Scott Wood scottwood@freescale.com wrote:

...

Really, instead of adding one command, you want to modify *two* commands to do the same thing separately, which involves changing the syntax of both commands to accept memory range information?

What is the purpose of limiting the memory range to be flushed? Is there a reason one might want to NOT flush certain data sitting in a dirty cache line out to memory before doing a go or boot command?

If you drive the operation as a "walk the cache" process rather than a "iterate over all SDRAM physical address space to cherry pick within a range" it wouldn't seem that slow. I mean, there's only so much cache memory.

Maybe I'm not well versed enough on the various architectures to understand how a cache walk is really done. So while the question really falls to someone who does know, I wanted to make sure it was at least asked.

-Mike

Scott Wood

12:48 a.m.

On 03/20/2013 06:33:41 PM, Michael Cashwell wrote:

...

On Mar 20, 2013, at 6:35 PM, Scott Wood scottwood@freescale.com wrote:

...
Really, instead of adding one command, you want to modify *two*

commands to do the same thing separately, which involves changing the syntax of both commands to accept memory range information?

What is the purpose of limiting the memory range to be flushed? Is there a reason one might want to NOT flush certain data sitting in a dirty cache line out to memory before doing a go or boot command?

Because it would take a while to flush all of RAM?

...

If you drive the operation as a "walk the cache" process rather than a "iterate over all SDRAM physical address space to cherry pick within a range" it wouldn't seem that slow. I mean, there's only so much cache memory.

There's no way to "walk the cache" on many chips, including ours. There are generally ways to flush the entire thing, but they are (more) hardware-specific than using the architected method of flushing a memory range, and sometimes these methods have errata associated with them.

-Scott

Michael Cashwell

1:27 a.m.

On Mar 20, 2013, at 7:48 PM, Scott Wood scottwood@freescale.com wrote:

...

On 03/20/2013 06:33:41 PM, Michael Cashwell wrote:

...
What is the purpose of limiting the memory range to be flushed? Is there a reason one might want to NOT flush certain data sitting in a dirty cache line out to memory before doing a go or boot command?

Because it would take a while to flush all of RAM?

"Flushing all of RAM" is what trips me up. Fundamentally, that puts the cart in front of the horse. The goal isn't to flush all of RAM but rather to flush all of cache. Iterating over the small thing rather than the large would seem reasonably efficient.

But as you say, if there are architectures where that can't be done and you must pass GBs of physical address space (rather than KB of cache space) through some process then range limiting it does make sense.

I'm not trying to argue a point. I just want all angles to be considered.

Thanks for the enlightenment!

-Mike

Scott Wood

1:31 a.m.

On 03/20/2013 07:27:29 PM, Michael Cashwell wrote:

...

On Mar 20, 2013, at 7:48 PM, Scott Wood scottwood@freescale.com wrote:

...
On 03/20/2013 06:33:41 PM, Michael Cashwell wrote:

...
What is the purpose of limiting the memory range to be flushed? Is

there a reason one might want to NOT flush certain data sitting in a dirty cache line out to memory before doing a go or boot command?

...
Because it would take a while to flush all of RAM?

"Flushing all of RAM" is what trips me up. Fundamentally, that puts the cart in front of the horse. The goal isn't to flush all of RAM but rather to flush all of cache.

Right, I was just responding to your question of, "What is the purpose of limiting the memory range to be flushed?"

...

Iterating over the small thing rather than the large would seem reasonably efficient.

But as you say, if there are architectures where that can't be done and you must pass GBs of physical address space (rather than KB of cache space) through some process then range limiting it does make sense.

Right. The range specified is a minimum to be flushed -- if a particular architecture finds it easier/quicker to flush everything instead, that's fine.

-Scott

Sricharan R

6:02 a.m.

On Thursday 21 March 2013 06:01 AM, Scott Wood wrote:

...

On 03/20/2013 07:27:29 PM, Michael Cashwell wrote:

...
On Mar 20, 2013, at 7:48 PM, Scott Wood scottwood@freescale.com wrote:

...
On 03/20/2013 06:33:41 PM, Michael Cashwell wrote:

...
What is the purpose of limiting the memory range to be flushed? Is there a reason one might want to NOT flush certain data sitting in a dirty cache line out to memory before doing a go or boot command?

Because it would take a while to flush all of RAM?

"Flushing all of RAM" is what trips me up. Fundamentally, that puts the cart in front of the horse. The goal isn't to flush all of RAM but rather to flush all of cache.

Right, I was just responding to your question of, "What is the purpose of limiting the memory range to be flushed?"

...
Iterating over the small thing rather than the large would seem reasonably efficient.

But as you say, if there are architectures where that can't be done and you must pass GBs of physical address space (rather than KB of cache space) through some process then range limiting it does make sense.

Right. The range specified is a minimum to be flushed -- if a particular architecture finds it easier/quicker to flush everything instead, that's fine.

So in your case, how do you find out the addresses of buffers to be flushed from command ? Just thinking how this can be used generically ?

Regards, Sricharan

Scott Wood

7:51 p.m.

On 03/21/2013 12:02:23 AM, Sricharan R wrote:

...

On Thursday 21 March 2013 06:01 AM, Scott Wood wrote:

...
On 03/20/2013 07:27:29 PM, Michael Cashwell wrote:

...
On Mar 20, 2013, at 7:48 PM, Scott Wood scottwood@freescale.com

wrote:

...
...
...
On 03/20/2013 06:33:41 PM, Michael Cashwell wrote:

...
What is the purpose of limiting the memory range to be flushed?

Is there a reason one might want to NOT flush certain data sitting in a dirty cache line out to memory before doing a go or boot command?

...
...
...
Because it would take a while to flush all of RAM?

"Flushing all of RAM" is what trips me up. Fundamentally, that

puts the cart in front of the horse. The goal isn't to flush all of RAM but rather to flush all of cache.

...
Right, I was just responding to your question of, "What is the

purpose of limiting the memory range to be flushed?"

...
...
Iterating over the small thing rather than the large would seem

reasonably efficient.

...
...
But as you say, if there are architectures where that can't be

done and you must pass GBs of physical address space (rather than KB of cache space) through some process then range limiting it does make sense.

...
Right. The range specified is a minimum to be flushed -- if a

particular architecture finds it easier/quicker to flush everything instead, that's fine.

...
So in your case, how do you find out the addresses of buffers to be flushed from command ? Just thinking how this can be used generically ?

I'm not sure what you're asking. How does the user know what address range to pass to the command? It's whatever covers the program they want to run.

-Scott

Albert ARIBAUD

6:58 p.m.

Hi Scott,

On Wed, 20 Mar 2013 17:35:51 -0500, Scott Wood scottwood@freescale.com wrote:

...

On 03/20/2013 05:11:57 PM, Albert ARIBAUD wrote:

...
Hi Scott,

On Wed, 20 Mar 2013 14:36:05 -0500, Scott Wood scottwood@freescale.com wrote:

...
On 03/20/2013 02:15:19 PM, Tom Rini wrote:

...
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:

...
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:

...
Dear Albert,

In message 20130320145927.2031b913@lilith you wrote: > > I do understand what it does, but I still don't get why it

should be

...
...
> done, since precisely payload control transfer happens

through

...
...
...
...
bootm and > the like which already properly flush cache.

It doesn't always happen through bootm. Standalone apps use the "go" command.

So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is

why

...
...
bootm does a cache flush, we need some way to flush in this case.

It's also an issue with using the "cpu <n> release" command.

...
And in this case you aren't better served by say bootelf ?

That wouldn't handle the "cpu release" case. In any case, "go"

exists

...
and is currently the recommended way to launch a standalone

application

...
in doc/README.standalone.

...
...
It's a user command! How can it be dead code? I don't know of

a

...
...
...
way to include a human user in a patchset...

Can you hightlight what exactly causes the world today to go off

and

...
...
fail? Is the hello_world example app sufficient in this case or

do we

...
...
need something much larger?

A user inside Freescale is running standalone performance test apps, using both "go" and "cpu <n> release" (since the test needs to run

on

...
all CPUs). They are seeing cache problems running on a T4240 if

they

...
don't have this flush. This flush is architecturally required

between

...
modifying/loading code and running it.

Still, why make it a shell command? Since this user needs a flush with "go" and "cpu release", then we should add a programmatic global cache flush in the "go" and "cpu release" commands.

Why add any new commands? They could all be subcommands of bootm! :-)

I did not say "subcommand", I said "programmatic" -- precisely, I don't like the idea that a specific command or command form is needed to avoid a situation that can be avoided automatically.

...

Really, instead of adding one command, you want to modify *two* commands to do the same thing separately, which involves changing the syntax of both commands to accept memory range information?

There is no need to change any syntax.

...

-Scott

Amicalement,

-- Albert.

Scott Wood

7:07 p.m.

On 03/21/2013 12:58:37 PM, Albert ARIBAUD wrote:

...

Hi Scott,

On Wed, 20 Mar 2013 17:35:51 -0500, Scott Wood scottwood@freescale.com wrote:

...
On 03/20/2013 05:11:57 PM, Albert ARIBAUD wrote:

...
Hi Scott,

On Wed, 20 Mar 2013 14:36:05 -0500, Scott Wood scottwood@freescale.com wrote:

...
On 03/20/2013 02:15:19 PM, Tom Rini wrote:

...
On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:

...
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote: >Dear Albert, > >In message 20130320145927.2031b913@lilith you wrote: >> >> I do understand what it does, but I still don't get why

it

...
...
...
...
should be

...
>> done, since precisely payload control transfer happens

through

...
...
...
>bootm and >> the like which already properly flush cache.

It doesn't always happen through bootm. Standalone apps

use the

...
...
...
...
...
"go" command.

So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence,

which is

...
...
why

...
...
bootm does a cache flush, we need some way to flush in this

case.

...
...
...
It's also an issue with using the "cpu <n> release" command.

...
And in this case you aren't better served by say bootelf ?

That wouldn't handle the "cpu release" case. In any case, "go"

exists

...
and is currently the recommended way to launch a standalone

application

...
in doc/README.standalone.

...
...
It's a user command! How can it be dead code? I don't

know of

...
...
a

...
...
...
way to include a human user in a patchset...

Can you hightlight what exactly causes the world today to go

off

...
...
and

...
...
fail? Is the hello_world example app sufficient in this case

or

...
...
do we

...
...
need something much larger?

A user inside Freescale is running standalone performance test

apps,

...
...
...
using both "go" and "cpu <n> release" (since the test needs to

run

...
...
on

...
all CPUs). They are seeing cache problems running on a T4240 if

they

...
don't have this flush. This flush is architecturally required

between

...
modifying/loading code and running it.

Still, why make it a shell command? Since this user needs a flush

with

...
...
"go" and "cpu release", then we should add a programmatic global

cache

...
...
flush in the "go" and "cpu release" commands.

Why add any new commands? They could all be subcommands of bootm!

:-)

I did not say "subcommand", I said "programmatic" -- precisely, I don't like the idea that a specific command or command form is needed to avoid a situation that can be avoided automatically.

bootm subcommands are programmatic in normal use cases.

...

...
Really, instead of adding one command, you want to modify *two* commands to do the same thing separately, which involves changing

the

...
syntax of both commands to accept memory range information?

There is no need to change any syntax.

Then how would we know what range to flush?

-Scott

Wolfgang Denk

8:21 p.m.

Dear Scott Wood,

In message 1363889275.31522.12@snotra you wrote:

...

...
There is no need to change any syntax.

Then how would we know what range to flush?

Use a IH_TYPE_STANDALONE U-Boot image, and bootm?

Best regards,

Wolfgang Denk

-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de IMPORTANT NOTICE TO PURCHASERS: The Entire Physical Universe, Inclu- ding This Product, May One Day Collapse Back into an Infinitesimally Small Space. Should Another Universe Subsequently Re-emerge, the Existence of This Product in That Universe Cannot Be Guaranteed.

York Sun

20 Mar 20 Mar

8:40 p.m.

On 03/20/2013 12:15 PM, Tom Rini wrote:

...

On Wed, Mar 20, 2013 at 11:43:15AM -0500, Scott Wood wrote:

...
On 03/20/2013 09:58:36 AM, Wolfgang Denk wrote:

...
Dear Albert,

In message 20130320145927.2031b913@lilith you wrote:

...
I do understand what it does, but I still don't get why it should be done, since precisely payload control transfer happens through

bootm and

...
the like which already properly flush cache.

It doesn't always happen through bootm. Standalone apps use the "go" command.

So, to try and be a bit more verbose about this, for U-Boot applications which use 'go', we still need to ensure cache coherence, which is why bootm does a cache flush, we need some way to flush in this case.

And in this case you aren't better served by say bootelf ?

We have a user's case to release secondary cores to run user's application which is in main memory. The secondary cores don't necessarily share the cache with the core u-boot is running, depends on hardware implementation. Without flushing the cache, those cores don't have the correct code to fetch.

York

Wolfgang Denk

3:51 p.m.

Dear York Sun,

In message 1363724992-9803-1-git-send-email-yorksun@freescale.com you wrote:

...

When we need the copied code/data in the main memory, we can flush the cache now. It uses the existing function flush_cache. Syntax is

flush_cache <addr> <size>

The addr and size are given in hexadecimal. Like memory command, there is no sanity check for the parameters.

Why would it be necessary to ever run this manually?

I feel you are just papering over bugs / shortcomings elsewhere in the code. Should we not rather fix the places where the cache flushes are missing in the code?

Best regards,

Wolfgang Denk

Scott Wood

5:44 p.m.

On 03/20/2013 09:51:03 AM, Wolfgang Denk wrote:

...

Dear York Sun,

In message 1363724992-9803-1-git-send-email-yorksun@freescale.com you wrote:

...
When we need the copied code/data in the main memory, we can flush

the

...
cache now. It uses the existing function flush_cache. Syntax is

flush_cache <addr> <size>

The addr and size are given in hexadecimal. Like memory command,

there is

...
no sanity check for the parameters.

Why would it be necessary to ever run this manually?

I feel you are just papering over bugs / shortcomings elsewhere in the code. Should we not rather fix the places where the cache flushes are missing in the code?

Possibly -- the alternative would be to have every command that modifies memory flush that region of cache afterward (cp, tftp, etc).

You could say the same thing about the existing cache command, though.

-Scott

Jim Chargin

3 Apr 3 Apr

4:02 p.m.

New subject: [U-Boot] [RFC] command/cache: Add flush_cache command

I apologize for being so late with this question.

York Sun <yorksun <at> freescale.com> writes:

...

When we need the copied code/data in the main memory, we can flush the cache now. It uses the existing function flush_cache. Syntax is

flush_cache <addr> <size>

The addr and size are given in hexadecimal. Like memory command, there is no sanity check for the parameters.

Are there symptoms for a specific system failure you can provide?

My problem is that when a stand alone application, which is copied from NOR flash to DDR, is started with the "go" command, it sometimes experiences a program check (illegal instruction) after a block of 32 zero bytes "appears" in memory.

I'm using a U-Boot for a custom Freescale P1022-based board, currently based on the old 2010.12 U-Boot as patched by Freescale in their SDK_V1_0_20110429_ltib.iso. Unfortunately, upgrading to a more recent version of U-Boot is not possible at this time, no more recent version is available from Freescale and I don't have the resources to verify all their patches apply correctly to a release directly from DENX.

I've used Codewarrior to observe the block of zeros in memory once the program check has happened, and I've verified that before the stand alone application begins execution, there are no zeros. Figuring it is likely that I have an error in my stand alone application code that corrupts the stack or writes based on an un-initialized pointer, I've tried using the CW watchpoint to catch where the zeros are written, but enabling the watchpoint seems to avoid the problem.

Based on later discussion on this thread, I've tried adding "flush_dcache(); invalidate_icache();" to cmd_boot.c:do_go(), just before control is passed to the stand alone app. Based on my ad hoc testing with this change, I don't get the program check exception.

I believe this result helps make the case that caching behavior is at the root of my problem, but since I was not able to isolate the actual cause of that problem, I can't be sure I've really got the solution.

(By the way, I would not leave the "flush_dcache(); invalidate_icache();" in do_go(), I merely found that for demonstrating a possible solution, this change easier than switching to a stand alone app that starts with bootm, or similar)

Any help or comments are very welcome.

Thanks, Jim

...

Signed-off-by: York Sun <yorksun <at> freescale.com>

common/cmd_cache.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)

diff --git a/common/cmd_cache.c b/common/cmd_cache.c index 5512f92..93b7337 100644 --- a/common/cmd_cache.c +++ b/common/cmd_cache.c @@ -94,6 +94,29 @@ int do_dcache(cmd_tbl_t *cmdtp, int flag, int argc, char *

const argv[])

...

return 0; }

+void __weak flush_cache(ulong addr, ulong size) +{

puts("No arch specific flush_cache available!\n");

/* please define arch specific flush_cache */

+}

+int do_flush_cache(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[]) +{
ulong addr, size;

switch (argc) {

case 3:
addr = simple_strtoul(argv[1], NULL, 16);
size = simple_strtoul(argv[2], NULL, 16);
flush_cache(addr, size);
break;
default:
return cmd_usage(cmdtp);
}

return 0;
+}

static int parse_argv(const char *s) { if (strcmp(s, "flush") == 0) @@ -120,3 +143,10 @@ U_BOOT_CMD( "[on, off, flush]\n" " - enable, disable, or flush data (writethrough) cache" );

+U_BOOT_CMD(

flush_cache, 3, 0, do_flush_cache,

"flush cache for a range",

"<addr> <size>\n"

" - flush cache for specificed range"

+);

Scott Wood

18 Apr 18 Apr

7:09 p.m.

On 04/03/2013 09:02:15 AM, Jim Chargin wrote:

...

I apologize for being so late with this question.

York Sun <yorksun <at> freescale.com> writes:

...
When we need the copied code/data in the main memory, we can flush

the

...
cache now. It uses the existing function flush_cache. Syntax is

flush_cache <addr> <size>

The addr and size are given in hexadecimal. Like memory command,

there is

...
no sanity check for the parameters.

Are there symptoms for a specific system failure you can provide?

Generally you'd see code behaving in a way that is different than the instructions in memory (as seen by a load instruction) would suggest. Often it's obvious, such as an illegal instruction trap on on opcode that is valid, but it could be arbitrary depending on what the old contents of memory were (especially if there had been valid (but different) code there before).

...

My problem is that when a stand alone application, which is copied from NOR flash to DDR, is started with the "go" command, it sometimes experiences a program check (illegal instruction) after a block of 32 zero bytes "appears" in memory.

Maybe the old contents of memory contained a dcbz? Or somehow confused a legimate dcbz to target the wrong address.

...

I'm using a U-Boot for a custom Freescale P1022-based board, currently based on the old 2010.12 U-Boot as patched by Freescale in their SDK_V1_0_20110429_ltib.iso. Unfortunately, upgrading to a more recent version of U-Boot is not possible at this time, no more recent version is available from Freescale and I don't have the resources to verify all their patches apply correctly to a release directly from DENX.

You can find the latest Freescale U-Boot at: http://git.freescale.com/git/cgit.cgi/ppc/sdk/u-boot.git/ git://git.freescale.com/ppc/sdk/u-boot.git

...though there's a good chance that you would not need anything from the SDK if you run the latest upstream U-Boot. You'd still need to apply your own patches for the custom board, of course.

...

(By the way, I would not leave the "flush_dcache(); invalidate_icache();" in do_go(), I merely found that for demonstrating a possible solution, this change easier than switching to a stand alone app that starts with bootm, or similar)

It seems reasonably likely that this is the cause of your problem; in any case, the flush is architecturally required.

-Scott

4410

Age (days ago)

4440

Last active (days ago)

List overview

Download

43 comments

8 participants

tags (0)

participants (8)

Albert ARIBAUD
Jim Chargin
Michael Cashwell
Scott Wood
Sricharan R
Tom Rini
Wolfgang Denk
York Sun