
Dear Alexander Stein,
In message 201105030848.17576.alexander.stein@systec-electronic.com you wrote:
This specific version was selected due to relocation problems on ARM. But I expect the dcache doesn't have that big influence on the named code part as the environment is already in RAM.
Your expectation is most likely completely wrong. Reading from / writing to uncached RAM is painfully slow compared to a system with caches turned on. And if you - as I speculate - need to checksum a huge amount of data, this will delay things without need.
Are you also still using the old environment code in your port, or is the new, hash table based one? When using the old code, there are additional penalties for using a needlessly big environment as each call to setenv() will recalculate the checksum.
Let me speculate: (I) you have a _huge_ environment allocated for your board, probably > 100 KiB or more;
Environment size: 2098/131067 bytes
So, no.
So, yes! You cannot even read your own numbers correctly.
131067 = 128 KiB which _is_ > 100 KiB.
(II) you are loading it from a slow storage device, probably NAND flash;
The environment is stored in NOR-Flash. So, no.
Especially on NOR flash there is no reason to use an environment size of 128 KiB when you only use 2 KiB of it.
(III) you are running on a narrow system bus (16 bit) with non-optimal RAM timings;
It is using a 32-Bit RAM-Bus. So, no.
And your NOR flash?
And your memory timings?
(IV) you do all this with caches turned off;
dcaches should be off, while icaches are on. So yes and no.
DC of makes things awfully slow. See comments of commits c3330e9, 95c6f6d and 7e4a9e6 - for plain RAM bound operations like copying/uncompressing an image from RAM to RAM switchign on the DC can accelerate the system by a factor of up to >15.
(V) you measure some numbers but you don;t understand what they mean.
These numbers show me that this part of code increases the start time of a considerable amount.
You don;t even understand that you have > 100 KiB of environment size which gets checksummed without need.
The workaround resulted in a faster startup without notable side effect. I'm aware this is not the fix of the problem. So yes and no.
I bet some beer that at least 3 of these speculations hit the point.
You better not want to bet here :-)
It appears I was right on accounts (I), (IV) and (V), it seems ;-) And (III) is not clear yet.
Fact is, the code that you claim takes 100 (or 500) ms to run has no potential for such a long run time unless your system is seriously misconfigured. I guess it runs at least 100 times faster on all systems I have access to.
Best regards,
Wolfgang Denk