
Dear Alexander Stein,
In message 201105050906.35834.alexander.stein@systec-electronic.com you wrote:
Are you also still using the old environment code in your port, or is the new, hash table based one? When using the old code, there are additional penalties for using a needlessly big environment as each call to setenv() will recalculate the checksum.
I was digging into this problem for a short time. And yes, the CRC checksumcalculation takes about 25ms each run. So setenv is called for each stdin,stdout and stderr. which sums up to ~75ms. So you're right this is the old environment code. Here a dcache will speed up the execution of course.
Even more so would reducing the environment size to some reasonable value. Currently you are using some 2 KiB, so say you set the environment size to 8 KiB. This would be 1/16 of your current size, which means the ~75ms would shrink to less than 5 ms. You are wasting 70 ms (only here - there are other places which will add to this figure) just because this inappropriate configuration.
But our standard startup just stars U-Boot and copies the Linux kernel into RAM and starts it. There is not much use of dcache during copy here.
You are wrong. There is a huge difference between perrforming a copy operation in single write cycles to uncached RAM versus writing to a cached area where the cache flushes willoperate in burst mode. Also, the U-Boot code will run faster, too, so copying and decompression is much faster.
You repeat the same mistake again: you make assumptions about what may or may not be fast or slow on your system without actually measuring it. Donald Knuth is right again: "Early optimization is the root of much evil."
It is using a 32-Bit RAM-Bus. So, no.
And your NOR flash?
It is connected 16-bit like most devices only support, but it is setup to use page read mode.
Well, many systems use two 16 bit chips in parallel to give a 32 bit bus.
DC of makes things awfully slow. See comments of commits c3330e9, 95c6f6d and 7e4a9e6 - for plain RAM bound operations like copying/uncompressing an image from RAM to RAM switchign on the DC can accelerate the system by a factor of up to >15.
Yes, from RAM to RAM, dcache will help a lot. But we neither copy from RAM to RAM nor do we uncompressing.
There is still a huge diference in memory bandwith between using plain single write cycles versus burst mode accesses.
Don't speculate. Measure yourself!
Best regards,
Wolfgang Denk