Re: [U-Boot] [PATCH v5 09/11] armv8: ls2080a: Drop early MMU for SPL build

7 Mar 2017


      On 03/06/2017 10:36 PM, Marek Vasut wrote:
...
On 03/07/2017 05:31 AM, york sun wrote:
...
On 03/06/2017 07:59 PM, Marek Vasut wrote:
...
On 03/06/2017 06:02 PM, York Sun wrote:
...
Early MMU improves performance especially on emulators. However, the
early MMU is left enabled after the first stage of SPL boot. Instead
of flushing D-cache and dealing with re-enabling MMU for the second
stage U-Boot, disabling it for SPL build simplifies the process. The
performance penalty is unnoticeable on the real hardware. As of now,
SPL boot is not supported by existing emulators. So this should have
no impact on emulators.
Signed-off-by: York Sun york.sun@nxp.com
This looks stupid. Why don't you just keep it enabled between SPL and
U-Boot ? That'd be much more logical and sensible ...
Mark,
We are probably the only one using early MMU to speed up execution on
emulators. I only noticed recently the cache/MMU wasn't enabled for most
ARMv8 layerscape SoCs for SPL, except LS2080A (which is where we
started). During recent debug, I learned the process of
break-before-make process should be strictly followed when changing MMU.
I don't think I understand this part.
Marek,
Let me explain in detail.
When we started to port U-Boot to the very first ARMv8 Layerscape SoC, 
we use hardware emulator (we still use emulator today) which runs very 
very slow, like 800KHz slow. It takes long long time to see the first 
message on the console. So we enabled MMU and cache very early to 
dramatically improve the booting speed. We are talking about booting up 
U-Boot in minutes vs 20~30 minutes. It improves productivity a lot. 
Since we are still using emulators for new SoCs, I think we should keep 
this feature.
That being said, the early MMU table is small and simple, residing in 
internal SRAM. It is necessary to create a new complete table after 
U-Boot relocates to DDR. Switching MMU table requires the cache to be 
flushed and disabling and re-enabling MMU at some point to avoid a race 
condition. I only learned this break-before-make rule should be strictly 
enforced recently.
For SPL boot, if the MMU/cache is enabled in SPL build, the normal boot 
part still goes through creating an early MMU, the same steps as booting 
from NOR flash. We have to deal with changing early MMU table with the 
same setup. This is troublesome and unnecessary.
...
...
The normal boot part surely creates MMU tables. If the SPL also creates
MMU table and enables cache, then we need to flush cache and disable MMU
before the normal boot part begins to run. Instead of dealing all these
steps just for LS2080A, I figure it would be a lot easier to drop it as
I proposed in this patch.
But then this is a step back and makes the whole platform slower, right?
Yes, but not noticeable on the real hardware. The most time we spend on 
the SPL is to initialize DDR. It can take up to seconds with ECC 
enabled. Dropping the MMU/cache does slow down a little bit, but not a 
lot. For Layerscape SoCs, the SPL image is loaded into internal SRAM 
which runs pretty fast.
Beside, LS2080A is the only SoC with MMU/cache enabled for SPL. We have 
many other SoCs in the same Layerscape family not enabling MMU/cache for 
SPL and we didn't notice the performance impact. It is actually cleaner 
to make all of them consistent.
York