
On Mon, 5 Sep 2016 09:23:00 +0100 Andre Przywara andre.przywara@arm.com wrote:
Hi,
On 05/09/16 05:12, Siarhei Siamashka wrote:
On Mon, 5 Sep 2016 01:32:38 +0100 Andre Przywara andre.przywara@arm.com wrote:
This commit moved the SPL stack into SRAM C, which worked when the SPL set the AHB1 clock down to 100 MHz to cope with the flaky SRAM C access from the CPU. However booting with boot0 (and thus not using SPL at all) we still run with a 200 MHz AHB1, so any access to SRAM C is prone to fail. Since this commit does _not_ only affect the SPL code, but also the U-Boot proper, we fail when booting with boot0.
Yes, it unfortunately affected both the SPL and the U-Boot proper because currently both CONFIG_SPL_STACK and CONFIG_SYS_INIT_SP_ADDR defines affect the SPL stack location and in practice this only works in a predictable way if they are set to the same value. I have sent a patch to address this problem (but the fix may be unsafe for v2016.09 because many ARM platforms are affected):
https://patchwork.ozlabs.org/patch/665608/
After this problem is resolved, the CONFIG_SYS_INIT_SP_ADDR define can be decoupled from CONFIG_SPL_STACK and configured to even use the DRAM instead of thrashing some part of the scarce SRAM space (which may be already occupied by the OpenRISC firmware and/or the ATF at the time when the U-Boot proper is starting).
As the introduction of tiny-printf reduced the size of the SPL, we can afford to have the SPL stack in SRAM A1.
We still need to check how much space is really available. The FIT support is rather heavyweight and we may want to enable some other features too.
Yes, I had to learn this yesterday ;-) So 64-bit SPL works for me now with Jens' DRAM support patches (yeah!), but enabling FIT support makes mksunxiboot barf about the file being to big. The actual SPL code is about 31K, so maybe I can talk mksunxiboot into relaxing its alignment requirements a bit (from 8K down to 512) and also increase the available SRAM size - it says 0x7600 for sun4i, is this still true to newer SoCs/BROMs?
We have this information in the linux-sunxi wiki since a long time ago (at least for the SoC variants that I have and could experiment with) and it is available here:
https://linux-sunxi.org/BROM#U-Boot_SPL_limitations
All the new SoCs have a 32K size limit for the SPL code, which can be loaded by the BROM. Older A10/A20 SoCs artificially limit it to 24K, probably trying to forcefully encourage the users to have 8K stack in the remaining part of the SRAM A1.
On A64, we have 32K of SRAM A1. Then we have 108K of SRAM C, which is a continuation of SRAM A1 in the address space thus making it look like a nice single 140K chunk. Then we also have 64K of SRAM A2, which is supposed to be used by the OpenRISC core and is the only memory area, which has a reasonable performance when used by OpenRISC:
https://linux-sunxi.org/AR100#Memory_Map
The idea was to let the BROM load up to 32K of the SPL code to the SRAM A1 (like it normally does) and then have 8K of stack a bit higher in the address space in SRAM C. But it turned out that the SRAM C is a bit quirky and suffers from data corruption problems if we reclock AHB1 too early.
Now there are two possible ways to move forward on A64: 1) Try to use SRAM C in such a way that it does not fail (and hope that no additional quirks get discovered later). 2) Move the initial SPL stack to SRAM A2.
If we move everything to SRAM A2, then we will have to make sure that all the SRAM users (the FEL storage area, the SPL stack, the ATF and the yet to be implemented OpenRISC firmware) never clash with each other.
About the 31K code size. This does not look good and is very close to the BROM limit (32K). Just using a different compiler may bring us into a trouble. Or some minor code tweaks and feature additions.
Trying this in the past (with libdram) and compiling for (32-bit) Thumb2 worked, but I need to check what the actual size with Jens' patches are these days for Thumb2.
We have already discussed this off-list a long time ago. I know that both you and Alexander Graf are generally in favour of compiling the SPL as 64-bit code.
I think that this is the usual case of utility versus fashion. Everyone wants to plug every hole with 64-bit ARM code right now just because it is new and innovative. But this fad will fade away in a few years. Now just imagine an alternative reality, where ARM64 is an old and boring thing, while Thumb2 is a recent invention to improve code density in microcontrollers and other code space constrained systems. I'm sure that everyone would be trying to find a way to replace the legacy bloated 64-bit ARM code in the SPL with the new and shiny Thumb2 stuff for improving code density ;-)
If we take a pragmatic approach and try to evaluate pro- and cons- factors, then we can see that the 64-bit code in the SPL on Allwinner A64 hardware does not give us any real improvements. Quite the contrary: it offers worse code density than 32-bit Thumb2 and also a functional USB FEL boot support becomes much more tricky (because the boot ROM implements FEL as a 32-bit code).
The only real argument in favour of having a 64-bit SPL is that we can use a single AArch64 toolchain to build both the SPL and the main U-Boot. And we are in this situation only because the AArch64 toolchain does not support the "-m32" option. There is no technical justification for this. ARM decided to be different just for the sake of being different (every other architecture has the -m32 option in GCC if the processor is able to work in both modes). If the "-m32" option was supported, then building a 32-bit SPL would have been mostly a trivial matter of adding "-m32 -mthumb" options to CFLAGS.
But we can try to do a 32-bit SPL build by introducing something like a CROSS_COMPILE_SPL environment variable, just like suggested some time ago: http://lists.denx.de/pipermail/u-boot/2012-April/122236.html
Also I'm finally going to submit the runtime SPL code decompression patches for the next U-Boot release, because there is no need to delay the implementation of this feature any longer. Yes, I know that any saved space will be wasted almost instantly by various gimmicks, but that's just how it is.
Anyway, thanks for your patch, I will try tonight if I can squeeze all the bits in.
If you mean https://patchwork.ozlabs.org/patch/665608/ then it only gives us the freedom to move CONFIG_SYS_INIT_SP_ADDR somewhere else.
And moving the initial stack if the U-Boot proper into the DRAM would make a lot of sense. We only need to agree what kind of DRAM address to use. After all, even the SPL relocates the stack into the DRAM. Why does the U-Boot proper want to use the SRAM for its stack again?