[U-Boot] [PATCH] configs: Lower Lamobo R1 DRAM clock rate to 384 MHz

When running at 432 MHz, the Lamobo R1 DRAM tends to get corrupted under stressing workloads. Reducing the clock rate to 384 MHz results in significantly-improved stability.
One reliable way to trigger a corruption at 432 MHz is to run I/O-intensive operations on an attached SATA disk. The same operations when operating the DRAM at 384 MHz typically go fine.
For some unexplained reason, running at 408 MHz worsens the situation.
Signed-off-by: Paul Kocialkowski contact@paulk.fr --- configs/Lamobo_R1_defconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/configs/Lamobo_R1_defconfig b/configs/Lamobo_R1_defconfig index 92e682128c..cf60fdfaf4 100644 --- a/configs/Lamobo_R1_defconfig +++ b/configs/Lamobo_R1_defconfig @@ -2,7 +2,7 @@ CONFIG_ARM=y CONFIG_ARCH_SUNXI=y CONFIG_SYS_TEXT_BASE=0x4a000000 CONFIG_MACH_SUN7I=y -CONFIG_DRAM_CLK=432 +CONFIG_DRAM_CLK=384 CONFIG_MACPWR="PH23" CONFIG_MMC0_CD_PIN="PH10" CONFIG_SATAPWR="PB3"

On Fri, Jun 15, 2018 at 10:52:39PM +0200, Paul Kocialkowski wrote:
When running at 432 MHz, the Lamobo R1 DRAM tends to get corrupted under stressing workloads. Reducing the clock rate to 384 MHz results in significantly-improved stability.
One reliable way to trigger a corruption at 432 MHz is to run I/O-intensive operations on an attached SATA disk. The same operations when operating the DRAM at 384 MHz typically go fine.
For some unexplained reason, running at 408 MHz worsens the situation.
Signed-off-by: Paul Kocialkowski contact@paulk.fr
What RAM settings are used by the Allwinner BSP, and can you reproduce the issue there if they are the same?
Maxime

Hi,
On Mon, 2018-06-18 at 09:59 +0200, Maxime Ripard wrote:
On Fri, Jun 15, 2018 at 10:52:39PM +0200, Paul Kocialkowski wrote:
When running at 432 MHz, the Lamobo R1 DRAM tends to get corrupted under stressing workloads. Reducing the clock rate to 384 MHz results in significantly-improved stability.
One reliable way to trigger a corruption at 432 MHz is to run I/O-intensive operations on an attached SATA disk. The same operations when operating the DRAM at 384 MHz typically go fine.
For some unexplained reason, running at 408 MHz worsens the situation.
Signed-off-by: Paul Kocialkowski contact@paulk.fr
What RAM settings are used by the Allwinner BSP, and can you reproduce the issue there if they are the same?
I forgot to mention it, but the fex uses 432 MHz (just like the u-boot defconfig we have currently). I doubt that building the Allwinner boot software (boot0 and so on) for comparison is really an option at this point, due to the trainwreck of build issues that may occur.
Would the linux-sunxi downstream u-boot be sufficient for this?
For the sake of completeness, I also looked whether enabling ODT for 432 MHz could be a solution, but since the fex does not make use of it (and has the default Zq value of 0x7f), this is not an option.
Cheers,
Paul

On Fri, 15 Jun 2018 22:52:39 +0200 Paul Kocialkowski contact@paulk.fr wrote:
When running at 432 MHz, the Lamobo R1 DRAM tends to get corrupted under stressing workloads.
Yes, it is well known that Allwinner devboards tend to have overclocked settings out of the box and poor reliability track record because each board vendor is trying to clock this low end hardware as high as possible in order to look more "competitive". You can find more information about this problem here:
https://linux-sunxi.org/Hardware_Reliability_Tests
Reducing the clock rate to 384 MHz results in significantly-improved stability.
It would be great if we could get the reliability problems completely resolved on this board rather than just improved.
One reliable way to trigger a corruption at 432 MHz is to run I/O-intensive operations on an attached SATA disk. The same operations when operating the DRAM at 384 MHz typically go fine.
Yes, concurrent access to the DRAM controller from more than one peripheral exposes reliability problems. That's why we have the lima-memtester tool at least for A10/A20 hardware, which does a stress test for DRAM reliability by using CPU+Mali simultaneously:
https://github.com/ssvb/lima-memtester/
I also did some experiments with CPU+Mali+G2D (simultaneous access from 3 sources) and CPU+G2D (use G2D instead of Mali) and the highest reliable DRAM clock speeds under these workloads were pretty much the same. So I suspect that CPU+SATA is about as stressful as any other combination. And you can probably just run a regular memtester tool together with some SATA activity in the background (I'm assuming that you did just that when debugging this problem).
Still I would suggest you to try the lima-memtester tool too. It requires a legacy 3.4 kernel. If you are really lazy, then you can even try this kernel branch from my github repository:
https://github.com/ssvb/linux-sunxi/tree/20151206-embedded-lima-memtester
The embedded initramfs automatically starts lima-memtester, so you only need to boot this kernel image and watch the serial console log. The only other thing is a proper script.bin file for your board (created from a fex file).
If you are interested in a more advanced stuff (finding better DRAM settings rather than just downclocking DRAM until it stops failing), then you may want to check this wiki page:
https://linux-sunxi.org/A10_DRAM_Controller_Calibration
For example, I had to downclock DRAM from 408MHz to 360MHz in the Linksprite_pcDuino_defconfig in the past, you can find a detailed analysis in the commit log:
https://lists.denx.de/pipermail/u-boot/2015-October/229567.html
For some unexplained reason, running at 408 MHz worsens the situation.
Signed-off-by: Paul Kocialkowski contact@paulk.fr
configs/Lamobo_R1_defconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/configs/Lamobo_R1_defconfig b/configs/Lamobo_R1_defconfig index 92e682128c..cf60fdfaf4 100644 --- a/configs/Lamobo_R1_defconfig +++ b/configs/Lamobo_R1_defconfig @@ -2,7 +2,7 @@ CONFIG_ARM=y CONFIG_ARCH_SUNXI=y CONFIG_SYS_TEXT_BASE=0x4a000000 CONFIG_MACH_SUN7I=y -CONFIG_DRAM_CLK=432 +CONFIG_DRAM_CLK=384 CONFIG_MACPWR="PH23" CONFIG_MMC0_CD_PIN="PH10" CONFIG_SATAPWR="PB3"
participants (3)
-
Maxime Ripard
-
Paul Kocialkowski
-
Siarhei Siamashka