[U-Boot] [PATCH 0/2] sunxi: Fix MMC driver crashes

There were recent reports about MMC operations failing very often on boards with the Allwinner A64 SoC. Investigations pointed to a time issue, and indeed a recent patch [1] introduced frequent arch timer reads to the Allwinner MMC driver. Reverting this patch made the problems go away, but also hurt performance much. Unrelated reports also confirmed an Allwinner A64 arch timer erratum[2], which can lead to erroneous counter reads, where the lower 11 bits become either all 0's or all 1's. This leads to random jumps forwards and backwards, with catastrophic consequences. These two patches fix the issue, while retaining the much improved MMC performance. The first patch refactors an already existing arch timer fix, to allow the second patch, introducing an Allwinner specific workaround, to fit it more nicely.
Please have a look and apply as soon as possible.
Cheers, Andre.
[1] commit 5ff8e54888e4d26a352453564f7f599d29696dc9 Author: Philipp Tomsich philipp.tomsich@theobroma-systems.com Date: Wed Mar 21 12:18:58 2018 +0100 sunxi: improve throughput in the sunxi_mmc driver
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/576886.html
Andre Przywara (2): arm: timer: factor out FSL arch timer erratum workaround arm: timer: sunxi: add Allwinner timer erratum workaround
arch/arm/cpu/armv8/generic_timer.c | 55 +++++++++++++++++++++++++++++++++----- arch/arm/mach-sunxi/Kconfig | 4 +++ 2 files changed, 53 insertions(+), 6 deletions(-)

At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/cpu/armv8/generic_timer.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-)
diff --git a/arch/arm/cpu/armv8/generic_timer.c b/arch/arm/cpu/armv8/generic_timer.c index bf07a706a0..3d04fde650 100644 --- a/arch/arm/cpu/armv8/generic_timer.c +++ b/arch/arm/cpu/armv8/generic_timer.c @@ -20,27 +20,46 @@ unsigned long get_tbclk(void) return cntfrq; }
+#ifdef CONFIG_SYS_FSL_ERRATUM_A008585 /* - * Generic timer implementation of timer_read_counter() + * FSL erratum A-008585 says that the ARM generic timer counter "has the + * potential to contain an erroneous value for a small number of core + * clock cycles every time the timer value changes". + * This sometimes leads to a consecutive counter read returning a lower + * value than the previous one, thus reporting the time to go backwards. + * The workaround is to read the counter twice and only return when the value + * was the same in both reads. + * Assumes that the CPU runs in much higher frequency than the timer. */ unsigned long timer_read_counter(void) { unsigned long cntpct; -#ifdef CONFIG_SYS_FSL_ERRATUM_A008585 - /* This erratum number needs to be confirmed to match ARM document */ unsigned long temp; -#endif + isb(); asm volatile("mrs %0, cntpct_el0" : "=r" (cntpct)); -#ifdef CONFIG_SYS_FSL_ERRATUM_A008585 asm volatile("mrs %0, cntpct_el0" : "=r" (temp)); while (temp != cntpct) { asm volatile("mrs %0, cntpct_el0" : "=r" (cntpct)); asm volatile("mrs %0, cntpct_el0" : "=r" (temp)); } -#endif + return cntpct; } +#else +/* + * timer_read_counter() using the Arm Generic Timer (aka arch timer). + */ +unsigned long timer_read_counter(void) +{ + unsigned long cntpct; + + isb(); + asm volatile("mrs %0, cntpct_el0" : "=r" (cntpct)); + + return cntpct; +} +#endif
uint64_t get_ticks(void) {

On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/generic_timer.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-)
Are these these two patches are for coming release?

Hi,
On 28/06/18 15:27, Jagan Teki wrote:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/generic_timer.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-)
Are these these two patches are for coming release?
Yes, please. That's an issue that needs fixing now. The bug is in all A64 silicon, but the issue popped up with the recent commit 5ff8e54888e4 and can be triggered quite easily: Boot mainline U-Boot on any A64 board, and do:
ls mmc 0:1
on any SD card (with at least one partition defined). Then hold down the Enter key to repeat this last command over and over again. It usually takes less than 10 seconds to crash the board. People reported immediate crashes under certain circumstances (GPT partitioned SD cards): https://bugzilla.opensuse.org/show_bug.cgi?id=1098550
Cheers, Andre.

On Thu, Jun 28, 2018 at 8:07 PM, Andre Przywara andre.przywara@arm.com wrote:
Hi,
On 28/06/18 15:27, Jagan Teki wrote:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/generic_timer.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-)
Are these these two patches are for coming release?
Yes, please. That's an issue that needs fixing now. The bug is in all A64 silicon, but the issue popped up with the recent commit 5ff8e54888e4 and can be triggered quite easily: Boot mainline U-Boot on any A64 board, and do:
ls mmc 0:1
on any SD card (with at least one partition defined). Then hold down the Enter key to repeat this last command over and over again. It usually takes less than 10 seconds to crash the board. People reported immediate crashes under certain circumstances (GPT partitioned SD cards): https://bugzilla.opensuse.org/show_bug.cgi?id=1098550
Yes I would able to reproduce the issue related to GPT partitions[1] with SD card on BPI-M64 and seems like issue fixed[2] after applying these two patches. I even tried another board in eMMC partition.
Tested-by: Jagan Teki jagan@amarulasolutions.com
[1] https://paste.ubuntu.com/p/hkHrQKXRj5/ [2] https://paste.ubuntu.com/p/JKrxC7ScNX/
Jagan.

Hi Jagan,
On 29/06/18 09:41, Jagan Teki wrote:
On Thu, Jun 28, 2018 at 8:07 PM, Andre Przywara andre.przywara@arm.com wrote:
Hi,
On 28/06/18 15:27, Jagan Teki wrote:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/generic_timer.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-)
Are these these two patches are for coming release?
Yes, please. That's an issue that needs fixing now. The bug is in all A64 silicon, but the issue popped up with the recent commit 5ff8e54888e4 and can be triggered quite easily: Boot mainline U-Boot on any A64 board, and do:
ls mmc 0:1
on any SD card (with at least one partition defined). Then hold down the Enter key to repeat this last command over and over again. It usually takes less than 10 seconds to crash the board. People reported immediate crashes under certain circumstances (GPT partitioned SD cards): https://bugzilla.opensuse.org/show_bug.cgi?id=1098550
Yes I would able to reproduce the issue related to GPT partitions[1] with SD card on BPI-M64 and seems like issue fixed[2] after applying these two patches. I even tried another board in eMMC partition.
Tested-by: Jagan Teki jagan@amarulasolutions.com
Thanks! That's much appreciated.
Cheers, Andre.
[1] https://paste.ubuntu.com/p/hkHrQKXRj5/ [2] https://paste.ubuntu.com/p/JKrxC7ScNX/
Jagan.

On Fri, Jun 29, 2018 at 2:17 PM, Andre Przywara andre.przywara@arm.com wrote:
Hi Jagan,
On 29/06/18 09:41, Jagan Teki wrote:
On Thu, Jun 28, 2018 at 8:07 PM, Andre Przywara andre.przywara@arm.com wrote:
Hi,
On 28/06/18 15:27, Jagan Teki wrote:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/generic_timer.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-)
Are these these two patches are for coming release?
Yes, please. That's an issue that needs fixing now. The bug is in all A64 silicon, but the issue popped up with the recent commit 5ff8e54888e4 and can be triggered quite easily: Boot mainline U-Boot on any A64 board, and do:
ls mmc 0:1
on any SD card (with at least one partition defined). Then hold down the Enter key to repeat this last command over and over again. It usually takes less than 10 seconds to crash the board. People reported immediate crashes under certain circumstances (GPT partitioned SD cards): https://bugzilla.opensuse.org/show_bug.cgi?id=1098550
Yes I would able to reproduce the issue related to GPT partitions[1] with SD card on BPI-M64 and seems like issue fixed[2] after applying these two patches. I even tried another board in eMMC partition.
Tested-by: Jagan Teki jagan@amarulasolutions.com
Thanks! That's much appreciated.
UW.
Hi All,
If anyone has any further questions on these two patches, please let me know. Would like to apply soon.
Jagan.

On 2018-06-28 16:37, Andre Przywara wrote:
Hi,
On 28/06/18 15:27, Jagan Teki wrote:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/generic_timer.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-)
Are these these two patches are for coming release?
Yes, please. That's an issue that needs fixing now. The bug is in all A64 silicon, but the issue popped up with the recent commit 5ff8e54888e4 and can be triggered quite easily: Boot mainline U-Boot on any A64 board, and do:
ls mmc 0:1
on any SD card (with at least one partition defined). Then hold down the Enter key to repeat this last command over and over again. It usually takes less than 10 seconds to crash the board. People reported immediate crashes under certain circumstances (GPT partitioned SD cards): https://bugzilla.opensuse.org/show_bug.cgi?id=1098550
Cheers, Andre.
For fixing the issue we also need to add the compat to the timer in the dts so kernels will also Apply the errata. Booting is cool but if you boot a kernel that will have timer issue it's not that great.

Hi,
On 06/29/2018 11:18 AM, Emmanuel Vadot wrote:
On 2018-06-28 16:37, Andre Przywara wrote:
Hi,
On 28/06/18 15:27, Jagan Teki wrote:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
arch/arm/cpu/armv8/generic_timer.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-)
Are these these two patches are for coming release?
Yes, please. That's an issue that needs fixing now. The bug is in all A64 silicon, but the issue popped up with the recent commit 5ff8e54888e4 and can be triggered quite easily: Boot mainline U-Boot on any A64 board, and do:
ls mmc 0:1
on any SD card (with at least one partition defined). Then hold down the Enter key to repeat this last command over and over again. It usually takes less than 10 seconds to crash the board. People reported immediate crashes under certain circumstances (GPT partitioned SD cards): https://bugzilla.opensuse.org/show_bug.cgi?id=1098550
Cheers, Andre.
For fixing the issue we also need to add the compat to the timer in the dts so kernels will also Apply the errata. Booting is cool but if you boot a kernel that will have timer issue it's not that great.
Yes, but that is a separate issue. The Linux patch [1] is on the list, but hasn't been merged yet. So the details of the DT property are not yet set. And since we need the fixup already early in the SPL, we have to use a compile time option for U-Boot anyway. So the DT property would be purely for OSes and we would get it during the normal DT sync anyway.
Cheers, Andre.
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/576886.html

On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Applied both patches, to u-boot-sunxi/master

Am 02.07.2018 um 10:01 schrieb Jagan Teki:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Applied both patches, to u-boot-sunxi/master
Tested both on top of v2018.07-rc2, fixes the boot for me.
Thanks, Andreas

Le 03/07/2018 à 01:08, Andreas Färber a écrit :
Am 02.07.2018 um 10:01 schrieb Jagan Teki:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Applied both patches, to u-boot-sunxi/master
Tested both on top of v2018.07-rc2, fixes the boot for me.
Tested both on top of v2018.07-rc3, and it fixes the boot.
Thanks.
Guillaume
Thanks, Andreas

On Tue, Jul 3, 2018 at 5:20 PM, Guillaume Gardet guillaume.gardet@free.fr wrote:
Le 03/07/2018 à 01:08, Andreas Färber a écrit :
Am 02.07.2018 um 10:01 schrieb Jagan Teki:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Applied both patches, to u-boot-sunxi/master
Tested both on top of v2018.07-rc2, fixes the boot for me.
Tested both on top of v2018.07-rc3, and it fixes the boot.
Thanks, I've collected Tested-by

Am 03.07.2018 um 01:08 schrieb Andreas Färber:
Am 02.07.2018 um 10:01 schrieb Jagan Teki:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Applied both patches, to u-boot-sunxi/master
Tested both on top of v2018.07-rc2, fixes the boot for me.
Actually I saw it again just now, without having touched U-Boot at all. Unplugged power, retried, worked. So it seems we've reduced the likelihood, but something might still be astray...
Regards, Andreas

Am 03.07.2018 um 22:51 schrieb Andreas Färber afaerber@suse.de:
Am 03.07.2018 um 01:08 schrieb Andreas Färber:
Am 02.07.2018 um 10:01 schrieb Jagan Teki:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote: At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Applied both patches, to u-boot-sunxi/master
Tested both on top of v2018.07-rc2, fixes the boot for me.
Actually I saw it again just now, without having touched U-Boot at all. Unplugged power, retried, worked. So it seems we've reduced the likelihood, but something might still be astray...
So maybe we need to instead apply some logic that loops until cnt == prev_cnt+1?
Also, is there any way to just trap counter reads from EL3? It'd be quite tedious to fix up all OSs out there.
Alex

On 07/03/2018 09:59 PM, Alexander Graf wrote:
Am 03.07.2018 um 22:51 schrieb Andreas Färber afaerber@suse.de:
Am 03.07.2018 um 01:08 schrieb Andreas Färber:
Am 02.07.2018 um 10:01 schrieb Jagan Teki:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote: At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Applied both patches, to u-boot-sunxi/master
Tested both on top of v2018.07-rc2, fixes the boot for me.
Actually I saw it again just now, without having touched U-Boot at all. Unplugged power, retried, worked. So it seems we've reduced the likelihood, but something might still be astray...
So maybe we need to instead apply some logic that loops until cnt == prev_cnt+1?
Also, is there any way to just trap counter reads from EL3?
I can't find anything for EL3, and I believe we can't trap the virtual timer at all except for EL0. Besides, that would be rather costly. The current solution normally gets away with just one sysreg read, so it would be just the comparison overhead we have to pay. You don't want to give this away easily.
It'd be quite tedious to fix up all OSs out there.
Well, bad luck, it's a hardware erratum - and not the first one in this area. So chances are you can add just another one quite easily, as we do in Linux - where we actually have somewhat of an "arch timer errata framework".
Cheers, Andre.

Hi,
On 03/07/18 21:51, Andreas Färber wrote:
Am 03.07.2018 um 01:08 schrieb Andreas Färber:
Am 02.07.2018 um 10:01 schrieb Jagan Teki:
On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
At the moment we have the workaround for the Freescale arch timer erratum A-008585 merged into the generic timer_read_counter() routine. Split those two up, so that we can add other errata workaround more easily. Also add an explaining comment on the way.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Applied both patches, to u-boot-sunxi/master
Tested both on top of v2018.07-rc2, fixes the boot for me.
Actually I saw it again just now, without having touched U-Boot at all. Unplugged power, retried, worked. So it seems we've reduced the likelihood, but something might still be astray...
There are reports for that happening on the kernel side as well: http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/588288.html (also see the follow-ups)
I suspect the TVAL access is affected as well (this internally accesses the counter), so we would need to cover this also. I'd suggest we wait for the kernel side solution and then copy that, but we keep this patch here in, as it seems to fix far more frequent problems.
Btw: I tried to use the Freescale workaround in U-Boot, but this at least requires another patch: to fix the problem when the CPU runs at 24MHz. Also it doesn't really help the MMC issue (I saw the same crashes), as it doesn't cover forward jumps.
Cheers, Andre.

The Allwinner A64 SoCs suffers from an arch timer implementation erratum, where sometimes the lower 11 bits of the counter value erroneously become all 0's or all 1's [1]. This leads to sudden jumps, both forwards and backwards, with the latter one often showing weird behaviour. Port the workaround proposed for Linux to U-Boot and activate it for all A64 boards. This fixes crashes when accessing MMC devices (SD cards), caused by a recent change to actually use the counter value for timeout checks.
Fixes: 5ff8e54888e4d26a352453564f7f599d29696dc9 ("sunxi: improve throughput in the sunxi_mmc driver")
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/576886.html
Signed-off-by: Andre Przywara andre.przywara@arm.com --- arch/arm/cpu/armv8/generic_timer.c | 24 ++++++++++++++++++++++++ arch/arm/mach-sunxi/Kconfig | 4 ++++ 2 files changed, 28 insertions(+)
diff --git a/arch/arm/cpu/armv8/generic_timer.c b/arch/arm/cpu/armv8/generic_timer.c index 3d04fde650..c1706dcec1 100644 --- a/arch/arm/cpu/armv8/generic_timer.c +++ b/arch/arm/cpu/armv8/generic_timer.c @@ -46,6 +46,30 @@ unsigned long timer_read_counter(void)
return cntpct; } +#elif CONFIG_SUNXI_A64_TIMER_ERRATUM +/* + * This erratum sometimes flips the lower 11 bits of the counter value + * to all 0's or all 1's, leading to jumps forwards or backwards. + * Backwards jumps might be interpreted all roll-overs and be treated as + * huge jumps forward. + * The workaround is to check whether the lower 11 bits of the counter are + * all 0 or all 1, then discard this value and read again. + * This occasionally discards valid values, but will catch all erroneous + * reads and fixes the problem reliably. Also this mostly requires only a + * single read, so does not have any significant overhead. + * The algorithm was conceived by Samuel Holland. + */ +unsigned long timer_read_counter(void) +{ + unsigned long cntpct; + + isb(); + do { + asm volatile("mrs %0, cntpct_el0" : "=r" (cntpct)); + } while (((cntpct + 1) & GENMASK(10, 0)) <= 1); + + return cntpct; +} #else /* * timer_read_counter() using the Arm Generic Timer (aka arch timer). diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig index a3f7723028..3624a03947 100644 --- a/arch/arm/mach-sunxi/Kconfig +++ b/arch/arm/mach-sunxi/Kconfig @@ -84,6 +84,9 @@ config SUNXI_HIGH_SRAM Chips using the latter setup are supposed to select this option to adjust the addresses accordingly.
+config SUNXI_A64_TIMER_ERRATUM + bool + # Note only one of these may be selected at a time! But hidden choices are # not supported by Kconfig config SUNXI_GEN_SUN4I @@ -270,6 +273,7 @@ config MACH_SUN50I select SUNXI_DRAM_DW_32BIT select FIT select SPL_LOAD_FIT + select SUNXI_A64_TIMER_ERRATUM
config MACH_SUN50I_H5 bool "sun50i (Allwinner H5)"

On 27 Jun 2018, at 02:42, Andre Przywara andre.przywara@arm.com wrote:
The Allwinner A64 SoCs suffers from an arch timer implementation erratum, where sometimes the lower 11 bits of the counter value erroneously become all 0's or all 1's [1]. This leads to sudden jumps, both forwards and backwards, with the latter one often showing weird behaviour.
Feels like a throwback a discussions between us from about 2 years back. ;-) Too bad that there’s still no Errata-document for the A64...
Port the workaround proposed for Linux to U-Boot and activate it for all A64 boards. This fixes crashes when accessing MMC devices (SD cards), caused by a recent change to actually use the counter value for timeout checks.
Fixes: 5ff8e54888e4d26a352453564f7f599d29696dc9 ("sunxi: improve throughput in the sunxi_mmc driver")
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/576886.html
Signed-off-by: Andre Przywara andre.przywara@arm.com
Reviewed-by: Philipp Tomsich philipp.tomsich@theobroma-systems.com
arch/arm/cpu/armv8/generic_timer.c | 24 ++++++++++++++++++++++++ arch/arm/mach-sunxi/Kconfig | 4 ++++ 2 files changed, 28 insertions(+)
diff --git a/arch/arm/cpu/armv8/generic_timer.c b/arch/arm/cpu/armv8/generic_timer.c index 3d04fde650..c1706dcec1 100644 --- a/arch/arm/cpu/armv8/generic_timer.c +++ b/arch/arm/cpu/armv8/generic_timer.c @@ -46,6 +46,30 @@ unsigned long timer_read_counter(void)
return cntpct; } +#elif CONFIG_SUNXI_A64_TIMER_ERRATUM +/*
- This erratum sometimes flips the lower 11 bits of the counter value
- to all 0's or all 1's, leading to jumps forwards or backwards.
- Backwards jumps might be interpreted all roll-overs and be treated as
- huge jumps forward.
- The workaround is to check whether the lower 11 bits of the counter are
- all 0 or all 1, then discard this value and read again.
- This occasionally discards valid values, but will catch all erroneous
- reads and fixes the problem reliably. Also this mostly requires only a
- single read, so does not have any significant overhead.
- The algorithm was conceived by Samuel Holland.
- */
+unsigned long timer_read_counter(void) +{
- unsigned long cntpct;
- isb();
- do {
asm volatile("mrs %0, cntpct_el0" : "=r" (cntpct));
- } while (((cntpct + 1) & GENMASK(10, 0)) <= 1);
- return cntpct;
+} #else /*
- timer_read_counter() using the Arm Generic Timer (aka arch timer).
diff --git a/arch/arm/mach-sunxi/Kconfig b/arch/arm/mach-sunxi/Kconfig index a3f7723028..3624a03947 100644 --- a/arch/arm/mach-sunxi/Kconfig +++ b/arch/arm/mach-sunxi/Kconfig @@ -84,6 +84,9 @@ config SUNXI_HIGH_SRAM Chips using the latter setup are supposed to select this option to adjust the addresses accordingly.
+config SUNXI_A64_TIMER_ERRATUM
- bool
# Note only one of these may be selected at a time! But hidden choices are # not supported by Kconfig config SUNXI_GEN_SUN4I @@ -270,6 +273,7 @@ config MACH_SUN50I select SUNXI_DRAM_DW_32BIT select FIT select SPL_LOAD_FIT
- select SUNXI_A64_TIMER_ERRATUM
config MACH_SUN50I_H5 bool "sun50i (Allwinner H5)" -- 2.14.4

On Wed, Jun 27, 2018 at 6:12 AM, Andre Przywara andre.przywara@arm.com wrote:
The Allwinner A64 SoCs suffers from an arch timer implementation erratum, where sometimes the lower 11 bits of the counter value erroneously become all 0's or all 1's [1]. This leads to sudden jumps, both forwards and backwards, with the latter one often showing weird behaviour. Port the workaround proposed for Linux to U-Boot and activate it for all A64 boards. This fixes crashes when accessing MMC devices (SD cards), caused by a recent change to actually use the counter value for timeout checks.
Fixes: 5ff8e54888e4d26a352453564f7f599d29696dc9 ("sunxi: improve throughput in the sunxi_mmc driver")
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/576886.html
Signed-off-by: Andre Przywara andre.przywara@arm.com
Tested-by: Jagan Teki jagan@amarulasolutions.com
participants (9)
-
Alexander Graf
-
Andre Przywara
-
Andreas Färber
-
André Przywara
-
Dr. Philipp Tomsich
-
Emmanuel Vadot
-
Guillaume Gardet
-
Jagan Teki
-
Jagan Teki