[U-Boot] [PATCH] arm: fix memory coherency problem after relocation

On the xscale, the icache must be invalidated and the write buffers drained after writing code over the data bus, even if the caches are disabled. After rebasing with the main git repository, u-boot began crashing in odd places on my pxa270 board (palmtreo680) after the code relocation routine ran. This patch fixes it. Cache coherency problems are often hit-and-miss (ha ha), and this latent problem didn't rear its ugly head until now. Tested on the pxa270.
Signed-off-by: Mike Dunn mikedunn@newsguy.com ---
I realize that __ARM_ARCH_5TE__ does not necessarily mean xscale. But how else to test for pxa2xx/ixp in an assembly file? I wouldn't expect any ill effects on a non-xscale v5te because these CP15 operations are defined the same way in the ARM Architecture Reference Manual for v5 cores, but unfortunately I only have a pxa270 xscale to test on.
I experienced this same problem a couple years ago when I was getting OpenOCD working on the xscale. Often software breakpoints (replacing an instruction with the 'bkpt' instruction) would fail unless this operation was performed after replacing the instruction, even with caches off. The following two paragraphs are cut from section 4.2.7 of the xscale core developer's manual...
If the instruction cache is not enabled, or code is being written to a non-cacheable region, software must still invalidate the instruction cache before using the newly-written code. This precaution ensures that state associated with the new code is not buffered elsewhere in the processor, such as the fetch buffers or the BTB.
Naturally, when writing code as data, care must be taken to force it completely out of the processor into external memory before attempting to execute it. If writing into a non-cacheable region, flushing the write buffers is sufficient precaution (see Section 7.2.8 for a description of this
arch/arm/lib/relocate.S | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-)
diff --git a/arch/arm/lib/relocate.S b/arch/arm/lib/relocate.S index 4446da9..e2febed 100644 --- a/arch/arm/lib/relocate.S +++ b/arch/arm/lib/relocate.S @@ -92,6 +92,16 @@ fixnext:
relocate_done:
+#ifdef __ARM_ARCH_5TE__ + /* + * On xscale, icache must be invalidated and write buffers drained, + * even with cache disabled - 4.2.7 of xscale core developer's manual + */ + mov r0, #0 /* arm reference manual: "data SBZ" */ + mcr p15, 0, r0, c7, c7, 0 /* invalidate icache */ + mcr p15, 0, r0, c7, c10, 4 /* drain write buffer */ +#endif + /* ARMv4- don't know bx lr but the assembler fails to see that */
#ifdef __ARM_ARCH_4__

Dear Mike Dunn,
On the xscale, the icache must be invalidated and the write buffers drained after writing code over the data bus, even if the caches are disabled. After rebasing with the main git repository, u-boot began crashing in odd places on my pxa270 board (palmtreo680) after the code relocation routine ran. This patch fixes it. Cache coherency problems are often hit-and-miss (ha ha), and this latent problem didn't rear its ugly head until now. Tested on the pxa270.
Signed-off-by: Mike Dunn mikedunn@newsguy.com
I realize that __ARM_ARCH_5TE__ does not necessarily mean xscale.
We should introduce some CONFIG_PXA I guess.
Best regards, Marek Vasut

On 06/20/2013 06:56 AM, Marek Vasut wrote:
Dear Mike Dunn,
On the xscale, the icache must be invalidated and the write buffers drained after writing code over the data bus, even if the caches are disabled. After rebasing with the main git repository, u-boot began crashing in odd places on my pxa270 board (palmtreo680) after the code relocation routine ran. This patch fixes it. Cache coherency problems are often hit-and-miss (ha ha), and this latent problem didn't rear its ugly head until now. Tested on the pxa270.
Signed-off-by: Mike Dunn mikedunn@newsguy.com
I realize that __ARM_ARCH_5TE__ does not necessarily mean xscale.
We should introduce some CONFIG_PXA I guess.
I think CONFIG_XSCALE is more correct, because if I'm not mistaken more recent Marvell processors continue to use the 'pxa' nomenclature in their name but do not contain xscale cores.
I'll take a stab at a patch if you like. Maybe something like this macro in arch/arm/include/asm/arch-pxa/hardware.h
/* * Define CONFIG_CPU_MONAHANS in case some CPU of the PXA3xx family is selected. * PXA300/310/320 all have distinct register mappings in some cases, that's why * the exact CPU has to be selected. CONFIG_CPU_MONAHANS is a helper for common * drivers and compatibility glue with old source then. */ #ifndef CONFIG_CPU_MONAHANS #if defined(CONFIG_CPU_PXA300) || \ defined(CONFIG_CPU_PXA310) || \ defined(CONFIG_CPU_PXA320) #define CONFIG_CPU_MONAHANS #endif #endif
Thanks, Mike

Dear Mike Dunn,
On 06/20/2013 06:56 AM, Marek Vasut wrote:
Dear Mike Dunn,
On the xscale, the icache must be invalidated and the write buffers drained after writing code over the data bus, even if the caches are disabled. After rebasing with the main git repository, u-boot began crashing in odd places on my pxa270 board (palmtreo680) after the code relocation routine ran. This patch fixes it. Cache coherency problems are often hit-and-miss (ha ha), and this latent problem didn't rear its ugly head until now. Tested on the pxa270.
Signed-off-by: Mike Dunn mikedunn@newsguy.com
I realize that __ARM_ARCH_5TE__ does not necessarily mean xscale.
We should introduce some CONFIG_PXA I guess.
I think CONFIG_XSCALE is more correct, because if I'm not mistaken more recent Marvell processors continue to use the 'pxa' nomenclature in their name but do not contain xscale cores.
I suspect we might end up checking all the datasheets (PXA2xx, PXA3xx and PXA9xx) to see which CPUs still need this fixup.
Best regards, Marek Vasut
participants (2)
-
Marek Vasut
-
Mike Dunn