[U-Boot] memcpy/memset on arm64 platforms

Hi All, I am trying to enable CONFIG_USE_ARCH_MEMSET/MEMCPY on arm64 platforms and realized that there is no arm64 specific memcpy available in u-boot. So I tried porting arm64 specific memcpy from kernel[1]. Memcpy stopped working after that and observed that if destination address is unaligned then system hangs.
After doing a bit more research, understood that unaligned access to device or strongly ordered memories will fail. And the memory system (even normal RAM) behaves like strongly ordered memory when the MMU is disabled[2]. In u-boot MMU is enabled very late after relocation and SPL doesn't enable MMU at all.
Before I proceed any further wanted to hear from others if anyone have already tried and have any working solution. If not should we update the kernel memcpy to use unaligned destination address and use it in u-boot?
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch... [2] https://community.arm.com/processors/f/discussions/7557/when-and-where-will-...
Thanks and regards, Lokesh

Hi All, Any help is much appreciated.
Thanks and regards, Lokesh
On 07/01/19 2:43 PM, Lokesh Vutla wrote:
Hi All, I am trying to enable CONFIG_USE_ARCH_MEMSET/MEMCPY on arm64 platforms and realized that there is no arm64 specific memcpy available in u-boot. So I tried porting arm64 specific memcpy from kernel[1]. Memcpy stopped working after that and observed that if destination address is unaligned then system hangs.
After doing a bit more research, understood that unaligned access to device or strongly ordered memories will fail. And the memory system (even normal RAM) behaves like strongly ordered memory when the MMU is disabled[2]. In u-boot MMU is enabled very late after relocation and SPL doesn't enable MMU at all.
Before I proceed any further wanted to hear from others if anyone have already tried and have any working solution. If not should we update the kernel memcpy to use unaligned destination address and use it in u-boot?
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch...
[2] https://community.arm.com/processors/f/discussions/7557/when-and-where-will-...
Thanks and regards, Lokesh _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot

On Mon, Jan 07, 2019 at 02:43:05PM +0530, Lokesh Vutla wrote:
Hi All, I am trying to enable CONFIG_USE_ARCH_MEMSET/MEMCPY on arm64 platforms and realized that there is no arm64 specific memcpy available in u-boot. So I tried porting arm64 specific memcpy from kernel[1]. Memcpy stopped working after that and observed that if destination address is unaligned then system hangs.
After doing a bit more research, understood that unaligned access to device or strongly ordered memories will fail. And the memory system (even normal RAM) behaves like strongly ordered memory when the MMU is disabled[2]. In u-boot MMU is enabled very late after relocation and SPL doesn't enable MMU at all.
Before I proceed any further wanted to hear from others if anyone have already tried and have any working solution. If not should we update the kernel memcpy to use unaligned destination address and use it in u-boot?
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch... [2] https://community.arm.com/processors/f/discussions/7557/when-and-where-will-...
I suspect we should, yes. Adding in some custodians of other arm64 platforms for comment.

On 1/11/19 5:13 PM, Tom Rini wrote:
On Mon, Jan 07, 2019 at 02:43:05PM +0530, Lokesh Vutla wrote:
Hi All, I am trying to enable CONFIG_USE_ARCH_MEMSET/MEMCPY on arm64 platforms and realized that there is no arm64 specific memcpy available in u-boot. So I tried porting arm64 specific memcpy from kernel[1]. Memcpy stopped working after that and observed that if destination address is unaligned then system hangs.
Why do you want a custom memcpy() implementation in the first place ?
After doing a bit more research, understood that unaligned access to device or strongly ordered memories will fail. And the memory system (even normal RAM) behaves like strongly ordered memory when the MMU is disabled[2]. In u-boot MMU is enabled very late after relocation and SPL doesn't enable MMU at all.
The ARM64 MMU table setup seems to be a mess, it could indeed use some improvement.
Before I proceed any further wanted to hear from others if anyone have already tried and have any working solution. If not should we update the kernel memcpy to use unaligned destination address and use it in u-boot?
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch... [2] https://community.arm.com/processors/f/discussions/7557/when-and-where-will-...
I suspect we should, yes. Adding in some custodians of other arm64 platforms for comment.
Does the unaligned access exception happen only before the MMU is enabled or even after ?

On Fri, Jan 11, 2019 at 9:43 PM Tom Rini trini@konsulko.com wrote:
On Mon, Jan 07, 2019 at 02:43:05PM +0530, Lokesh Vutla wrote:
Hi All, I am trying to enable CONFIG_USE_ARCH_MEMSET/MEMCPY on arm64 platforms and realized that there is no arm64 specific memcpy available in u-boot. So I tried porting arm64 specific memcpy from kernel[1]. Memcpy stopped working after that and observed that if destination address is unaligned then system hangs.
After doing a bit more research, understood that unaligned access to device or strongly ordered memories will fail. And the memory system (even normal RAM) behaves like strongly ordered memory when the MMU is disabled[2]. In u-boot MMU is enabled very late after relocation and SPL doesn't enable MMU at all.
Before I proceed any further wanted to hear from others if anyone have already tried and have any working solution. If not should we update the kernel memcpy to use unaligned destination address and use it in u-boot?
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch... [2] https://community.arm.com/processors/f/discussions/7557/when-and-where-will-...
I suspect we should, yes. Adding in some custodians of other arm64 platforms for comment.
Right now we still use it from lib/string.c we don't enable ARCH_MEMCPY for arm64, indeed there is no code for doing that, if I'm not wrong.

On 07.01.2019, at 10:13, Lokesh Vutla lokeshvutla@ti.com wrote:
Hi All, I am trying to enable CONFIG_USE_ARCH_MEMSET/MEMCPY on arm64 platforms and realized that there is no arm64 specific memcpy available in u-boot. So I tried porting arm64 specific memcpy from kernel[1]. Memcpy stopped working after that and observed that if destination address is unaligned then system hangs.
After doing a bit more research, understood that unaligned access to device or strongly ordered memories will fail. And the memory system (even normal RAM) behaves like strongly ordered memory when the MMU is disabled[2]. In u-boot MMU is enabled very late after relocation and SPL doesn't enable MMU at all.
ARMv8 does not allow unaligned accesses in privileged (i.e. except in EL0) modes.
Before I proceed any further wanted to hear from others if anyone have already tried and have any working solution. If not should we update the kernel memcpy to use unaligned destination address and use it in u-boot?
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch... [2] https://community.arm.com/processors/f/discussions/7557/when-and-where-will-...
Thanks and regards, Lokesh _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de https://lists.denx.de/listinfo/u-boot

On 11.01.19 17:44, Philipp Tomsich wrote:
On 07.01.2019, at 10:13, Lokesh Vutla lokeshvutla@ti.com wrote:
Hi All, I am trying to enable CONFIG_USE_ARCH_MEMSET/MEMCPY on arm64 platforms and realized that there is no arm64 specific memcpy available in u-boot. So I tried porting arm64 specific memcpy from kernel[1]. Memcpy stopped working after that and observed that if destination address is unaligned then system hangs.
After doing a bit more research, understood that unaligned access to device or strongly ordered memories will fail. And the memory system (even normal RAM) behaves like strongly ordered memory when the MMU is disabled[2]. In u-boot MMU is enabled very late after relocation and SPL doesn't enable MMU at all.
ARMv8 does not allow unaligned accesses in privileged (i.e. except in EL0) modes.
Do you have a reference to this in the spec somewhere?
So far, I always assumed that the only constraint on AArch64 to have workable unaligned accesses is to enable caching and have the maps be declared as normal (for which you need to enable the MMU and provide PTEs that do so).
Before I proceed any further wanted to hear from others if anyone have already tried and have any working solution. If not should we update the kernel memcpy to use unaligned destination address and use it in u-boot?
Just make sure you're only doing aligned accesses and you're fine? Most memcpy code checks start and end for alignment, does aligned operations on those bits and then bigger chunks for the middle piece. I would assume you will find something like that in glibc for example.
The reason the kernel gets away without is that it knows that hardware unaligned access is always available, because it controls the page tables and it only ever runs from normal ram.
In SPL however, we may run from XIP or SRAM or whatever. In that case, even if you wanted to, you could not declare memory as normal and thus could not rely on hardware unaligned fixups.
Alex

On 11.01.2019, at 23:28, Alexander Graf agraf@suse.de wrote:
On 11.01.19 17:44, Philipp Tomsich wrote:
On 07.01.2019, at 10:13, Lokesh Vutla lokeshvutla@ti.com wrote:
Hi All, I am trying to enable CONFIG_USE_ARCH_MEMSET/MEMCPY on arm64 platforms and realized that there is no arm64 specific memcpy available in u-boot. So I tried porting arm64 specific memcpy from kernel[1]. Memcpy stopped working after that and observed that if destination address is unaligned then system hangs.
After doing a bit more research, understood that unaligned access to device or strongly ordered memories will fail. And the memory system (even normal RAM) behaves like strongly ordered memory when the MMU is disabled[2]. In u-boot MMU is enabled very late after relocation and SPL doesn't enable MMU at all.
ARMv8 does not allow unaligned accesses in privileged (i.e. except in EL0) modes.
Do you have a reference to this in the spec somewhere?
So far, I always assumed that the only constraint on AArch64 to have workable unaligned accesses is to enable caching and have the maps be declared as normal (for which you need to enable the MMU and provide PTEs that do so).
"B2.5.2 Alignment of data accesses” in the AArch64 Architecture Manual has the full rules.
For non-normal memory, unaligned accesses will almost always spell trouble. They are specifically forbidden for device memory.
And yes, I oversimplified a bit: there’s the SCTLR_ELx register that controls whether alignment checks are performed. Unfortunately, this field resets to "a value that is architecturally UNKNOWN”, so unless explicitly enabled these will be disallowed… which is the default on a number of implementations.
There’s some added ugliness in the specification of alignment faults, if the faulting access crosses cache lines (I don’t want to go dig for the specific reference in the AArch64 ARM).
Unless we have trap handlers installed, the alignment fault will stop us with a data abort…
So from a bootloader-perspective it will probably be best to stay away from unaligned accesses unless we want to deal with the complexity of setting up SCTLR fully and differentiating between normal and non-normal memory for things like a memcpy.
Before I proceed any further wanted to hear from others if anyone have already tried and have any working solution. If not should we update the kernel memcpy to use unaligned destination address and use it in u-boot?
Just make sure you're only doing aligned accesses and you're fine? Most memcpy code checks start and end for alignment, does aligned operations on those bits and then bigger chunks for the middle piece. I would assume you will find something like that in glibc for example.
The reason the kernel gets away without is that it knows that hardware unaligned access is always available, because it controls the page tables and it only ever runs from normal ram.
In SPL however, we may run from XIP or SRAM or whatever. In that case, even if you wanted to, you could not declare memory as normal and thus could not rely on hardware unaligned fixups.
Alex
participants (6)
-
Alexander Graf
-
Jagan Teki
-
Lokesh Vutla
-
Marek Vasut
-
Philipp Tomsich
-
Tom Rini