[RFC] mmc: Remove alignment hole for cmdidx in struct mmc_cmd

The alignment hole caused by cmdidx in struct mmc_cmd cause strange issues together with the peephole2 optimization on Amlogic SoCs. Following was observed while working on SPL support for Amlogic SoCs.
sd_get_capabilities() normally issue a CMD55 followed by a CMD51. However, on at least Amlogic S905 (Cortex-A53) and S905X3 (Cortex-A55), CMD55 was instead followed by CMD8 (and a few reties) in SPL.
Code from the call site:
cmd.cmdidx = SD_CMD_APP_SEND_SCR; // 51 ... data.blocksize = 8; ... err = mmc_send_cmd_retry(mmc, &cmd, &data, 3);
Running the code with MMC_TRACE enabled shows:
CMD_SEND:55 ARG 0x50480000 MMC_RSP_R1,5,6,7 0x00000920 CMD_SEND:8 ARG 0x00000000 RET -110
Removing the alignment hole by changing cmdidx from ushort to uint or building with -fno-peephole2 flag seem to resolve this issue.
CMD_SEND:55 ARG 0x50480000 MMC_RSP_R1,5,6,7 0x00000920 CMD_SEND:51 ARG 0x00000000 MMC_RSP_R1,5,6,7 0x00000920
Same issue was observed building U-Boot with gcc 8-13. Please advise on how to best work around this possible gcc optimization bug.
Signed-off-by: Jonas Karlman jonas@kwiboo.se --- arch/arm/cpu/armv8/config.mk | 2 ++ include/mmc.h | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm/cpu/armv8/config.mk b/arch/arm/cpu/armv8/config.mk index 4d74b2a533e0..7177dcd7c73b 100644 --- a/arch/arm/cpu/armv8/config.mk +++ b/arch/arm/cpu/armv8/config.mk @@ -7,6 +7,8 @@ PLATFORM_RELFLAGS += $(call cc-option,-mbranch-protection=none) PF_NO_UNALIGNED := $(call cc-option, -mstrict-align) PLATFORM_CPPFLAGS += $(PF_NO_UNALIGNED)
+PLATFORM_CPPFLAGS += $(call cc-option,-fno-peephole2) + EFI_LDS := elf_aarch64_efi.lds EFI_CRT0 := crt0_aarch64_efi.o EFI_RELOC := reloc_aarch64_efi.o diff --git a/include/mmc.h b/include/mmc.h index 9aef31ea5deb..7b3868bb5664 100644 --- a/include/mmc.h +++ b/include/mmc.h @@ -413,7 +413,7 @@ struct mmc_cid { };
struct mmc_cmd { - ushort cmdidx; + uint cmdidx; uint resp_type; uint cmdarg; uint response[4];

On Sat Sep 30, 2023 at 1:06 AM CEST, Jonas Karlman wrote:
The alignment hole caused by cmdidx in struct mmc_cmd cause strange issues together with the peephole2 optimization on Amlogic SoCs. Following was observed while working on SPL support for Amlogic SoCs.
sd_get_capabilities() normally issue a CMD55 followed by a CMD51. However, on at least Amlogic S905 (Cortex-A53) and S905X3 (Cortex-A55), CMD55 was instead followed by CMD8 (and a few reties) in SPL.
Code from the call site:
cmd.cmdidx = SD_CMD_APP_SEND_SCR; // 51 ... data.blocksize = 8; ... err = mmc_send_cmd_retry(mmc, &cmd, &data, 3);
Running the code with MMC_TRACE enabled shows:
CMD_SEND:55 ARG 0x50480000 MMC_RSP_R1,5,6,7 0x00000920 CMD_SEND:8 ARG 0x00000000 RET -110
Removing the alignment hole by changing cmdidx from ushort to uint or building with -fno-peephole2 flag seem to resolve this issue.
CMD_SEND:55 ARG 0x50480000 MMC_RSP_R1,5,6,7 0x00000920 CMD_SEND:51 ARG 0x00000000 MMC_RSP_R1,5,6,7 0x00000920
Same issue was observed building U-Boot with gcc 8-13. Please advise on how to best work around this possible gcc optimization bug.
Signed-off-by: Jonas Karlman jonas@kwiboo.se
I'll try reproducing this. I'm using this: https://git.vitali64.duckdns.org/misc/u-boot-kii-pro.git (branch: wip/spl, known to boot on S905/GXBB)
For reference, I'm using this version of aarch64 gcc (from the Arch Linux repos):
$ aarch64-linux-gnu-gcc --version aarch64-linux-gnu-gcc (GCC) 13.1.0 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
And this version of (host) gcc:
$ gcc --version gcc (GCC) 13.1.1 20230429 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Cheers.

On Fri, 29 Sept 2023 at 17:07, Jonas Karlman jonas@kwiboo.se wrote:
The alignment hole caused by cmdidx in struct mmc_cmd cause strange issues together with the peephole2 optimization on Amlogic SoCs. Following was observed while working on SPL support for Amlogic SoCs.
sd_get_capabilities() normally issue a CMD55 followed by a CMD51. However, on at least Amlogic S905 (Cortex-A53) and S905X3 (Cortex-A55), CMD55 was instead followed by CMD8 (and a few reties) in SPL.
Code from the call site:
cmd.cmdidx = SD_CMD_APP_SEND_SCR; // 51 ... data.blocksize = 8; ... err = mmc_send_cmd_retry(mmc, &cmd, &data, 3);
Running the code with MMC_TRACE enabled shows:
CMD_SEND:55 ARG 0x50480000 MMC_RSP_R1,5,6,7 0x00000920 CMD_SEND:8 ARG 0x00000000 RET -110
Removing the alignment hole by changing cmdidx from ushort to uint or building with -fno-peephole2 flag seem to resolve this issue.
CMD_SEND:55 ARG 0x50480000 MMC_RSP_R1,5,6,7 0x00000920 CMD_SEND:51 ARG 0x00000000 MMC_RSP_R1,5,6,7 0x00000920
Same issue was observed building U-Boot with gcc 8-13. Please advise on how to best work around this possible gcc optimization bug.
Signed-off-by: Jonas Karlman jonas@kwiboo.se
arch/arm/cpu/armv8/config.mk | 2 ++ include/mmc.h | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm/cpu/armv8/config.mk b/arch/arm/cpu/armv8/config.mk index 4d74b2a533e0..7177dcd7c73b 100644 --- a/arch/arm/cpu/armv8/config.mk +++ b/arch/arm/cpu/armv8/config.mk @@ -7,6 +7,8 @@ PLATFORM_RELFLAGS += $(call cc-option,-mbranch-protection=none) PF_NO_UNALIGNED := $(call cc-option, -mstrict-align) PLATFORM_CPPFLAGS += $(PF_NO_UNALIGNED)
+PLATFORM_CPPFLAGS += $(call cc-option,-fno-peephole2)
Eek really?
EFI_LDS := elf_aarch64_efi.lds EFI_CRT0 := crt0_aarch64_efi.o EFI_RELOC := reloc_aarch64_efi.o diff --git a/include/mmc.h b/include/mmc.h index 9aef31ea5deb..7b3868bb5664 100644 --- a/include/mmc.h +++ b/include/mmc.h @@ -413,7 +413,7 @@ struct mmc_cid { };
struct mmc_cmd {
ushort cmdidx;
uint cmdidx;
This bit looks OK.
uint resp_type; uint cmdarg; uint response[4];
-- 2.42.0
Regards, Simon

On 2023-10-02 20:56, Simon Glass wrote:
On Fri, 29 Sept 2023 at 17:07, Jonas Karlman jonas@kwiboo.se wrote:
The alignment hole caused by cmdidx in struct mmc_cmd cause strange issues together with the peephole2 optimization on Amlogic SoCs. Following was observed while working on SPL support for Amlogic SoCs.
sd_get_capabilities() normally issue a CMD55 followed by a CMD51. However, on at least Amlogic S905 (Cortex-A53) and S905X3 (Cortex-A55), CMD55 was instead followed by CMD8 (and a few reties) in SPL.
Code from the call site:
cmd.cmdidx = SD_CMD_APP_SEND_SCR; // 51 ... data.blocksize = 8; ... err = mmc_send_cmd_retry(mmc, &cmd, &data, 3);
Running the code with MMC_TRACE enabled shows:
CMD_SEND:55 ARG 0x50480000 MMC_RSP_R1,5,6,7 0x00000920 CMD_SEND:8 ARG 0x00000000 RET -110
Removing the alignment hole by changing cmdidx from ushort to uint or building with -fno-peephole2 flag seem to resolve this issue.
CMD_SEND:55 ARG 0x50480000 MMC_RSP_R1,5,6,7 0x00000920 CMD_SEND:51 ARG 0x00000000 MMC_RSP_R1,5,6,7 0x00000920
Same issue was observed building U-Boot with gcc 8-13. Please advise on how to best work around this possible gcc optimization bug.
Signed-off-by: Jonas Karlman jonas@kwiboo.se
arch/arm/cpu/armv8/config.mk | 2 ++ include/mmc.h | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm/cpu/armv8/config.mk b/arch/arm/cpu/armv8/config.mk index 4d74b2a533e0..7177dcd7c73b 100644 --- a/arch/arm/cpu/armv8/config.mk +++ b/arch/arm/cpu/armv8/config.mk @@ -7,6 +7,8 @@ PLATFORM_RELFLAGS += $(call cc-option,-mbranch-protection=none) PF_NO_UNALIGNED := $(call cc-option, -mstrict-align) PLATFORM_CPPFLAGS += $(PF_NO_UNALIGNED)
+PLATFORM_CPPFLAGS += $(call cc-option,-fno-peephole2)
Eek really?
As mentioned the issue could be resolved with changing cmdidx from ushort to uint or disabling the peephole2 optimization.
For the purpose of trying to get more feedback on the issue I included both changes that would resolve my issue.
Also if there is one struct affected, there could be other similar structs affected by a similar issue.
I do not really know if this could be a cpu errata, compiler bug or something else, including some assembly from a non-working and working case (peephole2 optimization enabled/disabled).
Code generated with ushort and peephole2 enabled (not working):
60 06 80 52 mov w0,#0x33 // 51 f5 3f 06 91 add x21,sp,#0x18f e0 23 01 79 strh w0,[sp, #cmd] e0 03 00 b2 mov x0,#0x100000001 b5 e6 7a 92 and x21,x21,#-0x40 f5 03 06 a9 stp x21,x0,[sp, #data] 00 01 80 52 mov w0,#0x8 e2 83 01 91 add x2,sp,#0x60 e1 43 02 91 add x1,sp,#0x90 63 00 80 52 mov w3,#0x3 e0 73 00 b9 str w0,[sp, #blocksize] e0 03 13 aa mov x0,x19 f6 ff 12 29 stp w22,wzr,[sp, #resp_type] 50 fd ff 97 bl mmc_send_cmd_retry
Code generated with ushort and peephole2 disabled (working):
60 06 80 52 mov w0,#0x33 // 51 e0 23 01 79 strh w0,[sp, #cmd] e0 03 00 b2 mov x0,#0x100000001 f5 3f 06 91 add x21,sp,#0x18f e0 37 00 f9 str x0,[sp, #local_168] 00 01 80 52 mov w0,#0x8 b5 e6 7a 92 and x21,x21,#-0x40 e2 83 01 91 add x2,sp,#0x60 e1 43 02 91 add x1,sp,#0x90 63 00 80 52 mov w3,#0x3 f5 33 00 f9 str x21,[sp, #data] e0 73 00 b9 str w0,[sp, #blocksize] e0 03 13 aa mov x0,x19 f6 97 00 b9 str w22,[sp, #resp_type] ff 9b 00 b9 str wzr,[sp, #cmdarg] 4a fd ff 97 bl mmc_send_cmd_retry
Hoping someone can see something strange I cannot.
Regards, Jonas
EFI_LDS := elf_aarch64_efi.lds EFI_CRT0 := crt0_aarch64_efi.o EFI_RELOC := reloc_aarch64_efi.o diff --git a/include/mmc.h b/include/mmc.h index 9aef31ea5deb..7b3868bb5664 100644 --- a/include/mmc.h +++ b/include/mmc.h @@ -413,7 +413,7 @@ struct mmc_cid { };
struct mmc_cmd {
ushort cmdidx;
uint cmdidx;
This bit looks OK.
uint resp_type; uint cmdarg; uint response[4];
-- 2.42.0
Regards, Simon

Hi Jonas,
On Mon, 2 Oct 2023 at 13:29, Jonas Karlman jonas@kwiboo.se wrote:
On 2023-10-02 20:56, Simon Glass wrote:
On Fri, 29 Sept 2023 at 17:07, Jonas Karlman jonas@kwiboo.se wrote:
The alignment hole caused by cmdidx in struct mmc_cmd cause strange issues together with the peephole2 optimization on Amlogic SoCs. Following was observed while working on SPL support for Amlogic SoCs.
sd_get_capabilities() normally issue a CMD55 followed by a CMD51. However, on at least Amlogic S905 (Cortex-A53) and S905X3 (Cortex-A55), CMD55 was instead followed by CMD8 (and a few reties) in SPL.
Code from the call site:
cmd.cmdidx = SD_CMD_APP_SEND_SCR; // 51 ... data.blocksize = 8; ... err = mmc_send_cmd_retry(mmc, &cmd, &data, 3);
Running the code with MMC_TRACE enabled shows:
CMD_SEND:55 ARG 0x50480000 MMC_RSP_R1,5,6,7 0x00000920 CMD_SEND:8 ARG 0x00000000 RET -110
Removing the alignment hole by changing cmdidx from ushort to uint or building with -fno-peephole2 flag seem to resolve this issue.
CMD_SEND:55 ARG 0x50480000 MMC_RSP_R1,5,6,7 0x00000920 CMD_SEND:51 ARG 0x00000000 MMC_RSP_R1,5,6,7 0x00000920
Same issue was observed building U-Boot with gcc 8-13. Please advise on how to best work around this possible gcc optimization bug.
Signed-off-by: Jonas Karlman jonas@kwiboo.se
arch/arm/cpu/armv8/config.mk | 2 ++ include/mmc.h | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm/cpu/armv8/config.mk b/arch/arm/cpu/armv8/config.mk index 4d74b2a533e0..7177dcd7c73b 100644 --- a/arch/arm/cpu/armv8/config.mk +++ b/arch/arm/cpu/armv8/config.mk @@ -7,6 +7,8 @@ PLATFORM_RELFLAGS += $(call cc-option,-mbranch-protection=none) PF_NO_UNALIGNED := $(call cc-option, -mstrict-align) PLATFORM_CPPFLAGS += $(PF_NO_UNALIGNED)
+PLATFORM_CPPFLAGS += $(call cc-option,-fno-peephole2)
Eek really?
As mentioned the issue could be resolved with changing cmdidx from ushort to uint or disabling the peephole2 optimization.
For the purpose of trying to get more feedback on the issue I included both changes that would resolve my issue.
Also if there is one struct affected, there could be other similar structs affected by a similar issue.
I do not really know if this could be a cpu errata, compiler bug or something else, including some assembly from a non-working and working case (peephole2 optimization enabled/disabled).
Code generated with ushort and peephole2 enabled (not working):
60 06 80 52 mov w0,#0x33 // 51 f5 3f 06 91 add x21,sp,#0x18f e0 23 01 79 strh w0,[sp, #cmd] e0 03 00 b2 mov x0,#0x100000001 b5 e6 7a 92 and x21,x21,#-0x40 f5 03 06 a9 stp x21,x0,[sp, #data] 00 01 80 52 mov w0,#0x8 e2 83 01 91 add x2,sp,#0x60 e1 43 02 91 add x1,sp,#0x90 63 00 80 52 mov w3,#0x3 e0 73 00 b9 str w0,[sp, #blocksize] e0 03 13 aa mov x0,x19 f6 ff 12 29 stp w22,wzr,[sp, #resp_type] 50 fd ff 97 bl mmc_send_cmd_retry
Code generated with ushort and peephole2 disabled (working):
60 06 80 52 mov w0,#0x33 // 51 e0 23 01 79 strh w0,[sp, #cmd] e0 03 00 b2 mov x0,#0x100000001 f5 3f 06 91 add x21,sp,#0x18f e0 37 00 f9 str x0,[sp, #local_168] 00 01 80 52 mov w0,#0x8 b5 e6 7a 92 and x21,x21,#-0x40 e2 83 01 91 add x2,sp,#0x60 e1 43 02 91 add x1,sp,#0x90 63 00 80 52 mov w3,#0x3 f5 33 00 f9 str x21,[sp, #data] e0 73 00 b9 str w0,[sp, #blocksize] e0 03 13 aa mov x0,x19 f6 97 00 b9 str w22,[sp, #resp_type] ff 9b 00 b9 str wzr,[sp, #cmdarg] 4a fd ff 97 bl mmc_send_cmd_retry
Hoping someone can see something strange I cannot.
I used to be able to see what compilers were doing, but sometime over the past 10 years they got a lot smarter and I didn't.
I don't see anything wrong with the C code. The setting of cmd.resp_type to the same value could perhaps be fooling the compiler?
Regards, Simon
participants (3)
-
Ferass El Hafidi
-
Jonas Karlman
-
Simon Glass