[PATCH 0/3] Revert HAFDBS changes

As discussed this series reverts the HAFDBS changes that caused an issue on AC5/AC5X. I think there are some improvements that can be made to the initial memory map for the AC5/AC5X but so far nothing I've found makes it compatible with the HAFDBS changes.
Chris Packham (3): Revert "armv8: enable HAFDBS for other ELx when FEAT_HAFDBS is present" Revert "arm64: Use level-2 for largest block mappings when FEAT_HAFDBS is present" Revert "arm64: Use FEAT_HAFDBS to track dirty pages when available"
arch/arm/cpu/armv8/cache_v8.c | 30 +++--------------------------- arch/arm/include/asm/armv8/mmu.h | 20 ++++---------------- arch/arm/include/asm/global_data.h | 2 -- 3 files changed, 7 insertions(+), 45 deletions(-)

This reverts commit c1da6fdb5c239b432440721772d993e63cfdeb20. This is part of a series trying to make use of the arm64 hardware features for tracking dirty pages. Unfortunately this series causes problems for the AC5/AC5X SoCs. Having exhausted other options the consensus seems to be reverting this series is the best course of action.
Signed-off-by: Chris Packham judge.packham@gmail.com ---
arch/arm/cpu/armv8/cache_v8.c | 6 +----- arch/arm/include/asm/armv8/mmu.h | 10 ++-------- 2 files changed, 3 insertions(+), 13 deletions(-)
diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c index cb1131a0480e..4c6a1b1d6c5e 100644 --- a/arch/arm/cpu/armv8/cache_v8.c +++ b/arch/arm/cpu/armv8/cache_v8.c @@ -94,15 +94,11 @@ u64 get_tcr(u64 *pips, u64 *pva_bits) if (el == 1) { tcr = TCR_EL1_RSVD | (ips << 32) | TCR_EPD1_DISABLE; if (gd->arch.has_hafdbs) - tcr |= TCR_EL1_HA | TCR_EL1_HD; + tcr |= TCR_HA | TCR_HD; } else if (el == 2) { tcr = TCR_EL2_RSVD | (ips << 16); - if (gd->arch.has_hafdbs) - tcr |= TCR_EL2_HA | TCR_EL2_HD; } else { tcr = TCR_EL3_RSVD | (ips << 16); - if (gd->arch.has_hafdbs) - tcr |= TCR_EL3_HA | TCR_EL3_HD; }
/* PTWs cacheable, inner/outer WBWA and inner shareable */ diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h index 19a9e112a434..98a27db3166b 100644 --- a/arch/arm/include/asm/armv8/mmu.h +++ b/arch/arm/include/asm/armv8/mmu.h @@ -102,14 +102,8 @@ #define TCR_TG0_16K (2 << 14) #define TCR_EPD1_DISABLE (1 << 23)
-#define TCR_EL1_HA BIT(39) -#define TCR_EL1_HD BIT(40) - -#define TCR_EL2_HA BIT(21) -#define TCR_EL2_HD BIT(22) - -#define TCR_EL3_HA BIT(21) -#define TCR_EL3_HD BIT(22) +#define TCR_HA BIT(39) +#define TCR_HD BIT(40)
#define TCR_EL1_RSVD (1U << 31) #define TCR_EL2_RSVD (1U << 31 | 1 << 23)

On Fri, Oct 27, 2023 at 01:23:52PM +1300, Chris Packham wrote:
This reverts commit c1da6fdb5c239b432440721772d993e63cfdeb20. This is part of a series trying to make use of the arm64 hardware features for tracking dirty pages. Unfortunately this series causes problems for the AC5/AC5X SoCs. Having exhausted other options the consensus seems to be reverting this series is the best course of action.
Signed-off-by: Chris Packham judge.packham@gmail.com
Applied to u-boot/master, thanks!

This reverts commit 836b8d4b205d2175b57cb9ef271e638b0c116e89. This is part of a series trying to make use of the arm64 hardware features for tracking dirty pages. Unfortunately this series causes problems for the AC5/AC5X SoCs. Having exhausted other options the consensus seems to be reverting this series is the best course of action.
Signed-off-by: Chris Packham judge.packham@gmail.com ---
arch/arm/cpu/armv8/cache_v8.c | 14 ++++---------- arch/arm/include/asm/global_data.h | 1 - 2 files changed, 4 insertions(+), 11 deletions(-)
diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c index 4c6a1b1d6c5e..4760064ee18f 100644 --- a/arch/arm/cpu/armv8/cache_v8.c +++ b/arch/arm/cpu/armv8/cache_v8.c @@ -314,7 +314,7 @@ static void map_range(u64 virt, u64 phys, u64 size, int level, for (i = idx; size; i++) { u64 next_size, *next_table;
- if (level >= gd->arch.first_block_level && + if (level >= 1 && size >= map_size && !(virt & (map_size - 1))) { if (level == 3) table[i] = phys | attrs | PTE_TYPE_PAGE; @@ -353,9 +353,6 @@ static void add_map(struct mm_region *map) if (va_bits < 39) level = 1;
- if (!gd->arch.first_block_level) - gd->arch.first_block_level = 1; - if (gd->arch.has_hafdbs) attrs |= PTE_DBM | PTE_RDONLY;
@@ -372,7 +369,7 @@ static void count_range(u64 virt, u64 size, int level, int *cntp) for (i = idx; size; i++) { u64 next_size;
- if (level >= gd->arch.first_block_level && + if (level >= 1 && size >= map_size && !(virt & (map_size - 1))) { virt += map_size; size -= map_size; @@ -413,13 +410,10 @@ __weak u64 get_page_table_size(void) u64 size, mmfr1;
asm volatile("mrs %0, id_aa64mmfr1_el1" : "=r" (mmfr1)); - if ((mmfr1 & 0xf) == 2) { + if ((mmfr1 & 0xf) == 2) gd->arch.has_hafdbs = true; - gd->arch.first_block_level = 2; - } else { + else gd->arch.has_hafdbs = false; - gd->arch.first_block_level = 1; - }
/* Account for all page tables we would need to cover our memory map */ size = one_pt * count_ranges(); diff --git a/arch/arm/include/asm/global_data.h b/arch/arm/include/asm/global_data.h index b385bae02669..1325b0644248 100644 --- a/arch/arm/include/asm/global_data.h +++ b/arch/arm/include/asm/global_data.h @@ -52,7 +52,6 @@ struct arch_global_data { #if defined(CONFIG_ARM64) unsigned long tlb_fillptr; unsigned long tlb_emerg; - unsigned int first_block_level; bool has_hafdbs; #endif #endif

On Fri, Oct 27, 2023 at 01:23:53PM +1300, Chris Packham wrote:
This reverts commit 836b8d4b205d2175b57cb9ef271e638b0c116e89. This is part of a series trying to make use of the arm64 hardware features for tracking dirty pages. Unfortunately this series causes problems for the AC5/AC5X SoCs. Having exhausted other options the consensus seems to be reverting this series is the best course of action.
Signed-off-by: Chris Packham judge.packham@gmail.com
Applied to u-boot/master, thanks!

This reverts commit 6cdf6b7a340db4ddd008516181de7e08e3f8c213. This is part of a series trying to make use of the arm64 hardware features for tracking dirty pages. Unfortunately this series causes problems for the AC5/AC5X SoCs. Having exhausted other options the consensus seems to be reverting this series is the best course of action.
Signed-off-by: Chris Packham judge.packham@gmail.com ---
arch/arm/cpu/armv8/cache_v8.c | 16 +--------------- arch/arm/include/asm/armv8/mmu.h | 14 ++++---------- arch/arm/include/asm/global_data.h | 1 - 3 files changed, 5 insertions(+), 26 deletions(-)
diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c index 4760064ee18f..697334086fdc 100644 --- a/arch/arm/cpu/armv8/cache_v8.c +++ b/arch/arm/cpu/armv8/cache_v8.c @@ -93,8 +93,6 @@ u64 get_tcr(u64 *pips, u64 *pva_bits)
if (el == 1) { tcr = TCR_EL1_RSVD | (ips << 32) | TCR_EPD1_DISABLE; - if (gd->arch.has_hafdbs) - tcr |= TCR_HA | TCR_HD; } else if (el == 2) { tcr = TCR_EL2_RSVD | (ips << 16); } else { @@ -202,9 +200,6 @@ static void __cmo_on_leaves(void (*cmo_fn)(unsigned long, unsigned long), attrs != PTE_BLOCK_MEMTYPE(MT_NORMAL_NC)) continue;
- if (gd->arch.has_hafdbs && (pte & (PTE_RDONLY | PTE_DBM)) != PTE_DBM) - continue; - end = va + BIT(level2shift(level)) - 1;
/* No intersection with RAM? */ @@ -353,9 +348,6 @@ static void add_map(struct mm_region *map) if (va_bits < 39) level = 1;
- if (gd->arch.has_hafdbs) - attrs |= PTE_DBM | PTE_RDONLY; - map_range(map->virt, map->phys, map->size, level, (u64 *)gd->arch.tlb_addr, attrs); } @@ -407,13 +399,7 @@ static int count_ranges(void) __weak u64 get_page_table_size(void) { u64 one_pt = MAX_PTE_ENTRIES * sizeof(u64); - u64 size, mmfr1; - - asm volatile("mrs %0, id_aa64mmfr1_el1" : "=r" (mmfr1)); - if ((mmfr1 & 0xf) == 2) - gd->arch.has_hafdbs = true; - else - gd->arch.has_hafdbs = false; + u64 size;
/* Account for all page tables we would need to cover our memory map */ size = one_pt * count_ranges(); diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h index 98a27db3166b..9f58cedb650c 100644 --- a/arch/arm/include/asm/armv8/mmu.h +++ b/arch/arm/include/asm/armv8/mmu.h @@ -49,13 +49,10 @@ #define PTE_TYPE_BLOCK (1 << 0) #define PTE_TYPE_VALID (1 << 0)
-#define PTE_RDONLY BIT(7) -#define PTE_DBM BIT(51) - -#define PTE_TABLE_PXN BIT(59) -#define PTE_TABLE_XN BIT(60) -#define PTE_TABLE_AP BIT(61) -#define PTE_TABLE_NS BIT(63) +#define PTE_TABLE_PXN (1UL << 59) +#define PTE_TABLE_XN (1UL << 60) +#define PTE_TABLE_AP (1UL << 61) +#define PTE_TABLE_NS (1UL << 63)
/* * Block @@ -102,9 +99,6 @@ #define TCR_TG0_16K (2 << 14) #define TCR_EPD1_DISABLE (1 << 23)
-#define TCR_HA BIT(39) -#define TCR_HD BIT(40) - #define TCR_EL1_RSVD (1U << 31) #define TCR_EL2_RSVD (1U << 31 | 1 << 23) #define TCR_EL3_RSVD (1U << 31 | 1 << 23) diff --git a/arch/arm/include/asm/global_data.h b/arch/arm/include/asm/global_data.h index 1325b0644248..75bd9d56f893 100644 --- a/arch/arm/include/asm/global_data.h +++ b/arch/arm/include/asm/global_data.h @@ -52,7 +52,6 @@ struct arch_global_data { #if defined(CONFIG_ARM64) unsigned long tlb_fillptr; unsigned long tlb_emerg; - bool has_hafdbs; #endif #endif #ifdef CFG_SYS_MEM_RESERVE_SECURE

On Fri, Oct 27, 2023 at 01:23:54PM +1300, Chris Packham wrote:
This reverts commit 6cdf6b7a340db4ddd008516181de7e08e3f8c213. This is part of a series trying to make use of the arm64 hardware features for tracking dirty pages. Unfortunately this series causes problems for the AC5/AC5X SoCs. Having exhausted other options the consensus seems to be reverting this series is the best course of action.
Signed-off-by: Chris Packham judge.packham@gmail.com
Applied to u-boot/master, thanks!

Hi Chris,
On Fri, Oct 27, 2023 at 01:23:51PM +1300, Chris Packham wrote:
As discussed this series reverts the HAFDBS changes that caused an issue on AC5/AC5X. I think there are some improvements that can be made to the initial memory map for the AC5/AC5X but so far nothing I've found makes it compatible with the HAFDBS changes.
Would you mind briefly explaining what those issues are and/or point me to the discussion where it was decided to revert these patches?
The feature is quite useful for users of CONFIG_CMO_BY_VA_ONLY, to speed up the boot sequence: instead of reverting these patches altogether, would a reasonable alternative be to put them behind a build option?
Also, could you confirm that the "initial memory map" you are referring to above only describes actual memory as, IIRC, some boards were using mappings **much** larger than their DRAM address space?
Chris Packham (3): Revert "armv8: enable HAFDBS for other ELx when FEAT_HAFDBS is present" Revert "arm64: Use level-2 for largest block mappings when FEAT_HAFDBS is present" Revert "arm64: Use FEAT_HAFDBS to track dirty pages when available"
arch/arm/cpu/armv8/cache_v8.c | 30 +++--------------------------- arch/arm/include/asm/armv8/mmu.h | 20 ++++---------------- arch/arm/include/asm/global_data.h | 2 -- 3 files changed, 7 insertions(+), 45 deletions(-)
-- 2.42.0

Hi Pierre-Clément,
On Fri, 27 Oct 2023 10:49:47 +0100, Pierre-Clément Tosi ptosi@google.com wrote:
Hi Chris,
On Fri, Oct 27, 2023 at 01:23:51PM +1300, Chris Packham wrote:
As discussed this series reverts the HAFDBS changes that caused an issue on AC5/AC5X. I think there are some improvements that can be made to the initial memory map for the AC5/AC5X but so far nothing I've found makes it compatible with the HAFDBS changes.
Would you mind briefly explaining what those issues are and/or point me to the discussion where it was decided to revert these patches?
It was me[1] who offered to revert them, as the "fix" that Chris came up with didn't make much sense on its own, was provided without much of an explanation, and nobody else with sufficient understanding of the ARMv8 translation model seem to have access to this HW to debug it (though if someone sends me a board, I'll happily look into it).
To be clear, I'm pretty much convinced that the issue is due to the way page tables are managed on the arm64 port. The whole "emergency page table" switch doesn't make much sense, specially when changing something as simple as some page attributes, and it isn't like "break before make" is a brand new concept.
The feature is quite useful for users of CONFIG_CMO_BY_VA_ONLY, to speed up the boot sequence: instead of reverting these patches altogether, would a reasonable alternative be to put them behind a build option?
I don't think this is right. Either we have something that works for all ARMV8.1+ systems, or it just makes things a lot less maintainable.
But maybe the u-boot maintainers have a different view on this.
The core of the problem is that there is a truckload of code that already exists in the tree and that somehow relies on the existing (and funky) behaviours[2]. How do we go about fixing the core u-boot? I have no idea.
Cheers,
M.
[1] https://lore.kernel.org/r/063a16d559dc9cb327c9459803005006@kernel.org [2] https://lore.kernel.org/r/CAFOYHZC_Dveax85n0fLr5BFyZcLqsvUssn=J0oHyvN75bTaCU...

Hi Marc,
On Fri, Oct 27, 2023 at 03:11:15PM +0100, Marc Zyngier wrote:
Hi Pierre-Clément,
On Fri, 27 Oct 2023 10:49:47 +0100, Pierre-Clément Tosi ptosi@google.com wrote:
Hi Chris,
On Fri, Oct 27, 2023 at 01:23:51PM +1300, Chris Packham wrote:
As discussed this series reverts the HAFDBS changes that caused an issue on AC5/AC5X. I think there are some improvements that can be made to the initial memory map for the AC5/AC5X but so far nothing I've found makes it compatible with the HAFDBS changes.
Would you mind briefly explaining what those issues are and/or point me to the discussion where it was decided to revert these patches?
It was me[1] who offered to revert them, as the "fix" that Chris came up with didn't make much sense on its own, was provided without much of an explanation, and nobody else with sufficient understanding of the ARMv8 translation model seem to have access to this HW to debug it (though if someone sends me a board, I'll happily look into it).
Thanks for the context.
I have checked with our internal users of the patches being reverted and they don't mind the performance hit; if no one else is worried about the potential regression, reverting (instead of properly addressing the root cause) might be a reasonable option.
To be clear, I'm pretty much convinced that the issue is due to the way page tables are managed on the arm64 port. The whole "emergency page table" switch doesn't make much sense, specially when changing something as simple as some page attributes, and it isn't like "break before make" is a brand new concept.
It might not "make sense" from a functional/performance perspective (as it isn't strictly necessary) but I was assuming that it at least complied with the architecture (i.e. should not cause the CPU to lockup, as Chris reports).
Even if the reverts get merged, it would be good to clear up if it was a bug in the logic (e.g. how is gd->arch.tlb_emerg affected by these patches?) or a violation of the architecture and if the issue was actually coming from the HAFDBS changes or if they were triggering it.
Otherwise, these reverts might be masking the symptom instead of addressing the cause. As a result, we would be left with code that seems to "work" but is still as fragile as before, where any future patch is at risk of triggering the faulty behavior and of ending up also being reverted.
The feature is quite useful for users of CONFIG_CMO_BY_VA_ONLY, to speed up the boot sequence: instead of reverting these patches altogether, would a reasonable alternative be to put them behind a build option?
I don't think this is right. Either we have something that works for all ARMV8.1+ systems, or it just makes things a lot less maintainable.
But maybe the u-boot maintainers have a different view on this.
The core of the problem is that there is a truckload of code that already exists in the tree and that somehow relies on the existing (and funky) behaviours[2]. How do we go about fixing the core u-boot? I have no idea.
Cheers,
M.
[1] https://lore.kernel.org/r/063a16d559dc9cb327c9459803005006@kernel.org [2] https://lore.kernel.org/r/CAFOYHZC_Dveax85n0fLr5BFyZcLqsvUssn=J0oHyvN75bTaCU...
-- Without deviation from the norm, progress is not possible.
Thanks,

On Fri, Oct 27, 2023 at 10:49:47AM +0100, Pierre-Clément Tosi wrote:
Hi Chris,
On Fri, Oct 27, 2023 at 01:23:51PM +1300, Chris Packham wrote:
As discussed this series reverts the HAFDBS changes that caused an issue on AC5/AC5X. I think there are some improvements that can be made to the initial memory map for the AC5/AC5X but so far nothing I've found makes it compatible with the HAFDBS changes.
Would you mind briefly explaining what those issues are and/or point me to the discussion where it was decided to revert these patches?
The feature is quite useful for users of CONFIG_CMO_BY_VA_ONLY, to speed up the boot sequence: instead of reverting these patches altogether, would a reasonable alternative be to put them behind a build option?
Also, could you confirm that the "initial memory map" you are referring to above only describes actual memory as, IIRC, some boards were using mappings **much** larger than their DRAM address space?
The most details are in this thread: https://lore.kernel.org/u-boot/CAFOYHZC_Dveax85n0fLr5BFyZcLqsvUssn=J0oHyvN75... with some follow-up in: https://lore.kernel.org/u-boot/CAFOYHZB7jJWwD24oWzx6u55w2whHYjK56f=QyN0LWU4Z...

On Fri, Oct 27, 2023 at 01:22:37PM -0400, Tom Rini wrote:
On Fri, Oct 27, 2023 at 10:49:47AM +0100, Pierre-Clément Tosi wrote:
Hi Chris,
On Fri, Oct 27, 2023 at 01:23:51PM +1300, Chris Packham wrote:
As discussed this series reverts the HAFDBS changes that caused an issue on AC5/AC5X. I think there are some improvements that can be made to the initial memory map for the AC5/AC5X but so far nothing I've found makes it compatible with the HAFDBS changes.
Would you mind briefly explaining what those issues are and/or point me to the discussion where it was decided to revert these patches?
The feature is quite useful for users of CONFIG_CMO_BY_VA_ONLY, to speed up the boot sequence: instead of reverting these patches altogether, would a reasonable alternative be to put them behind a build option?
Also, could you confirm that the "initial memory map" you are referring to above only describes actual memory as, IIRC, some boards were using mappings **much** larger than their DRAM address space?
The most details are in this thread: https://lore.kernel.org/u-boot/CAFOYHZC_Dveax85n0fLr5BFyZcLqsvUssn=J0oHyvN75... with some follow-up in: https://lore.kernel.org/u-boot/CAFOYHZB7jJWwD24oWzx6u55w2whHYjK56f=QyN0LWU4Z...
Checking to see if you have any feedback for these platforms? I would like to have them working again one way or another in v2024.01, thanks.

On Fri, 10 Nov 2023, 10:33 am Tom Rini, trini@konsulko.com wrote:
On Fri, Oct 27, 2023 at 01:22:37PM -0400, Tom Rini wrote:
On Fri, Oct 27, 2023 at 10:49:47AM +0100, Pierre-Clément Tosi wrote:
Hi Chris,
On Fri, Oct 27, 2023 at 01:23:51PM +1300, Chris Packham wrote:
As discussed this series reverts the HAFDBS changes that caused an
issue
on AC5/AC5X. I think there are some improvements that can be made to
the
initial memory map for the AC5/AC5X but so far nothing I've found
makes
it compatible with the HAFDBS changes.
Would you mind briefly explaining what those issues are and/or point
me to the
discussion where it was decided to revert these patches?
The feature is quite useful for users of CONFIG_CMO_BY_VA_ONLY, to
speed up the
boot sequence: instead of reverting these patches altogether, would a
reasonable
alternative be to put them behind a build option?
Also, could you confirm that the "initial memory map" you are
referring to above
only describes actual memory as, IIRC, some boards were using mappings
**much**
larger than their DRAM address space?
The most details are in this thread:
https://lore.kernel.org/u-boot/CAFOYHZC_Dveax85n0fLr5BFyZcLqsvUssn=J0oHyvN75...
with some follow-up in:
https://lore.kernel.org/u-boot/CAFOYHZB7jJWwD24oWzx6u55w2whHYjK56f=QyN0LWU4Z...
Checking to see if you have any feedback for these platforms? I would like to have them working again one way or another in v2024.01, thanks.
Either this series or https://lore.kernel.org/u-boot/20231018205358.1557234-1-judge.packham@gmail.... will get the AC5X boards back working. I'm out of other ideas but happy to test patches.

On Fri, Nov 10, 2023 at 06:38:40PM +1300, Chris Packham wrote:
On Fri, 10 Nov 2023, 10:33 am Tom Rini, trini@konsulko.com wrote:
On Fri, Oct 27, 2023 at 01:22:37PM -0400, Tom Rini wrote:
On Fri, Oct 27, 2023 at 10:49:47AM +0100, Pierre-Clément Tosi wrote:
Hi Chris,
On Fri, Oct 27, 2023 at 01:23:51PM +1300, Chris Packham wrote:
As discussed this series reverts the HAFDBS changes that caused an
issue
on AC5/AC5X. I think there are some improvements that can be made to
the
initial memory map for the AC5/AC5X but so far nothing I've found
makes
it compatible with the HAFDBS changes.
Would you mind briefly explaining what those issues are and/or point
me to the
discussion where it was decided to revert these patches?
The feature is quite useful for users of CONFIG_CMO_BY_VA_ONLY, to
speed up the
boot sequence: instead of reverting these patches altogether, would a
reasonable
alternative be to put them behind a build option?
Also, could you confirm that the "initial memory map" you are
referring to above
only describes actual memory as, IIRC, some boards were using mappings
**much**
larger than their DRAM address space?
The most details are in this thread:
https://lore.kernel.org/u-boot/CAFOYHZC_Dveax85n0fLr5BFyZcLqsvUssn=J0oHyvN75...
with some follow-up in:
https://lore.kernel.org/u-boot/CAFOYHZB7jJWwD24oWzx6u55w2whHYjK56f=QyN0LWU4Z...
Checking to see if you have any feedback for these platforms? I would like to have them working again one way or another in v2024.01, thanks.
Either this series or https://lore.kernel.org/u-boot/20231018205358.1557234-1-judge.packham@gmail.... will get the AC5X boards back working. I'm out of other ideas but happy to test patches.
Following up here on the cover letter as well that I've applied the revert for now. I'm happy to revert the revert if we can get the AC5X platforms fixed, and perhaps a little documentation about what was going wrong as I believe the other thread at least hinted that other platforms might be doing a workaround as well but didn't spell out why.
participants (4)
-
Chris Packham
-
Marc Zyngier
-
Pierre-Clément Tosi
-
Tom Rini