[PATCH 0/3] Use FEAT_HAFDBS to track dirty pages

For some ARM64 cores there is a facility that allows the page table walker to track the dirty state of a page. These commits enhances the performance of CMOs by VA.
Marc Zyngier (2): arm64: Use FEAT_HAFDBS to track dirty pages when available arm64: Use level-2 for largest block mappings when FEAT_HAFDBS is present
meitao (1): armv8: enable HAFDBS for other ELx when FEAT_HAFDBS is present
arch/arm/cpu/armv8/cache_v8.c | 30 +++++++++++++++++++++++++++--- arch/arm/include/asm/armv8/mmu.h | 20 ++++++++++++++++---- arch/arm/include/asm/global_data.h | 2 ++ 3 files changed, 45 insertions(+), 7 deletions(-)

From: Marc Zyngier maz@kernel.org
Some recent arm64 cores have a facility that allows the page table walker to track the dirty state of a page. This makes it really efficient to perform CMOs by VA as we only need to look at dirty pages.
Signed-off-by: Marc Zyngier maz@kernel.org [ Paul: pick from the Android tree. Rebase to the upstream ] Signed-off-by: Ying-Chun Liu (PaulLiu) paul.liu@linaro.org Cc: Tom Rini trini@konsulko.com Link: https://android.googlesource.com/platform/external/u-boot/+/3c433724e6f830a6... --- arch/arm/cpu/armv8/cache_v8.c | 16 +++++++++++++++- arch/arm/include/asm/armv8/mmu.h | 14 ++++++++++---- arch/arm/include/asm/global_data.h | 1 + 3 files changed, 26 insertions(+), 5 deletions(-)
diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c index 697334086f..4760064ee1 100644 --- a/arch/arm/cpu/armv8/cache_v8.c +++ b/arch/arm/cpu/armv8/cache_v8.c @@ -93,6 +93,8 @@ u64 get_tcr(u64 *pips, u64 *pva_bits)
if (el == 1) { tcr = TCR_EL1_RSVD | (ips << 32) | TCR_EPD1_DISABLE; + if (gd->arch.has_hafdbs) + tcr |= TCR_HA | TCR_HD; } else if (el == 2) { tcr = TCR_EL2_RSVD | (ips << 16); } else { @@ -200,6 +202,9 @@ static void __cmo_on_leaves(void (*cmo_fn)(unsigned long, unsigned long), attrs != PTE_BLOCK_MEMTYPE(MT_NORMAL_NC)) continue;
+ if (gd->arch.has_hafdbs && (pte & (PTE_RDONLY | PTE_DBM)) != PTE_DBM) + continue; + end = va + BIT(level2shift(level)) - 1;
/* No intersection with RAM? */ @@ -348,6 +353,9 @@ static void add_map(struct mm_region *map) if (va_bits < 39) level = 1;
+ if (gd->arch.has_hafdbs) + attrs |= PTE_DBM | PTE_RDONLY; + map_range(map->virt, map->phys, map->size, level, (u64 *)gd->arch.tlb_addr, attrs); } @@ -399,7 +407,13 @@ static int count_ranges(void) __weak u64 get_page_table_size(void) { u64 one_pt = MAX_PTE_ENTRIES * sizeof(u64); - u64 size; + u64 size, mmfr1; + + asm volatile("mrs %0, id_aa64mmfr1_el1" : "=r" (mmfr1)); + if ((mmfr1 & 0xf) == 2) + gd->arch.has_hafdbs = true; + else + gd->arch.has_hafdbs = false;
/* Account for all page tables we would need to cover our memory map */ size = one_pt * count_ranges(); diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h index 9f58cedb65..98a27db316 100644 --- a/arch/arm/include/asm/armv8/mmu.h +++ b/arch/arm/include/asm/armv8/mmu.h @@ -49,10 +49,13 @@ #define PTE_TYPE_BLOCK (1 << 0) #define PTE_TYPE_VALID (1 << 0)
-#define PTE_TABLE_PXN (1UL << 59) -#define PTE_TABLE_XN (1UL << 60) -#define PTE_TABLE_AP (1UL << 61) -#define PTE_TABLE_NS (1UL << 63) +#define PTE_RDONLY BIT(7) +#define PTE_DBM BIT(51) + +#define PTE_TABLE_PXN BIT(59) +#define PTE_TABLE_XN BIT(60) +#define PTE_TABLE_AP BIT(61) +#define PTE_TABLE_NS BIT(63)
/* * Block @@ -99,6 +102,9 @@ #define TCR_TG0_16K (2 << 14) #define TCR_EPD1_DISABLE (1 << 23)
+#define TCR_HA BIT(39) +#define TCR_HD BIT(40) + #define TCR_EL1_RSVD (1U << 31) #define TCR_EL2_RSVD (1U << 31 | 1 << 23) #define TCR_EL3_RSVD (1U << 31 | 1 << 23) diff --git a/arch/arm/include/asm/global_data.h b/arch/arm/include/asm/global_data.h index 9e746e380a..eda99b5b41 100644 --- a/arch/arm/include/asm/global_data.h +++ b/arch/arm/include/asm/global_data.h @@ -52,6 +52,7 @@ struct arch_global_data { #if defined(CONFIG_ARM64) unsigned long tlb_fillptr; unsigned long tlb_emerg; + bool has_hafdbs; #endif #endif #ifdef CFG_SYS_MEM_RESERVE_SECURE

On Sat, Mar 18, 2023 at 12:22:51AM +0800, Ying-Chun Liu (PaulLiu) wrote:
From: Marc Zyngier maz@kernel.org
Some recent arm64 cores have a facility that allows the page table walker to track the dirty state of a page. This makes it really efficient to perform CMOs by VA as we only need to look at dirty pages.
Signed-off-by: Marc Zyngier maz@kernel.org [ Paul: pick from the Android tree. Rebase to the upstream ] Signed-off-by: Ying-Chun Liu (PaulLiu) paul.liu@linaro.org Cc: Tom Rini trini@konsulko.com Link: https://android.googlesource.com/platform/external/u-boot/+/3c433724e6f830a6...
Applied to u-boot/master, thanks!

Hi Marc, Paul,
On Sat, Mar 18, 2023 at 5:23 AM Ying-Chun Liu (PaulLiu) paul.liu@linaro.org wrote:
From: Marc Zyngier maz@kernel.org
Some recent arm64 cores have a facility that allows the page table walker to track the dirty state of a page. This makes it really efficient to perform CMOs by VA as we only need to look at dirty pages.
Signed-off-by: Marc Zyngier maz@kernel.org [ Paul: pick from the Android tree. Rebase to the upstream ] Signed-off-by: Ying-Chun Liu (PaulLiu) paul.liu@linaro.org Cc: Tom Rini trini@konsulko.com Link: https://android.googlesource.com/platform/external/u-boot/+/3c433724e6f830a6...
I think this may have caused a regression for the Marvell AC5X board(s). I found that v2023.07 locked up at boot but v2023.01 was fine. The lockup seemed to be in the 'Net:' init probably just as the mvneta driver was being initialised.
A git bisect led me to this change although for this specific change instead of the lockup I get a crash so maybe I'm actually hitting a different issue.
Any thoughts as to why this may have caused problems?
arch/arm/cpu/armv8/cache_v8.c | 16 +++++++++++++++- arch/arm/include/asm/armv8/mmu.h | 14 ++++++++++---- arch/arm/include/asm/global_data.h | 1 + 3 files changed, 26 insertions(+), 5 deletions(-)
diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c index 697334086f..4760064ee1 100644 --- a/arch/arm/cpu/armv8/cache_v8.c +++ b/arch/arm/cpu/armv8/cache_v8.c @@ -93,6 +93,8 @@ u64 get_tcr(u64 *pips, u64 *pva_bits)
if (el == 1) { tcr = TCR_EL1_RSVD | (ips << 32) | TCR_EPD1_DISABLE;
if (gd->arch.has_hafdbs)
tcr |= TCR_HA | TCR_HD; } else if (el == 2) { tcr = TCR_EL2_RSVD | (ips << 16); } else {
@@ -200,6 +202,9 @@ static void __cmo_on_leaves(void (*cmo_fn)(unsigned long, unsigned long), attrs != PTE_BLOCK_MEMTYPE(MT_NORMAL_NC)) continue;
if (gd->arch.has_hafdbs && (pte & (PTE_RDONLY | PTE_DBM)) != PTE_DBM)
continue;
end = va + BIT(level2shift(level)) - 1; /* No intersection with RAM? */
@@ -348,6 +353,9 @@ static void add_map(struct mm_region *map) if (va_bits < 39) level = 1;
if (gd->arch.has_hafdbs)
attrs |= PTE_DBM | PTE_RDONLY;
map_range(map->virt, map->phys, map->size, level, (u64 *)gd->arch.tlb_addr, attrs);
} @@ -399,7 +407,13 @@ static int count_ranges(void) __weak u64 get_page_table_size(void) { u64 one_pt = MAX_PTE_ENTRIES * sizeof(u64);
u64 size;
u64 size, mmfr1;
asm volatile("mrs %0, id_aa64mmfr1_el1" : "=r" (mmfr1));
if ((mmfr1 & 0xf) == 2)
gd->arch.has_hafdbs = true;
else
gd->arch.has_hafdbs = false; /* Account for all page tables we would need to cover our memory map */ size = one_pt * count_ranges();
diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h index 9f58cedb65..98a27db316 100644 --- a/arch/arm/include/asm/armv8/mmu.h +++ b/arch/arm/include/asm/armv8/mmu.h @@ -49,10 +49,13 @@ #define PTE_TYPE_BLOCK (1 << 0) #define PTE_TYPE_VALID (1 << 0)
-#define PTE_TABLE_PXN (1UL << 59) -#define PTE_TABLE_XN (1UL << 60) -#define PTE_TABLE_AP (1UL << 61) -#define PTE_TABLE_NS (1UL << 63) +#define PTE_RDONLY BIT(7) +#define PTE_DBM BIT(51)
+#define PTE_TABLE_PXN BIT(59) +#define PTE_TABLE_XN BIT(60) +#define PTE_TABLE_AP BIT(61) +#define PTE_TABLE_NS BIT(63)
/*
- Block
@@ -99,6 +102,9 @@ #define TCR_TG0_16K (2 << 14) #define TCR_EPD1_DISABLE (1 << 23)
+#define TCR_HA BIT(39) +#define TCR_HD BIT(40)
#define TCR_EL1_RSVD (1U << 31) #define TCR_EL2_RSVD (1U << 31 | 1 << 23) #define TCR_EL3_RSVD (1U << 31 | 1 << 23) diff --git a/arch/arm/include/asm/global_data.h b/arch/arm/include/asm/global_data.h index 9e746e380a..eda99b5b41 100644 --- a/arch/arm/include/asm/global_data.h +++ b/arch/arm/include/asm/global_data.h @@ -52,6 +52,7 @@ struct arch_global_data { #if defined(CONFIG_ARM64) unsigned long tlb_fillptr; unsigned long tlb_emerg;
bool has_hafdbs;
#endif #endif
#ifdef CFG_SYS_MEM_RESERVE_SECURE
2.39.2

On 2023-10-13 03:40, Chris Packham wrote:
Hi Marc, Paul,
On Sat, Mar 18, 2023 at 5:23 AM Ying-Chun Liu (PaulLiu) paul.liu@linaro.org wrote:
From: Marc Zyngier maz@kernel.org
Some recent arm64 cores have a facility that allows the page table walker to track the dirty state of a page. This makes it really efficient to perform CMOs by VA as we only need to look at dirty pages.
Signed-off-by: Marc Zyngier maz@kernel.org [ Paul: pick from the Android tree. Rebase to the upstream ] Signed-off-by: Ying-Chun Liu (PaulLiu) paul.liu@linaro.org Cc: Tom Rini trini@konsulko.com Link: https://android.googlesource.com/platform/external/u-boot/+/3c433724e6f830a6...
I think this may have caused a regression for the Marvell AC5X board(s). I found that v2023.07 locked up at boot but v2023.01 was fine. The lockup seemed to be in the 'Net:' init probably just as the mvneta driver was being initialised.
A git bisect led me to this change although for this specific change instead of the lockup I get a crash so maybe I'm actually hitting a different issue.
Any thoughts as to why this may have caused problems?
Not really. What CPUs does this platform have? What is the offending driver doing to trigger the issue? Can you provide some level of tracing?
M.

On Sat, 14 Oct 2023, 11:04 am Marc Zyngier, maz@kernel.org wrote:
On 2023-10-13 03:40, Chris Packham wrote:
Hi Marc, Paul,
On Sat, Mar 18, 2023 at 5:23 AM Ying-Chun Liu (PaulLiu) paul.liu@linaro.org wrote:
From: Marc Zyngier maz@kernel.org
Some recent arm64 cores have a facility that allows the page table walker to track the dirty state of a page. This makes it really efficient to perform CMOs by VA as we only need to look at dirty pages.
Signed-off-by: Marc Zyngier maz@kernel.org [ Paul: pick from the Android tree. Rebase to the upstream ] Signed-off-by: Ying-Chun Liu (PaulLiu) paul.liu@linaro.org Cc: Tom Rini trini@konsulko.com Link:
https://android.googlesource.com/platform/external/u-boot/+/3c433724e6f830a6...
I think this may have caused a regression for the Marvell AC5X board(s). I found that v2023.07 locked up at boot but v2023.01 was fine. The lockup seemed to be in the 'Net:' init probably just as the mvneta driver was being initialised.
A git bisect led me to this change although for this specific change instead of the lockup I get a crash so maybe I'm actually hitting a different issue.
Any thoughts as to why this may have caused problems?
Not really. What CPUs does this platform have? What is the offending driver doing to trigger the issue? Can you provide some level of tracing?
The Marvell AC5X is a network switch ASIC with an integrated ARMv8 CPU (8.1 specifically I think).
I think there is something that the mvneta driver is doing triggering the issue. I have another AC5X based board without an Ethernet port that boots just fine (this is also why I didn't notice earlier).
I'll try and get some more debug out when I'm back in the office
M.
-- Jazz is not dead. It just smells funny...

On Sun, Oct 15, 2023 at 10:29 AM Chris Packham judge.packham@gmail.com wrote:
On Sat, 14 Oct 2023, 11:04 am Marc Zyngier, maz@kernel.org wrote:
On 2023-10-13 03:40, Chris Packham wrote:
Hi Marc, Paul,
On Sat, Mar 18, 2023 at 5:23 AM Ying-Chun Liu (PaulLiu) paul.liu@linaro.org wrote:
From: Marc Zyngier maz@kernel.org
Some recent arm64 cores have a facility that allows the page table walker to track the dirty state of a page. This makes it really efficient to perform CMOs by VA as we only need to look at dirty pages.
Signed-off-by: Marc Zyngier maz@kernel.org [ Paul: pick from the Android tree. Rebase to the upstream ] Signed-off-by: Ying-Chun Liu (PaulLiu) paul.liu@linaro.org Cc: Tom Rini trini@konsulko.com Link: https://android.googlesource.com/platform/external/u-boot/+/3c433724e6f830a6...
I think this may have caused a regression for the Marvell AC5X board(s). I found that v2023.07 locked up at boot but v2023.01 was fine. The lockup seemed to be in the 'Net:' init probably just as the mvneta driver was being initialised.
A git bisect led me to this change although for this specific change instead of the lockup I get a crash so maybe I'm actually hitting a different issue.
Any thoughts as to why this may have caused problems?
Not really. What CPUs does this platform have? What is the offending driver doing to trigger the issue? Can you provide some level of tracing?
The Marvell AC5X is a network switch ASIC with an integrated ARMv8 CPU (8.1 specifically I think).
I think there is something that the mvneta driver is doing triggering the issue. I have another AC5X based board without an Ethernet port that boots just fine (this is also why I didn't notice earlier).
I'll try and get some more debug out when I'm back in the office
The thing the mvneta driver does that upsets things appears to be
mmu_set_region_dcache_behaviour((phys_addr_t)bd_space, BD_SPACE, DCACHE_OFF);
I can comment that line out and everything works.

On Mon, 16 Oct 2023 02:42:08 +0100, Chris Packham judge.packham@gmail.com wrote:
On Sun, Oct 15, 2023 at 10:29 AM Chris Packham judge.packham@gmail.com wrote:
On Sat, 14 Oct 2023, 11:04 am Marc Zyngier, maz@kernel.org wrote:
On 2023-10-13 03:40, Chris Packham wrote:
Hi Marc, Paul,
On Sat, Mar 18, 2023 at 5:23 AM Ying-Chun Liu (PaulLiu) paul.liu@linaro.org wrote:
From: Marc Zyngier maz@kernel.org
Some recent arm64 cores have a facility that allows the page table walker to track the dirty state of a page. This makes it really efficient to perform CMOs by VA as we only need to look at dirty pages.
Signed-off-by: Marc Zyngier maz@kernel.org [ Paul: pick from the Android tree. Rebase to the upstream ] Signed-off-by: Ying-Chun Liu (PaulLiu) paul.liu@linaro.org Cc: Tom Rini trini@konsulko.com Link: https://android.googlesource.com/platform/external/u-boot/+/3c433724e6f830a6...
I think this may have caused a regression for the Marvell AC5X board(s). I found that v2023.07 locked up at boot but v2023.01 was fine. The lockup seemed to be in the 'Net:' init probably just as the mvneta driver was being initialised.
A git bisect led me to this change although for this specific change instead of the lockup I get a crash so maybe I'm actually hitting a different issue.
Any thoughts as to why this may have caused problems?
Not really. What CPUs does this platform have? What is the offending driver doing to trigger the issue? Can you provide some level of tracing?
The Marvell AC5X is a network switch ASIC with an integrated ARMv8 CPU (8.1 specifically I think).
I think there is something that the mvneta driver is doing triggering the issue. I have another AC5X based board without an Ethernet port that boots just fine (this is also why I didn't notice earlier).
I'll try and get some more debug out when I'm back in the office
The thing the mvneta driver does that upsets things appears to be
mmu_set_region_dcache_behaviour((phys_addr_t)bd_space, BD_SPACE, DCACHE_OFF);
I can comment that line out and everything works.
This leads to two questions:
- is the device cache coherent, in which case it doesn't need the memory being non-cacheable? If everything is OK, then why the switch to device memory?
- what goes wrong when these attributes are applied? do we have to split a block mapping?
Instrumenting the MMU code would certainly help understanding what goes wrong here.
Thanks,
M.

On Tue, Oct 17, 2023 at 12:21 AM Marc Zyngier maz@kernel.org wrote:
On Mon, 16 Oct 2023 02:42:08 +0100, Chris Packham judge.packham@gmail.com wrote:
On Sun, Oct 15, 2023 at 10:29 AM Chris Packham judge.packham@gmail.com wrote:
On Sat, 14 Oct 2023, 11:04 am Marc Zyngier, maz@kernel.org wrote:
On 2023-10-13 03:40, Chris Packham wrote:
Hi Marc, Paul,
On Sat, Mar 18, 2023 at 5:23 AM Ying-Chun Liu (PaulLiu) paul.liu@linaro.org wrote:
From: Marc Zyngier maz@kernel.org
Some recent arm64 cores have a facility that allows the page table walker to track the dirty state of a page. This makes it really efficient to perform CMOs by VA as we only need to look at dirty pages.
Signed-off-by: Marc Zyngier maz@kernel.org [ Paul: pick from the Android tree. Rebase to the upstream ] Signed-off-by: Ying-Chun Liu (PaulLiu) paul.liu@linaro.org Cc: Tom Rini trini@konsulko.com Link: https://android.googlesource.com/platform/external/u-boot/+/3c433724e6f830a6...
I think this may have caused a regression for the Marvell AC5X board(s). I found that v2023.07 locked up at boot but v2023.01 was fine. The lockup seemed to be in the 'Net:' init probably just as the mvneta driver was being initialised.
A git bisect led me to this change although for this specific change instead of the lockup I get a crash so maybe I'm actually hitting a different issue.
Any thoughts as to why this may have caused problems?
Not really. What CPUs does this platform have? What is the offending driver doing to trigger the issue? Can you provide some level of tracing?
The Marvell AC5X is a network switch ASIC with an integrated ARMv8 CPU (8.1 specifically I think).
I think there is something that the mvneta driver is doing triggering the issue. I have another AC5X based board without an Ethernet port that boots just fine (this is also why I didn't notice earlier).
I'll try and get some more debug out when I'm back in the office
The thing the mvneta driver does that upsets things appears to be
mmu_set_region_dcache_behaviour((phys_addr_t)bd_space, BD_SPACE, DCACHE_OFF);
I can comment that line out and everything works.
This leads to two questions:
- is the device cache coherent, in which case it doesn't need the memory being non-cacheable? If everything is OK, then why the switch to device memory?
I'll be honest and say I understand less than 50% of that. The network transfer does seem to work without the call so perhaps the device is cache coherent but this seems to be a common thing in many drivers so I'd assume that on such platforms this should be innocuous. It's totally possible I haven't done a good job of setting up the CPU or informing the rest of the system about it. I did just take a lot of the code from the Marvell SDK and clean it up without really understanding what most of it did.
- what goes wrong when these attributes are applied? do we have to split a block mapping?
Instrumenting the MMU code would certainly help understanding what goes wrong here.
I did do that a little bit. At first I thought there was a possible infinite loop in mmu_set_region_dcache_behaviour(). Squinting at things you could naively say that if set_one_region() failed to find an entry then it would loop forever but if that happened I'd have some debug saying that it failed. Things seem to go south after __asm_switch_ttbr(gd->arch.tlb_emerg) which did get me thinking that perhaps the emergency tables aren't setup (or at least aren't set up in a way that allows debug output). That's about as far as I got debugging wise, I'll try and spend some more time digging into the MMU code.
Thanks,
M.
-- Without deviation from the norm, progress is not possible.

From: Marc Zyngier maz@kernel.org
In order to make invalidation by VA more efficient, set the largest block mapping to 2MB, mapping it onto level-2. This has no material impact on u-boot's runtime performance, and allows a huge speedup when cleaning the cache.
Signed-off-by: Marc Zyngier maz@kernel.org [ Paul: pick from the Android tree. Rebase to the upstream ] Signed-off-by: Ying-Chun Liu (PaulLiu) paul.liu@linaro.org Cc: Tom Rini trini@konsulko.com Link: https://android.googlesource.com/platform/external/u-boot/+/417a73581a72ff6d... --- arch/arm/cpu/armv8/cache_v8.c | 14 ++++++++++---- arch/arm/include/asm/global_data.h | 1 + 2 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c index 4760064ee1..4c6a1b1d6c 100644 --- a/arch/arm/cpu/armv8/cache_v8.c +++ b/arch/arm/cpu/armv8/cache_v8.c @@ -314,7 +314,7 @@ static void map_range(u64 virt, u64 phys, u64 size, int level, for (i = idx; size; i++) { u64 next_size, *next_table;
- if (level >= 1 && + if (level >= gd->arch.first_block_level && size >= map_size && !(virt & (map_size - 1))) { if (level == 3) table[i] = phys | attrs | PTE_TYPE_PAGE; @@ -353,6 +353,9 @@ static void add_map(struct mm_region *map) if (va_bits < 39) level = 1;
+ if (!gd->arch.first_block_level) + gd->arch.first_block_level = 1; + if (gd->arch.has_hafdbs) attrs |= PTE_DBM | PTE_RDONLY;
@@ -369,7 +372,7 @@ static void count_range(u64 virt, u64 size, int level, int *cntp) for (i = idx; size; i++) { u64 next_size;
- if (level >= 1 && + if (level >= gd->arch.first_block_level && size >= map_size && !(virt & (map_size - 1))) { virt += map_size; size -= map_size; @@ -410,10 +413,13 @@ __weak u64 get_page_table_size(void) u64 size, mmfr1;
asm volatile("mrs %0, id_aa64mmfr1_el1" : "=r" (mmfr1)); - if ((mmfr1 & 0xf) == 2) + if ((mmfr1 & 0xf) == 2) { gd->arch.has_hafdbs = true; - else + gd->arch.first_block_level = 2; + } else { gd->arch.has_hafdbs = false; + gd->arch.first_block_level = 1; + }
/* Account for all page tables we would need to cover our memory map */ size = one_pt * count_ranges(); diff --git a/arch/arm/include/asm/global_data.h b/arch/arm/include/asm/global_data.h index eda99b5b41..9d94cbe665 100644 --- a/arch/arm/include/asm/global_data.h +++ b/arch/arm/include/asm/global_data.h @@ -52,6 +52,7 @@ struct arch_global_data { #if defined(CONFIG_ARM64) unsigned long tlb_fillptr; unsigned long tlb_emerg; + unsigned int first_block_level; bool has_hafdbs; #endif #endif

On Sat, Mar 18, 2023 at 12:22:52AM +0800, Ying-Chun Liu (PaulLiu) wrote:
From: Marc Zyngier maz@kernel.org
In order to make invalidation by VA more efficient, set the largest block mapping to 2MB, mapping it onto level-2. This has no material impact on u-boot's runtime performance, and allows a huge speedup when cleaning the cache.
Signed-off-by: Marc Zyngier maz@kernel.org [ Paul: pick from the Android tree. Rebase to the upstream ] Signed-off-by: Ying-Chun Liu (PaulLiu) paul.liu@linaro.org Cc: Tom Rini trini@konsulko.com Link: https://android.googlesource.com/platform/external/u-boot/+/417a73581a72ff6d...
Applied to u-boot/master, thanks!

From: meitao meitaogao@asrmicro.com
u-boot could be run at EL1/EL2/EL3. so we set it as same as EL1 does. otherwise it will hang when enable mmu, that is what we encounter in our SOC.
Signed-off-by: meitao meitaogao@asrmicro.com [ Paul: pick from the Android tree. Rebase to the upstream ] Signed-off-by: Ying-Chun Liu (PaulLiu) paul.liu@linaro.org Cc: Tom Rini trini@konsulko.com Link: https://android.googlesource.com/platform/external/u-boot/+/3bf38943aeab4700... --- arch/arm/cpu/armv8/cache_v8.c | 6 +++++- arch/arm/include/asm/armv8/mmu.h | 10 ++++++++-- 2 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c index 4c6a1b1d6c..cb1131a048 100644 --- a/arch/arm/cpu/armv8/cache_v8.c +++ b/arch/arm/cpu/armv8/cache_v8.c @@ -94,11 +94,15 @@ u64 get_tcr(u64 *pips, u64 *pva_bits) if (el == 1) { tcr = TCR_EL1_RSVD | (ips << 32) | TCR_EPD1_DISABLE; if (gd->arch.has_hafdbs) - tcr |= TCR_HA | TCR_HD; + tcr |= TCR_EL1_HA | TCR_EL1_HD; } else if (el == 2) { tcr = TCR_EL2_RSVD | (ips << 16); + if (gd->arch.has_hafdbs) + tcr |= TCR_EL2_HA | TCR_EL2_HD; } else { tcr = TCR_EL3_RSVD | (ips << 16); + if (gd->arch.has_hafdbs) + tcr |= TCR_EL3_HA | TCR_EL3_HD; }
/* PTWs cacheable, inner/outer WBWA and inner shareable */ diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h index 98a27db316..19a9e112a4 100644 --- a/arch/arm/include/asm/armv8/mmu.h +++ b/arch/arm/include/asm/armv8/mmu.h @@ -102,8 +102,14 @@ #define TCR_TG0_16K (2 << 14) #define TCR_EPD1_DISABLE (1 << 23)
-#define TCR_HA BIT(39) -#define TCR_HD BIT(40) +#define TCR_EL1_HA BIT(39) +#define TCR_EL1_HD BIT(40) + +#define TCR_EL2_HA BIT(21) +#define TCR_EL2_HD BIT(22) + +#define TCR_EL3_HA BIT(21) +#define TCR_EL3_HD BIT(22)
#define TCR_EL1_RSVD (1U << 31) #define TCR_EL2_RSVD (1U << 31 | 1 << 23)

On Sat, Mar 18, 2023 at 12:22:53AM +0800, Ying-Chun Liu (PaulLiu) wrote:
From: meitao meitaogao@asrmicro.com
u-boot could be run at EL1/EL2/EL3. so we set it as same as EL1 does. otherwise it will hang when enable mmu, that is what we encounter in our SOC.
Signed-off-by: meitao meitaogao@asrmicro.com [ Paul: pick from the Android tree. Rebase to the upstream ] Signed-off-by: Ying-Chun Liu (PaulLiu) paul.liu@linaro.org Cc: Tom Rini trini@konsulko.com Link: https://android.googlesource.com/platform/external/u-boot/+/3bf38943aeab4700...
Applied to u-boot/master, thanks!
participants (4)
-
Chris Packham
-
Marc Zyngier
-
Tom Rini
-
Ying-Chun Liu (PaulLiu)