[PATCH v2 0/3] arm64: add a software pagetable walker

MMU issues are some of the most frustrating to debug. To make this slightly less unbearable, introduce a software pagetable walker for ARMv8. This can be called to dump a pagetable with the default formatter, or a custom callback can be provided to implement more complicated parsing.
This can also be useful to dump the pagetable used by a previous bootloader stage (by reading out the ttbr register).
Here is an example of the output when walking U-Boot's own memory map on a Qualcomm RB3 board:
Walking pagetable at 000000017df90000, va_bits: 36. Using 3 levels [0x17df91000] | Table | | [0x17df92000] | Table | | [0x000001000 - 0x000200000] | Pages | Device-nGnRnE | Non-shareable [0x000200000 - 0x040000000] | Block | Device-nGnRnE | Non-shareable [0x040000000 - 0x080000000] | Block | Device-nGnRnE | Non-shareable [0x080000000 - 0x140000000] | Block | Normal | Inner-shareable [0x17df93000] | Table | | [0x140000000 - 0x17de00000] | Block | Normal | Inner-shareable [0x17df94000] | Table | | [0x17de00000 - 0x17dfa0000] | Pages | Normal | Inner-shareable
--- Changes in v2: - Document the MMU debugging features in doc/arch/arm64.rst - Mention that functions are stripped out when unused. - Link to v1: https://lore.kernel.org/r/20240607-caleb-upstreaming-v1-0-99094dabb3d9@linar...
--- Caleb Connolly (3): arm64: mmu.h: fix PTE_TABLE_AP arm64: add software pagetable walker doc: arch: arm64: describe pagetable debugging
arch/arm/cpu/armv8/cache_v8.c | 202 +++++++++++++++++++++++++++++++++++++++ arch/arm/include/asm/armv8/mmu.h | 57 ++++++++++- doc/arch/arm64.rst | 37 +++++++ 3 files changed, 295 insertions(+), 1 deletion(-) --- change-id: 20240607-caleb-upstreaming-82744a2786b3 base-commit: 22ca7e13dc02be83c29450ae4444add8e150ebdd
// Caleb (they/them)

The APTable attribute is two bits wide according to the ARMv8-A architecture reference manual. Fix the macro accordingly.
Signed-off-by: Caleb Connolly caleb.connolly@linaro.org --- arch/arm/include/asm/armv8/mmu.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h index ce655ce7a952..52cb18b9ed5e 100644 --- a/arch/arm/include/asm/armv8/mmu.h +++ b/arch/arm/include/asm/armv8/mmu.h @@ -50,9 +50,9 @@ #define PTE_TYPE_VALID (1 << 0)
#define PTE_TABLE_PXN (1UL << 59) #define PTE_TABLE_XN (1UL << 60) -#define PTE_TABLE_AP (1UL << 61) +#define PTE_TABLE_AP (3UL << 61) #define PTE_TABLE_NS (1UL << 63)
/* * Block

Add a basic software implementation of the ARM64 pagetable walker. This can be used for debugging U-Boot's pagetable, as well as dumping the pagetable from the previous bootloader stage if it used one (by reading out the ttbr address).
One can either call dump_pagetable() to print the pagetable to the console with the default formatter, or implement their own pagetable handler using walke_pagetable() with a custom pte_walker_cb_t callback.
All of the added code is discarded when unused, hence there is no need to add an additional Kconfig option for this.
Signed-off-by: Caleb Connolly caleb.connolly@linaro.org --- arch/arm/cpu/armv8/cache_v8.c | 202 +++++++++++++++++++++++++++++++++++++++ arch/arm/include/asm/armv8/mmu.h | 55 +++++++++++ 2 files changed, 257 insertions(+)
diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c index d4c64f2d60d9..cb7c0e61ba70 100644 --- a/arch/arm/cpu/armv8/cache_v8.c +++ b/arch/arm/cpu/armv8/cache_v8.c @@ -395,8 +395,210 @@ static int count_ranges(void)
return count; }
+#define ALL_ATTRS (3 << 8 | PMD_ATTRINDX_MASK) +#define PTE_IS_TABLE(pte, level) (pte_type(&(pte)) == PTE_TYPE_TABLE && (level) < 3) + +enum walker_state { + WALKER_STATE_START = 0, + WALKER_STATE_TABLE, + WALKER_STATE_REGION, /* block or page, depending on level */ +}; + +static void __pagetable_walk(u64 addr, u64 tcr, int level, pte_walker_cb_t cb, void *priv) +{ + u64 *table = (u64 *)addr; + u64 attrs, last_attrs = 0, last_addr = 0, entry_start = 0; + int i; + u64 va_bits = 64 - (tcr & (BIT(6) - 1)); + static enum walker_state state[4] = { 0 }; + static bool exit; + + if (!level) { + exit = false; + if (va_bits < 39) + level = 1; + } + + state[level] = WALKER_STATE_START; + + /* Walk through the table entries */ + for (i = 0; i < MAX_PTE_ENTRIES; i++) { + u64 pte = table[i]; + u64 _addr = pte & GENMASK_ULL(va_bits, PAGE_SHIFT); + + if (exit) + return; + + if (pte_type(&pte) == PTE_TYPE_FAULT) + continue; + + attrs = pte & ALL_ATTRS; + /* If we're currently inside a block or set of pages */ + if (state[level] > WALKER_STATE_START && state[level] != WALKER_STATE_TABLE) { + /* + * Continue walking if this entry has the same attributes as the last and + * is one page/block away -- it's a contiguous region. + */ + if (attrs == last_attrs && _addr == last_addr + (1 << level2shift(level))) { + last_attrs = attrs; + last_addr = _addr; + continue; + } else { + /* We either hit a table or a new region */ + exit = cb(entry_start, last_addr + (1 << level2shift(level)), + va_bits, level, priv); + if (exit) + return; + state[level] = WALKER_STATE_START; + } + } + last_attrs = attrs; + last_addr = _addr; + + if (PTE_IS_TABLE(pte, level)) { + /* After the end of the table might be corrupted data */ + if (!_addr || (pte & 0xfff) > 0x3ff) + return; + state[level] = WALKER_STATE_TABLE; + /* Signify the start of a table */ + exit = cb(pte, 0, va_bits, level, priv); + if (exit) + return; + + /* Go down a level */ + __pagetable_walk(_addr, tcr, level + 1, cb, priv); + state[level] = WALKER_STATE_START; + } else if (pte_type(&pte) == PTE_TYPE_BLOCK || pte_type(&pte) == PTE_TYPE_PAGE) { + /* We foud a block or page, start walking */ + entry_start = pte; + state[level] = WALKER_STATE_REGION; + } + } + + if (state[level] > WALKER_STATE_START) + exit = cb(entry_start, last_addr + (1 << level2shift(level)), va_bits, level, priv); +} + +static void pretty_print_pte_type(u64 pte) +{ + switch (pte_type(&pte)) { + case PTE_TYPE_FAULT: + printf(" %-5s", "Fault"); + break; + case PTE_TYPE_BLOCK: + printf(" %-5s", "Block"); + break; + case PTE_TYPE_PAGE: + printf(" %-5s", "Pages"); + break; + default: + printf(" %-5s", "Unk"); + } +} + +static void pretty_print_table_attrs(u64 pte) +{ + int ap = (pte & PTE_TABLE_AP) >> 61; + + printf(" | %2s %10s", + (ap & 2) ? "RO" : "", + (ap & 1) ? "!EL0" : ""); + printf(" | %3s %2s %2s", + (pte & PTE_TABLE_PXN) ? "PXN" : "", + (pte & PTE_TABLE_XN) ? "XN" : "", + (pte & PTE_TABLE_NS) ? "NS" : ""); +} + +static void pretty_print_block_attrs(u64 pte) +{ + u64 attrs = pte & PMD_ATTRINDX_MASK; + + switch (attrs) { + case PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE): + printf(" | %-13s", "Device-nGnRnE"); + break; + case PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRE): + printf(" | %-13s", "Device-nGnRE"); + break; + case PTE_BLOCK_MEMTYPE(MT_DEVICE_GRE): + printf(" | %-13s", "Device-GRE"); + break; + case PTE_BLOCK_MEMTYPE(MT_NORMAL_NC): + printf(" | %-13s", "Normal-NC"); + break; + case PTE_BLOCK_MEMTYPE(MT_NORMAL): + printf(" | %-13s", "Normal"); + break; + default: + printf(" | %-13s", "Unknown"); + } +} + +static void pretty_print_block_memtype(u64 pte) +{ + u64 share = pte & (3 << 8); + + switch (share) { + case PTE_BLOCK_NON_SHARE: + printf(" | %-16s", "Non-shareable"); + break; + case PTE_BLOCK_OUTER_SHARE: + printf(" | %-16s", "Outer-shareable"); + break; + case PTE_BLOCK_INNER_SHARE: + printf(" | %-16s", "Inner-shareable"); + break; + default: + printf(" | %-16s", "Unknown"); + } +} + +static void print_pte(u64 pte, int level) +{ + if (PTE_IS_TABLE(pte, level)) { + printf(" %-5s", "Table"); + pretty_print_table_attrs(pte); + } else { + pretty_print_pte_type(pte); + pretty_print_block_attrs(pte); + pretty_print_block_memtype(pte); + } + printf("\n"); +} + +static bool pagetable_print_entry(u64 start_attrs, u64 end, int va_bits, int level, void *priv) +{ + u64 _addr = start_attrs & GENMASK_ULL(va_bits, PAGE_SHIFT); + int indent = va_bits < 39 ? level - 1 : level; + + printf("%*s", indent * 2, ""); + if (PTE_IS_TABLE(start_attrs, level)) + printf("[%#011llx]%14s", _addr, ""); + else + printf("[%#011llx - %#011llx]", _addr, end); + + printf("%*s | ", (3 - level) * 2, ""); + print_pte(start_attrs, level); + + return false; +} + +void walk_pagetable(u64 ttbr, u64 tcr, pte_walker_cb_t cb, void *priv) +{ + __pagetable_walk(ttbr, tcr, 0, cb, priv); +} + +void dump_pagetable(u64 ttbr, u64 tcr) +{ + u64 va_bits = 64 - (tcr & (BIT(6) - 1)); + + printf("Walking pagetable at %p, va_bits: %lld. Using %d levels\n", (void *)ttbr, + va_bits, va_bits < 39 ? 3 : 4); + walk_pagetable(ttbr, tcr, pagetable_print_entry, NULL); +} + /* Returns the estimated required size of all page tables */ __weak u64 get_page_table_size(void) { u64 one_pt = MAX_PTE_ENTRIES * sizeof(u64); diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h index 52cb18b9ed5e..eed121784247 100644 --- a/arch/arm/include/asm/armv8/mmu.h +++ b/arch/arm/include/asm/armv8/mmu.h @@ -128,8 +128,63 @@ static inline void set_ttbr_tcr_mair(int el, u64 table, u64 tcr, u64 attr) } asm volatile("isb"); }
+static inline void get_ttbr_tcr_mair(int el, u64 *table, u64 *tcr, u64 *attr) +{ + if (el == 1) { + asm volatile("mrs %0, ttbr0_el1" : "=r" (*table)); + asm volatile("mrs %0, tcr_el1" : "=r" (*tcr)); + asm volatile("mrs %0, mair_el1" : "=r" (*attr)); + } else if (el == 2) { + asm volatile("mrs %0, ttbr0_el2" : "=r" (*table)); + asm volatile("mrs %0, tcr_el2" : "=r" (*tcr)); + asm volatile("mrs %0, mair_el2" : "=r" (*attr)); + } else if (el == 3) { + asm volatile("mrs %0, ttbr0_el3" : "=r" (*table)); + asm volatile("mrs %0, tcr_el3" : "=r" (*tcr)); + asm volatile("mrs %0, mair_el3" : "=r" (*attr)); + } else { + hang(); + } +} + +/** + * pte_walker_cb_t - callback function for walk_pagetable. + * + * This function is called when the walker finds a table entry + * or after parsing a block or pages. For a table the @end address + * is 0, and @addr is the address of the table. Otherwise, they + * are the start and end physical addresses of the block or page. + * + * @addr: PTE start address (PA), or address of table. Includes attributes. + * @end: End address of the region (or 0 for a table) + * @va_bits: Number of bits in the virtual address + * @level: Table level + * @priv: Private data for the callback + * + * Return: true to stop walking, false to continue + */ +typedef bool (*pte_walker_cb_t)(u64 addr, u64 end, int va_bits, int level, void *priv); + +/** + * walk_pagetable() - Walk the pagetable at ttbr and call cb for each region + * + * @ttbr: Address of the pagetable to dump + * @tcr: TCR value to use + * @cb: Callback function to call for each entry + * @priv: Private data for the callback + */ +void walk_pagetable(u64 ttbr, u64 tcr, pte_walker_cb_t cb, void *priv); + +/** + * dump_pagetable() - Dump the pagetable at ttbr by printing each region + * + * @ttbr: Address of the pagetable to dump + * @tcr: TCR value to use + */ +void dump_pagetable(u64 ttbr, u64 tcr); + struct mm_region { u64 virt; u64 phys; u64 size;

Add some brief documentation on using dump_pagetables() to print out U-Boot's pagetables during boot.
Signed-off-by: Caleb Connolly caleb.connolly@linaro.org --- doc/arch/arm64.rst | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+)
diff --git a/doc/arch/arm64.rst b/doc/arch/arm64.rst index 7c0713504c47..0c851304aad4 100644 --- a/doc/arch/arm64.rst +++ b/doc/arch/arm64.rst @@ -47,8 +47,45 @@ Notes
6. CONFIG_ARM64 instead of CONFIG_ARMV8 is used to distinguish aarch64 and aarch32 specific codes.
+MMU +--- + +U-Boot uses a simple page table for MMU setup. It uses the smallest number of bits +possible for the virtual address based on the maximum memory address (see the logic +in ``get_tcr()``). + +As with all platforms, U-Boot on ARM64 uses a 1:1 mapping of virtual to physical addresses. +In general, the memory map is expected to remain static once the MMU is enabled. + +It is possible to debug the pagetable generated by U-Boot with the built in +``dump_pagetable()`` and ``walk_pagetable()`` functions (the former being a simple +wrapper for the latter). For example the following can be added to ``setup_all_pgtables()`` +after the first call to ``setup_pgtables()``: + +.. code-block:: c + + dump_pagetable(gd->arch.tlb_addr, get_tcr(NULL, NULL)); + +This will result in a print like the following: + +.. code-block:: text + + Walking pagetable at 000000017df90000, va_bits: 36. Using 3 levels + [0x17df91000] | Table | | + [0x17df92000] | Table | | + [0x000001000 - 0x000200000] | Pages | Device-nGnRnE | Non-shareable + [0x000200000 - 0x040000000] | Block | Device-nGnRnE | Non-shareable + [0x040000000 - 0x080000000] | Block | Device-nGnRnE | Non-shareable + [0x080000000 - 0x140000000] | Block | Normal | Inner-shareable + [0x17df93000] | Table | | + [0x140000000 - 0x17de00000] | Block | Normal | Inner-shareable + [0x17df94000] | Table | | + [0x17de00000 - 0x17dfa0000] | Pages | Normal | Inner-shareable + +For more information, please refer to the function documentation in +``arch/arm/include/asm/armv8/mmu.h``.
Contributors ------------ * Tom Rini trini@ti.com

On Fri, Jun 07, 2024 at 10:42:51PM +0200, Caleb Connolly wrote:
Add some brief documentation on using dump_pagetables() to print out U-Boot's pagetables during boot.
Signed-off-by: Caleb Connolly caleb.connolly@linaro.org
doc/arch/arm64.rst | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+)
diff --git a/doc/arch/arm64.rst b/doc/arch/arm64.rst index 7c0713504c47..0c851304aad4 100644 --- a/doc/arch/arm64.rst +++ b/doc/arch/arm64.rst @@ -47,8 +47,45 @@ Notes
- CONFIG_ARM64 instead of CONFIG_ARMV8 is used to distinguish aarch64 and aarch32 specific codes.
+MMU +---
+U-Boot uses a simple page table for MMU setup. It uses the smallest number of bits +possible for the virtual address based on the maximum memory address (see the logic +in ``get_tcr()``).
+As with all platforms, U-Boot on ARM64 uses a 1:1 mapping of virtual to physical addresses. +In general, the memory map is expected to remain static once the MMU is enabled.
+It is possible to debug the pagetable generated by U-Boot with the built in +``dump_pagetable()`` and ``walk_pagetable()`` functions (the former being a simple +wrapper for the latter). For example the following can be added to ``setup_all_pgtables()`` +after the first call to ``setup_pgtables()``:
+.. code-block:: c
- dump_pagetable(gd->arch.tlb_addr, get_tcr(NULL, NULL));
+This will result in a print like the following:
+.. code-block:: text
- Walking pagetable at 000000017df90000, va_bits: 36. Using 3 levels
- [0x17df91000] | Table | |
[0x17df92000] | Table | |
[0x000001000 - 0x000200000] | Pages | Device-nGnRnE | Non-shareable
[0x000200000 - 0x040000000] | Block | Device-nGnRnE | Non-shareable
- [0x040000000 - 0x080000000] | Block | Device-nGnRnE | Non-shareable
- [0x080000000 - 0x140000000] | Block | Normal | Inner-shareable
- [0x17df93000] | Table | |
[0x140000000 - 0x17de00000] | Block | Normal | Inner-shareable
[0x17df94000] | Table | |
[0x17de00000 - 0x17dfa0000] | Pages | Normal | Inner-shareable
+For more information, please refer to the function documentation in +``arch/arm/include/asm/armv8/mmu.h``.
Since the function comments are kernel-doc style, can you please rework this to leverage "kernel-doc::" and make sure it renders nicely? And this may show we should be putting some function comments in 2/3 of this series as well.
participants (2)
-
Caleb Connolly
-
Tom Rini