[PATCH 1/6] arm: a37xx: pci: Don't put link into LTSSM Recovery state during probe

During our debugging of the Aardvark driver in Linux we have discovered that the PCIE_CORE_LINK_CTRL_STAT_REG register in fact controls standard PCIe Link Control Register for PCIe Root Bridge. This led us to discover that the name of the PCIE_CORE_LINK_TRAINING macro and the corresponding comment by this macro's usage is misleading; this bit in fact controls Retrain Link, which, according to PCIe base spec is defined as:
A write of 1b to this bit initiates Link retraining by directing the Physical Layer LTSSM to the Recovery state. If the LTSSM is already in Recovery or Configuration, re-entering Recovery is permitted but not required.
Entering Recovery state is normally done from LTSSM L0, L0s and L1 states. But since the pci-aardvark.c driver enables Link Training just a few lines above, the controller is not in L0 ready state yet. So setting aardvark bit PCIE_CORE_LINK_TRAINING does not actually enter Recovery state at this place.
Moreover, trying to enter LTSSM Recovery state without other configuration is causing issues for some cards (e.g. Atheros AR9xxx and QCA9xxx). Since Recovery state is not entered, these issues are not triggered.
Remove code which tries to enter LTSSM Recovery state completely.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz --- drivers/pci/pci-aardvark.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index c43d4f309b19..06c567e236f9 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -613,11 +613,6 @@ static int pcie_advk_setup_hw(struct pcie_advk *pcie) reg |= PIO_CTRL_ADDR_WIN_DISABLE; advk_writel(pcie, reg, PIO_CTRL);
- /* Start link training */ - reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG); - reg |= PCIE_CORE_LINK_TRAINING; - advk_writel(pcie, reg, PCIE_CORE_LINK_CTRL_STAT_REG); - /* Wait for PCIe link up */ if (pcie_advk_wait_for_link(pcie)) return -ENXIO;

Disable Root Bridge I/O space, memory space and bus mastering in Aardvark's remove method, which is called before booting Linux kernel.
This ensures that PCIe device which was initialized and used by U-Boot cannot do new DMA transfers until Linux initializes PCI subsystem and loads appropriate drivers for the device.
During initialization of PCI subsystem Linux in fact disables this bus mastering on Root Bridge (and later enables it when driver is loaded and configured), but there is a possibility of a small window after U-Boot boots Linux when bus mastering is enabled, which is not correct.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz --- drivers/pci/pci-aardvark.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index 06c567e236f9..ee81b2ea46d3 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -675,6 +675,12 @@ static int pcie_advk_remove(struct udevice *dev) struct pcie_advk *pcie = dev_get_priv(dev); u32 reg;
+ reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG); + reg &= ~(PCIE_CORE_CMD_MEM_ACCESS_EN | + PCIE_CORE_CMD_IO_ACCESS_EN | + PCIE_CORE_CMD_MEM_IO_REQ_EN); + advk_writel(pcie, reg, PCIE_CORE_CMD_STATUS_REG); + reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); reg &= ~LINK_TRAINING_EN; advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);

Change DT compatible string for A3700 PCIe from 'marvell,armada-37xx-pcie' to 'marvell,armada-3700-pcie' to make U-Boot A3700 PCIe DT node compatible with Linux' DT node.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz --- arch/arm/dts/armada-37xx.dtsi | 2 +- drivers/pci/pci-aardvark.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/dts/armada-37xx.dtsi b/arch/arm/dts/armada-37xx.dtsi index a1052add0cca..b7d325b40577 100644 --- a/arch/arm/dts/armada-37xx.dtsi +++ b/arch/arm/dts/armada-37xx.dtsi @@ -323,7 +323,7 @@ };
pcie0: pcie@d0070000 { - compatible = "marvell,armada-37xx-pcie"; + compatible = "marvell,armada-3700-pcie"; reg = <0 0xd0070000 0 0x20000>; #address-cells = <3>; #size-cells = <2>; diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index ee81b2ea46d3..ae1a20551fed 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -717,7 +717,7 @@ static const struct dm_pci_ops pcie_advk_ops = { };
static const struct udevice_id pcie_advk_ids[] = { - { .compatible = "marvell,armada-37xx-pcie" }, + { .compatible = "marvell,armada-3700-pcie" }, { } };

Find PCIe DT node by compatible string instead of retrieving it by using hardcoded DT path.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz --- arch/arm/mach-mvebu/armada3700/cpu.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/arm/mach-mvebu/armada3700/cpu.c b/arch/arm/mach-mvebu/armada3700/cpu.c index 0cf60d7cdd7d..1abac7c9a47a 100644 --- a/arch/arm/mach-mvebu/armada3700/cpu.c +++ b/arch/arm/mach-mvebu/armada3700/cpu.c @@ -53,8 +53,6 @@ #define A3700_PTE_BLOCK_DEVICE \ (PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) | PTE_BLOCK_NON_SHARE)
-#define PCIE_PATH "/soc/pcie@d0070000" - DECLARE_GLOBAL_DATA_PTR;
static struct mm_region mvebu_mem_map[MAX_MEM_MAP_REGIONS] = { @@ -288,7 +286,7 @@ int a3700_fdt_fix_pcie_regions(void *blob) const u32 *ranges; int node, len;
- node = fdt_path_offset(blob, PCIE_PATH); + node = fdt_node_offset_by_compatible(blob, -1, "marvell,armada-3700-pcie"); if (node < 0) return node;

Current version of this function uses a lot of incorrect assumptions about the `ranges` DT property:
* parent(#address-cells) == 2 * #size-cells == 2 * number of entries == 2 * address size of first entry == 0x1000000 * second child address entry == base + 0x1000000
Trying to increase PCIe MEM space to more than 16 MiB leads to an overlap with PCIe IO space, and trying to define additional MEM space (as a third entry in the `ranges` DT property) causes U-Boot to crash when booting the kernel.
## Flattened Device Tree blob at 04f00000 Booting using the fdt blob at 0x4f00000 Loading Device Tree to 000000001fb01000, end 000000001fb08f12 ... OK ERROR: board-specific fdt fixup failed: <unknown error> - must RESET the board to recover.
Fix a3700_fdt_fix_pcie_regions() to properly parse and update all addresses in the `ranges` property according to https://elinux.org/Device_Tree_Usage#PCI_Address_Translation
Now it is possible to increase PCIe MEM space from 16MiB to maximal value of 128MiB - 64KiB.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz Fixes: cb2ddb291ee6 ("arm64: mvebu: a37xx: add device-tree fixer for PCIe regions") --- arch/arm/mach-mvebu/armada3700/cpu.c | 74 ++++++++++++++++++++++------ 1 file changed, 60 insertions(+), 14 deletions(-)
diff --git a/arch/arm/mach-mvebu/armada3700/cpu.c b/arch/arm/mach-mvebu/armada3700/cpu.c index 1abac7c9a47a..9aec0ce9a430 100644 --- a/arch/arm/mach-mvebu/armada3700/cpu.c +++ b/arch/arm/mach-mvebu/armada3700/cpu.c @@ -8,6 +8,7 @@ #include <cpu_func.h> #include <dm.h> #include <fdtdec.h> +#include <fdt_support.h> #include <init.h> #include <asm/global_data.h> #include <linux/bitops.h> @@ -280,36 +281,81 @@ static u32 find_pcie_window_base(void) return -1; }
+static int fdt_setprop_inplace_u32_partial(void *blob, int node, + const char *name, + u32 idx, u32 val) +{ + val = cpu_to_fdt32(val); + + return fdt_setprop_inplace_namelen_partial(blob, node, name, + strlen(name), + idx * sizeof(u32), + &val, sizeof(u32)); +} + int a3700_fdt_fix_pcie_regions(void *blob) { - u32 new_ranges[14], base; + int acells, pacells, scells; + u32 base, fix_offset; const u32 *ranges; - int node, len; + int node, pnode; + int ret, i, len; + + base = find_pcie_window_base(); + if (base == -1) + return -ENOENT;
node = fdt_node_offset_by_compatible(blob, -1, "marvell,armada-3700-pcie"); if (node < 0) return node;
ranges = fdt_getprop(blob, node, "ranges", &len); - if (!ranges) + if (!ranges || len % sizeof(u32)) return -ENOENT;
- if (len != sizeof(new_ranges)) - return -EINVAL; - - memcpy(new_ranges, ranges, len); + /* + * The "ranges" property is an array of + * { <child address> <parent address> <size in child address space> } + * + * All 3 elements can span a diffent number of cells. Fetch their sizes. + */ + pnode = fdt_parent_offset(blob, node); + acells = fdt_address_cells(blob, node); + pacells = fdt_address_cells(blob, pnode); + scells = fdt_size_cells(blob, node);
- base = find_pcie_window_base(); - if (base == -1) + /* Child PCI addresses always use 3 cells */ + if (acells != 3) return -ENOENT;
- new_ranges[2] = cpu_to_fdt32(base); - new_ranges[4] = new_ranges[2]; + /* Calculate fixup offset from first child address (in last cell) */ + fix_offset = base - fdt32_to_cpu(ranges[2]);
- new_ranges[9] = cpu_to_fdt32(base + 0x1000000); - new_ranges[11] = new_ranges[9]; + /* + * Fix address (last cell) of each child address and each parent + * address + */ + for (i = 0; i < len / sizeof(u32); i += acells + pacells + scells) { + int idx; + + /* fix child address */ + idx = i + acells - 1; + ret = fdt_setprop_inplace_u32_partial(blob, node, "ranges", idx, + fdt32_to_cpu(ranges[idx]) + + fix_offset); + if (ret) + return ret; + + /* fix parent address */ + idx = i + acells + pacells - 1; + ret = fdt_setprop_inplace_u32_partial(blob, node, "ranges", idx, + fdt32_to_cpu(ranges[idx]) + + fix_offset); + if (ret) + return ret; + }
- return fdt_setprop_inplace(blob, node, "ranges", new_ranges, len); + return 0; }
void reset_cpu(void)

For some configurations with more PCIe cards and PCIe bridges 16 MiB of PCIe MEM space is not enough. Since TF-A already allocates a 128 MiB CPU window for PCIe and IO port space is only 64 KiB in total, use all the remaining space for PCIe MEM.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz --- arch/arm/dts/armada-37xx.dtsi | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/arm/dts/armada-37xx.dtsi b/arch/arm/dts/armada-37xx.dtsi index b7d325b40577..f85fe38ad11c 100644 --- a/arch/arm/dts/armada-37xx.dtsi +++ b/arch/arm/dts/armada-37xx.dtsi @@ -333,9 +333,9 @@
bus-range = <0 0xff>; ranges = <0x82000000 0 0xe8000000 - 0 0xe8000000 0 0x1000000 /* Port 0 MEM */ - 0x81000000 0 0xe9000000 - 0 0xe9000000 0 0x10000>; /* Port 0 IO*/ + 0 0xe8000000 0 0xefff0000 /* Port 0 MEM */ + 0x81000000 0 0xefff0000 + 0 0xefff0000 0 0x10000>; /* Port 0 IO*/ }; }; };

On Monday 17 May 2021 08:39:56 Pali Rohár wrote:
For some configurations with more PCIe cards and PCIe bridges 16 MiB of PCIe MEM space is not enough. Since TF-A already allocates a 128 MiB CPU window for PCIe and IO port space is only 64 KiB in total, use all the remaining space for PCIe MEM.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz
arch/arm/dts/armada-37xx.dtsi | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/arm/dts/armada-37xx.dtsi b/arch/arm/dts/armada-37xx.dtsi index b7d325b40577..f85fe38ad11c 100644 --- a/arch/arm/dts/armada-37xx.dtsi +++ b/arch/arm/dts/armada-37xx.dtsi @@ -333,9 +333,9 @@
bus-range = <0 0xff>; ranges = <0x82000000 0 0xe8000000
0 0xe8000000 0 0x1000000 /* Port 0 MEM */
0x81000000 0 0xe9000000
0 0xe9000000 0 0x10000>; /* Port 0 IO*/
0 0xe8000000 0 0xefff0000 /* Port 0 MEM */
Ou, this is mistake. Size should be 0xefff0000-0xe8000000 = 0x7ff0000.
I will send a new patch version.
0x81000000 0 0xefff0000
}; };0 0xefff0000 0 0x10000>; /* Port 0 IO*/
};
2.20.1

During our debugging of the Aardvark driver in Linux we have discovered that the PCIE_CORE_LINK_CTRL_STAT_REG register in fact controls standard PCIe Link Control Register for PCIe Root Bridge. This led us to discover that the name of the PCIE_CORE_LINK_TRAINING macro and the corresponding comment by this macro's usage is misleading; this bit in fact controls Retrain Link, which, according to PCIe base spec is defined as:
A write of 1b to this bit initiates Link retraining by directing the Physical Layer LTSSM to the Recovery state. If the LTSSM is already in Recovery or Configuration, re-entering Recovery is permitted but not required.
Entering Recovery state is normally done from LTSSM L0, L0s and L1 states. But since the pci-aardvark.c driver enables Link Training just a few lines above, the controller is not in L0 ready state yet. So setting aardvark bit PCIE_CORE_LINK_TRAINING does not actually enter Recovery state at this place.
Moreover, trying to enter LTSSM Recovery state without other configuration is causing issues for some cards (e.g. Atheros AR9xxx and QCA9xxx). Since Recovery state is not entered, these issues are not triggered.
Remove code which tries to enter LTSSM Recovery state completely.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz --- drivers/pci/pci-aardvark.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index c43d4f309b19..06c567e236f9 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -613,11 +613,6 @@ static int pcie_advk_setup_hw(struct pcie_advk *pcie) reg |= PIO_CTRL_ADDR_WIN_DISABLE; advk_writel(pcie, reg, PIO_CTRL);
- /* Start link training */ - reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG); - reg |= PCIE_CORE_LINK_TRAINING; - advk_writel(pcie, reg, PCIE_CORE_LINK_CTRL_STAT_REG); - /* Wait for PCIe link up */ if (pcie_advk_wait_for_link(pcie)) return -ENXIO;

Disable Root Bridge I/O space, memory space and bus mastering in Aardvark's remove method, which is called before booting Linux kernel.
This ensures that PCIe device which was initialized and used by U-Boot cannot do new DMA transfers until Linux initializes PCI subsystem and loads appropriate drivers for the device.
During initialization of PCI subsystem Linux in fact disables this bus mastering on Root Bridge (and later enables it when driver is loaded and configured), but there is a possibility of a small window after U-Boot boots Linux when bus mastering is enabled, which is not correct.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz --- drivers/pci/pci-aardvark.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index 06c567e236f9..ee81b2ea46d3 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -675,6 +675,12 @@ static int pcie_advk_remove(struct udevice *dev) struct pcie_advk *pcie = dev_get_priv(dev); u32 reg;
+ reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG); + reg &= ~(PCIE_CORE_CMD_MEM_ACCESS_EN | + PCIE_CORE_CMD_IO_ACCESS_EN | + PCIE_CORE_CMD_MEM_IO_REQ_EN); + advk_writel(pcie, reg, PCIE_CORE_CMD_STATUS_REG); + reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); reg &= ~LINK_TRAINING_EN; advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);

On 26.05.21 17:59, Pali Rohár wrote:
Disable Root Bridge I/O space, memory space and bus mastering in Aardvark's remove method, which is called before booting Linux kernel.
This ensures that PCIe device which was initialized and used by U-Boot cannot do new DMA transfers until Linux initializes PCI subsystem and loads appropriate drivers for the device.
During initialization of PCI subsystem Linux in fact disables this bus mastering on Root Bridge (and later enables it when driver is loaded and configured), but there is a possibility of a small window after U-Boot boots Linux when bus mastering is enabled, which is not correct.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz
Reviewed-by: Stefan Roese sr@denx.de
Thanks, Stefan
drivers/pci/pci-aardvark.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index 06c567e236f9..ee81b2ea46d3 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -675,6 +675,12 @@ static int pcie_advk_remove(struct udevice *dev) struct pcie_advk *pcie = dev_get_priv(dev); u32 reg;
- reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG);
- reg &= ~(PCIE_CORE_CMD_MEM_ACCESS_EN |
PCIE_CORE_CMD_IO_ACCESS_EN |
PCIE_CORE_CMD_MEM_IO_REQ_EN);
- advk_writel(pcie, reg, PCIE_CORE_CMD_STATUS_REG);
- reg = advk_readl(pcie, PCIE_CORE_CTRL0_REG); reg &= ~LINK_TRAINING_EN; advk_writel(pcie, reg, PCIE_CORE_CTRL0_REG);
Viele Grüße, Stefan

Change DT compatible string for A3700 PCIe from 'marvell,armada-37xx-pcie' to 'marvell,armada-3700-pcie' to make U-Boot A3700 PCIe DT node compatible with Linux' DT node.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz --- arch/arm/dts/armada-37xx.dtsi | 2 +- drivers/pci/pci-aardvark.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/dts/armada-37xx.dtsi b/arch/arm/dts/armada-37xx.dtsi index a1052add0cca..b7d325b40577 100644 --- a/arch/arm/dts/armada-37xx.dtsi +++ b/arch/arm/dts/armada-37xx.dtsi @@ -323,7 +323,7 @@ };
pcie0: pcie@d0070000 { - compatible = "marvell,armada-37xx-pcie"; + compatible = "marvell,armada-3700-pcie"; reg = <0 0xd0070000 0 0x20000>; #address-cells = <3>; #size-cells = <2>; diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index ee81b2ea46d3..ae1a20551fed 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -717,7 +717,7 @@ static const struct dm_pci_ops pcie_advk_ops = { };
static const struct udevice_id pcie_advk_ids[] = { - { .compatible = "marvell,armada-37xx-pcie" }, + { .compatible = "marvell,armada-3700-pcie" }, { } };

On 26.05.21 17:59, Pali Rohár wrote:
Change DT compatible string for A3700 PCIe from 'marvell,armada-37xx-pcie' to 'marvell,armada-3700-pcie' to make U-Boot A3700 PCIe DT node compatible with Linux' DT node.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz
Reviewed-by: Stefan Roese sr@denx.de
Thanks, Stefan
arch/arm/dts/armada-37xx.dtsi | 2 +- drivers/pci/pci-aardvark.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/dts/armada-37xx.dtsi b/arch/arm/dts/armada-37xx.dtsi index a1052add0cca..b7d325b40577 100644 --- a/arch/arm/dts/armada-37xx.dtsi +++ b/arch/arm/dts/armada-37xx.dtsi @@ -323,7 +323,7 @@ };
pcie0: pcie@d0070000 {
compatible = "marvell,armada-37xx-pcie";
compatible = "marvell,armada-3700-pcie"; reg = <0 0xd0070000 0 0x20000>; #address-cells = <3>; #size-cells = <2>;
diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index ee81b2ea46d3..ae1a20551fed 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -717,7 +717,7 @@ static const struct dm_pci_ops pcie_advk_ops = { };
static const struct udevice_id pcie_advk_ids[] = {
- { .compatible = "marvell,armada-37xx-pcie" },
- { .compatible = "marvell,armada-3700-pcie" }, { } };
Viele Grüße, Stefan

Find PCIe DT node by compatible string instead of retrieving it by using hardcoded DT path.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz --- arch/arm/mach-mvebu/armada3700/cpu.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/arm/mach-mvebu/armada3700/cpu.c b/arch/arm/mach-mvebu/armada3700/cpu.c index 0cf60d7cdd7d..1abac7c9a47a 100644 --- a/arch/arm/mach-mvebu/armada3700/cpu.c +++ b/arch/arm/mach-mvebu/armada3700/cpu.c @@ -53,8 +53,6 @@ #define A3700_PTE_BLOCK_DEVICE \ (PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) | PTE_BLOCK_NON_SHARE)
-#define PCIE_PATH "/soc/pcie@d0070000" - DECLARE_GLOBAL_DATA_PTR;
static struct mm_region mvebu_mem_map[MAX_MEM_MAP_REGIONS] = { @@ -288,7 +286,7 @@ int a3700_fdt_fix_pcie_regions(void *blob) const u32 *ranges; int node, len;
- node = fdt_path_offset(blob, PCIE_PATH); + node = fdt_node_offset_by_compatible(blob, -1, "marvell,armada-3700-pcie"); if (node < 0) return node;

On 26.05.21 17:59, Pali Rohár wrote:
Find PCIe DT node by compatible string instead of retrieving it by using hardcoded DT path.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz
Reviewed-by: Stefan Roese sr@denx.de
Thanks, Stefan
arch/arm/mach-mvebu/armada3700/cpu.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/arm/mach-mvebu/armada3700/cpu.c b/arch/arm/mach-mvebu/armada3700/cpu.c index 0cf60d7cdd7d..1abac7c9a47a 100644 --- a/arch/arm/mach-mvebu/armada3700/cpu.c +++ b/arch/arm/mach-mvebu/armada3700/cpu.c @@ -53,8 +53,6 @@ #define A3700_PTE_BLOCK_DEVICE \ (PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) | PTE_BLOCK_NON_SHARE)
-#define PCIE_PATH "/soc/pcie@d0070000"
DECLARE_GLOBAL_DATA_PTR;
static struct mm_region mvebu_mem_map[MAX_MEM_MAP_REGIONS] = {
@@ -288,7 +286,7 @@ int a3700_fdt_fix_pcie_regions(void *blob) const u32 *ranges; int node, len;
- node = fdt_path_offset(blob, PCIE_PATH);
- node = fdt_node_offset_by_compatible(blob, -1, "marvell,armada-3700-pcie"); if (node < 0) return node;
Viele Grüße, Stefan

Current version of this function uses a lot of incorrect assumptions about the `ranges` DT property:
* parent(#address-cells) == 2 * #size-cells == 2 * number of entries == 2 * address size of first entry == 0x1000000 * second child address entry == base + 0x1000000
Trying to increase PCIe MEM space to more than 16 MiB leads to an overlap with PCIe IO space, and trying to define additional MEM space (as a third entry in the `ranges` DT property) causes U-Boot to crash when booting the kernel.
## Flattened Device Tree blob at 04f00000 Booting using the fdt blob at 0x4f00000 Loading Device Tree to 000000001fb01000, end 000000001fb08f12 ... OK ERROR: board-specific fdt fixup failed: <unknown error> - must RESET the board to recover.
Fix a3700_fdt_fix_pcie_regions() to properly parse and update all addresses in the `ranges` property according to https://elinux.org/Device_Tree_Usage#PCI_Address_Translation
Now it is possible to increase PCIe MEM space from 16 MiB to maximal value of 127 MiB.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz Fixes: cb2ddb291ee6 ("arm64: mvebu: a37xx: add device-tree fixer for PCIe regions") --- arch/arm/mach-mvebu/armada3700/cpu.c | 74 ++++++++++++++++++++++------ 1 file changed, 60 insertions(+), 14 deletions(-)
diff --git a/arch/arm/mach-mvebu/armada3700/cpu.c b/arch/arm/mach-mvebu/armada3700/cpu.c index 1abac7c9a47a..9aec0ce9a430 100644 --- a/arch/arm/mach-mvebu/armada3700/cpu.c +++ b/arch/arm/mach-mvebu/armada3700/cpu.c @@ -8,6 +8,7 @@ #include <cpu_func.h> #include <dm.h> #include <fdtdec.h> +#include <fdt_support.h> #include <init.h> #include <asm/global_data.h> #include <linux/bitops.h> @@ -280,36 +281,81 @@ static u32 find_pcie_window_base(void) return -1; }
+static int fdt_setprop_inplace_u32_partial(void *blob, int node, + const char *name, + u32 idx, u32 val) +{ + val = cpu_to_fdt32(val); + + return fdt_setprop_inplace_namelen_partial(blob, node, name, + strlen(name), + idx * sizeof(u32), + &val, sizeof(u32)); +} + int a3700_fdt_fix_pcie_regions(void *blob) { - u32 new_ranges[14], base; + int acells, pacells, scells; + u32 base, fix_offset; const u32 *ranges; - int node, len; + int node, pnode; + int ret, i, len; + + base = find_pcie_window_base(); + if (base == -1) + return -ENOENT;
node = fdt_node_offset_by_compatible(blob, -1, "marvell,armada-3700-pcie"); if (node < 0) return node;
ranges = fdt_getprop(blob, node, "ranges", &len); - if (!ranges) + if (!ranges || len % sizeof(u32)) return -ENOENT;
- if (len != sizeof(new_ranges)) - return -EINVAL; - - memcpy(new_ranges, ranges, len); + /* + * The "ranges" property is an array of + * { <child address> <parent address> <size in child address space> } + * + * All 3 elements can span a diffent number of cells. Fetch their sizes. + */ + pnode = fdt_parent_offset(blob, node); + acells = fdt_address_cells(blob, node); + pacells = fdt_address_cells(blob, pnode); + scells = fdt_size_cells(blob, node);
- base = find_pcie_window_base(); - if (base == -1) + /* Child PCI addresses always use 3 cells */ + if (acells != 3) return -ENOENT;
- new_ranges[2] = cpu_to_fdt32(base); - new_ranges[4] = new_ranges[2]; + /* Calculate fixup offset from first child address (in last cell) */ + fix_offset = base - fdt32_to_cpu(ranges[2]);
- new_ranges[9] = cpu_to_fdt32(base + 0x1000000); - new_ranges[11] = new_ranges[9]; + /* + * Fix address (last cell) of each child address and each parent + * address + */ + for (i = 0; i < len / sizeof(u32); i += acells + pacells + scells) { + int idx; + + /* fix child address */ + idx = i + acells - 1; + ret = fdt_setprop_inplace_u32_partial(blob, node, "ranges", idx, + fdt32_to_cpu(ranges[idx]) + + fix_offset); + if (ret) + return ret; + + /* fix parent address */ + idx = i + acells + pacells - 1; + ret = fdt_setprop_inplace_u32_partial(blob, node, "ranges", idx, + fdt32_to_cpu(ranges[idx]) + + fix_offset); + if (ret) + return ret; + }
- return fdt_setprop_inplace(blob, node, "ranges", new_ranges, len); + return 0; }
void reset_cpu(void)

On 26.05.21 17:59, Pali Rohár wrote:
Current version of this function uses a lot of incorrect assumptions about the `ranges` DT property:
- parent(#address-cells) == 2
- #size-cells == 2
- number of entries == 2
- address size of first entry == 0x1000000
- second child address entry == base + 0x1000000
Trying to increase PCIe MEM space to more than 16 MiB leads to an overlap with PCIe IO space, and trying to define additional MEM space (as a third entry in the `ranges` DT property) causes U-Boot to crash when booting the kernel.
## Flattened Device Tree blob at 04f00000 Booting using the fdt blob at 0x4f00000 Loading Device Tree to 000000001fb01000, end 000000001fb08f12 ... OK ERROR: board-specific fdt fixup failed: <unknown error> - must RESET the board to recover.
Fix a3700_fdt_fix_pcie_regions() to properly parse and update all addresses in the `ranges` property according to https://elinux.org/Device_Tree_Usage#PCI_Address_Translation
Now it is possible to increase PCIe MEM space from 16 MiB to maximal value of 127 MiB.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz Fixes: cb2ddb291ee6 ("arm64: mvebu: a37xx: add device-tree fixer for PCIe regions")
Reviewed-by: Stefan Roese sr@denx.de
Thanks, Stefan
arch/arm/mach-mvebu/armada3700/cpu.c | 74 ++++++++++++++++++++++------ 1 file changed, 60 insertions(+), 14 deletions(-)
diff --git a/arch/arm/mach-mvebu/armada3700/cpu.c b/arch/arm/mach-mvebu/armada3700/cpu.c index 1abac7c9a47a..9aec0ce9a430 100644 --- a/arch/arm/mach-mvebu/armada3700/cpu.c +++ b/arch/arm/mach-mvebu/armada3700/cpu.c @@ -8,6 +8,7 @@ #include <cpu_func.h> #include <dm.h> #include <fdtdec.h> +#include <fdt_support.h> #include <init.h> #include <asm/global_data.h> #include <linux/bitops.h> @@ -280,36 +281,81 @@ static u32 find_pcie_window_base(void) return -1; }
+static int fdt_setprop_inplace_u32_partial(void *blob, int node,
const char *name,
u32 idx, u32 val)
+{
- val = cpu_to_fdt32(val);
- return fdt_setprop_inplace_namelen_partial(blob, node, name,
strlen(name),
idx * sizeof(u32),
&val, sizeof(u32));
+}
- int a3700_fdt_fix_pcie_regions(void *blob) {
- u32 new_ranges[14], base;
- int acells, pacells, scells;
- u32 base, fix_offset; const u32 *ranges;
- int node, len;
int node, pnode;
int ret, i, len;
base = find_pcie_window_base();
if (base == -1)
return -ENOENT;
node = fdt_node_offset_by_compatible(blob, -1, "marvell,armada-3700-pcie"); if (node < 0) return node;
ranges = fdt_getprop(blob, node, "ranges", &len);
- if (!ranges)
- if (!ranges || len % sizeof(u32)) return -ENOENT;
- if (len != sizeof(new_ranges))
return -EINVAL;
- memcpy(new_ranges, ranges, len);
- /*
* The "ranges" property is an array of
* { <child address> <parent address> <size in child address space> }
*
* All 3 elements can span a diffent number of cells. Fetch their sizes.
*/
- pnode = fdt_parent_offset(blob, node);
- acells = fdt_address_cells(blob, node);
- pacells = fdt_address_cells(blob, pnode);
- scells = fdt_size_cells(blob, node);
- base = find_pcie_window_base();
- if (base == -1)
- /* Child PCI addresses always use 3 cells */
- if (acells != 3) return -ENOENT;
- new_ranges[2] = cpu_to_fdt32(base);
- new_ranges[4] = new_ranges[2];
- /* Calculate fixup offset from first child address (in last cell) */
- fix_offset = base - fdt32_to_cpu(ranges[2]);
- new_ranges[9] = cpu_to_fdt32(base + 0x1000000);
- new_ranges[11] = new_ranges[9];
- /*
* Fix address (last cell) of each child address and each parent
* address
*/
- for (i = 0; i < len / sizeof(u32); i += acells + pacells + scells) {
int idx;
/* fix child address */
idx = i + acells - 1;
ret = fdt_setprop_inplace_u32_partial(blob, node, "ranges", idx,
fdt32_to_cpu(ranges[idx]) +
fix_offset);
if (ret)
return ret;
/* fix parent address */
idx = i + acells + pacells - 1;
ret = fdt_setprop_inplace_u32_partial(blob, node, "ranges", idx,
fdt32_to_cpu(ranges[idx]) +
fix_offset);
if (ret)
return ret;
- }
- return fdt_setprop_inplace(blob, node, "ranges", new_ranges, len);
return 0; }
void reset_cpu(void)
Viele Grüße, Stefan

For some configurations with more PCIe cards and PCIe bridges, 16 MiB of PCIe MEM space may not be enough. Since TF-A already allocates a 128 MiB CPU window for PCIe, and since IO port space is only 64 KiB in total, use all the remaining space (64 + 32 + 16 + 8 + 4 + 2 + 1 = 127 MiB) for PCIe MEM.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz
--- Changes in v2: * Fix size for PCIe MEM --- arch/arm/dts/armada-37xx.dtsi | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/arch/arm/dts/armada-37xx.dtsi b/arch/arm/dts/armada-37xx.dtsi index b7d325b40577..2615b8c748c1 100644 --- a/arch/arm/dts/armada-37xx.dtsi +++ b/arch/arm/dts/armada-37xx.dtsi @@ -332,10 +332,17 @@ status = "disabled";
bus-range = <0 0xff>; + /* + * The 128 MiB address range [0xe8000000-0xf0000000] is + * dedicated for PCIe and can be assigned to 8 windows + * with size a power of two. Use one 64 KiB window for + * IO at the end and the remaining seven windows + * (totaling 127 MiB) for MEM. + */ ranges = <0x82000000 0 0xe8000000 - 0 0xe8000000 0 0x1000000 /* Port 0 MEM */ - 0x81000000 0 0xe9000000 - 0 0xe9000000 0 0x10000>; /* Port 0 IO*/ + 0 0xe8000000 0 0x7f00000 /* Port 0 MEM */ + 0x81000000 0 0xefff0000 + 0 0xefff0000 0 0x10000>; /* Port 0 IO*/ }; }; };

On 26.05.21 17:59, Pali Rohár wrote:
For some configurations with more PCIe cards and PCIe bridges, 16 MiB of PCIe MEM space may not be enough. Since TF-A already allocates a 128 MiB CPU window for PCIe, and since IO port space is only 64 KiB in total, use all the remaining space (64 + 32 + 16 + 8 + 4 + 2 + 1 = 127 MiB) for PCIe MEM.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz
Reviewed-by: Stefan Roese sr@denx.de
Thanks, Stefan
Changes in v2:
- Fix size for PCIe MEM
arch/arm/dts/armada-37xx.dtsi | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/arch/arm/dts/armada-37xx.dtsi b/arch/arm/dts/armada-37xx.dtsi index b7d325b40577..2615b8c748c1 100644 --- a/arch/arm/dts/armada-37xx.dtsi +++ b/arch/arm/dts/armada-37xx.dtsi @@ -332,10 +332,17 @@ status = "disabled";
bus-range = <0 0xff>;
/*
* The 128 MiB address range [0xe8000000-0xf0000000] is
* dedicated for PCIe and can be assigned to 8 windows
* with size a power of two. Use one 64 KiB window for
* IO at the end and the remaining seven windows
* (totaling 127 MiB) for MEM.
*/ ranges = <0x82000000 0 0xe8000000
0 0xe8000000 0 0x1000000 /* Port 0 MEM */
0x81000000 0 0xe9000000
0 0xe9000000 0 0x10000>; /* Port 0 IO*/
0 0xe8000000 0 0x7f00000 /* Port 0 MEM */
0x81000000 0 0xefff0000
}; }; };0 0xefff0000 0 0x10000>; /* Port 0 IO*/
Viele Grüße, Stefan

The `ranges` DT property of the PCIe node is currently ignored by Aardvark driver - all entries are used as transparent PCIe MEM, despite some of them being defined for IO in DT.
This is because the driver does not setup PCIe outbound windows and thus a default configuration is used.
This can cause an external abort on CPU when a device driver tries to access non-MEM space.
Setup the PCIe windows according to the `ranges` property for all non-MEM resources (currently only IO) and also non-transparent MEM resources.
Because Linux expects that bootloader does not setup Aardvark PCIe windows, disable them before booting Linux.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz --- drivers/pci/pci-aardvark.c | 158 ++++++++++++++++++++++++++++++++++++- 1 file changed, 157 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index ae1a20551fed..96aa039bdc26 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -99,6 +99,46 @@ #define PCIE_CORE_CTRL2_STRICT_ORDER_ENABLE BIT(5) #define PCIE_CORE_CTRL2_ADDRWIN_MAP_ENABLE BIT(6)
+/* PCIe window configuration */ +#define OB_WIN_BASE_ADDR 0x4c00 +#define OB_WIN_BLOCK_SIZE 0x20 +#define OB_WIN_COUNT 8 +#define OB_WIN_REG_ADDR(win, offset) (OB_WIN_BASE_ADDR + \ + OB_WIN_BLOCK_SIZE * (win) + \ + (offset)) +#define OB_WIN_MATCH_LS(win) OB_WIN_REG_ADDR(win, 0x00) +#define OB_WIN_ENABLE BIT(0) +#define OB_WIN_MATCH_MS(win) OB_WIN_REG_ADDR(win, 0x04) +#define OB_WIN_REMAP_LS(win) OB_WIN_REG_ADDR(win, 0x08) +#define OB_WIN_REMAP_MS(win) OB_WIN_REG_ADDR(win, 0x0c) +#define OB_WIN_MASK_LS(win) OB_WIN_REG_ADDR(win, 0x10) +#define OB_WIN_MASK_MS(win) OB_WIN_REG_ADDR(win, 0x14) +#define OB_WIN_ACTIONS(win) OB_WIN_REG_ADDR(win, 0x18) +#define OB_WIN_DEFAULT_ACTIONS (OB_WIN_ACTIONS(OB_WIN_COUNT-1) + 0x4) +#define OB_WIN_FUNC_NUM_MASK GENMASK(31, 24) +#define OB_WIN_FUNC_NUM_SHIFT 24 +#define OB_WIN_FUNC_NUM_ENABLE BIT(23) +#define OB_WIN_BUS_NUM_BITS_MASK GENMASK(22, 20) +#define OB_WIN_BUS_NUM_BITS_SHIFT 20 +#define OB_WIN_MSG_CODE_ENABLE BIT(22) +#define OB_WIN_MSG_CODE_MASK GENMASK(21, 14) +#define OB_WIN_MSG_CODE_SHIFT 14 +#define OB_WIN_MSG_PAYLOAD_LEN BIT(12) +#define OB_WIN_ATTR_ENABLE BIT(11) +#define OB_WIN_ATTR_TC_MASK GENMASK(10, 8) +#define OB_WIN_ATTR_TC_SHIFT 8 +#define OB_WIN_ATTR_RELAXED BIT(7) +#define OB_WIN_ATTR_NOSNOOP BIT(6) +#define OB_WIN_ATTR_POISON BIT(5) +#define OB_WIN_ATTR_IDO BIT(4) +#define OB_WIN_TYPE_MASK GENMASK(3, 0) +#define OB_WIN_TYPE_SHIFT 0 +#define OB_WIN_TYPE_MEM 0x0 +#define OB_WIN_TYPE_IO 0x4 +#define OB_WIN_TYPE_CONFIG_TYPE0 0x8 +#define OB_WIN_TYPE_CONFIG_TYPE1 0x9 +#define OB_WIN_TYPE_MSG 0xc + /* LMI registers base address and register offsets */ #define LMI_BASE_ADDR 0x6000 #define CFG_REG (LMI_BASE_ADDR + 0x0) @@ -522,6 +562,86 @@ static int pcie_advk_wait_for_link(struct pcie_advk *pcie) return -ETIMEDOUT; }
+/* + * Set PCIe address window register which could be used for memory + * mapping. + */ +static void pcie_advk_set_ob_win(struct pcie_advk *pcie, u8 win_num, + phys_addr_t match, phys_addr_t remap, + phys_addr_t mask, u32 actions) +{ + advk_writel(pcie, OB_WIN_ENABLE | + lower_32_bits(match), OB_WIN_MATCH_LS(win_num)); + advk_writel(pcie, upper_32_bits(match), OB_WIN_MATCH_MS(win_num)); + advk_writel(pcie, lower_32_bits(remap), OB_WIN_REMAP_LS(win_num)); + advk_writel(pcie, upper_32_bits(remap), OB_WIN_REMAP_MS(win_num)); + advk_writel(pcie, lower_32_bits(mask), OB_WIN_MASK_LS(win_num)); + advk_writel(pcie, upper_32_bits(mask), OB_WIN_MASK_MS(win_num)); + advk_writel(pcie, actions, OB_WIN_ACTIONS(win_num)); +} + +static void pcie_advk_disable_ob_win(struct pcie_advk *pcie, u8 win_num) +{ + advk_writel(pcie, 0, OB_WIN_MATCH_LS(win_num)); + advk_writel(pcie, 0, OB_WIN_MATCH_MS(win_num)); + advk_writel(pcie, 0, OB_WIN_REMAP_LS(win_num)); + advk_writel(pcie, 0, OB_WIN_REMAP_MS(win_num)); + advk_writel(pcie, 0, OB_WIN_MASK_LS(win_num)); + advk_writel(pcie, 0, OB_WIN_MASK_MS(win_num)); + advk_writel(pcie, 0, OB_WIN_ACTIONS(win_num)); +} + +static void pcie_advk_set_ob_region(struct pcie_advk *pcie, int *wins, + struct pci_region *region, u32 actions) +{ + phys_addr_t phys_start = region->phys_start; + pci_addr_t bus_start = region->bus_start; + pci_size_t size = region->size; + phys_addr_t win_mask; + u64 win_size; + + if (*wins == -1) + return; + + /* + * The n-th PCIe window is configured by tuple (match, remap, mask) + * and an access to address A uses this window it if A matches the + * match with given mask. + * So every PCIe window size must be a power of two and every start + * address must be aligned to window size. Minimal size is 64 KiB + * because lower 16 bits of mask must be zero. + */ + while (*wins < OB_WIN_COUNT && size > 0) { + /* Calculate the largest aligned window size */ + win_size = (1ULL << (fls64(size) - 1)) | + (phys_start ? (1ULL << __ffs64(phys_start)) : 0); + win_size = 1ULL << __ffs64(win_size); + if (win_size < 0x10000) + break; + + dev_dbg(pcie->dev, + "Configuring PCIe window %d: [0x%llx-0x%llx] as 0x%x\n", + *wins, (u64)phys_start, (u64)phys_start + win_size, + actions); + win_mask = ~(win_size - 1) & ~0xffff; + pcie_advk_set_ob_win(pcie, *wins, phys_start, bus_start, + win_mask, actions); + + phys_start += win_size; + bus_start += win_size; + size -= win_size; + (*wins)++; + } + + if (size > 0) { + *wins = -1; + dev_err(pcie->dev, + "Invalid PCIe region [0x%llx-0x%llx]\n", + (u64)region->phys_start, + (u64)region->phys_start + region->size); + } +} + /** * pcie_advk_setup_hw() - PCIe initailzation * @@ -531,6 +651,8 @@ static int pcie_advk_wait_for_link(struct pcie_advk *pcie) */ static int pcie_advk_setup_hw(struct pcie_advk *pcie) { + struct pci_region *io, *mem, *pref; + int i, wins; u32 reg;
/* Set to Direct mode */ @@ -597,7 +719,9 @@ static int pcie_advk_setup_hw(struct pcie_advk *pcie) * configurations (Default User Field: 0xD0074CFC) * are used to transparent address translation for * the outbound transactions. Thus, PCIe address - * windows are not required. + * windows are not required for transparent memory + * access when default outbound window configuration + * is set for memory access. */ reg = advk_readl(pcie, PCIE_CORE_CTRL2_REG); reg |= PCIE_CORE_CTRL2_ADDRWIN_MAP_ENABLE; @@ -613,6 +737,34 @@ static int pcie_advk_setup_hw(struct pcie_advk *pcie) reg |= PIO_CTRL_ADDR_WIN_DISABLE; advk_writel(pcie, reg, PIO_CTRL);
+ /* + * Set memory access in Default User Field so it + * is not required to configure PCIe address for + * transparent memory access. + */ + advk_writel(pcie, OB_WIN_TYPE_MEM, OB_WIN_DEFAULT_ACTIONS); + + /* + * Configure PCIe address windows for non-memory or + * non-transparent access as by default PCIe uses + * transparent memory access. + */ + wins = 0; + pci_get_regions(pcie->dev, &io, &mem, &pref); + if (io) + pcie_advk_set_ob_region(pcie, &wins, io, OB_WIN_TYPE_IO); + if (mem && mem->phys_start != mem->bus_start) + pcie_advk_set_ob_region(pcie, &wins, mem, OB_WIN_TYPE_MEM); + if (pref && pref->phys_start != pref->bus_start) + pcie_advk_set_ob_region(pcie, &wins, pref, OB_WIN_TYPE_MEM); + + /* Disable remaining PCIe outbound windows */ + for (i = ((wins >= 0) ? wins : 0); i < OB_WIN_COUNT; i++) + pcie_advk_disable_ob_win(pcie, i); + + if (wins == -1) + return -EINVAL; + /* Wait for PCIe link up */ if (pcie_advk_wait_for_link(pcie)) return -ENXIO; @@ -674,6 +826,10 @@ static int pcie_advk_remove(struct udevice *dev) { struct pcie_advk *pcie = dev_get_priv(dev); u32 reg; + int i; + + for (i = 0; i < OB_WIN_COUNT; i++) + pcie_advk_disable_ob_win(pcie, i);
reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG); reg &= ~(PCIE_CORE_CMD_MEM_ACCESS_EN |

On 26.05.21 17:59, Pali Rohár wrote:
The `ranges` DT property of the PCIe node is currently ignored by Aardvark driver - all entries are used as transparent PCIe MEM, despite some of them being defined for IO in DT.
This is because the driver does not setup PCIe outbound windows and thus a default configuration is used.
This can cause an external abort on CPU when a device driver tries to access non-MEM space.
Setup the PCIe windows according to the `ranges` property for all non-MEM resources (currently only IO) and also non-transparent MEM resources.
Because Linux expects that bootloader does not setup Aardvark PCIe windows, disable them before booting Linux.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz
Reviewed-by: Stefan Roese sr@denx.de
Thanks, Stefan
drivers/pci/pci-aardvark.c | 158 ++++++++++++++++++++++++++++++++++++- 1 file changed, 157 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index ae1a20551fed..96aa039bdc26 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -99,6 +99,46 @@ #define PCIE_CORE_CTRL2_STRICT_ORDER_ENABLE BIT(5) #define PCIE_CORE_CTRL2_ADDRWIN_MAP_ENABLE BIT(6)
+/* PCIe window configuration */ +#define OB_WIN_BASE_ADDR 0x4c00 +#define OB_WIN_BLOCK_SIZE 0x20 +#define OB_WIN_COUNT 8 +#define OB_WIN_REG_ADDR(win, offset) (OB_WIN_BASE_ADDR + \
OB_WIN_BLOCK_SIZE * (win) + \
(offset))
+#define OB_WIN_MATCH_LS(win) OB_WIN_REG_ADDR(win, 0x00) +#define OB_WIN_ENABLE BIT(0) +#define OB_WIN_MATCH_MS(win) OB_WIN_REG_ADDR(win, 0x04) +#define OB_WIN_REMAP_LS(win) OB_WIN_REG_ADDR(win, 0x08) +#define OB_WIN_REMAP_MS(win) OB_WIN_REG_ADDR(win, 0x0c) +#define OB_WIN_MASK_LS(win) OB_WIN_REG_ADDR(win, 0x10) +#define OB_WIN_MASK_MS(win) OB_WIN_REG_ADDR(win, 0x14) +#define OB_WIN_ACTIONS(win) OB_WIN_REG_ADDR(win, 0x18) +#define OB_WIN_DEFAULT_ACTIONS (OB_WIN_ACTIONS(OB_WIN_COUNT-1) + 0x4) +#define OB_WIN_FUNC_NUM_MASK GENMASK(31, 24) +#define OB_WIN_FUNC_NUM_SHIFT 24 +#define OB_WIN_FUNC_NUM_ENABLE BIT(23) +#define OB_WIN_BUS_NUM_BITS_MASK GENMASK(22, 20) +#define OB_WIN_BUS_NUM_BITS_SHIFT 20 +#define OB_WIN_MSG_CODE_ENABLE BIT(22) +#define OB_WIN_MSG_CODE_MASK GENMASK(21, 14) +#define OB_WIN_MSG_CODE_SHIFT 14 +#define OB_WIN_MSG_PAYLOAD_LEN BIT(12) +#define OB_WIN_ATTR_ENABLE BIT(11) +#define OB_WIN_ATTR_TC_MASK GENMASK(10, 8) +#define OB_WIN_ATTR_TC_SHIFT 8 +#define OB_WIN_ATTR_RELAXED BIT(7) +#define OB_WIN_ATTR_NOSNOOP BIT(6) +#define OB_WIN_ATTR_POISON BIT(5) +#define OB_WIN_ATTR_IDO BIT(4) +#define OB_WIN_TYPE_MASK GENMASK(3, 0) +#define OB_WIN_TYPE_SHIFT 0 +#define OB_WIN_TYPE_MEM 0x0 +#define OB_WIN_TYPE_IO 0x4 +#define OB_WIN_TYPE_CONFIG_TYPE0 0x8 +#define OB_WIN_TYPE_CONFIG_TYPE1 0x9 +#define OB_WIN_TYPE_MSG 0xc
- /* LMI registers base address and register offsets */ #define LMI_BASE_ADDR 0x6000 #define CFG_REG (LMI_BASE_ADDR + 0x0)
@@ -522,6 +562,86 @@ static int pcie_advk_wait_for_link(struct pcie_advk *pcie) return -ETIMEDOUT; }
+/*
- Set PCIe address window register which could be used for memory
- mapping.
- */
+static void pcie_advk_set_ob_win(struct pcie_advk *pcie, u8 win_num,
phys_addr_t match, phys_addr_t remap,
phys_addr_t mask, u32 actions)
+{
- advk_writel(pcie, OB_WIN_ENABLE |
lower_32_bits(match), OB_WIN_MATCH_LS(win_num));
- advk_writel(pcie, upper_32_bits(match), OB_WIN_MATCH_MS(win_num));
- advk_writel(pcie, lower_32_bits(remap), OB_WIN_REMAP_LS(win_num));
- advk_writel(pcie, upper_32_bits(remap), OB_WIN_REMAP_MS(win_num));
- advk_writel(pcie, lower_32_bits(mask), OB_WIN_MASK_LS(win_num));
- advk_writel(pcie, upper_32_bits(mask), OB_WIN_MASK_MS(win_num));
- advk_writel(pcie, actions, OB_WIN_ACTIONS(win_num));
+}
+static void pcie_advk_disable_ob_win(struct pcie_advk *pcie, u8 win_num) +{
- advk_writel(pcie, 0, OB_WIN_MATCH_LS(win_num));
- advk_writel(pcie, 0, OB_WIN_MATCH_MS(win_num));
- advk_writel(pcie, 0, OB_WIN_REMAP_LS(win_num));
- advk_writel(pcie, 0, OB_WIN_REMAP_MS(win_num));
- advk_writel(pcie, 0, OB_WIN_MASK_LS(win_num));
- advk_writel(pcie, 0, OB_WIN_MASK_MS(win_num));
- advk_writel(pcie, 0, OB_WIN_ACTIONS(win_num));
+}
+static void pcie_advk_set_ob_region(struct pcie_advk *pcie, int *wins,
struct pci_region *region, u32 actions)
+{
- phys_addr_t phys_start = region->phys_start;
- pci_addr_t bus_start = region->bus_start;
- pci_size_t size = region->size;
- phys_addr_t win_mask;
- u64 win_size;
- if (*wins == -1)
return;
- /*
* The n-th PCIe window is configured by tuple (match, remap, mask)
* and an access to address A uses this window it if A matches the
* match with given mask.
* So every PCIe window size must be a power of two and every start
* address must be aligned to window size. Minimal size is 64 KiB
* because lower 16 bits of mask must be zero.
*/
- while (*wins < OB_WIN_COUNT && size > 0) {
/* Calculate the largest aligned window size */
win_size = (1ULL << (fls64(size) - 1)) |
(phys_start ? (1ULL << __ffs64(phys_start)) : 0);
win_size = 1ULL << __ffs64(win_size);
if (win_size < 0x10000)
break;
dev_dbg(pcie->dev,
"Configuring PCIe window %d: [0x%llx-0x%llx] as 0x%x\n",
*wins, (u64)phys_start, (u64)phys_start + win_size,
actions);
win_mask = ~(win_size - 1) & ~0xffff;
pcie_advk_set_ob_win(pcie, *wins, phys_start, bus_start,
win_mask, actions);
phys_start += win_size;
bus_start += win_size;
size -= win_size;
(*wins)++;
- }
- if (size > 0) {
*wins = -1;
dev_err(pcie->dev,
"Invalid PCIe region [0x%llx-0x%llx]\n",
(u64)region->phys_start,
(u64)region->phys_start + region->size);
- }
+}
- /**
- pcie_advk_setup_hw() - PCIe initailzation
@@ -531,6 +651,8 @@ static int pcie_advk_wait_for_link(struct pcie_advk *pcie) */ static int pcie_advk_setup_hw(struct pcie_advk *pcie) {
struct pci_region *io, *mem, *pref;
int i, wins; u32 reg;
/* Set to Direct mode */
@@ -597,7 +719,9 @@ static int pcie_advk_setup_hw(struct pcie_advk *pcie) * configurations (Default User Field: 0xD0074CFC) * are used to transparent address translation for * the outbound transactions. Thus, PCIe address
* windows are not required.
* windows are not required for transparent memory
* access when default outbound window configuration
*/ reg = advk_readl(pcie, PCIE_CORE_CTRL2_REG); reg |= PCIE_CORE_CTRL2_ADDRWIN_MAP_ENABLE;* is set for memory access.
@@ -613,6 +737,34 @@ static int pcie_advk_setup_hw(struct pcie_advk *pcie) reg |= PIO_CTRL_ADDR_WIN_DISABLE; advk_writel(pcie, reg, PIO_CTRL);
- /*
* Set memory access in Default User Field so it
* is not required to configure PCIe address for
* transparent memory access.
*/
- advk_writel(pcie, OB_WIN_TYPE_MEM, OB_WIN_DEFAULT_ACTIONS);
- /*
* Configure PCIe address windows for non-memory or
* non-transparent access as by default PCIe uses
* transparent memory access.
*/
- wins = 0;
- pci_get_regions(pcie->dev, &io, &mem, &pref);
- if (io)
pcie_advk_set_ob_region(pcie, &wins, io, OB_WIN_TYPE_IO);
- if (mem && mem->phys_start != mem->bus_start)
pcie_advk_set_ob_region(pcie, &wins, mem, OB_WIN_TYPE_MEM);
- if (pref && pref->phys_start != pref->bus_start)
pcie_advk_set_ob_region(pcie, &wins, pref, OB_WIN_TYPE_MEM);
- /* Disable remaining PCIe outbound windows */
- for (i = ((wins >= 0) ? wins : 0); i < OB_WIN_COUNT; i++)
pcie_advk_disable_ob_win(pcie, i);
- if (wins == -1)
return -EINVAL;
- /* Wait for PCIe link up */ if (pcie_advk_wait_for_link(pcie)) return -ENXIO;
@@ -674,6 +826,10 @@ static int pcie_advk_remove(struct udevice *dev) { struct pcie_advk *pcie = dev_get_priv(dev); u32 reg;
int i;
for (i = 0; i < OB_WIN_COUNT; i++)
pcie_advk_disable_ob_win(pcie, i);
reg = advk_readl(pcie, PCIE_CORE_CMD_STATUS_REG); reg &= ~(PCIE_CORE_CMD_MEM_ACCESS_EN |
Viele Grüße, Stefan

On 26.05.21 17:59, Pali Rohár wrote:
During our debugging of the Aardvark driver in Linux we have discovered that the PCIE_CORE_LINK_CTRL_STAT_REG register in fact controls standard PCIe Link Control Register for PCIe Root Bridge. This led us to discover that the name of the PCIE_CORE_LINK_TRAINING macro and the corresponding comment by this macro's usage is misleading; this bit in fact controls Retrain Link, which, according to PCIe base spec is defined as:
A write of 1b to this bit initiates Link retraining by directing the Physical Layer LTSSM to the Recovery state. If the LTSSM is already in Recovery or Configuration, re-entering Recovery is permitted but not required.
Entering Recovery state is normally done from LTSSM L0, L0s and L1 states. But since the pci-aardvark.c driver enables Link Training just a few lines above, the controller is not in L0 ready state yet. So setting aardvark bit PCIE_CORE_LINK_TRAINING does not actually enter Recovery state at this place.
Moreover, trying to enter LTSSM Recovery state without other configuration is causing issues for some cards (e.g. Atheros AR9xxx and QCA9xxx). Since Recovery state is not entered, these issues are not triggered.
Remove code which tries to enter LTSSM Recovery state completely.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz
Reviewed-by: Stefan Roese sr@denx.de
Thanks, Stefan
drivers/pci/pci-aardvark.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index c43d4f309b19..06c567e236f9 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -613,11 +613,6 @@ static int pcie_advk_setup_hw(struct pcie_advk *pcie) reg |= PIO_CTRL_ADDR_WIN_DISABLE; advk_writel(pcie, reg, PIO_CTRL);
- /* Start link training */
- reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG);
- reg |= PCIE_CORE_LINK_TRAINING;
- advk_writel(pcie, reg, PCIE_CORE_LINK_CTRL_STAT_REG);
- /* Wait for PCIe link up */ if (pcie_advk_wait_for_link(pcie)) return -ENXIO;
Viele Grüße, Stefan

On Thursday 27 May 2021 08:19:32 Stefan Roese wrote:
On 26.05.21 17:59, Pali Rohár wrote:
During our debugging of the Aardvark driver in Linux we have discovered that the PCIE_CORE_LINK_CTRL_STAT_REG register in fact controls standard PCIe Link Control Register for PCIe Root Bridge. This led us to discover that the name of the PCIE_CORE_LINK_TRAINING macro and the corresponding comment by this macro's usage is misleading; this bit in fact controls Retrain Link, which, according to PCIe base spec is defined as:
A write of 1b to this bit initiates Link retraining by directing the Physical Layer LTSSM to the Recovery state. If the LTSSM is already in Recovery or Configuration, re-entering Recovery is permitted but not required.
Entering Recovery state is normally done from LTSSM L0, L0s and L1 states. But since the pci-aardvark.c driver enables Link Training just a few lines above, the controller is not in L0 ready state yet. So setting aardvark bit PCIE_CORE_LINK_TRAINING does not actually enter Recovery state at this place.
Moreover, trying to enter LTSSM Recovery state without other configuration is causing issues for some cards (e.g. Atheros AR9xxx and QCA9xxx). Since Recovery state is not entered, these issues are not triggered.
Remove code which tries to enter LTSSM Recovery state completely.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz
Reviewed-by: Stefan Roese sr@denx.de
Thanks, Stefan
Hello Stefan! Thank you for review. Would you be sending these A3720 patches to 2021.07 version?
drivers/pci/pci-aardvark.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index c43d4f309b19..06c567e236f9 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -613,11 +613,6 @@ static int pcie_advk_setup_hw(struct pcie_advk *pcie) reg |= PIO_CTRL_ADDR_WIN_DISABLE; advk_writel(pcie, reg, PIO_CTRL);
- /* Start link training */
- reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG);
- reg |= PCIE_CORE_LINK_TRAINING;
- advk_writel(pcie, reg, PCIE_CORE_LINK_CTRL_STAT_REG);
- /* Wait for PCIe link up */ if (pcie_advk_wait_for_link(pcie)) return -ENXIO;
Viele Grüße, Stefan
-- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-51 Fax: (+49)-8142-66989-80 Email: sr@denx.de

Hi Pali,
On 01.06.21 14:57, Pali Rohár wrote:
On Thursday 27 May 2021 08:19:32 Stefan Roese wrote:
On 26.05.21 17:59, Pali Rohár wrote:
During our debugging of the Aardvark driver in Linux we have discovered that the PCIE_CORE_LINK_CTRL_STAT_REG register in fact controls standard PCIe Link Control Register for PCIe Root Bridge. This led us to discover that the name of the PCIE_CORE_LINK_TRAINING macro and the corresponding comment by this macro's usage is misleading; this bit in fact controls Retrain Link, which, according to PCIe base spec is defined as:
A write of 1b to this bit initiates Link retraining by directing the Physical Layer LTSSM to the Recovery state. If the LTSSM is already in Recovery or Configuration, re-entering Recovery is permitted but not required.
Entering Recovery state is normally done from LTSSM L0, L0s and L1 states. But since the pci-aardvark.c driver enables Link Training just a few lines above, the controller is not in L0 ready state yet. So setting aardvark bit PCIE_CORE_LINK_TRAINING does not actually enter Recovery state at this place.
Moreover, trying to enter LTSSM Recovery state without other configuration is causing issues for some cards (e.g. Atheros AR9xxx and QCA9xxx). Since Recovery state is not entered, these issues are not triggered.
Remove code which tries to enter LTSSM Recovery state completely.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz
Reviewed-by: Stefan Roese sr@denx.de
Thanks, Stefan
Hello Stefan! Thank you for review. Would you be sending these A3720 patches to 2021.07 version?
My plan was to postpone these patches to the next release, as they seem quite intrusive. But please let me know if you think this is not the case and they should be added as fixes into this release still.
Thanks, Stefan
drivers/pci/pci-aardvark.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index c43d4f309b19..06c567e236f9 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -613,11 +613,6 @@ static int pcie_advk_setup_hw(struct pcie_advk *pcie) reg |= PIO_CTRL_ADDR_WIN_DISABLE; advk_writel(pcie, reg, PIO_CTRL);
- /* Start link training */
- reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG);
- reg |= PCIE_CORE_LINK_TRAINING;
- advk_writel(pcie, reg, PCIE_CORE_LINK_CTRL_STAT_REG);
- /* Wait for PCIe link up */ if (pcie_advk_wait_for_link(pcie)) return -ENXIO;
Viele Grüße, Stefan
-- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-51 Fax: (+49)-8142-66989-80 Email: sr@denx.de
Viele Grüße, Stefan

On Wed, 2 Jun 2021 07:12:50 +0200 Stefan Roese sr@denx.de wrote:
Hello Stefan! Thank you for review. Would you be sending these A3720 patches to 2021.07 version?
My plan was to postpone these patches to the next release, as they seem quite intrusive. But please let me know if you think this is not the case and they should be added as fixes into this release still.
Hello Stefan,
some of these patches only fix behaviour of PCIe in U-Boot, but patch 5 also fixes booting kernel when PCIe regions in kernel's DTS are changed (by user, so that they have more memory to support more cards).
Since all these are fixes, we think they should be added into this release.
Marek

On 02.06.21 14:42, Marek Behún wrote:
On Wed, 2 Jun 2021 07:12:50 +0200 Stefan Roese sr@denx.de wrote:
Hello Stefan! Thank you for review. Would you be sending these A3720 patches to 2021.07 version?
My plan was to postpone these patches to the next release, as they seem quite intrusive. But please let me know if you think this is not the case and they should be added as fixes into this release still.
Hello Stefan,
some of these patches only fix behaviour of PCIe in U-Boot, but patch 5 also fixes booting kernel when PCIe regions in kernel's DTS are changed (by user, so that they have more memory to support more cards).
Since all these are fixes, we think they should be added into this release.
Understood. I'll do some testing on these patches and will prepare a pull request, if everything works as expected.
Thanks, Stefan

On 26.05.21 17:59, Pali Rohár wrote:
During our debugging of the Aardvark driver in Linux we have discovered that the PCIE_CORE_LINK_CTRL_STAT_REG register in fact controls standard PCIe Link Control Register for PCIe Root Bridge. This led us to discover that the name of the PCIE_CORE_LINK_TRAINING macro and the corresponding comment by this macro's usage is misleading; this bit in fact controls Retrain Link, which, according to PCIe base spec is defined as:
A write of 1b to this bit initiates Link retraining by directing the Physical Layer LTSSM to the Recovery state. If the LTSSM is already in Recovery or Configuration, re-entering Recovery is permitted but not required.
Entering Recovery state is normally done from LTSSM L0, L0s and L1 states. But since the pci-aardvark.c driver enables Link Training just a few lines above, the controller is not in L0 ready state yet. So setting aardvark bit PCIE_CORE_LINK_TRAINING does not actually enter Recovery state at this place.
Moreover, trying to enter LTSSM Recovery state without other configuration is causing issues for some cards (e.g. Atheros AR9xxx and QCA9xxx). Since Recovery state is not entered, these issues are not triggered.
Remove code which tries to enter LTSSM Recovery state completely.
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: Marek Behún marek.behun@nic.cz
Complete series:
Applied to u-boot-marvell/master
Thanks, Stefan
drivers/pci/pci-aardvark.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/drivers/pci/pci-aardvark.c b/drivers/pci/pci-aardvark.c index c43d4f309b19..06c567e236f9 100644 --- a/drivers/pci/pci-aardvark.c +++ b/drivers/pci/pci-aardvark.c @@ -613,11 +613,6 @@ static int pcie_advk_setup_hw(struct pcie_advk *pcie) reg |= PIO_CTRL_ADDR_WIN_DISABLE; advk_writel(pcie, reg, PIO_CTRL);
- /* Start link training */
- reg = advk_readl(pcie, PCIE_CORE_LINK_CTRL_STAT_REG);
- reg |= PCIE_CORE_LINK_TRAINING;
- advk_writel(pcie, reg, PCIE_CORE_LINK_CTRL_STAT_REG);
- /* Wait for PCIe link up */ if (pcie_advk_wait_for_link(pcie)) return -ENXIO;
Viele Grüße, Stefan
participants (3)
-
Marek Behún
-
Pali Rohár
-
Stefan Roese