[RFC] dev_phys_to_bus() and PCI

I'm working on support for Apple's M1 systems in U-Boot. The idea is that this can be used as a "payload" for the m1n1 bootloader that is being developed by Hector Martin for Asahi Linux in order to provide an UEFI implementation that can boot a standard arm64 OS.
My current code, which can be found on the "apple-m1-m1n1" branch at:
https://github.com/kettenis/u-boot/tree/apple-m1-m1n1
can already do this. I'm booting OpenBSD/arm64 this way and I've also booted into grub using a standard arm64 Debian installer.
One of the big challenges I'm facing is that the integrated PCIe devices on the system sit behind an IOMMU that (as far as we can tell) doesn't support any kind of bypass mode. I don't think we want full IOMMU support in U-Boot, so the approach I'm currently taking is that I set up the IOMMU to map a chunk of memory at the "top of RAM" where U-Boot resides after relocation. But in order to use this mapping I need to do DMA address translation.
Fortunately Nicolas Saenz Julienne recently introduced dev_phys_to_bus() and dev_bus_to_phys() interfaces to do this. Those interfaces make use of a dma-ranges property in the device tree which doesn't work so well for PCI devices though. However, the PCI code in U-Boot already has a way to describe DMA address translation through its regions support. Hooking this up to dev_phys_to_bus() and dev_bus_to_phys() is fairly easy as illustrated by the diff below.
Would this be viable approach? This could also help adding support for PCIe devices on the Raspberry Pi CM4.
commit 4f0e989c7a765291a38b7d10da562f23c5f31239 Author: Mark Kettenis kettenis@openbsd.org Date: Fri Mar 12 20:23:11 2021 +0100
dm: Add PCI support to dev_phys_to_bus()/dev_bus_to_phys()
For PCI devices, call dm_pci_phys_to_bus()/dm_pci_bus_to_phys() to do the address translation. This uses the regions set up by the PCI host controller driver to do the translation as a single translation offset may not be sufficient in this case.
Signed-off-by: Mark Kettenis kettenis@openbsd.org
diff --git a/include/phys2bus.h b/include/phys2bus.h index 866b8b51a8..13d23ef4bb 100644 --- a/include/phys2bus.h +++ b/include/phys2bus.h @@ -23,14 +23,21 @@ static inline unsigned long bus_to_phys(unsigned long bus)
#if CONFIG_IS_ENABLED(DM) #include <dm/device.h> +#include <pci.h>
static inline dma_addr_t dev_phys_to_bus(struct udevice *dev, phys_addr_t phys) { + if (device_is_on_pci_bus(dev)) + return dm_pci_phys_to_bus(dev, phys, PCI_REGION_SYS_MEMORY); + return phys - dev_get_dma_offset(dev); }
static inline phys_addr_t dev_bus_to_phys(struct udevice *dev, dma_addr_t bus) { + if (device_is_on_pci_bus(dev)) + return dm_pci_bus_to_phys(dev, bus, PCI_REGION_SYS_MEMORY); + return bus + dev_get_dma_offset(dev); } #else

+Tom Rini
Hi Mark,
On Sat, 13 Mar 2021 at 02:24, Mark Kettenis mark.kettenis@xs4all.nl wrote:
I'm working on support for Apple's M1 systems in U-Boot. The idea is that this can be used as a "payload" for the m1n1 bootloader that is being developed by Hector Martin for Asahi Linux in order to provide an UEFI implementation that can boot a standard arm64 OS.
My current code, which can be found on the "apple-m1-m1n1" branch at:
https://github.com/kettenis/u-boot/tree/apple-m1-m1n1
can already do this. I'm booting OpenBSD/arm64 this way and I've also booted into grub using a standard arm64 Debian installer.
One of the big challenges I'm facing is that the integrated PCIe devices on the system sit behind an IOMMU that (as far as we can tell) doesn't support any kind of bypass mode. I don't think we want full IOMMU support in U-Boot, so the approach I'm currently taking is that I set up the IOMMU to map a chunk of memory at the "top of RAM" where U-Boot resides after relocation. But in order to use this mapping I need to do DMA address translation.
Fortunately Nicolas Saenz Julienne recently introduced dev_phys_to_bus() and dev_bus_to_phys() interfaces to do this. Those interfaces make use of a dma-ranges property in the device tree which doesn't work so well for PCI devices though. However, the PCI code in U-Boot already has a way to describe DMA address translation through its regions support. Hooking this up to dev_phys_to_bus() and dev_bus_to_phys() is fairly easy as illustrated by the diff below.
Would this be viable approach? This could also help adding support for PCIe devices on the Raspberry Pi CM4.
It seems OK to me.
I suspect IOMMU support will be needed to some degree eventually in U-Boot.
Regards, Simon
commit 4f0e989c7a765291a38b7d10da562f23c5f31239 Author: Mark Kettenis kettenis@openbsd.org Date: Fri Mar 12 20:23:11 2021 +0100
dm: Add PCI support to dev_phys_to_bus()/dev_bus_to_phys() For PCI devices, call dm_pci_phys_to_bus()/dm_pci_bus_to_phys() to do the address translation. This uses the regions set up by the PCI host controller driver to do the translation as a single translation offset may not be sufficient in this case. Signed-off-by: Mark Kettenis <kettenis@openbsd.org>
diff --git a/include/phys2bus.h b/include/phys2bus.h index 866b8b51a8..13d23ef4bb 100644 --- a/include/phys2bus.h +++ b/include/phys2bus.h @@ -23,14 +23,21 @@ static inline unsigned long bus_to_phys(unsigned long bus)
#if CONFIG_IS_ENABLED(DM) #include <dm/device.h> +#include <pci.h>
static inline dma_addr_t dev_phys_to_bus(struct udevice *dev, phys_addr_t phys) {
if (device_is_on_pci_bus(dev))
return dm_pci_phys_to_bus(dev, phys, PCI_REGION_SYS_MEMORY);
return phys - dev_get_dma_offset(dev);
}
static inline phys_addr_t dev_bus_to_phys(struct udevice *dev, dma_addr_t bus) {
if (device_is_on_pci_bus(dev))
return dm_pci_bus_to_phys(dev, bus, PCI_REGION_SYS_MEMORY);
return bus + dev_get_dma_offset(dev);
} #else

Hi Mark!
On Sat, 2021-03-13 at 10:24 +0100, Mark Kettenis wrote: [...]
Fortunately Nicolas Saenz Julienne recently introduced dev_phys_to_bus() and dev_bus_to_phys() interfaces to do this. Those interfaces make use of a dma-ranges property in the device tree which doesn't work so well for PCI devices though.
Why doesn't it work with PCI devices? Raspberry Pi 4 has a PCIe bus that needs DMA translations. This is handled by parsing its 'dma-ranges' property. Here's how rpi4's devicetree looks like[1]:
pcie0: pcie@7d500000 { compatible = "brcm,bcm2711-pcie"; ranges = <0x02000000 0x0 0xf8000000 0x6 0x00000000 0x0 0x04000000>; /* * The wrapper around the PCIe block has a bug preventing it * from accessing beyond the first 3GB of memory. */ dma-ranges = <0x02000000 0x0 0x00000000 0x0 0x00000000 0x0 0xc0000000>; };
However, the PCI code in U-Boot already has a way to describe DMA address translation through its regions support.
regions contain a representation of devicetree's 'ranges' property, which doesn't describe the DMA address translations, but CPU's view of PCI memory.
diff --git a/include/phys2bus.h b/include/phys2bus.h index 866b8b51a8..13d23ef4bb 100644 --- a/include/phys2bus.h +++ b/include/phys2bus.h @@ -23,14 +23,21 @@ static inline unsigned long bus_to_phys(unsigned long bus)
#if CONFIG_IS_ENABLED(DM) #include <dm/device.h> +#include <pci.h>
static inline dma_addr_t dev_phys_to_bus(struct udevice *dev, phys_addr_t phys) {
- if (device_is_on_pci_bus(dev))
return dm_pci_phys_to_bus(dev, phys, PCI_REGION_SYS_MEMORY);
Note that this would break USB support on RPi4.
Regards, Nicolas
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch...

From: Nicolas Saenz Julienne nsaenzjulienne@suse.de Date: Mon, 15 Mar 2021 11:26:18 +0100
Hi Mark!
Hi Nicolas,
On Sat, 2021-03-13 at 10:24 +0100, Mark Kettenis wrote: [...]
Fortunately Nicolas Saenz Julienne recently introduced dev_phys_to_bus() and dev_bus_to_phys() interfaces to do this. Those interfaces make use of a dma-ranges property in the device tree which doesn't work so well for PCI devices though.
Why doesn't it work with PCI devices? Raspberry Pi 4 has a PCIe bus that needs DMA translations. This is handled by parsing its 'dma-ranges' property. Here's how rpi4's devicetree looks like[1]:
pcie0: pcie@7d500000 { compatible = "brcm,bcm2711-pcie"; ranges = <0x02000000 0x0 0xf8000000 0x6 0x00000000 0x0 0x04000000>; /* * The wrapper around the PCIe block has a bug preventing it * from accessing beyond the first 3GB of memory. */ dma-ranges = <0x02000000 0x0 0x00000000 0x0 0x00000000 0x0 0xc0000000>; };
Does that work even though there are no device tree nodes for the PCI devices themselves?
However, the PCI code in U-Boot already has a way to describe DMA address translation through its regions support.
regions contain a representation of devicetree's 'ranges' property, which doesn't describe the DMA address translations, but CPU's view of PCI memory.
It describes both. The DMA address translations are described by regions that have the PCI_REGION_SYS_MEMORY flag set. Some of the PCI host bridge drivers set up these regions, and the generic DM_PCI code by default adds such regions for the system RAM banks. And there is a fdt_pci_dma_ranges() function to populate a "dma-ranges" property based on these regions.
The benefit of the PCI regions code is that it can describe more complex setups where a single translation offset isn't enough. And the code will print a warbing if you try to do DMA to memory that isn't "visable" for the PCI device.
In the particular example of the M1 code I don't really want to add a dma-ranges property to the device tree since the IOMMU setup is only used by u-boot and not really usable by Linux or another OS as it only covers a (relatively) small part of system memory.
diff --git a/include/phys2bus.h b/include/phys2bus.h index 866b8b51a8..13d23ef4bb 100644 --- a/include/phys2bus.h +++ b/include/phys2bus.h @@ -23,14 +23,21 @@ static inline unsigned long bus_to_phys(unsigned long bus)
#if CONFIG_IS_ENABLED(DM) #include <dm/device.h> +#include <pci.h>
static inline dma_addr_t dev_phys_to_bus(struct udevice *dev, phys_addr_t phys) {
- if (device_is_on_pci_bus(dev))
return dm_pci_phys_to_bus(dev, phys, PCI_REGION_SYS_MEMORY);
Note that this would break USB support on RPi4.
It should work if the PCI host bridge driver sets up the regions properly based on the dma-ranges property. Looks like the current driver doesn't do this, so I'll have to fix that.
Cheers,
Mark

Hi Mark,
Hi Nicolas,
On Sat, 2021-03-13 at 10:24 +0100, Mark Kettenis wrote: [...]
Fortunately Nicolas Saenz Julienne recently introduced dev_phys_to_bus() and dev_bus_to_phys() interfaces to do this. Those interfaces make use of a dma-ranges property in the device tree which doesn't work so well for PCI devices though.
Why doesn't it work with PCI devices? Raspberry Pi 4 has a PCIe bus that needs DMA translations. This is handled by parsing its 'dma-ranges' property. Here's how rpi4's devicetree looks like[1]:
pcie0: pcie@7d500000 { compatible = "brcm,bcm2711-pcie"; ranges = <0x02000000 0x0 0xf8000000 0x6 0x00000000 0x0 0x04000000>; /* * The wrapper around the PCIe block has a bug preventing it * from accessing beyond the first 3GB of memory. */ dma-ranges = <0x02000000 0x0 0x00000000 0x0 0x00000000 0x0 0xc0000000>; };
Does that work even though there are no device tree nodes for the PCI devices themselves?
'dma-ranges' only makes sense on bus nodes. So whenever you perform a DMA address tranlation you start parsing from the endpoint device's parent. In this case the PCIe root bridge, which has a proper DT node.
However, the PCI code in U-Boot already has a way to describe DMA address translation through its regions support.
regions contain a representation of devicetree's 'ranges' property, which doesn't describe the DMA address translations, but CPU's view of PCI memory.
It describes both. The DMA address translations are described by regions that have the PCI_REGION_SYS_MEMORY flag set. Some of the PCI host bridge drivers set up these regions, and the generic DM_PCI code by default adds such regions for the system RAM banks. And there is a fdt_pci_dma_ranges() function to populate a "dma-ranges" property based on these regions.
Sorry, I wasn't aware PCI_REGION_SYS_MEMORY existed.
The benefit of the PCI regions code is that it can describe more complex setups where a single translation offset isn't enough. And the code will print a warbing if you try to do DMA to memory that isn't "visable" for the PCI device.
In the particular example of the M1 code I don't really want to add a dma-ranges property to the device tree since the IOMMU setup is only used by u-boot and not really usable by Linux or another OS as it only covers a (relatively) small part of system memory.
So I understand that this region will be hard-coded in your PCI driver. And there will be no trace of it in DT. If so, I think I understand your strategy.
diff --git a/include/phys2bus.h b/include/phys2bus.h index 866b8b51a8..13d23ef4bb 100645 --- a/include/phys2bus.h +++ b/include/phys2bus.h @@ -23,14 +23,21 @@ static inline unsigned long bus_to_phys(unsigned long bus)
#if CONFIG_IS_ENABLED(DM) #include <dm/device.h> +#include <pci.h>
static inline dma_addr_t dev_phys_to_bus(struct udevice *dev, phys_addr_t phys) {
- if (device_is_on_pci_bus(dev))
return dm_pci_phys_to_bus(dev, phys, PCI_REGION_SYS_MEMORY);
Note that this would break USB support on RPi4.
It should work if the PCI host bridge driver sets up the regions properly based on the dma-ranges property. Looks like the current driver doesn't do this, so I'll have to fix that.
That would work. But it would be better to integrate your setup with struct udevice's DMA tranlation attributes. It's as simple as performing the PCI_REGION_SYS_MEMORY parsing in device_get_dma_constraints(). Then you'd be able to use plain, generic, dev_phys_to_bus(). With the added benefit of not having the parse all resources everytime you perform a translation, and not having to add churn in pcie-brcmstb.
Regards, Nicolas
participants (3)
-
Mark Kettenis
-
Nicolas Saenz Julienne
-
Simon Glass