[U-Boot] [PATCH 1/3] Create API to map between CPU physical and bus addresses

On some SoCs, DMA-capable peripherals see a different address space to the CPU's physical address space. Create an API to allow platform-agnostic drivers to convert between the two address spaces when programming DMA operations.
This API will exist on all platforms, but will have a dummy implementation when this feature is not required. Other platforms will enable CONFIG_PHYS_TO_BUS and provide the required implementation.
Signed-off-by: Stephen Warren swarren@wwwdotorg.org --- These patches depend on previous DWC2 rework that's in the topic/dwc2 branch of the USB repo.
These patches conflict with patches Masahiro has posted to move arch/arm/cpu/arm1176/bcm2835 to arch/arm/mach-bcm283x(?).
I expect I'll have to rebase this series after the upcoming release once those two things are merged into u-boot.git. Still, reviews could begin before that. --- drivers/Kconfig | 8 ++++++++ include/phys2bus.h | 25 +++++++++++++++++++++++++ 2 files changed, 33 insertions(+) create mode 100644 include/phys2bus.h
diff --git a/drivers/Kconfig b/drivers/Kconfig index dcce532e2df2..941aa0c2612a 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -53,3 +53,11 @@ source "drivers/crypto/Kconfig" source "drivers/thermal/Kconfig"
endmenu + +config PHYS_TO_BUS + bool + help + Some SoCs use a different address map for CPU physical addresses and + peripheral DMA master accesses. If yours does, select this option in + your platform's Kconfig, and implement the appropriate mapping + functions in your platform's support code. diff --git a/include/phys2bus.h b/include/phys2bus.h new file mode 100644 index 000000000000..87b6d69aa617 --- /dev/null +++ b/include/phys2bus.h @@ -0,0 +1,25 @@ +/* + * Copyright 2015 Stephen Warren + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#ifndef _BUS_ADDR_H +#define _BUS_ADDR_H + +#ifdef CONFIG_PHYS_TO_BUS +unsigned long phys_to_bus(unsigned long phys); +unsigned long bus_to_phys(unsigned long bus); +#else +static inline unsigned long phys_to_bus(unsigned long phys) +{ + return phys; +} + +static inline unsigned long bus_to_phys(unsigned long bus) +{ + return bus; +} +#endif + +#endif

The BCM283[56] contain both a L1 and L2 cache between the GPU (a/k/a VideoCore CPU?) and DRAM. DMA-capable peripherals can also optionally access DRAM via this same L2 cache (although they always bypass the L1 cache). Peripherals select whether to use or bypass the cache via the top two bits of the bus address.
An IOMMU exists between the ARM CPU and the rest of the system. This controls whether the ARM CPU's accesses use or bypass the L1 and/or L2 cache. This IOMMU is configured/controlled exclusively by the VideoCore CPU.
In order for DRAM accesses made by the ARM core to be coherent with accesses made by other DMA peripherals, we must program a bus address into those peripherals that causes the peripheral's accesses to use the same set of caches that the ARM core's accesses will use.
On the RPi1, the VideoCore firmware sets up the IOMMU to enable use of the L2 cache. This corresponds to addresses based at 0x40000000.
On the RPi2, the VideoCore firmware sets up the IOMMU to disable use of the L2 cache. This corresponds to addresses based at 0xc0000000.
This patch implements U-Boot's phys_to_bus/bus_to_phys APIs according to those rules.
For full details of this setup, please see Dom Cobley's description at: http://lists.denx.de/pipermail/u-boot/2015-March/208201.html http://permalink.gmane.org/gmane.comp.boot-loaders.u-boot/215038 https://www.mail-archive.com/u-boot@lists.denx.de/msg166568.html
Signed-off-by: Stephen Warren swarren@wwwdotorg.org --- arch/arm/cpu/arm1176/bcm2835/Kconfig | 3 +++ arch/arm/cpu/arm1176/bcm2835/Makefile | 1 + arch/arm/cpu/arm1176/bcm2835/phys2bus.c | 22 ++++++++++++++++++++++ arch/arm/cpu/armv7/bcm2835/Makefile | 1 + 4 files changed, 27 insertions(+) create mode 100644 arch/arm/cpu/arm1176/bcm2835/phys2bus.c
diff --git a/arch/arm/cpu/arm1176/bcm2835/Kconfig b/arch/arm/cpu/arm1176/bcm2835/Kconfig index 73cc72b41185..3181747fbfd7 100644 --- a/arch/arm/cpu/arm1176/bcm2835/Kconfig +++ b/arch/arm/cpu/arm1176/bcm2835/Kconfig @@ -9,4 +9,7 @@ config DM_SERIAL config DM_GPIO default y
+config PHYS_TO_BUS + default y + endif diff --git a/arch/arm/cpu/arm1176/bcm2835/Makefile b/arch/arm/cpu/arm1176/bcm2835/Makefile index 7e5dbe1fdeaf..6d1b74158773 100644 --- a/arch/arm/cpu/arm1176/bcm2835/Makefile +++ b/arch/arm/cpu/arm1176/bcm2835/Makefile @@ -6,3 +6,4 @@
obj-y := lowlevel_init.o obj-y += init.o reset.o timer.o mbox.o +obj-y += phys2bus.o diff --git a/arch/arm/cpu/arm1176/bcm2835/phys2bus.c b/arch/arm/cpu/arm1176/bcm2835/phys2bus.c new file mode 100644 index 000000000000..fc1c29905de3 --- /dev/null +++ b/arch/arm/cpu/arm1176/bcm2835/phys2bus.c @@ -0,0 +1,22 @@ +/* + * Copyright 2015 Stephen Warren + * + * SPDX-License-Identifier: GPL-2.0+ + */ + +#include <config.h> +#include <phys2bus.h> + +unsigned long phys_to_bus(unsigned long phys) +{ +#ifdef CONFIG_BCM2836 + return 0xc0000000 | phys; +#else + return 0x40000000 | phys; +#endif +} + +unsigned long bus_to_phys(unsigned long bus) +{ + return bus & ~0xc0000000; +} diff --git a/arch/arm/cpu/armv7/bcm2835/Makefile b/arch/arm/cpu/armv7/bcm2835/Makefile index ed1ee4753d49..5d14d8bdcac3 100644 --- a/arch/arm/cpu/armv7/bcm2835/Makefile +++ b/arch/arm/cpu/armv7/bcm2835/Makefile @@ -11,3 +11,4 @@ obj-y += $(src_dir)/init.o obj-y += $(src_dir)/reset.o obj-y += $(src_dir)/timer.o obj-y += $(src_dir)/mbox.o +obj-y += $(src_dir)/phys2bus.o

On Wednesday, March 25, 2015 at 03:07:34 AM, Stephen Warren wrote:
The BCM283[56] contain both a L1 and L2 cache between the GPU (a/k/a VideoCore CPU?) and DRAM. DMA-capable peripherals can also optionally access DRAM via this same L2 cache (although they always bypass the L1 cache). Peripherals select whether to use or bypass the cache via the top two bits of the bus address.
An IOMMU exists between the ARM CPU and the rest of the system. This controls whether the ARM CPU's accesses use or bypass the L1 and/or L2 cache. This IOMMU is configured/controlled exclusively by the VideoCore CPU.
In order for DRAM accesses made by the ARM core to be coherent with accesses made by other DMA peripherals, we must program a bus address into those peripherals that causes the peripheral's accesses to use the same set of caches that the ARM core's accesses will use.
On the RPi1, the VideoCore firmware sets up the IOMMU to enable use of the L2 cache. This corresponds to addresses based at 0x40000000.
On the RPi2, the VideoCore firmware sets up the IOMMU to disable use of the L2 cache. This corresponds to addresses based at 0xc0000000.
This patch implements U-Boot's phys_to_bus/bus_to_phys APIs according to those rules.
For full details of this setup, please see Dom Cobley's description at: http://lists.denx.de/pipermail/u-boot/2015-March/208201.html http://permalink.gmane.org/gmane.comp.boot-loaders.u-boot/215038 https://www.mail-archive.com/u-boot@lists.denx.de/msg166568.html
Signed-off-by: Stephen Warren swarren@wwwdotorg.org
Applied to -next, thanks!
Best regards, Marek Vasut

Use of these APIs is required on the Raspberry Pi. With this change, USB on RPi1 should be more reliable, and USB on the RPi2 will start working.
Signed-off-by: Stephen Warren swarren@wwwdotorg.org --- We likely should enable use of these functions for mbox, SDHCI, and LCD display too. However, I haven't validated those yet. --- drivers/usb/host/dwc2.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/dwc2.c b/drivers/usb/host/dwc2.c index 5f4ca7abf7bf..8f7c269dd1a5 100644 --- a/drivers/usb/host/dwc2.c +++ b/drivers/usb/host/dwc2.c @@ -9,6 +9,7 @@ #include <errno.h> #include <usb.h> #include <malloc.h> +#include <phys2bus.h> #include <usbroothubdes.h> #include <asm/io.h>
@@ -795,7 +796,8 @@ int chunk_msg(struct usb_device *dev, unsigned long pipe, int *pid, int in, if (!in) memcpy(aligned_buffer, (char *)buffer + done, len);
- writel((uint32_t)aligned_buffer, &hc_regs->hcdma); + writel(phys_to_bus((unsigned long)aligned_buffer), + &hc_regs->hcdma);
/* Set host channel enable after all other setup is complete. */ clrsetbits_le32(&hc_regs->hcchar, DWC2_HCCHAR_MULTICNT_MASK |

On Wednesday, March 25, 2015 at 03:07:35 AM, Stephen Warren wrote:
Use of these APIs is required on the Raspberry Pi. With this change, USB on RPi1 should be more reliable, and USB on the RPi2 will start working.
Signed-off-by: Stephen Warren swarren@wwwdotorg.org
Applied to -next, thanks!
Best regards, Marek Vasut

On Wednesday, March 25, 2015 at 03:07:33 AM, Stephen Warren wrote:
On some SoCs, DMA-capable peripherals see a different address space to the CPU's physical address space. Create an API to allow platform-agnostic drivers to convert between the two address spaces when programming DMA operations.
This API will exist on all platforms, but will have a dummy implementation when this feature is not required. Other platforms will enable CONFIG_PHYS_TO_BUS and provide the required implementation.
Signed-off-by: Stephen Warren swarren@wwwdotorg.org
Applied to -next, thanks!
btw. can't you use __weak here instead of a new ifdef macro (which is not documented btw)?
Best regards, Marek Vasut

On 03/25/2015 05:55 AM, Marek Vasut wrote:
On Wednesday, March 25, 2015 at 03:07:33 AM, Stephen Warren wrote:
On some SoCs, DMA-capable peripherals see a different address space to the CPU's physical address space. Create an API to allow platform-agnostic drivers to convert between the two address spaces when programming DMA operations.
This API will exist on all platforms, but will have a dummy implementation when this feature is not required. Other platforms will enable CONFIG_PHYS_TO_BUS and provide the required implementation.
Signed-off-by: Stephen Warren swarren@wwwdotorg.org
Applied to -next, thanks!
btw. can't you use __weak here instead of a new ifdef macro (which is not documented btw)?
__weak wont' work with inlines, which I used to ensure zero code overhead in the case the functions aren't needed. If we were OK with calling a no-op function in all cases, we could indeed provide a weak default implementation and get rid of the ifdef.
The new option is documented in the Kconfig file. I assume we don't need to document options in multiple places (both Kconfig and README), since if we do, the documentation is bound to become inconsistent in those two places. Hopefully README goes away once everything is in Kconfig.

On Wednesday, March 25, 2015 at 03:40:28 PM, Stephen Warren wrote:
On 03/25/2015 05:55 AM, Marek Vasut wrote:
On Wednesday, March 25, 2015 at 03:07:33 AM, Stephen Warren wrote:
On some SoCs, DMA-capable peripherals see a different address space to the CPU's physical address space. Create an API to allow platform-agnostic drivers to convert between the two address spaces when programming DMA operations.
This API will exist on all platforms, but will have a dummy implementation when this feature is not required. Other platforms will enable CONFIG_PHYS_TO_BUS and provide the required implementation.
Signed-off-by: Stephen Warren swarren@wwwdotorg.org
Applied to -next, thanks!
btw. can't you use __weak here instead of a new ifdef macro (which is not documented btw)?
__weak wont' work with inlines, which I used to ensure zero code overhead in the case the functions aren't needed. If we were OK with calling a no-op function in all cases, we could indeed provide a weak default implementation and get rid of the ifdef.
OK, makes sense.
The new option is documented in the Kconfig file. I assume we don't need to document options in multiple places (both Kconfig and README), since if we do, the documentation is bound to become inconsistent in those two places. Hopefully README goes away once everything is in Kconfig.
Yup, agreed.
Thanks for clearing this up :)
Best regards, Marek Vasut
participants (2)
-
Marek Vasut
-
Stephen Warren