IMX8MM 4GiB boundary issue

Greetings,
I'm working with an IMX8MM board that has 4GiB of DRAM. I've found that in this configuration the MMC driver and FEC network driver appear to have some issues with crossing the 4GiB address space. If I tell U-Boot I have 3GiB everything is ok, but when I set it to 4GiB I see the following:
MMC: FSL_SDHC: 0, FSL_SDHC: 1, FSL_SDHC: 2 Loading Environment from MMC... Error found for upper 32 bits Error found for upper 32 bits Error found for upper 32 bits *** Warning - No block device, using default environment
In: serial Out: serial Err: serial Net: DP83867 Warning: ethernet@30be0000 (eth0) using random MAC address - ea:22:3a:4d:8f:d5 eth0: ethernet@30be0000 [PRIME] Hit any key to stop autoboot: 0
On the FEC ethernet side I don't see any errors reported but ping's fail with 4GiB DRAM.
I suspect drivers have 32bit addressing issues as the base of mem for IMX8MM is 1GiB so anything over 3GiB of DRAM runs you over the 32bit boundary.
Anyone run into this yet?
Marek, I noticed you are the maintainer for the technexion pico-imx8mq which has support for 1, 2, 3, and 4 GiB DRAM. Did you encounter such issues on the 4GiB variant?
Best Regards,
Tim

On 9/25/20 4:52 PM, Tim Harvey wrote:
Greetings,
I'm working with an IMX8MM board that has 4GiB of DRAM. I've found that in this configuration the MMC driver and FEC network driver appear to have some issues with crossing the 4GiB address space. If I tell U-Boot I have 3GiB everything is ok, but when I set it to 4GiB I see the following:
MMC: FSL_SDHC: 0, FSL_SDHC: 1, FSL_SDHC: 2 Loading Environment from MMC... Error found for upper 32 bits Error found for upper 32 bits Error found for upper 32 bits *** Warning - No block device, using default environment
In: serial Out: serial Err: serial Net: DP83867 Warning: ethernet@30be0000 (eth0) using random MAC address - ea:22:3a:4d:8f:d5 eth0: ethernet@30be0000 [PRIME] Hit any key to stop autoboot: 0
On the FEC ethernet side I don't see any errors reported but ping's fail with 4GiB DRAM.
I suspect drivers have 32bit addressing issues as the base of mem for IMX8MM is 1GiB so anything over 3GiB of DRAM runs you over the 32bit boundary.
Anyone run into this yet?
I saw similar things on RCar3 with 32bit IPs and used bounce buffers to work around the 32bit limitations where applicable.
Marek, I noticed you are the maintainer for the technexion pico-imx8mq which has support for 1, 2, 3, and 4 GiB DRAM. Did you encounter such issues on the 4GiB variant?
I dont have the 4 GiB variant.
I can imagine that either the FEC/SDHCI is limited to 32bit addressing in hardware (the DMA can only operate on 32bit range due to it coming from 32bit systems), OR, the drivers need to be patched to support the 64bit addresses properly on 64bit SoCs and 64bit variants of the IPs

On Fri, Sep 25, 2020 at 8:05 AM Marek Vasut marex@denx.de wrote:
On 9/25/20 4:52 PM, Tim Harvey wrote:
Greetings,
I'm working with an IMX8MM board that has 4GiB of DRAM. I've found that in this configuration the MMC driver and FEC network driver appear to have some issues with crossing the 4GiB address space. If I tell U-Boot I have 3GiB everything is ok, but when I set it to 4GiB I see the following:
MMC: FSL_SDHC: 0, FSL_SDHC: 1, FSL_SDHC: 2 Loading Environment from MMC... Error found for upper 32 bits Error found for upper 32 bits Error found for upper 32 bits *** Warning - No block device, using default environment
In: serial Out: serial Err: serial Net: DP83867 Warning: ethernet@30be0000 (eth0) using random MAC address - ea:22:3a:4d:8f:d5 eth0: ethernet@30be0000 [PRIME] Hit any key to stop autoboot: 0
On the FEC ethernet side I don't see any errors reported but ping's fail with 4GiB DRAM.
I suspect drivers have 32bit addressing issues as the base of mem for IMX8MM is 1GiB so anything over 3GiB of DRAM runs you over the 32bit boundary.
Anyone run into this yet?
I saw similar things on RCar3 with 32bit IPs and used bounce buffers to work around the 32bit limitations where applicable.
Marek, I noticed you are the maintainer for the technexion pico-imx8mq which has support for 1, 2, 3, and 4 GiB DRAM. Did you encounter such issues on the 4GiB variant?
I dont have the 4 GiB variant.
ah... that explains why you didn't see it. Note a patch I just sent 'imx8m: fix cache setup for dynamic sdram size' that your board will need as well in order to boot with 4GiB.
I can imagine that either the FEC/SDHCI is limited to 32bit addressing in hardware (the DMA can only operate on 32bit range due to it coming from 32bit systems), OR, the drivers need to be patched to support the 64bit addresses properly on 64bit SoCs and 64bit variants of the IPs
I hadn't thought about the DMA boundary issue. I'll wait for NXP to weigh in before I start digging through drivers. I wonder if there is a simple workaround to make sure U-Boot is running in lower DRAM? I'm not all that clear where U-Boot gets allocated.
Tim

On 9/25/20 5:12 PM, Tim Harvey wrote:
On Fri, Sep 25, 2020 at 8:05 AM Marek Vasut marex@denx.de wrote:
On 9/25/20 4:52 PM, Tim Harvey wrote:
Greetings,
I'm working with an IMX8MM board that has 4GiB of DRAM. I've found that in this configuration the MMC driver and FEC network driver appear to have some issues with crossing the 4GiB address space. If I tell U-Boot I have 3GiB everything is ok, but when I set it to 4GiB I see the following:
MMC: FSL_SDHC: 0, FSL_SDHC: 1, FSL_SDHC: 2 Loading Environment from MMC... Error found for upper 32 bits Error found for upper 32 bits Error found for upper 32 bits *** Warning - No block device, using default environment
In: serial Out: serial Err: serial Net: DP83867 Warning: ethernet@30be0000 (eth0) using random MAC address - ea:22:3a:4d:8f:d5 eth0: ethernet@30be0000 [PRIME] Hit any key to stop autoboot: 0
On the FEC ethernet side I don't see any errors reported but ping's fail with 4GiB DRAM.
I suspect drivers have 32bit addressing issues as the base of mem for IMX8MM is 1GiB so anything over 3GiB of DRAM runs you over the 32bit boundary.
Anyone run into this yet?
I saw similar things on RCar3 with 32bit IPs and used bounce buffers to work around the 32bit limitations where applicable.
Marek, I noticed you are the maintainer for the technexion pico-imx8mq which has support for 1, 2, 3, and 4 GiB DRAM. Did you encounter such issues on the 4GiB variant?
I dont have the 4 GiB variant.
ah... that explains why you didn't see it. Note a patch I just sent 'imx8m: fix cache setup for dynamic sdram size' that your board will need as well in order to boot with 4GiB.
I can imagine that either the FEC/SDHCI is limited to 32bit addressing in hardware (the DMA can only operate on 32bit range due to it coming from 32bit systems), OR, the drivers need to be patched to support the 64bit addresses properly on 64bit SoCs and 64bit variants of the IPs
I hadn't thought about the DMA boundary issue. I'll wait for NXP to weigh in before I start digging through drivers. I wonder if there is a simple workaround to make sure U-Boot is running in lower DRAM? I'm not all that clear where U-Boot gets allocated.
Use the whole DRAM and fix the drivers. U-Boot gets relocated to the end of DRAM by default.

Subject: Re: IMX8MM 4GiB boundary issue
On Fri, Sep 25, 2020 at 8:05 AM Marek Vasut marex@denx.de wrote:
On 9/25/20 4:52 PM, Tim Harvey wrote:
Greetings,
I'm working with an IMX8MM board that has 4GiB of DRAM. I've found that in this configuration the MMC driver and FEC network driver appear to have some issues with crossing the 4GiB address space. If I tell U-Boot I have 3GiB everything is ok, but when I set it to 4GiB I see the following:
MMC: FSL_SDHC: 0, FSL_SDHC: 1, FSL_SDHC: 2 Loading Environment from MMC... Error found for upper 32 bits Error found for upper 32 bits Error found for upper 32 bits *** Warning - No block device, using default environment
In: serial Out: serial Err: serial Net: DP83867 Warning: ethernet@30be0000 (eth0) using random MAC address - ea:22:3a:4d:8f:d5 eth0: ethernet@30be0000 [PRIME] Hit any key to stop autoboot: 0
On the FEC ethernet side I don't see any errors reported but ping's fail with 4GiB DRAM.
I suspect drivers have 32bit addressing issues as the base of mem for IMX8MM is 1GiB so anything over 3GiB of DRAM runs you over the 32bit boundary.
Anyone run into this yet?
I saw similar things on RCar3 with 32bit IPs and used bounce buffers to work around the 32bit limitations where applicable.
Marek, I noticed you are the maintainer for the technexion pico-imx8mq which has support for 1, 2, 3, and 4 GiB DRAM. Did you encounter such issues on the 4GiB variant?
I dont have the 4 GiB variant.
ah... that explains why you didn't see it. Note a patch I just sent 'imx8m: fix cache setup for dynamic sdram size' that your board will need as well in order to boot with 4GiB.
I can imagine that either the FEC/SDHCI is limited to 32bit addressing in hardware (the DMA can only operate on 32bit range due to it coming from 32bit systems), OR, the drivers need to be patched to support the 64bit addresses properly on 64bit SoCs and 64bit variants of the IPs
I hadn't thought about the DMA boundary issue. I'll wait for NXP to weigh in before I start digging through drivers. I wonder if there is a simple workaround to make sure U-Boot is running in lower DRAM? I'm not all that clear where U-Boot gets allocated.
The IP only support 32bits DMA, you could let U-Boot only relocated to the end of 4GB memory address space using get_effective_memsize
Regards, Peng.
Tim

On 9/27/20 2:56 AM, Peng Fan wrote:
[...]
I can imagine that either the FEC/SDHCI is limited to 32bit addressing in hardware (the DMA can only operate on 32bit range due to it coming from 32bit systems), OR, the drivers need to be patched to support the 64bit addresses properly on 64bit SoCs and 64bit variants of the IPs
I hadn't thought about the DMA boundary issue. I'll wait for NXP to weigh in before I start digging through drivers. I wonder if there is a simple workaround to make sure U-Boot is running in lower DRAM? I'm not all that clear where U-Boot gets allocated.
The IP only support 32bits DMA, you could let U-Boot only relocated to the end of 4GB memory address space using get_effective_memsize
Surely the ARM64 core can address more than 4 GiB of DRAM, and can execute code from above the 4 GiB boundary, right ? In that case, get_effective_memsize cannot be used.
What you describe here is a limitation of the old IP blocks which were taken from previously 32bit SoCs and they are incapable of accessing DRAM above the 4 GiB boundary with their limited DMAs. The solution for that is to fix those drivers, e.g. by placing their buffers below the 4 GiB boundary, or by using bounce buffers if needed.
Placing U-Boot below the 4 GiB boundary is NOT a solution in any way, but a broken workaround. There is still nothing preventing user from placing a buffer above the 4 GiB boundary and passing that to the driver, at which point the driver will fail (e.g. a simple "$ load mmc 0:1 0x100000000 file" will just fail, unless e.g. a bounce buffer is used).

Subject: Re: IMX8MM 4GiB boundary issue
On 9/27/20 2:56 AM, Peng Fan wrote:
[...]
I can imagine that either the FEC/SDHCI is limited to 32bit addressing in hardware (the DMA can only operate on 32bit range due to it coming from 32bit systems), OR, the drivers need to be patched to support the 64bit addresses properly on 64bit SoCs and 64bit variants of the IPs
I hadn't thought about the DMA boundary issue. I'll wait for NXP to weigh in before I start digging through drivers. I wonder if there is a simple workaround to make sure U-Boot is running in lower DRAM? I'm not all that clear where U-Boot gets allocated.
The IP only support 32bits DMA, you could let U-Boot only relocated to the end of 4GB memory address space using get_effective_memsize
Surely the ARM64 core can address more than 4 GiB of DRAM, and can execute code from above the 4 GiB boundary, right ?
Yes
In that case,
get_effective_memsize cannot be used.
What you describe here is a limitation of the old IP blocks which were taken from previously 32bit SoCs and they are incapable of accessing DRAM above the 4 GiB boundary with their limited DMAs. The solution for that is to fix those drivers, e.g. by placing their buffers below the 4 GiB boundary, or by using bounce buffers if needed.
Placing U-Boot below the 4 GiB boundary is NOT a solution in any way, but a broken workaround. There is still nothing preventing user from placing a buffer above the 4 GiB boundary and passing that to the driver, at which point the driver will fail (e.g. a simple "$ load mmc 0:1 0x100000000 file" will just fail, unless e.g. a bounce buffer is used).
That will be several drivers need to use bounce buffer, sdhc/fec/usb/nand/video. Let's see how to address the drivers.
Thanks, Peng.

On 9/27/20 4:35 AM, Peng Fan wrote:
Subject: Re: IMX8MM 4GiB boundary issue
On 9/27/20 2:56 AM, Peng Fan wrote:
[...]
I can imagine that either the FEC/SDHCI is limited to 32bit addressing in hardware (the DMA can only operate on 32bit range due to it coming from 32bit systems), OR, the drivers need to be patched to support the 64bit addresses properly on 64bit SoCs and 64bit variants of the IPs
I hadn't thought about the DMA boundary issue. I'll wait for NXP to weigh in before I start digging through drivers. I wonder if there is a simple workaround to make sure U-Boot is running in lower DRAM? I'm not all that clear where U-Boot gets allocated.
The IP only support 32bits DMA, you could let U-Boot only relocated to the end of 4GB memory address space using get_effective_memsize
Surely the ARM64 core can address more than 4 GiB of DRAM, and can execute code from above the 4 GiB boundary, right ?
Yes
In that case,
get_effective_memsize cannot be used.
What you describe here is a limitation of the old IP blocks which were taken from previously 32bit SoCs and they are incapable of accessing DRAM above the 4 GiB boundary with their limited DMAs. The solution for that is to fix those drivers, e.g. by placing their buffers below the 4 GiB boundary, or by using bounce buffers if needed.
Placing U-Boot below the 4 GiB boundary is NOT a solution in any way, but a broken workaround. There is still nothing preventing user from placing a buffer above the 4 GiB boundary and passing that to the driver, at which point the driver will fail (e.g. a simple "$ load mmc 0:1 0x100000000 file" will just fail, unless e.g. a bounce buffer is used).
That will be several drivers need to use bounce buffer, sdhc/fec/usb/nand/video. Let's see how to address the drivers.
R-Car had the same problem, so you can look there.

On Sun, Sep 27, 2020 at 7:47 AM Marek Vasut marex@denx.de wrote:
On 9/27/20 4:35 AM, Peng Fan wrote:
Subject: Re: IMX8MM 4GiB boundary issue
On 9/27/20 2:56 AM, Peng Fan wrote:
[...]
I can imagine that either the FEC/SDHCI is limited to 32bit addressing in hardware (the DMA can only operate on 32bit range due to it coming from 32bit systems), OR, the drivers need to be patched to support the 64bit addresses properly on 64bit SoCs and 64bit variants of the IPs
I hadn't thought about the DMA boundary issue. I'll wait for NXP to weigh in before I start digging through drivers. I wonder if there is a simple workaround to make sure U-Boot is running in lower DRAM? I'm not all that clear where U-Boot gets allocated.
The IP only support 32bits DMA, you could let U-Boot only relocated to the end of 4GB memory address space using get_effective_memsize
Surely the ARM64 core can address more than 4 GiB of DRAM, and can execute code from above the 4 GiB boundary, right ?
Yes
In that case,
get_effective_memsize cannot be used.
What you describe here is a limitation of the old IP blocks which were taken from previously 32bit SoCs and they are incapable of accessing DRAM above the 4 GiB boundary with their limited DMAs. The solution for that is to fix those drivers, e.g. by placing their buffers below the 4 GiB boundary, or by using bounce buffers if needed.
Placing U-Boot below the 4 GiB boundary is NOT a solution in any way, but a broken workaround. There is still nothing preventing user from placing a buffer above the 4 GiB boundary and passing that to the driver, at which point the driver will fail (e.g. a simple "$ load mmc 0:1 0x100000000 file" will just fail, unless e.g. a bounce buffer is used).
That will be several drivers need to use bounce buffer, sdhc/fec/usb/nand/video. Let's see how to address the drivers.
R-Car had the same problem, so you can look there.
Marek,
Are you referring to d2661d8: mmc: tmio: sdhi: Use bounce buffer to avoid DMA limitations
Do you know the state of the Linux kernel drivers with regards to this issue and if there is a performance hit due to the bounce buffers?
Best Regards,
Tim

On 10/1/20 6:33 PM, Tim Harvey wrote:
On Sun, Sep 27, 2020 at 7:47 AM Marek Vasut marex@denx.de wrote:
On 9/27/20 4:35 AM, Peng Fan wrote:
Subject: Re: IMX8MM 4GiB boundary issue
On 9/27/20 2:56 AM, Peng Fan wrote:
[...]
> I can imagine that either the FEC/SDHCI is limited to 32bit > addressing in hardware (the DMA can only operate on 32bit range due > to it coming from 32bit systems), OR, the drivers need to be patched > to support the 64bit addresses properly on 64bit SoCs and 64bit > variants of the IPs
I hadn't thought about the DMA boundary issue. I'll wait for NXP to weigh in before I start digging through drivers. I wonder if there is a simple workaround to make sure U-Boot is running in lower DRAM? I'm not all that clear where U-Boot gets allocated.
The IP only support 32bits DMA, you could let U-Boot only relocated to the end of 4GB memory address space using get_effective_memsize
Surely the ARM64 core can address more than 4 GiB of DRAM, and can execute code from above the 4 GiB boundary, right ?
Yes
In that case,
get_effective_memsize cannot be used.
What you describe here is a limitation of the old IP blocks which were taken from previously 32bit SoCs and they are incapable of accessing DRAM above the 4 GiB boundary with their limited DMAs. The solution for that is to fix those drivers, e.g. by placing their buffers below the 4 GiB boundary, or by using bounce buffers if needed.
Placing U-Boot below the 4 GiB boundary is NOT a solution in any way, but a broken workaround. There is still nothing preventing user from placing a buffer above the 4 GiB boundary and passing that to the driver, at which point the driver will fail (e.g. a simple "$ load mmc 0:1 0x100000000 file" will just fail, unless e.g. a bounce buffer is used).
That will be several drivers need to use bounce buffer, sdhc/fec/usb/nand/video. Let's see how to address the drivers.
R-Car had the same problem, so you can look there.
Marek,
Are you referring to d2661d8: mmc: tmio: sdhi: Use bounce buffer to avoid DMA limitations
Yes, but on R-Car3 this could also be solved better by using IOMMU (Linux does it). IOMMU isn't available on MX8M though, to my knowledge.
Do you know the state of the Linux kernel drivers with regards to this issue and if there is a performance hit due to the bounce buffers?
In Linux, R-Car3 uses IOMMU, so there is no performance hit on that specific hardware. On iMX8M, you would likely need to set some bit which indicates the hardware supports only 32bit DMA, so the DMA buffers would be allocated below the 32bit barrier, also no big problem. I think it is one of the DMA flags, DMA_BIT_MASK(32) or so.

On Thu, Oct 1, 2020 at 1:50 PM Marek Vasut marex@denx.de wrote:
In Linux, R-Car3 uses IOMMU, so there is no performance hit on that specific hardware. On iMX8M, you would likely need to set some bit which indicates the hardware supports only 32bit DMA, so the DMA buffers would be allocated below the 32bit barrier, also no big problem. I think it is one of the DMA flags, DMA_BIT_MASK(32) or so.
Just saw a recent discussion on this and the recommendation was to use the 'dma-ranges' property.
Please check:

On Sat, Sep 26, 2020 at 7:35 PM Peng Fan peng.fan@nxp.com wrote:
Subject: Re: IMX8MM 4GiB boundary issue
On 9/27/20 2:56 AM, Peng Fan wrote:
[...]
I can imagine that either the FEC/SDHCI is limited to 32bit addressing in hardware (the DMA can only operate on 32bit range due to it coming from 32bit systems), OR, the drivers need to be patched to support the 64bit addresses properly on 64bit SoCs and 64bit variants of the IPs
I hadn't thought about the DMA boundary issue. I'll wait for NXP to weigh in before I start digging through drivers. I wonder if there is a simple workaround to make sure U-Boot is running in lower DRAM? I'm not all that clear where U-Boot gets allocated.
The IP only support 32bits DMA, you could let U-Boot only relocated to the end of 4GB memory address space using get_effective_memsize
Surely the ARM64 core can address more than 4 GiB of DRAM, and can execute code from above the 4 GiB boundary, right ?
Yes
In that case,
get_effective_memsize cannot be used.
What you describe here is a limitation of the old IP blocks which were taken from previously 32bit SoCs and they are incapable of accessing DRAM above the 4 GiB boundary with their limited DMAs. The solution for that is to fix those drivers, e.g. by placing their buffers below the 4 GiB boundary, or by using bounce buffers if needed.
Placing U-Boot below the 4 GiB boundary is NOT a solution in any way, but a broken workaround. There is still nothing preventing user from placing a buffer above the 4 GiB boundary and passing that to the driver, at which point the driver will fail (e.g. a simple "$ load mmc 0:1 0x100000000 file" will just fail, unless e.g. a bounce buffer is used).
That will be several drivers need to use bounce buffer, sdhc/fec/usb/nand/video. Let's see how to address the drivers.
Peng,
I assume the Linux sdhc/fec/usb/nand/video drivers take care of those IP's only having 32bit DMA's via bounce buffers at the cost of some performance hit (like not being able to use zero-copy buffers)? By video I assume you mean CSI/DSI bridge has 32bit DMA but hopefully GPU/VPU has 64bit DMA?
This makes me wonder if >3GiB is worth much on IMX8M
Tim

Subject: Re: IMX8MM 4GiB boundary issue
On Sat, Sep 26, 2020 at 7:35 PM Peng Fan peng.fan@nxp.com wrote:
Subject: Re: IMX8MM 4GiB boundary issue
On 9/27/20 2:56 AM, Peng Fan wrote:
[...]
I can imagine that either the FEC/SDHCI is limited to 32bit addressing in hardware (the DMA can only operate on 32bit range due to it coming from 32bit systems), OR, the drivers need to be patched to support the 64bit addresses properly on 64bit SoCs and 64bit variants of the IPs
I hadn't thought about the DMA boundary issue. I'll wait for NXP to weigh in before I start digging through drivers. I wonder if there is a simple workaround to make sure U-Boot is running in lower DRAM? I'm not all that clear where U-Boot gets allocated.
The IP only support 32bits DMA, you could let U-Boot only relocated to the end of 4GB memory address space using get_effective_memsize
Surely the ARM64 core can address more than 4 GiB of DRAM, and can execute code from above the 4 GiB boundary, right ?
Yes
In that case,
get_effective_memsize cannot be used.
What you describe here is a limitation of the old IP blocks which were taken from previously 32bit SoCs and they are incapable of accessing DRAM above the 4 GiB boundary with their limited DMAs. The solution for that is to fix those drivers, e.g. by placing their buffers below the 4 GiB boundary, or by using bounce buffers if needed.
Placing U-Boot below the 4 GiB boundary is NOT a solution in any way, but a broken workaround. There is still nothing preventing user from placing a buffer above the 4 GiB boundary and passing that to the driver, at which point the driver will fail (e.g. a simple "$ load mmc 0:1 0x100000000 file" will just fail, unless e.g. a bounce buffer is used).
That will be several drivers need to use bounce buffer,
sdhc/fec/usb/nand/video.
Let's see how to address the drivers.
Peng,
I assume the Linux sdhc/fec/usb/nand/video drivers take care of those IP's only having 32bit DMA's via bounce buffers at the cost of some performance hit (like not being able to use zero-copy buffers)? By video I assume you mean CSI/DSI bridge has 32bit DMA but hopefully GPU/VPU has 64bit DMA?
I did not check all the IP details. GPU may support 36bit address space.
We use CMA area in 4GB space for DMA usage.
This makes me wonder if >3GiB is worth much on IMX8M
DMA limitation not block using bigger DRAM memory for CPU.
Thanks, Peng.
Tim

Hi Tim,
Sorry for resurrecting such an old thread.
On Fri, Sep 25, 2020 at 11:52 AM Tim Harvey tharvey@gateworks.com wrote:
Greetings,
I'm working with an IMX8MM board that has 4GiB of DRAM. I've found that in this configuration the MMC driver and FEC network driver appear to have some issues with crossing the 4GiB address space. If I tell U-Boot I have 3GiB everything is ok, but when I set it to 4GiB I see the following:
MMC: FSL_SDHC: 0, FSL_SDHC: 1, FSL_SDHC: 2 Loading Environment from MMC... Error found for upper 32 bits Error found for upper 32 bits Error found for upper 32 bits *** Warning - No block device, using default environment
In: serial Out: serial Err: serial Net: DP83867 Warning: ethernet@30be0000 (eth0) using random MAC address - ea:22:3a:4d:8f:d5 eth0: ethernet@30be0000 [PRIME] Hit any key to stop autoboot: 0
On the FEC ethernet side I don't see any errors reported but ping's fail with 4GiB DRAM.
Yes, just noticed the "Error found for upper 32 bits" and FEC breakage on an imx8mm iotgate board.
Is there any progress with regards to the support of 4GB of RAM on imx8mm?
Thanks

On 2/24/22 20:50, Fabio Estevam wrote:
Hi Tim,
Sorry for resurrecting such an old thread.
On Fri, Sep 25, 2020 at 11:52 AM Tim Harvey tharvey@gateworks.com wrote:
Greetings,
I'm working with an IMX8MM board that has 4GiB of DRAM. I've found that in this configuration the MMC driver and FEC network driver appear to have some issues with crossing the 4GiB address space. If I tell U-Boot I have 3GiB everything is ok, but when I set it to 4GiB I see the following:
MMC: FSL_SDHC: 0, FSL_SDHC: 1, FSL_SDHC: 2 Loading Environment from MMC... Error found for upper 32 bits Error found for upper 32 bits Error found for upper 32 bits *** Warning - No block device, using default environment
In: serial Out: serial Err: serial Net: DP83867 Warning: ethernet@30be0000 (eth0) using random MAC address - ea:22:3a:4d:8f:d5 eth0: ethernet@30be0000 [PRIME] Hit any key to stop autoboot: 0
On the FEC ethernet side I don't see any errors reported but ping's fail with 4GiB DRAM.
Yes, just noticed the "Error found for upper 32 bits" and FEC breakage on an imx8mm iotgate board.
Is there any progress with regards to the support of 4GB of RAM on imx8mm?
If the IP doesn't support access above 32bit bus addresses, but mmc_data->src/dst is above 4 GiB, use 'struct bounce_buffer' on that buffer and let U-Boot "bounce" into 32bit space which the IP can handle . Same thing for ethernet and other IPs with such limitations.

On Thu, Feb 24, 2022 at 12:03 PM Marek Vasut marex@denx.de wrote:
On 2/24/22 20:50, Fabio Estevam wrote:
Hi Tim,
Sorry for resurrecting such an old thread.
On Fri, Sep 25, 2020 at 11:52 AM Tim Harvey tharvey@gateworks.com wrote:
Greetings,
I'm working with an IMX8MM board that has 4GiB of DRAM. I've found that in this configuration the MMC driver and FEC network driver appear to have some issues with crossing the 4GiB address space. If I tell U-Boot I have 3GiB everything is ok, but when I set it to 4GiB I see the following:
MMC: FSL_SDHC: 0, FSL_SDHC: 1, FSL_SDHC: 2 Loading Environment from MMC... Error found for upper 32 bits Error found for upper 32 bits Error found for upper 32 bits *** Warning - No block device, using default environment
In: serial Out: serial Err: serial Net: DP83867 Warning: ethernet@30be0000 (eth0) using random MAC address - ea:22:3a:4d:8f:d5 eth0: ethernet@30be0000 [PRIME] Hit any key to stop autoboot: 0
On the FEC ethernet side I don't see any errors reported but ping's fail with 4GiB DRAM.
Yes, just noticed the "Error found for upper 32 bits" and FEC breakage on an imx8mm iotgate board.
Is there any progress with regards to the support of 4GB of RAM on imx8mm?
If the IP doesn't support access above 32bit bus addresses, but mmc_data->src/dst is above 4 GiB, use 'struct bounce_buffer' on that buffer and let U-Boot "bounce" into 32bit space which the IP can handle . Same thing for ethernet and other IPs with such limitations.
Fabio,
As Marek points out the individual peripheral drivers have not been updated properly but this should be worked around with commit e27bddff4b97 ("imx8m: Restrict usable memory to space below 4G boundary").
Best regards,
Tim

On 2/24/22 21:19, Tim Harvey wrote:
On Thu, Feb 24, 2022 at 12:03 PM Marek Vasut marex@denx.de wrote:
On 2/24/22 20:50, Fabio Estevam wrote:
Hi Tim,
Sorry for resurrecting such an old thread.
On Fri, Sep 25, 2020 at 11:52 AM Tim Harvey tharvey@gateworks.com wrote:
Greetings,
I'm working with an IMX8MM board that has 4GiB of DRAM. I've found that in this configuration the MMC driver and FEC network driver appear to have some issues with crossing the 4GiB address space. If I tell U-Boot I have 3GiB everything is ok, but when I set it to 4GiB I see the following:
MMC: FSL_SDHC: 0, FSL_SDHC: 1, FSL_SDHC: 2 Loading Environment from MMC... Error found for upper 32 bits Error found for upper 32 bits Error found for upper 32 bits *** Warning - No block device, using default environment
In: serial Out: serial Err: serial Net: DP83867 Warning: ethernet@30be0000 (eth0) using random MAC address - ea:22:3a:4d:8f:d5 eth0: ethernet@30be0000 [PRIME] Hit any key to stop autoboot: 0
On the FEC ethernet side I don't see any errors reported but ping's fail with 4GiB DRAM.
Yes, just noticed the "Error found for upper 32 bits" and FEC breakage on an imx8mm iotgate board.
Is there any progress with regards to the support of 4GB of RAM on imx8mm?
If the IP doesn't support access above 32bit bus addresses, but mmc_data->src/dst is above 4 GiB, use 'struct bounce_buffer' on that buffer and let U-Boot "bounce" into 32bit space which the IP can handle . Same thing for ethernet and other IPs with such limitations.
Fabio,
As Marek points out the individual peripheral drivers have not been updated properly but this should be worked around with commit e27bddff4b97 ("imx8m: Restrict usable memory to space below 4G boundary").
Ah, ok ... so drivers were left broken, a workaround was added instead to paper over the bug. Pity.

Hi Tim,
On Thu, Feb 24, 2022 at 5:20 PM Tim Harvey tharvey@gateworks.com wrote:
Fabio,
As Marek points out the individual peripheral drivers have not been updated properly but this should be worked around with commit e27bddff4b97 ("imx8m: Restrict usable memory to space below 4G boundary").
I am using 2021.07 which contains such commit, but it does not help in my case. FEC and esdhc are still broken. I will try Marek's suggestion to fix the drivers.
Just curious: don't you see such errors anymore with your board?
Thanks

On Thu, Feb 24, 2022 at 12:32 PM Fabio Estevam festevam@gmail.com wrote:
Hi Tim,
On Thu, Feb 24, 2022 at 5:20 PM Tim Harvey tharvey@gateworks.com wrote:
Fabio,
As Marek points out the individual peripheral drivers have not been updated properly but this should be worked around with commit e27bddff4b97 ("imx8m: Restrict usable memory to space below 4G boundary").
I am using 2021.07 which contains such commit, but it does not help in my case. FEC and esdhc are still broken. I will try Marek's suggestion to fix the drivers.
Just curious: don't you see such errors anymore with your board?
Fabio,
No, that commit is 'not' in v2021.07. Please test with master and you should see that go away.
Regardless, Marek's suggestion is the right fix if you can manage that... we really don't want to limit 4GB boards to 3GB. I was hoping NXP would step up and address the peripheral drivers for this.
Tim

Hi Tim,
On Thu, Feb 24, 2022 at 6:46 PM Tim Harvey tharvey@gateworks.com wrote:
Fabio,
No, that commit is 'not' in v2021.07. Please test with master and you should see that go away.
Yes, you are right.
Regardless, Marek's suggestion is the right fix if you can manage that... we really don't want to limit 4GB boards to 3GB. I was hoping NXP would step up and address the peripheral drivers for this.
Agreed, thanks!

From: Fabio Estevam festevam@gmail.com Date: Fri, 25 Feb 2022 08:12:58 -0300
Hi Tim,
On Thu, Feb 24, 2022 at 6:46 PM Tim Harvey tharvey@gateworks.com wrote:
Fabio,
No, that commit is 'not' in v2021.07. Please test with master and you should see that go away.
Yes, you are right.
Regardless, Marek's suggestion is the right fix if you can manage that... we really don't want to limit 4GB boards to 3GB. I was hoping NXP would step up and address the peripheral drivers for this.
Agreed, thanks!
But isn't the problem here that (some of) the hardware peripherals simply can't address memory above the 4GB boundary?
OS kernels can work around such limitations by using an IOMMU (if provided by the hardware) or by using bounce buffers (swiotlb in Linux speak). The traditional way to deal with this in u-boot is to make sure that u-boot only uses memory below the 4GB boundary by implementing board_get_usable_ram_top() and making sure that all the addresses in the u-boot environment are in "low" memory. For EFI support there is the CONFIG_EFI_LOADER_BOUNCE_BUFFER option, which should be set to "y" in this case.

On 2/25/22 12:37, Mark Kettenis wrote:
From: Fabio Estevam festevam@gmail.com Date: Fri, 25 Feb 2022 08:12:58 -0300
Hi Tim,
On Thu, Feb 24, 2022 at 6:46 PM Tim Harvey tharvey@gateworks.com wrote:
Fabio,
No, that commit is 'not' in v2021.07. Please test with master and you should see that go away.
Yes, you are right.
Regardless, Marek's suggestion is the right fix if you can manage that... we really don't want to limit 4GB boards to 3GB. I was hoping NXP would step up and address the peripheral drivers for this.
Agreed, thanks!
But isn't the problem here that (some of) the hardware peripherals simply can't address memory above the 4GB boundary?
OS kernels can work around such limitations by using an IOMMU (if provided by the hardware) or by using bounce buffers (swiotlb in Linux speak).
Right, see bounce_buffer in U-Boot.
The traditional way to deal with this in u-boot is to make sure that u-boot only uses memory below the 4GB boundary by implementing board_get_usable_ram_top() and making sure that all the addresses in the u-boot environment are in "low" memory.
The board_get_usable_ram_top() purpose was something else entirely at the beginning, it only started being misused to work around driver issues instead of fixing them later and that is utterly wrong.
For EFI support there is the CONFIG_EFI_LOADER_BOUNCE_BUFFER option, which should be set to "y" in this case.
There is generic bounce buffer for drivers, see common/bouncebuf.c .

Date: Fri, 25 Feb 2022 14:50:59 +0100 From: Marek Vasut marex@denx.de
On 2/25/22 12:37, Mark Kettenis wrote:
From: Fabio Estevam festevam@gmail.com Date: Fri, 25 Feb 2022 08:12:58 -0300
Hi Tim,
On Thu, Feb 24, 2022 at 6:46 PM Tim Harvey tharvey@gateworks.com wrote:
Fabio,
No, that commit is 'not' in v2021.07. Please test with master and you should see that go away.
Yes, you are right.
Regardless, Marek's suggestion is the right fix if you can manage that... we really don't want to limit 4GB boards to 3GB. I was hoping NXP would step up and address the peripheral drivers for this.
Agreed, thanks!
But isn't the problem here that (some of) the hardware peripherals simply can't address memory above the 4GB boundary?
OS kernels can work around such limitations by using an IOMMU (if provided by the hardware) or by using bounce buffers (swiotlb in Linux speak).
Right, see bounce_buffer in U-Boot.
The traditional way to deal with this in u-boot is to make sure that u-boot only uses memory below the 4GB boundary by implementing board_get_usable_ram_top() and making sure that all the addresses in the u-boot environment are in "low" memory.
The board_get_usable_ram_top() purpose was something else entirely at the beginning, it only started being misused to work around driver issues instead of fixing them later and that is utterly wrong.
For EFI support there is the CONFIG_EFI_LOADER_BOUNCE_BUFFER option, which should be set to "y" in this case.
There is generic bounce buffer for drivers, see common/bouncebuf.c .
That implementation only seems to exist to handle misaligned buffers. As far as I can tell it doesn't make any attempt to make sure it allocates memory in a specific address range. Although I suppose that using memalign() means it allocates from the heap and boards have some control over where the heap lives. But doesn't that rely on board_get_usable_ram_top()?
I'm following this discussion since I'm trying to work out the best way to add PCIe support for the Apple M1 "boards". There the issue isn't so much that the hardware peripherals can't address memory above the 4GB boundary (there is no memory below the 4GB boundary!). But the IOMMU only has a 4GB iova window which means that I cannot have the IOMMU map all physical memory 1:1. So I either have to make sure that U-Boot (including the efi_loader subsystem) only uses memory in a particular 4GB range. Or I have to add an interface to have drivers explictly map memory through the IOMMU (and have them unmap when they're done). Such an interface would look somewhat similar to the bounce buffer interface.

On 2/26/22 14:30, Mark Kettenis wrote:
Date: Fri, 25 Feb 2022 14:50:59 +0100 From: Marek Vasut marex@denx.de
On 2/25/22 12:37, Mark Kettenis wrote:
From: Fabio Estevam festevam@gmail.com Date: Fri, 25 Feb 2022 08:12:58 -0300
Hi Tim,
On Thu, Feb 24, 2022 at 6:46 PM Tim Harvey tharvey@gateworks.com wrote:
Fabio,
No, that commit is 'not' in v2021.07. Please test with master and you should see that go away.
Yes, you are right.
Regardless, Marek's suggestion is the right fix if you can manage that... we really don't want to limit 4GB boards to 3GB. I was hoping NXP would step up and address the peripheral drivers for this.
Agreed, thanks!
But isn't the problem here that (some of) the hardware peripherals simply can't address memory above the 4GB boundary?
OS kernels can work around such limitations by using an IOMMU (if provided by the hardware) or by using bounce buffers (swiotlb in Linux speak).
Right, see bounce_buffer in U-Boot.
The traditional way to deal with this in u-boot is to make sure that u-boot only uses memory below the 4GB boundary by implementing board_get_usable_ram_top() and making sure that all the addresses in the u-boot environment are in "low" memory.
The board_get_usable_ram_top() purpose was something else entirely at the beginning, it only started being misused to work around driver issues instead of fixing them later and that is utterly wrong.
For EFI support there is the CONFIG_EFI_LOADER_BOUNCE_BUFFER option, which should be set to "y" in this case.
There is generic bounce buffer for drivers, see common/bouncebuf.c .
That implementation only seems to exist to handle misaligned buffers. As far as I can tell it doesn't make any attempt to make sure it allocates memory in a specific address range. Although I suppose that using memalign() means it allocates from the heap and boards have some control over where the heap lives. But doesn't that rely on board_get_usable_ram_top()?
Possibly, I suspect someone will have to take a deeper look into this and maybe implement some better bounce buffer.
I'm following this discussion since I'm trying to work out the best way to add PCIe support for the Apple M1 "boards". There the issue isn't so much that the hardware peripherals can't address memory above the 4GB boundary (there is no memory below the 4GB boundary!). But the IOMMU only has a 4GB iova window which means that I cannot have the IOMMU map all physical memory 1:1. So I either have to make sure that U-Boot (including the efi_loader subsystem) only uses memory in a particular 4GB range. Or I have to add an interface to have drivers explictly map memory through the IOMMU (and have them unmap when they're done). Such an interface would look somewhat similar to the bounce buffer interface.
Maybe now is the right time to implement such interface ? Isn't that what linux uses swiotlb for ?
participants (5)
-
Fabio Estevam
-
Marek Vasut
-
Mark Kettenis
-
Peng Fan
-
Tim Harvey