[U-Boot] i.MX dcache issues

Hi all,
Has anyone tested i.MX25 or i.MX35 with dcache on?
I'm working with U-Boot 2012.04.01 on several custom platforms using these processors.
With dcache off, everything works fine, but slowly.
With dcache on, it's much faster (e.g. mtest), but it's impossible to read files through the eSDHC, and U-Boot hangs if trying to ping using the FEC.
Shouldn't mtest disable dcache automatically, then set it back to its original state once finished? Otherwise, some of its tests to the memory may actually test the dcache rather than the memory chips connections.
dcache seems to be enabled for mx35pdk, but disabled for spear3, so I'm wondering if it has been thoroughly tested on mx35pdk.
I have added traces to arch/arm/lib/cache-cp15.c to check that the MMU is properly initialized with the appropriate addresses, and it is.
Defining CONFIG_SYS_CACHELINE_SIZE to 32 in the board file, which sets ARCH_DMA_MINALIGN to 32 instead of 64 does not worsen things (as expected with 32-byte cache lines).
Defining CONFIG_MMC_BOUNCE_BUFFER and/or setting the no_snoop option of the eSDHC driver does not change anything.
Defining DEBUG in mmc.c shows that the 1st mmc_send_cmd following the retry_scr label uses a cacheline-unaligned size that gets caught by the buffer bouncing mechanism that reallocs it for nothing since scr has already been allocated with a cacheline-aligned size. The requested transfer length is left unchanged by bouncing, but it seems normal for MMC commands, and it should not be an issue as long as allocated sizes are aligned.
Also, the mmc read commands to free RAM regions seem to work fine, so I'm wondering if this issue is not caused only by stack-allocated buffers. I'm not sure the data ending up in the buffers is random when there is an issue: the pattern f783 appears very often at the position of the expected 55aa DOS partition marker.
Shouldn't the MMC/eSDHC drivers flush/invalidate the dcache ranges that they use for DMA operations? Not doing so would explain why stack-allocated buffers are more affected than buffers in unused RAM areas.
As to the FEC, dcache issues with DMA seem to have been taken care of, so I'll add traces to the FEC driver to see if there is any cacheline-unaligned buffer passed in.
These issues could also be related to L2 on i.MX35, but I don't see any code enabling it.
BTW, why isn't there an enable_caches function in arch/arm/cpu/arm926ejs/cache.c or arch/arm/cpu/arm926ejs/cpu.c, just like in arch/arm/cpu/arm1136/cpu.c, so that dcache can be enabled by default if CONFIG_SYS_DCACHE_OFF isn't defined?
Regards, Benoît

On Sat, Jul 14, 2012 at 11:28:03PM +0200, Benoît Thébaudeau wrote:
Shouldn't the MMC/eSDHC drivers flush/invalidate the dcache ranges that they use for DMA operations? Not doing so would explain why stack-allocated buffers are more affected than buffers in unused RAM areas.
That will help: http://git.denx.de/?p=u-boot/u-boot-mmc.git;a=commitdiff;h=e576bd90f940806b9...
I can't wait for the 2012.07 release! ;)
Benoît

On 15.07.2012 00:08, Benoît Thébaudeau wrote:
On Sat, Jul 14, 2012 at 11:28:03PM +0200, Benoît Thébaudeau wrote:
Shouldn't the MMC/eSDHC drivers flush/invalidate the dcache ranges that they use for DMA operations? Not doing so would explain why stack-allocated buffers are more affected than buffers in unused RAM areas.
That will help: http://git.denx.de/?p=u-boot/u-boot-mmc.git;a=commitdiff;h=e576bd90f940806b9...
Are you sure that this patch does really help?
If I remember correctly (will re-check) we have this patch locally applied. But even with this patch, we have issues so that we enabled CONFIG_SYS_DCACHE_OFF, i.e. disabled the dcache.
The issues we observed *without* CONFIG_SYS_DCACHE_OFF: The SD card was detected as 1-bit only (mmcinfo), while with dcache off it was used as 4-bit. Debugging this showed that wrong configuration data was read [1]. Having a fat partition on the card, mmc part/fatls etc failed, too, with cache enabled.
Best regards
Dirk
[1]
http://git.denx.de/?p=u-boot.git;a=blob;f=drivers/mmc/mmc.c;h=aebe578ff6f2e0...
mmc->scr[0]/scr[1] contained random data

On Sun, Jul 15, 2012 at 08:56:35AM +0200, Dirk Behme wrote:
On 15.07.2012 00:08, Benoît Thébaudeau wrote:
On Sat, Jul 14, 2012 at 11:28:03PM +0200, Benoît Thébaudeau wrote:
Shouldn't the MMC/eSDHC drivers flush/invalidate the dcache ranges that they use for DMA operations? Not doing so would explain why stack-allocated buffers are more affected than buffers in unused RAM areas.
That will help: http://git.denx.de/?p=u-boot/u-boot-mmc.git;a=commitdiff;h=e576bd90f940806b9...
Are you sure that this patch does really help?
I meant: It's necessary, but perhaps not sufficient. I have not tested it yet.
If I remember correctly (will re-check) we have this patch locally applied. But even with this patch, we have issues so that we enabled CONFIG_SYS_DCACHE_OFF, i.e. disabled the dcache.
The issues we observed *without* CONFIG_SYS_DCACHE_OFF: The SD card was detected as 1-bit only (mmcinfo), while with dcache off it was used as 4-bit. Debugging this showed that wrong configuration data was read [1]. Having a fat partition on the card, mmc part/fatls etc failed, too, with cache enabled.
It's exactly the kind of issues I currently get. Was CONFIG_MMC_BOUNCE_BUFFER defined for your tests to make sure no unaligned buffer was used? I'll tell you if it works better for me with this patch.
Best regards, Benoît

Dear Dirk Behme,
On 15.07.2012 00:08, Benoît Thébaudeau wrote:
On Sat, Jul 14, 2012 at 11:28:03PM +0200, Benoît Thébaudeau wrote:
Shouldn't the MMC/eSDHC drivers flush/invalidate the dcache ranges that they use for DMA operations? Not doing so would explain why stack-allocated buffers are more affected than buffers in unused RAM areas.
That will help: http://git.denx.de/?p=u-boot/u-boot-mmc.git;a=commitdiff;h=e576bd90f94080 6b989ffd666552081f17f032c8
Are you sure that this patch does really help?
If I remember correctly (will re-check) we have this patch locally applied. But even with this patch, we have issues so that we enabled CONFIG_SYS_DCACHE_OFF, i.e. disabled the dcache.
Try using the bounce buffer as I do on mx28.
The issues we observed *without* CONFIG_SYS_DCACHE_OFF: The SD card was detected as 1-bit only (mmcinfo), while with dcache off it was used as 4-bit. Debugging this showed that wrong configuration data was read [1]. Having a fat partition on the card, mmc part/fatls etc failed, too, with cache enabled.
Ad caches -- FEC should be fixed on these platforms, USB might work, MMC should work if you use the bounce buffer. FAT should be fixed, so should be ext2
Best regards
Dirk
[1]
http://git.denx.de/?p=u-boot.git;a=blob;f=drivers/mmc/mmc.c;h=aebe578ff6f2e 0228baa3e5d010f6808ea269760;hb=HEAD#l856
mmc->scr[0]/scr[1] contained random data _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Best regards, Marek Vasut

Hi all,
On Tue, Jul 17, 2012 at 12:30:22AM +0200, Marek Vasut wrote:
Dear Dirk Behme,
On 15.07.2012 00:08, Benoît Thébaudeau wrote:
On Sat, Jul 14, 2012 at 11:28:03PM +0200, Benoît Thébaudeau wrote:
Shouldn't the MMC/eSDHC drivers flush/invalidate the dcache ranges that they use for DMA operations? Not doing so would explain why stack-allocated buffers are more affected than buffers in unused RAM areas.
That will help: http://git.denx.de/?p=u-boot/u-boot-mmc.git;a=commitdiff;h=e576bd90f94080 6b989ffd666552081f17f032c8
Are you sure that this patch does really help?
If I remember correctly (will re-check) we have this patch locally applied. But even with this patch, we have issues so that we enabled CONFIG_SYS_DCACHE_OFF, i.e. disabled the dcache.
Try using the bounce buffer as I do on mx28.
The issues we observed *without* CONFIG_SYS_DCACHE_OFF: The SD card was detected as 1-bit only (mmcinfo), while with dcache off it was used as 4-bit. Debugging this showed that wrong configuration data was read [1]. Having a fat partition on the card, mmc part/fatls etc failed, too, with cache enabled.
Ad caches -- FEC should be fixed on these platforms, USB might work, MMC should work if you use the bounce buffer. FAT should be fixed, so should be ext2
I have switched to the git head, and applied all the latest EHCI/MSC patches from u-boot-usb-next. Since then, all my MMC/EHCI/FEC i.MX DMA issues are resolved, except on i.MX35 because of a huge bug in the handling of cache range checks on ARM1136. I have just posted a patch that fixes this issue.
Best regards, Benoît

Dear Benoît Thébaudeau,
Hi all,
On Tue, Jul 17, 2012 at 12:30:22AM +0200, Marek Vasut wrote:
Dear Dirk Behme,
On 15.07.2012 00:08, Benoît Thébaudeau wrote:
On Sat, Jul 14, 2012 at 11:28:03PM +0200, Benoît Thébaudeau
wrote:
Shouldn't the MMC/eSDHC drivers flush/invalidate the dcache ranges that they use for DMA operations? Not doing so would explain why stack-allocated buffers are more affected than buffers in unused RAM areas.
That will help: http://git.denx.de/?p=u-boot/u-boot-mmc.git;a=commitdiff;h=e576bd90f9 4080 6b989ffd666552081f17f032c8
Are you sure that this patch does really help?
If I remember correctly (will re-check) we have this patch locally applied. But even with this patch, we have issues so that we enabled CONFIG_SYS_DCACHE_OFF, i.e. disabled the dcache.
Try using the bounce buffer as I do on mx28.
The issues we observed *without* CONFIG_SYS_DCACHE_OFF: The SD card was detected as 1-bit only (mmcinfo), while with dcache off it was used as 4-bit. Debugging this showed that wrong configuration data was read [1]. Having a fat partition on the card, mmc part/fatls etc failed, too, with cache enabled.
Ad caches -- FEC should be fixed on these platforms, USB might work, MMC should work if you use the bounce buffer. FAT should be fixed, so should be ext2
I have switched to the git head, and applied all the latest EHCI/MSC patches from u-boot-usb-next. Since then, all my MMC/EHCI/FEC i.MX DMA issues are resolved, except on i.MX35 because of a huge bug in the handling of cache range checks on ARM1136. I have just posted a patch that fixes this issue.
AARGH! Thank you very much!
Best regards, Benoît
Best regards, Marek Vasut

On 19/07/2012 13:43, Benoît Thébaudeau wrote:
Hi all,
Hi,
I have switched to the git head, and applied all the latest EHCI/MSC patches from u-boot-usb-next.
Marek, should we maybe try to merge these patches still in the incoming release ? It makes no sense if the release is broken and we have already a solution to fix it.
Since then, all my MMC/EHCI/FEC i.MX DMA issues are resolved, except on i.MX35 because of a huge bug in the handling of cache range checks on ARM1136. I have just posted a patch that fixes this issue.
Thanks for fixing !
Best regards, Stefano Babic

Dear Stefano Babic,
On 19/07/2012 13:43, Benoît Thébaudeau wrote:
Hi all,
Hi,
I have switched to the git head, and applied all the latest EHCI/MSC patches from u-boot-usb-next.
Marek, should we maybe try to merge these patches still in the incoming release ? It makes no sense if the release is broken and we have already a solution to fix it.
But Wolfgang will make three small ones of me if they break anything ;-) like with the last release, I'm really a bit cautious here. But ok then, I'll grow some balls and submit a pull request ;-)
Since then, all my MMC/EHCI/FEC i.MX DMA issues are resolved, except on i.MX35 because of a huge bug in the handling of cache range checks on ARM1136. I have just posted a patch that fixes this issue.
Thanks for fixing !
Best regards, Stefano Babic
Best regards, Marek Vasut

On 15/07/2012 00:08, Benoît Thébaudeau wrote:
On Sat, Jul 14, 2012 at 11:28:03PM +0200, Benoît Thébaudeau wrote:
Shouldn't the MMC/eSDHC drivers flush/invalidate the dcache ranges that they use for DMA operations? Not doing so would explain why stack-allocated buffers are more affected than buffers in unused RAM areas.
That will help: http://git.denx.de/?p=u-boot/u-boot-mmc.git;a=commitdiff;h=e576bd90f940806b9...
This patch is merged into u-boot-mmc. It will flow to mainline by next Andy's (MMC custodian) pull request.
Best regards, Stefano Babic

On 14/07/2012 23:28, Benoît Thébaudeau wrote:
Hi all,
Hi,
Has anyone tested i.MX25 or i.MX35 with dcache on?
I'm working with U-Boot 2012.04.01 on several custom platforms using these processors.
With dcache off, everything works fine, but slowly.
With dcache on, it's much faster (e.g. mtest), but it's impossible to read files through the eSDHC, and U-Boot hangs if trying to ping using the FEC.
This is known - the buffers must be invalidate, there is a pending patch doing this.
Shouldn't mtest disable dcache automatically, then set it back to its original state once finished? Otherwise, some of its tests to the memory may actually test the dcache rather than the memory chips connections.
dcache seems to be enabled for mx35pdk, but disabled for spear3, so I'm wondering if it has been thoroughly tested on mx35pdk.
My last status is that ESDHC does not work with cache on, but support for cache was already merged into the FEC driver.
I have added traces to arch/arm/lib/cache-cp15.c to check that the MMU is properly initialized with the appropriate addresses, and it is.
Defining CONFIG_SYS_CACHELINE_SIZE to 32 in the board file, which sets ARCH_DMA_MINALIGN to 32 instead of 64 does not worsen things (as expected with 32-byte cache lines).
Defining CONFIG_MMC_BOUNCE_BUFFER and/or setting the no_snoop option of the eSDHC driver does not change anything.
snoop is a feture of PowerQuick SOCs, it has no meaning on i.MX.
Defining DEBUG in mmc.c shows that the 1st mmc_send_cmd following the retry_scr label uses a cacheline-unaligned size that gets caught by the buffer bouncing mechanism that reallocs it for nothing since scr has already been allocated with a cacheline-aligned size. The requested transfer length is left unchanged by bouncing, but it seems normal for MMC commands, and it should not be an issue as long as allocated sizes are aligned.
Also, the mmc read commands to free RAM regions seem to work fine, so I'm wondering if this issue is not caused only by stack-allocated buffers. I'm not sure the data ending up in the buffers is random when there is an issue: the pattern f783 appears very often at the position of the expected 55aa DOS partition marker.
Shouldn't the MMC/eSDHC drivers flush/invalidate the dcache ranges that they use for DMA operations? Not doing so would explain why stack-allocated buffers are more affected than buffers in unused RAM areas.
Yes, and this is what the pending patch is supposed to do.
As to the FEC, dcache issues with DMA seem to have been taken care of, so I'll add traces to the FEC driver to see if there is any cacheline-unaligned buffer passed in.
Let's know what are the results here - FEC is supposed to work.
BTW, why isn't there an enable_caches function in arch/arm/cpu/arm926ejs/cache.c or arch/arm/cpu/arm926ejs/cpu.c, just like in arch/arm/cpu/arm1136/cpu.c, so that dcache can be enabled by default if CONFIG_SYS_DCACHE_OFF isn't defined?
For all these reasons - because the drivers do not support yet cache, cache on MX35 remains disabled. When support for cache (= buffers are invalidate..) flows into mainline, it will be possible to activate cache by default.
Best regards, Stefano Babic

On 15/07/2012 15:45, Stefano Babic wrote:
On 14/07/2012 23:28, Benoît Thébaudeau wrote:
Hi all,
Hi,
Has anyone tested i.MX25 or i.MX35 with dcache on?
I'm working with U-Boot 2012.04.01 on several custom platforms using these processors.
With dcache off, everything works fine, but slowly.
With dcache on, it's much faster (e.g. mtest), but it's impossible to read files through the eSDHC, and U-Boot hangs if trying to ping using the FEC.
This is known - the buffers must be invalidate, there is a pending patch doing this.
I've seen it since, but it is not sufficient according to Dirk Behme.
Shouldn't mtest disable dcache automatically, then set it back to its original state once finished? Otherwise, some of its tests to the memory may actually test the dcache rather than the memory chips connections.
dcache seems to be enabled for mx35pdk, but disabled for spear3, so I'm wondering if it has been thoroughly tested on mx35pdk.
My last status is that ESDHC does not work with cache on, but support for cache was already merged into the FEC driver.
OK.
I have added traces to arch/arm/lib/cache-cp15.c to check that the MMU is properly initialized with the appropriate addresses, and it is.
Defining CONFIG_SYS_CACHELINE_SIZE to 32 in the board file, which sets ARCH_DMA_MINALIGN to 32 instead of 64 does not worsen things (as expected with 32-byte cache lines).
Defining CONFIG_MMC_BOUNCE_BUFFER and/or setting the no_snoop option of the eSDHC driver does not change anything.
snoop is a feture of PowerQuick SOCs, it has no meaning on i.MX.
OK. I tried it because no_snoop is set for i.MX5 boards, so I was wondering if it was an undocumented feature of this IP.
Defining DEBUG in mmc.c shows that the 1st mmc_send_cmd following the retry_scr label uses a cacheline-unaligned size that gets caught by the buffer bouncing mechanism that reallocs it for nothing since scr has already been allocated with a cacheline-aligned size. The requested transfer length is left unchanged by bouncing, but it seems normal for MMC commands, and it should not be an issue as long as allocated sizes are aligned.
Also, the mmc read commands to free RAM regions seem to work fine, so I'm wondering if this issue is not caused only by stack-allocated buffers. I'm not sure the data ending up in the buffers is random when there is an issue: the pattern f783 appears very often at the position of the expected 55aa DOS partition marker.
Shouldn't the MMC/eSDHC drivers flush/invalidate the dcache ranges that they use for DMA operations? Not doing so would explain why stack-allocated buffers are more affected than buffers in unused RAM areas.
Yes, and this is what the pending patch is supposed to do.
OK.
As to the FEC, dcache issues with DMA seem to have been taken care of, so I'll add traces to the FEC driver to see if there is any cacheline-unaligned buffer passed in.
Let's know what are the results here - FEC is supposed to work.
I'll tell you.
BTW, why isn't there an enable_caches function in arch/arm/cpu/arm926ejs/cache.c or arch/arm/cpu/arm926ejs/cpu.c, just like in arch/arm/cpu/arm1136/cpu.c, so that dcache can be enabled by default if CONFIG_SYS_DCACHE_OFF isn't defined?
For all these reasons - because the drivers do not support yet cache, cache on MX35 remains disabled. When support for cache (= buffers are invalidate..) flows into mainline, it will be possible to activate cache by default.
OK, but there is an inconsistency in that case between i.MX25 and i.MX35: These issues are present on both, but without CONFIG_SYS_DCACHE_OFF defined, dcache is enabled by default on i.MX35, but not on i.MX25.
Best regards, Benoît

On 15.07.2012 16:45, Benoît Thébaudeau wrote:
On 15/07/2012 15:45, Stefano Babic wrote:
On 14/07/2012 23:28, Benoît Thébaudeau wrote:
Hi all,
Hi,
Has anyone tested i.MX25 or i.MX35 with dcache on?
I'm working with U-Boot 2012.04.01 on several custom platforms using these processors.
With dcache off, everything works fine, but slowly.
With dcache on, it's much faster (e.g. mtest), but it's impossible to read files through the eSDHC, and U-Boot hangs if trying to ping using the FEC.
This is known - the buffers must be invalidate, there is a pending patch doing this.
I've seen it since, but it is not sufficient according to Dirk Behme.
As promised I re-checked this again: The test and issue reported previously in this thread were done on a plain v2012.04.01, so *without* "i.MX: fsl_esdhc: allow use with cache enabled." applied.
So I can't tell if "i.MX: fsl_esdhc: allow use with cache enabled." does completely help or not.
Sorry for the noise, too many branches ;)
Best regards
Dirk
Shouldn't mtest disable dcache automatically, then set it back to its original state once finished? Otherwise, some of its tests to the memory may actually test the dcache rather than the memory chips connections.
dcache seems to be enabled for mx35pdk, but disabled for spear3, so I'm wondering if it has been thoroughly tested on mx35pdk.
My last status is that ESDHC does not work with cache on, but support for cache was already merged into the FEC driver.
OK.
I have added traces to arch/arm/lib/cache-cp15.c to check that the MMU is properly initialized with the appropriate addresses, and it is.
Defining CONFIG_SYS_CACHELINE_SIZE to 32 in the board file, which sets ARCH_DMA_MINALIGN to 32 instead of 64 does not worsen things (as expected with 32-byte cache lines).
Defining CONFIG_MMC_BOUNCE_BUFFER and/or setting the no_snoop option of the eSDHC driver does not change anything.
snoop is a feture of PowerQuick SOCs, it has no meaning on i.MX.
OK. I tried it because no_snoop is set for i.MX5 boards, so I was wondering if it was an undocumented feature of this IP.
Defining DEBUG in mmc.c shows that the 1st mmc_send_cmd following the retry_scr label uses a cacheline-unaligned size that gets caught by the buffer bouncing mechanism that reallocs it for nothing since scr has already been allocated with a cacheline-aligned size. The requested transfer length is left unchanged by bouncing, but it seems normal for MMC commands, and it should not be an issue as long as allocated sizes are aligned.
Also, the mmc read commands to free RAM regions seem to work fine, so I'm wondering if this issue is not caused only by stack-allocated buffers. I'm not sure the data ending up in the buffers is random when there is an issue: the pattern f783 appears very often at the position of the expected 55aa DOS partition marker.
Shouldn't the MMC/eSDHC drivers flush/invalidate the dcache ranges that they use for DMA operations? Not doing so would explain why stack-allocated buffers are more affected than buffers in unused RAM areas.
Yes, and this is what the pending patch is supposed to do.
OK.
As to the FEC, dcache issues with DMA seem to have been taken care of, so I'll add traces to the FEC driver to see if there is any cacheline-unaligned buffer passed in.
Let's know what are the results here - FEC is supposed to work.
I'll tell you.
BTW, why isn't there an enable_caches function in arch/arm/cpu/arm926ejs/cache.c or arch/arm/cpu/arm926ejs/cpu.c, just like in arch/arm/cpu/arm1136/cpu.c, so that dcache can be enabled by default if CONFIG_SYS_DCACHE_OFF isn't defined?
For all these reasons - because the drivers do not support yet cache, cache on MX35 remains disabled. When support for cache (= buffers are invalidate..) flows into mainline, it will be possible to activate cache by default.
OK, but there is an inconsistency in that case between i.MX25 and i.MX35: These issues are present on both, but without CONFIG_SYS_DCACHE_OFF defined, dcache is enabled by default on i.MX35, but not on i.MX25.
Best regards, Benoît _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
participants (5)
-
Benoît Thébaudeau
-
Dirk Behme
-
Dirk Behme
-
Marek Vasut
-
Stefano Babic