
On 07/13/2018 02:50 PM, Jason Rush wrote:
On 7/11/2018 1:54 PM, Marek Vasut wrote:
On 07/11/2018 07:30 PM, Trent Piepho wrote:
On Wed, 2018-07-11 at 08:56 -0500, Jason Rush wrote:
On 7/11/2018 8:48 AM, Marek Vasut wrote:
On 07/11/2018 03:49 PM, Jason Rush wrote:
My mistake. I did disable the dcache after scrubbing too. The code is almost identical to the Arria10 code where after scrubbing it flushes the dcache, then turns it off.
The weird reset problems happens if I scrub the area where u-boot relocates to with the dcache on, then turn dcache off.
I tried to also tried turning the MMU off, but that didn't help.
Maybe there are some data used by the SPL there ? I think the SPL has malloc area in RAM at some point.
I thought something similar, so I narrowed it down to clearing just from where U-Boot relocates to the end of DRAM. If I'm correct, that includes where U-Boot relocates and where the MMU tables are normally stored.
I wonder if the reset does not properly reset the CPU caches? The idea being that the CPU cache has stale data from before the reset, or maybe just stale tag bits, and the hang is due to using this bad data from the cache.
That's why we do dcache_off() icache_off() first, to make sure the caches are in consistent state. But a good point nonetheless, this should be checked.
I call dcache_off() and icache_off() after scrubbing, and I verified the MMU control register indicated they were off.
Or perhaps there is always something done incorrectly, but it is only the state of DRAM after a reset, vs a power cycle, that consistently triggers a hang?
The SoCFPGA has some weird warm/cold reset hooks even in the bootrom, could be. But the DRAM should be torn down in either case.
Maybe there is something wrong with the reset hooks? I could try and clear the on chip RAM after U-Boot relocates just to see what happens. Or there is probably an enable for the hooks I could try and disable.
Warm reset just jumps into OCRAM. Maybe the code in OCRAM is corrupted.
Cold reset reloads the OCRAM content from flash/sdmmc.
If possible, try to add code before the hang point to invalidate both the i-cache and d-cache for the problem region above. Perhaps the SPL is doing something wrong w.r.t. cache invalidation, e.g. moving code around and not updating the i-cache, because it assumes nothing has yet used the caches, which is now no longer the case since you turn them on for scrubbing.
After scrubbing I first call flush_dcache_all(), then I added calls to invalidate_icache_all() and invalidate_dcache_all(), and finally I call dcache_off() and icache_off(). I wasn't sure about the order I should call them, but there was no change.