Re: [U-Boot] SoCFPGA PL330 DMA driver and ECC scrubbing

13 Jul 2018


      On 07/13/2018 02:50 PM, Jason Rush wrote:
...
On 7/11/2018 1:54 PM, Marek Vasut wrote:
...
On 07/11/2018 07:30 PM, Trent Piepho wrote:
...
On Wed, 2018-07-11 at 08:56 -0500, Jason Rush wrote:
...
On 7/11/2018 8:48 AM, Marek Vasut wrote:
...
On 07/11/2018 03:49 PM, Jason Rush wrote:
...
My mistake.  I did disable the dcache after scrubbing too.  The
code is almost identical to the Arria10 code where after
scrubbing it flushes the dcache, then turns it off.
The weird reset problems happens if I scrub the area where
u-boot relocates to with the dcache on, then turn dcache off.
I tried to also tried turning the MMU off, but that didn't help.
Maybe there are some data used by the SPL there ? I think the SPL has
malloc area in RAM at some point.
I thought something similar, so I narrowed it down to clearing
just from where U-Boot relocates to the end of DRAM.  If I'm
correct, that includes where U-Boot relocates and where the
MMU tables are normally stored.
I wonder if the reset does not properly reset the CPU caches?  The idea
being that the CPU cache has stale data from before the reset, or maybe
just stale tag bits, and the hang is due to using this bad data from
the cache.
That's why we do dcache_off() icache_off() first, to make sure the
caches are in consistent state. But a good point nonetheless, this
should be checked.
I call dcache_off() and icache_off() after scrubbing, and I
verified the MMU control register indicated they were off.
...
...
Or perhaps there is always something done incorrectly, but it is only
the state of DRAM after a reset, vs a power cycle, that consistently
triggers a hang?
The SoCFPGA has some weird warm/cold reset hooks even in the bootrom,
could be. But the DRAM should be torn down in either case.
Maybe there is something wrong with the reset hooks? I could try
and clear the on chip RAM after U-Boot relocates just to see what
happens.  Or there is probably an enable for the hooks I could try
and disable.
Warm reset just jumps into OCRAM. Maybe the code in OCRAM is corrupted.
Cold reset reloads the OCRAM content from flash/sdmmc.
...
...
...
If possible, try to add code before the hang point to invalidate both
the i-cache and d-cache for the problem region above.  Perhaps the SPL
is doing something wrong w.r.t. cache invalidation, e.g. moving code
around and not updating the i-cache, because it assumes nothing has yet
used the caches, which is now no longer the case since you turn them on
for scrubbing.
After scrubbing I first call flush_dcache_all(), then I added calls to
invalidate_icache_all() and invalidate_dcache_all(), and finally I
call dcache_off() and icache_off().  I wasn't sure about the order
I should call them, but there was no change.
-- 
Best regards,
Marek Vasut