[U-Boot] Memory problems on mpc8572ds

Hello,
my mpc8572ds (2x 512 MiB ram) had initially u-boot 1.3.0-rc2 and I haven't notice any problems. Today I updated u-boot to v2009.03 because I was not able to boot current vanilla linux kernel in smp mode. Since then I experience random kernel opses. I thing it has something to do with memory setup. The settings in the ddr controller changed from (v1.3.0-rc2): - ddr->sdram_mode 0x00400a62; - ddr->sdram_data_init = 0x00000000; - ddr->sdram_cfg_2 = 0x04401000;
to (v2009.03): - ddr->sdram_mode 0x00440a62 - ddr->sdram_data_init 0xdeadbeef - ddr->sdram_cfg_2 0x24401000
I cherry-picked commit 6a819783 aka "fsl-ddr: Fix two bugs in the ddr infrastructure" but nothing improved.
Now I've picked v1.3.0-rc2 and linux v2.6.29. With that combination everything seems to work however I experience sometimes ICE while compiling. If I continue to compile from that point there is no ICE anymore so it does not look like a gcc bug to me.
Any suggestions? Someone with similar problems?
Sebastian

my mpc8572ds (2x 512 MiB ram) had initially u-boot 1.3.0-rc2 and I haven't notice any problems. Today I updated u-boot to v2009.03 because I was not able to boot current vanilla linux kernel in smp mode. Since then I experience random kernel opses. I thing it has something to do with memory setup. The settings in the ddr controller changed from (v1.3.0-rc2):
- ddr->sdram_mode 0x00400a62;
- ddr->sdram_data_init = 0x00000000;
- ddr->sdram_cfg_2 = 0x04401000;
to (v2009.03):
- ddr->sdram_mode 0x00440a62
- ddr->sdram_data_init 0xdeadbeef
- ddr->sdram_cfg_2 0x24401000
I cherry-picked commit 6a819783 aka "fsl-ddr: Fix two bugs in the ddr infrastructure" but nothing improved.
Now I've picked v1.3.0-rc2 and linux v2.6.29. With that combination everything seems to work however I experience sometimes ICE while compiling. If I continue to compile from that point there is no ICE anymore so it does not look like a gcc bug to me.
Any suggestions? Someone with similar problems?
It seems like you are happening the DDR memory issue. Have a dump all of DDR registers for both v1.3.0-rc2 and v2009.03. Check if anything is different.

* Liu Dave-R63238 | 2009-05-15 08:47:04 [+0800]:
(v1.3.0-rc2):
- ddr->sdram_mode 0x00400a62;
- ddr->sdram_data_init = 0x00000000;
- ddr->sdram_cfg_2 = 0x04401000;
to (v2009.03):
- ddr->sdram_mode 0x00440a62
- ddr->sdram_data_init 0xdeadbeef
- ddr->sdram_cfg_2 0x24401000
I cherry-picked commit 6a819783 aka "fsl-ddr: Fix two bugs in the ddr infrastructure" but nothing improved.
Now I've picked v1.3.0-rc2 and linux v2.6.29. With that combination everything seems to work however I experience sometimes ICE while compiling. If I continue to compile from that point there is no ICE anymore so it does not look like a gcc bug to me.
Any suggestions? Someone with similar problems?
It seems like you are happening the DDR memory issue. Have a dump all of DDR registers for both v1.3.0-rc2 and v2009.03. Check if anything is different.
Those three I've posted are the only ones that have changed. I've compared the 16 regs which are set in fixed_sdram(). Do you want me to go through _all_ of them? As I wrote later, with v1.3.0-rc2 I get some ICE during compile sessions. So it looks to me like those timings are not bullet proof but friendlier than in v2009.03
Sebastian

On 2009-05-14, at 19:22, Sebastian Andrzej Siewior wrote:
Hello,
my mpc8572ds (2x 512 MiB ram) had initially u-boot 1.3.0-rc2 and I haven't notice any problems. Today I updated u-boot to v2009.03 because I was not able to boot current vanilla linux kernel in smp mode. Since then I experience random kernel opses. I thing it has something to do with memory setup. The settings in the ddr controller changed from (v1.3.0-rc2):
- ddr->sdram_mode 0x00400a62;
- ddr->sdram_data_init = 0x00000000;
- ddr->sdram_cfg_2 = 0x04401000;
to (v2009.03):
- ddr->sdram_mode 0x00440a62
- ddr->sdram_data_init 0xdeadbeef
- ddr->sdram_cfg_2 0x24401000
I cherry-picked commit 6a819783 aka "fsl-ddr: Fix two bugs in the ddr infrastructure" but nothing improved.
Now I've picked v1.3.0-rc2 and linux v2.6.29. With that combination everything seems to work however I experience sometimes ICE while compiling. If I continue to compile from that point there is no ICE anymore so it does not look like a gcc bug to me.
Any suggestions? Someone with similar problems?
I can confirm this. We have been suffering from instabilities of the main line U-Boot on MPC8572DS since a long time ago. It is strongly correlated with memory controller settings: the same h/w unit works fine with 1.3.0-rc2 (Freescale LTIB), while using anything newer from main line leads to hangs and/or corruptions. We tried to nail down the issue, but could not find anything conclusive in the given time, so just stick with that old (but stable) 1.3.0 derivative...
Sometimes the corruptions could be observed using U-Boot mtest, sometimes not, but usually everything looked fine until more memory intensive operations at the OS level: heavier network traffic would cause checksum errors showing up and other corruptions, including hangs.
Interleaved mode seemed to have *some* effect: typically disabling interleaved config would give somewhat less corrupt-prone behaviour (it takes much longer to trigger the faults, but they would pop up eventually).
Other data point: we observed much often the above problems with rev1.1 of the silicon, rev1.0 is also failing, but 1.1 is very quick to err.
Rafal

* Rafal Jaworowski | 2009-05-15 20:12:48 [+0200]:
I can confirm this. We have been suffering from instabilities of the main line U-Boot on MPC8572DS since a long time ago. It is strongly correlated with memory controller settings: the same h/w unit works fine with 1.3.0-rc2 (Freescale LTIB), while using anything newer from main line leads to hangs and/or corruptions. We tried to nail down the issue, but could not find anything conclusive in the given time, so just stick with that old (but stable) 1.3.0 derivative...
With the v1.3.0-rc2 I was able to compile qt4-x11 (one make job) with no problems. After I tried to compile the package six times in parallel (also one make job) I saw that some of the compile jobs ended up in an ICE at random points. So I thing there is also some instability with v1.3.0-rc2 but it is harder to trigger.
Other data point: we observed much often the above problems with rev1.1 of the silicon, rev1.0 is also failing, but 1.1 is very quick to err.
I have just rev1.1
Rafal
Sebastian
participants (3)
-
Liu Dave-R63238
-
Rafal Jaworowski
-
Sebastian Andrzej Siewior