[U-Boot] FW: P4080 target has 16G memory stability issues ...

Dear list(s),
Our P4080 target board is using 2 SODIMM's on each of 2 Controllers (4x4G DDR3), and we are seeing some memory problems (linux panics) when beating up large amounts of memory (just under the 16G), on multiple threads (7 or 8 CPUs).
Our DDR3 configuration is derived from the SPD dump of U-Boot, and we are using a version based upon the 2011.09 release of U-Boot. Our firmware memory test, limited as it is to 2G chunks, and a single CPU shows no problem, it is only using a small test program under Linux and using multiple cpu's that we see the problems, and we can reproduce the problem at will, although reducing our memory speed via the RCW does seem to ameliorate the problem somewhat.
Questions:
- is anyone using a similar configuration? - is anyone aware of limitations in the U-Boot 2011.09R version of the mpc8xxx/ddr/* code we need to be aware of? - any ideas?
We've been pounding our heads on this for a while now, and I'm just wondering if we are covering old territory here.
Cheers, Rob.
Robert Sciuk Senior Designer, R&D. 905.738.3741 xt 22621

Dear Robert,
please note: it is NOT a good idea to post the same message to several mailing lists separately. Normally such cross-posts should be avoided completely; if they appear to make sense, you should really cross-post, so threading works, and people are aware that this is a message they have already seen on another list. Thanks.
In message 2DD52030B5146141BEB762A11AE97C4C014C6791@SPQCEXC05.exfo.com you wrote:
Our P4080 target board is using 2 SODIMM's on each of 2 Controllers (4x4G DDR3), and we are seeing some memory problems (linux panics) when beating up large amounts of memory (just under the 16G), on multiple threads (7 or 8 CPUs).
This is actually a somewhat frequent problem. It takes some experience to get DDR3 designs right. We have done some hardware design reviews which showed quite a number of issues in this area, typically resulting in issues similar to yours.
Our DDR3 configuration is derived from the SPD dump of U-Boot, and we are using a version based upon the 2011.09 release of U-Boot. Our firmware memory test, limited as it is to 2G chunks, and a single CPU shows no problem, it is only using a small test program under Linux and using multiple cpu's that we see the problems, and we can reproduce the problem at will, although reducing our memory speed via the RCW does seem to ameliorate the problem somewhat.
Most memory test routines don't help you here - they execrise the memory with plain read / write cycles, which results in pretty much relaxed timings. Even if these tests work perfectly, your memory mayu fail seriously when you manage to load it with back-to-back burst mode accesses. The easiest way to do this is running Linux with root file system over NFS, and then running some bigger application (like compiling a Linux kernel on the target). This results in many context switches (cache flush / cache fetch) and lots of DMA (network drivers). If everything else works, and this tests crashes your system, you can be pretty sure that burst mode accesses have some problem.
- is anyone using a similar configuration?
I don;t think the configuration is a problem here. My bet is either incomplete / incorrect initialization of the memory controller, and/or problems with the hardware design.
- is anyone aware of limitations in the U-Boot 2011.09R version of
the mpc8xxx/ddr/* code we need to be aware of?
I have no idea what "2011.09R" might be, sorry.
- any ideas?
We've been pounding our heads on this for a while now, and I'm just wondering if we are covering old territory here.
This _is_ a well known problem. Memory errors like this have always been a major issue wehn runnign an OS like Linux which really loads the hardware to the limits.
Best regards,
Wolfgang Denk
participants (2)
-
Robert Sciuk
-
Wolfgang Denk