
Hi, Murray!
(Sorry, this got OT now)
Murray.Jensen@csiro.au wrote:
Thanks very much for the quick response... I use the "mtest" command in U-Boot. There is also a Linux memory tester called "memtester" (URL: http://pyropus.ca/software/memtester/ - another page that might interest you resides at: http://linuxquality.sunsite.dk/articles/testsuites/).
Good stuff. Thanks!
But the dead giveaway is while running in linux, getting an Illegal Instruction exception on a legal instruction in a program that ran fine a number of times before and after, doing identical tasks (also get random Segmentation Violations) - plus I get random exceptions in kernel mode that crashes Linux. I am mounting the root filesystem via NFS.
Okay, so, you speak about an error rate of about 1..10 cph (crash per hour)?! Never had this over here.
I can't really rule out some problem in the Linux virtual memory system
Oh, I wouldn't blame the kernel in that case. 2.6.8 to 2.6.13 work really fine over here. What linux do you use?
A relevant point is that our DDR memory bus tracks are relatively short, I believe - all are the same length @ 54mm (~300ps propagation delay). I base this on various things I have read e.g. the worked CPO example in AN2583 uses 800-1000ps of propagation delay on the PCB - three times ours (but the difference isn't enough to make the default CPO value not work). We also do not use ECC - I was wondering if ECC might protect others in a similar situation e.g. if they only get single bit errors which are corrected.
Check the datasheet according ECC. AFAIK it can correct single-bit errors and detect double-bit errors. If you would have the chance to have an ECCable system you can pretty easily check the reliability of the DDR looking at the ECC status registers. And just for completeness: It's possible to turn off ECC during linux-runtime pretty reliable with something like this:
tmp=immap->im_ddr.sdram_cfg; mb(); /* sync */ wmb(); /* eieio */ immap->im_ddr.sdram_cfg = tmp & ~0x20000000; mb(); /* sync */ wmb(); /* eieio*/
I am hoping someone will reply saying "you fool - you forgot xxxxx" :-)
There is a story around here (from a guy who did a DDR for MPC8540) that you might have to put on really good capacitors on some of the ddr vref voltages on the CPU side because something is bouncing there more than you would expect...
You also might want to check what happens if you increase/decrease supply voltages here and there... well... you know, all these EE tricks.
Greets,
Clemens Koller _______________________________ R&D Imaging Devices Anagramm GmbH Rupert-Mayer-Str. 45/1 81379 Muenchen Germany
http://www.anagramm.de Phone: +49-89-741518-50 Fax: +49-89-741518-19