[U-Boot-Users] Moving u-boot location in ram to do full mem-test?

Hi,
I'm using "mtest" to test the sdram somewhat, but parts of the ram is ofcourse left untested doing so - due to stack- and u-boot positioning.
Enabling DEBUG'ing I narrowed in the plausible test segment into something like:
=> mtest 1090 3f8af37
where the last value is based on the information:
Top of RAM usable for U-Boot at: 04000000 Reserving 208k for U-Boot at: 03fcb000 Reserving 256k for malloc() at: 03f8b000 Reserving 128 Bytes for Board Info at: 03f8af80 Reserving 48 Bytes for Global Data at: 03f8af50 Stack Pointer at: 03f8af38 New Stack Pointer is: 03f8af38 Now running in RAM - U-Boot at: 03fcb000 FLASH: [flash_init, 771] Entering ... [flash_init, 772] flash_info = 0x03FFEB20 ... U-Boot relocated to 03fcb000
and the first from manual testing.
So, I would like to make a new u-boot binary that offered me the possibility to test the first part - from "0x0" to "0x108F" and from "0x3f8af37" to "0x3ffffff".
I've scanned my board-header to see if any of these addresses are directly present there, but without luck. The nearest I get to something that smells of the lower boundary, is the value of "CFG_INIT_RAM_ADDR" and "CFG_OCM_DATA_SIZE" both having the value 0x1000.
Any clues?
BR, Martin Egholm

In message dlveaq$o5p$1@sea.gmane.org you wrote:
I'm using "mtest" to test the sdram somewhat, but parts of the ram is ofcourse left untested doing so - due to stack- and u-boot positioning.
Not to forget exception vectors.
So, I would like to make a new u-boot binary that offered me the possibility to test the first part - from "0x0" to "0x108F" and from "0x3f8af37" to "0x3ffffff".
Ummm... I seriously doubt if this is worth the effort. Typical problems like unconnected or shorted data or address lines or crosstalk will show up very reliably with the existing test. And the really nasty problems usually happen with burst mode accesses, and these are *not* covered by any such memory test at all.
In my opinion the test as is is good enough as is to find coarse problems, and if you really want to stress test your memory just boot Linux with root file system mounted over NFS and compile a Linux kernel on the target. No smiley here, I really mean it.
Best regards,
Wolfgang Denk

I'm using "mtest" to test the sdram somewhat, but parts of the ram is ofcourse left untested doing so - due to stack- and u-boot positioning.
Not to forget exception vectors.
Oki :-)
So, I would like to make a new u-boot binary that offered me the possibility to test the first part - from "0x0" to "0x108F" and from "0x3f8af37" to "0x3ffffff".
Ummm... I seriously doubt if this is worth the effort. Typical problems like unconnected or shorted data or address lines or crosstalk will show up very reliably with the existing test.
But couldn't there be an error for a specific address segment - say "0x3ff0000"-"0x3ff00ff", which contains u-boot data never being used in u-boot, and not possible to test with mtest?
And the really nasty problems usually happen with burst mode accesses, and these are *not* covered by any such memory test at all. In my opinion the test as is is good enough as is to find coarse problems, and if you really want to stress test your memory just boot Linux with root file system mounted over NFS and compile a Linux kernel on the target. No smiley here, I really mean it.
Yes, but that would take days, if at all possible, on my 133 Mhz PPC405EP with 32 megs.
Then, I would rather have a "similar" memory exhausting test-application for Linux...
BR, Martin Egholm

In message dm1a2e$cdt$1@sea.gmane.org you wrote:
But couldn't there be an error for a specific address segment - say "0x3ff0000"-"0x3ff00ff", which contains u-boot data never being used in u-boot, and not possible to test with mtest?
In theory there could be such a problem. But 99% of all RAM memory errors fall into different patterns - at least from what I've seen.
Linux with root file system mounted over NFS and compile a Linux kernel on the target. No smiley here, I really mean it.
Yes, but that would take days, if at all possible, on my 133 Mhz PPC405EP with 32 megs.
No. It takes 2...3 hours on a MPC860 with 50 MHz, so you might be done with approx. one hour or so (assuming a 2.4 kernel tree). And of course you can stop any time you like - as long as the system does not crash you are fine.
Then, I would rather have a "similar" memory exhausting test-application for Linux...
It's not just "memory exhaustion". It's the combination of context switches, code fetching, DMA going on all simultaneously. I have yet to find any other test code that produces back-to-back burst mode accesses in such a density. It's really difficult to come up with a similar strong memory test.
And bythe way - if you're testing your memory you *want* to have this test running for a long time, probably changing operational parameters like temperature, voltages, ... while runnign the test. Or injecting EM "noise" if you're testing for EMC...
Best regards,
Wolfgang Denk

Hi Wolfgang,
But couldn't there be an error for a specific address segment - say "0x3ff0000"-"0x3ff00ff", which contains u-boot data never being used in u-boot, and not possible to test with mtest?
In theory there could be such a problem. But 99% of all RAM memory errors fall into different patterns - at least from what I've seen.
Ok... I have a specific board acting strange, so I was for starters suspecting memory. But I will solder in a new memory chip to see if that indeed was the problem...
Linux with root file system mounted over NFS and compile a Linux kernel on the target. No smiley here, I really mean it.
Yes, but that would take days, if at all possible, on my 133 Mhz PPC405EP with 32 megs.
No. It takes 2...3 hours on a MPC860 with 50 MHz, so you might be done with approx. one hour or so (assuming a 2.4 kernel tree). And of course you can stop any time you like - as long as the system does not crash you are fine.
Right...
Then, I would rather have a "similar" memory exhausting test-application for Linux...
It's not just "memory exhaustion". It's the combination of context switches, code fetching, DMA going on all simultaneously. I have yet to find any other test code that produces back-to-back burst mode accesses in such a density. It's really difficult to come up with a similar strong memory test.
I see your point...
And bythe way - if you're testing your memory you *want* to have this test running for a long time, probably changing operational parameters like temperature, voltages, ... while runnign the test. Or injecting EM "noise" if you're testing for EMC...
Sure - but this is really too extensive for the individual board. We're doing EMC test for a couple of units to the see the general picture, but all units will run a 24h burnin test at 50 degrees Celcius...
Ok, for now I'll abandon the idea of expanding the mtest area, and change the chip on the board that is causing me headaches...
Thanks, Martin

In message dm1dph$mrc$1@sea.gmane.org you wrote:
Ok, for now I'll abandon the idea of expanding the mtest area, and change the chip on the board that is causing me headaches...
Right. That is always an excellent idea from the point of view of a software engineer :-)
Best regards,
Wolfgang Denk

Ok, for now I'll abandon the idea of expanding the mtest area, and change the chip on the board that is causing me headaches...
Right. That is always an excellent idea from the point of view of a software engineer :-)
Baaah :-)

In message dm1guv$c0$1@sea.gmane.org you wrote:
Ok, for now I'll abandon the idea of expanding the mtest area, and change the chip on the board that is causing me headaches...
Right. That is always an excellent idea from the point of view of a software engineer :-)
Baaah :-)
Really - there is even a terminus technicus for it: it's called a SEP (Somebody Else's Problem :-)
Best regards,
Wolfgang Denk

Ok, for now I'll abandon the idea of expanding the mtest area, and change the chip on the board that is causing me headaches...
Right. That is always an excellent idea from the point of view of a software engineer :-)
Baaah :-)
Really - there is even a terminus technicus for it: it's called a SEP (Somebody Else's Problem :-)
Just for the record: It did help to change the memory-chip!
Now, _I'm_ stuck with finding a way to detect it :-)
Wolfgang, do you by chance have a gcc package that runs on 405 so I could try the kernel compile strategy?
// Martin

In message dm4ebj$ke7$1@sea.gmane.org you wrote:
Just for the record: It did help to change the memory-chip!
Fine.
Wolfgang, do you by chance have a gcc package that runs on 405 so I could try the kernel compile strategy?
Sure, our ELDK contains a full native development environment, too - with binutils, GCC, make, rpm, autoconf, ...
Best regards,
Wolfgang Denk

Just for the record: It did help to change the memory-chip!
Fine.
Yes! Kinda... :-)
Wolfgang, do you by chance have a gcc package that runs on 405 so I could try the kernel compile strategy?
Sure, our ELDK contains a full native development environment, too - with binutils, GCC, make, rpm, autoconf, ...
Most excellent dude! I forgot - but now that you mention it...
Thanks again...
// Martin
participants (2)
-
Martin Egholm Nielsen
-
Wolfgang Denk