
On 02/10/2014 01:14 PM, Andreas Bießmann wrote:
- we have a hardware design bug
- we have a few hundred i.MX31 TT-01 devices in the field
- the i.MX31 rom boot loader is only capable of using 1bit HW-ECC
(loading the first page (2k) from the NAND)
- the NAND chip specifies a requirement of 1bit ECC for the first 128kB
(PEB) and 4bit ECC for the rest
- our current u-boot uses 1-bit HW-ECC, the kernel uses UBIFS and 1bit
HW-ECC D'oh!
just about what I thought ...
- we face increasing bit errors in the field in the PEBs used by u-boot.
Using UBIFS in the kernel mitigates the requirements of 4bit ECC for the whole NAND because it moves PEBs when bit errors show up. The real problem is the area where u-boot is located (currently approx. 450kB, including UBIFS, USB ethernet support and more ..). I wouldn't say it is a good solution to have 1 bit ecc on NAND that requires 4 bit, even though there is another layer reacting on bit errors. I guess your BBT will increase significant in a very short time.
Most operations are read (we use a separate YAFFS partition for time predictable writes), so UBI will relocate read-only blocks anyway (due to read disturbances), I think the effect wont be too dramatic, but don't make me proof that ;-)
So the idea was:
- use a small u-boot (<128kB) in the first PEB of the NAND (written with
1bit HW-ECC) that supports 4bit BCH
How about using SPL here? I don't know the freescale universe but wonder if SPL is fixed to 2k. Building SPL with SW BCH in less than 2k seems not doable for me.
SPL on i.MX31 is limited to 2kB so we can't use BCH 4 here, just as you guessed.
- let it load a second u-boot (<512kB) from the next 4 PEBs (written
with 4bit BCH)
- jump to the second u-boot and load the kernel from an UBI volume using
1bit HW-ECC again
I did all that and it seemed to work just fine, but jumping to the second u-boot almost always crashes the system. In detail we do:
- romboot loads the SPL (2kb)
- SPL loads the first u-boot stage (which relocates and runs nicely)
- the first u-boot 'boots' the second u-boot by loading it from the NAND
- the second u-boot is loaded to the link address minus 2kB (for SPL)
- this is the same for the first and the second u-boot (link address
0x87e00000 - 0x800 = 0x87dff800)
The offset is about 125MiB, current mainline code tells me, that the tt-01 board has just 128 MiB. It is likely your second uboot overwrites the code of your first one while copying. You should link your code to run at a far away address, maybe 0x80000000 ;)
We have 256MiB (not yet contributed). First u-boot is loaded to 0x87e00000, then relocates to 0x8f... something. Second u-booot is loaded to 0x87e00000 again and relocates to 0x8f..., the same locations for both, the second u-boot is verified in RAM before jumping to it. If I set a breakpoint in the do_go_exec() I can step right into the second u-boot.
Well, it may be related to some freescale interna I do not know. However It is likely that you really overwrite the first u-boot version with the
You'd be right for 128MiB! I'll try to crc32 the relocated area of the first u-boot, anyway.
The strange thing is that the serial output does not show up unless I set breakpoints. This might be pointing to some clock setup problem?!
Thx for caring, Helmut
-- Scanned by MailScanner.