
Hi Roger,
I really appreciate the help!
On Thu, May 18, 2023 at 01:55:38PM +0300, Roger Quadros wrote:
Hi Colin,
On 17/05/2023 22:39, Colin Foster wrote:
I swapped in just U-Boot (not the SPL) with your patch, and everything seems to work!
The issue of Uncorrectable ECC errors spam came from the SPL. Here's a snippet of the boot log with the "ecc" print as well as your patch:
Thanks for the tests. Glad to hear issue is narrowed down to SPL.
I can "fix" the issue by just commenting out the "ECC uncorrectable errors" print :-)
U-Boot SPL 2023.04-00029-g26a9ce5314-dirty (May 17 2023 - 12:06:49 -0700) OMAP4460-GP ES1.1 Trying to boot from NAND ecc: 2420106 ecc: ebd922f6 ecc: 333f844f ecc: ab812f72
This is clearly the issue. They should all have been 0.
Interesting. With the "ecc" prints in U-Boot I also get some non-zero values:
ecc: 0 ecc: 6bff997b ecc: 6bff997b ecc: 6bff997b
Once I'm booted, I can use nanddump. It seems like everything is correct from the Linux side of things:
# nanddump -f mlo_dump /dev/mtd0 ECC failed: 0 ECC corrected: 0 Number of bad blocks: 0 Number of bbt blocks: 0 Block size 131072, page size 2048, OOB size 64 Dumping data starting at 0x00000000 and ending at 0x00020000...
# nanddump -f uboot1_dump /dev/mtd1 ECC failed: 0 ECC corrected: 0 Number of bad blocks: 0 Number of bbt blocks: 0 Block size 131072, page size 2048, OOB size 64 Dumping data starting at 0x00000000 and ending at 0x00180000...
# nanddump -f uboot2_dump /dev/mtd2 ECC failed: 0 ECC corrected: 0 Number of bad blocks: 0 Number of bbt blocks: 0 Block size 131072, page size 2048, OOB size 64 Dumping data starting at 0x00000000 and ending at 0x00180000...
# nanddump -f /dev/null /dev/mtd3 ECC failed: 0 ECC corrected: 6 Number of bad blocks: 0 Number of bbt blocks: 0 Block size 131072, page size 2048, OOB size 64 Dumping data starting at 0x00000000 and ending at 0x1fce0000... ECC: 1 corrected bitflip(s) at offset 0x0ab30800 ECC: 1 corrected bitflip(s) at offset 0x0b008800 ECC: 1 corrected bitflip(s) at offset 0x0deaa000 ECC: 1 corrected bitflip(s) at offset 0x0ea5b000 ECC: 1 corrected bitflip(s) at offset 0x0ecbc000 ECC: 1 corrected bitflip(s) at offset 0x0ed61800
Can you please share your spl/u-boot.cfg?
Attached
We have a stripped down driver "am335x_spl_bch.c" that deals with NAND at SPL. I haven't really looked much at that driver but it relies on omap_gpmc.c for
ecc.hwctl() read_buf() ecc.calculate()
We didn't do any functional change to these functions in commit 04fcd25873 unless something slipped through the cracks.
I'll take a look at am335x_spl_bch.c and look at what I'm doing different. I was sad to see that `dump_stack()` didn't work off the bat for me.
It seems to rely on following config options
CFG_SYS_NAND_ECCPOS
{2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, \ 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, \ 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, \ 33, 34, 35, 36, 37, 38, 39, 40, 41, \ 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, \ 52, 53, 54, 55, 56, 57}
CONFIG_SYS_NAND_PAGE_COUNT
0x40
CONFIG_SYS_NAND_PAGE_SIZE
0x800
CONFIG_SYS_NAND_5_ADDR_CYCLE
1
CFG_SYS_NAND_ECCSIZE
512
CFG_SYS_NAND_ECCBYTES
14
CONFIG_SYS_NAND_OOBSIZE
0x40
Could you please share what they are set to for your SPL build?
All the CFG_* values should be identical for the SPL and U-Boot.
Meanwhile, I'll try to reproduce this on AM335x-EVM.
If you find anything let me know. I'll keep digging on my side as well.
Colin