[U-Boot] NAND write error with HW ECC on OMAP3

Hi,
When using 'nandecc hw' on an OMAP3 platform, data is not being correctly written to NAND. I see the issue on 2013.10-rc2 and 2013.07 but not on 2012.10. Specifically, when I read back a SPL binary written with hardware Hamming ECC, I don't get a matching CRC. With the BCH8 ECC algorithm, the CRC is correct (but SPL must be written with Hamming otherwise the board doesn't boot)
I've shown my steps here: http://pastebin.com/tLZsr9zH The expected CRC is 46745188.
Any suggestions/ideas much appreciated!
--Ash

Dear Ash Charles,
On 09/03/2013 09:34 PM, Ash Charles wrote:
Hi,
When using 'nandecc hw' on an OMAP3 platform, data is not being correctly written to NAND. I see the issue on 2013.10-rc2 and 2013.07 but not on 2012.10. Specifically, when I read back a SPL binary written with hardware Hamming ECC, I don't get a matching CRC. With the BCH8 ECC algorithm, the CRC is correct (but SPL must be written with Hamming otherwise the board doesn't boot)
I've shown my steps here: http://pastebin.com/tLZsr9zH The expected CRC is 46745188.
Any suggestions/ideas much appreciated!
I'd swear this worked when I changed the command to use BCH ... I'll check it again today.
Best regards
Andreas Bießmann

Dear Ash Charles,
On 09/04/2013 09:35 AM, Andreas Bießmann wrote:
Dear Ash Charles,
On 09/03/2013 09:34 PM, Ash Charles wrote:
Hi,
When using 'nandecc hw' on an OMAP3 platform, data is not being correctly written to NAND. I see the issue on 2013.10-rc2 and 2013.07 but not on 2012.10. Specifically, when I read back a SPL binary written with hardware Hamming ECC, I don't get a matching CRC. With the BCH8 ECC algorithm, the CRC is correct (but SPL must be written with Hamming otherwise the board doesn't boot)
I've shown my steps here: http://pastebin.com/tLZsr9zH The expected CRC is 46745188.
Any suggestions/ideas much appreciated!
I'd swear this worked when I changed the command to use BCH ... I'll check it again today.
I can't confirm your complaints. Here it works (at least on tricorder, which utilizes BCH for U-Boot section in SPL):
---8<--- U-Boot SPL 2013.10-rc2-00014-g2d5878e (Sep 04 2013 - 10:39:57) reading u-boot.img reading u-boot.img
U-Boot 2013.10-rc2-00014-g2d5878e (Sep 04 2013 - 10:39:57)
OMAP36XX/37XX-GP ES1.2, CPU-OPP2, L3-200MHz, Max CPU Clock 1 Ghz OMAP3 Tricorder + LPDDR/NAND I2C: ready DRAM: 128 MiB NAND: 512 MiB MMC: OMAP SD/MMC: 0 *** Warning - bad CRC, using default environment
In: serial Out: serial Err: serial Board : serial Die ID #246000029e38000001683b060102002d Hit any key to stop autoboot: 0 OMAP3 Tricorder # fatload mmc 0 ${loadaddr} MLO reading MLO 55052 bytes read in 11 ms (4.8 MiB/s) OMAP3 Tricorder # nandecc hw 1-bit hamming HW ECC selected OMAP3 Tricorder # nand erase.part SPL
NAND erase.part: device 0 offset 0x0, size 0x20000 Erasing at 0x0 -- 100% complete. OK OMAP3 Tricorder # crc32 ${loadaddr} ${filesize} CRC32 for 82000000 ... 8200d70b ==> b182f0a2 OMAP3 Tricorder # nand write ${loadaddr} SPL ${filesize}
NAND write: device 0 offset 0x0, size 0xd70c 55052 bytes written: OK OMAP3 Tricorder # nand read 0x80000000 SPL ${filesize}
NAND read: device 0 offset 0x0, size 0xd70c 55052 bytes read: OK OMAP3 Tricorder # crc32 0x80000000 ${filesize} CRC32 for 80000000 ... 8000d70b ==> b182f0a2 OMAP3 Tricorder # --->8---
The 14 patches on top of -rc2 are just local board changes (will be sent soon):
---8<--- 2d5878e (HEAD, tricorder-TOT) tricorder: fixup for led.c c3806ea tricorder: read kernel directly from NAND c158ec1 tricorder: switch to alternative memtest 9bcc57b tricorder: Make u-boot faster 441857b tricorder: add led support b8bb65a tricorder: panic() on unknown board 8c5e0a8 tricorder: add tricordereeprom command 4c86cad tricorder: add mtdparts to environment 3ac9838 tricorder: add cmdline history 7fe344d tricorder: move commonargs to common settings 3bd1a05 tricorder: add configuration for a flashcard u-boot d2578c5 tricorder: use generic provided loadaddr 649f8dc tricorder: update flash partitioning e34c01a tricorder: remove lcdmode from bootargs fb18fa9 (tag: v2013.10-rc2, origin/master, origin/HEAD, cs/master) Prepare v2013.10-rc2 --->8---
Best regards
Andreas Bießmann

On Wed, Sep 4, 2013 at 1:54 AM, Andreas Bießmann andreas.devel@googlemail.com wrote:
I can't confirm your complaints. Here it works (at least on tricorder, which utilizes BCH for U-Boot section in SPL):
Hi Andreas,
Thanks for your response---this was very helpful. When I boot my board using the tricorder board file, it flashes nand correctly. Likewise, I moved over some of the NAND configuration from include/configs/tricorder.h to include/configs/omap3_overo.h and, after a little rearranging to enlarge SPL, it also flashed NAND correctly.
So...any guesses what it is about setting these variables that gets NAND flashing to work properly?
+#define CONFIG_NAND_OMAP_BCH8 +#define CONFIG_BCH -#define CONFIG_SYS_NAND_ECCPOS {2, 3, 4, 5, 6, 7, 8, 9,\ - 10, 11, 12, 13} +#define CONFIG_SYS_NAND_ECCPOS {12, 13, 14, 15, 16, 17, 18, 19, 20, \ + 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,\ + 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46,\ + 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,\ + 60, 61, 62, 63} -#define CONFIG_SYS_NAND_ECCBYTES 3 +#define CONFIG_SYS_NAND_ECCBYTES 13
--Ash

Hi,
I did a little bit more work with git bisect and found an issue on commit c788ecfdc3eb577757ffc1bfb8416added07ef33 "nand: Move the sub-page read support enable to a flag".
Making this change on top of v2013.07 allowed me to again write to NAND correctly.
-#define NAND_HAS_SUBPAGE_READ(chip) ((chip->options & NAND_SUBPAGE_READ)) +#define NAND_HAS_SUBPAGE_READ(chip) ((chip->ecc.mode == NAND_ECC_SOFT) \ + && (chip->page_shift > 9))
Like some other OMAP3 platforms, my platform uses 1-bit hardware ECC for the first NAND partition and software ECC elsewhere. Does this ecc.mode switch need to be partition specific?
--Ash
On Wed, Sep 4, 2013 at 11:00 AM, Ash Charles ashcharles@gmail.com wrote:
On Wed, Sep 4, 2013 at 1:54 AM, Andreas Bießmann andreas.devel@googlemail.com wrote:
I can't confirm your complaints. Here it works (at least on tricorder, which utilizes BCH for U-Boot section in SPL):
Hi Andreas,
Thanks for your response---this was very helpful. When I boot my board using the tricorder board file, it flashes nand correctly. Likewise, I moved over some of the NAND configuration from include/configs/tricorder.h to include/configs/omap3_overo.h and, after a little rearranging to enlarge SPL, it also flashed NAND correctly.
So...any guesses what it is about setting these variables that gets NAND flashing to work properly?
+#define CONFIG_NAND_OMAP_BCH8 +#define CONFIG_BCH -#define CONFIG_SYS_NAND_ECCPOS {2, 3, 4, 5, 6, 7, 8, 9,\
10, 11, 12, 13}
+#define CONFIG_SYS_NAND_ECCPOS {12, 13, 14, 15, 16, 17, 18, 19, 20, \
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,\
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46,\
47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,\
60, 61, 62, 63}
-#define CONFIG_SYS_NAND_ECCBYTES 3 +#define CONFIG_SYS_NAND_ECCBYTES 13
--Ash

Dear Ash Charles,
On 09/05/2013 01:02 AM, Ash Charles wrote:
Hi,
I did a little bit more work with git bisect and found an issue on commit c788ecfdc3eb577757ffc1bfb8416added07ef33 "nand: Move the sub-page read support enable to a flag".
Making this change on top of v2013.07 allowed me to again write to NAND correctly.
-#define NAND_HAS_SUBPAGE_READ(chip) ((chip->options & NAND_SUBPAGE_READ)) +#define NAND_HAS_SUBPAGE_READ(chip) ((chip->ecc.mode == NAND_ECC_SOFT) \
&& (chip->page_shift > 9))
this check moved into nand_scan_tail() which is also handled when calling nandecc from u-boot cmdline, on first sight your change isn't not necessary. Can you please check if the chip->options is modified somewhere between the nand_scan_tail() and the place where the NAND_SUBPAGE_READ flag is checked?
Like some other OMAP3 platforms, my platform uses 1-bit hardware ECC for the first NAND partition and software ECC elsewhere. Does this ecc.mode switch need to be partition specific?
This can not be partition specific by design. The ecc scheme is bound to an NAND device and therefore we introduced the nandecc command for omap3 (cause ROM code can only handle 1 bit hamming, but today's devices require sometimes more than 1 bit ecc).
Best regards
Andreas Bießmann

Dear Andreas,
Thanks for your responses.
Based on a little more testing, I found this: with 'nandecc hw', NAND_HAS_SUBPAGE_READ must be false otherwise data is not written correct to NAND.
My hardware (a Micron flash chip) supports subpage reads and behaves correctly with 'nandecc sw'. On boot, chip->ecc.mode == NAND_ECC_SOFT which sets (correctly for my NAND), the NAND_SUBPAGE_READ option. If I change the 'nandecc hw', chip->ecc.mode == NAND_ECC_HARD (seems correct) but NAND_SUBPAGE_READ is still set.
I suspect then that commit c788ecfdc3eb577757ffc1bfb8416added07ef33 is not to blame. It just exposes a bug where if 'nandecc hw', subpage reads don't work properly even if the NAND flash supports it. Seem okay?
Thanks again for your help.
--Ash
On Thu, Sep 5, 2013 at 1:57 AM, Andreas Bießmann andreas.devel@googlemail.com wrote:
Dear Ash Charles,
On 09/05/2013 01:02 AM, Ash Charles wrote:
Hi,
I did a little bit more work with git bisect and found an issue on commit c788ecfdc3eb577757ffc1bfb8416added07ef33 "nand: Move the sub-page read support enable to a flag".
Making this change on top of v2013.07 allowed me to again write to NAND correctly.
-#define NAND_HAS_SUBPAGE_READ(chip) ((chip->options & NAND_SUBPAGE_READ)) +#define NAND_HAS_SUBPAGE_READ(chip) ((chip->ecc.mode == NAND_ECC_SOFT) \
&& (chip->page_shift > 9))
this check moved into nand_scan_tail() which is also handled when calling nandecc from u-boot cmdline, on first sight your change isn't not necessary. Can you please check if the chip->options is modified somewhere between the nand_scan_tail() and the place where the NAND_SUBPAGE_READ flag is checked?
Like some other OMAP3 platforms, my platform uses 1-bit hardware ECC for the first NAND partition and software ECC elsewhere. Does this ecc.mode switch need to be partition specific?
This can not be partition specific by design. The ecc scheme is bound to an NAND device and therefore we introduced the nandecc command for omap3 (cause ROM code can only handle 1 bit hamming, but today's devices require sometimes more than 1 bit ecc).
Best regards
Andreas Bießmann

Dear Ash Charles,
On 09/04/2013 08:00 PM, Ash Charles wrote:
On Wed, Sep 4, 2013 at 1:54 AM, Andreas Bießmann andreas.devel@googlemail.com wrote:
I can't confirm your complaints. Here it works (at least on tricorder, which utilizes BCH for U-Boot section in SPL):
Hi Andreas,
Thanks for your response---this was very helpful. When I boot my board using the tricorder board file, it flashes nand correctly. Likewise, I moved over some of the NAND configuration from include/configs/tricorder.h to include/configs/omap3_overo.h and, after a little rearranging to enlarge SPL, it also flashed NAND correctly.
So...any guesses what it is about setting these variables that gets NAND flashing to work properly?
+#define CONFIG_NAND_OMAP_BCH8 +#define CONFIG_BCH -#define CONFIG_SYS_NAND_ECCPOS {2, 3, 4, 5, 6, 7, 8, 9,\
10, 11, 12, 13}
+#define CONFIG_SYS_NAND_ECCPOS {12, 13, 14, 15, 16, 17, 18, 19, 20, \
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,\
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46,\
47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,\
60, 61, 62, 63}
-#define CONFIG_SYS_NAND_ECCBYTES 3 +#define CONFIG_SYS_NAND_ECCBYTES 13
these settings are for BCH8! The original settings you replaced are for so called 'SW hamming' (1 bit ecc with special OOB layout for omap3, differs from 1 bit ecc mapping used by the omap3 HW calculation in ROM code).
If you need higher ECC schemes for your NAND, you should update your setup (u-boot + SPL _and_ linux) to use some BCH codec. BCH4 calculation seems buggy on some omap3, therefore I used BCH8 here. Obviously is it a required step, since even SLC need 4 bit ecc nowadays, some allow just 1 bit for the first sector if only a few erasures occur.
Best regards
Andreas Bießmann
participants (2)
-
Andreas Bießmann
-
Ash Charles