[U-Boot] [PATCH v1 0/2] mtd: nand: omap: booting from NAND using u-boot

As there were parallel set of patches running between u-boot and kernel. hence, some patch-sets caused regression for OMAP3x platforms when booting using u-boot specifically for ecc-schemes (like BCH4_SW).
Hence this patch series fixes those regressions, and tests complete NAND boot sequence for multiple ecc-schemes on AM335x-EVM board. (following configurations are required for building u-boot driver which is compatible to kernel ecclayout for selected ecc-scheme)
(BCH8_HW) (HAM1_HW) (HAM1_HW) (HAM1_HW) (UBIFS) ROM ---------> SPL ---------> U-Boot ---------> Kernel ---------> File-System
(BCH8_HW) (BCH8_SW) (BCH8_SW) (BCH8_SW) (UBIFS) ROM ---------> SPL ---------> U-Boot ---------> Kernel ---------> File-System
(BCH8_HW) (BCH8_HW) (BCH8_HW) (BCH8_HW) (UBIFS) ROM ---------> SPL ---------> U-Boot ---------> Kernel ---------> File-System
*Configurations used to build u-boot and kernel for end-to-end NAND boot* +------------+--------------------------------------------+------------------+ | ecc-scheme | u-boot/SPL configs | kernel DTS | +------------+--------------------------------------------+------------------+ | | | | | HAM1_HW | #define CONFIG_NAND_OMAP_ECCSCHEME \ |ti,nand-ecc-opts= | | | OMAP_ECC_HAM1_CODE_HW | "ham1" | | (1-bit | #define CONFIG_SYS_NAND_ECCBYTES 3 | | | Hamming | #define CONFIG_SYS_NAND_ECCPOS \ | | | using h/w) | { 1, 2, 3, 4, 5, 6, 7, 8, 9, \ | | | | 10, 11, 12 } | | | | (for NAND page-size=2048) | | | | | | +------------+--------------------------------------------+------------------+ | | | | | BCH8_SW | #define CONFIG_NAND_OMAP_ECCSCHEME \ |ti,nand-ecc-opts= | | | OMAP_ECC_BCH8_CODE_HW_DETECTION_SW | "bch8" | |(8-bit BCH | #define CONFIG_SYS_NAND_ECCBYTES 13 |(without ELM node)| | using s/w | #define CONFIG_BCH | | |library for | #undef CONFIG_SPL_NAND_AM33XX_BCH | | |for ECC | #define CONFIG_SPL_NAND_SIMPLE | | | error | #define CONFIG_SYS_NAND_ECCPOS \ | | |correction) | {2, 3, 4, 5, 6, 7, 8, 9, 10, \ | | | | 11, 12, 13, 14, \ | | | | 16, 17, 18, 19, 20, 21, 22, 23, 24, \ | | | | 25, 26, 27, 28, \ | | | | 30, 31, 32, 33, 34, 35, 36, 37, 38, \ | | | | 39, 40, 41, 42, \ | | | | 44, 45, 46, 47, 48, 49, 50, 51, 52, \ | | | | 53, 54, 55, 56, } | | | | (for NAND page-size=2048) | | | | #define CONFIG_SYS_NAND_ECCSIZE 512 | | | | | | +------------+--------------------------------------------+------------------+ | | | | | BCH8_HW | #define CONFIG_NAND_OMAP_ECCSCHEME \ |ti,nand-ecc-opts= | | | OMAP_ECC_BCH8_CODE_HW | "bch8" | |(8-bit BCH | #define CONFIG_SYS_NAND_ECCBYTES 14 | | | using ELM | #define CONFIG_SPL_NAND_AM33XX_BCH |(with ELM node) | | h/w engine | #define CONFIG_SYS_NAND_ECCPOS \ |ti,elm-id=<&elm> | |for ECC | {2, 3, 4, 5, 6, 7, 8, 9, \ | | | error | 10, 11, 12, 13, 14, 15, 16, 17, \ | | |correction) | 18, 19, 20, 21, 22, 23, 24, 25, \ | | | | 26, 27, 28, 29, 30, 31, 32, 33, \ | | | | 34, 35, 36, 37, 38, 39, 40, 41, \ | | | | 42, 43, 44, 45, 46, 47, 48, 49, \ | | | | 50, 51, 52, 53, 54, 55, 56, 57, } | | | | (for NAND page-size=2048) | | | | #define CONFIG_SYS_NAND_ECCSIZE 512 | | | | | | +------------+--------------------------------------------+------------------+
#* In addition following patches need to be pulled for u-boot: http://lists.denx.de/pipermail/u-boot/2013-December/168506.html http://lists.denx.de/pipermail/u-boot/2013-December/169021.html
Test1: flash ubi image from u-boot and boot the kernel U-boot> mw 0x82000000 0xff <u-boot.img size aligned to NAND block boundary> U-boot> fatload mmc 0 0x82000000 u-boot.img U-boot> nand erase <u-boot_offset> <u-boot.img size> U-boot> nand write 0x82000000 <u-boot_offset> <u-boot.img size> U-boot> setenv bootargs 'console=ttyO0,115200n8 noinitrd mem=256M \ root=ubi0 rw rootfstype=ubifs ubi.mtd=<mtdpart-of-rootfs>,\ <page-size> ip=off init=/init' U-boot> bootm <kernel_offset>
Test2: update u-boot.img from kernel and re-boot Kernel> flash_erase /dev/<mtdpart-of-u-boot> 0 0 Kernel> nandwrite -s 0 /dev/<mtdpart-of-u-boot> u-boot.img Kernel> reboot
Signed-off-by: Pekon Gupta pekon@ti.com --- drivers/mtd/nand/omap2.c | 33 +++++++++++++++++++++++++-------- 1 file changed, 25 insertions(+), 8 deletions(-)
Pekon Gupta (2): mtd: nand: omap: fix ecclayout->oobfree->offset mtd: nand: omap: fix ecclayout to be in sync with u-boot NAND driver
drivers/mtd/nand/omap2.c | 50 ++++++++++++++++++++++++++++-------------------- 1 file changed, 29 insertions(+), 21 deletions(-)

This patch updates starting offset for free bytes in OOB which can be used by file-systems to store their metadata (like clean-marker in case of JFFS2).
Signed-off-by: Pekon Gupta pekon@ti.com --- drivers/mtd/nand/omap2.c | 17 ++++------------- 1 file changed, 4 insertions(+), 13 deletions(-)
diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c index f777250..bbdb5e8 100644 --- a/drivers/mtd/nand/omap2.c +++ b/drivers/mtd/nand/omap2.c @@ -1835,8 +1835,6 @@ static int omap_nand_probe(struct platform_device *pdev) ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH; else ecclayout->eccpos[0] = 1; - ecclayout->oobfree->offset = ecclayout->eccpos[0] + - ecclayout->eccbytes; break;
case OMAP_ECC_BCH4_CODE_HW_DETECTION_SW: @@ -1854,8 +1852,6 @@ static int omap_nand_probe(struct platform_device *pdev) (mtd->writesize / nand_chip->ecc.size); ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH; - ecclayout->oobfree->offset = ecclayout->eccpos[0] + - ecclayout->eccbytes; /* software bch library is used for locating errors */ nand_chip->ecc.priv = nand_bch_init(mtd, nand_chip->ecc.size, @@ -1890,8 +1886,6 @@ static int omap_nand_probe(struct platform_device *pdev) (mtd->writesize / nand_chip->ecc.size); ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH; - ecclayout->oobfree->offset = ecclayout->eccpos[0] + - ecclayout->eccbytes; /* This ECC scheme requires ELM H/W block */ if (is_elm_present(info, pdata->elm_of_node, BCH4_ECC) < 0) { pr_err("nand: error: could not initialize ELM\n"); @@ -1920,8 +1914,6 @@ static int omap_nand_probe(struct platform_device *pdev) (mtd->writesize / nand_chip->ecc.size); ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH; - ecclayout->oobfree->offset = ecclayout->eccpos[0] + - ecclayout->eccbytes; /* software bch library is used for locating errors */ nand_chip->ecc.priv = nand_bch_init(mtd, nand_chip->ecc.size, @@ -1963,8 +1955,6 @@ static int omap_nand_probe(struct platform_device *pdev) (mtd->writesize / nand_chip->ecc.size); ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH; - ecclayout->oobfree->offset = ecclayout->eccpos[0] + - ecclayout->eccbytes; break; #else pr_err("nand: error: CONFIG_MTD_NAND_OMAP_BCH not enabled\n"); @@ -1978,9 +1968,6 @@ static int omap_nand_probe(struct platform_device *pdev) goto return_error; }
- /* populate remaining ECC layout data */ - ecclayout->oobfree->length = mtd->oobsize - (BADBLOCK_MARKER_LENGTH + - ecclayout->eccbytes); for (i = 1; i < ecclayout->eccbytes; i++) ecclayout->eccpos[i] = ecclayout->eccpos[0] + i; /* check if NAND device's OOB is enough to store ECC signatures */ @@ -1990,6 +1977,10 @@ static int omap_nand_probe(struct platform_device *pdev) err = -EINVAL; goto return_error; } + /* populate remaining ECC layout data */ + ecclayout->oobfree->offset = ecclayout->eccpos[ecclayout->eccbytes] + 1; + ecclayout->oobfree->length = mtd->oobsize - (BADBLOCK_MARKER_LENGTH + + ecclayout->eccbytes);
/* second phase scan */ if (nand_scan_tail(mtd)) {

On Fri, Dec 13, 2013 at 02:42:57PM +0530, Pekon Gupta wrote:
This patch updates starting offset for free bytes in OOB which can be used by file-systems to store their metadata (like clean-marker in case of JFFS2).
This should be describing a regression fix, right? We don't just arbitrarily change the "OOB free" layout; we need a reason. So please describe it here.
(It seems like an off-by-one, or off-by-<N> error, where the oobfree region was miscalculated.)
Possibly you can paste an example intended ecclayout as well as an incorrect layout that was calculated before this fix.
Signed-off-by: Pekon Gupta pekon@ti.com
drivers/mtd/nand/omap2.c | 17 ++++------------- 1 file changed, 4 insertions(+), 13 deletions(-)
diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c index f777250..bbdb5e8 100644 --- a/drivers/mtd/nand/omap2.c +++ b/drivers/mtd/nand/omap2.c @@ -1835,8 +1835,6 @@ static int omap_nand_probe(struct platform_device *pdev) ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH; else ecclayout->eccpos[0] = 1;
ecclayout->oobfree->offset = ecclayout->eccpos[0] +
ecclayout->eccbytes;
break;
case OMAP_ECC_BCH4_CODE_HW_DETECTION_SW:
@@ -1854,8 +1852,6 @@ static int omap_nand_probe(struct platform_device *pdev) (mtd->writesize / nand_chip->ecc.size); ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH;
ecclayout->oobfree->offset = ecclayout->eccpos[0] +
/* software bch library is used for locating errors */ nand_chip->ecc.priv = nand_bch_init(mtd, nand_chip->ecc.size,ecclayout->eccbytes;
@@ -1890,8 +1886,6 @@ static int omap_nand_probe(struct platform_device *pdev) (mtd->writesize / nand_chip->ecc.size); ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH;
ecclayout->oobfree->offset = ecclayout->eccpos[0] +
/* This ECC scheme requires ELM H/W block */ if (is_elm_present(info, pdata->elm_of_node, BCH4_ECC) < 0) { pr_err("nand: error: could not initialize ELM\n");ecclayout->eccbytes;
@@ -1920,8 +1914,6 @@ static int omap_nand_probe(struct platform_device *pdev) (mtd->writesize / nand_chip->ecc.size); ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH;
ecclayout->oobfree->offset = ecclayout->eccpos[0] +
/* software bch library is used for locating errors */ nand_chip->ecc.priv = nand_bch_init(mtd, nand_chip->ecc.size,ecclayout->eccbytes;
@@ -1963,8 +1955,6 @@ static int omap_nand_probe(struct platform_device *pdev) (mtd->writesize / nand_chip->ecc.size); ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH;
ecclayout->oobfree->offset = ecclayout->eccpos[0] +
break;ecclayout->eccbytes;
#else pr_err("nand: error: CONFIG_MTD_NAND_OMAP_BCH not enabled\n"); @@ -1978,9 +1968,6 @@ static int omap_nand_probe(struct platform_device *pdev) goto return_error; }
- /* populate remaining ECC layout data */
- ecclayout->oobfree->length = mtd->oobsize - (BADBLOCK_MARKER_LENGTH +
for (i = 1; i < ecclayout->eccbytes; i++) ecclayout->eccpos[i] = ecclayout->eccpos[0] + i; /* check if NAND device's OOB is enough to store ECC signatures */ecclayout->eccbytes);
@@ -1990,6 +1977,10 @@ static int omap_nand_probe(struct platform_device *pdev) err = -EINVAL; goto return_error; }
- /* populate remaining ECC layout data */
- ecclayout->oobfree->offset = ecclayout->eccpos[ecclayout->eccbytes] + 1;
Hmm, this line seems a little odd. The ecclayout->eccpos[] array and the value of ecclayout->eccbytes sould be related as follows:
let N = ecclayout->eccbytes
This means the eccpos[] array should have N entries, indexed 0 to N-1, and eccpos[N] is out of bounds. But you are accessing eccpos[N] above (i.e., eccpos[ecclayout->eccbytes]). Are you sure this is correct? It seems like it should be:
ecclayout->oobfree->offset = ecclayout->eccpos[ecclayout->eccbytes - 1] + 1;
ecclayout->oobfree->length = mtd->oobsize - (BADBLOCK_MARKER_LENGTH +
ecclayout->eccbytes);
/* second phase scan */ if (nand_scan_tail(mtd)) {
Brian

Hi Brian,
From: Brian Norris [mailto:computersforpeace@gmail.com]
On Fri, Dec 13, 2013 at 02:42:57PM +0530, Pekon Gupta wrote: This patch updates starting offset for free bytes in OOB which can be used by file-systems to store their metadata (like clean-marker in case of JFFS2).
This should be describing a regression fix, right? We don't just arbitrarily change the "OOB free" layout; we need a reason. So please describe it here.
(It seems like an off-by-one, or off-by-<N> error, where the oobfree region was miscalculated.)
Possibly you can paste an example intended ecclayout as well as an incorrect layout that was calculated before this fix.
Signed-off-by: Pekon Gupta pekon@ti.com
drivers/mtd/nand/omap2.c | 17 ++++------------- 1 file changed, 4 insertions(+), 13 deletions(-)
diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c
[...]
- /* populate remaining ECC layout data */
- ecclayout->oobfree->length = mtd->oobsize - (BADBLOCK_MARKER_LENGTH +
for (i = 1; i < ecclayout->eccbytes; i++) ecclayout->eccpos[i] = ecclayout->eccpos[0] + i; /* check if NAND device's OOB is enough to store ECC signatures */ecclayout->eccbytes);
@@ -1990,6 +1977,10 @@ static int omap_nand_probe(struct platform_device *pdev) err = -EINVAL; goto return_error; }
- /* populate remaining ECC layout data */
- ecclayout->oobfree->offset = ecclayout->eccpos[ecclayout->eccbytes] + 1;
Hmm, this line seems a little odd. The ecclayout->eccpos[] array and the value of ecclayout->eccbytes sould be related as follows:
let N = ecclayout->eccbytes
This means the eccpos[] array should have N entries, indexed 0 to N-1, and eccpos[N] is out of bounds. But you are accessing eccpos[N] above (i.e., eccpos[ecclayout->eccbytes]). Are you sure this is correct? It seems like it should be:
ecclayout->oobfree->offset = ecclayout->eccpos[ecclayout->eccbytes - 1] + 1;
Thanks for this catch. Yes, you are correct. It's a typo. This wasn't caught as I had tested everything on UBIFS which does not use 'oobfree'. Also, as ecclayout->eccpos is defined as large static array, so this dint caused problems either. #define MTD_MAX_ECCPOS_ENTRIES_LARGE 640 struct nand_ecclayout { __u32 eccpos[MTD_MAX_ECCPOS_ENTRIES_LARGE];
But, I think mtd_tests.nand_oobtest would have caught this. I'll include this change in next version.
<stripping down the CC list to avoid getting moderated by u-boot mailman>
with regards, pekon

This patch mainly fixes ecc-layout for following ecc-schemes, to bring them in sync with u-boot omap_gpmc NAND driver: - BCH4_SW: OMAP_ECC_BCH4_CODE_HW_DETECTION_SW This ecc-scheme is mainly used on AM35xx and other legacy platforms.
- BCH8_SW: OMAP_ECC_BCH8_CODE_HW_DETECTION_SW This ecc-scheme is mainly used on OMAP3x and other legacy platforms.
Apart from above, this patch also touches other ecc-schemes as the fix required refactoring common code, into ecc-scheme specific code. Hence, end-to-end NAND boot sequence was tested on AM335x-EVM to avoid any further regression on legacy or current platforms.
(BCH8_HW) (HAM1_HW) (HAM1_HW) (HAM1_HW) (UBIFS) ROM ---------> SPL ---------> U-Boot ---------> Kernel ---------> File-System
(BCH8_HW) (BCH8_SW) (BCH8_SW) (BCH8_SW) (UBIFS) ROM ---------> SPL ---------> U-Boot ---------> Kernel ---------> File-System
(BCH8_HW) (BCH8_HW) (BCH8_HW) (BCH8_HW) (UBIFS) ROM ---------> SPL ---------> U-Boot ---------> Kernel ---------> File-System
*Configurations used to build u-boot and kernel for end-to-end NAND boot* +------------+--------------------------------------------+------------------+ | ecc-scheme | u-boot/SPL configs | kernel DTS | +------------+--------------------------------------------+------------------+ | | | | | HAM1_HW | #define CONFIG_NAND_OMAP_ECCSCHEME \ |ti,nand-ecc-opts= | | | OMAP_ECC_HAM1_CODE_HW | "ham1" | | (1-bit | #define CONFIG_SYS_NAND_ECCBYTES 3 | | | Hamming | #define CONFIG_SYS_NAND_ECCPOS \ | | | using h/w) | { 1, 2, 3, 4, 5, 6, 7, 8, 9, \ | | | | 10, 11, 12 } | | | | (for NAND page-size=2048) | | | | | | +------------+--------------------------------------------+------------------+ | | | | | BCH8_SW | #define CONFIG_NAND_OMAP_ECCSCHEME \ |ti,nand-ecc-opts= | | | OMAP_ECC_BCH8_CODE_HW_DETECTION_SW | "bch8" | |(8-bit BCH | #define CONFIG_SYS_NAND_ECCBYTES 13 |(without ELM node)| | using s/w | #define CONFIG_BCH | | |library for | #undef CONFIG_SPL_NAND_AM33XX_BCH | | |for ECC | #define CONFIG_SPL_NAND_SIMPLE | | | error | #define CONFIG_SYS_NAND_ECCPOS \ | | |correction) | {2, 3, 4, 5, 6, 7, 8, 9, 10, \ | | | | 11, 12, 13, 14, \ | | | | 16, 17, 18, 19, 20, 21, 22, 23, 24, \ | | | | 25, 26, 27, 28, \ | | | | 30, 31, 32, 33, 34, 35, 36, 37, 38, \ | | | | 39, 40, 41, 42, \ | | | | 44, 45, 46, 47, 48, 49, 50, 51, 52, \ | | | | 53, 54, 55, 56, } | | | | (for NAND page-size=2048) | | | | #define CONFIG_SYS_NAND_ECCSIZE 512 | | | | | | +------------+--------------------------------------------+------------------+ | | | | | BCH8_HW | #define CONFIG_NAND_OMAP_ECCSCHEME \ |ti,nand-ecc-opts= | | | OMAP_ECC_BCH8_CODE_HW | "bch8" | |(8-bit BCH | #define CONFIG_SYS_NAND_ECCBYTES 14 | | | using ELM | #define CONFIG_SPL_NAND_AM33XX_BCH |(with ELM node) | | h/w engine | #define CONFIG_SYS_NAND_ECCPOS \ |ti,elm-id=<&elm> | |for ECC | {2, 3, 4, 5, 6, 7, 8, 9, \ | | | error | 10, 11, 12, 13, 14, 15, 16, 17, \ | | |correction) | 18, 19, 20, 21, 22, 23, 24, 25, \ | | | | 26, 27, 28, 29, 30, 31, 32, 33, \ | | | | 34, 35, 36, 37, 38, 39, 40, 41, \ | | | | 42, 43, 44, 45, 46, 47, 48, 49, \ | | | | 50, 51, 52, 53, 54, 55, 56, 57, } | | | | (for NAND page-size=2048) | | | | #define CONFIG_SYS_NAND_ECCSIZE 512 | | | | | | +------------+--------------------------------------------+------------------+
Test1: flash ubi image from u-boot and boot the kernel U-boot> mw 0x82000000 0xff <u-boot.img size aligned to NAND block boundary> U-boot> fatload mmc 0 0x82000000 u-boot.img U-boot> nand erase <u-boot_offset> <u-boot.img size> U-boot> nand write 0x82000000 <u-boot_offset> <u-boot.img size> U-boot> setenv bootargs 'console=ttyO0,115200n8 noinitrd mem=256M \ root=ubi0 rw rootfstype=ubifs ubi.mtd=<mtdpart-of-rootfs>,\ <page-size> ip=off init=/init' U-boot> bootm <kernel_offset>
Test2: update u-boot.img from kernel and re-boot Kernel> flash_erase /dev/<mtdpart-of-u-boot> 0 0 Kernel> nandwrite -s 0 /dev/<mtdpart-of-u-boot> u-boot.img Kernel> reboot
Signed-off-by: Pekon Gupta pekon@ti.com --- drivers/mtd/nand/omap2.c | 33 +++++++++++++++++++++++++-------- 1 file changed, 25 insertions(+), 8 deletions(-)
diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c index bbdb5e8..e7836bf 100644 --- a/drivers/mtd/nand/omap2.c +++ b/drivers/mtd/nand/omap2.c @@ -1633,6 +1633,7 @@ static int omap_nand_probe(struct platform_device *pdev) int i; dma_cap_mask_t mask; unsigned sig; + unsigned oob_index; struct resource *res; struct mtd_part_parser_data ppdata = {};
@@ -1832,9 +1833,11 @@ static int omap_nand_probe(struct platform_device *pdev) (mtd->writesize / nand_chip->ecc.size); if (nand_chip->options & NAND_BUSWIDTH_16) - ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH; + oob_index = BADBLOCK_MARKER_LENGTH; else - ecclayout->eccpos[0] = 1; + oob_index = 1; + for (i = 0; i < ecclayout->eccbytes; i++, oob_index++) + ecclayout->eccpos[i] = oob_index; break;
case OMAP_ECC_BCH4_CODE_HW_DETECTION_SW: @@ -1851,7 +1854,13 @@ static int omap_nand_probe(struct platform_device *pdev) ecclayout->eccbytes = nand_chip->ecc.bytes * (mtd->writesize / nand_chip->ecc.size); - ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH; + oob_index = BADBLOCK_MARKER_LENGTH; + for (i = 0; i < ecclayout->eccbytes; i++, oob_index++) { + if ((i % nand_chip->ecc.bytes) || (i == 0)) + ecclayout->eccpos[i] = oob_index; + else + ecclayout->eccpos[i] = ++oob_index; + } /* software bch library is used for locating errors */ nand_chip->ecc.priv = nand_bch_init(mtd, nand_chip->ecc.size, @@ -1885,7 +1894,9 @@ static int omap_nand_probe(struct platform_device *pdev) ecclayout->eccbytes = nand_chip->ecc.bytes * (mtd->writesize / nand_chip->ecc.size); - ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH; + oob_index = BADBLOCK_MARKER_LENGTH; + for (i = 0; i < ecclayout->eccbytes; i++, oob_index++) + ecclayout->eccpos[i] = oob_index; /* This ECC scheme requires ELM H/W block */ if (is_elm_present(info, pdata->elm_of_node, BCH4_ECC) < 0) { pr_err("nand: error: could not initialize ELM\n"); @@ -1913,7 +1924,13 @@ static int omap_nand_probe(struct platform_device *pdev) ecclayout->eccbytes = nand_chip->ecc.bytes * (mtd->writesize / nand_chip->ecc.size); - ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH; + oob_index = BADBLOCK_MARKER_LENGTH; + for (i = 0; i < ecclayout->eccbytes; i++, oob_index++) { + if ((i % nand_chip->ecc.bytes) || (i == 0)) + ecclayout->eccpos[i] = oob_index; + else + ecclayout->eccpos[i] = ++oob_index; + } /* software bch library is used for locating errors */ nand_chip->ecc.priv = nand_bch_init(mtd, nand_chip->ecc.size, @@ -1954,7 +1971,9 @@ static int omap_nand_probe(struct platform_device *pdev) ecclayout->eccbytes = nand_chip->ecc.bytes * (mtd->writesize / nand_chip->ecc.size); - ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH; + oob_index = BADBLOCK_MARKER_LENGTH; + for (i = 0; i < ecclayout->eccbytes; i++, oob_index++) + ecclayout->eccpos[i] = oob_index; break; #else pr_err("nand: error: CONFIG_MTD_NAND_OMAP_BCH not enabled\n"); @@ -1968,8 +1987,6 @@ static int omap_nand_probe(struct platform_device *pdev) goto return_error; }
- for (i = 1; i < ecclayout->eccbytes; i++) - ecclayout->eccpos[i] = ecclayout->eccpos[0] + i; /* check if NAND device's OOB is enough to store ECC signatures */ if (mtd->oobsize < (ecclayout->eccbytes + BADBLOCK_MARKER_LENGTH)) { pr_err("not enough OOB bytes required = %d, available=%d\n",

On Fri, Dec 13, 2013 at 02:42:58PM +0530, Pekon Gupta wrote:
@@ -1851,7 +1854,13 @@ static int omap_nand_probe(struct platform_device *pdev) ecclayout->eccbytes = nand_chip->ecc.bytes * (mtd->writesize / nand_chip->ecc.size);
ecclayout->eccpos[0] = BADBLOCK_MARKER_LENGTH;
oob_index = BADBLOCK_MARKER_LENGTH;
for (i = 0; i < ecclayout->eccbytes; i++, oob_index++) {
if ((i % nand_chip->ecc.bytes) || (i == 0))
ecclayout->eccpos[i] = oob_index;
else
ecclayout->eccpos[i] = ++oob_index;
This if-else structure, with a combined assignment and increment kind of obscures the logic of what you're really trying to do. Could this be better as follows?
for (i = 0; i < ecclayout->eccbytes; i++, oob_index++) { ecclayout->eccpos[i] = oob_index; if (((i + 1) % nand_chip->ecc.bytes) == 0) oob_index++; }
/* software bch library is used for locating errors */ nand_chip->ecc.priv = nand_bch_init(mtd, nand_chip->ecc.size,}
Brian

Hello Enric, Nikita, and other OMAP3 users,
As there were parallel set of patches running between u-boot and kernel. hence, some patch-sets caused regression for OMAP3x platforms when booting using u-boot specifically for ecc-schemes (like BCH4_SW).
Hence this patch series fixes those regressions, and tests complete NAND boot sequence for multiple ecc-schemes on AM335x-EVM board. (following configurations are required for building u-boot driver which is compatible to kernel ecclayout for selected ecc-scheme)
(BCH8_HW) (HAM1_HW) (HAM1_HW) (HAM1_HW) (UBIFS) ROM ---------> SPL ---------> U-Boot ---------> Kernel ---------> File-System
(BCH8_HW) (BCH8_SW) (BCH8_SW) (BCH8_SW) (UBIFS) ROM ---------> SPL ---------> U-Boot ---------> Kernel ---------> File-System
(BCH8_HW) (BCH8_HW) (BCH8_HW) (BCH8_HW) (UBIFS) ROM ---------> SPL ---------> U-Boot ---------> Kernel ---------> File-System
*Configurations used to build u-boot and kernel for end-to-end NAND boot* +------------+--------------------------------------------+------------------+ | ecc-scheme | u-boot/SPL configs | kernel DTS | +------------+--------------------------------------------+------------------+ | | | | | HAM1_HW | #define CONFIG_NAND_OMAP_ECCSCHEME \ |ti,nand-ecc-opts= | | | OMAP_ECC_HAM1_CODE_HW | "ham1" | | (1-bit | #define CONFIG_SYS_NAND_ECCBYTES 3 | | | Hamming | #define CONFIG_SYS_NAND_ECCPOS \ | | | using h/w) | { 1, 2, 3, 4, 5, 6, 7, 8, 9, \ | | | | 10, 11, 12 } | | | | (for NAND page-size=2048) | | | | | | +------------+--------------------------------------------+------------------+ | | | | | BCH8_SW | #define CONFIG_NAND_OMAP_ECCSCHEME \ |ti,nand-ecc-opts= | | | OMAP_ECC_BCH8_CODE_HW_DETECTION_SW | "bch8" | |(8-bit BCH | #define CONFIG_SYS_NAND_ECCBYTES 13 |(without ELM node)| | using s/w | #define CONFIG_BCH | | |library for | #undef CONFIG_SPL_NAND_AM33XX_BCH | | |for ECC | #define CONFIG_SPL_NAND_SIMPLE | | | error | #define CONFIG_SYS_NAND_ECCPOS \ | | |correction) | {2, 3, 4, 5, 6, 7, 8, 9, 10, \ | | | | 11, 12, 13, 14, \ | | | | 16, 17, 18, 19, 20, 21, 22, 23, 24, \ | | | | 25, 26, 27, 28, \ | | | | 30, 31, 32, 33, 34, 35, 36, 37, 38, \ | | | | 39, 40, 41, 42, \ | | | | 44, 45, 46, 47, 48, 49, 50, 51, 52, \ | | | | 53, 54, 55, 56, } | | | | (for NAND page-size=2048) | | | | #define CONFIG_SYS_NAND_ECCSIZE 512 | | | | | | +------------+--------------------------------------------+------------------+ | | | | | BCH8_HW | #define CONFIG_NAND_OMAP_ECCSCHEME \ |ti,nand-ecc-opts= | | | OMAP_ECC_BCH8_CODE_HW | "bch8" | |(8-bit BCH | #define CONFIG_SYS_NAND_ECCBYTES 14 | | | using ELM | #define CONFIG_SPL_NAND_AM33XX_BCH |(with ELM node) | | h/w engine | #define CONFIG_SYS_NAND_ECCPOS \ |ti,elm-id=<&elm> | |for ECC | {2, 3, 4, 5, 6, 7, 8, 9, \ | | | error | 10, 11, 12, 13, 14, 15, 16, 17, \ | | |correction) | 18, 19, 20, 21, 22, 23, 24, 25, \ | | | | 26, 27, 28, 29, 30, 31, 32, 33, \ | | | | 34, 35, 36, 37, 38, 39, 40, 41, \ | | | | 42, 43, 44, 45, 46, 47, 48, 49, \ | | | | 50, 51, 52, 53, 54, 55, 56, 57, } | | | | (for NAND page-size=2048) | | | | #define CONFIG_SYS_NAND_ECCSIZE 512 | | | | | | +------------+--------------------------------------------+------------------+
#* In addition following patches need to be pulled for u-boot: http://lists.denx.de/pipermail/u-boot/2013-December/168506.html http://lists.denx.de/pipermail/u-boot/2013-December/169021.html
Test1: flash ubi image from u-boot and boot the kernel U-boot> mw 0x82000000 0xff <u-boot.img size aligned to NAND block boundary> U-boot> fatload mmc 0 0x82000000 u-boot.img U-boot> nand erase <u-boot_offset> <u-boot.img size> U-boot> nand write 0x82000000 <u-boot_offset> <u-boot.img size> U-boot> setenv bootargs 'console=ttyO0,115200n8 noinitrd mem=256M \ root=ubi0 rw rootfstype=ubifs ubi.mtd=<mtdpart-of-rootfs>,\ <page-size> ip=off init=/init' U-boot> bootm <kernel_offset>
Test2: update u-boot.img from kernel and re-boot Kernel> flash_erase /dev/<mtdpart-of-u-boot> 0 0 Kernel> nandwrite -s 0 /dev/<mtdpart-of-u-boot> u-boot.img Kernel> reboot
Signed-off-by: Pekon Gupta pekon@ti.com
Though I have done initial level of testing on AM335x as mentioned above, But will it be possible for you to test and confirm if these set of patches solve regressions on your OMAP3 boards ?
with regards, pekon

Hi Pekon,
On 06.01.2014 08:40, Gupta, Pekon wrote:
Hello Enric, Nikita, and other OMAP3 users,
As there were parallel set of patches running between u-boot and kernel. hence, some patch-sets caused regression for OMAP3x platforms when booting using u-boot specifically for ecc-schemes (like BCH4_SW).
Hence this patch series fixes those regressions, and tests complete NAND boot sequence for multiple ecc-schemes on AM335x-EVM board. (following configurations are required for building u-boot driver which is compatible to kernel ecclayout for selected ecc-scheme)
(BCH8_HW) (HAM1_HW) (HAM1_HW) (HAM1_HW) (UBIFS) ROM ---------> SPL ---------> U-Boot ---------> Kernel ---------> File-System
(BCH8_HW) (BCH8_SW) (BCH8_SW) (BCH8_SW) (UBIFS) ROM ---------> SPL ---------> U-Boot ---------> Kernel ---------> File-System
(BCH8_HW) (BCH8_HW) (BCH8_HW) (BCH8_HW) (UBIFS) ROM ---------> SPL ---------> U-Boot ---------> Kernel ---------> File-System
*Configurations used to build u-boot and kernel for end-to-end NAND boot* +------------+--------------------------------------------+------------------+ | ecc-scheme | u-boot/SPL configs | kernel DTS | +------------+--------------------------------------------+------------------+ | | | | | HAM1_HW | #define CONFIG_NAND_OMAP_ECCSCHEME \ |ti,nand-ecc-opts= | | | OMAP_ECC_HAM1_CODE_HW | "ham1" | | (1-bit | #define CONFIG_SYS_NAND_ECCBYTES 3 | | | Hamming | #define CONFIG_SYS_NAND_ECCPOS \ | | | using h/w) | { 1, 2, 3, 4, 5, 6, 7, 8, 9, \ | | | | 10, 11, 12 } | | | | (for NAND page-size=2048) | | | | | | +------------+--------------------------------------------+------------------+ | | | | | BCH8_SW | #define CONFIG_NAND_OMAP_ECCSCHEME \ |ti,nand-ecc-opts= | | | OMAP_ECC_BCH8_CODE_HW_DETECTION_SW | "bch8" | |(8-bit BCH | #define CONFIG_SYS_NAND_ECCBYTES 13 |(without ELM node)| | using s/w | #define CONFIG_BCH | | |library for | #undef CONFIG_SPL_NAND_AM33XX_BCH | | |for ECC | #define CONFIG_SPL_NAND_SIMPLE | | | error | #define CONFIG_SYS_NAND_ECCPOS \ | | |correction) | {2, 3, 4, 5, 6, 7, 8, 9, 10, \ | | | | 11, 12, 13, 14, \ | | | | 16, 17, 18, 19, 20, 21, 22, 23, 24, \ | | | | 25, 26, 27, 28, \ | | | | 30, 31, 32, 33, 34, 35, 36, 37, 38, \ | | | | 39, 40, 41, 42, \ | | | | 44, 45, 46, 47, 48, 49, 50, 51, 52, \ | | | | 53, 54, 55, 56, } | | | | (for NAND page-size=2048) | | | | #define CONFIG_SYS_NAND_ECCSIZE 512 | | | | | | +------------+--------------------------------------------+------------------+ | | | | | BCH8_HW | #define CONFIG_NAND_OMAP_ECCSCHEME \ |ti,nand-ecc-opts= | | | OMAP_ECC_BCH8_CODE_HW | "bch8" | |(8-bit BCH | #define CONFIG_SYS_NAND_ECCBYTES 14 | | | using ELM | #define CONFIG_SPL_NAND_AM33XX_BCH |(with ELM node) | | h/w engine | #define CONFIG_SYS_NAND_ECCPOS \ |ti,elm-id=<&elm> | |for ECC | {2, 3, 4, 5, 6, 7, 8, 9, \ | | | error | 10, 11, 12, 13, 14, 15, 16, 17, \ | | |correction) | 18, 19, 20, 21, 22, 23, 24, 25, \ | | | | 26, 27, 28, 29, 30, 31, 32, 33, \ | | | | 34, 35, 36, 37, 38, 39, 40, 41, \ | | | | 42, 43, 44, 45, 46, 47, 48, 49, \ | | | | 50, 51, 52, 53, 54, 55, 56, 57, } | | | | (for NAND page-size=2048) | | | | #define CONFIG_SYS_NAND_ECCSIZE 512 | | | | | | +------------+--------------------------------------------+------------------+
#* In addition following patches need to be pulled for u-boot: http://lists.denx.de/pipermail/u-boot/2013-December/168506.html http://lists.denx.de/pipermail/u-boot/2013-December/169021.html
Test1: flash ubi image from u-boot and boot the kernel U-boot> mw 0x82000000 0xff <u-boot.img size aligned to NAND block boundary> U-boot> fatload mmc 0 0x82000000 u-boot.img U-boot> nand erase <u-boot_offset> <u-boot.img size> U-boot> nand write 0x82000000 <u-boot_offset> <u-boot.img size> U-boot> setenv bootargs 'console=ttyO0,115200n8 noinitrd mem=256M \ root=ubi0 rw rootfstype=ubifs ubi.mtd=<mtdpart-of-rootfs>,\ <page-size> ip=off init=/init' U-boot> bootm <kernel_offset>
Test2: update u-boot.img from kernel and re-boot Kernel> flash_erase /dev/<mtdpart-of-u-boot> 0 0 Kernel> nandwrite -s 0 /dev/<mtdpart-of-u-boot> u-boot.img Kernel> reboot
Signed-off-by: Pekon Gupta pekon@ti.com
Though I have done initial level of testing on AM335x as mentioned above, But will it be possible for you to test and confirm if these set of patches solve regressions on your OMAP3 boards ?
Those patches work fine on our custom AM335x board. So:
Tested-by: Stefan Roese sr@denx.de
Thanks, Stefan

Hi Pekon,
Sorry, I'm revisiting your patch series a bit late. There are a few factors that contributed to this, though.
1. This patch series talks extensively about U-Boot. U-Boot is not my interest, nor should it be the focus of kernel (driver) development. Any work done here should be framed in the kernel driver context. [1]
2. You have previously CC'd me on U-Boot patches. Or at least, I think so. More about this in the next point...
3. The U-Boot source code structure is rather similar to pieces of Linux subsystems. So without additional effort, it is hard to discern whether a patch is intended for Linux MTD or U-Boot MTD.
4. You cross-posted to both the U-Boot and Linux-MTD mailing lists.
So the sum of all this is that I ignored these patches at first, as they weren't clearly intended for me.
On Fri, Dec 13, 2013 at 02:42:56PM +0530, Pekon Gupta wrote:
As there were parallel set of patches running between u-boot and kernel.
I don't know what patches you're talking about.
hence, some patch-sets caused regression for OMAP3x platforms when booting
"Hence" is not the right word. The previous sentence doesn't imply that there were regressions.
using u-boot specifically for ecc-schemes (like BCH4_SW).
Huh? What does "specifically for ecc-schemes" mean? You mean that only some schemes had regressions? Can you be more specific? What are these regressions?
The rest of this cover letter (and most of the patch descriptions) describes the configurations and testing (which is all very good, don't get me wrong), with a lot of focus on things I don't follow (namely, U-Boot), but you omit many of the important details, like
* What is the regression? I honestly don't see a description of the actual regression. (I can read the code to intuit that, but what's the point of your pages of description if it doesn't mention this?) Please give some logical description of the problem * What are the effects of the regression? ECC bytes in the wrong place, so you see correction failures/corruption? JFFS2 failures? (The "oobfree" changes, for instance, should only affect something like JFFS2.) * What Linux commit caused the regression? * Should this patch set be marked for -stable? And for what kernel release? (the previous question would help answer this)
Hence this patch series fixes those regressions, and tests complete
[...] snip
I'm preparing a 3.14 pull request soon, and since you seem committed to fixing and properly testing a known regression here, I'd like to see this go in. But given the late timing and the unanswered questions, I think it's unlikely to go in -rc1. Perhaps I can send a later pull request if we can get it together properly.
So I'd like to first of all get a few answers to my questions, and then I think we might want to refresh this patch series with a little better commit messages and perhaps a separate documentation file (not necessarily in the same patch series, as the fix is more urgent).
Brian
[1] For instance, rather than saying "the ECC layout doesn't match U-Boot", you should probably describe what the intended ECC layout should look like and show that Linux does not match it. Perhaps it's time for some better documentation, like Ezequiel stashed under Documentation/mtd/nand/ for pxa3xx. You could also take some cues from Huang's comments on ECC layouts, etc., in gpmi-nand.c.

Hi Brian,
From: Brian Norris
- This patch series talks extensively about U-Boot. U-Boot is not my
interest, nor should it be the focus of kernel (driver) development. Any work done here should be framed in the kernel driver context. [1]
Apologies for cross-posting, I understand that you are already flooded by emails from linux-mtd list. But my intention was to keep all users of OMAP3 informed, as this regression was Reported-by: Enric Balletbo Serra eballetbo@iseebcn.com while testing mainline u-boot & kernel on OMAP3 platform.
Thanks for you feedbacks. I'll fix the commit messages with proper description of the regression, And incorporate other comments when I re-send it. Also, I'll cut-down the CC list, as u-boot mailman blocks emails with long CC list.
with regards, pekon

On Mon, Jan 27, 2014 at 9:46 AM, Gupta, Pekon pekon@ti.com wrote:
From: Brian Norris
- This patch series talks extensively about U-Boot. U-Boot is not my
interest, nor should it be the focus of kernel (driver) development. Any work done here should be framed in the kernel driver context. [1]
Apologies for cross-posting, I understand that you are already flooded by emails from linux-mtd list. But my intention was to keep all users of OMAP3 informed, as this regression was Reported-by: Enric Balletbo Serra eballetbo@iseebcn.com while testing mainline u-boot & kernel on OMAP3 platform.
This last line ("while testing mainline u-boot & kernel on OMAP3 platform") is part of what worries me and requires more explanation. An upgrade to *either* U-Boot or kernel should not cause regressions for already-supported platforms (if this is new platform support, then that's different, and it's not exactly a "regression" in that case; but I know some of the kernel features are new platform support).
Thanks for you feedbacks. I'll fix the commit messages with proper description of the regression, And incorporate other comments when I re-send it.
Can you please respond to a few of the concerns before sending a new patch set? I'd like to have the proper explanation and discussion up front here, rather than burying it in your long patch descriptions with test results. Then the end result can go into a proper commit message, once we're all happy.
Also, I'll cut-down the CC list, as u-boot mailman blocks emails with long CC list.
Yeah, I noticed that after I sent my replies... IMO, you can drop the u-boot list if that helps.
Brian

Hi Brian,
From: Brian Norris
Hi Pekon,
Sorry, I'm revisiting your patch series a bit late. There are a few factors that contributed to this, though.
- This patch series talks extensively about U-Boot. U-Boot is not my
interest, nor should it be the focus of kernel (driver) development. Any work done here should be framed in the kernel driver context. [1]
- You have previously CC'd me on U-Boot patches. Or at least, I think
so. More about this in the next point...
- The U-Boot source code structure is rather similar to pieces of Linux
subsystems. So without additional effort, it is hard to discern whether a patch is intended for Linux MTD or U-Boot MTD.
- You cross-posted to both the U-Boot and Linux-MTD mailing lists.
So the sum of all this is that I ignored these patches at first, as they weren't clearly intended for me.
On Fri, Dec 13, 2013 at 02:42:56PM +0530, Pekon Gupta wrote:
As there were parallel set of patches running between u-boot and kernel.
I don't know what patches you're talking about.
Following patch-sets touched ecc.layout while cleaning up OMAP NAND driver. u-boot: http://lists.denx.de/pipermail/u-boot/2013-November/167551.html kernel: http://lists.infradead.org/pipermail/linux-mtd/2013-October/049414.html
hence, some patch-sets caused regression for OMAP3x platforms when booting
"Hence" is not the right word. The previous sentence doesn't imply that there were regressions.
using u-boot specifically for ecc-schemes (like BCH4_SW).
Huh? What does "specifically for ecc-schemes" mean? You mean that only some schemes had regressions? Can you be more specific? What are these regressions?
It was reported by Enric Balletbo Serra eballetbo@iseebcn.com [2] that when using 'OMAP_ECC_BCH8_CODE_HW_DETECTION_SW' ecc-scheme there was a mismatch in ECC layout between mainline u-boot and kernel. This caused boot failure when kernel tried to mount UBIFS with image flashed from u-boot.
This was missed while testing earlier patch-sets because only OMAP3 boards use above ecc-scheme as OMAP3 SoC do not have ELM hardware engine. All the later platforms use 'OMAP_ECC_BCH8_CODE_HW' ecc-scheme. Both ecc-scheme are based on same algorithm of BCH8 ECC code except that: (1) 'OMAP_ECC_BCH8_CODE_HW_DETECTION_SW' uses /lib/bch.c software library for ecc-correction. (2) 'OMAP_ECC_BCH8_CODE_HW' uses ELM hardware engine for ecc-correction.
This regression affects following ecc-schemes, as ecc.layout logic is just copy-paste of each-other. - 'OMAP_ECC_BCH4_CODE_HW_DETECTION_SW' and - 'OMAP_ECC_BCH8_CODE_HW_DETECTION_SW'
The rest of this cover letter (and most of the patch descriptions) describes the configurations and testing (which is all very good, don't get me wrong), with a lot of focus on things I don't follow (namely, U-Boot), but you omit many of the important details, like
- What is the regression? I honestly don't see a description of the actual regression. (I can read the code to intuit that, but what's the point of your pages of description if it doesn't mention this?) Please give some logical description of the problem
[Patch 1/2] of this series actually simplification of 'ecclayout->oobfree->offset' It does not fix the regression, but it's important to clean it, before any new feature additions. [Patch 2/2] Actually fixes the regression for affected ecc-schemes. But "this patch also touches other ecc-schemes as the fix required refactoring common code, into ecc-scheme specific code."
- What are the effects of the regression? ECC bytes in the wrong place, so you see correction failures/corruption? JFFS2 failures? (The "oobfree" changes, for instance, should only affect something like JFFS2.)
As mentioned above [PATCH 1/2] which touches 'ecclayout->oobfree->offset' is a clean-up not affecting regression, but it should be taken in before any further feature or ecc-scheme changes. Now, what is your preference? (a) Take [PATCH 1/2] ecc.layout clean-up separately. (b) Take ecc.layout as part of this series, but may be with different patch title.
- What Linux commit caused the regression?
- Should this patch set be marked for -stable? And for what kernel release? (the previous question would help answer this)
This regression was caused in commit b491da7233d5dc1a24d46ca1ad0209900329c5d0 mtd: nand: omap: clean-up ecc layout for BCH ecc schemes So, this should be applicable from 3.13.x+
Hence this patch series fixes those regressions, and tests complete
[...] snip
I'm preparing a 3.14 pull request soon, and since you seem committed to fixing and properly testing a known regression here, I'd like to see this go in. But given the late timing and the unanswered questions, I think it's unlikely to go in -rc1. Perhaps I can send a later pull request if we can get it together properly.
Yes, if the above details are sufficient, then I'll: (1) re-word [PATCH 1/2] commit log & title to mention that it's a clean-up (2) Add above description and details of regression into commit log of [PATCH 2/2] (3) Include your other comments on individual patches.
So I'd like to first of all get a few answers to my questions, and then I think we might want to refresh this patch series with a little better commit messages and perhaps a separate documentation file (not necessarily in the same patch series, as the fix is more urgent).
Yes, I plan to get some documentation in kernel docs. But for now I do have a 'un-polished' page for OMAP NAND driver on TI wiki [3]. Please let me know if you would like something similar for kernel docs.
I'm happy that you are scrutinizing each patch and their commit logs wisely. Thanks much for that. This helps me and probably everyone else to tighten-up so that only good quality patch-sets go in. This eventually will make the sub-system and the drivers more stable for long-term.
Also, if you don't mind, I'll still continue to cross-post this particular Patch-series on both u-boot and kernel mailing lists just to maintain continuity, as I know there are lot of OMAP3 users which are following only one of the two lists.. (but I'm cutting down on the CC list). I'll avoid cross posting in future.
Brian
[1] For instance, rather than saying "the ECC layout doesn't match U-Boot", you should probably describe what the intended ECC layout should look like and show that Linux does not match it. Perhaps it's time for some better documentation, like Ezequiel stashed under Documentation/mtd/nand/ for pxa3xx. You could also take some cues from Huang's comments on ECC layouts, etc., in gpmi-nand.c.
[2] http://lists.denx.de/pipermail/u-boot/2013-December/168440.html [3] https://processors.wiki.ti.com/index.php/Linux_Core_NAND_User%27s_Guide (under construction)
with regards, pekon

Hi Brian,
From: Gupta, Pekon
I'm preparing a 3.14 pull request soon, and since you seem committed to fixing and properly testing a known regression here, I'd like to see this go in. But given the late timing and the unanswered questions, I think it's unlikely to go in -rc1. Perhaps I can send a later pull request if we can get it together properly.
Yes, if the above details are sufficient, then I'll: (1) re-word [PATCH 1/2] commit log & title to mention that it's a clean-up (2) Add above description and details of regression into commit log of [PATCH 2/2] (3) Include your other comments on individual patches.
So I'd like to first of all get a few answers to my questions, and then I think we might want to refresh this patch series with a little better commit messages and perhaps a separate documentation file (not necessarily in the same patch series, as the fix is more urgent).
Do my previous responses look complete to you for re-submission ?
with regards, pekon

Hi Pekon,
On Tue, Jan 28, 2014 at 07:42:09AM +0000, Pekon Gupta wrote:
From: Brian Norris
On Fri, Dec 13, 2013 at 02:42:56PM +0530, Pekon Gupta wrote:
As there were parallel set of patches running between u-boot and kernel.
I don't know what patches you're talking about.
Following patch-sets touched ecc.layout while cleaning up OMAP NAND driver. u-boot: http://lists.denx.de/pipermail/u-boot/2013-November/167551.html kernel: http://lists.infradead.org/pipermail/linux-mtd/2013-October/049414.html
hence, some patch-sets caused regression for OMAP3x platforms when booting
"Hence" is not the right word. The previous sentence doesn't imply that there were regressions.
using u-boot specifically for ecc-schemes (like BCH4_SW).
Huh? What does "specifically for ecc-schemes" mean? You mean that only some schemes had regressions? Can you be more specific? What are these regressions?
It was reported by Enric Balletbo Serra eballetbo@iseebcn.com [2] that when using 'OMAP_ECC_BCH8_CODE_HW_DETECTION_SW' ecc-scheme there was a mismatch in ECC layout between mainline u-boot and kernel. This caused boot failure when kernel tried to mount UBIFS with image flashed from u-boot.
This was missed while testing earlier patch-sets because only OMAP3 boards use above ecc-scheme as OMAP3 SoC do not have ELM hardware engine. All the later platforms use 'OMAP_ECC_BCH8_CODE_HW' ecc-scheme. Both ecc-scheme are based on same algorithm of BCH8 ECC code except that: (1) 'OMAP_ECC_BCH8_CODE_HW_DETECTION_SW' uses /lib/bch.c software library for ecc-correction. (2) 'OMAP_ECC_BCH8_CODE_HW' uses ELM hardware engine for ecc-correction.
This regression affects following ecc-schemes, as ecc.layout logic is just copy-paste of each-other.
- 'OMAP_ECC_BCH4_CODE_HW_DETECTION_SW' and
- 'OMAP_ECC_BCH8_CODE_HW_DETECTION_SW'
OK
The rest of this cover letter (and most of the patch descriptions) describes the configurations and testing (which is all very good, don't get me wrong), with a lot of focus on things I don't follow (namely, U-Boot), but you omit many of the important details, like
- What is the regression? I honestly don't see a description of the actual regression. (I can read the code to intuit that, but what's the point of your pages of description if it doesn't mention this?) Please give some logical description of the problem
[Patch 1/2] of this series actually simplification of 'ecclayout->oobfree->offset' It does not fix the regression, but it's important to clean it, before any new feature additions.
By my reading, it is actually wrong, and therefore not just a clean up. Can you address my comment there about your indexing into the eccpos[] array?
Did you actually check (e.g., print out) the resulting ecclayout before and after your cleanup to make sure it's exactly the same? I'm looking at the oobfree[] values, which looked wrong to me.
[Patch 2/2] Actually fixes the regression for affected ecc-schemes. But "this patch also touches other ecc-schemes as the fix required refactoring common code, into ecc-scheme specific code."
- What are the effects of the regression? ECC bytes in the wrong place, so you see correction failures/corruption? JFFS2 failures? (The "oobfree" changes, for instance, should only affect something like JFFS2.)
As mentioned above [PATCH 1/2] which touches 'ecclayout->oobfree->offset' is a clean-up not affecting regression, but it should be taken in before any further feature or ecc-scheme changes.
That's fine; a fixup patch can come just before a bugfix, if that helps simplify the bufix.
Now, what is your preference? (a) Take [PATCH 1/2] ecc.layout clean-up separately. (b) Take ecc.layout as part of this series, but may be with different patch title.
I think we'll keep these patches together, but they both need to have better descriptions to tell what's actually going on.
- What Linux commit caused the regression?
- Should this patch set be marked for -stable? And for what kernel release? (the previous question would help answer this)
This regression was caused in commit b491da7233d5dc1a24d46ca1ad0209900329c5d0 mtd: nand: omap: clean-up ecc layout for BCH ecc schemes So, this should be applicable from 3.13.x+
Hence this patch series fixes those regressions, and tests complete
[...] snip
I'm preparing a 3.14 pull request soon, and since you seem committed to fixing and properly testing a known regression here, I'd like to see this go in. But given the late timing and the unanswered questions, I think it's unlikely to go in -rc1. Perhaps I can send a later pull request if we can get it together properly.
Yes, if the above details are sufficient, then I'll: (1) re-word [PATCH 1/2] commit log & title to mention that it's a clean-up (2) Add above description and details of regression into commit log of [PATCH 2/2] (3) Include your other comments on individual patches.
That's a good plan. But please, before sending another version, can you at least address my comment on patch 1 about the eccpos[] indexing you are using? It doesn't look right to me.
Also, can you please print out your full layout (eccpos, eccbytes, oobavail, and oobfree) and check it?
So I'd like to first of all get a few answers to my questions, and then I think we might want to refresh this patch series with a little better commit messages and perhaps a separate documentation file (not necessarily in the same patch series, as the fix is more urgent).
Yes, I plan to get some documentation in kernel docs. But for now I do have a 'un-polished' page for OMAP NAND driver on TI wiki [3]. Please let me know if you would like something similar for kernel docs.
Thanks for the link. Some portion of that doc is probably useful to include in Linux eventually.
I'm happy that you are scrutinizing each patch and their commit logs wisely. Thanks much for that. This helps me and probably everyone else to tighten-up so that only good quality patch-sets go in. This eventually will make the sub-system and the drivers more stable for long-term.
Also, if you don't mind, I'll still continue to cross-post this particular Patch-series on both u-boot and kernel mailing lists just to maintain continuity, as I know there are lot of OMAP3 users which are following only one of the two lists.. (but I'm cutting down on the CC list). I'll avoid cross posting in future.
Cross-posting is OK with me if it's made a little more clear who is targeted. The restrictive U-Boot mailing list policy (which rejects non-member / too-large-CC-list email) diminishes the usefulness of cross-posting too.
But most of all, using "u-boot" in the subject of the cover letter email does not clearly convey that this is a Linux patch!
Brian
[1] For instance, rather than saying "the ECC layout doesn't match U-Boot", you should probably describe what the intended ECC layout should look like and show that Linux does not match it. Perhaps it's time for some better documentation, like Ezequiel stashed under Documentation/mtd/nand/ for pxa3xx. You could also take some cues from Huang's comments on ECC layouts, etc., in gpmi-nand.c.
[2] http://lists.denx.de/pipermail/u-boot/2013-December/168440.html [3] https://processors.wiki.ti.com/index.php/Linux_Core_NAND_User%27s_Guide (under construction)
I look forward to you addressing my last concerns and sending a good v2. Thanks!
Brian
participants (4)
-
Brian Norris
-
Gupta, Pekon
-
Pekon Gupta
-
Stefan Roese