[PATCH] mtd: nand: pxa3xx: Incorrect bitflip return on page read

older
[PATCH 0/7] watchdog_reset cleanup

Ravi Minnikanti

27 Apr 2024 27 Apr '24

6:15 p.m.

Once a page is read with higher bitflips all subsequent reads are returning the same bitflip value even though they have none. max_bitflip variable is not being reset to 0 across page reads.

This is causing problems like incorrectly marking erase blocks bad by UBI and causing read failures.

Verified the change with both MTD reads and UBI. This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

ubi0: fixable bit-flip detected at PEB 125 ubi0: run torture test for PEB 125 ubi0: fixable bit-flip detected at PEB 125 ubi0 error: torture_peb: read problems on freshly erased PEB 125, must be bad ubi0 error: erase_worker: failed to erase PEB 125, error -5 ubi0: mark PEB 125 as bad

Signed-off-by: rminnikanti rminnikanti@marvell.com --- drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c index 1d9a6d107b..d2a4faad56 100644 --- a/drivers/mtd/nand/raw/pxa3xx_nand.c +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c @@ -800,6 +800,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command) info->ecc_err_cnt = 0; info->ndcb3 = 0; info->need_wait = 0; + /* + * Reset max_bitflips to zero. Once command is complete, + * max_bitflips for this READ is returned in ecc.read_page() + */ + info->max_bitflips = 0;

switch (command) { case NAND_CMD_READ0:

-- 2.17.1

Show replies by date

Chris Packham

29 Apr 29 Apr

6:22 p.m.

On Sun, Apr 28, 2024 at 4:15 AM Ravi Minnikanti rminnikanti@marvell.com wrote:

...

Once a page is read with higher bitflips all subsequent reads are returning the same bitflip value even though they have none. max_bitflip variable is not being reset to 0 across page reads.

This is causing problems like incorrectly marking erase blocks bad by UBI and causing read failures.

Verified the change with both MTD reads and UBI. This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

ubi0: fixable bit-flip detected at PEB 125 ubi0: run torture test for PEB 125 ubi0: fixable bit-flip detected at PEB 125 ubi0 error: torture_peb: read problems on freshly erased PEB 125, must be bad ubi0 error: erase_worker: failed to erase PEB 125, error -5 ubi0: mark PEB 125 as bad

Signed-off-by: rminnikanti rminnikanti@marvell.com

Looks good to me

Reviewed-by: Chris Packham judge.packham@gmail.com

...

drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c index 1d9a6d107b..d2a4faad56 100644 --- a/drivers/mtd/nand/raw/pxa3xx_nand.c +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c @@ -800,6 +800,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command) info->ecc_err_cnt = 0; info->ndcb3 = 0; info->need_wait = 0;
  /*
   * Reset max_bitflips to zero. Once command is complete,
   * max_bitflips for this READ is returned in ecc.read_page()
   */
  info->max_bitflips      = 0;

  switch (command) {
  case NAND_CMD_READ0:
-- 2.17.1

Michael Nazzareno Trimarchi

6:59 p.m.

On Mon, Apr 29, 2024 at 6:22 PM Chris Packham judge.packham@gmail.com wrote:

...

On Sun, Apr 28, 2024 at 4:15 AM Ravi Minnikanti rminnikanti@marvell.com wrote:

...
Once a page is read with higher bitflips all subsequent reads are returning the same bitflip value even though they have none. max_bitflip variable is not being reset to 0 across page reads.

This is causing problems like incorrectly marking erase blocks bad by UBI and causing read failures.

Verified the change with both MTD reads and UBI. This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

ubi0: fixable bit-flip detected at PEB 125 ubi0: run torture test for PEB 125 ubi0: fixable bit-flip detected at PEB 125 ubi0 error: torture_peb: read problems on freshly erased PEB 125, must be bad ubi0 error: erase_worker: failed to erase PEB 125, error -5 ubi0: mark PEB 125 as bad

Signed-off-by: rminnikanti rminnikanti@marvell.com

Looks good to me

Reviewed-by: Chris Packham judge.packham@gmail.com

...
drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c index 1d9a6d107b..d2a4faad56 100644 --- a/drivers/mtd/nand/raw/pxa3xx_nand.c +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c @@ -800,6 +800,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command) info->ecc_err_cnt = 0; info->ndcb3 = 0; info->need_wait = 0;
  /*
   * Reset max_bitflips to zero. Once command is complete,
   * max_bitflips for this READ is returned in ecc.read_page()
   */
  info->max_bitflips      = 0;

Why this should not be put to 0 in read_page instead on prepare_start_command?

Michael

...

    switch (command) {
    case NAND_CMD_READ0:

-- 2.17.1

-- Michael Nazzareno Trimarchi Co-Founder & Chief Executive Officer M. +39 347 913 2170 michael@amarulasolutions.com __________________________________ Amarula Solutions BV Joop Geesinkweg 125, 1114 AB, Amsterdam, NL T. +31 (0)85 111 9172 info@amarulasolutions.com www.amarulasolutions.com

Ravi Minnikanti

30 Apr 30 Apr

6:25 a.m.

New subject: [EXTERNAL] Re: [PATCH] mtd: nand: pxa3xx: Incorrect bitflip return on page read

On 4/29/24 09:59, Michael Nazzareno Trimarchi wrote:

...

On Mon, Apr 29, 2024 at 6:22 PM Chris Packham judge.packham@gmail.com wrote:

...
On Sun, Apr 28, 2024 at 4:15 AM Ravi Minnikanti rminnikanti@marvell.com wrote:

...
Once a page is read with higher bitflips all subsequent reads are returning the same bitflip value even though they have none. max_bitflip variable is not being reset to 0 across page reads.

This is causing problems like incorrectly marking erase blocks bad by UBI and causing read failures.

Verified the change with both MTD reads and UBI. This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

ubi0: fixable bit-flip detected at PEB 125 ubi0: run torture test for PEB 125 ubi0: fixable bit-flip detected at PEB 125 ubi0 error: torture_peb: read problems on freshly erased PEB 125, must be bad ubi0 error: erase_worker: failed to erase PEB 125, error -5 ubi0: mark PEB 125 as bad

Signed-off-by: rminnikanti rminnikanti@marvell.com

Looks good to me

Reviewed-by: Chris Packham judge.packham@gmail.com

...
drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c index 1d9a6d107b..d2a4faad56 100644 --- a/drivers/mtd/nand/raw/pxa3xx_nand.c +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c @@ -800,6 +800,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command) info->ecc_err_cnt = 0; info->ndcb3 = 0; info->need_wait = 0;
  /*
   * Reset max_bitflips to zero. Once command is complete,
   * max_bitflips for this READ is returned in ecc.read_page()
   */
  info->max_bitflips      = 0;
Why this should not be put to 0 in read_page instead on prepare_start_command?

Michael

ecc.read_page is invoked after the read command execution. First chip->cmdfunc is executed with NAND_CMD_READ0 and then ecc.read_page is invoked to read the page from buffer. So, by the time read_page is invoked, info->max_bitflips must already have the bit flip value.

Thanks, Ravi.

...

    switch (command) {
    case NAND_CMD_READ0:

-- 2.17.1

Michael Nazzareno Trimarchi

6 May 6 May

9:35 a.m.

New subject: [EXTERNAL] Re: [PATCH] mtd: nand: pxa3xx: Incorrect bitflip return on page read

Hi Ravi

On Tue, Apr 30, 2024 at 6:25 AM Ravi Minnikanti rminnikanti@marvell.com wrote:

...

On 4/29/24 09:59, Michael Nazzareno Trimarchi wrote:

...
On Mon, Apr 29, 2024 at 6:22 PM Chris Packham judge.packham@gmail.com wrote:

...
On Sun, Apr 28, 2024 at 4:15 AM Ravi Minnikanti rminnikanti@marvell.com wrote:

...
Once a page is read with higher bitflips all subsequent reads are returning the same bitflip value even though they have none. max_bitflip variable is not being reset to 0 across page reads.

This is causing problems like incorrectly marking erase blocks bad by UBI and causing read failures.

Verified the change with both MTD reads and UBI. This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

ubi0: fixable bit-flip detected at PEB 125 ubi0: run torture test for PEB 125 ubi0: fixable bit-flip detected at PEB 125 ubi0 error: torture_peb: read problems on freshly erased PEB 125, must be bad ubi0 error: erase_worker: failed to erase PEB 125, error -5 ubi0: mark PEB 125 as bad

Signed-off-by: rminnikanti rminnikanti@marvell.com

Looks good to me

Reviewed-by: Chris Packham judge.packham@gmail.com

...
drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c index 1d9a6d107b..d2a4faad56 100644 --- a/drivers/mtd/nand/raw/pxa3xx_nand.c +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c @@ -800,6 +800,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command) info->ecc_err_cnt = 0; info->ndcb3 = 0; info->need_wait = 0;
  /*
   * Reset max_bitflips to zero. Once command is complete,
   * max_bitflips for this READ is returned in ecc.read_page()
   */
  info->max_bitflips      = 0;
Why this should not be put to 0 in read_page instead on prepare_start_command?

Michael
ecc.read_page is invoked after the read command execution. First chip->cmdfunc is executed with NAND_CMD_READ0 and then ecc.read_page is invoked to read the page from buffer. So, by the time read_page is invoked, info->max_bitflips must already have the bit flip value.

All the other implementation has a slight different way to handle.

...

From what you said the reset should

be done on for NAND_CMD_READ0 command and should be sufficient. Technically should be moved in switch and not unconditionally.

Michael

...

Thanks, Ravi.

...
...
...
    switch (command) {
    case NAND_CMD_READ0:
-- 2.17.1

Ravi Minnikanti

7:32 p.m.

On 5/6/24 00:35, Michael Nazzareno Trimarchi wrote:

...

Hi Ravi

On Tue, Apr 30, 2024 at 6:25 AM Ravi Minnikanti rminnikanti@marvell.com wrote:

...
On 4/29/24 09:59, Michael Nazzareno Trimarchi wrote:

...
On Mon, Apr 29, 2024 at 6:22 PM Chris Packham judge.packham@gmail.com wrote:

...
On Sun, Apr 28, 2024 at 4:15 AM Ravi Minnikanti rminnikanti@marvell.com wrote:

...
Once a page is read with higher bitflips all subsequent reads are returning the same bitflip value even though they have none. max_bitflip variable is not being reset to 0 across page reads.

This is causing problems like incorrectly marking erase blocks bad by UBI and causing read failures.

Verified the change with both MTD reads and UBI. This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

ubi0: fixable bit-flip detected at PEB 125 ubi0: run torture test for PEB 125 ubi0: fixable bit-flip detected at PEB 125 ubi0 error: torture_peb: read problems on freshly erased PEB 125, must be bad ubi0 error: erase_worker: failed to erase PEB 125, error -5 ubi0: mark PEB 125 as bad

Signed-off-by: rminnikanti rminnikanti@marvell.com

Looks good to me

Reviewed-by: Chris Packham judge.packham@gmail.com

...
drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c index 1d9a6d107b..d2a4faad56 100644 --- a/drivers/mtd/nand/raw/pxa3xx_nand.c +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c @@ -800,6 +800,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command) info->ecc_err_cnt = 0; info->ndcb3 = 0; info->need_wait = 0;
  /*
   * Reset max_bitflips to zero. Once command is complete,
   * max_bitflips for this READ is returned in ecc.read_page()
   */
  info->max_bitflips      = 0;
Why this should not be put to 0 in read_page instead on prepare_start_command?

Michael
ecc.read_page is invoked after the read command execution. First chip->cmdfunc is executed with NAND_CMD_READ0 and then ecc.read_page is invoked to read the page from buffer. So, by the time read_page is invoked, info->max_bitflips must already have the bit flip value.
All the other implementation has a slight different way to handle. From what you said the reset should be done on for NAND_CMD_READ0 command and should be sufficient. Technically should be moved in switch and not unconditionally.

Michael

max_bitflip is not being reset to 0 across page reads. Once a page is read with higher bitflips all subsequent reads are returning the same bitflip value even though they have none.

This is causing problems like incorrectly marking erase blocks bad by UBI and read failures.

Tested it with both MTD reads and UBI attach. This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

Signed-off-by: rminnikanti rminnikanti@marvell.com --- drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c index 1d9a6d107b..97f250483f 100644 --- a/drivers/mtd/nand/raw/pxa3xx_nand.c +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c @@ -803,6 +803,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command)

switch (command) { case NAND_CMD_READ0: + /* + * Reset max_bitflips to zero. Once command is complete, + * max_bitflips for this READ is returned in ecc.read_page() + */ + info->max_bitflips = 0; case NAND_CMD_READOOB: case NAND_CMD_PAGEPROG: if (!info->force_raw)

-- 2.17.1 Thanks Michael. Fixed it. >> Thanks, Ravi. >> >>>>> switch (command) { >>>>> case NAND_CMD_READ0: >>>>> -- >>>>> 2.17.1 >>> >> > >

Michael Nazzareno Trimarchi

8:28 p.m.

Hi Ravi

On Mon, May 6, 2024 at 7:33 PM Ravi Minnikanti rminnikanti@marvell.com wrote:

...

On 5/6/24 00:35, Michael Nazzareno Trimarchi wrote:

...
Hi Ravi

On Tue, Apr 30, 2024 at 6:25 AM Ravi Minnikanti rminnikanti@marvell.com wrote:

...
On 4/29/24 09:59, Michael Nazzareno Trimarchi wrote:

...
On Mon, Apr 29, 2024 at 6:22 PM Chris Packham judge.packham@gmail.com wrote:

...
On Sun, Apr 28, 2024 at 4:15 AM Ravi Minnikanti rminnikanti@marvell.com wrote:

...
Once a page is read with higher bitflips all subsequent reads are returning the same bitflip value even though they have none. max_bitflip variable is not being reset to 0 across page reads.

This is causing problems like incorrectly marking erase blocks bad by UBI and causing read failures.

Verified the change with both MTD reads and UBI. This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

ubi0: fixable bit-flip detected at PEB 125 ubi0: run torture test for PEB 125 ubi0: fixable bit-flip detected at PEB 125 ubi0 error: torture_peb: read problems on freshly erased PEB 125, must be bad ubi0 error: erase_worker: failed to erase PEB 125, error -5 ubi0: mark PEB 125 as bad

Signed-off-by: rminnikanti rminnikanti@marvell.com

Looks good to me

Reviewed-by: Chris Packham judge.packham@gmail.com

...
drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c index 1d9a6d107b..d2a4faad56 100644 --- a/drivers/mtd/nand/raw/pxa3xx_nand.c +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c @@ -800,6 +800,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command) info->ecc_err_cnt = 0; info->ndcb3 = 0; info->need_wait = 0;
  /*
   * Reset max_bitflips to zero. Once command is complete,
   * max_bitflips for this READ is returned in ecc.read_page()
   */
  info->max_bitflips      = 0;
Why this should not be put to 0 in read_page instead on prepare_start_command?

Michael
ecc.read_page is invoked after the read command execution. First chip->cmdfunc is executed with NAND_CMD_READ0 and then ecc.read_page is invoked to read the page from buffer. So, by the time read_page is invoked, info->max_bitflips must already have the bit flip value.
All the other implementation has a slight different way to handle. From what you said the reset should be done on for NAND_CMD_READ0 command and should be sufficient. Technically should be moved in switch and not unconditionally.

Michael
max_bitflip is not being reset to 0 across page reads. Once a page is read with higher bitflips all subsequent reads are returning the same bitflip value even though they have none.

This is causing problems like incorrectly marking erase blocks bad by UBI and read failures.

Tested it with both MTD reads and UBI attach. This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

ubi0: fixable bit-flip detected at PEB 125 ubi0: run torture test for PEB 125 ubi0: fixable bit-flip detected at PEB 125 ubi0 error: torture_peb: read problems on freshly erased PEB 125, must be bad ubi0 error: erase_worker: failed to erase PEB 125, error -5 ubi0: mark PEB 125 as bad

Signed-off-by: rminnikanti rminnikanti@marvell.com

drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c index 1d9a6d107b..97f250483f 100644 --- a/drivers/mtd/nand/raw/pxa3xx_nand.c +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c @@ -803,6 +803,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command)
    switch (command) {
    case NAND_CMD_READ0:
          /*
           * Reset max_bitflips to zero. Once command is complete,
           * max_bitflips for this READ is returned in ecc.read_page()
           */
          info->max_bitflips = 0;
  case NAND_CMD_READOOB:
  case NAND_CMD_PAGEPROG:
          if (!info->force_raw)
-- 2.17.1

Thanks Michael. Fixed it.

...
...
Thanks, Ravi.

...
...
...
    switch (command) {
    case NAND_CMD_READ0:
-- 2.17.1

Acked-by: Michael Trimarchi michael@amarulasolutions.com

Ravi Minnikanti

21 May 21 May

4:30 p.m.

New subject: [EXTERNAL] Re: [PATCH] mtd: nand: pxa3xx: Incorrect bitflip return on page read

Hi,

Can you please merge this PR, if there are no more review comments?

Thanks, Ravi On 5/6/24 11:28, Michael Nazzareno Trimarchi wrote:

...

Hi Ravi

On Mon, May 6, 2024 at 7:33 PM Ravi Minnikanti rminnikanti@marvell.com wrote:

...
On 5/6/24 00:35, Michael Nazzareno Trimarchi wrote:

...
Hi Ravi

On Tue, Apr 30, 2024 at 6:25 AM Ravi Minnikanti rminnikanti@marvell.com wrote:

...
On 4/29/24 09:59, Michael Nazzareno Trimarchi wrote:

...

On Mon, Apr 29, 2024 at 6:22 PM Chris Packham judge.packham@gmail.com wrote:

...
On Sun, Apr 28, 2024 at 4:15 AM Ravi Minnikanti rminnikanti@marvell.com wrote: > > Once a page is read with higher bitflips all subsequent reads > are returning the same bitflip value even though they have none. > max_bitflip variable is not being reset to 0 across page reads. > > This is causing problems like incorrectly > marking erase blocks bad by UBI and causing read failures. > > Verified the change with both MTD reads and UBI. > This change is inline with other NFC drivers. > > Sample error log where a block is marked bad incorrectly: > > ubi0: fixable bit-flip detected at PEB 125 > ubi0: run torture test for PEB 125 > ubi0: fixable bit-flip detected at PEB 125 > ubi0 error: torture_peb: read problems on freshly erased PEB 125, > must be bad > ubi0 error: erase_worker: failed to erase PEB 125, error -5 > ubi0: mark PEB 125 as bad > > Signed-off-by: rminnikanti rminnikanti@marvell.com

Looks good to me

Reviewed-by: Chris Packham judge.packham@gmail.com

> --- > drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c > index 1d9a6d107b..d2a4faad56 100644 > --- a/drivers/mtd/nand/raw/pxa3xx_nand.c > +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c > @@ -800,6 +800,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command) > info->ecc_err_cnt = 0; > info->ndcb3 = 0; > info->need_wait = 0; > + /* > + * Reset max_bitflips to zero. Once command is complete, > + * max_bitflips for this READ is returned in ecc.read_page() > + */ > + info->max_bitflips = 0; >

Why this should not be put to 0 in read_page instead on prepare_start_command?

Michael

ecc.read_page is invoked after the read command execution. First chip->cmdfunc is executed with NAND_CMD_READ0 and then ecc.read_page is invoked to read the page from buffer. So, by the time read_page is invoked, info->max_bitflips must already have the bit flip value.

All the other implementation has a slight different way to handle. From what you said the reset should be done on for NAND_CMD_READ0 command and should be sufficient. Technically should be moved in switch and not unconditionally.

Michael

max_bitflip is not being reset to 0 across page reads. Once a page is read with higher bitflips all subsequent reads are returning the same bitflip value even though they have none.

This is causing problems like incorrectly marking erase blocks bad by UBI and read failures.

Tested it with both MTD reads and UBI attach. This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

ubi0: fixable bit-flip detected at PEB 125 ubi0: run torture test for PEB 125 ubi0: fixable bit-flip detected at PEB 125 ubi0 error: torture_peb: read problems on freshly erased PEB 125, must be bad ubi0 error: erase_worker: failed to erase PEB 125, error -5 ubi0: mark PEB 125 as bad

Signed-off-by: rminnikanti rminnikanti@marvell.com

drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c b/drivers/mtd/nand/raw/pxa3xx_nand.c index 1d9a6d107b..97f250483f 100644 --- a/drivers/mtd/nand/raw/pxa3xx_nand.c +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c @@ -803,6 +803,11 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command)
    switch (command) {
    case NAND_CMD_READ0:
          /*
           * Reset max_bitflips to zero. Once command is complete,
           * max_bitflips for this READ is returned in ecc.read_page()
           */
          info->max_bitflips = 0;
  case NAND_CMD_READOOB:
  case NAND_CMD_PAGEPROG:
          if (!info->force_raw)
-- 2.17.1

Thanks Michael. Fixed it.

...
...
Thanks, Ravi.

...
...
> switch (command) { > case NAND_CMD_READ0: > -- > 2.17.1
Acked-by: Michael Trimarchi michael@amarulasolutions.com

Michael Nazzareno Trimarchi

10:15 p.m.

New subject: [EXTERNAL] Re: [PATCH] mtd: nand: pxa3xx: Incorrect bitflip return on page read

Hi Dario

Can you add to next pull?

Michael

On Tue, May 21, 2024, 4:31 PM Ravi Minnikanti rminnikanti@marvell.com wrote:

...

Hi,

Can you please merge this PR, if there are no more review comments?

Thanks, Ravi On 5/6/24 11:28, Michael Nazzareno Trimarchi wrote:

...

Hi Ravi

On Mon, May 6, 2024 at 7:33 PM Ravi Minnikanti rminnikanti@marvell.com

wrote:

...
...
On 5/6/24 00:35, Michael Nazzareno Trimarchi wrote:

...
Hi Ravi

On Tue, Apr 30, 2024 at 6:25 AM Ravi Minnikanti <

rminnikanti@marvell.com> wrote:

...
...
...
...
On 4/29/24 09:59, Michael Nazzareno Trimarchi wrote:

...

...
...
...
...
...
On Mon, Apr 29, 2024 at 6:22 PM Chris Packham <

judge.packham@gmail.com> wrote:

...
...
...
...
...
> > On Sun, Apr 28, 2024 at 4:15 AM Ravi Minnikanti <

rminnikanti@marvell.com> wrote:

...
...
...
...
...
>> >> Once a page is read with higher bitflips all subsequent reads >> are returning the same bitflip value even though they have none. >> max_bitflip variable is not being reset to 0 across page reads. >> >> This is causing problems like incorrectly >> marking erase blocks bad by UBI and causing read failures. >> >> Verified the change with both MTD reads and UBI. >> This change is inline with other NFC drivers. >> >> Sample error log where a block is marked bad incorrectly: >> >> ubi0: fixable bit-flip detected at PEB 125 >> ubi0: run torture test for PEB 125 >> ubi0: fixable bit-flip detected at PEB 125 >> ubi0 error: torture_peb: read problems on freshly erased PEB 125, >> must be bad >> ubi0 error: erase_worker: failed to erase PEB 125, error -5 >> ubi0: mark PEB 125 as bad >> >> Signed-off-by: rminnikanti rminnikanti@marvell.com > > Looks good to me > > Reviewed-by: Chris Packham judge.packham@gmail.com > >> --- >> drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c

b/drivers/mtd/nand/raw/pxa3xx_nand.c

...
...
...
...
...
>> index 1d9a6d107b..d2a4faad56 100644 >> --- a/drivers/mtd/nand/raw/pxa3xx_nand.c >> +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c >> @@ -800,6 +800,11 @@ static void prepare_start_command(struct

pxa3xx_nand_info *info, int command)

...
...
...
...
...
>> info->ecc_err_cnt = 0; >> info->ndcb3 = 0; >> info->need_wait = 0; >> + /* >> + * Reset max_bitflips to zero. Once command is complete, >> + * max_bitflips for this READ is returned in

ecc.read_page()

...
...
...
...
...
>> + */ >> + info->max_bitflips = 0; >>

Why this should not be put to 0 in read_page instead on

prepare_start_command?

...
...
...
...
...
Michael

ecc.read_page is invoked after the read command execution. First chip->cmdfunc is executed with NAND_CMD_READ0 and then

ecc.read_page is invoked

...
...
...
...
to read the page from buffer. So, by the time read_page is invoked,

info->max_bitflips

...
...
...
...
must already have the bit flip value.

All the other implementation has a slight different way to handle. From what you said the reset should be done on for NAND_CMD_READ0 command and should be sufficient. Technically should be moved in switch and not unconditionally.

Michael

max_bitflip is not being reset to 0 across page reads. Once a page is read with higher bitflips all subsequent reads are returning the same bitflip value even though they have none.

This is causing problems like incorrectly marking erase blocks bad by UBI and read failures.

Tested it with both MTD reads and UBI attach. This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

ubi0: fixable bit-flip detected at PEB 125 ubi0: run torture test for PEB 125 ubi0: fixable bit-flip detected at PEB 125 ubi0 error: torture_peb: read problems on freshly erased PEB 125, must be bad ubi0 error: erase_worker: failed to erase PEB 125, error -5 ubi0: mark PEB 125 as bad

Signed-off-by: rminnikanti rminnikanti@marvell.com

drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c

b/drivers/mtd/nand/raw/pxa3xx_nand.c

...
...
index 1d9a6d107b..97f250483f 100644 --- a/drivers/mtd/nand/raw/pxa3xx_nand.c +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c @@ -803,6 +803,11 @@ static void prepare_start_command(struct

pxa3xx_nand_info *info, int command)

...
...
    switch (command) {
    case NAND_CMD_READ0:
          /*
           * Reset max_bitflips to zero. Once command is complete,
           * max_bitflips for this READ is returned in
ecc.read_page()

...
...
           */
          info->max_bitflips = 0;
  case NAND_CMD_READOOB:
  case NAND_CMD_PAGEPROG:
          if (!info->force_raw)
-- 2.17.1

Thanks Michael. Fixed it.

...
...
Thanks, Ravi.

...
>> switch (command) { >> case NAND_CMD_READ0: >> -- >> 2.17.1
Acked-by: Michael Trimarchi michael@amarulasolutions.com

Ravi Minnikanti

16 Jun 16 Jun

12:24 p.m.

Can this be merged? Let me know if I missed any process.

Thanks, Ravi

On 5/21/24 13:15, Michael Nazzareno Trimarchi wrote:

...

Hi Dario

Can you add to next pull?

Michael

On Tue, May 21, 2024, 4:31 PM Ravi Minnikanti rminnikanti@marvell.com wrote:

...
Hi,

Can you please merge this PR, if there are no more review comments?

Thanks, Ravi On 5/6/24 11:28, Michael Nazzareno Trimarchi wrote:

...

Hi Ravi

On Mon, May 6, 2024 at 7:33 PM Ravi Minnikanti rminnikanti@marvell.com

wrote:

...
...
On 5/6/24 00:35, Michael Nazzareno Trimarchi wrote:

...
Hi Ravi

On Tue, Apr 30, 2024 at 6:25 AM Ravi Minnikanti <

rminnikanti@marvell.com> wrote:

...
...
...
...
On 4/29/24 09:59, Michael Nazzareno Trimarchi wrote:

>

...
...
...
...
> On Mon, Apr 29, 2024 at 6:22 PM Chris Packham <

judge.packham@gmail.com> wrote:

...
...
...
...
>> >> On Sun, Apr 28, 2024 at 4:15 AM Ravi Minnikanti <

rminnikanti@marvell.com> wrote:

...
...
...
...
>>> >>> Once a page is read with higher bitflips all subsequent reads >>> are returning the same bitflip value even though they have none. >>> max_bitflip variable is not being reset to 0 across page reads. >>> >>> This is causing problems like incorrectly >>> marking erase blocks bad by UBI and causing read failures. >>> >>> Verified the change with both MTD reads and UBI. >>> This change is inline with other NFC drivers. >>> >>> Sample error log where a block is marked bad incorrectly: >>> >>> ubi0: fixable bit-flip detected at PEB 125 >>> ubi0: run torture test for PEB 125 >>> ubi0: fixable bit-flip detected at PEB 125 >>> ubi0 error: torture_peb: read problems on freshly erased PEB 125, >>> must be bad >>> ubi0 error: erase_worker: failed to erase PEB 125, error -5 >>> ubi0: mark PEB 125 as bad >>> >>> Signed-off-by: rminnikanti rminnikanti@marvell.com >> >> Looks good to me >> >> Reviewed-by: Chris Packham judge.packham@gmail.com >> >>> --- >>> drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ >>> 1 file changed, 5 insertions(+) >>> >>> diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c

b/drivers/mtd/nand/raw/pxa3xx_nand.c

...
...
...
...
>>> index 1d9a6d107b..d2a4faad56 100644 >>> --- a/drivers/mtd/nand/raw/pxa3xx_nand.c >>> +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c >>> @@ -800,6 +800,11 @@ static void prepare_start_command(struct

pxa3xx_nand_info *info, int command)

...
...
...
...
>>> info->ecc_err_cnt = 0; >>> info->ndcb3 = 0; >>> info->need_wait = 0; >>> + /* >>> + * Reset max_bitflips to zero. Once command is complete, >>> + * max_bitflips for this READ is returned in

ecc.read_page()

...
...
...
...
>>> + */ >>> + info->max_bitflips = 0; >>> > > Why this should not be put to 0 in read_page instead on

prepare_start_command?

...
...
...
...
> > Michael >

ecc.read_page is invoked after the read command execution. First chip->cmdfunc is executed with NAND_CMD_READ0 and then

ecc.read_page is invoked

...
...
...
...
to read the page from buffer. So, by the time read_page is invoked,

info->max_bitflips

...
...
...
...
must already have the bit flip value.

All the other implementation has a slight different way to handle. From what you said the reset should be done on for NAND_CMD_READ0 command and should be sufficient. Technically should be moved in switch and not unconditionally.

Michael

max_bitflip is not being reset to 0 across page reads. Once a page is read with higher bitflips all subsequent reads are returning the same bitflip value even though they have none.

This is causing problems like incorrectly marking erase blocks bad by UBI and read failures.

Tested it with both MTD reads and UBI attach. This change is inline with other NFC drivers.

Sample error log where a block is marked bad incorrectly:

ubi0: fixable bit-flip detected at PEB 125 ubi0: run torture test for PEB 125 ubi0: fixable bit-flip detected at PEB 125 ubi0 error: torture_peb: read problems on freshly erased PEB 125, must be bad ubi0 error: erase_worker: failed to erase PEB 125, error -5 ubi0: mark PEB 125 as bad

Signed-off-by: rminnikanti rminnikanti@marvell.com

drivers/mtd/nand/raw/pxa3xx_nand.c | 5 +++++ 1 file changed, 5 insertions(+)

diff --git a/drivers/mtd/nand/raw/pxa3xx_nand.c

b/drivers/mtd/nand/raw/pxa3xx_nand.c

...
...
index 1d9a6d107b..97f250483f 100644 --- a/drivers/mtd/nand/raw/pxa3xx_nand.c +++ b/drivers/mtd/nand/raw/pxa3xx_nand.c @@ -803,6 +803,11 @@ static void prepare_start_command(struct

pxa3xx_nand_info *info, int command)

...
...
    switch (command) {
    case NAND_CMD_READ0:
          /*
           * Reset max_bitflips to zero. Once command is complete,
           * max_bitflips for this READ is returned in
ecc.read_page()

...
...
           */
          info->max_bitflips = 0;
  case NAND_CMD_READOOB:
  case NAND_CMD_PAGEPROG:
          if (!info->force_raw)
-- 2.17.1

Thanks Michael. Fixed it.

...
...
Thanks, Ravi.

>>> switch (command) { >>> case NAND_CMD_READ0: >>> -- >>> 2.17.1 >
Acked-by: Michael Trimarchi michael@amarulasolutions.com

324

Age (days ago)

374

Last active (days ago)

List overview

Download

9 comments

3 participants

tags (0)

participants (3)

Chris Packham
Michael Nazzareno Trimarchi
Ravi Minnikanti