[U-Boot] [PATCH] sf: Set current flash bank to 0 in clean_bar()

The clean_bar() function resets the SPI NOR BAR register to 0, but does not set the flash->curr_bar to 0 , therefore those two can get out of sync, which could ultimatelly result in corrupted flash content.
The simplest test case is this:
=> mw 0x10000000 0x1234abcd 0x4000 => sf probe => sf erase 0x1000000 0x10000 => sf write 0x10000000 0x1000000 0x10000
=> sf probe ; sf read 0x12000000 0 0x10000 ; md 0x12000000
That is, erase a sector above the 16 MiB boundary and write it with random pre-configured data. What will actually happen without this patch is the sector will be erased, but the data will be written to BAR 0 offset 0x0 in the flash.
This is because the erase command will call write_bar()+clean_bar(), which will leave flash->bank_curr = 1 while the hardware BAR registers will be set to 0 through clean_bar(). The subsequent write will also trigger write_bar()+clean_bar(), but write_bar checks if the target bank == flash->bank_curr and if so, does NOT reconfigure the BAR in the SPI NOR. Since flash->bank_curr is still 1 and out of sync with the HW, the condition matches, BAR programming is skipped and write ends up at address 0x0, thus corrupting flash content.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@openedev.com Cc: Tom Rini trini@konsulko.com --- drivers/mtd/spi/spi_flash.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/mtd/spi/spi_flash.c b/drivers/mtd/spi/spi_flash.c index 2911729b28..1b61095859 100644 --- a/drivers/mtd/spi/spi_flash.c +++ b/drivers/mtd/spi/spi_flash.c @@ -128,6 +128,7 @@ static int clean_bar(struct spi_flash *flash) if (flash->bank_curr == 0) return 0; cmd = flash->bank_write_cmd; + flash->bank_curr = 0;
return spi_flash_write_common(flash, &cmd, 1, &bank_sel, 1); }

On 05/24/2018 09:58 PM, Marek Vasut wrote:
The clean_bar() function resets the SPI NOR BAR register to 0, but does not set the flash->curr_bar to 0 , therefore those two can get out of sync, which could ultimatelly result in corrupted flash content.
The simplest test case is this:
=> mw 0x10000000 0x1234abcd 0x4000 => sf probe => sf erase 0x1000000 0x10000 => sf write 0x10000000 0x1000000 0x10000
=> sf probe ; sf read 0x12000000 0 0x10000 ; md 0x12000000
That is, erase a sector above the 16 MiB boundary and write it with random pre-configured data. What will actually happen without this patch is the sector will be erased, but the data will be written to BAR 0 offset 0x0 in the flash.
This is because the erase command will call write_bar()+clean_bar(), which will leave flash->bank_curr = 1 while the hardware BAR registers will be set to 0 through clean_bar(). The subsequent write will also trigger write_bar()+clean_bar(), but write_bar checks if the target bank == flash->bank_curr and if so, does NOT reconfigure the BAR in the SPI NOR. Since flash->bank_curr is still 1 and out of sync with the HW, the condition matches, BAR programming is skipped and write ends up at address 0x0, thus corrupting flash content.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@openedev.com Cc: Tom Rini trini@konsulko.com
This is a critical fix and it's been broken for multiple releases, so bump.
drivers/mtd/spi/spi_flash.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/mtd/spi/spi_flash.c b/drivers/mtd/spi/spi_flash.c index 2911729b28..1b61095859 100644 --- a/drivers/mtd/spi/spi_flash.c +++ b/drivers/mtd/spi/spi_flash.c @@ -128,6 +128,7 @@ static int clean_bar(struct spi_flash *flash) if (flash->bank_curr == 0) return 0; cmd = flash->bank_write_cmd;
flash->bank_curr = 0;
return spi_flash_write_common(flash, &cmd, 1, &bank_sel, 1);
}

On Fri, May 25, 2018 at 1:28 AM, Marek Vasut marex@denx.de wrote:
The clean_bar() function resets the SPI NOR BAR register to 0, but does not set the flash->curr_bar to 0 , therefore those two can get out of sync, which could ultimatelly result in corrupted flash content.
The simplest test case is this:
=> mw 0x10000000 0x1234abcd 0x4000 => sf probe => sf erase 0x1000000 0x10000 => sf write 0x10000000 0x1000000 0x10000
=> sf probe ; sf read 0x12000000 0 0x10000 ; md 0x12000000
That is, erase a sector above the 16 MiB boundary and write it with random pre-configured data. What will actually happen without this patch is the sector will be erased, but the data will be written to BAR 0 offset 0x0 in the flash.
Why it would write it at 0x0
write ops -> write_bar -> identifies bank sel (here it's 1) -> write bank register -> flash->bank_curr
then actual data will written from 16MB offset and call clean_bar
Jagan.

On 06/04/2018 03:27 PM, Jagan Teki wrote:
On Fri, May 25, 2018 at 1:28 AM, Marek Vasut marex@denx.de wrote:
The clean_bar() function resets the SPI NOR BAR register to 0, but does not set the flash->curr_bar to 0 , therefore those two can get out of sync, which could ultimatelly result in corrupted flash content.
The simplest test case is this:
=> mw 0x10000000 0x1234abcd 0x4000 => sf probe => sf erase 0x1000000 0x10000 => sf write 0x10000000 0x1000000 0x10000
=> sf probe ; sf read 0x12000000 0 0x10000 ; md 0x12000000
That is, erase a sector above the 16 MiB boundary and write it with random pre-configured data. What will actually happen without this patch is the sector will be erased, but the data will be written to BAR 0 offset 0x0 in the flash.
Why it would write it at 0x0
write ops -> write_bar -> identifies bank sel (here it's 1) -> write bank register -> flash->bank_curr
then actual data will written from 16MB offset and call clean_bar
I am not sure what you're trying to say here, but maybe re-read the commit message again ? The erase cycle is needed before the write cycle to put the SPI NOR framework into inconsistent state, at which point it will corrupt the content of the flash at 0x0.

On Mon, Jun 4, 2018 at 7:05 PM, Marek Vasut marex@denx.de wrote:
On 06/04/2018 03:27 PM, Jagan Teki wrote:
On Fri, May 25, 2018 at 1:28 AM, Marek Vasut marex@denx.de wrote:
The clean_bar() function resets the SPI NOR BAR register to 0, but does not set the flash->curr_bar to 0 , therefore those two can get out of sync, which could ultimatelly result in corrupted flash content.
The simplest test case is this:
=> mw 0x10000000 0x1234abcd 0x4000 => sf probe => sf erase 0x1000000 0x10000 => sf write 0x10000000 0x1000000 0x10000
=> sf probe ; sf read 0x12000000 0 0x10000 ; md 0x12000000
That is, erase a sector above the 16 MiB boundary and write it with random pre-configured data. What will actually happen without this patch is the sector will be erased, but the data will be written to BAR 0 offset 0x0 in the flash.
What I'm trying to say is both erase and write ops we have write_bar to switch specific offset of flash based on the bank_sel and the same bank_sel is updated in flash->bank_curr. If there is an immediate read after we still computing bank_sel and do necessary ops. for each operation completion the clean_bar will switch the flash to 0x0 offset by writing bank_sel as 0
What is corrupting here? can you explain?

On 06/04/2018 03:49 PM, Jagan Teki wrote:
On Mon, Jun 4, 2018 at 7:05 PM, Marek Vasut marex@denx.de wrote:
On 06/04/2018 03:27 PM, Jagan Teki wrote:
On Fri, May 25, 2018 at 1:28 AM, Marek Vasut marex@denx.de wrote:
The clean_bar() function resets the SPI NOR BAR register to 0, but does not set the flash->curr_bar to 0 , therefore those two can get out of sync, which could ultimatelly result in corrupted flash content.
The simplest test case is this:
=> mw 0x10000000 0x1234abcd 0x4000 => sf probe => sf erase 0x1000000 0x10000 => sf write 0x10000000 0x1000000 0x10000
=> sf probe ; sf read 0x12000000 0 0x10000 ; md 0x12000000
That is, erase a sector above the 16 MiB boundary and write it with random pre-configured data. What will actually happen without this patch is the sector will be erased, but the data will be written to BAR 0 offset 0x0 in the flash.
What I'm trying to say is both erase and write ops we have write_bar to switch specific offset of flash based on the bank_sel and the same bank_sel is updated in flash->bank_curr. If there is an immediate read after we still computing bank_sel and do necessary ops. for each operation completion the clean_bar will switch the flash to 0x0 offset by writing bank_sel as 0
What is corrupting here? can you explain?
Clearly that is not what is happening, as explained in the commit message.
So maybe run that testcase on some system and observe what is actually happening ? Print the bank_curr and see how it goes out of sync maybe?

On Fri, May 25, 2018 at 1:28 AM, Marek Vasut marex@denx.de wrote:
The clean_bar() function resets the SPI NOR BAR register to 0, but does not set the flash->curr_bar to 0 , therefore those two can get out of sync, which could ultimatelly result in corrupted flash content.
The simplest test case is this:
=> mw 0x10000000 0x1234abcd 0x4000 => sf probe => sf erase 0x1000000 0x10000 => sf write 0x10000000 0x1000000 0x10000
=> sf probe ; sf read 0x12000000 0 0x10000 ; md 0x12000000
That is, erase a sector above the 16 MiB boundary and write it with random pre-configured data. What will actually happen without this patch is the sector will be erased, but the data will be written to BAR 0 offset 0x0 in the flash.
This is because the erase command will call write_bar()+clean_bar(), which will leave flash->bank_curr = 1 while the hardware BAR registers will be set to 0 through clean_bar(). The subsequent write will also trigger write_bar()+clean_bar(), but write_bar checks if the target bank == flash->bank_curr and if so, does NOT reconfigure the BAR in the SPI NOR. Since flash->bank_curr is still 1 and out of sync with the HW, the condition matches, BAR programming is skipped and write ends up at address 0x0, thus corrupting flash content.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@openedev.com Cc: Tom Rini trini@konsulko.com
Reviewed-by: Jagan Teki jagan@openedev.com
Applied to u-boot-spi/master

On 06/04/2018 08:03 PM, Jagan Teki wrote:
On Fri, May 25, 2018 at 1:28 AM, Marek Vasut marex@denx.de wrote:
The clean_bar() function resets the SPI NOR BAR register to 0, but does not set the flash->curr_bar to 0 , therefore those two can get out of sync, which could ultimatelly result in corrupted flash content.
The simplest test case is this:
=> mw 0x10000000 0x1234abcd 0x4000 => sf probe => sf erase 0x1000000 0x10000 => sf write 0x10000000 0x1000000 0x10000
=> sf probe ; sf read 0x12000000 0 0x10000 ; md 0x12000000
That is, erase a sector above the 16 MiB boundary and write it with random pre-configured data. What will actually happen without this patch is the sector will be erased, but the data will be written to BAR 0 offset 0x0 in the flash.
This is because the erase command will call write_bar()+clean_bar(), which will leave flash->bank_curr = 1 while the hardware BAR registers will be set to 0 through clean_bar(). The subsequent write will also trigger write_bar()+clean_bar(), but write_bar checks if the target bank == flash->bank_curr and if so, does NOT reconfigure the BAR in the SPI NOR. Since flash->bank_curr is still 1 and out of sync with the HW, the condition matches, BAR programming is skipped and write ends up at address 0x0, thus corrupting flash content.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@openedev.com Cc: Tom Rini trini@konsulko.com
Reviewed-by: Jagan Teki jagan@openedev.com
Applied to u-boot-spi/master
So I presume you see it now ?

On Tue, Jun 5, 2018 at 12:02 AM, Marek Vasut marex@denx.de wrote:
On 06/04/2018 08:03 PM, Jagan Teki wrote:
On Fri, May 25, 2018 at 1:28 AM, Marek Vasut marex@denx.de wrote:
The clean_bar() function resets the SPI NOR BAR register to 0, but does not set the flash->curr_bar to 0 , therefore those two can get out of sync, which could ultimatelly result in corrupted flash content.
The simplest test case is this:
=> mw 0x10000000 0x1234abcd 0x4000 => sf probe => sf erase 0x1000000 0x10000 => sf write 0x10000000 0x1000000 0x10000
=> sf probe ; sf read 0x12000000 0 0x10000 ; md 0x12000000
That is, erase a sector above the 16 MiB boundary and write it with random pre-configured data. What will actually happen without this patch is the sector will be erased, but the data will be written to BAR 0 offset 0x0 in the flash.
This is because the erase command will call write_bar()+clean_bar(), which will leave flash->bank_curr = 1 while the hardware BAR registers will be set to 0 through clean_bar(). The subsequent write will also trigger write_bar()+clean_bar(), but write_bar checks if the target bank == flash->bank_curr and if so, does NOT reconfigure the BAR in the SPI NOR. Since flash->bank_curr is still 1 and out of sync with the HW, the condition matches, BAR programming is skipped and write ends up at address 0x0, thus corrupting flash content.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@openedev.com Cc: Tom Rini trini@konsulko.com
Reviewed-by: Jagan Teki jagan@openedev.com
Applied to u-boot-spi/master
So I presume you see it now ?
Yes, I understand the issue, thanks.

On 06/04/2018 08:35 PM, Jagan Teki wrote:
On Tue, Jun 5, 2018 at 12:02 AM, Marek Vasut marex@denx.de wrote:
On 06/04/2018 08:03 PM, Jagan Teki wrote:
On Fri, May 25, 2018 at 1:28 AM, Marek Vasut marex@denx.de wrote:
The clean_bar() function resets the SPI NOR BAR register to 0, but does not set the flash->curr_bar to 0 , therefore those two can get out of sync, which could ultimatelly result in corrupted flash content.
The simplest test case is this:
=> mw 0x10000000 0x1234abcd 0x4000 => sf probe => sf erase 0x1000000 0x10000 => sf write 0x10000000 0x1000000 0x10000
=> sf probe ; sf read 0x12000000 0 0x10000 ; md 0x12000000
That is, erase a sector above the 16 MiB boundary and write it with random pre-configured data. What will actually happen without this patch is the sector will be erased, but the data will be written to BAR 0 offset 0x0 in the flash.
This is because the erase command will call write_bar()+clean_bar(), which will leave flash->bank_curr = 1 while the hardware BAR registers will be set to 0 through clean_bar(). The subsequent write will also trigger write_bar()+clean_bar(), but write_bar checks if the target bank == flash->bank_curr and if so, does NOT reconfigure the BAR in the SPI NOR. Since flash->bank_curr is still 1 and out of sync with the HW, the condition matches, BAR programming is skipped and write ends up at address 0x0, thus corrupting flash content.
Signed-off-by: Marek Vasut marex@denx.de Cc: Jagan Teki jagan@openedev.com Cc: Tom Rini trini@konsulko.com
Reviewed-by: Jagan Teki jagan@openedev.com
Applied to u-boot-spi/master
So I presume you see it now ?
Yes, I understand the issue, thanks.
Excellent
participants (3)
-
Jagan Teki
-
Marek Vasut
-
Marek Vasut