Re: [U-Boot] Bricked when trying to attach UBI

19 Dec 2012


      Hi Andreas,
Andreas Bießmann wrote:
...
Hi Luca,
On 19.12.2012 16:56, Luca Ceresoli wrote:
...
Hi Andreas,
Andreas Bießmann wrote:
...
...
...
Creating 1 MTD partitions on "nand0":
0x000000100000-0x000010000000 : "mtd=3"
UBI: attaching mtd1 to ubi0
UBI: physical eraseblock size:   131072 bytes (128 KiB)
UBI: logical eraseblock size:    129024 bytes
UBI: smallest flash I/O unit:    2048
UBI: sub-page size:              512
UBI: VID header offset:          512 (aligned 512)
UBI: data offset:                2048
UBI error: ubi_wl_init_scan: no enough physical eraseblocks (0, need 1)
Now the device is totally blocked, and power cycling does not change
the result.
have you tried to increase the malloc arena in u-boot
(CONIG_SYS_MALLOC_LEN)?
We had errors like this before [1],[2] and [3], maybe others -
apparently with another error message, but please give it a try. We know
ubi recovery needs some ram and 1MiB may be not enough.
Thanks for your suggestion.
Unfortunately this does not seem to be the cause of my problem: I tried
increasing my CONFIG_SYS_MALLOC_LEN in include/configs/dig297.h from
(1024 << 10) to both (1024 << 12) and (1024 << 14), but without any
difference.
Well, ok ... Malloc arena is always my first thought if I read about
problems with ubi in u-boot.
Have you looked up the differences in drivers/mtd/ubi/ in your u-boot
and linux tree? Maybe you can see something obviously different in the
ubi_wl_init_scan()?
I had some days ago, but I double-checked now as you suggested. Indeed
there is an important difference: attach_by_scanning() (build.c) calls
ubi_wl_init_scan() and ubi_eba_init_scan() just like Linux does, but in
a swapped order!
This swap dates back to:
commit d63894654df72b010de2abb4b3f07d0d755f65b6
Author: Holger Brunck holger.brunck@keymile.com
Date:   Mon Oct 10 13:08:19 2011 +0200
UBI: init eba tables before wl when attaching a device
This fixes that u-boot gets stuck when a bitflip was detected
     during "ubi part <ubi_device>". If a bitflip was detected UBI tries
     to copy the PEB to a different place. This needs that the eba table
     are initialized, but this was done after the wear levelling worker
     detects the bitflip. So changes the initialisation of these two
     tasks in u-boot.
This is a u-boot specific patch and not needed in the linux layer,
     because due to commit 1b1f9a9d00447d
     UBI: Ensure that "background thread" operations are really executed
     we schedule these tasks in place and not as in linux after the inital
     task which schedule this new task is finished.
Signed-off-by: Holger Brunck holger.brunck@keymile.com
     cc: Stefan Roese sr@denx.de
     Signed-off-by: Stefan Roese sr@denx.de
I tried reverting that commit and... surprise! U-Boot can now attach UBI
and boot properly!
But the cited commit actually fixed a bug that bite our board a few
months back, so it should not be reverted without thinking twice. Now
it apparently introduced another bug. :-(
I'm Cc:ing the commit author for comments.
Nonetheless, I have evidence of a different behaviour between U-Boot
and Linux even before the two swapped functions are called.
What attach_by_scanning() does in Linux is (abbreviated):
static int attach_by_scanning(struct ubi_device *ubi)
{
         si = ubi_scan(ubi);
    ...fill ubi->some_fields...;
         err = ubi_read_volume_table(ubi, si);
    /* MARK */
         err = ubi_eba_init_scan(ubi, si); /* swapped in U-Boot */
         err = ubi_wl_init_scan(ubi, si);  /* swapped in U-Boot */
         ubi_scan_destroy_si(si);
         return 0;
}
See the two swapped calls.
At MARK, I printed some of the peb counters in *ubi, and I got
different results for ubi->avail_pebs between U-Boot and Linux:
U-Boot: UBI: POST_TBL: rsvd=2018, avail=21, beb_rsvd_{pebs,level}=0,0
Linux:  UBI: POST_TBL: rsvd=2018, avail=22, beb_rsvd_{pebs,level}=0,0
The printed values were equal before calling ubi_read_volume_table().
I have no idea about where this difference comes from, nor if this
difference can cause my troubles.
I will better investigate tomorrow looking into ubi_read_volume_table().
Luca