
Hi Andreas,
Andreas Bießmann wrote:
Hi Luca,
On 19.12.2012 16:56, Luca Ceresoli wrote:
Hi Andreas,
Andreas Bießmann wrote: ...
Creating 1 MTD partitions on "nand0": 0x000000100000-0x000010000000 : "mtd=3" UBI: attaching mtd1 to ubi0 UBI: physical eraseblock size: 131072 bytes (128 KiB) UBI: logical eraseblock size: 129024 bytes UBI: smallest flash I/O unit: 2048 UBI: sub-page size: 512 UBI: VID header offset: 512 (aligned 512) UBI: data offset: 2048 UBI error: ubi_wl_init_scan: no enough physical eraseblocks (0, need 1)
Now the device is totally blocked, and power cycling does not change the result.
have you tried to increase the malloc arena in u-boot (CONIG_SYS_MALLOC_LEN)? We had errors like this before [1],[2] and [3], maybe others - apparently with another error message, but please give it a try. We know ubi recovery needs some ram and 1MiB may be not enough.
Thanks for your suggestion.
Unfortunately this does not seem to be the cause of my problem: I tried increasing my CONFIG_SYS_MALLOC_LEN in include/configs/dig297.h from (1024 << 10) to both (1024 << 12) and (1024 << 14), but without any difference.
Well, ok ... Malloc arena is always my first thought if I read about problems with ubi in u-boot. Have you looked up the differences in drivers/mtd/ubi/ in your u-boot and linux tree? Maybe you can see something obviously different in the ubi_wl_init_scan()?
I had some days ago, but I double-checked now as you suggested. Indeed there is an important difference: attach_by_scanning() (build.c) calls ubi_wl_init_scan() and ubi_eba_init_scan() just like Linux does, but in a swapped order!
This swap dates back to:
commit d63894654df72b010de2abb4b3f07d0d755f65b6 Author: Holger Brunck holger.brunck@keymile.com Date: Mon Oct 10 13:08:19 2011 +0200
UBI: init eba tables before wl when attaching a device
This fixes that u-boot gets stuck when a bitflip was detected during "ubi part <ubi_device>". If a bitflip was detected UBI tries to copy the PEB to a different place. This needs that the eba table are initialized, but this was done after the wear levelling worker detects the bitflip. So changes the initialisation of these two tasks in u-boot.
This is a u-boot specific patch and not needed in the linux layer, because due to commit 1b1f9a9d00447d UBI: Ensure that "background thread" operations are really executed we schedule these tasks in place and not as in linux after the inital task which schedule this new task is finished.
Signed-off-by: Holger Brunck holger.brunck@keymile.com cc: Stefan Roese sr@denx.de Signed-off-by: Stefan Roese sr@denx.de
I tried reverting that commit and... surprise! U-Boot can now attach UBI and boot properly!
But the cited commit actually fixed a bug that bite our board a few months back, so it should not be reverted without thinking twice. Now it apparently introduced another bug. :-(
I'm Cc:ing the commit author for comments.
Nonetheless, I have evidence of a different behaviour between U-Boot and Linux even before the two swapped functions are called.
What attach_by_scanning() does in Linux is (abbreviated):
static int attach_by_scanning(struct ubi_device *ubi) { si = ubi_scan(ubi); ...fill ubi->some_fields...; err = ubi_read_volume_table(ubi, si); /* MARK */ err = ubi_eba_init_scan(ubi, si); /* swapped in U-Boot */ err = ubi_wl_init_scan(ubi, si); /* swapped in U-Boot */ ubi_scan_destroy_si(si); return 0; }
See the two swapped calls.
At MARK, I printed some of the peb counters in *ubi, and I got different results for ubi->avail_pebs between U-Boot and Linux: U-Boot: UBI: POST_TBL: rsvd=2018, avail=21, beb_rsvd_{pebs,level}=0,0 Linux: UBI: POST_TBL: rsvd=2018, avail=22, beb_rsvd_{pebs,level}=0,0
The printed values were equal before calling ubi_read_volume_table(). I have no idea about where this difference comes from, nor if this difference can cause my troubles. I will better investigate tomorrow looking into ubi_read_volume_table().
Luca