[PATCH 0/4] fs: ubifs: Fix crash and add safeguards

Hei hei,
filesystem handling is different in U-Boot and beyond that UBI/UBIFS is different from other filesystems in U-Boot. There's UBI and UBIFS code ported from Linux (quite old already now, maybe someone wants to update that?), and there's "glue code" or "wrapper code" to interface with U-Boot scripts, commands, and filesystem handling. The fixes and improvements in this patch series are for this U-Boot specific glue code.
I'm no filesystem expert, but after days of debugging I'm quite sure the bug is in U-Boot since UBIFS support was added in 2009, and it was repeated in 2015 when generic filesystem support for UBIFS was added. So please review carefully!
The crashes were not easily reproducible, only with boards using the old distroboot _and_ a boot script inspired by (but not equal to) the one proposed by RAUC [1], which basically boils down to:
ubifsmount ubi0:boot (from distroboot) test -e (from distroboot) ubifsmount ubi0:rootfs1 (this time from the boot script, triggering a ubifs_umount)
Crashes can be triggered more easily, if patch order is changed and patch 2 (resetting pointers to NULL after free) comes first, or if patch 2 is applied on its own only.
The fix is in the first patch, and on my boards I see no crashes anymore. I also tested all kinds of combinations of calling `ubi part`, `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, `load`, `size`, and `test -e` and got no crashes anymore after the fix.
The three additional patches (2 to 4) are more or less safeguards and improvements for the future, and come from me trying and my debugging code done on the way, more or less optional, but I think nice to have.
Greets Alex
[1] https://github.com/rauc/rauc/blob/master/contrib/uboot.sh
Alexander Dahl (4): fs: ubifs: Fix memleak and double free in u-boot wrapper functions fs: ubifs: Set pointers to NULL after free fs: ubifs: Make k(z)alloc/kfree symmetric fs: ubifs: Add volume mounted check
fs/ubifs/super.c | 8 ++++++-- fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ 2 files changed, 25 insertions(+), 14 deletions(-)
base-commit: 65fbdab27224ee3943a89496b21862db83c34da2

When mounting ubifs e.g. through command 'ubifsmount' one global static superblock 'ubifs_sb' is used _and_ the requested volume is opened (like in Linux). The pointer returned by 'ubifs_open_volume()' is stored in that superblock struct and freed later on cmd 'ubifsumount' or another call to 'ubifsmount' with a different volume, through ubifs_umount() and ubi_close_volume().
In ubifs_ls(), ubifs_exists(), ubifs_size(), and ubifs_read() the volume was opened again, which is technically no problem with regard to refcounting, but here the still valid pointer in sb was overwritten, leading to a memory leak. Even worse, when using one of those functions and calling ubifsumount later, ubi_close_volume() was called again but now on an already freed pointer, leading to a double free. This actually crashed with different invalid memory accesses on a board using the old distro boot and a rather long script handling RAUC updates.
Example:
> ubi part UBI > ubifsmount ubi0:boot > test -e ubi ubi0:boot /boot.scr.uimg > ubifsumount
The ubifs specific commands 'ubifsls' and 'ubifsload' check for a mounted volume by themselves, for the generic fs variants 'ls', 'load', (and 'size', and 'test -e') this is covered by special ubifs handling in fs_set_blk_dev() and deeper down blk_get_device_part_str() then. So for ubifs_ls(), ubifs_exists(), ubifs_size(), and ubifs_read() we can be sure the volume is opened and the necessary struct pointer in sb is valid, so it is not needed to open volume again.
Fixes: 9eefe2a2b37 ("UBIFS: Implement read-only UBIFS support in U-Boot") Fixes: 29cc5bcadfc ("ubifs: Add functions for generic fs use") Signed-off-by: Alexander Dahl ada@thorsis.com --- fs/ubifs/ubifs.c | 12 ------------ 1 file changed, 12 deletions(-)
diff --git a/fs/ubifs/ubifs.c b/fs/ubifs/ubifs.c index 75de01e95f7..58b3f3659c1 100644 --- a/fs/ubifs/ubifs.c +++ b/fs/ubifs/ubifs.c @@ -566,7 +566,6 @@ int ubifs_set_blk_dev(struct blk_desc *rbdd, struct disk_partition *info)
int ubifs_ls(const char *filename) { - struct ubifs_info *c = ubifs_sb->s_fs_info; struct file *file; struct dentry *dentry; struct inode *dir; @@ -574,7 +573,6 @@ int ubifs_ls(const char *filename) unsigned long inum; int ret = 0;
- c->ubi = ubi_open_volume(c->vi.ubi_num, c->vi.vol_id, UBI_READONLY); inum = ubifs_findfile(ubifs_sb, (char *)filename); if (!inum) { ret = -1; @@ -608,31 +606,24 @@ out_mem: free(dir);
out: - ubi_close_volume(c->ubi); return ret; }
int ubifs_exists(const char *filename) { - struct ubifs_info *c = ubifs_sb->s_fs_info; unsigned long inum;
- c->ubi = ubi_open_volume(c->vi.ubi_num, c->vi.vol_id, UBI_READONLY); inum = ubifs_findfile(ubifs_sb, (char *)filename); - ubi_close_volume(c->ubi);
return inum != 0; }
int ubifs_size(const char *filename, loff_t *size) { - struct ubifs_info *c = ubifs_sb->s_fs_info; unsigned long inum; struct inode *inode; int err = 0;
- c->ubi = ubi_open_volume(c->vi.ubi_num, c->vi.vol_id, UBI_READONLY); - inum = ubifs_findfile(ubifs_sb, (char *)filename); if (!inum) { err = -1; @@ -650,7 +641,6 @@ int ubifs_size(const char *filename, loff_t *size)
ubifs_iput(inode); out: - ubi_close_volume(c->ubi); return err; }
@@ -845,7 +835,6 @@ int ubifs_read(const char *filename, void *buf, loff_t offset, return -1; }
- c->ubi = ubi_open_volume(c->vi.ubi_num, c->vi.vol_id, UBI_READONLY); /* ubifs_findfile will resolve symlinks, so we know that we get * the real file here */ inum = ubifs_findfile(ubifs_sb, (char *)filename); @@ -909,7 +898,6 @@ put_inode: ubifs_iput(inode);
out: - ubi_close_volume(c->ubi); return err; }

Global superblock pointer 'ubifs_sb' and volume pointer 'ubi' of type struct ubi_volume_desc in private member sb->s_fs_info of type struct ubifs_info, can be allocated and freed at runtime, and allocated and freed again, depending which console or script commands are run. In some cases ubifs_sb is even tested to determine if the filesystem is mounted. Reset those pointers to NULL after free to clearly mark them as not valid. This avoids potential double free on invalid pointers.
(The ubifs_sb pointer was already reset, but that statement was moved now to directly after the free() to make it easier to understand.)
Signed-off-by: Alexander Dahl ada@thorsis.com --- fs/ubifs/super.c | 4 ++++ fs/ubifs/ubifs.c | 1 - 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c index 788f88f0495..321c9b9351e 100644 --- a/fs/ubifs/super.c +++ b/fs/ubifs/super.c @@ -1758,11 +1758,13 @@ void ubifs_umount(struct ubifs_info *c) ubifs_debugging_exit(c); #ifdef __UBOOT__ ubi_close_volume(c->ubi); + c->ubi = NULL; mutex_unlock(&c->umount_mutex); /* Finally free U-Boot's global copy of superblock */ if (ubifs_sb != NULL) { free(ubifs_sb->s_fs_info); free(ubifs_sb); + ubifs_sb = NULL; } #endif } @@ -2061,6 +2063,7 @@ static void ubifs_put_super(struct super_block *sb) #ifndef __UBOOT__ bdi_destroy(&c->bdi); ubi_close_volume(c->ubi); + c->ubi = NULL; mutex_unlock(&c->umount_mutex); #endif } @@ -2340,6 +2343,7 @@ out_bdi: out_close: #endif ubi_close_volume(c->ubi); + c->ubi = NULL; out: return err; } diff --git a/fs/ubifs/ubifs.c b/fs/ubifs/ubifs.c index 58b3f3659c1..506e29958c3 100644 --- a/fs/ubifs/ubifs.c +++ b/fs/ubifs/ubifs.c @@ -928,6 +928,5 @@ void uboot_ubifs_umount(void) printf("Unmounting UBIFS volume %s!\n", ((struct ubifs_info *)(ubifs_sb->s_fs_info))->vi.name); ubifs_umount(ubifs_sb->s_fs_info); - ubifs_sb = NULL; } }

Although kfree() is in fact only a slim wrapper to free() in U-Boot, use kfree() here, because those structs where allocated with kalloc() or kzalloc().
Signed-off-by: Alexander Dahl ada@thorsis.com --- fs/ubifs/super.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c index 321c9b9351e..e2d988790a3 100644 --- a/fs/ubifs/super.c +++ b/fs/ubifs/super.c @@ -1762,8 +1762,8 @@ void ubifs_umount(struct ubifs_info *c) mutex_unlock(&c->umount_mutex); /* Finally free U-Boot's global copy of superblock */ if (ubifs_sb != NULL) { - free(ubifs_sb->s_fs_info); - free(ubifs_sb); + kfree(ubifs_sb->s_fs_info); + kfree(ubifs_sb); ubifs_sb = NULL; } #endif

Safety guard in the U-Boot filesystem glue code, because these functions are called from different parts of the codebase. For generic filesystem handling this should have been checked in blk_get_device_part_str() already. Commands from cmd/ubifs.c should also check this before calling those functions, but you never know?!
Signed-off-by: Alexander Dahl ada@thorsis.com --- fs/ubifs/ubifs.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
diff --git a/fs/ubifs/ubifs.c b/fs/ubifs/ubifs.c index 506e29958c3..9722cc440ab 100644 --- a/fs/ubifs/ubifs.c +++ b/fs/ubifs/ubifs.c @@ -573,6 +573,11 @@ int ubifs_ls(const char *filename) unsigned long inum; int ret = 0;
+ if (!ubifs_is_mounted()) { + debug("UBIFS not mounted, use ubifsmount to mount volume first!\n"); + return -1; + } + inum = ubifs_findfile(ubifs_sb, (char *)filename); if (!inum) { ret = -1; @@ -613,6 +618,11 @@ int ubifs_exists(const char *filename) { unsigned long inum;
+ if (!ubifs_is_mounted()) { + debug("UBIFS not mounted, use ubifsmount to mount volume first!\n"); + return -1; + } + inum = ubifs_findfile(ubifs_sb, (char *)filename);
return inum != 0; @@ -624,6 +634,11 @@ int ubifs_size(const char *filename, loff_t *size) struct inode *inode; int err = 0;
+ if (!ubifs_is_mounted()) { + debug("UBIFS not mounted, use ubifsmount to mount volume first!\n"); + return -1; + } + inum = ubifs_findfile(ubifs_sb, (char *)filename); if (!inum) { err = -1; @@ -827,6 +842,11 @@ int ubifs_read(const char *filename, void *buf, loff_t offset, int count; int last_block_size = 0;
+ if (!ubifs_is_mounted()) { + debug("UBIFS not mounted, use ubifsmount to mount volume first!\n"); + return -1; + } + *actread = 0;
if (offset & (PAGE_SIZE - 1)) {

Hello Alexander,
On 03.07.24 12:12, Alexander Dahl wrote:
Hei hei,
filesystem handling is different in U-Boot and beyond that UBI/UBIFS is different from other filesystems in U-Boot. There's UBI and UBIFS code ported from Linux (quite old already now, maybe someone wants to update that?), and there's "glue code" or "wrapper code" to interface with U-Boot scripts, commands, and filesystem handling. The fixes and improvements in this patch series are for this U-Boot specific glue code.
Yes, the linux base is very old ... patches are welcome!
And for me it is not that easy, as I do not have a hardware with current mainline U-Boot running on it... I want to update a hardware I have to current mainline, but I had no time yet for it...
I'm no filesystem expert, but after days of debugging I'm quite sure the bug is in U-Boot since UBIFS support was added in 2009, and it was repeated in 2015 when generic filesystem support for UBIFS was added. So please review carefully!
Which bug?
The crashes were not easily reproducible, only with boards using the old distroboot _and_ a boot script inspired by (but not equal to) the one proposed by RAUC [1], which basically boils down to:
ubifsmount ubi0:boot (from distroboot) test -e (from distroboot) ubifsmount ubi0:rootfs1 (this time from the boot script, triggering a ubifs_umount)
So, you have a special sequence you execute to trigger the bug, good!
In special 2 ubifsmount in a row... may not often needed for booting! (I ask me, why that is needed? Boottime is not good than...)
BTW: Is this really a good bootcmd in [1] as on *every* boot your Environment is saved? This is not good for lifetime of your storage device ... why not using bootcounter?
Crashes can be triggered more easily, if patch order is changed and patch 2 (resetting pointers to NULL after free) comes first, or if patch 2 is applied on its own only.
Hmm...
The fix is in the first patch, and on my boards I see no crashes anymore. I also tested all kinds of combinations of calling `ubi part`, `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, `load`, `size`, and `test -e` and got no crashes anymore after the fix.
That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think.
On what hardware do you test? Is it in mainline?
The three additional patches (2 to 4) are more or less safeguards and improvements for the future, and come from me trying and my debugging code done on the way, more or less optional, but I think nice to have.
I will look at them .. but give me some time, as I am in holidays the next 2 weeks ... Hmm.. and it would be good to get some Tested-by from people with hardware...
bye, Heiko
Greets Alex
[1] https://github.com/rauc/rauc/blob/master/contrib/uboot.sh
Alexander Dahl (4): fs: ubifs: Fix memleak and double free in u-boot wrapper functions fs: ubifs: Set pointers to NULL after free fs: ubifs: Make k(z)alloc/kfree symmetric fs: ubifs: Add volume mounted check
fs/ubifs/super.c | 8 ++++++-- fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ 2 files changed, 25 insertions(+), 14 deletions(-)
base-commit: 65fbdab27224ee3943a89496b21862db83c34da2

Hello Heiko,
Am Thu, Jul 04, 2024 at 06:28:31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 03.07.24 12:12, Alexander Dahl wrote:
Hei hei,
filesystem handling is different in U-Boot and beyond that UBI/UBIFS is different from other filesystems in U-Boot. There's UBI and UBIFS code ported from Linux (quite old already now, maybe someone wants to update that?), and there's "glue code" or "wrapper code" to interface with U-Boot scripts, commands, and filesystem handling. The fixes and improvements in this patch series are for this U-Boot specific glue code.
Yes, the linux base is very old ... patches are welcome!
The last sync was back in 2015 from linux v4.2, there were 800+ changes to ubi/ubifs in Linux since then. :-/
And for me it is not that easy, as I do not have a hardware with current mainline U-Boot running on it... I want to update a hardware I have to current mainline, but I had no time yet for it...
Besides the custom hardware here, I used Microchip SAM9X60-Curiosity lately, which is coming with a raw NAND flash and can boot from it.
I'm no filesystem expert, but after days of debugging I'm quite sure the bug is in U-Boot since UBIFS support was added in 2009, and it was repeated in 2015 when generic filesystem support for UBIFS was added. So please review carefully!
Which bug?
The memory leak and double free fixed with patch 1 of the series.
The crashes were not easily reproducible, only with boards using the old distroboot _and_ a boot script inspired by (but not equal to) the one proposed by RAUC [1], which basically boils down to:
ubifsmount ubi0:boot (from distroboot) test -e (from distroboot) ubifsmount ubi0:rootfs1 (this time from the boot script, triggering a ubifs_umount)
So, you have a special sequence you execute to trigger the bug, good!
In special 2 ubifsmount in a row... may not often needed for booting! (I ask me, why that is needed? Boottime is not good than...)
Using distroboot with a script here. The script is in a separate UBI volume ubi0:boot, kernel is loaded from ubi0:rootfs1 or ubi0:rootfs2 however. So there is 'ubifsmount ubi0:boot' from distroboot and in the script found, loaded, and run there is 'ubifsmount ubi0:rootfs1' (or rootfs2) later. ubifsmount calls ubifsumount internally if some volume is mounted already.
BTW: Is this really a good bootcmd in [1] as on *every* boot your Environment is saved? This is not good for lifetime of your storage device ... why not using bootcounter?
Well, I was not aware of bootcounter, but I had a look and the actual counter must be stored somewhere. Possible are:
- pmic → has no storage possibility on my board - rtc → soc internal only, volatile in the end (if battery is empty) - i2c eeprom → missing - spi flash → missing - filesystem → ends up on the flash - nvmem → no other nvmems present - syscon or some cpu register or sram → volatile
So none of these are possible in my case, I only have a raw NAND as storage and thus I have to use U-Boot env, which is stored in UBI here btw to not stress the flash too much.
I could investigate if it would be possible to let RAUC use the U-Boot bootcounter infrastructure, but currently RAUC depends on U-Boot environment variables for tracking boot attempts.
btw: documentation of bootcount is sparse, I only found the very short 'doc/README.bootcount' and it's not even migrated to recent U-Boot sphinx based docs. ;-)
But from what I understood the concept is the same, U-Boot counts something and Linux userspace has to reset it. The counter must be stored somewhere, for example in U-Boot env if no other storage is possible.
Crashes can be triggered more easily, if patch order is changed and patch 2 (resetting pointers to NULL after free) comes first, or if patch 2 is applied on its own only.
Hmm...
The fix is in the first patch, and on my boards I see no crashes anymore. I also tested all kinds of combinations of calling `ubi part`, `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, `load`, `size`, and `test -e` and got no crashes anymore after the fix.
That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think.
Oh it has, 'test -e' calls file_exists() which calls fs_exists() which ends up calling ubifs_exists() which is one of the functions causing an immediate memory leak, see patch 1.
On what hardware do you test? Is it in mainline?
Tested on custom hardware, but I'm confident it should be reproducible on any board using ubifs, especially after applying patch 2 resetting pointers of freed memory to NULL. This should trigger the bug with the simple sequence already described:
ubifsmount ubi0:anyvolume ls ubi ubi0:anyvolume / # (or load, or test -e, or size) ubifsumount
ubifsumount will call ubifs_umount() which calls ubi_close_volume(c->ubi), that pointer is either invalid leading to a double free inside of ubi_close_volume() and it will crash only in certain conditions or the pointer is NULL after applying patch 2 of the series, then ubi_close_volume() crashes right away with a NULL pointer exception.
Note: without patch 2 it very much depends on the sequence of commands, but after the first ubi_close_volume() triggered by ls/load/size/exists the pointer in ubifs_sb is invalid, but accessed later by the second ubi_close_volume() triggered by ubifs_umount(). If you do something in between those using the freed memory by something else again, the second ubi_close_volume() access might get corrupted data or access things outside of RAM. Patch 2 redirects this on a clean NULL pointer exception you can easily trigger.
In my case I got a pointer variable actually containing a string "ng.." aka 0x2e2e676e which looked suspiciously similar to a valid pointer on the platform somewhere in RAM between 0x20000000 and 0x28000000 so it took me two days to realize it's not a pointer. ;-)
The three additional patches (2 to 4) are more or less safeguards and improvements for the future, and come from me trying and my debugging code done on the way, more or less optional, but I think nice to have.
I will look at them .. but give me some time, as I am in holidays the next 2 weeks ... Hmm.. and it would be good to get some Tested-by from people with hardware...
Take your time, no need to work in holidays. Would appreciate a Tested-by by anyone else though, maybe some of the raw NAND folks?
Greets Alex
bye, Heiko
Greets Alex
[1] https://github.com/rauc/rauc/blob/master/contrib/uboot.sh
Alexander Dahl (4): fs: ubifs: Fix memleak and double free in u-boot wrapper functions fs: ubifs: Set pointers to NULL after free fs: ubifs: Make k(z)alloc/kfree symmetric fs: ubifs: Add volume mounted check
fs/ubifs/super.c | 8 ++++++-- fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ 2 files changed, 25 insertions(+), 14 deletions(-)
base-commit: 65fbdab27224ee3943a89496b21862db83c34da2
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de

Hei,
Am Thu, Jul 04, 2024 at 10:18:55AM +0200 schrieb Alexander Dahl:
Hello Heiko,
Am Thu, Jul 04, 2024 at 06:28:31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 03.07.24 12:12, Alexander Dahl wrote:
Hei hei,
filesystem handling is different in U-Boot and beyond that UBI/UBIFS is different from other filesystems in U-Boot. There's UBI and UBIFS code ported from Linux (quite old already now, maybe someone wants to update that?), and there's "glue code" or "wrapper code" to interface with U-Boot scripts, commands, and filesystem handling. The fixes and improvements in this patch series are for this U-Boot specific glue code.
Yes, the linux base is very old ... patches are welcome!
The last sync was back in 2015 from linux v4.2, there were 800+ changes to ubi/ubifs in Linux since then. :-/
And for me it is not that easy, as I do not have a hardware with current mainline U-Boot running on it... I want to update a hardware I have to current mainline, but I had no time yet for it...
Besides the custom hardware here, I used Microchip SAM9X60-Curiosity lately, which is coming with a raw NAND flash and can boot from it.
I'm no filesystem expert, but after days of debugging I'm quite sure the bug is in U-Boot since UBIFS support was added in 2009, and it was repeated in 2015 when generic filesystem support for UBIFS was added. So please review carefully!
Which bug?
The memory leak and double free fixed with patch 1 of the series.
The crashes were not easily reproducible, only with boards using the old distroboot _and_ a boot script inspired by (but not equal to) the one proposed by RAUC [1], which basically boils down to:
ubifsmount ubi0:boot (from distroboot) test -e (from distroboot) ubifsmount ubi0:rootfs1 (this time from the boot script, triggering a ubifs_umount)
So, you have a special sequence you execute to trigger the bug, good!
In special 2 ubifsmount in a row... may not often needed for booting! (I ask me, why that is needed? Boottime is not good than...)
Using distroboot with a script here. The script is in a separate UBI volume ubi0:boot, kernel is loaded from ubi0:rootfs1 or ubi0:rootfs2 however. So there is 'ubifsmount ubi0:boot' from distroboot and in the script found, loaded, and run there is 'ubifsmount ubi0:rootfs1' (or rootfs2) later. ubifsmount calls ubifsumount internally if some volume is mounted already.
BTW: Is this really a good bootcmd in [1] as on *every* boot your Environment is saved? This is not good for lifetime of your storage device ... why not using bootcounter?
Well, I was not aware of bootcounter, but I had a look and the actual counter must be stored somewhere. Possible are:
- pmic → has no storage possibility on my board
- rtc → soc internal only, volatile in the end (if battery is empty)
- i2c eeprom → missing
- spi flash → missing
- filesystem → ends up on the flash
- nvmem → no other nvmems present
- syscon or some cpu register or sram → volatile
So none of these are possible in my case, I only have a raw NAND as storage and thus I have to use U-Boot env, which is stored in UBI here btw to not stress the flash too much.
I could investigate if it would be possible to let RAUC use the U-Boot bootcounter infrastructure, but currently RAUC depends on U-Boot environment variables for tracking boot attempts.
btw: documentation of bootcount is sparse, I only found the very short 'doc/README.bootcount' and it's not even migrated to recent U-Boot sphinx based docs. ;-)
But from what I understood the concept is the same, U-Boot counts something and Linux userspace has to reset it. The counter must be stored somewhere, for example in U-Boot env if no other storage is possible.
Crashes can be triggered more easily, if patch order is changed and patch 2 (resetting pointers to NULL after free) comes first, or if patch 2 is applied on its own only.
Hmm...
The fix is in the first patch, and on my boards I see no crashes anymore. I also tested all kinds of combinations of calling `ubi part`, `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, `load`, `size`, and `test -e` and got no crashes anymore after the fix.
That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think.
Oh it has, 'test -e' calls file_exists() which calls fs_exists() which ends up calling ubifs_exists() which is one of the functions causing an immediate memory leak, see patch 1.
On what hardware do you test? Is it in mainline?
Tested on custom hardware, but I'm confident it should be reproducible on any board using ubifs, especially after applying patch 2 resetting pointers of freed memory to NULL. This should trigger the bug with the simple sequence already described:
ubifsmount ubi0:anyvolume ls ubi ubi0:anyvolume / # (or load, or test -e, or size) ubifsumount
ubifsumount will call ubifs_umount() which calls ubi_close_volume(c->ubi), that pointer is either invalid leading to a double free inside of ubi_close_volume() and it will crash only in certain conditions or the pointer is NULL after applying patch 2 of the series, then ubi_close_volume() crashes right away with a NULL pointer exception.
Note: without patch 2 it very much depends on the sequence of commands, but after the first ubi_close_volume() triggered by ls/load/size/exists the pointer in ubifs_sb is invalid, but accessed later by the second ubi_close_volume() triggered by ubifs_umount(). If you do something in between those using the freed memory by something else again, the second ubi_close_volume() access might get corrupted data or access things outside of RAM. Patch 2 redirects this on a clean NULL pointer exception you can easily trigger.
In my case I got a pointer variable actually containing a string "ng.." aka 0x2e2e676e which looked suspiciously similar to a valid pointer on the platform somewhere in RAM between 0x20000000 and 0x28000000 so it took me two days to realize it's not a pointer. ;-)
The three additional patches (2 to 4) are more or less safeguards and improvements for the future, and come from me trying and my debugging code done on the way, more or less optional, but I think nice to have.
I will look at them .. but give me some time, as I am in holidays the next 2 weeks ... Hmm.. and it would be good to get some Tested-by from people with hardware...
Take your time, no need to work in holidays. Would appreciate a Tested-by by anyone else though, maybe some of the raw NAND folks?
Well, apparently nobody had a look in the month of July, I add the raw NAND maintainers in Cc, maybe I should have done in the first place.
Would be happy if someone could have a look at the fix, maybe read the patches first before the discussion? ;-)
Greets Alex
Greets Alex
bye, Heiko
Greets Alex
[1] https://github.com/rauc/rauc/blob/master/contrib/uboot.sh
Alexander Dahl (4): fs: ubifs: Fix memleak and double free in u-boot wrapper functions fs: ubifs: Set pointers to NULL after free fs: ubifs: Make k(z)alloc/kfree symmetric fs: ubifs: Add volume mounted check
fs/ubifs/super.c | 8 ++++++-- fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ 2 files changed, 25 insertions(+), 14 deletions(-)
base-commit: 65fbdab27224ee3943a89496b21862db83c34da2
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de

Hi all
On Thu, Aug 1, 2024 at 8:50 AM Alexander Dahl ada@thorsis.com wrote:
Hei,
Am Thu, Jul 04, 2024 at 10:18:55AM +0200 schrieb Alexander Dahl:
Hello Heiko,
Am Thu, Jul 04, 2024 at 06:28:31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 03.07.24 12:12, Alexander Dahl wrote:
Hei hei,
filesystem handling is different in U-Boot and beyond that UBI/UBIFS is different from other filesystems in U-Boot. There's UBI and UBIFS code ported from Linux (quite old already now, maybe someone wants to update that?), and there's "glue code" or "wrapper code" to interface with U-Boot scripts, commands, and filesystem handling. The fixes and improvements in this patch series are for this U-Boot specific glue code.
Yes, the linux base is very old ... patches are welcome!
The last sync was back in 2015 from linux v4.2, there were 800+ changes to ubi/ubifs in Linux since then. :-/
And for me it is not that easy, as I do not have a hardware with current mainline U-Boot running on it... I want to update a hardware I have to current mainline, but I had no time yet for it...
Besides the custom hardware here, I used Microchip SAM9X60-Curiosity lately, which is coming with a raw NAND flash and can boot from it.
I read it now, I have one mainline board with ubifs running on it. If the patch are not get applied I will take a look today of the thread. Sync to newest version of the kernel it's a good idea. I will check if someone in the company can start this task
Michael
I'm no filesystem expert, but after days of debugging I'm quite sure the bug is in U-Boot since UBIFS support was added in 2009, and it was repeated in 2015 when generic filesystem support for UBIFS was added. So please review carefully!
Which bug?
The memory leak and double free fixed with patch 1 of the series.
The crashes were not easily reproducible, only with boards using the old distroboot _and_ a boot script inspired by (but not equal to) the one proposed by RAUC [1], which basically boils down to:
ubifsmount ubi0:boot (from distroboot) test -e (from distroboot) ubifsmount ubi0:rootfs1 (this time from the boot script, triggering a ubifs_umount)
So, you have a special sequence you execute to trigger the bug, good!
In special 2 ubifsmount in a row... may not often needed for booting! (I ask me, why that is needed? Boottime is not good than...)
Using distroboot with a script here. The script is in a separate UBI volume ubi0:boot, kernel is loaded from ubi0:rootfs1 or ubi0:rootfs2 however. So there is 'ubifsmount ubi0:boot' from distroboot and in the script found, loaded, and run there is 'ubifsmount ubi0:rootfs1' (or rootfs2) later. ubifsmount calls ubifsumount internally if some volume is mounted already.
BTW: Is this really a good bootcmd in [1] as on *every* boot your Environment is saved? This is not good for lifetime of your storage device ... why not using bootcounter?
Well, I was not aware of bootcounter, but I had a look and the actual counter must be stored somewhere. Possible are:
- pmic → has no storage possibility on my board
- rtc → soc internal only, volatile in the end (if battery is empty)
- i2c eeprom → missing
- spi flash → missing
- filesystem → ends up on the flash
- nvmem → no other nvmems present
- syscon or some cpu register or sram → volatile
So none of these are possible in my case, I only have a raw NAND as storage and thus I have to use U-Boot env, which is stored in UBI here btw to not stress the flash too much.
I could investigate if it would be possible to let RAUC use the U-Boot bootcounter infrastructure, but currently RAUC depends on U-Boot environment variables for tracking boot attempts.
btw: documentation of bootcount is sparse, I only found the very short 'doc/README.bootcount' and it's not even migrated to recent U-Boot sphinx based docs. ;-)
But from what I understood the concept is the same, U-Boot counts something and Linux userspace has to reset it. The counter must be stored somewhere, for example in U-Boot env if no other storage is possible.
Crashes can be triggered more easily, if patch order is changed and patch 2 (resetting pointers to NULL after free) comes first, or if patch 2 is applied on its own only.
Hmm...
The fix is in the first patch, and on my boards I see no crashes anymore. I also tested all kinds of combinations of calling `ubi part`, `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, `load`, `size`, and `test -e` and got no crashes anymore after the fix.
That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think.
Oh it has, 'test -e' calls file_exists() which calls fs_exists() which ends up calling ubifs_exists() which is one of the functions causing an immediate memory leak, see patch 1.
On what hardware do you test? Is it in mainline?
Tested on custom hardware, but I'm confident it should be reproducible on any board using ubifs, especially after applying patch 2 resetting pointers of freed memory to NULL. This should trigger the bug with the simple sequence already described:
ubifsmount ubi0:anyvolume ls ubi ubi0:anyvolume / # (or load, or test -e, or size) ubifsumount
ubifsumount will call ubifs_umount() which calls ubi_close_volume(c->ubi), that pointer is either invalid leading to a double free inside of ubi_close_volume() and it will crash only in certain conditions or the pointer is NULL after applying patch 2 of the series, then ubi_close_volume() crashes right away with a NULL pointer exception.
Note: without patch 2 it very much depends on the sequence of commands, but after the first ubi_close_volume() triggered by ls/load/size/exists the pointer in ubifs_sb is invalid, but accessed later by the second ubi_close_volume() triggered by ubifs_umount(). If you do something in between those using the freed memory by something else again, the second ubi_close_volume() access might get corrupted data or access things outside of RAM. Patch 2 redirects this on a clean NULL pointer exception you can easily trigger.
In my case I got a pointer variable actually containing a string "ng.." aka 0x2e2e676e which looked suspiciously similar to a valid pointer on the platform somewhere in RAM between 0x20000000 and 0x28000000 so it took me two days to realize it's not a pointer. ;-)
The three additional patches (2 to 4) are more or less safeguards and improvements for the future, and come from me trying and my debugging code done on the way, more or less optional, but I think nice to have.
I will look at them .. but give me some time, as I am in holidays the next 2 weeks ... Hmm.. and it would be good to get some Tested-by from people with hardware...
Take your time, no need to work in holidays. Would appreciate a Tested-by by anyone else though, maybe some of the raw NAND folks?
Well, apparently nobody had a look in the month of July, I add the raw NAND maintainers in Cc, maybe I should have done in the first place.
Would be happy if someone could have a look at the fix, maybe read the patches first before the discussion? ;-)
Greets Alex
Greets Alex
bye, Heiko
Greets Alex
[1] https://github.com/rauc/rauc/blob/master/contrib/uboot.sh
Alexander Dahl (4): fs: ubifs: Fix memleak and double free in u-boot wrapper functions fs: ubifs: Set pointers to NULL after free fs: ubifs: Make k(z)alloc/kfree symmetric fs: ubifs: Add volume mounted check
fs/ubifs/super.c | 8 ++++++-- fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ 2 files changed, 25 insertions(+), 14 deletions(-)
base-commit: 65fbdab27224ee3943a89496b21862db83c34da2
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de

Hello Michael,
On 01.08.24 08:53, Michael Nazzareno Trimarchi wrote:
Hi all
On Thu, Aug 1, 2024 at 8:50 AM Alexander Dahl ada@thorsis.com wrote:
Hei,
Am Thu, Jul 04, 2024 at 10:18:55AM +0200 schrieb Alexander Dahl:
Hello Heiko,
Am Thu, Jul 04, 2024 at 06:28:31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 03.07.24 12:12, Alexander Dahl wrote:
Hei hei,
filesystem handling is different in U-Boot and beyond that UBI/UBIFS is different from other filesystems in U-Boot. There's UBI and UBIFS code ported from Linux (quite old already now, maybe someone wants to update that?), and there's "glue code" or "wrapper code" to interface with U-Boot scripts, commands, and filesystem handling. The fixes and improvements in this patch series are for this U-Boot specific glue code.
Yes, the linux base is very old ... patches are welcome!
The last sync was back in 2015 from linux v4.2, there were 800+ changes to ubi/ubifs in Linux since then. :-/
And for me it is not that easy, as I do not have a hardware with current mainline U-Boot running on it... I want to update a hardware I have to current mainline, but I had no time yet for it...
Besides the custom hardware here, I used Microchip SAM9X60-Curiosity lately, which is coming with a raw NAND flash and can boot from it.
I read it now, I have one mainline board with ubifs running on it. If the patch are not get applied I will take a look today of the thread. Sync to newest version of the kernel it's a good idea. I will check if someone in the company can start this task
That would be nice!
bye, Heiko
Michael
I'm no filesystem expert, but after days of debugging I'm quite sure the bug is in U-Boot since UBIFS support was added in 2009, and it was repeated in 2015 when generic filesystem support for UBIFS was added. So please review carefully!
Which bug?
The memory leak and double free fixed with patch 1 of the series.
The crashes were not easily reproducible, only with boards using the old distroboot _and_ a boot script inspired by (but not equal to) the one proposed by RAUC [1], which basically boils down to:
ubifsmount ubi0:boot (from distroboot) test -e (from distroboot) ubifsmount ubi0:rootfs1 (this time from the boot script, triggering a ubifs_umount)
So, you have a special sequence you execute to trigger the bug, good!
In special 2 ubifsmount in a row... may not often needed for booting! (I ask me, why that is needed? Boottime is not good than...)
Using distroboot with a script here. The script is in a separate UBI volume ubi0:boot, kernel is loaded from ubi0:rootfs1 or ubi0:rootfs2 however. So there is 'ubifsmount ubi0:boot' from distroboot and in the script found, loaded, and run there is 'ubifsmount ubi0:rootfs1' (or rootfs2) later. ubifsmount calls ubifsumount internally if some volume is mounted already.
BTW: Is this really a good bootcmd in [1] as on *every* boot your Environment is saved? This is not good for lifetime of your storage device ... why not using bootcounter?
Well, I was not aware of bootcounter, but I had a look and the actual counter must be stored somewhere. Possible are:
- pmic → has no storage possibility on my board
- rtc → soc internal only, volatile in the end (if battery is empty)
- i2c eeprom → missing
- spi flash → missing
- filesystem → ends up on the flash
- nvmem → no other nvmems present
- syscon or some cpu register or sram → volatile
So none of these are possible in my case, I only have a raw NAND as storage and thus I have to use U-Boot env, which is stored in UBI here btw to not stress the flash too much.
I could investigate if it would be possible to let RAUC use the U-Boot bootcounter infrastructure, but currently RAUC depends on U-Boot environment variables for tracking boot attempts.
btw: documentation of bootcount is sparse, I only found the very short 'doc/README.bootcount' and it's not even migrated to recent U-Boot sphinx based docs. ;-)
But from what I understood the concept is the same, U-Boot counts something and Linux userspace has to reset it. The counter must be stored somewhere, for example in U-Boot env if no other storage is possible.
Crashes can be triggered more easily, if patch order is changed and patch 2 (resetting pointers to NULL after free) comes first, or if patch 2 is applied on its own only.
Hmm...
The fix is in the first patch, and on my boards I see no crashes anymore. I also tested all kinds of combinations of calling `ubi part`, `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, `load`, `size`, and `test -e` and got no crashes anymore after the fix.
That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think.
Oh it has, 'test -e' calls file_exists() which calls fs_exists() which ends up calling ubifs_exists() which is one of the functions causing an immediate memory leak, see patch 1.
On what hardware do you test? Is it in mainline?
Tested on custom hardware, but I'm confident it should be reproducible on any board using ubifs, especially after applying patch 2 resetting pointers of freed memory to NULL. This should trigger the bug with the simple sequence already described:
ubifsmount ubi0:anyvolume ls ubi ubi0:anyvolume / # (or load, or test -e, or size) ubifsumount
ubifsumount will call ubifs_umount() which calls ubi_close_volume(c->ubi), that pointer is either invalid leading to a double free inside of ubi_close_volume() and it will crash only in certain conditions or the pointer is NULL after applying patch 2 of the series, then ubi_close_volume() crashes right away with a NULL pointer exception.
Note: without patch 2 it very much depends on the sequence of commands, but after the first ubi_close_volume() triggered by ls/load/size/exists the pointer in ubifs_sb is invalid, but accessed later by the second ubi_close_volume() triggered by ubifs_umount(). If you do something in between those using the freed memory by something else again, the second ubi_close_volume() access might get corrupted data or access things outside of RAM. Patch 2 redirects this on a clean NULL pointer exception you can easily trigger.
In my case I got a pointer variable actually containing a string "ng.." aka 0x2e2e676e which looked suspiciously similar to a valid pointer on the platform somewhere in RAM between 0x20000000 and 0x28000000 so it took me two days to realize it's not a pointer. ;-)
The three additional patches (2 to 4) are more or less safeguards and improvements for the future, and come from me trying and my debugging code done on the way, more or less optional, but I think nice to have.
I will look at them .. but give me some time, as I am in holidays the next 2 weeks ... Hmm.. and it would be good to get some Tested-by from people with hardware...
Take your time, no need to work in holidays. Would appreciate a Tested-by by anyone else though, maybe some of the raw NAND folks?
Well, apparently nobody had a look in the month of July, I add the raw NAND maintainers in Cc, maybe I should have done in the first place.
Would be happy if someone could have a look at the fix, maybe read the patches first before the discussion? ;-)
Greets Alex
Greets Alex
bye, Heiko
Greets Alex
[1] https://github.com/rauc/rauc/blob/master/contrib/uboot.sh
Alexander Dahl (4): fs: ubifs: Fix memleak and double free in u-boot wrapper functions fs: ubifs: Set pointers to NULL after free fs: ubifs: Make k(z)alloc/kfree symmetric fs: ubifs: Add volume mounted check
fs/ubifs/super.c | 8 ++++++-- fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ 2 files changed, 25 insertions(+), 14 deletions(-)
base-commit: 65fbdab27224ee3943a89496b21862db83c34da2
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de

Hello Alexander,
On 01.08.24 08:50, Alexander Dahl wrote:
Hei,
Am Thu, Jul 04, 2024 at 10:18:55AM +0200 schrieb Alexander Dahl:
Hello Heiko,
Am Thu, Jul 04, 2024 at 06:28:31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 03.07.24 12:12, Alexander Dahl wrote:
Hei hei,
filesystem handling is different in U-Boot and beyond that UBI/UBIFS is different from other filesystems in U-Boot. There's UBI and UBIFS code ported from Linux (quite old already now, maybe someone wants to update that?), and there's "glue code" or "wrapper code" to interface with U-Boot scripts, commands, and filesystem handling. The fixes and improvements in this patch series are for this U-Boot specific glue code.
Yes, the linux base is very old ... patches are welcome!
The last sync was back in 2015 from linux v4.2, there were 800+ changes to ubi/ubifs in Linux since then. :-/
And for me it is not that easy, as I do not have a hardware with current mainline U-Boot running on it... I want to update a hardware I have to current mainline, but I had no time yet for it...
Besides the custom hardware here, I used Microchip SAM9X60-Curiosity lately, which is coming with a raw NAND flash and can boot from it.
I'm no filesystem expert, but after days of debugging I'm quite sure the bug is in U-Boot since UBIFS support was added in 2009, and it was repeated in 2015 when generic filesystem support for UBIFS was added. So please review carefully!
Which bug?
The memory leak and double free fixed with patch 1 of the series.
The crashes were not easily reproducible, only with boards using the old distroboot _and_ a boot script inspired by (but not equal to) the one proposed by RAUC [1], which basically boils down to:
ubifsmount ubi0:boot (from distroboot) test -e (from distroboot) ubifsmount ubi0:rootfs1 (this time from the boot script, triggering a ubifs_umount)
So, you have a special sequence you execute to trigger the bug, good!
In special 2 ubifsmount in a row... may not often needed for booting! (I ask me, why that is needed? Boottime is not good than...)
Using distroboot with a script here. The script is in a separate UBI volume ubi0:boot, kernel is loaded from ubi0:rootfs1 or ubi0:rootfs2 however. So there is 'ubifsmount ubi0:boot' from distroboot and in the script found, loaded, and run there is 'ubifsmount ubi0:rootfs1' (or rootfs2) later. ubifsmount calls ubifsumount internally if some volume is mounted already.
BTW: Is this really a good bootcmd in [1] as on *every* boot your Environment is saved? This is not good for lifetime of your storage device ... why not using bootcounter?
Well, I was not aware of bootcounter, but I had a look and the actual counter must be stored somewhere. Possible are:
- pmic → has no storage possibility on my board
- rtc → soc internal only, volatile in the end (if battery is empty)
- i2c eeprom → missing
- spi flash → missing
- filesystem → ends up on the flash
- nvmem → no other nvmems present
- syscon or some cpu register or sram → volatile
So none of these are possible in my case, I only have a raw NAND as storage and thus I have to use U-Boot env, which is stored in UBI here btw to not stress the flash too much.
I could investigate if it would be possible to let RAUC use the U-Boot bootcounter infrastructure, but currently RAUC depends on U-Boot environment variables for tracking boot attempts.
btw: documentation of bootcount is sparse, I only found the very short 'doc/README.bootcount' and it's not even migrated to recent U-Boot sphinx based docs. ;-)
But from what I understood the concept is the same, U-Boot counts something and Linux userspace has to reset it. The counter must be stored somewhere, for example in U-Boot env if no other storage is possible.
Crashes can be triggered more easily, if patch order is changed and patch 2 (resetting pointers to NULL after free) comes first, or if patch 2 is applied on its own only.
Hmm...
The fix is in the first patch, and on my boards I see no crashes anymore. I also tested all kinds of combinations of calling `ubi part`, `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, `load`, `size`, and `test -e` and got no crashes anymore after the fix.
That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think.
Oh it has, 'test -e' calls file_exists() which calls fs_exists() which ends up calling ubifs_exists() which is one of the functions causing an immediate memory leak, see patch 1.
On what hardware do you test? Is it in mainline?
Tested on custom hardware, but I'm confident it should be reproducible on any board using ubifs, especially after applying patch 2 resetting pointers of freed memory to NULL. This should trigger the bug with the simple sequence already described:
ubifsmount ubi0:anyvolume ls ubi ubi0:anyvolume / # (or load, or test -e, or size) ubifsumount
ubifsumount will call ubifs_umount() which calls ubi_close_volume(c->ubi), that pointer is either invalid leading to a double free inside of ubi_close_volume() and it will crash only in certain conditions or the pointer is NULL after applying patch 2 of the series, then ubi_close_volume() crashes right away with a NULL pointer exception.
Note: without patch 2 it very much depends on the sequence of commands, but after the first ubi_close_volume() triggered by ls/load/size/exists the pointer in ubifs_sb is invalid, but accessed later by the second ubi_close_volume() triggered by ubifs_umount(). If you do something in between those using the freed memory by something else again, the second ubi_close_volume() access might get corrupted data or access things outside of RAM. Patch 2 redirects this on a clean NULL pointer exception you can easily trigger.
In my case I got a pointer variable actually containing a string "ng.." aka 0x2e2e676e which looked suspiciously similar to a valid pointer on the platform somewhere in RAM between 0x20000000 and 0x28000000 so it took me two days to realize it's not a pointer. ;-)
The three additional patches (2 to 4) are more or less safeguards and improvements for the future, and come from me trying and my debugging code done on the way, more or less optional, but I think nice to have.
I will look at them .. but give me some time, as I am in holidays the next 2 weeks ... Hmm.. and it would be good to get some Tested-by from people with hardware...
Take your time, no need to work in holidays. Would appreciate a Tested-by by anyone else though, maybe some of the raw NAND folks?
Well, apparently nobody had a look in the month of July, I add the raw NAND maintainers in Cc, maybe I should have done in the first place.
Would be happy if someone could have a look at the fix, maybe read the patches first before the discussion? ;-)
I asked Ravi and Alexey (added to cc) if they have time to look it they can reproduce the issue and test your patches...
bye, Heiko
Greets Alex
Greets Alex
bye, Heiko
Greets Alex
[1] https://github.com/rauc/rauc/blob/master/contrib/uboot.sh
Alexander Dahl (4): fs: ubifs: Fix memleak and double free in u-boot wrapper functions fs: ubifs: Set pointers to NULL after free fs: ubifs: Make k(z)alloc/kfree symmetric fs: ubifs: Add volume mounted check
fs/ubifs/super.c | 8 ++++++-- fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ 2 files changed, 25 insertions(+), 14 deletions(-)
base-commit: 65fbdab27224ee3943a89496b21862db83c34da2
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de

Hi Heiko, Alexander,
On 7/31/24 23:54, Heiko Schocher wrote:
Hello Alexander, On 01. 08. 24 08: 50, Alexander Dahl wrote: > Hei, > > Am Thu, Jul 04, 2024 at 10: 18: 55AM +0200 schrieb Alexander Dahl: >> Hello Heiko, >> >> Am Thu, Jul 04, 2024 at 06: 28: 31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 01.08.24 08:50, Alexander Dahl wrote:
Hei,
Am Thu, Jul 04, 2024 at 10:18:55AM +0200 schrieb Alexander Dahl:
Hello Heiko,
Am Thu, Jul 04, 2024 at 06:28:31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 03.07.24 12:12, Alexander Dahl wrote:
Hei hei,
filesystem handling is different in U-Boot and beyond that UBI/UBIFS is different from other filesystems in U-Boot. There's UBI and UBIFS code ported from Linux (quite old already now, maybe someone wants to update that?), and there's "glue code" or "wrapper code" to interface with U-Boot scripts, commands, and filesystem handling. The fixes and improvements in this patch series are for this U-Boot specific glue code.
Yes, the linux base is very old ... patches are welcome!
The last sync was back in 2015 from linux v4.2, there were 800+ changes to ubi/ubifs in Linux since then. :-/
And for me it is not that easy, as I do not have a hardware with current mainline U-Boot running on it... I want to update a hardware I have to current mainline, but I had no time yet for it...
Besides the custom hardware here, I used Microchip SAM9X60-Curiosity lately, which is coming with a raw NAND flash and can boot from it.
I'm no filesystem expert, but after days of debugging I'm quite sure the bug is in U-Boot since UBIFS support was added in 2009, and it was repeated in 2015 when generic filesystem support for UBIFS was added. So please review carefully!
Which bug?
The memory leak and double free fixed with patch 1 of the series.
The crashes were not easily reproducible, only with boards using the old distroboot _and_ a boot script inspired by (but not equal to) the one proposed by RAUC [1], which basically boils down to:
ubifsmount ubi0:boot (from distroboot) test -e (from distroboot) ubifsmount ubi0:rootfs1 (this time from the boot script, triggering a ubifs_umount)
So, you have a special sequence you execute to trigger the bug, good!
In special 2 ubifsmount in a row... may not often needed for booting! (I ask me, why that is needed? Boottime is not good than...)
Using distroboot with a script here. The script is in a separate UBI volume ubi0:boot, kernel is loaded from ubi0:rootfs1 or ubi0:rootfs2 however. So there is 'ubifsmount ubi0:boot' from distroboot and in the script found, loaded, and run there is 'ubifsmount ubi0:rootfs1' (or rootfs2) later. ubifsmount calls ubifsumount internally if some volume is mounted already.
BTW: Is this really a good bootcmd in [1] as on *every* boot your Environment is saved? This is not good for lifetime of your storage device ... why not using bootcounter?
Well, I was not aware of bootcounter, but I had a look and the actual counter must be stored somewhere. Possible are:
- pmic → has no storage possibility on my board
- rtc → soc internal only, volatile in the end (if battery is empty)
- i2c eeprom → missing
- spi flash → missing
- filesystem → ends up on the flash
- nvmem → no other nvmems present
- syscon or some cpu register or sram → volatile
So none of these are possible in my case, I only have a raw NAND as storage and thus I have to use U-Boot env, which is stored in UBI here btw to not stress the flash too much.
I could investigate if it would be possible to let RAUC use the U-Boot bootcounter infrastructure, but currently RAUC depends on U-Boot environment variables for tracking boot attempts.
btw: documentation of bootcount is sparse, I only found the very short 'doc/README.bootcount' and it's not even migrated to recent U-Boot sphinx based docs. ;-)
But from what I understood the concept is the same, U-Boot counts something and Linux userspace has to reset it. The counter must be stored somewhere, for example in U-Boot env if no other storage is possible.
Crashes can be triggered more easily, if patch order is changed and patch 2 (resetting pointers to NULL after free) comes first, or if patch 2 is applied on its own only.
Hmm...
The fix is in the first patch, and on my boards I see no crashes anymore. I also tested all kinds of combinations of calling `ubi part`, `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, `load`, `size`, and `test -e` and got no crashes anymore after the fix.
That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think.
Oh it has, 'test -e' calls file_exists() which calls fs_exists() which ends up calling ubifs_exists() which is one of the functions causing an immediate memory leak, see patch 1.
On what hardware do you test? Is it in mainline?
Tested on custom hardware, but I'm confident it should be reproducible on any board using ubifs, especially after applying patch 2 resetting pointers of freed memory to NULL. This should trigger the bug with the simple sequence already described:
ubifsmount ubi0:anyvolume ls ubi ubi0:anyvolume / # (or load, or test -e, or size) ubifsumount
ubifsumount will call ubifs_umount() which calls ubi_close_volume(c->ubi), that pointer is either invalid leading to a double free inside of ubi_close_volume() and it will crash only in certain conditions or the pointer is NULL after applying patch 2 of the series, then ubi_close_volume() crashes right away with a NULL pointer exception.
Note: without patch 2 it very much depends on the sequence of commands, but after the first ubi_close_volume() triggered by ls/load/size/exists the pointer in ubifs_sb is invalid, but accessed later by the second ubi_close_volume() triggered by ubifs_umount(). If you do something in between those using the freed memory by something else again, the second ubi_close_volume() access might get corrupted data or access things outside of RAM. Patch 2 redirects this on a clean NULL pointer exception you can easily trigger.
In my case I got a pointer variable actually containing a string "ng.." aka 0x2e2e676e which looked suspiciously similar to a valid pointer on the platform somewhere in RAM between 0x20000000 and 0x28000000 so it took me two days to realize it's not a pointer. ;-)
The three additional patches (2 to 4) are more or less safeguards and improvements for the future, and come from me trying and my debugging code done on the way, more or less optional, but I think nice to have.
I will look at them .. but give me some time, as I am in holidays the next 2 weeks ... Hmm.. and it would be good to get some Tested-by from people with hardware...
Take your time, no need to work in holidays. Would appreciate a Tested-by by anyone else though, maybe some of the raw NAND folks?
Well, apparently nobody had a look in the month of July, I add the raw NAND maintainers in Cc, maybe I should have done in the first place.
Would be happy if someone could have a look at the fix, maybe read the patches first before the discussion? ;-)
I asked Ravi and Alexey (added to cc) if they have time to look it they can reproduce the issue and test your patches...
bye, Heiko
I tried to reproduce with CONFIG_SYS_MALLOC_LEN reduced to 1MB and calling ubifsls in a loop, logic below:
ubi part ubi0:rootfs ubifsmount ubi0:rootfs setenv i 1 while test $i -le 200; do ubifsls a/b/test.bin setexpr i $i + 1 sleep 1 echo "Loop count: $i" done
It crashes with or without patch. With patch it takes 4,5 loops more to crash.
Log: UBIFS error (ubi0:0 pid 0): ubifs_iget: failed to read inode 85, error -12 "Synchronous Abort" handler, esr 0x96000004
Thanks, Ravi
Greets Alex
Greets Alex
bye, Heiko
Greets Alex
[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_bl... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_blob_master_contrib_uboot.sh&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=3apjoJolwzwH82pd_c1HzSLzzrwfu9x1lSm802edgOg&m=lpy642UhqtTXHV0dd0xnOtQxkQSO5RC6FwcUbzt-FsNwd3QSXZsulfqC6dP9qV7G&s=wbnw_T9_r6taMKIFqJ2-wtj9Zv02OD7mbIy-ZZFRjRE&e=
Alexander Dahl (4): fs: ubifs: Fix memleak and double free in u-boot wrapper functions fs: ubifs: Set pointers to NULL after free fs: ubifs: Make k(z)alloc/kfree symmetric fs: ubifs: Add volume mounted check
fs/ubifs/super.c | 8 ++++++-- fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ 2 files changed, 25 insertions(+), 14 deletions(-)
base-commit: 65fbdab27224ee3943a89496b21862db83c34da2
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de

Hello Ravi,
On 01.08.24 21:39, Ravi Minnikanti wrote:
Hi Heiko, Alexander,
On 7/31/24 23:54, Heiko Schocher wrote:
Hello Alexander, On 01. 08. 24 08: 50, Alexander Dahl wrote: > Hei, > > Am Thu, Jul 04, 2024 at 10: 18: 55AM +0200 schrieb Alexander Dahl: >> Hello Heiko, >> >> Am Thu, Jul 04, 2024 at 06: 28: 31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 01.08.24 08:50, Alexander Dahl wrote:
Hei,
Am Thu, Jul 04, 2024 at 10:18:55AM +0200 schrieb Alexander Dahl:
Hello Heiko,
Am Thu, Jul 04, 2024 at 06:28:31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 03.07.24 12:12, Alexander Dahl wrote:
Hei hei,
filesystem handling is different in U-Boot and beyond that UBI/UBIFS is different from other filesystems in U-Boot. There's UBI and UBIFS code ported from Linux (quite old already now, maybe someone wants to update that?), and there's "glue code" or "wrapper code" to interface with U-Boot scripts, commands, and filesystem handling. The fixes and improvements in this patch series are for this U-Boot specific glue code.
Yes, the linux base is very old ... patches are welcome!
The last sync was back in 2015 from linux v4.2, there were 800+ changes to ubi/ubifs in Linux since then. :-/
And for me it is not that easy, as I do not have a hardware with current mainline U-Boot running on it... I want to update a hardware I have to current mainline, but I had no time yet for it...
Besides the custom hardware here, I used Microchip SAM9X60-Curiosity lately, which is coming with a raw NAND flash and can boot from it.
I'm no filesystem expert, but after days of debugging I'm quite sure the bug is in U-Boot since UBIFS support was added in 2009, and it was repeated in 2015 when generic filesystem support for UBIFS was added. So please review carefully!
Which bug?
The memory leak and double free fixed with patch 1 of the series.
The crashes were not easily reproducible, only with boards using the old distroboot _and_ a boot script inspired by (but not equal to) the one proposed by RAUC [1], which basically boils down to:
ubifsmount ubi0:boot (from distroboot) test -e (from distroboot) ubifsmount ubi0:rootfs1 (this time from the boot script, triggering a ubifs_umount)
So, you have a special sequence you execute to trigger the bug, good!
In special 2 ubifsmount in a row... may not often needed for booting! (I ask me, why that is needed? Boottime is not good than...)
Using distroboot with a script here. The script is in a separate UBI volume ubi0:boot, kernel is loaded from ubi0:rootfs1 or ubi0:rootfs2 however. So there is 'ubifsmount ubi0:boot' from distroboot and in the script found, loaded, and run there is 'ubifsmount ubi0:rootfs1' (or rootfs2) later. ubifsmount calls ubifsumount internally if some volume is mounted already.
BTW: Is this really a good bootcmd in [1] as on *every* boot your Environment is saved? This is not good for lifetime of your storage device ... why not using bootcounter?
Well, I was not aware of bootcounter, but I had a look and the actual counter must be stored somewhere. Possible are:
- pmic → has no storage possibility on my board
- rtc → soc internal only, volatile in the end (if battery is empty)
- i2c eeprom → missing
- spi flash → missing
- filesystem → ends up on the flash
- nvmem → no other nvmems present
- syscon or some cpu register or sram → volatile
So none of these are possible in my case, I only have a raw NAND as storage and thus I have to use U-Boot env, which is stored in UBI here btw to not stress the flash too much.
I could investigate if it would be possible to let RAUC use the U-Boot bootcounter infrastructure, but currently RAUC depends on U-Boot environment variables for tracking boot attempts.
btw: documentation of bootcount is sparse, I only found the very short 'doc/README.bootcount' and it's not even migrated to recent U-Boot sphinx based docs. ;-)
But from what I understood the concept is the same, U-Boot counts something and Linux userspace has to reset it. The counter must be stored somewhere, for example in U-Boot env if no other storage is possible.
Crashes can be triggered more easily, if patch order is changed and patch 2 (resetting pointers to NULL after free) comes first, or if patch 2 is applied on its own only.
Hmm...
The fix is in the first patch, and on my boards I see no crashes anymore. I also tested all kinds of combinations of calling `ubi part`, `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, `load`, `size`, and `test -e` and got no crashes anymore after the fix.
That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think.
Oh it has, 'test -e' calls file_exists() which calls fs_exists() which ends up calling ubifs_exists() which is one of the functions causing an immediate memory leak, see patch 1.
On what hardware do you test? Is it in mainline?
Tested on custom hardware, but I'm confident it should be reproducible on any board using ubifs, especially after applying patch 2 resetting pointers of freed memory to NULL. This should trigger the bug with the simple sequence already described:
> ubifsmount ubi0:anyvolume > ls ubi ubi0:anyvolume / # (or load, or test -e, or size) > ubifsumount
ubifsumount will call ubifs_umount() which calls ubi_close_volume(c->ubi), that pointer is either invalid leading to a double free inside of ubi_close_volume() and it will crash only in certain conditions or the pointer is NULL after applying patch 2 of the series, then ubi_close_volume() crashes right away with a NULL pointer exception.
Note: without patch 2 it very much depends on the sequence of commands, but after the first ubi_close_volume() triggered by ls/load/size/exists the pointer in ubifs_sb is invalid, but accessed later by the second ubi_close_volume() triggered by ubifs_umount(). If you do something in between those using the freed memory by something else again, the second ubi_close_volume() access might get corrupted data or access things outside of RAM. Patch 2 redirects this on a clean NULL pointer exception you can easily trigger.
In my case I got a pointer variable actually containing a string "ng.." aka 0x2e2e676e which looked suspiciously similar to a valid pointer on the platform somewhere in RAM between 0x20000000 and 0x28000000 so it took me two days to realize it's not a pointer. ;-)
The three additional patches (2 to 4) are more or less safeguards and improvements for the future, and come from me trying and my debugging code done on the way, more or less optional, but I think nice to have.
I will look at them .. but give me some time, as I am in holidays the next 2 weeks ... Hmm.. and it would be good to get some Tested-by from people with hardware...
Take your time, no need to work in holidays. Would appreciate a Tested-by by anyone else though, maybe some of the raw NAND folks?
Well, apparently nobody had a look in the month of July, I add the raw NAND maintainers in Cc, maybe I should have done in the first place.
Would be happy if someone could have a look at the fix, maybe read the patches first before the discussion? ;-)
I asked Ravi and Alexey (added to cc) if they have time to look it they can reproduce the issue and test your patches...
bye, Heiko
I tried to reproduce with CONFIG_SYS_MALLOC_LEN reduced to 1MB and calling ubifsls in a loop, logic below:
ubi part ubi0:rootfs ubifsmount ubi0:rootfs setenv i 1 while test $i -le 200; do ubifsls a/b/test.bin setexpr i $i + 1 sleep 1 echo "Loop count: $i" done
Thanks for testing! But it is not exactly the same sequence Alexander described ... but you also trigger a bug.
It crashes with or without patch. With patch it takes 4,5 loops more to crash.
Log: UBIFS error (ubi0:0 pid 0): ubifs_iget: failed to read inode 85, error -12 "Synchronous Abort" handler, esr 0x96000004
-12 = -ENOMEM
So it seems, we have a memleak...
Yes, ubi/ubifs linux base is very old, but I tend to think that problem is in U-Boot adaption layer...
bye, Heiko
Thanks, Ravi
Greets Alex
Greets Alex
bye, Heiko
Greets Alex
[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_bl... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_blob_master_contrib_uboot.sh&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=3apjoJolwzwH82pd_c1HzSLzzrwfu9x1lSm802edgOg&m=lpy642UhqtTXHV0dd0xnOtQxkQSO5RC6FwcUbzt-FsNwd3QSXZsulfqC6dP9qV7G&s=wbnw_T9_r6taMKIFqJ2-wtj9Zv02OD7mbIy-ZZFRjRE&e=
Alexander Dahl (4): fs: ubifs: Fix memleak and double free in u-boot wrapper functions fs: ubifs: Set pointers to NULL after free fs: ubifs: Make k(z)alloc/kfree symmetric fs: ubifs: Add volume mounted check
fs/ubifs/super.c | 8 ++++++-- fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ 2 files changed, 25 insertions(+), 14 deletions(-)
base-commit: 65fbdab27224ee3943a89496b21862db83c34da2
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de

Hello,
Am Mon, Aug 05, 2024 at 07:28:19AM +0200 schrieb Heiko Schocher:
Hello Ravi,
On 01.08.24 21:39, Ravi Minnikanti wrote:
Hi Heiko, Alexander,
On 7/31/24 23:54, Heiko Schocher wrote:
Hello Alexander, On 01. 08. 24 08: 50, Alexander Dahl wrote: > Hei, > > Am Thu, Jul 04, 2024 at 10: 18: 55AM +0200 schrieb Alexander Dahl: >> Hello Heiko, >> >> Am Thu, Jul 04, 2024 at 06: 28: 31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 01.08.24 08:50, Alexander Dahl wrote:
Hei,
Am Thu, Jul 04, 2024 at 10:18:55AM +0200 schrieb Alexander Dahl:
Hello Heiko,
Am Thu, Jul 04, 2024 at 06:28:31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 03.07.24 12:12, Alexander Dahl wrote: > Hei hei, > > filesystem handling is different in U-Boot and beyond that UBI/UBIFS is > different from other filesystems in U-Boot. There's UBI and UBIFS code > ported from Linux (quite old already now, maybe someone wants to update > that?), and there's "glue code" or "wrapper code" to interface with > U-Boot scripts, commands, and filesystem handling. The fixes and > improvements in this patch series are for this U-Boot specific glue > code.
Yes, the linux base is very old ... patches are welcome!
The last sync was back in 2015 from linux v4.2, there were 800+ changes to ubi/ubifs in Linux since then. :-/
And for me it is not that easy, as I do not have a hardware with current mainline U-Boot running on it... I want to update a hardware I have to current mainline, but I had no time yet for it...
Besides the custom hardware here, I used Microchip SAM9X60-Curiosity lately, which is coming with a raw NAND flash and can boot from it.
> I'm no filesystem expert, but after days of debugging I'm quite sure the > bug is in U-Boot since UBIFS support was added in 2009, and it was > repeated in 2015 when generic filesystem support for UBIFS was added. > So please review carefully!
Which bug?
The memory leak and double free fixed with patch 1 of the series.
> The crashes were not easily reproducible, only with boards using the old > distroboot _and_ a boot script inspired by (but not equal to) the one > proposed by RAUC [1], which basically boils down to: > > ubifsmount ubi0:boot (from distroboot) > test -e (from distroboot) > ubifsmount ubi0:rootfs1 (this time from the boot script, > triggering a ubifs_umount)
So, you have a special sequence you execute to trigger the bug, good!
In special 2 ubifsmount in a row... may not often needed for booting! (I ask me, why that is needed? Boottime is not good than...)
Using distroboot with a script here. The script is in a separate UBI volume ubi0:boot, kernel is loaded from ubi0:rootfs1 or ubi0:rootfs2 however. So there is 'ubifsmount ubi0:boot' from distroboot and in the script found, loaded, and run there is 'ubifsmount ubi0:rootfs1' (or rootfs2) later. ubifsmount calls ubifsumount internally if some volume is mounted already.
BTW: Is this really a good bootcmd in [1] as on *every* boot your Environment is saved? This is not good for lifetime of your storage device ... why not using bootcounter?
Well, I was not aware of bootcounter, but I had a look and the actual counter must be stored somewhere. Possible are:
- pmic → has no storage possibility on my board
- rtc → soc internal only, volatile in the end (if battery is empty)
- i2c eeprom → missing
- spi flash → missing
- filesystem → ends up on the flash
- nvmem → no other nvmems present
- syscon or some cpu register or sram → volatile
So none of these are possible in my case, I only have a raw NAND as storage and thus I have to use U-Boot env, which is stored in UBI here btw to not stress the flash too much.
I could investigate if it would be possible to let RAUC use the U-Boot bootcounter infrastructure, but currently RAUC depends on U-Boot environment variables for tracking boot attempts.
btw: documentation of bootcount is sparse, I only found the very short 'doc/README.bootcount' and it's not even migrated to recent U-Boot sphinx based docs. ;-)
But from what I understood the concept is the same, U-Boot counts something and Linux userspace has to reset it. The counter must be stored somewhere, for example in U-Boot env if no other storage is possible.
> Crashes can be triggered more easily, if patch order is changed and > patch 2 (resetting pointers to NULL after free) comes first, or if patch > 2 is applied on its own only.
Hmm...
> The fix is in the first patch, and on my boards I see no crashes > anymore. I also tested all kinds of combinations of calling `ubi part`, > `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, > `load`, `size`, and `test -e` and got no crashes anymore after the fix.
That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think.
Oh it has, 'test -e' calls file_exists() which calls fs_exists() which ends up calling ubifs_exists() which is one of the functions causing an immediate memory leak, see patch 1.
On what hardware do you test? Is it in mainline?
Tested on custom hardware, but I'm confident it should be reproducible on any board using ubifs, especially after applying patch 2 resetting pointers of freed memory to NULL. This should trigger the bug with the simple sequence already described:
> ubifsmount ubi0:anyvolume > ls ubi ubi0:anyvolume / # (or load, or test -e, or size) > ubifsumount
ubifsumount will call ubifs_umount() which calls ubi_close_volume(c->ubi), that pointer is either invalid leading to a double free inside of ubi_close_volume() and it will crash only in certain conditions or the pointer is NULL after applying patch 2 of the series, then ubi_close_volume() crashes right away with a NULL pointer exception.
Note: without patch 2 it very much depends on the sequence of commands, but after the first ubi_close_volume() triggered by ls/load/size/exists the pointer in ubifs_sb is invalid, but accessed later by the second ubi_close_volume() triggered by ubifs_umount(). If you do something in between those using the freed memory by something else again, the second ubi_close_volume() access might get corrupted data or access things outside of RAM. Patch 2 redirects this on a clean NULL pointer exception you can easily trigger.
In my case I got a pointer variable actually containing a string "ng.." aka 0x2e2e676e which looked suspiciously similar to a valid pointer on the platform somewhere in RAM between 0x20000000 and 0x28000000 so it took me two days to realize it's not a pointer. ;-)
> The three additional patches (2 to 4) are more or less safeguards and > improvements for the future, and come from me trying and my debugging > code done on the way, more or less optional, but I think nice to have.
I will look at them .. but give me some time, as I am in holidays the next 2 weeks ... Hmm.. and it would be good to get some Tested-by from people with hardware...
Take your time, no need to work in holidays. Would appreciate a Tested-by by anyone else though, maybe some of the raw NAND folks?
Well, apparently nobody had a look in the month of July, I add the raw NAND maintainers in Cc, maybe I should have done in the first place.
Would be happy if someone could have a look at the fix, maybe read the patches first before the discussion? ;-)
I asked Ravi and Alexey (added to cc) if they have time to look it they can reproduce the issue and test your patches...
bye, Heiko
I tried to reproduce with CONFIG_SYS_MALLOC_LEN reduced to 1MB and calling ubifsls in a loop, logic below:
ubi part ubi0:rootfs ubifsmount ubi0:rootfs setenv i 1 while test $i -le 200; do ubifsls a/b/test.bin setexpr i $i + 1 sleep 1 echo "Loop count: $i" done
Thanks for testing! But it is not exactly the same sequence Alexander described ... but you also trigger a bug.
+1 … thanks for testing. :-)
It crashes with or without patch. With patch it takes 4,5 loops more to crash.
Log: UBIFS error (ubi0:0 pid 0): ubifs_iget: failed to read inode 85, error -12 "Synchronous Abort" handler, esr 0x96000004
-12 = -ENOMEM
So it seems, we have a memleak...
If with my patch it takes more loops to crash, this suggests we have more than one memory leak, and of them got fixed with my patch(es).
Yes, ubi/ubifs linux base is very old, but I tend to think that problem is in U-Boot adaption layer...
I would also suspect the adaption layer.
However, although my patch fixes one memory leak, that was not the main reason doing this. The reason of the crash in my case was the double free, and accessing invalid memory afterwards. So from my perspective the patches are okay on their own, and looking after more memory leaks should not be part of this series.
Greets Alex
P.S.: not sure what's causing this, but I only got the mails from you all through the mailing list. You can keep me in Cc, so I won't overlook your answers easily.
bye, Heiko
Thanks, Ravi
Greets Alex
Greets Alex
bye, Heiko > > Greets > Alex > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_bl... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_blob_master_contrib_uboot.sh&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=3apjoJolwzwH82pd_c1HzSLzzrwfu9x1lSm802edgOg&m=lpy642UhqtTXHV0dd0xnOtQxkQSO5RC6FwcUbzt-FsNwd3QSXZsulfqC6dP9qV7G&s=wbnw_T9_r6taMKIFqJ2-wtj9Zv02OD7mbIy-ZZFRjRE&e= > > Alexander Dahl (4): > fs: ubifs: Fix memleak and double free in u-boot wrapper functions > fs: ubifs: Set pointers to NULL after free > fs: ubifs: Make k(z)alloc/kfree symmetric > fs: ubifs: Add volume mounted check > > fs/ubifs/super.c | 8 ++++++-- > fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ > 2 files changed, 25 insertions(+), 14 deletions(-) > > > base-commit: 65fbdab27224ee3943a89496b21862db83c34da2 >
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de

Hi all
On Mon, Aug 5, 2024 at 10:49 AM Alexander Dahl ada@thorsis.com wrote:
Hello,
Am Mon, Aug 05, 2024 at 07:28:19AM +0200 schrieb Heiko Schocher:
Hello Ravi,
On 01.08.24 21:39, Ravi Minnikanti wrote:
Hi Heiko, Alexander,
On 7/31/24 23:54, Heiko Schocher wrote:
Hello Alexander, On 01. 08. 24 08: 50, Alexander Dahl wrote: > Hei, > > Am Thu, Jul 04, 2024 at 10: 18: 55AM +0200 schrieb Alexander Dahl: >> Hello Heiko, >> >> Am Thu, Jul 04, 2024 at 06: 28: 31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 01.08.24 08:50, Alexander Dahl wrote:
Hei,
Am Thu, Jul 04, 2024 at 10:18:55AM +0200 schrieb Alexander Dahl:
Hello Heiko,
Am Thu, Jul 04, 2024 at 06:28:31AM +0200 schrieb Heiko Schocher: > Hello Alexander, > > On 03.07.24 12:12, Alexander Dahl wrote: > > Hei hei, > > > > filesystem handling is different in U-Boot and beyond that UBI/UBIFS is > > different from other filesystems in U-Boot. There's UBI and UBIFS code > > ported from Linux (quite old already now, maybe someone wants to update > > that?), and there's "glue code" or "wrapper code" to interface with > > U-Boot scripts, commands, and filesystem handling. The fixes and > > improvements in this patch series are for this U-Boot specific glue > > code. > > Yes, the linux base is very old ... patches are welcome!
The last sync was back in 2015 from linux v4.2, there were 800+ changes to ubi/ubifs in Linux since then. :-/
> And for me it is not that easy, as I do not have a hardware with > current mainline U-Boot running on it... I want to update a hardware > I have to current mainline, but I had no time yet for it...
Besides the custom hardware here, I used Microchip SAM9X60-Curiosity lately, which is coming with a raw NAND flash and can boot from it.
> > > I'm no filesystem expert, but after days of debugging I'm quite sure the > > bug is in U-Boot since UBIFS support was added in 2009, and it was > > repeated in 2015 when generic filesystem support for UBIFS was added. > > So please review carefully! > > Which bug?
The memory leak and double free fixed with patch 1 of the series.
> > > The crashes were not easily reproducible, only with boards using the old > > distroboot _and_ a boot script inspired by (but not equal to) the one > > proposed by RAUC [1], which basically boils down to: > > > > ubifsmount ubi0:boot (from distroboot) > > test -e (from distroboot) > > ubifsmount ubi0:rootfs1 (this time from the boot script, > > triggering a ubifs_umount) > > So, you have a special sequence you execute to trigger the bug, good! > > In special 2 ubifsmount in a row... may not often needed for booting! > (I ask me, why that is needed? Boottime is not good than...)
Using distroboot with a script here. The script is in a separate UBI volume ubi0:boot, kernel is loaded from ubi0:rootfs1 or ubi0:rootfs2 however. So there is 'ubifsmount ubi0:boot' from distroboot and in the script found, loaded, and run there is 'ubifsmount ubi0:rootfs1' (or rootfs2) later. ubifsmount calls ubifsumount internally if some volume is mounted already.
> > BTW: Is this really a good bootcmd in [1] as on *every* boot your > Environment is saved? This is not good for lifetime of your > storage device ... why not using bootcounter?
Well, I was not aware of bootcounter, but I had a look and the actual counter must be stored somewhere. Possible are:
- pmic → has no storage possibility on my board
- rtc → soc internal only, volatile in the end (if battery is empty)
- i2c eeprom → missing
- spi flash → missing
- filesystem → ends up on the flash
- nvmem → no other nvmems present
- syscon or some cpu register or sram → volatile
So none of these are possible in my case, I only have a raw NAND as storage and thus I have to use U-Boot env, which is stored in UBI here btw to not stress the flash too much.
I could investigate if it would be possible to let RAUC use the U-Boot bootcounter infrastructure, but currently RAUC depends on U-Boot environment variables for tracking boot attempts.
btw: documentation of bootcount is sparse, I only found the very short 'doc/README.bootcount' and it's not even migrated to recent U-Boot sphinx based docs. ;-)
But from what I understood the concept is the same, U-Boot counts something and Linux userspace has to reset it. The counter must be stored somewhere, for example in U-Boot env if no other storage is possible.
> > > Crashes can be triggered more easily, if patch order is changed and > > patch 2 (resetting pointers to NULL after free) comes first, or if patch > > 2 is applied on its own only. > > Hmm... > > > The fix is in the first patch, and on my boards I see no crashes > > anymore. I also tested all kinds of combinations of calling `ubi part`, > > `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, > > `load`, `size`, and `test -e` and got no crashes anymore after the fix. > > That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think.
Oh it has, 'test -e' calls file_exists() which calls fs_exists() which ends up calling ubifs_exists() which is one of the functions causing an immediate memory leak, see patch 1.
> On what hardware do you test? Is it in mainline?
Tested on custom hardware, but I'm confident it should be reproducible on any board using ubifs, especially after applying patch 2 resetting pointers of freed memory to NULL. This should trigger the bug with the simple sequence already described:
> ubifsmount ubi0:anyvolume > ls ubi ubi0:anyvolume / # (or load, or test -e, or size) > ubifsumount
ubifsumount will call ubifs_umount() which calls ubi_close_volume(c->ubi), that pointer is either invalid leading to a double free inside of ubi_close_volume() and it will crash only in certain conditions or the pointer is NULL after applying patch 2 of the series, then ubi_close_volume() crashes right away with a NULL pointer exception.
Note: without patch 2 it very much depends on the sequence of commands, but after the first ubi_close_volume() triggered by ls/load/size/exists the pointer in ubifs_sb is invalid, but accessed later by the second ubi_close_volume() triggered by ubifs_umount(). If you do something in between those using the freed memory by something else again, the second ubi_close_volume() access might get corrupted data or access things outside of RAM. Patch 2 redirects this on a clean NULL pointer exception you can easily trigger.
In my case I got a pointer variable actually containing a string "ng.." aka 0x2e2e676e which looked suspiciously similar to a valid pointer on the platform somewhere in RAM between 0x20000000 and 0x28000000 so it took me two days to realize it's not a pointer. ;-)
> > > The three additional patches (2 to 4) are more or less safeguards and > > improvements for the future, and come from me trying and my debugging > > code done on the way, more or less optional, but I think nice to have. > > I will look at them .. but give me some time, as I am in holidays the > next 2 weeks ... Hmm.. and it would be good to get some Tested-by > from people with hardware...
Take your time, no need to work in holidays. Would appreciate a Tested-by by anyone else though, maybe some of the raw NAND folks?
Well, apparently nobody had a look in the month of July, I add the raw NAND maintainers in Cc, maybe I should have done in the first place.
Would be happy if someone could have a look at the fix, maybe read the patches first before the discussion? ;-)
I asked Ravi and Alexey (added to cc) if they have time to look it they can reproduce the issue and test your patches...
bye, Heiko
I tried to reproduce with CONFIG_SYS_MALLOC_LEN reduced to 1MB and calling ubifsls in a loop, logic below:
ubi part ubi0:rootfs ubifsmount ubi0:rootfs setenv i 1 while test $i -le 200; do ubifsls a/b/test.bin setexpr i $i + 1 sleep 1 echo "Loop count: $i" done
Thanks for testing! But it is not exactly the same sequence Alexander described ... but you also trigger a bug.
diff --git a/fs/ubifs/ubifs.c b/fs/ubifs/ubifs.c index 048730db7f..33df4ff51f 100644 --- a/fs/ubifs/ubifs.c +++ b/fs/ubifs/ubifs.c @@ -557,6 +557,7 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename)
/* We have some sort of symlink recursion, bail out */ if (symlink_count++ > 8) { + ubifs_iput(inode); printf("Symlink recursion, aborting\n"); return 0; } @@ -568,6 +569,7 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename) * the leading slash */ next = name = link_name + 1; root_inum = 1; + ubifs_iput(inode); continue; } /* Relative to cur dir */ @@ -575,6 +577,7 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename) link_name, next == NULL ? "" : next); memcpy(symlinkpath, buf, sizeof(buf)); next = name = symlinkpath; + ubifs_iput(inode); continue; }
@@ -583,8 +586,10 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename) */
/* Found the node! */ - if (!next || *next == '\0') + if (!next || *next == '\0') { + ubifs_iput(inode); return inum; + }
root_inum = inum; name = next;
Should we not free inode_iget?
Michael
+1 … thanks for testing. :-)
It crashes with or without patch. With patch it takes 4,5 loops more to crash.
Log: UBIFS error (ubi0:0 pid 0): ubifs_iget: failed to read inode 85, error -12 "Synchronous Abort" handler, esr 0x96000004
-12 = -ENOMEM
So it seems, we have a memleak...
If with my patch it takes more loops to crash, this suggests we have more than one memory leak, and of them got fixed with my patch(es).
Yes, ubi/ubifs linux base is very old, but I tend to think that problem is in U-Boot adaption layer...
I would also suspect the adaption layer.
However, although my patch fixes one memory leak, that was not the main reason doing this. The reason of the crash in my case was the double free, and accessing invalid memory afterwards. So from my perspective the patches are okay on their own, and looking after more memory leaks should not be part of this series.
Greets Alex
P.S.: not sure what's causing this, but I only got the mails from you all through the mailing list. You can keep me in Cc, so I won't overlook your answers easily.
bye, Heiko
Thanks, Ravi
Greets Alex
Greets Alex
> > bye, > Heiko > > > > Greets > > Alex > > > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_bl... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_blob_master_contrib_uboot.sh&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=3apjoJolwzwH82pd_c1HzSLzzrwfu9x1lSm802edgOg&m=lpy642UhqtTXHV0dd0xnOtQxkQSO5RC6FwcUbzt-FsNwd3QSXZsulfqC6dP9qV7G&s=wbnw_T9_r6taMKIFqJ2-wtj9Zv02OD7mbIy-ZZFRjRE&e= > > > > Alexander Dahl (4): > > fs: ubifs: Fix memleak and double free in u-boot wrapper functions > > fs: ubifs: Set pointers to NULL after free > > fs: ubifs: Make k(z)alloc/kfree symmetric > > fs: ubifs: Add volume mounted check > > > > fs/ubifs/super.c | 8 ++++++-- > > fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ > > 2 files changed, 25 insertions(+), 14 deletions(-) > > > > > > base-commit: 65fbdab27224ee3943a89496b21862db83c34da2 > > > > -- > DENX Software Engineering GmbH, Managing Director: Erika Unter > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany > Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de

On 8/5/24 01:58, Michael Nazzareno Trimarchi wrote:
Hi all On Mon, Aug 5, 2024 at 10: 49 AM Alexander Dahl <ada@ thorsis. com> wrote: > > Hello, > > Am Mon, Aug 05, 2024 at 07: 28: 19AM +0200 schrieb Heiko Schocher: > > Hello Ravi, > > > > On 01. 08. 24 21: 39, Ravi
Hi all
On Mon, Aug 5, 2024 at 10:49 AM Alexander Dahl ada@thorsis.com wrote:
Hello,
Am Mon, Aug 05, 2024 at 07:28:19AM +0200 schrieb Heiko Schocher:
Hello Ravi,
On 01.08.24 21:39, Ravi Minnikanti wrote:
Hi Heiko, Alexander,
On 7/31/24 23:54, Heiko Schocher wrote:
Hello Alexander, On 01. 08. 24 08: 50, Alexander Dahl wrote: > Hei, > > Am Thu, Jul 04, 2024 at 10: 18: 55AM +0200 schrieb Alexander Dahl: >> Hello Heiko, >> >> Am Thu, Jul 04, 2024 at 06: 28: 31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 01.08.24 08:50, Alexander Dahl wrote:
Hei,
Am Thu, Jul 04, 2024 at 10:18:55AM +0200 schrieb Alexander Dahl: > Hello Heiko, > > Am Thu, Jul 04, 2024 at 06:28:31AM +0200 schrieb Heiko Schocher: > > Hello Alexander, > > > > On 03.07.24 12:12, Alexander Dahl wrote: > > > Hei hei, > > > > > > filesystem handling is different in U-Boot and beyond that UBI/UBIFS is > > > different from other filesystems in U-Boot. There's UBI and UBIFS code > > > ported from Linux (quite old already now, maybe someone wants to update > > > that?), and there's "glue code" or "wrapper code" to interface with > > > U-Boot scripts, commands, and filesystem handling. The fixes and > > > improvements in this patch series are for this U-Boot specific glue > > > code. > > > > Yes, the linux base is very old ... patches are welcome! > > The last sync was back in 2015 from linux v4.2, there were 800+ > changes to ubi/ubifs in Linux since then. :-/ > > > And for me it is not that easy, as I do not have a hardware with > > current mainline U-Boot running on it... I want to update a hardware > > I have to current mainline, but I had no time yet for it... > > Besides the custom hardware here, I used Microchip SAM9X60-Curiosity > lately, which is coming with a raw NAND flash and can boot from it. > > > > > > I'm no filesystem expert, but after days of debugging I'm quite sure the > > > bug is in U-Boot since UBIFS support was added in 2009, and it was > > > repeated in 2015 when generic filesystem support for UBIFS was added. > > > So please review carefully! > > > > Which bug? > > The memory leak and double free fixed with patch 1 of the series. > > > > > > The crashes were not easily reproducible, only with boards using the old > > > distroboot _and_ a boot script inspired by (but not equal to) the one > > > proposed by RAUC [1], which basically boils down to: > > > > > > ubifsmount ubi0:boot (from distroboot) > > > test -e (from distroboot) > > > ubifsmount ubi0:rootfs1 (this time from the boot script, > > > triggering a ubifs_umount) > > > > So, you have a special sequence you execute to trigger the bug, good! > > > > In special 2 ubifsmount in a row... may not often needed for booting! > > (I ask me, why that is needed? Boottime is not good than...) > > Using distroboot with a script here. The script is in a separate UBI > volume ubi0:boot, kernel is loaded from ubi0:rootfs1 or ubi0:rootfs2 > however. So there is 'ubifsmount ubi0:boot' from distroboot and in the > script found, loaded, and run there is 'ubifsmount ubi0:rootfs1' (or > rootfs2) later. ubifsmount calls ubifsumount internally if some > volume is mounted already. > > > > > BTW: Is this really a good bootcmd in [1] as on *every* boot your > > Environment is saved? This is not good for lifetime of your > > storage device ... why not using bootcounter? > > Well, I was not aware of bootcounter, but I had a look and the actual > counter must be stored somewhere. Possible are: > > - pmic → has no storage possibility on my board > - rtc → soc internal only, volatile in the end (if battery is empty) > - i2c eeprom → missing > - spi flash → missing > - filesystem → ends up on the flash > - nvmem → no other nvmems present > - syscon or some cpu register or sram → volatile > > So none of these are possible in my case, I only have a raw NAND as > storage and thus I have to use U-Boot env, which is stored in UBI here > btw to not stress the flash too much. > > I could investigate if it would be possible to let RAUC use the > U-Boot bootcounter infrastructure, but currently RAUC depends on > U-Boot environment variables for tracking boot attempts. > > btw: documentation of bootcount is sparse, I only found the very short > 'doc/README.bootcount' and it's not even migrated to recent U-Boot > sphinx based docs. ;-) > > But from what I understood the concept is the same, U-Boot counts > something and Linux userspace has to reset it. The counter must be > stored somewhere, for example in U-Boot env if no other storage is > possible. > > > > > > Crashes can be triggered more easily, if patch order is changed and > > > patch 2 (resetting pointers to NULL after free) comes first, or if patch > > > 2 is applied on its own only. > > > > Hmm... > > > > > The fix is in the first patch, and on my boards I see no crashes > > > anymore. I also tested all kinds of combinations of calling `ubi part`, > > > `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, > > > `load`, `size`, and `test -e` and got no crashes anymore after the fix. > > > > That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think. > > Oh it has, 'test -e' calls file_exists() which calls fs_exists() which > ends up calling ubifs_exists() which is one of the functions causing > an immediate memory leak, see patch 1. > > > On what hardware do you test? Is it in mainline? > > Tested on custom hardware, but I'm confident it should be reproducible > on any board using ubifs, especially after applying patch 2 resetting > pointers of freed memory to NULL. This should trigger the bug with > the simple sequence already described: > > > ubifsmount ubi0:anyvolume > > ls ubi ubi0:anyvolume / # (or load, or test -e, or size) > > ubifsumount > > ubifsumount will call ubifs_umount() which calls > ubi_close_volume(c->ubi), that pointer is either invalid leading to a > double free inside of ubi_close_volume() and it will crash only in > certain conditions or the pointer is NULL after applying patch 2 of > the series, then ubi_close_volume() crashes right away with a NULL > pointer exception. > > Note: without patch 2 it very much depends on the sequence of > commands, but after the first ubi_close_volume() triggered by > ls/load/size/exists the pointer in ubifs_sb is invalid, but accessed later > by the second ubi_close_volume() triggered by ubifs_umount(). If you > do something in between those using the freed memory by something else > again, the second ubi_close_volume() access might get corrupted data > or access things outside of RAM. Patch 2 redirects this on a clean > NULL pointer exception you can easily trigger. > > In my case I got a pointer variable actually containing a string > "ng.." aka 0x2e2e676e which looked suspiciously similar to a valid > pointer on the platform somewhere in RAM between 0x20000000 and > 0x28000000 so it took me two days to realize it's not a pointer. ;-) > > > > > > The three additional patches (2 to 4) are more or less safeguards and > > > improvements for the future, and come from me trying and my debugging > > > code done on the way, more or less optional, but I think nice to have. > > > > I will look at them .. but give me some time, as I am in holidays the > > next 2 weeks ... Hmm.. and it would be good to get some Tested-by > > from people with hardware... > > Take your time, no need to work in holidays. Would appreciate a > Tested-by by anyone else though, maybe some of the raw NAND folks?
Well, apparently nobody had a look in the month of July, I add the raw NAND maintainers in Cc, maybe I should have done in the first place.
Would be happy if someone could have a look at the fix, maybe read the patches first before the discussion? ;-)
I asked Ravi and Alexey (added to cc) if they have time to look it they can reproduce the issue and test your patches...
bye, Heiko
I tried to reproduce with CONFIG_SYS_MALLOC_LEN reduced to 1MB and calling ubifsls in a loop, logic below:
ubi part ubi0:rootfs ubifsmount ubi0:rootfs setenv i 1 while test $i -le 200; do ubifsls a/b/test.bin setexpr i $i + 1 sleep 1 echo "Loop count: $i" done
Thanks for testing! But it is not exactly the same sequence Alexander described ... but you also trigger a bug.
It is not same sequence but it is similar (multiple access to ubifs file exists and ubifsls) Test involves ubifsls which is also part of the patchset fix. Double free issue is visible on reading the code, but doesn't show up any crash on multiple mount -> file access -> unmount sequence.
Thanks, Ravi
diff --git a/fs/ubifs/ubifs.c b/fs/ubifs/ubifs.c index 048730db7f..33df4ff51f 100644 --- a/fs/ubifs/ubifs.c +++ b/fs/ubifs/ubifs.c @@ -557,6 +557,7 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename)
/* We have some sort of symlink recursion, bail out */ if (symlink_count++ > 8) {
ubifs_iput(inode); printf("Symlink recursion, aborting\n"); return 0; }
@@ -568,6 +569,7 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename) * the leading slash */ next = name = link_name + 1; root_inum = 1;
ubifs_iput(inode); continue; } /* Relative to cur dir */
@@ -575,6 +577,7 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename) link_name, next == NULL ? "" : next); memcpy(symlinkpath, buf, sizeof(buf)); next = name = symlinkpath;
ubifs_iput(inode); continue; }
@@ -583,8 +586,10 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename) */
/* Found the node! */
if (!next || *next == '\0')
if (!next || *next == '\0') {
ubifs_iput(inode); return inum;
} root_inum = inum; name = next;
Should we not free inode_iget?
Michael
+1 … thanks for testing. :-)
It crashes with or without patch. With patch it takes 4,5 loops more to crash.
Log: UBIFS error (ubi0:0 pid 0): ubifs_iget: failed to read inode 85, error -12 "Synchronous Abort" handler, esr 0x96000004
-12 = -ENOMEM
So it seems, we have a memleak...
If with my patch it takes more loops to crash, this suggests we have more than one memory leak, and of them got fixed with my patch(es).
Yes, ubi/ubifs linux base is very old, but I tend to think that problem is in U-Boot adaption layer...
I would also suspect the adaption layer.
However, although my patch fixes one memory leak, that was not the main reason doing this. The reason of the crash in my case was the double free, and accessing invalid memory afterwards. So from my perspective the patches are okay on their own, and looking after more memory leaks should not be part of this series.
Greets Alex
P.S.: not sure what's causing this, but I only got the mails from you all through the mailing list. You can keep me in Cc, so I won't overlook your answers easily.
bye, Heiko
Thanks, Ravi
Greets Alex
> > Greets > Alex > > > > > bye, > > Heiko > > > > > > Greets > > > Alex > > > > > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_bl... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_blob_master_contrib_uboot.sh&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=3apjoJolwzwH82pd_c1HzSLzzrwfu9x1lSm802edgOg&m=lpy642UhqtTXHV0dd0xnOtQxkQSO5RC6FwcUbzt-FsNwd3QSXZsulfqC6dP9qV7G&s=wbnw_T9_r6taMKIFqJ2-wtj9Zv02OD7mbIy-ZZFRjRE&e= https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_blob_master_contrib_uboot.sh&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=3apjoJolwzwH82pd_c1HzSLzzrwfu9x1lSm802edgOg&m=lpy642UhqtTXHV0dd0xnOtQxkQSO5RC6FwcUbzt-FsNwd3QSXZsulfqC6dP9qV7G&s=wbnw_T9_r6taMKIFqJ2-wtj9Zv02OD7mbIy-ZZFRjRE&e= > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_blob_master_contrib_uboot.sh&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=3apjoJolwzwH82pd_c1HzSLzzrwfu9x1lSm802edgOg&m=lpy642UhqtTXHV0dd0xnOtQxkQSO5RC6FwcUbzt-FsNwd3QSXZsulfqC6dP9qV7G&s=wbnw_T9_r6taMKIFqJ2-wtj9Zv02OD7mbIy-ZZFRjRE&e= > > > > > > > Alexander Dahl (4): > > > fs: ubifs: Fix memleak and double free in u-boot wrapper functions > > > fs: ubifs: Set pointers to NULL after free > > > fs: ubifs: Make k(z)alloc/kfree symmetric > > > fs: ubifs: Add volume mounted check > > > > > > fs/ubifs/super.c | 8 ++++++-- > > > fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ > > > 2 files changed, 25 insertions(+), 14 deletions(-) > > > > > > > > > base-commit: 65fbdab27224ee3943a89496b21862db83c34da2 > > > > > > > -- > > DENX Software Engineering GmbH, Managing Director: Erika Unter > > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany > > Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- Michael Nazzareno Trimarchi Co-Founder & Chief Executive Officer M. +39 347 913 2170 michael@amarulasolutions.com __________________________________
Amarula Solutions BV Joop Geesinkweg 125, 1114 AB, Amsterdam, NL T. +31 (0)85 111 9172 info@amarulasolutions.com https://urldefense.proofpoint.com/v2/url?u=http-3A__www.amarulasolutions.com... https://urldefense.proofpoint.com/v2/url?u=http-3A__www.amarulasolutions.com&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=3apjoJolwzwH82pd_c1HzSLzzrwfu9x1lSm802edgOg&m=U1Rv-XKqb0aXhWgOxWsZGY-rDRlRGztjPzwhC8OmxVOMsZS6Vgna6DmYv_FXd5_9&s=25Kc666V9Nv8qBX85mtwQCjOd9WayHgtVr2EU2RbtL4&e=

Hello Michael,
On 05.08.24 10:58, Michael Nazzareno Trimarchi wrote:
Hi all
On Mon, Aug 5, 2024 at 10:49 AM Alexander Dahl ada@thorsis.com wrote:
Hello,
Am Mon, Aug 05, 2024 at 07:28:19AM +0200 schrieb Heiko Schocher:
Hello Ravi,
On 01.08.24 21:39, Ravi Minnikanti wrote:
Hi Heiko, Alexander,
On 7/31/24 23:54, Heiko Schocher wrote:
Hello Alexander, On 01. 08. 24 08: 50, Alexander Dahl wrote: > Hei, > > Am Thu, Jul 04, 2024 at 10: 18: 55AM +0200 schrieb Alexander Dahl: >> Hello Heiko, >> >> Am Thu, Jul 04, 2024 at 06: 28: 31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 01.08.24 08:50, Alexander Dahl wrote:
Hei,
Am Thu, Jul 04, 2024 at 10:18:55AM +0200 schrieb Alexander Dahl: > Hello Heiko, > > Am Thu, Jul 04, 2024 at 06:28:31AM +0200 schrieb Heiko Schocher: >> Hello Alexander, >> >> On 03.07.24 12:12, Alexander Dahl wrote: >>> Hei hei, >>> >>> filesystem handling is different in U-Boot and beyond that UBI/UBIFS is >>> different from other filesystems in U-Boot. There's UBI and UBIFS code >>> ported from Linux (quite old already now, maybe someone wants to update >>> that?), and there's "glue code" or "wrapper code" to interface with >>> U-Boot scripts, commands, and filesystem handling. The fixes and >>> improvements in this patch series are for this U-Boot specific glue >>> code. >> >> Yes, the linux base is very old ... patches are welcome! > > The last sync was back in 2015 from linux v4.2, there were 800+ > changes to ubi/ubifs in Linux since then. :-/ > >> And for me it is not that easy, as I do not have a hardware with >> current mainline U-Boot running on it... I want to update a hardware >> I have to current mainline, but I had no time yet for it... > > Besides the custom hardware here, I used Microchip SAM9X60-Curiosity > lately, which is coming with a raw NAND flash and can boot from it. > >> >>> I'm no filesystem expert, but after days of debugging I'm quite sure the >>> bug is in U-Boot since UBIFS support was added in 2009, and it was >>> repeated in 2015 when generic filesystem support for UBIFS was added. >>> So please review carefully! >> >> Which bug? > > The memory leak and double free fixed with patch 1 of the series. > >> >>> The crashes were not easily reproducible, only with boards using the old >>> distroboot _and_ a boot script inspired by (but not equal to) the one >>> proposed by RAUC [1], which basically boils down to: >>> >>> ubifsmount ubi0:boot (from distroboot) >>> test -e (from distroboot) >>> ubifsmount ubi0:rootfs1 (this time from the boot script, >>> triggering a ubifs_umount) >> >> So, you have a special sequence you execute to trigger the bug, good! >> >> In special 2 ubifsmount in a row... may not often needed for booting! >> (I ask me, why that is needed? Boottime is not good than...) > > Using distroboot with a script here. The script is in a separate UBI > volume ubi0:boot, kernel is loaded from ubi0:rootfs1 or ubi0:rootfs2 > however. So there is 'ubifsmount ubi0:boot' from distroboot and in the > script found, loaded, and run there is 'ubifsmount ubi0:rootfs1' (or > rootfs2) later. ubifsmount calls ubifsumount internally if some > volume is mounted already. > >> >> BTW: Is this really a good bootcmd in [1] as on *every* boot your >> Environment is saved? This is not good for lifetime of your >> storage device ... why not using bootcounter? > > Well, I was not aware of bootcounter, but I had a look and the actual > counter must be stored somewhere. Possible are: > > - pmic → has no storage possibility on my board > - rtc → soc internal only, volatile in the end (if battery is empty) > - i2c eeprom → missing > - spi flash → missing > - filesystem → ends up on the flash > - nvmem → no other nvmems present > - syscon or some cpu register or sram → volatile > > So none of these are possible in my case, I only have a raw NAND as > storage and thus I have to use U-Boot env, which is stored in UBI here > btw to not stress the flash too much. > > I could investigate if it would be possible to let RAUC use the > U-Boot bootcounter infrastructure, but currently RAUC depends on > U-Boot environment variables for tracking boot attempts. > > btw: documentation of bootcount is sparse, I only found the very short > 'doc/README.bootcount' and it's not even migrated to recent U-Boot > sphinx based docs. ;-) > > But from what I understood the concept is the same, U-Boot counts > something and Linux userspace has to reset it. The counter must be > stored somewhere, for example in U-Boot env if no other storage is > possible. > >> >>> Crashes can be triggered more easily, if patch order is changed and >>> patch 2 (resetting pointers to NULL after free) comes first, or if patch >>> 2 is applied on its own only. >> >> Hmm... >> >>> The fix is in the first patch, and on my boards I see no crashes >>> anymore. I also tested all kinds of combinations of calling `ubi part`, >>> `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, >>> `load`, `size`, and `test -e` and got no crashes anymore after the fix. >> >> That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think. > > Oh it has, 'test -e' calls file_exists() which calls fs_exists() which > ends up calling ubifs_exists() which is one of the functions causing > an immediate memory leak, see patch 1. > >> On what hardware do you test? Is it in mainline? > > Tested on custom hardware, but I'm confident it should be reproducible > on any board using ubifs, especially after applying patch 2 resetting > pointers of freed memory to NULL. This should trigger the bug with > the simple sequence already described: > > > ubifsmount ubi0:anyvolume > > ls ubi ubi0:anyvolume / # (or load, or test -e, or size) > > ubifsumount > > ubifsumount will call ubifs_umount() which calls > ubi_close_volume(c->ubi), that pointer is either invalid leading to a > double free inside of ubi_close_volume() and it will crash only in > certain conditions or the pointer is NULL after applying patch 2 of > the series, then ubi_close_volume() crashes right away with a NULL > pointer exception. > > Note: without patch 2 it very much depends on the sequence of > commands, but after the first ubi_close_volume() triggered by > ls/load/size/exists the pointer in ubifs_sb is invalid, but accessed later > by the second ubi_close_volume() triggered by ubifs_umount(). If you > do something in between those using the freed memory by something else > again, the second ubi_close_volume() access might get corrupted data > or access things outside of RAM. Patch 2 redirects this on a clean > NULL pointer exception you can easily trigger. > > In my case I got a pointer variable actually containing a string > "ng.." aka 0x2e2e676e which looked suspiciously similar to a valid > pointer on the platform somewhere in RAM between 0x20000000 and > 0x28000000 so it took me two days to realize it's not a pointer. ;-) > >> >>> The three additional patches (2 to 4) are more or less safeguards and >>> improvements for the future, and come from me trying and my debugging >>> code done on the way, more or less optional, but I think nice to have. >> >> I will look at them .. but give me some time, as I am in holidays the >> next 2 weeks ... Hmm.. and it would be good to get some Tested-by >> from people with hardware... > > Take your time, no need to work in holidays. Would appreciate a > Tested-by by anyone else though, maybe some of the raw NAND folks?
Well, apparently nobody had a look in the month of July, I add the raw NAND maintainers in Cc, maybe I should have done in the first place.
Would be happy if someone could have a look at the fix, maybe read the patches first before the discussion? ;-)
I asked Ravi and Alexey (added to cc) if they have time to look it they can reproduce the issue and test your patches...
bye, Heiko
I tried to reproduce with CONFIG_SYS_MALLOC_LEN reduced to 1MB and calling ubifsls in a loop, logic below:
ubi part ubi0:rootfs ubifsmount ubi0:rootfs setenv i 1 while test $i -le 200; do ubifsls a/b/test.bin setexpr i $i + 1 sleep 1 echo "Loop count: $i" done
Thanks for testing! But it is not exactly the same sequence Alexander described ... but you also trigger a bug.
diff --git a/fs/ubifs/ubifs.c b/fs/ubifs/ubifs.c index 048730db7f..33df4ff51f 100644 --- a/fs/ubifs/ubifs.c +++ b/fs/ubifs/ubifs.c @@ -557,6 +557,7 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename)
/* We have some sort of symlink recursion, bail out */ if (symlink_count++ > 8) {
ubifs_iput(inode); printf("Symlink recursion, aborting\n"); return 0; }
@@ -568,6 +569,7 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename) * the leading slash */ next = name = link_name + 1; root_inum = 1;
ubifs_iput(inode); continue; } /* Relative to cur dir */
@@ -575,6 +577,7 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename) link_name, next == NULL ? "" : next); memcpy(symlinkpath, buf, sizeof(buf)); next = name = symlinkpath;
ubifs_iput(inode); continue; }
@@ -583,8 +586,10 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename) */
/* Found the node! */
if (!next || *next == '\0')
if (!next || *next == '\0') {
ubifs_iput(inode); return inum;
} root_inum = inum; name = next;
Should we not free inode_iget?
Yep, that patchsnipset seems valid to me ... We should check all calls from ubifs_get() ... so I think we need to correct also filldir() fucntion in this file?
and
fs/ubifs/recovery.c ubifs_recover_size() fs/ubifs/super.c ubifs_fill_super()
?
bye, Heiko
Michael
+1 … thanks for testing. :-)
It crashes with or without patch. With patch it takes 4,5 loops more to crash.
Log: UBIFS error (ubi0:0 pid 0): ubifs_iget: failed to read inode 85, error -12 "Synchronous Abort" handler, esr 0x96000004
-12 = -ENOMEM
So it seems, we have a memleak...
If with my patch it takes more loops to crash, this suggests we have more than one memory leak, and of them got fixed with my patch(es).
Yes, ubi/ubifs linux base is very old, but I tend to think that problem is in U-Boot adaption layer...
I would also suspect the adaption layer.
However, although my patch fixes one memory leak, that was not the main reason doing this. The reason of the crash in my case was the double free, and accessing invalid memory afterwards. So from my perspective the patches are okay on their own, and looking after more memory leaks should not be part of this series.
Greets Alex
P.S.: not sure what's causing this, but I only got the mails from you all through the mailing list. You can keep me in Cc, so I won't overlook your answers easily.
bye, Heiko
Thanks, Ravi
Greets Alex
> > Greets > Alex > >> >> bye, >> Heiko >>> >>> Greets >>> Alex >>> >>> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_bl... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_blob_master_contrib_uboot.sh&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=3apjoJolwzwH82pd_c1HzSLzzrwfu9x1lSm802edgOg&m=lpy642UhqtTXHV0dd0xnOtQxkQSO5RC6FwcUbzt-FsNwd3QSXZsulfqC6dP9qV7G&s=wbnw_T9_r6taMKIFqJ2-wtj9Zv02OD7mbIy-ZZFRjRE&e= >>> >>> Alexander Dahl (4): >>> fs: ubifs: Fix memleak and double free in u-boot wrapper functions >>> fs: ubifs: Set pointers to NULL after free >>> fs: ubifs: Make k(z)alloc/kfree symmetric >>> fs: ubifs: Add volume mounted check >>> >>> fs/ubifs/super.c | 8 ++++++-- >>> fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ >>> 2 files changed, 25 insertions(+), 14 deletions(-) >>> >>> >>> base-commit: 65fbdab27224ee3943a89496b21862db83c34da2 >>> >> >> -- >> DENX Software Engineering GmbH, Managing Director: Erika Unter >> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany >> Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de

Hi
I found few places, but I can not test changes against my code. Nice if someone start to commit and split by file this changes so we can test everything
Micahel
On Mon, Aug 5, 2024 at 12:54 PM Heiko Schocher hs@denx.de wrote:
Hello Michael,
On 05.08.24 10:58, Michael Nazzareno Trimarchi wrote:
Hi all
On Mon, Aug 5, 2024 at 10:49 AM Alexander Dahl ada@thorsis.com wrote:
Hello,
Am Mon, Aug 05, 2024 at 07:28:19AM +0200 schrieb Heiko Schocher:
Hello Ravi,
On 01.08.24 21:39, Ravi Minnikanti wrote:
Hi Heiko, Alexander,
On 7/31/24 23:54, Heiko Schocher wrote:
Hello Alexander, On 01. 08. 24 08: 50, Alexander Dahl wrote: > Hei, > > Am Thu, Jul 04, 2024 at 10: 18: 55AM +0200 schrieb Alexander Dahl: >> Hello Heiko, >> >> Am Thu, Jul 04, 2024 at 06: 28: 31AM +0200 schrieb Heiko Schocher:
Hello Alexander,
On 01.08.24 08:50, Alexander Dahl wrote: > Hei, > > Am Thu, Jul 04, 2024 at 10:18:55AM +0200 schrieb Alexander Dahl: >> Hello Heiko, >> >> Am Thu, Jul 04, 2024 at 06:28:31AM +0200 schrieb Heiko Schocher: >>> Hello Alexander, >>> >>> On 03.07.24 12:12, Alexander Dahl wrote: >>>> Hei hei, >>>> >>>> filesystem handling is different in U-Boot and beyond that UBI/UBIFS is >>>> different from other filesystems in U-Boot. There's UBI and UBIFS code >>>> ported from Linux (quite old already now, maybe someone wants to update >>>> that?), and there's "glue code" or "wrapper code" to interface with >>>> U-Boot scripts, commands, and filesystem handling. The fixes and >>>> improvements in this patch series are for this U-Boot specific glue >>>> code. >>> >>> Yes, the linux base is very old ... patches are welcome! >> >> The last sync was back in 2015 from linux v4.2, there were 800+ >> changes to ubi/ubifs in Linux since then. :-/ >> >>> And for me it is not that easy, as I do not have a hardware with >>> current mainline U-Boot running on it... I want to update a hardware >>> I have to current mainline, but I had no time yet for it... >> >> Besides the custom hardware here, I used Microchip SAM9X60-Curiosity >> lately, which is coming with a raw NAND flash and can boot from it. >> >>> >>>> I'm no filesystem expert, but after days of debugging I'm quite sure the >>>> bug is in U-Boot since UBIFS support was added in 2009, and it was >>>> repeated in 2015 when generic filesystem support for UBIFS was added. >>>> So please review carefully! >>> >>> Which bug? >> >> The memory leak and double free fixed with patch 1 of the series. >> >>> >>>> The crashes were not easily reproducible, only with boards using the old >>>> distroboot _and_ a boot script inspired by (but not equal to) the one >>>> proposed by RAUC [1], which basically boils down to: >>>> >>>> ubifsmount ubi0:boot (from distroboot) >>>> test -e (from distroboot) >>>> ubifsmount ubi0:rootfs1 (this time from the boot script, >>>> triggering a ubifs_umount) >>> >>> So, you have a special sequence you execute to trigger the bug, good! >>> >>> In special 2 ubifsmount in a row... may not often needed for booting! >>> (I ask me, why that is needed? Boottime is not good than...) >> >> Using distroboot with a script here. The script is in a separate UBI >> volume ubi0:boot, kernel is loaded from ubi0:rootfs1 or ubi0:rootfs2 >> however. So there is 'ubifsmount ubi0:boot' from distroboot and in the >> script found, loaded, and run there is 'ubifsmount ubi0:rootfs1' (or >> rootfs2) later. ubifsmount calls ubifsumount internally if some >> volume is mounted already. >> >>> >>> BTW: Is this really a good bootcmd in [1] as on *every* boot your >>> Environment is saved? This is not good for lifetime of your >>> storage device ... why not using bootcounter? >> >> Well, I was not aware of bootcounter, but I had a look and the actual >> counter must be stored somewhere. Possible are: >> >> - pmic → has no storage possibility on my board >> - rtc → soc internal only, volatile in the end (if battery is empty) >> - i2c eeprom → missing >> - spi flash → missing >> - filesystem → ends up on the flash >> - nvmem → no other nvmems present >> - syscon or some cpu register or sram → volatile >> >> So none of these are possible in my case, I only have a raw NAND as >> storage and thus I have to use U-Boot env, which is stored in UBI here >> btw to not stress the flash too much. >> >> I could investigate if it would be possible to let RAUC use the >> U-Boot bootcounter infrastructure, but currently RAUC depends on >> U-Boot environment variables for tracking boot attempts. >> >> btw: documentation of bootcount is sparse, I only found the very short >> 'doc/README.bootcount' and it's not even migrated to recent U-Boot >> sphinx based docs. ;-) >> >> But from what I understood the concept is the same, U-Boot counts >> something and Linux userspace has to reset it. The counter must be >> stored somewhere, for example in U-Boot env if no other storage is >> possible. >> >>> >>>> Crashes can be triggered more easily, if patch order is changed and >>>> patch 2 (resetting pointers to NULL after free) comes first, or if patch >>>> 2 is applied on its own only. >>> >>> Hmm... >>> >>>> The fix is in the first patch, and on my boards I see no crashes >>>> anymore. I also tested all kinds of combinations of calling `ubi part`, >>>> `ubi detach`, `ubifsmount`, `ubifsumount`, `ubifsls`, `ubifsload`, `ls`, >>>> `load`, `size`, and `test -e` and got no crashes anymore after the fix. >>> >>> That sounds good! Hmm.. test -e has nothing to do with ubi/ubifs I think. >> >> Oh it has, 'test -e' calls file_exists() which calls fs_exists() which >> ends up calling ubifs_exists() which is one of the functions causing >> an immediate memory leak, see patch 1. >> >>> On what hardware do you test? Is it in mainline? >> >> Tested on custom hardware, but I'm confident it should be reproducible >> on any board using ubifs, especially after applying patch 2 resetting >> pointers of freed memory to NULL. This should trigger the bug with >> the simple sequence already described: >> >> > ubifsmount ubi0:anyvolume >> > ls ubi ubi0:anyvolume / # (or load, or test -e, or size) >> > ubifsumount >> >> ubifsumount will call ubifs_umount() which calls >> ubi_close_volume(c->ubi), that pointer is either invalid leading to a >> double free inside of ubi_close_volume() and it will crash only in >> certain conditions or the pointer is NULL after applying patch 2 of >> the series, then ubi_close_volume() crashes right away with a NULL >> pointer exception. >> >> Note: without patch 2 it very much depends on the sequence of >> commands, but after the first ubi_close_volume() triggered by >> ls/load/size/exists the pointer in ubifs_sb is invalid, but accessed later >> by the second ubi_close_volume() triggered by ubifs_umount(). If you >> do something in between those using the freed memory by something else >> again, the second ubi_close_volume() access might get corrupted data >> or access things outside of RAM. Patch 2 redirects this on a clean >> NULL pointer exception you can easily trigger. >> >> In my case I got a pointer variable actually containing a string >> "ng.." aka 0x2e2e676e which looked suspiciously similar to a valid >> pointer on the platform somewhere in RAM between 0x20000000 and >> 0x28000000 so it took me two days to realize it's not a pointer. ;-) >> >>> >>>> The three additional patches (2 to 4) are more or less safeguards and >>>> improvements for the future, and come from me trying and my debugging >>>> code done on the way, more or less optional, but I think nice to have. >>> >>> I will look at them .. but give me some time, as I am in holidays the >>> next 2 weeks ... Hmm.. and it would be good to get some Tested-by >>> from people with hardware... >> >> Take your time, no need to work in holidays. Would appreciate a >> Tested-by by anyone else though, maybe some of the raw NAND folks? > > Well, apparently nobody had a look in the month of July, I add the raw > NAND maintainers in Cc, maybe I should have done in the first place. > > Would be happy if someone could have a look at the fix, maybe read the > patches first before the discussion? ;-)
I asked Ravi and Alexey (added to cc) if they have time to look it they can reproduce the issue and test your patches...
bye, Heiko
I tried to reproduce with CONFIG_SYS_MALLOC_LEN reduced to 1MB and calling ubifsls in a loop, logic below:
ubi part ubi0:rootfs ubifsmount ubi0:rootfs setenv i 1 while test $i -le 200; do ubifsls a/b/test.bin setexpr i $i + 1 sleep 1 echo "Loop count: $i" done
Thanks for testing! But it is not exactly the same sequence Alexander described ... but you also trigger a bug.
diff --git a/fs/ubifs/ubifs.c b/fs/ubifs/ubifs.c index 048730db7f..33df4ff51f 100644 --- a/fs/ubifs/ubifs.c +++ b/fs/ubifs/ubifs.c @@ -557,6 +557,7 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename)
/* We have some sort of symlink recursion, bail out */ if (symlink_count++ > 8) {
ubifs_iput(inode); printf("Symlink recursion, aborting\n"); return 0; }
@@ -568,6 +569,7 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename) * the leading slash */ next = name = link_name + 1; root_inum = 1;
ubifs_iput(inode); continue; } /* Relative to cur dir */
@@ -575,6 +577,7 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename) link_name, next == NULL ? "" : next); memcpy(symlinkpath, buf, sizeof(buf)); next = name = symlinkpath;
ubifs_iput(inode); continue; }
@@ -583,8 +586,10 @@ static unsigned long ubifs_findfile(struct super_block *sb, char *filename) */
/* Found the node! */
if (!next || *next == '\0')
if (!next || *next == '\0') {
ubifs_iput(inode); return inum;
} root_inum = inum; name = next;
Should we not free inode_iget?
Yep, that patchsnipset seems valid to me ... We should check all calls from ubifs_get() ... so I think we need to correct also filldir() fucntion in this file?
and
fs/ubifs/recovery.c ubifs_recover_size() fs/ubifs/super.c ubifs_fill_super()
?
bye, Heiko
Michael
+1 … thanks for testing. :-)
It crashes with or without patch. With patch it takes 4,5 loops more to crash.
Log: UBIFS error (ubi0:0 pid 0): ubifs_iget: failed to read inode 85, error -12 "Synchronous Abort" handler, esr 0x96000004
-12 = -ENOMEM
So it seems, we have a memleak...
If with my patch it takes more loops to crash, this suggests we have more than one memory leak, and of them got fixed with my patch(es).
Yes, ubi/ubifs linux base is very old, but I tend to think that problem is in U-Boot adaption layer...
I would also suspect the adaption layer.
However, although my patch fixes one memory leak, that was not the main reason doing this. The reason of the crash in my case was the double free, and accessing invalid memory afterwards. So from my perspective the patches are okay on their own, and looking after more memory leaks should not be part of this series.
Greets Alex
P.S.: not sure what's causing this, but I only got the mails from you all through the mailing list. You can keep me in Cc, so I won't overlook your answers easily.
bye, Heiko
Thanks, Ravi
> > Greets > Alex > >> >> Greets >> Alex >> >>> >>> bye, >>> Heiko >>>> >>>> Greets >>>> Alex >>>> >>>> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_bl... https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rauc_rauc_blob_master_contrib_uboot.sh&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=3apjoJolwzwH82pd_c1HzSLzzrwfu9x1lSm802edgOg&m=lpy642UhqtTXHV0dd0xnOtQxkQSO5RC6FwcUbzt-FsNwd3QSXZsulfqC6dP9qV7G&s=wbnw_T9_r6taMKIFqJ2-wtj9Zv02OD7mbIy-ZZFRjRE&e= >>>> >>>> Alexander Dahl (4): >>>> fs: ubifs: Fix memleak and double free in u-boot wrapper functions >>>> fs: ubifs: Set pointers to NULL after free >>>> fs: ubifs: Make k(z)alloc/kfree symmetric >>>> fs: ubifs: Add volume mounted check >>>> >>>> fs/ubifs/super.c | 8 ++++++-- >>>> fs/ubifs/ubifs.c | 31 +++++++++++++++++++------------ >>>> 2 files changed, 25 insertions(+), 14 deletions(-) >>>> >>>> >>>> base-commit: 65fbdab27224ee3943a89496b21862db83c34da2 >>>> >>> >>> -- >>> DENX Software Engineering GmbH, Managing Director: Erika Unter >>> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany >>> Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
-- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
participants (4)
-
Alexander Dahl
-
Heiko Schocher
-
Michael Nazzareno Trimarchi
-
Ravi Minnikanti