Re: [U-Boot] [LEDE-DEV] Older u-boot mangles UBI from ubinize 1.5.2

Am 11.08.2016 um 13:49 schrieb Daniel Golle:
Hi!
On Thu, Aug 11, 2016 at 04:28:47AM -0700, J Mo wrote:
I got that good old feeling... like I just jumped onto a bag of flaming poo. Ha ha
On 08/11/2016 03:40 AM, Daniel Golle wrote:
Understandable. However, we also need to experiment and figure out the mess left behind by $vendor which often doesn't leave a lot of reasonable options for 3rd-party firmware to be installed. With regard to that specific hack, I never truly understood why it was needed in first place -- I'm not using it on any UBI-enabled device and believe it's some kind of work-around to allow ubinized images to be written via nandwrite, initially in order to support the vendor/stock sysupgrade-format of a specific device (NETGEAR WNDR4300). Please correct me or add the missing bits needed to understand the use-case. It was added to OpenWrt long ago in r38681...r38683 and by now needed to be fixed several times in r42940, r43287, r44658, r44801 and r44881. Later on it was re-used by a bunch of other devices, e.g. bcm4708-netgear-r6250, bcm4708-netgear-r6300-v2, bcm4708-buffalo-wzr-1750dhp, bcm47081-buffalo-wzr-600dhp2 and probably some more.
Gabor and Rafal should know more about it and why exactly this is needed and supposedly cannot be solved without this hack.
I'm also confused about WTF that patch does. If it was device-specific to comply with OEM-hackery, why apply it generally?
I reckon because it's generic in the sense that it's used by more than one target (ar71xx, bcm47xx) and we don't do any device/board specific patching at all.
Hm, I just found another example. I don't know why this didn't turn up in my searches yesterday since it's a perfect match with the EXACT error. This too was on a QSDK AP148:
https://patchwork.ozlabs.org/patch/509468/
I think I'll go rip that patch out here in a bit, recompile my image, and see what happens.
In the end, this will at least give you some consistency in terms of U-Boot's and the Kernel's UBI implementation. Ie. either both work or both fail (e.g. to attach a not entirely erased/formatted UBI device with left-overs from previous uses of the stock fw). In case you are flashing the firmware using ubiformat, this shouldn't be a problem anyway.
[...]
Thanks for the insight.
The idea was to have a UBI with three volumes: kernel, rootfs(squashfs), and the rootfs_data overlay(ubifs).
One of my problems is that someone thought it was a great idea to name the SMEM NAND UBI partition "rootfs". There's a patch out there which is supposed to fix that, (rename to "ubi") but it's apparently not working for me. The auto rootfs selection method might be trying to use the smem/mtd parition named "rootfs" instead of the UBI volume named "rootfs"?
No, these are two different things and it shouldn't matter. However, in order to have your UBI device auto-attached without any cmdline parameters it needs to be named 'ubi', so simply changing the name of the MTD partition in the device-tree should do the trick.
And yes, my DTS has: bootargs = "console=ttyMSM0,115200n8 ubi.mtd=11 root=ubi0:rootfs rootfstype=squashfs";
Is that not valid? Looks right to me.
squashfs doesn't work on UBI character devices but rather likes block devices only, just like most filesystems. Thus, rootfs detection works automagically in OpenWrt/LEDE, just having a ubi volume named 'rootfs' should do the trick and automatically decide whether the volume is UBIFS and thus would be mounted similar to what you tried to do now -- or to create a ubiblock-device and select that to be mounted as rootfs. In any case, you shouldn't need any kernel command-line parameters for that, so simply drop everything past 'console=ttyMSM0,115200n8' (and btw, this can also be done nicer by setting stdout-path rather than hacking the cmdline).
Right. Depending on whether U-Boot's UBI support or the kernel itself first touches the freshly-written UBI device things go wrong, becase only the hacked-up OpenWrt/LEDE kernel does the right magic on firstboot...
The kernel is in the UBI, so u-boot is going to attach it. I can't get around that without doing major reconstructive surgery to how this thing was designed to boot.
The number of OpenWRT/LEDE devices that have KERNEL_IN_UBI set are tiny. I think I only saw one or two others, and they were obscure or dev boards. This is likely why the issue hasn't come up before, and it could have been a problem for awhile and nobody noticed.
I do the excact same for all boards on the oxnas target and it works great. I even store U-Boot's environment inside UBI volumes. I reckon it really depends on how you flash the device in first place, ie. using raw nand-write (which may need the before mentioned hack to erase the remaining free-space) or using ubiformat (which shouldn't need that).
I don't know who's to blame. That's why I started this three-way cross posting clusterfark. =)
Not too bad, at least we get to discuss some forgotten uglyness now before it starts to affect more people...
I'm most tempted to blame the kernel rather than u-boot. After all, I can change the kernel, and the old kernel worked fine.
I reckon it's somewhere between the way the image was generated and written to the flash and then didn't get the right treatment on first-boot because U-Boot tried to access it before it got fixed-up. Again, if you just use ubiformat to write the image, you won't need any EOF-markers or other hacks (ie. thus you also shouldn't include them in the ubinized image!)
Did you intentional drop linux-mtd from the CC's after I offered you to discuss your patches on linux-mtd? ;-)
Thanks //richard

On Thu, Aug 11, 2016 at 02:22:58PM +0200, Richard Weinberger wrote:
Did you intentional drop linux-mtd from the CC's after I offered you to discuss your patches on linux-mtd? ;-)
I replied twice, once including all the CC's with the intention to contribute to the general debate. And once to lede-dev, you and J Mo intending to support J Mo creating board-support and figuring out how to work with UBI in OpenWrt/LEDE which I assumed would be considered noise for most readers of the other lists involved.
participants (2)
-
Daniel Golle
-
Richard Weinberger