[U-Boot] freescale i.MX28 mxsboot NAND booting on mx28evk bad blocks

I'm prototyping a project that's going to need to boot linux from NAND on a mx28evk board.
I was able to successfully use the u-boot mxsboot utility to generate a nand image and burn it, then boot from it. I noticed one anomaly though, when using mxsboot/u-boot to generate and burn the bootstream to NAND, when the linux kernel boots it finds bad blocks:
[ 1.090000] NAND device: Manufacturer ID: 0x2c, Chip ID: 0xf1 (Micron MT29F14 [ 1.100000] Scanning device for bad blocks [ 1.110000] Bad eraseblock 0 at 0x000000000000 [ 1.110000] Bad eraseblock 1 at 0x000000020000 [ 1.120000] Bad eraseblock 2 at 0x000000040000 [ 1.120000] Bad eraseblock 3 at 0x000000060000
When I burn the exact same bootstream with kobs-ng, linux does not find any bad blocks, so it seems to be a byproduct of either the image generated by mxsboot or the u-boot burning.
I don't think this is having any functional impact, as the scrub component of burning a new nand image wipes out the bad blocks, and once linux is booted it really has no need to read the bootstream from the bootloader mtd partition.
However, it seems anomalous, and I was wondering if other people are have seen it, and whether or not it is something that might be fixed.
Thanks much…

On 03/18/2013 07:50:07 PM, Paul B. Henson wrote:
I'm prototyping a project that's going to need to boot linux from NAND on a mx28evk board.
I was able to successfully use the u-boot mxsboot utility to generate a nand image and burn it, then boot from it. I noticed one anomaly though, when using mxsboot/u-boot to generate and burn the bootstream to NAND, when the linux kernel boots it finds bad blocks:
[ 1.090000] NAND device: Manufacturer ID: 0x2c, Chip ID: 0xf1 (Micron MT29F14 [ 1.100000] Scanning device for bad blocks [ 1.110000] Bad eraseblock 0 at 0x000000000000 [ 1.110000] Bad eraseblock 1 at 0x000000020000 [ 1.120000] Bad eraseblock 2 at 0x000000040000 [ 1.120000] Bad eraseblock 3 at 0x000000060000
When I burn the exact same bootstream with kobs-ng, linux does not find any bad blocks, so it seems to be a byproduct of either the image generated by mxsboot or the u-boot burning.
I don't think this is having any functional impact, as the scrub component of burning a new nand image wipes out the bad blocks,
You should not be routinely scrubbing NAND!
The manufacturers put bad block information there for a reason.
-Scott

On Tue, Mar 19, 2013 at 06:23:27PM -0500, Scott Wood wrote:
I don't think this is having any functional impact, as the scrub component of burning a new nand image wipes out the bad blocks,
You should not be routinely scrubbing NAND!
The manufacturers put bad block information there for a reason.
Hmm, I was following the instructions in doc/README.mx28_common, which says to use "run update_nand_full" to burn the NAND image, and one component of that per include/configs/mx28evk.h is:
nand scrub -y 0x0 ${filesize}
Are the instructions/env script incorrect?
I don't believe the bad blocks that linux finds are actual bad blocks, and definitely not factory bad blocks. They seem to show up as a byproduct of the way u-boot is burning the NAND image. They are always at the same addresses (I tried two different NAND chips), and only appear when u-boot is used to burn the bootstream, but not when kobs-ng is used.
Thanks...

On 03/20/2013 04:20:07 PM, Paul B. Henson wrote:
On Tue, Mar 19, 2013 at 06:23:27PM -0500, Scott Wood wrote:
I don't think this is having any functional impact, as the scrub component of burning a new nand image wipes out the bad blocks,
You should not be routinely scrubbing NAND!
The manufacturers put bad block information there for a reason.
Hmm, I was following the instructions in doc/README.mx28_common, which says to use "run update_nand_full" to burn the NAND image, and one component of that per include/configs/mx28evk.h is:
nand scrub -y 0x0 ${filesize}
Are the instructions/env script incorrect?
The env script is incorrect. Otavio, Marek, what's going on here?
I don't believe the bad blocks that linux finds are actual bad blocks, and definitely not factory bad blocks. They seem to show up as a byproduct of the way u-boot is burning the NAND image. They are always at the same addresses (I tried two different NAND chips), and only appear when u-boot is used to burn the bootstream, but not when kobs-ng is used.
My guess is there's some mismatch regarding NAND layout, but someone more familiar with mx28 will need to answer that...
-Scott

On Mon, Mar 18, 2013 at 5:50 PM, Paul B. Henson henson@acm.org wrote:
I'm prototyping a project that's going to need to boot linux from NAND on a mx28evk board.
I was able to successfully use the u-boot mxsboot utility to generate a nand image and burn it, then boot from it. I noticed one anomaly though, when using mxsboot/u-boot to generate and burn the bootstream to NAND, when the linux kernel boots it finds bad blocks:
[ 1.090000] NAND device: Manufacturer ID: 0x2c, Chip ID: 0xf1 (Micron MT29F14 [ 1.100000] Scanning device for bad blocks [ 1.110000] Bad eraseblock 0 at 0x000000000000 [ 1.110000] Bad eraseblock 1 at 0x000000020000 [ 1.120000] Bad eraseblock 2 at 0x000000040000 [ 1.120000] Bad eraseblock 3 at 0x000000060000
When I burn the exact same bootstream with kobs-ng, linux does not find any bad blocks, so it seems to be a byproduct of either the image generated by mxsboot or the u-boot burning.
I get the same problem, with an iMX28 based board and the same NAND chip (correct name MT29F1G08ABADAWP, MT29F14 is from serial terminal line wrapping).
It's something to do with the way u-boot writes to nand. If I write with nandwrite it doesn't happen, nandtest doesn't find any bad blocks, the chip is supposed to guarantee that block 0 is not bad, and those four blocks contain all copies of the FCB so if they were all bad you couldn't boot.
A bad block on that chip is marked with a non-0xff as the first OOB byte in the 1st page of a block. So, my guess is that when u-boot writes the FCB data it also writes something to the OOB data. It's not entirely clear to me if the ROM bootloader uses the OOB data in the FCB blocks or not, which pages used by the ROM are BCH encoded, and how the that affects the OOB data.
You said you've booted from NAND. Did you have to program any of the OTP fuses to do this? Try as I might, the ROM bootloader refuses to accept anything I flash into NAND. Of course the error code doesn't tell you WHY it didn't like the image. One thing I found was the OCOTP fuse bits for NAND_ROW_ADDRESS_BYTES. The default is for 3 bytes of row address. Yet the MT29F1 has 2 bytes of row address. Did you program the fuse bits to change the number of row address bytes to two?
I don't think this is having any functional impact, as the scrub component of burning a new nand image wipes out the bad blocks, and once linux is booted it really has no need to read the bootstream from the bootloader mtd partition.
nandwrite didn't seem to want to program the blocks after they were marked bad. The only way fix this seemed to be to scrub nand from u-boot. So it's a problem if you want to be able to flash the bootloader from Linux, unless there is some way to get the blocks written when they have been marked bad.

On 4/4/2013 3:09 AM, Trent Piepho wrote:
It's something to do with the way u-boot writes to nand. If I write with nandwrite it doesn't happen, nandtest doesn't find any bad
Hmm, I'm pretty sure I tested burning the u-boot generated nand image with nandwrite under Linux with exactly the same result, it seems to be inherent in the underlying data, not the burn method.
Did you use the --oob option to nandwrite? The u-boot generated image is actually written in two separate steps, the initial piece is written raw and includes oob data, the second piece is written normally and the ecc/oob is generated by the hardware. To burn it under linux, you need to split the u-boot nand image into those two pieces, and write the first with -oob, and the second normally.
A bad block on that chip is marked with a non-0xff as the first OOB byte in the 1st page of a block. So, my guess is that when u-boot writes the FCB data it also writes something to the OOB data.
Yes, as would linux if you used the --oob option to nandwrite.
You said you've booted from NAND. Did you have to program any of the OTP fuses to do this?
No. All I did was install the actual NAND chip and update the boot dip switches. Testing u-boot, I followed the script in the default environment other than updating it to load the firmware from SD rather than tftp. For testing under Linux, I used dd to split the u-boot nand image into two pieces, corresponding to the u-boot burn instructions.
nandwrite didn't seem to want to program the blocks after they were marked bad. The only way fix this seemed to be to scrub nand from u-boot. So it's a problem if you want to be able to flash the bootloader from Linux, unless there is some way to get the blocks written when they have been marked bad.
No, from what I understand there is no way to clear bad block markers from within linux short of modifying the mtd driver.
I followed up with Otavio off list, he said he had ordered some nand chips for his board and would get back to me once he had received them and had a chance to replicate the issue.
Are you targeting burning the nand with u-boot or linux? If you are using an older kernel, the kobs-ng that comes with the mx28 BSP works fine. It does not work with newer kernels though, there is a newer version of kobs-ng that comes with a different chip BSP that I've heard will work correctly on current kernels, it is on my to do list to try it out.

On Apr 5, 2013 9:28 PM, "Paul B. Henson" henson@acm.org wrote:
On 4/4/2013 3:09 AM, Trent Piepho wrote:
It's something to do with the way u-boot writes to nand. If I write with nandwrite it doesn't happen, nandtest doesn't find any bad
Hmm, I'm pretty sure I tested burning the u-boot generated nand image with nandwrite under Linux with exactly the same result, it seems to be inherent in the underlying data, not the burn method.
Did you use the --oob option to nandwrite? The u-boot generated image is actually written in two separate steps, the initial piece is written raw and includes oob data, the second piece is written normally and the ecc/oob is generated by the hardware. To burn it under linux, you need to split the u-boot nand image into those two pieces, and write the first with -oob, and the second normally.
Did you already have the bad sectors when you burnt under Linux? I hadn't used --oob under linux, which as you've said doesn't work. I've now burnt with kobs-ng a working nand image and have no bad sectors.
I looked into this more and haven't entirely figured it out. It's definitely something to do with the raw sectors vs BCH protected sectors.
I don't think the image u-boot mxsboot generates includes any OOB data. For me, it made an image which is *exactly* 24 blocks of 128 kiB each. If the FCB blocks had OOB data then there would need to be some multiple of 64 OBB bytes in the image (16 kiB I would think). I think maybe this is the problem. The update_nand_full script calls "nand write.raw ${loadaddr} 0x0 ${fcb_sz}" and write.raw expects loadaddr to contain $fcb_sz pages of (2048 + 64) bytes each. But a hexdump of the u-boot image: 00000000 00 00 00 00 00 00 00 00 00 00 00 00 d6 fc ff ff |................| 00000010 46 43 42 20 00 00 00 01 50 3c 19 06 00 00 00 00 |FCB ....P<......|
00020000 00 00 00 00 00 00 00 00 00 00 00 00 d6 fc ff ff |................| 00020010 46 43 42 20 00 00 00 01 50 3c 19 06 00 00 00 00 |FCB ....P<......|
There is the first FCB block at offset 0. And the second FCB block at offset 0x20000. That's 64 * 2048 bytes, not 64 * 2112 bytes. No OOB data. The next two FCBs are at 0x40000 and 0x60000, again not where they should be if they contained the OOB data.
So I think the reason flashing with u-boot didn't work for me is that the u-boot script vs mxsboot are broken. The script expects OOB data in the first 4 blocks while mxsboot doesn't put any there. I wonder if the way mxsboot or write.raw work has changed recently and one is out of date?
Now the BCH error correction is interesting. When it's used, the nand page does not consist of 2048 bytes of data and 64 oob bytes of ecc. Instead it's something like 10 bytes metadata, 512 bytes data, 13 bytes ecc, 512 bytes data, 13 ecc, etc. The data and the ECC are intermixed! So if you write a page in BCH mode and then read it in RAW mode, or vice versa, you get completely incorrect data back.
This is why writing with nandwrite doesn't work. The ROM bootloader expects the FCB blocks, which contain the BCH parameters, to be in raw mode and apparently expects the rest to be in BCH mode. If you write with nandwrite everything is in BCH mode and thus the FCBs will be in the wrong format and not read correctly. The FCB blocks actually only use the first 1036 bytes so the OOB data doesn't matter to the bootloader, but since it's written in BCH mode everything is in the wrong location.
A bad block on that chip is marked with a non-0xff as the first OOB byte in the 1st page of a block. So, my guess is that when u-boot writes the FCB data it also writes something to the OOB data.
Yes, as would linux if you used the --oob option to nandwrite.
Does this work for you? As I see it, one would need to first generate an image with OOB data, i.e. 2112 bytes pages, for the FCB blocks and mxsboot doesn't do that. Then you would need to get this written in RAW mode. I don't think nandwrite will do this, even with the --oob option. It's still in ECC mode. The GPMI driver does not support writing OOB data (see gpmi_ecc_write_oob(), all it does is return -EPERM). It does have a RAW mode write option, which is what kobs-ng uses to write the FCB blocks.
You said you've booted from NAND. Did you have to program any of the OTP fuses to do this?
No. All I did was install the actual NAND chip and update the boot dip switches. Testing u-boot, I followed the script in the default environment other than updating it to load the firmware from SD rather than tftp. For testing under Linux, I used dd to split the u-boot nand image into two pieces, corresponding to the u-boot burn instructions.
What instructions are those? I didn't see anything in the README.mx28_common file about that. I think that could work, if you re-blocked the FCB pages with ibs=2048 obs=2112 count=256.
I think this explains why the pages become "bad". If you did this then the OOB data added by dd with be all zero. That first byte not being 0xff anymore would mark the page as bad. The bootloader doesn't care since it ignores those marks, but Linux doesn't ignore them. The solution would be for u-boot to not write into the OOB data when writing the FCBs. This is what the kobs-ng program does. It puts the nand driver into raw mode and then writes 2048 byte pages for the FCB, leaving the OOB data untouched. U-boot appears to write the OOB data (with all zeros), causing the pages to become marked as bad.
I think the u-boot method is somewhat problematic with regards to how the NAND and ROM bootloader work. It's nice to have a "smart" image generator that does everything to make a raw image that you then flash with any "dumb" flasher. That's how we used to do things with NOR. So much easier. But the hardware really doesn't lend itself to this anymore. A "smart" flasher like kobs-ng seems more in line with what the hardware calls for.
1) To create a correct image one needs to have all these details of the hardware. Page size, block size, BCH mode, ROM bootloader stride and count configuration for the FCB and for the DBBT, nand timing to put in FCB, etc. A smart flasher can just query the drivers and get all these values directly. The smart image generator needs to get told all these by someone transcribing them from the system. Some can be calculated, but then the code for that duplicates the code in the driver and must be kept up to date as hardware and drivers change. It also means the image generated by the smart image generator can is very hardware specific and can not cope with minor hardware changes. E.g., the .sb file can be flashed onto any nand chip but the .nand file can't be.
2) Some parts of the image needs to be written in raw mode, some in ecc mode. The image doesn't say. So the "dumb" flasher still needs to figure that out, by duplicating calculations from block size, ROM bootloader stride and count fuse settings, etc. that the image generator also did.
3) Lots of the image is actually blank. There is a difference between not programming a NAND page and programming a NAND page to all 1s.
4) This seems like the big problem to me. The mxsboot system can't cope with bad blocks. It just creates a blank DBBT table. What you need to do is query the mtd driver and get the bad block list and then use that to construct an accurate DBBT and then not use those bad blocks. kobs-ng can do this (not sure if it actually does).
Are you targeting burning the nand with u-boot or linux? If you are using an older kernel, the kobs-ng that comes with the mx28 BSP works fine. It does not work with newer kernels though, there is a newer version of kobs-ng that comes with a different chip BSP that I've heard will work correctly on current kernels, it is on my to do list to try it out.
I'd like to burn from Linux. It's easier to create an end user firmware update system in linux than in u-boot. The kobs-ng that comes with the BSP does indeed not support the mainline kernel. Why Freescale can't update the BSP I don't know. So maybe easy to fix bugs. I ported that kobs-ng to the current kernel, then discovered there was the new version that already worked. This does work to flash u-boot or a kernel to nand from linux. It doesn't write the OOB data when it flashes the FCBs and so the blocks don't get marked bad.

Let me just preface this reply with the disclaimer that I'm fairly new to embedded development, and it sounds like you know a lot more about what you're talking about than I do ;).
On 4/6/2013 12:18 AM, Trent Piepho wrote:
Did you already have the bad sectors when you burnt under Linux? I hadn't used --oob under linux, which as you've said doesn't work.
I did not have any bad blocks before I tried to burn the mxsboot generated image to nand from Linux using nandwrite. You misunderstood me though, I was able to exactly replicate the outcome from burning the image with u-boot using Linux nandwrite. The board successfully booted from NAND after burning the image with nandwrite, but resulted in the exact same bad blocks.
I've now burnt with kobs-ng a working nand image and have no bad sectors.
Yes, when I burn the bootstream with kobs-ng, I also do not get any bad blocks on the nand.
I don't think the image u-boot mxsboot generates includes any OOB data. For me, it made an image which is *exactly* 24 blocks of 128 kiB each. If the FCB blocks had OOB data then there would need to be some multiple of 64 OBB bytes in the image (16 kiB I would think). I think maybe this is the problem. The update_nand_full script calls "nand write.raw ${loadaddr} 0x0 ${fcb_sz}" and write.raw expects loadaddr to contain $fcb_sz pages of (2048 + 64) bytes each.
I'm not sure what you mean. According to u-boot:
Device 0: nand0, sector size 128 KiB Page size 2048 b OOB size 64 b
The page size is 2048 bytes, with 64 bytes of oob data, for a total of 2112 bytes.
When I burn the first part of the image with u-boot:
MX28EVK U-Boot > nand write.raw ${loadaddr} 0x0 ${fcb_sz}
NAND write: 540672 bytes written: OK
It writes 540672 bytes, which is evenly divisible by 2112 (256).
If you look at the mxsboot source code:
for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) { offset = i * nand_writesize; memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize); }
It appears to be writing the FCB including oob data.
I wonder if the way mxsboot or write.raw work has changed recently and one is out of date?
I used the latest git head of the ARM branch when I was testing a few weeks ago.
This is why writing with nandwrite doesn't work. The ROM bootloader expects the FCB blocks, which contain the BCH parameters, to be in raw mode and apparently expects the rest to be in BCH mode.
Unless I am misunderstanding the u-boot instructions, the FCB blocks are written in raw mode:
MX28EVK U-Boot > nand write.raw ${loadaddr} 0x0 ${fcb_sz}
NAND write: 540672 bytes written: OK
and the rest is not:
MX28EVK U-Boot > setexpr update_off ${loadaddr} + ${update_nand_fcb} MX28EVK U-Boot > setexpr update_sz ${filesize} - ${update_nand_fcb} MX28EVK U-Boot > nand write ${update_off} ${update_nand_fcb} ${update_sz}
NAND write: device 0 offset 0x80000, size 0xa80000 11010048 bytes written: OK
rather than tftp. For testing under Linux, I used dd to split the u-boot nand image into two pieces, corresponding to the u-boot burn instructions.
What instructions are those? I didn't see anything in the README.mx28_common file about that. I think that could work, if you re-blocked the FCB pages with ibs=2048 obs=2112 count=256.
There are no explicit instructions for nandwrite in u-boot, I simply split the mxsboot NAND image into two pieces to match the pieces that u-boot wrote:
dd if=test.nand bs=2112 count=256 of=test-head.nand dd if=test.nand bs=1 skip=524288 of=test-tail.nand
And then wrote the first piece with nandwrite -oob at offset 0 and the second with regular nandwrite at offset 0x80000. This worked exactly the same as using u-boot, including the four blocks being marked as bad by Linux.
U-boot appears to write the OOB data (with all zeros), causing the pages to become marked as bad.
If you look at the mxsboot source code, they appear to be trying to calculate the ecc and generate the oob data, but maybe they are doing it wrong?
A "smart" flasher like kobs-ng seems more in line with what the hardware calls for.
Yes, but unfortunately I'm not sure that's something that could be implemented within a running u-boot?
The smart image generator needs to get told all these by someone transcribing them from the system.
The current mxsboot is not that smart :), for the most part it has values hardcoded and you would need to recompile it if you wanted to change them. But it is just the first iteration while they are trying to get something working.
cope with bad blocks. It just creates a blank DBBT table. What you need to do is query the mtd driver and get the bad block list and then use that to construct an accurate DBBT and then not use those bad blocks. kobs-ng can do this (not sure if it actually does).
Does the ROM IPL pay any attention to bad blocks? I don't know exactly how it works, but from what I've heard it doesn't sound like it deals with bad blocks very well if at all.
I'd like to burn from Linux. It's easier to create an end user firmware update system in linux than in u-boot.
Yes, agreed. We've been evaluating our bootloader options for the project I'm working on. Initially we were going to have u-boot be the bootloader and directly load our production kernel. However, as you say, it is somewhat difficult to create a flexible update/recovery system within u-boot. Next we looked into using the freescale bootlets to directly load a bootloader/recovery linux kernel with a bundled initramfs that would be used for the update/recovery process, and then have it use kexec to load the production kernel. This worked out pretty well.
What I think we're going to go with is actually a hybrid of both u-boot and a stripped-down linux/initramfs bootloader/recovery kernel. The bootstream actually contains two copies of whatever object is going to be loaded (I'm guessing, but I assume that is because the ROM IPL doesn't handle bad blocks, so if the first copy can't be read it will just try the second). Our recovery kernel/initramfs is probably going to be about 5M, so it would take about 10M to be loaded directly. u-boot is less than 1M, so by including both, it actually uses *less* space (2*1+5=7, vs 2*5=10). In addition, the recommendation seems to be to have the IPL load the minimal amount possible, so with this implementation the IPL loads a tiny u-boot, which again is a lot smarter and more reliable about loading the larger recovery kernel, which can then either perform recovery options or kexec the production kernel.
Why Freescale can't update the BSP I don't know.
Yes, it does seem in the embedded space chip manufacturers release something and then let it stagnate :(. I generally prefer to run more up to date stuff.
I ported that kobs-ng to the current kernel, then discovered there was the new version that already worked.
I worked on porting it, and got to the point where it wanted to read a sysfs node to determine the NAND geometry which no longer existed. I was inquiring on the linux mtd mailing list about the possibility of getting that information back, when I was directed to a newer version of kobs-ng that comes in a different chip's BSP that's supposed to work with a current kernel.
This does work to flash u-boot or a kernel to nand from linux. It doesn't write the OOB data when it flashes the FCBs and so the blocks don't get marked bad.
I'm glad to hear the newer kobs-ng works with a current kernel, I had not yet had a chance to try it out. It would be nice if Freescale just had a link to the latest kobs-ng, rather than trying to figure out which chip BSP has the newest version, but I guess that's not the way they are bundling things or want people to work.
At this point, while I think it would be nice in general for the u-boot mxsboot utility to be fixed and work correctly, I don't think either of us is going to need that? If the latest kobs-ng worked correctly under a current kernel, we can just build the u-boot.sb file and use kobs-ng under Linux to burn it to NAND.

I don't think the image u-boot mxsboot generates includes any OOB data. For me, it made an image which is *exactly* 24 blocks of 128 kiB each. If the FCB blocks had OOB data then there would need to be some multiple of 64 OBB bytes in the image (16 kiB I would think). I think maybe this is the problem. The update_nand_full script calls "nand write.raw ${loadaddr} 0x0 ${fcb_sz}" and write.raw expects loadaddr to contain $fcb_sz pages of (2048 + 64) bytes each.
I'm not sure what you mean. According to u-boot:
Device 0: nand0, sector size 128 KiB Page size 2048 b OOB size 64 b
The page size is 2048 bytes, with 64 bytes of oob data, for a total of 2112 bytes.
When I burn the first part of the image with u-boot:
MX28EVK U-Boot > nand write.raw ${loadaddr} 0x0 ${fcb_sz}
NAND write: 540672 bytes written: OK
I'm talking about the image file as generated by mxsimage. If I hex dump that, it's clearly written entirely with 2048 byte pages. If you hexdump your image are the FCB blocks exactly 128k apart? Or are they 64 * 2112 = 132k apart? It should be the latter, as 132k * 4 = 540672 bytes.
If you look at the mxsboot source code:
for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) { offset = i * nand_writesize; memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize); }
It appears to be writing the FCB including oob data.
Looks wrong to me! Notice that offset is equal to i * nand_writesize, not i * (nand_writesize + nand_oobsize). I think this only produce a bootable image because:
The FCB data is only 1036 bytes in size. The remaining 1012 bytes of data and 64 oob bytes in the page aren't used. And the 63 pages after the first aren't used either. So they can be full of garbage and it doesn't matter. The image mxsboot creates is ok for the first 1036 bytes. Everything after that is wrong, but it doesn't matter.
There are four copies of the FCB blocks. The ROM bootloader looks for the first valid one and uses it. The first one is ok in the mxsboot image. All the rest are corrupted since they are written in the wrong location. But since the first one was ok the bootloader never even looks at the bad ones. Unless the NAND page goes bad, then the whole point of having redundant copies will be defeated.
Now, look at the mx28_nand_fcb_block() that generates the FCB block. It calls memset() to fill the entire 2112 bytes with ZERO. The mx28_nand_fcb struct is 512 bytes, so the copy to copy the fcb struct to the buffer at offset 12, and then the code to write the fcb ecc at offset 512+12 only writes the first 1036 bytes. The remaining bytes, including the OOB, will all be zero. And a ZERO byte in the first OOB byte makes the NAND block as bad. So that's why burning the mxsboot generated image with nand write.raw makes the blocks bad. Using kobs-ng doesn't write the OOB data and erase any bad block markers, which is better. I guess this is not just a bug in mxsboot, but also a deficiency in u-boot's nand support. It allows one to write 2048 bytes in ECC mode or 2112 bytes in raw mode. What one should actually do to flash these blocks is write 2048 bytes in raw mode.
This is why writing with nandwrite doesn't work. The ROM bootloader expects the FCB blocks, which contain the BCH parameters, to be in raw mode and apparently expects the rest to be in BCH mode.
Unless I am misunderstanding the u-boot instructions, the FCB blocks are written in raw mode:
It's not the flashing that is in the wrong mode, but the image mxsboot generates that is wrong.
There are no explicit instructions for nandwrite in u-boot, I simply
split the mxsboot NAND image into two pieces to match the pieces that u-boot wrote:
dd if=test.nand bs=2112 count=256 of=test-head.nand dd if=test.nand bs=1 skip=524288 of=test-tail.nand
And then wrote the first piece with nandwrite -oob at offset 0 and the
second with regular nandwrite at offset 0x80000. This worked exactly the same as using u-boot, including the four blocks being marked as bad by Linux.
If the four blocks were already marked as bad, then nandwrite will not write them. So maybe you only have a working image because it was already working and wasn't modified? Can you erase flash in u-boot, verify that nand does not boot, and the make a working nand using just nandwrite --oob? I think you will also need to use the option --noecc to write in raw mode.
U-boot appears to write the OOB data (with all zeros), causing the pages to become marked as bad.
If you look at the mxsboot source code, they appear to be trying to
calculate the ecc and generate the oob data, but maybe they are doing it wrong?
Look closer, they don't. The ECC they generate is just the 512 bytes of ecc data for the 512 bytes of FCB data. This is a special ecc just used for the FCBs. Nothing will actually write past the 1036th byte of the block and so it will still be all zero past that including the oob data.
A "smart" flasher like kobs-ng seems more in line with what the hardware calls for.
Yes, but unfortunately I'm not sure that's something that could be
implemented within a running u-boot?
It would be harder. I think you'd need to write a u-boot app to do it.
cope with bad blocks. It just creates a blank DBBT table. What you need to do is query the mtd driver and get the bad block list and then use that to construct an accurate DBBT and then not use those bad blocks. kobs-ng can do this (not sure if it actually does).
Does the ROM IPL pay any attention to bad blocks? I don't know exactly
how it works, but from what I've heard it doesn't sound like it deals with bad blocks very well if at all.
It is supposed to use something called the DBBT to skip bad blocks. mxsboot doesn't generate a real DBBT so bad blocks probably aren't handled.
I'd like to burn from Linux. It's easier to create an end user firmware update system in linux than in u-boot.
Yes, agreed. We've been evaluating our bootloader options for the project
I'm working on. Initially we were going to have u-boot be the bootloader and directly load our production kernel. However, as you say, it is somewhat difficult to create a flexible update/recovery system within u-boot. Next we looked into using the freescale bootlets to directly load a bootloader/recovery linux kernel with a bundled initramfs that would be used for the update/recovery process, and then have it use kexec to load the production kernel. This worked out pretty well.
What I think we're going to go with is actually a hybrid of both u-boot
and a stripped-down linux/initramfs bootloader/recovery kernel. The bootstream actually contains two copies of whatever object is going to be loaded (I'm guessing, but I assume that is because the ROM IPL doesn't handle bad blocks, so if the first copy can't be read it will just try the second). Our recovery kernel/initramfs is probably going to be about 5M, so it would take about 10M to be loaded directly. u-boot is less than 1M, so by including both, it actually uses *less* space (2*1+5=7, vs 2*5=10). In addition, the recommendation seems to be to have the IPL load the minimal amount possible, so with this implementation the IPL loads a tiny u-boot, which again is a lot smarter and more reliable about loading the larger recovery kernel, which can then either perform recovery options or kexec the production kernel.
I've done that before. The u-boot env was written from Linux to tell u-boot which kernel to boot, the firmware update kernel and rootfs or the main kernel and rootfs.
Another way is to use initramfs for your main filesystem. If you have no filesystems mounted from flash, then you can just flash without rebooting. You need a small filesystem for this of course.

On 4/11/2013 5:03 AM, Trent Piepho wrote:
I'm talking about the image file as generated by mxsimage. If I hex dump that, it's clearly written entirely with 2048 byte pages. If you hexdump your image are the FCB blocks exactly 128k apart?
Hmm, I don't have one in front of me to conveniently look at, but as I recall when I was working with it the FCB blocks did indeed appear to be evenly spaced at locations divisible by 1k.
Looks wrong to me! Notice that offset is equal to i * nand_writesize, not i * (nand_writesize + nand_oobsize).
Ah, good eye. They are writing the the correct amount of data, but in the wrong places.
All the rest are corrupted since they are written in the wrong location. But since the first one was ok the bootloader never even looks at the bad ones. Unless the NAND page goes bad, then the whole point of having redundant copies will be defeated.
That sounds like a correct conclusion.
What one should actually do to flash these blocks is write 2048 bytes in raw mode.
I guess that would only work if whatever reading the blocks also read them in raw mode, as otherwise the lack of ECC in the OOB area would fail the read?
If the four blocks were already marked as bad, then nandwrite will not write them. So maybe you only have a working image because it was already working and wasn't modified? Can you erase flash in u-boot, verify that nand does not boot, and the make a working nand using just nandwrite --oob? I think you will also need to use the option --noecc to write in raw mode.
I did actually erase the NAND before testing the burn in Linux, so I can confirm it does actually work – the first time. After the first burn, the next time Linux is booted, it detects the blocks as bad, and will not overwrite them, even in raw mode. I unfortunately did not make good notes, and don't recall the specific flags I used with nandwrite during the test.
I've done that before. The u-boot env was written from Linux to tell u-boot which kernel to boot, the firmware update kernel and rootfs or the main kernel and rootfs.
I think we're going to always have u-boot boot the recovery kernel and have that bootstrap the production kernel. We plan to have a physical reset button on the device, which if held down when powered on will reset the device to factory defaults. The recovery kernel will check if that button is pressed when it loads and rewrite the production area of the flash if so from a recovery partition, otherwise just load the production kernel.
Hopefully Otavio is watching this thread and can address the issues you found with mxsboot.
Thanks much…

On Thu, Apr 11, 2013 at 11:33 AM, Paul B. Henson henson@acm.org wrote:
On 4/11/2013 5:03 AM, Trent Piepho wrote:
What one should actually do to flash these blocks is write 2048 bytes in raw mode.
I guess that would only work if whatever reading the blocks also read them in raw mode, as otherwise the lack of ECC in the OOB area would fail the read?
See my second message in the thread. The FCBs are in raw mode, all the rest are in BCH/ECC mode. The FCBs have the BCH parameters, so the IPL can't switch to ECC mode until it gets the parameters from them. Also, the BCH/ECC data is NOT in the OOB area. In BCH/ECC mode, the data and the ECC bytes are intermixed throughout the full 2112 byte page. So one MUST write the FCBs in raw mode and everything else in ECC mode for it to work, as the IPL will read the data in this manner. That's why the image from mxsboot needs to be split into two. Maybe it would make more sense for mxsboot to write two files? One with the FCBs and one with everything else?
The FCBs are only 1036 byes long. The OOB isn't used by the FCB. So when writing the FCBs, the OOB should not be written and whatever bad/good flag is in there left alone. But u-boot flashes the entire block with zeros (the first 2112 page and also the 63 unused pages after it too). So the OOB is also zeroed out, and that marks the block as bad.
I've done that before. The u-boot env was written from Linux to tell u-boot which kernel to boot, the firmware update kernel and rootfs or the main kernel and rootfs.
I think we're going to always have u-boot boot the recovery kernel and have that bootstrap the production kernel. We plan to have a physical reset button on the device, which if held down when powered on will reset the device to factory defaults. The recovery kernel will check if that button is pressed when it loads and rewrite the production area of the flash if so from a recovery partition, otherwise just load the production kernel.
You'll boot slower then, as you're basically booting twice. Maybe that doesn't matter for you. I'm usually trying to get booted and have system startup done in <500 ms and two boots make that a lot harder.

On 4/11/2013 4:25 PM, Trent Piepho wrote:
Maybe it would make more sense for mxsboot to write two files? One with the FCBs and one with everything else?
Hmm, possibly; I guess that would be conceptually simpler but require more commands to execute to get done.
The FCBs are only 1036 byes long. The OOB isn't used by the FCB. So when writing the FCBs, the OOB should not be written and whatever bad/good flag is in there left alone. But u-boot flashes the entire block with zeros (the first 2112 page and also the 63 unused pages after it too). So the OOB is also zeroed out, and that marks the block as bad.
I'm not that familiar with the intricacies of NAND, it sounds like you're saying each FCB should be written separately rather than in one fell swoop as it does currently?
There haven't been any responses or follow-ups to this thread, so I guess they either think it's working fine as is or aren't interested/don't have the time to follow up on the issue. I'm not accusing anything of being broken, just explaining what I'm seeing and offering to help :)...
I think we're going to always have u-boot boot the recovery kernel and have that bootstrap the production kernel. We plan to have a physical reset
You'll boot slower then, as you're basically booting twice. Maybe that doesn't matter for you.
Boot time doesn't matter too much for our application, it shouldn't boot very often and if it does a couple extra seconds won't be a problem.
What is your recovery plan in the case of the production kernel/file system becoming corrupt and unbootable? u-boot, per the environment variable, will try to load the production kernel, which then can't boot far enough to reset the environment variable to load the recovery kernel?

On Fri, Apr 19, 2013 at 6:03 PM, Paul B. Henson henson@acm.org wrote:
On 4/11/2013 4:25 PM, Trent Piepho wrote:
Maybe it would make more sense for mxsboot to write two files? One with the FCBs and one with everything else?
Hmm, possibly; I guess that would be conceptually simpler but require more commands to execute to get done.
Don't see why. If mxsboot wrote both files at once, there'd be the same number of commands to generate them. When flashing with nandwrite, the commands to split the file would no longer be necessary. U-boot would have to load and flash two files, but it could avoid having to calculate where to split the file like it does now.
The FCBs are only 1036 byes long. The OOB isn't used by the FCB. So when writing the FCBs, the OOB should not be written and whatever bad/good flag is in there left alone. But u-boot flashes the entire block with zeros (the first 2112 page and also the 63 unused pages after it too). So the OOB is also zeroed out, and that marks the block as bad.
I'm not that familiar with the intricacies of NAND, it sounds like you're saying each FCB should be written separately rather than in one fell swoop as it does currently?
Yes. Or at least it should not write the OOB. Basically you have a 136 KB NAND block, including data+OOB. The first 1036 bytes are FCB data. Bytes 2048 to 2112 contain a bad block marker. The remaining ~134 KB after byte 2112 are entirely unused. kobs-ng writes only the first 2048 bytes, and thus does not write bytes 2048-2112, which contain the marker. u-boot writes all 136 KB, including the marker. There are 4 FCB blocks like this. While U-boot gets the first one correct (other than the bad block marker), it doesn't write the 3 after it correctly.
I think we're going to always have u-boot boot the recovery kernel and have that bootstrap the production kernel. We plan to have a physical reset
You'll boot slower then, as you're basically booting twice. Maybe that doesn't matter for you.
Boot time doesn't matter too much for our application, it shouldn't boot very often and if it does a couple extra seconds won't be a problem.
What is your recovery plan in the case of the production kernel/file system becoming corrupt and unbootable? u-boot, per the environment variable, will try to load the production kernel, which then can't boot far enough to reset the environment variable to load the recovery kernel?
The production rootfs and kernel are read-only, so shouldn't become corrupted on SLC nand. So there isn't anything to detect that and switch to a backup. If something does happen, then recovery would be by microSD card boot/reflash. The ROM bootloader supposedly supports two firmware images to boot from. That's one of the reasons why the output of mxsboot is so big, as it contains two images. It's not clear to me if the bootloader supports switching between image in any useful way.

On 4/19/2013 6:22 PM, Trent Piepho wrote:
Hmm, possibly; I guess that would be conceptually simpler but require more commands to execute to get done.
Don't see why. If mxsboot wrote both files at once, there'd be the same number of commands to generate them.
Well, you'd have to copy two files instead of one to say SD, and run the mmcload command twice instead of once, but now I'm just being pedantic :).
While U-boot gets the first one correct (other than the bad block marker), it doesn't write the 3 after it correctly.
So if the first one were ever corrupted, the boot would fail. It seems like that would be worth fixing.
supports two firmware images to boot from. That's one of the reasons why the output of mxsboot is so big, as it contains two images. It's not clear to me if the bootloader supports switching between image in any useful way.
From the limited understanding I have of it, no, the second image is only loaded if the first one fails.

Dear Paul B. Henson,
On 4/19/2013 6:22 PM, Trent Piepho wrote:
Hmm, possibly; I guess that would be conceptually simpler but require more commands to execute to get done.
Don't see why. If mxsboot wrote both files at once, there'd be the same number of commands to generate them.
Well, you'd have to copy two files instead of one to say SD, and run the mmcload command twice instead of once, but now I'm just being pedantic :).
While U-boot gets the first one correct (other than the bad block marker), it doesn't write the 3 after it correctly.
So if the first one were ever corrupted, the boot would fail. It seems like that would be worth fixing.
supports two firmware images to boot from. That's one of the reasons why the output of mxsboot is so big, as it contains two images. It's not clear to me if the bootloader supports switching between image in any useful way.
From the limited understanding I have of it, no, the second image is only loaded if the first one fails.
I didn't really track the thread and I'm plenty busy, besides I had quite a clash with Trent in another thread, sorry about me being plenty unpleasant. Anyway, can you please sum what is going on and what you came up with?
Please also always CC Fabio, he is of great help.
Best regards, Marek Vasut

On 4/25/2013 6:13 PM, Marek Vasut wrote:
I didn't really track the thread and I'm plenty busy, besides I had quite a clash with Trent in another thread, sorry about me being plenty unpleasant. Anyway, can you please sum what is going on and what you came up with?
Most of the analysis came from Trent, but I can try to summarize the findings.
One problem is that the current mxsboot misaligns the FCB's:
for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) { offset = i * nand_writesize; memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize); }
The code writes out nand_writesize+nand_oobsize bytes, but updates the offset only by nand_writesize, so every FCB but the first one isn't in the right place:
hexdump of the u-boot image: 00000000 00 00 00 00 00 00 00 00 00 00 00 00 d6 fc ff ff |................| 00000010 46 43 42 20 00 00 00 01 50 3c 19 06 00 00 00 00 |FCB ....P<......|
00020000 00 00 00 00 00 00 00 00 00 00 00 00 d6 fc ff ff |................| 00020010 46 43 42 20 00 00 00 01 50 3c 19 06 00 00 00 00 |FCB ....P<......|
The first FCB block is at offset 0. The second FCB block is at offset 0x20000, 64 * 2048 bytes, not 64 * 2112 bytes, no OOB data. The next two FCBs are at 0x40000 and 0x60000, again not where they should be if they contained the OOB data.
Another problem is that the OOB section gets zeroed out.
If you look at the mx28_nand_fcb_block() that generates the FCB block, it calls memset() to fill the entire 2112 bytes with zero. The mx28_nand_fcb struct is 512 bytes, so the copy to copy the fcb struct to the buffer at offset 12, and then the code to write the fcb ecc at offset 512+12 only writes the first 1036 bytes. The remaining bytes, including the OOB, will all be zero. A zero byte in the first OOB byte makes the NAND block as bad. Burning the mxsboot generated image with nand write.raw makes the blocks bad because it fills the OOB section with all zero.
It seems possibly either the FCB's should each be written separately, not overwriting the OOB area, or the image containing them needs to be aligned correctly and have proper OOB data?
The TL;DR summary is simply that mxsboot generates the image with misaligned FCB's and invalid OOB data.
While we're on the subject of mx28evk, I posted a couple simple questions to the list that I didn't see responses to; perhaps one of you guys knows the answers off the top of your head?
First, I was wondering why the mx28evk board config doesn't define CONFIG_FIT? It seemed like that was the new preferred image format as opposed to the legacy image, when I added it seems to work fine so I wasn't sure why it's not there by default.
Second, the config defines a load address for the kernel and device tree, but none for a ramdisk image. Is there any particular address that would be best for that that could perhaps be added to the default environment?
Thanks…

Dear Paul B. Henson,
On 4/25/2013 6:13 PM, Marek Vasut wrote:
I didn't really track the thread and I'm plenty busy, besides I had quite a clash with Trent in another thread, sorry about me being plenty unpleasant. Anyway, can you please sum what is going on and what you came up with?
Most of the analysis came from Trent, but I can try to summarize the findings.
One problem is that the current mxsboot misaligns the FCB's:
for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) { offset = i * nand_writesize; memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize); }
The code writes out nand_writesize+nand_oobsize bytes, but updates the offset only by nand_writesize, so every FCB but the first one isn't in the right place:
hexdump of the u-boot image: 00000000 00 00 00 00 00 00 00 00 00 00 00 00 d6 fc ff ff
|................|
00000010 46 43 42 20 00 00 00 01 50 3c 19 06 00 00 00 00 |FCB ....P<......|
00020000 00 00 00 00 00 00 00 00 00 00 00 00 d6 fc ff ff
|................|
00020010 46 43 42 20 00 00 00 01 50 3c 19 06 00 00 00 00 |FCB ....P<......|
The first FCB block is at offset 0. The second FCB block is at offset 0x20000, 64 * 2048 bytes, not 64 * 2112 bytes, no OOB data. The next two FCBs are at 0x40000 and 0x60000, again not where they should be if they contained the OOB data.
Another problem is that the OOB section gets zeroed out.
If you look at the mx28_nand_fcb_block() that generates the FCB block, it calls memset() to fill the entire 2112 bytes with zero. The mx28_nand_fcb struct is 512 bytes, so the copy to copy the fcb struct to the buffer at offset 12, and then the code to write the fcb ecc at offset 512+12 only writes the first 1036 bytes. The remaining bytes, including the OOB, will all be zero. A zero byte in the first OOB byte makes the NAND block as bad. Burning the mxsboot generated image with nand write.raw makes the blocks bad because it fills the OOB section with all zero.
It seems possibly either the FCB's should each be written separately, not overwriting the OOB area, or the image containing them needs to be aligned correctly and have proper OOB data?
I'll take one more stab at reading this tomorrow.
The TL;DR summary is simply that mxsboot generates the image with misaligned FCB's and invalid OOB data.
While we're on the subject of mx28evk, I posted a couple simple questions to the list that I didn't see responses to; perhaps one of you guys knows the answers off the top of your head?
CC me and Fabio, then you have good chance of having them answered.
First, I was wondering why the mx28evk board config doesn't define CONFIG_FIT? It seemed like that was the new preferred image format as opposed to the legacy image, when I added it seems to work fine so I wasn't sure why it's not there by default.
It's just disabled as we use uImage on those boards. Sure, you can enable FIT image and yes, it's the new preffered format.
Second, the config defines a load address for the kernel and device tree, but none for a ramdisk image. Is there any particular address that would be best for that that could perhaps be added to the default environment?
I don't know many people who still use ramdisk, but any address above kernel works.
Best regards, Marek Vasut

Dear Paul B. Henson,
On 4/25/2013 6:13 PM, Marek Vasut wrote:
I didn't really track the thread and I'm plenty busy, besides I had quite a clash with Trent in another thread, sorry about me being plenty unpleasant. Anyway, can you please sum what is going on and what you came up with?
Most of the analysis came from Trent, but I can try to summarize the findings.
One problem is that the current mxsboot misaligns the FCB's:
for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) { offset = i * nand_writesize; memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize); }
The code writes out nand_writesize+nand_oobsize bytes, but updates the offset only by nand_writesize, so every FCB but the first one isn't in the right place:
hexdump of the u-boot image: 00000000 00 00 00 00 00 00 00 00 00 00 00 00 d6 fc ff ff
|................|
00000010 46 43 42 20 00 00 00 01 50 3c 19 06 00 00 00 00 |FCB ....P<......|
00020000 00 00 00 00 00 00 00 00 00 00 00 00 d6 fc ff ff
|................|
00020010 46 43 42 20 00 00 00 01 50 3c 19 06 00 00 00 00 |FCB ....P<......|
The first FCB block is at offset 0. The second FCB block is at offset 0x20000, 64 * 2048 bytes, not 64 * 2112 bytes, no OOB data. The next two FCBs are at 0x40000 and 0x60000, again not where they should be if they contained the OOB data.
Another problem is that the OOB section gets zeroed out.
Ok, I see the problem, but I don't see easy solution. For some reason, the BCH doesn't compute the same ECC as mx28_nand_parity_13_8() when writing regular data, do you know why?
Best regards, Marek Vasut

On Fri, May 3, 2013 at 5:08 PM, Marek Vasut marex@denx.de wrote:
On 4/25/2013 6:13 PM, Marek Vasut wrote:
I didn't really track the thread and I'm plenty busy, besides I had
quite
a clash with Trent in another thread, sorry about me being plenty unpleasant. Anyway, can you please sum what is going on and what you came up with?
Most of the analysis came from Trent, but I can try to summarize the findings.
One problem is that the current mxsboot misaligns the FCB's:
for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) { offset = i * nand_writesize; memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize); }
The code writes out nand_writesize+nand_oobsize bytes, but updates the offset only by nand_writesize, so every FCB but the first one isn't in the right place:
hexdump of the u-boot image: 00000000 00 00 00 00 00 00 00 00 00 00 00 00 d6 fc ff ff
|................|
00000010 46 43 42 20 00 00 00 01 50 3c 19 06 00 00 00 00 |FCB ....P<......|
00020000 00 00 00 00 00 00 00 00 00 00 00 00 d6 fc ff ff
|................|
00020010 46 43 42 20 00 00 00 01 50 3c 19 06 00 00 00 00 |FCB ....P<......|
The first FCB block is at offset 0. The second FCB block is at offset 0x20000, 64 * 2048 bytes, not 64 * 2112 bytes, no OOB data. The next two FCBs are at 0x40000 and 0x60000, again not where they should be if they contained the OOB data.
Another problem is that the OOB section gets zeroed out.
Ok, I see the problem, but I don't see easy solution. For some reason, the BCH doesn't compute the same ECC as mx28_nand_parity_13_8() when writing regular data, do you know why?
Completely different algorithms. The BCH ECC is computed on a 512 byte block at a time using a vastly more complex algorithm. The 13_8 parity is only used by the ROM bootloader code for checking the FCB blocks (and maybe some other boot blocks too? Not sure about that off the top of head). It produces one byte of parity (only 6 bits used I think) for each byte of data. This is because the FCB blocks are not in BCH format, as those blocks contain the BCH parameters that are necessary to decode BCH encoded blocks. Thus the FCB blocks are raw so the BCH parameters can be read, then the rest of the blocks can be in BCH mode.
I think what needs to be done is for u-boot to have a nand write.raw that writes a block in non-BCH mode WITHOUT writing the OOB data. Since the OOB data in the FCB blocks should not actually be written, it's not necessarily a problem that mxsboot fails to include it in the image. The problem really is that u-boot nand write expects it and the image does not have it.

Dear Trent Piepho,
On Fri, May 3, 2013 at 5:08 PM, Marek Vasut marex@denx.de wrote:
On 4/25/2013 6:13 PM, Marek Vasut wrote:
I didn't really track the thread and I'm plenty busy, besides I had
quite
a clash with Trent in another thread, sorry about me being plenty unpleasant. Anyway, can you please sum what is going on and what you came up with?
Most of the analysis came from Trent, but I can try to summarize the findings.
One problem is that the current mxsboot misaligns the FCB's: for (i = 0; i < STRIDE_PAGES * STRIDE_COUNT; i += STRIDE_PAGES) {
offset = i * nand_writesize; memcpy(buf + offset, fcbblock, nand_writesize + nand_oobsize); }
The code writes out nand_writesize+nand_oobsize bytes, but updates the offset only by nand_writesize, so every FCB but the first one isn't in the right place:
hexdump of the u-boot image: 00000000 00 00 00 00 00 00 00 00 00 00 00 00 d6 fc ff ff
|................|
00000010 46 43 42 20 00 00 00 01 50 3c 19 06 00 00 00 00 |FCB ....P<......|
00020000 00 00 00 00 00 00 00 00 00 00 00 00 d6 fc ff ff
|................|
00020010 46 43 42 20 00 00 00 01 50 3c 19 06 00 00 00 00 |FCB ....P<......|
The first FCB block is at offset 0. The second FCB block is at offset 0x20000, 64 * 2048 bytes, not 64 * 2112 bytes, no OOB data. The next two FCBs are at 0x40000 and 0x60000, again not where they should be if they contained the OOB data.
Another problem is that the OOB section gets zeroed out.
Ok, I see the problem, but I don't see easy solution. For some reason, the BCH doesn't compute the same ECC as mx28_nand_parity_13_8() when writing regular data, do you know why?
Completely different algorithms. The BCH ECC is computed on a 512 byte block at a time using a vastly more complex algorithm. The 13_8 parity is only used by the ROM bootloader code for checking the FCB blocks (and maybe some other boot blocks too?
It's only the first block, I checked the bootrom source.
Not sure about that off the top of head). It produces one byte of parity (only 6 bits used I think) for each byte of data. This is because the FCB blocks are not in BCH format, as those blocks contain the BCH parameters that are necessary to decode BCH encoded blocks. Thus the FCB blocks are raw so the BCH parameters can be read, then the rest of the blocks can be in BCH mode.
I think what needs to be done is for u-boot to have a nand write.raw that writes a block in non-BCH mode WITHOUT writing the OOB data. Since the OOB data in the FCB blocks should not actually be written, it's not necessarily a problem that mxsboot fails to include it in the image. The problem really is that u-boot nand write expects it and the image does not have it.
What OOB data are you exactly talking about? The 10b metadata at the begining? Or the parity blocks? Or what? The other option would be to replace the zero'd bytes in the image with 0xff maybe, but I didn't test that.
Best regards, Marek Vasut

Dear Paul B. Henson,
Let me just preface this reply with the disclaimer that I'm fairly new to embedded development, and it sounds like you know a lot more about what you're talking about than I do ;).
[...]
I'm not reading the thread as it -- again -- contains loads of baseless "is broken" and "doesn't work" accusations left and right, sorry. I am CCing Fabio.
The issue with the bad sectors is known. This is because the kobs-ng scans the NAND and fills the DBBT. This is not something that can be done off-line, so the mxsboot can never generate such a image per-se. It can on the other hand generate a bootable image. Note that the mxsboot is by default configured for 2048+64 bps flashes.
Best regards, Marek Vasut

On Sat, Apr 13, 2013 at 7:42 AM, Marek Vasut marex@denx.de wrote:
Dear Paul B. Henson,
Let me just preface this reply with the disclaimer that I'm fairly new to embedded development, and it sounds like you know a lot more about what you're talking about than I do ;).
[...]
I'm not reading the thread as it -- again -- contains loads of baseless "is broken" and "doesn't work" accusations left and right, sorry. I am CCing Fabio.
So why did you respond? Just to tell us you don't care if there are bugs in mxsboot?
The issue with the bad sectors is known. This is because the kobs-ng scans the NAND and fills the DBBT. This is not something that can be done off-line, so the
You misunderstand, flashing with u-boot marks the sectors as bad. It's not necessary to ever use kobs-ng to see the problem, so it can't possibly be the cause.

Dear Trent Piepho,
On Sat, Apr 13, 2013 at 7:42 AM, Marek Vasut marex@denx.de wrote:
Dear Paul B. Henson,
Let me just preface this reply with the disclaimer that I'm fairly new to embedded development, and it sounds like you know a lot more about what you're talking about than I do ;).
[...]
I'm not reading the thread as it -- again -- contains loads of baseless "is broken" and "doesn't work" accusations left and right, sorry. I am CCing Fabio.
So why did you respond? Just to tell us you don't care if there are bugs in mxsboot?
To CC Fabio to let him handle this issue, we obviously cannot work together.
The issue with the bad sectors is known. This is because the kobs-ng scans the NAND and fills the DBBT. This is not something that can be done off-line, so the
You misunderstand, flashing with u-boot marks the sectors as bad. It's not necessary to ever use kobs-ng to see the problem, so it can't possibly be the cause.
Then blank DBBT is also a problem.
Best regards, Marek Vasut
participants (4)
-
Marek Vasut
-
Paul B. Henson
-
Scott Wood
-
Trent Piepho