[U-Boot] fw_setenv broken?

I'm seeing some strange behavior with the fw_setenv tools on OMAP.
Here's what I see when using the tools on OMAP (overo in this case):
1. fw_printenv prints the environment with no issues [1] 2. fw_setenv allows me to change a variable with no reported errors [2] 3. fw_printenv will print the changed environment, however the variables are not sorted [3] 4. Since all seems well at this point, I reboot. There is an error reading the new environment [4]
I added debug printf's to readenv() in env_nand.c and the root cause is an error return from ret=nand_read(&nand_info[0], offset, &len, char_ptr)). I get an error code of -74
Before I spend too much time on this I wanted to check to see if others are seeing this issue, or whether it might be OMAP specific.
Regards,
Steve
[1]
root@omap3-multi:~# fw_printenv baudrate=115200 bootcmd=if mmc rescan ${mmcdev}; then if run loadbootscript; then run bootscript; else if run loaduimage; then run mmcboot; else run nandboot; fi; fi; else run nandboot; fi bootdelay=5 bootscript=echo Running bootscript from mmc ...; source ${loadaddr} console=ttyS2,115200n8 defaultdisplay=dvi dieid#=1e1e00040000000004032d460d01900b dvimode=1024x768MR-16@60 ethact=smc911x-0 loadaddr=0x82000000 loadbootscript=fatload mmc ${mmcdev} ${loadaddr} boot.scr loaduimage=fatload mmc ${mmcdev} ${loadaddr} uImage mmcargs=setenv bootargs console=${console} mpurate=${mpurate} vram= ${vram} omapfb.mode=dvi:${dvimode} omapfb.debug=y omapdss.def_disp= ${defaultdisplay} root=${mmcroot} rootfstype=${mmcrootfstype} mmcboot=echo Booting from mmc ...; run mmcargs; bootm ${loadaddr} mmcdev=0 mmcroot=/dev/mmcblk0p2 rw mmcrootfstype=ext3 rootwait mpurate=500 nandargs=setenv bootargs console=${console} mpurate=${mpurate} vram= ${vram} omapfb.mode=dvi:${dvimode} omapfb.debug=y omapdss.def_disp= ${defaultdisplay} root=${nandroot} rootfstype=${nandrootfstype} nandboot=echo Booting from nand ...; run nandargs; nand read ${loadaddr} 280000 400000; bootm ${loadaddr} nandroot=/dev/mtdblock4 rw nandrootfstype=jffs2 stderr=serial stdin=serial stdout=serial vram=12M
[2]
root@omap3-multi:~# fw_setenv mpurate 720
[3]
root@omap3-multi:~# fw_printenv baudrate=115200 bootcmd=if mmc rescan ${mmcdev}; then if run loadbootscript; then run bootscript; else if run loaduimage; then run mmcboot; else run nandboot; fi; fi; else run nandboot; fi bootdelay=5 bootscript=echo Running bootscript from mmc ...; source ${loadaddr} console=ttyS2,115200n8 defaultdisplay=dvi dieid#=1e1e00040000000004032d460d01900b dvimode=1024x768MR-16@60 ethact=smc911x-0 loadaddr=0x82000000 loadbootscript=fatload mmc ${mmcdev} ${loadaddr} boot.scr loaduimage=fatload mmc ${mmcdev} ${loadaddr} uImage mmcargs=setenv bootargs console=${console} mpurate=${mpurate} vram= ${vram} omapfb.mode=dvi:${dvimode} omapfb.debug=y omapdss.def_disp= ${defaultdisplay} root=${mmcroot} rootfstype=${mmcrootfstype} mmcboot=echo Booting from mmc ...; run mmcargs; bootm ${loadaddr} mmcdev=0 mmcroot=/dev/mmcblk0p2 rw mmcrootfstype=ext3 rootwait nandargs=setenv bootargs console=${console} mpurate=${mpurate} vram= ${vram} omapfb.mode=dvi:${dvimode} omapfb.debug=y omapdss.def_disp= ${defaultdisplay} root=${nandroot} rootfstype=${nandrootfstype} nandboot=echo Booting from nand ...; run nandargs; nand read ${loadaddr} 280000 400000; bootm ${loadaddr} nandroot=/dev/mtdblock4 rw nandrootfstype=jffs2 stderr=serial stdin=serial stdout=serial vram=12M mpurate=720
[4]
U-Boot 2010.12-rc1 (Nov 17 2010 - 08:04:09)
OMAP3530-GP ES3.1, CPU-OPP2, L3-165MHz, Max CPU Clock 600 mHz Gumstix Overo board + LPDDR/NAND I2C: ready DRAM: 256 MiB NAND: 256 MiB MMC: OMAP SD/MMC: 0 *** Warning - readenv() failed, using default environment

On 11/17/2010 05:30 PM, Steve Sakoman wrote:
I'm seeing some strange behavior with the fw_setenv tools on OMAP.
Here's what I see when using the tools on OMAP (overo in this case):
- fw_printenv prints the environment with no issues [1]
- fw_setenv allows me to change a variable with no reported errors [2]
- fw_printenv will print the changed environment, however the variables
are not sorted [3]
I tested yesterday on a davinci board, I can confirm this behavior, I have not thought was an error. I do not see any code in fw_env.c to sort variables. I konow the variables are sorted in u-boot, but do we ever have this feature on the userland fw_printenv ?
- Since all seems well at this point, I reboot. There is an error
reading the new environment [4]
I cannot confirm that. I made the same test, environment is stored on a SPI flash. No CRC error, the environment was restored correctly after a reset and/or power cycle.
I added debug printf's to readenv() in env_nand.c and the root cause is an error return from ret=nand_read(&nand_info[0], offset, &len, char_ptr)). I get an error code of -74
Before I spend too much time on this I wanted to check to see if others are seeing this issue, or whether it might be OMAP specific.
At least this should not be a general failure, because it works on my target. It could be also nand specific.
Best regards, Stefano

Dear Stefano Babic,
In message 4CE4092B.7090209@denx.de you wrote:
On 11/17/2010 05:30 PM, Steve Sakoman wrote:
I'm seeing some strange behavior with the fw_setenv tools on OMAP.
Here's what I see when using the tools on OMAP (overo in this case):
- fw_printenv prints the environment with no issues [1]
- fw_setenv allows me to change a variable with no reported errors [2]
- fw_printenv will print the changed environment, however the variables
are not sorted [3]
I tested yesterday on a davinci board, I can confirm this behavior, I have not thought was an error. I do not see any code in fw_env.c to sort variables. I konow the variables are sorted in u-boot, but do we ever have this feature on the userland fw_printenv ?
Indeed this behaviour is normal. fw_printenv does not sort the output (not yet - patches welcome).
I added debug printf's to readenv() in env_nand.c and the root cause is an error return from ret=nand_read(&nand_info[0], offset, &len, char_ptr)). I get an error code of -74
Before I spend too much time on this I wanted to check to see if others are seeing this issue, or whether it might be OMAP specific.
At least this should not be a general failure, because it works on my target. It could be also nand specific.
Thanks for confirming this.
Well, the next step should be a review of the code, where error -74 gets set and what that probably means...
Best regards,
Wolfgang Denk

On Wed, 2010-11-17 at 18:39 +0100, Wolfgang Denk wrote:
Dear Stefano Babic,
In message 4CE4092B.7090209@denx.de you wrote:
On 11/17/2010 05:30 PM, Steve Sakoman wrote:
I'm seeing some strange behavior with the fw_setenv tools on OMAP.
Here's what I see when using the tools on OMAP (overo in this case):
- fw_printenv prints the environment with no issues [1]
- fw_setenv allows me to change a variable with no reported errors [2]
- fw_printenv will print the changed environment, however the variables
are not sorted [3]
I tested yesterday on a davinci board, I can confirm this behavior, I have not thought was an error. I do not see any code in fw_env.c to sort variables. I konow the variables are sorted in u-boot, but do we ever have this feature on the userland fw_printenv ?
Indeed this behaviour is normal. fw_printenv does not sort the output (not yet - patches welcome).
I added debug printf's to readenv() in env_nand.c and the root cause is an error return from ret=nand_read(&nand_info[0], offset, &len, char_ptr)). I get an error code of -74
Before I spend too much time on this I wanted to check to see if others are seeing this issue, or whether it might be OMAP specific.
At least this should not be a general failure, because it works on my target. It could be also nand specific.
Thanks for confirming this.
Well, the next step should be a review of the code, where error -74 gets set and what that probably means...
Well, since -74 is EBADMSG, I suspect the error occurs at the following code in nand_do_read_ops() in nand-base.c:
if (mtd->ecc_stats.failed - stats.failed) return -EBADMSG;
I'm not real familiar with the nand driver code, so I'll add some debug printfs and see if I can determine why this is happening.
Steve

On Wednesday, November 17, 2010 12:39:33 Wolfgang Denk wrote:
Stefano Babic wrote:
On 11/17/2010 05:30 PM, Steve Sakoman wrote:
I'm seeing some strange behavior with the fw_setenv tools on OMAP.
Here's what I see when using the tools on OMAP (overo in this case):
- fw_printenv prints the environment with no issues [1]
- fw_setenv allows me to change a variable with no reported errors [2]
- fw_printenv will print the changed environment, however the
variables are not sorted [3]
I tested yesterday on a davinci board, I can confirm this behavior, I have not thought was an error. I do not see any code in fw_env.c to sort variables. I konow the variables are sorted in u-boot, but do we ever have this feature on the userland fw_printenv ?
Indeed this behaviour is normal. fw_printenv does not sort the output (not yet - patches welcome).
why bloat the code ? why cant people simply: `fw_printenv | sort` ? -mike

Dear Mike Frysinger,
In message 201011171313.27696.vapier@gentoo.org you wrote:
Indeed this behaviour is normal. fw_printenv does not sort the output (not yet - patches welcome).
why bloat the code ? why cant people simply: `fw_printenv | sort` ?
Well, you are of course right, but some people expect consistent behaviour. And in U-Boot "printenv" will sort the output.
Best regards,
Wolfgang Denk

On Wed, Nov 17, 2010 at 9:39 AM, Wolfgang Denk wd@denx.de wrote:
Dear Stefano Babic,
In message 4CE4092B.7090209@denx.de you wrote:
On 11/17/2010 05:30 PM, Steve Sakoman wrote:
I'm seeing some strange behavior with the fw_setenv tools on OMAP.
Here's what I see when using the tools on OMAP (overo in this case):
- fw_printenv prints the environment with no issues [1]
- fw_setenv allows me to change a variable with no reported errors [2]
- fw_printenv will print the changed environment, however the variables
are not sorted [3]
I tested yesterday on a davinci board, I can confirm this behavior, I have not thought was an error. I do not see any code in fw_env.c to sort variables. I konow the variables are sorted in u-boot, but do we ever have this feature on the userland fw_printenv ?
Indeed this behaviour is normal. fw_printenv does not sort the output (not yet - patches welcome).
I added debug printf's to readenv() in env_nand.c and the root cause is an error return from ret=nand_read(&nand_info[0], offset, &len, char_ptr)). I get an error code of -74
Before I spend too much time on this I wanted to check to see if others are seeing this issue, or whether it might be OMAP specific.
At least this should not be a general failure, because it works on my target. It could be also nand specific.
Thanks for confirming this.
Well, the next step should be a review of the code, where error -74 gets set and what that probably means...
I've experimented on a couple of boards and it seems to always fail.
The nand_do_read_ops function in nand_base.c ends like so:
if (mtd->ecc_stats.failed - stats.failed) return -EBADMSG;
return mtd->ecc_stats.corrected - stats.corrected ? -EUCLEAN : 0; }
After writing the environment with fw_setenv in linux, u-boot's read of the environment on the subsequent boot always fails with either EBADMSG or EUCLEAN.
I'll keep digging, but perhaps the above might mean something to someone with more knowledge of the nand driver.
Steve

Dear Steve Sakoman,
In message AANLkTikaLbZG5ED=p-_0MWoLOjH=kfna-P8SYAC=Nisx@mail.gmail.com you wrote:
After writing the environment with fw_setenv in linux, u-boot's read of the environment on the subsequent boot always fails with either EBADMSG or EUCLEAN.
Can you read - in U-Boot - any other data written in Linux? Ecventually there is some discrepance for example in the use of sw versus hw ECC or such?
Best regards,
Wolfgang Denk

On Wed, Nov 17, 2010 at 12:47 PM, Wolfgang Denk wd@denx.de wrote:
Dear Steve Sakoman,
In message AANLkTikaLbZG5ED=p-_0MWoLOjH=kfna-P8SYAC=Nisx@mail.gmail.com you wrote:
After writing the environment with fw_setenv in linux, u-boot's read of the environment on the subsequent boot always fails with either EBADMSG or EUCLEAN.
Can you read - in U-Boot - any other data written in Linux? Ecventually there is some discrepance for example in the use of sw versus hw ECC or such?
I just did that experiment!
As I mentioned, after writing with fw_setenv, I get an error in u-boot (I added a couple of printf's to indicate the offset being read from as well as any error codes returned):
U-Boot 2010.12-rc1 (Nov 17 2010 - 11:20:23)
OMAP3630/3730-GP ES1.0, CPU-OPP2, L3-165MHz, Max CPU Clock 1 Ghz Gumstix Overo board + LPDDR/NAND I2C: ready DRAM: 256 MiB NAND: 256 MiB MMC: OMAP SD/MMC: 0 readenv: offset = 240000 readenv: nand_read failure = -117 *** Warning - readenv() failed, using default environment
I then immediately tried to use the nand read command to read the same block, and it was successful!
Overo # nand read 82000000 240000 20000
NAND read: device 0 offset 0x240000, size 0x20000 131072 bytes read: OK
Not only that, the data read looks correct!
Overo # md 82000000 140 82000000: bd8c0c16 64756162 65746172 3531313d ....baudrate=115 82000010: 00303032 746f6f62 3d646d63 6d206669 200.bootcmd=if m 82000020: 7220636d 61637365 7b24206e 64636d6d mc rescan ${mmcd 82000030: 3b7d7665 65687420 6669206e 6e757220 ev}; then if run 82000040: 616f6c20 6f6f6264 72637374 3b747069 loadbootscript; 82000050: 65687420 7572206e 6f62206e 6373746f then run bootsc 82000060: 74706972 6c6d203b 69206573 75722066 ript; mlse if ru 82000070: 6f6c206e 69756461 6567616d 6874203b n loaduimage; th 82000080: 72206e65 6d206e75 6f62636d 203b746f en run mmcboot; 82000090: 65736c65 6e757220 6e616e20 6f6f6264 else run nandboo 820000a0: 66203b74 66203b69 65203b69 2065736c t; fi; fi; else 820000b0: 206e7572 646e616e 746f6f62 6966203b run nandboot; fi 820000c0: 6f6f6200 6c656474 353d7961 6f6f6200 .bootdelay=5.boo 820000d0: 72637374 3d747069 6f686365 6e755220 tscript=echo Run 820000e0: 676e696e 6f6f6220 72637374 20747069 ning bootscript 820000f0: 6d6f7266 636d6d20 2e2e2e20 6f73203b from mmc ...; so 82000100: 65637275 6c7b2420 6164616f 7d726464 urce ${loadaddr} 82000110: 6e6f6300 656c6f73 7974743d 312c3253 .console=ttyS2,1 82000120: 30323531 00386e30 61666564 64746c75 15200n8.defaultd 82000130: 6c707369 643d7961 64006976 64696569 isplay=dvi.dieid 82000140: 32363d23 30306535 62313030 30303066 #=625e00001bf000 82000150: 31303030 39333735 37306165 30663230 00015739ea0702f0 82000160: 64006530 6f6d6976 313d6564 78343230 0e.dvimode=1024x 82000170: 4d383637 36312d52 00303640 61687465 768MR-16@60.etha 82000180: 733d7463 3139636d 302d7831 616f6c00 ct=smc911x-0.loa 82000190: 64646164 78303d72 30303238 30303030 daddr=0x82000000 820001a0: 616f6c00 6f6f6264 72637374 3d747069 .loadbootscript= 820001b0: 6c746166 2064616f 20636d6d 6d6d7b24 fatload mmc ${mm 820001c0: 76656463 7b24207d 64616f6c 72646461 cdev} ${loadaddr 820001d0: 6f62207d 732e746f 6c007263 7564616f } boot.scr.loadu 820001e0: 67616d69 61663d65 616f6c74 6d6d2064 image=fatload mm 820001f0: 7b242063 64636d6d 207d7665 6f6c7b24 c ${mmcdev} ${lo 82000200: 64616461 207d7264 616d4975 6d006567 adaddr} uImage.m 82000210: 7261636d 733d7367 6e657465 6f622076 mcargs=setenv bo 82000220: 7261746f 63207367 6f736e6f 243d656c otargs console=$ 82000230: 6e6f637b 656c6f73 706d207d 74617275 {console} mpurat 82000240: 7b243d65 7275706d 7d657461 61727620 e=${mpurate} vra 82000250: 7b243d6d 6d617276 6d6f207d 62667061 m=${vram} omapfb 82000260: 646f6d2e 76643d65 7b243a69 6d697664 .mode=dvi:${dvim 82000270: 7d65646f 616d6f20 2e626670 75626564 ode} omapfb.debu 82000280: 20793d67 70616d6f 2e737364 5f666564 g=y omapdss.def_ 82000290: 70736964 647b243d 75616665 6964746c disp=${defaultdi 820002a0: 616c7073 72207d79 3d746f6f 6d6d7b24 splay} root=${mm 820002b0: 6f6f7263 72207d74 66746f6f 70797473 croot} rootfstyp 820002c0: 7b243d65 72636d6d 66746f6f 70797473 e=${mmcrootfstyp 820002d0: 6d007d65 6f62636d 653d746f 206f6863 e}.mmcboot=echo 820002e0: 746f6f42 20676e69 6d6f7266 636d6d20 Booting from mmc 820002f0: 2e2e2e20 7572203b 6d6d206e 67726163 ...; run mmcarg 82000300: 62203b73 6d746f6f 6c7b2420 6164616f s; bootm ${loada 82000310: 7d726464 636d6d00 3d766564 6d6d0030 ddr}.mmcdev=0.mm 82000320: 6f6f7263 642f3d74 6d2f7665 6c62636d croot=/dev/mmcbl 82000330: 3270306b 00777220 72636d6d 66746f6f k0p2 rw.mmcrootf 82000340: 70797473 78653d65 72203374 77746f6f stype=ext3 rootw 82000350: 00746961 646e616e 73677261 7465733d ait.nandargs=set 82000360: 20766e65 746f6f62 73677261 6e6f6320 env bootargs con 82000370: 656c6f73 637b243d 6f736e6f 207d656c sole=${console} 82000380: 7275706d 3d657461 706d7b24 74617275 mpurate=${mpurat 82000390: 76207d65 3d6d6172 72767b24 207d6d61 e} vram=${vram} 820003a0: 70616d6f 6d2e6266 3d65646f 3a697664 omapfb.mode=dvi: 820003b0: 76647b24 646f6d69 6f207d65 6670616d ${dvimode} omapf 820003c0: 65642e62 3d677562 6d6f2079 73647061 b.debug=y omapds 820003d0: 65642e73 69645f66 243d7073 6665647b s.def_disp=${def 820003e0: 746c7561 70736964 7d79616c 6f6f7220 aultdisplay} roo 820003f0: 7b243d74 646e616e 746f6f72 6f72207d t=${nandroot} ro 82000400: 7366746f 65707974 6e7b243d 72646e61 otfstype=${nandr 82000410: 66746f6f 70797473 6e007d65 62646e61 ootfstype}.nandb 82000420: 3d746f6f 6f686365 6f6f4220 676e6974 oot=echo Booting 82000430: 6f726620 616e206d 2e20646e 203b2e2e from nand ...; 82000440: 206e7572 646e616e 73677261 616e203b run nandargs; na 82000450: 7220646e 20646165 6f6c7b24 64616461 nd read ${loadad 82000460: 207d7264 30303832 34203030 30303030 dr} 280000 40000 82000470: 62203b30 6d746f6f 6c7b2420 6164616f 0; bootm ${loada 82000480: 7d726464 6e616e00 6f6f7264 642f3d74 ddr}.nandroot=/d 82000490: 6d2f7665 6c626474 346b636f 00777220 ev/mtdblock4 rw. 820004a0: 646e616e 746f6f72 79747366 6a3d6570 nandrootfstype=j 820004b0: 32736666 64747300 3d727265 69726573 ffs2.stderr=seri 820004c0: 73006c61 6e696474 7265733d 006c6169 al.stdin=serial. 820004d0: 6f647473 733d7475 61697265 7276006c stdout=serial.vr 820004e0: 313d6d61 6d004d34 61727570 363d6574 am=14M.mpurate=6 820004f0: 00003030 00000000 00000000 00000000 00.............. Overo #
Any ideas?
Steve

Dear Steve Sakoman,
In message AANLkTimrfQ5+AWfdFy_fueTMH=x=xrkaZGNtK8fiSD48@mail.gmail.com you wrote:
readenv: offset = 240000 readenv: nand_read failure = -117 *** Warning - readenv() failed, using default environment
I then immediately tried to use the nand read command to read the same block, and it was successful!
Hm... any chance that - for example - your timers are not working correctly before relocation (maybe because they try to write to the not yet available data segment) ? This could cause timeouts or delays to be too short, so the NAND driver is misbehaving?
Best regards,
Wolfgang Denk

On Wed, Nov 17, 2010 at 1:40 PM, Wolfgang Denk wd@denx.de wrote:
Dear Steve Sakoman,
In message AANLkTimrfQ5+AWfdFy_fueTMH=x=xrkaZGNtK8fiSD48@mail.gmail.com you wrote:
readenv: offset = 240000 readenv: nand_read failure = -117 *** Warning - readenv() failed, using default environment
I then immediately tried to use the nand read command to read the same block, and it was successful!
Hm... any chance that - for example - your timers are not working correctly before relocation (maybe because they try to write to the not yet available data segment) ? This could cause timeouts or delays to be too short, so the NAND driver is misbehaving?
Hmm . . . I suppose that is possible, but it doesn't seem to explain why environment data written by u-boot will always be read successfully, but reads of linux written data fails.
Steve

On Wed, 17 Nov 2010 22:40:49 +0100 Wolfgang Denk wd@denx.de wrote:
Dear Steve Sakoman,
In message AANLkTimrfQ5+AWfdFy_fueTMH=x=xrkaZGNtK8fiSD48@mail.gmail.com you wrote:
readenv: offset = 240000 readenv: nand_read failure = -117 *** Warning - readenv() failed, using default environment
I then immediately tried to use the nand read command to read the same block, and it was successful!
Hm... any chance that - for example - your timers are not working correctly before relocation (maybe because they try to write to the not yet available data segment) ? This could cause timeouts or delays to be too short, so the NAND driver is misbehaving?
The NAND driver only works after relocation.
It looks like the problem is that -EUCLEAN is a non-fatal error (indicates a correctable ECC error). The code invoked by the "nand read" command succeeds if nand_read() returns either 0 or -EUCLEAN, but readenv() is missing this check.
-Scott

On Wed, 2010-11-17 at 16:08 -0600, Scott Wood wrote:
On Wed, 17 Nov 2010 22:40:49 +0100 Wolfgang Denk wd@denx.de wrote:
Dear Steve Sakoman,
In message AANLkTimrfQ5+AWfdFy_fueTMH=x=xrkaZGNtK8fiSD48@mail.gmail.com you wrote:
readenv: offset = 240000 readenv: nand_read failure = -117 *** Warning - readenv() failed, using default environment
I then immediately tried to use the nand read command to read the same block, and it was successful!
Hm... any chance that - for example - your timers are not working correctly before relocation (maybe because they try to write to the not yet available data segment) ? This could cause timeouts or delays to be too short, so the NAND driver is misbehaving?
The NAND driver only works after relocation.
It looks like the problem is that -EUCLEAN is a non-fatal error (indicates a correctable ECC error). The code invoked by the "nand read" command succeeds if nand_read() returns either 0 or -EUCLEAN, but readenv() is missing this check.
OK, we seem to be peeling back the layers of the onion now.
I patched readenv to use the same nand_read_skip_bad function used in the command line "nand read" tool. I no longer get the -EUCLEAN errors when reading the environment after using fw_setenv to write from linux. Now I get:
*** Warning - bad CRC, using default environment
Checking the data with the "nand read" command line shows that the changes I made in linux are indeed there, so I suspect that there is also some mismatch in the CRC computation between the fw tools and the u-boot code (i.e. I'm pretty sure this error does *not* refer to the nand CRC)
Steve

Dear Steve Sakoman,
In message 1290034139.2927.1192.camel@quadra you wrote:
I patched readenv to use the same nand_read_skip_bad function used in the command line "nand read" tool. I no longer get the -EUCLEAN errors when reading the environment after using fw_setenv to write from linux. Now I get:
*** Warning - bad CRC, using default environment
Checking the data with the "nand read" command line shows that the changes I made in linux are indeed there, so I suspect that there is also some mismatch in the CRC computation between the fw tools and the u-boot code (i.e. I'm pretty sure this error does *not* refer to the nand CRC)
Try and use the "crc32" command in U-Boot tocheck if the checksum is correct. Check if the environment size in your board config file and in the fw_env.conf file match. Check if both match wether or not redundant env is used.
Best regards,
Wolfgang Denk

On Wed, 2010-11-17 at 16:08 -0600, Scott Wood wrote:
On Wed, 17 Nov 2010 22:40:49 +0100 Wolfgang Denk wd@denx.de wrote:
Dear Steve Sakoman,
In message AANLkTimrfQ5+AWfdFy_fueTMH=x=xrkaZGNtK8fiSD48@mail.gmail.com you wrote:
readenv: offset = 240000 readenv: nand_read failure = -117 *** Warning - readenv() failed, using default environment
I then immediately tried to use the nand read command to read the same block, and it was successful!
Hm... any chance that - for example - your timers are not working correctly before relocation (maybe because they try to write to the not yet available data segment) ? This could cause timeouts or delays to be too short, so the NAND driver is misbehaving?
The NAND driver only works after relocation.
It looks like the problem is that -EUCLEAN is a non-fatal error (indicates a correctable ECC error). The code invoked by the "nand read" command succeeds if nand_read() returns either 0 or -EUCLEAN, but readenv() is missing this check.
Changing readenv to use nand_read_skip_bad eliminated the -117 (EUCLEAN) failures.
Now I am getting just the -74 (EBADMSG) errors for fw_setenv written environments.
It seems that fw_printenv can always read u-boot written environments, but 99.9% of the time I get a -74 (EBADMSG) error in u-boot for environments written by fw_setenv:
NAND read from offset 240000 failed -74 *** Warning - readenv() failed, using default environment
If I try to read the environment using the nand read tool I get the same error.
Using fw_printenv always seems to work -- whether u-boot or fw_setenv was the writer.
The code generating both errors is in the nand_do_read_ops function in nand_base.c:
if (mtd->ecc_stats.failed - stats.failed) return -EBADMSG;
return mtd->ecc_stats.corrected - stats.corrected ? -EUCLEAN : 0; }
I understand that the -EUCLEAN error indicates a correctable ECC error. What does the -EBADMSG error indicate?
This condition doesn't seem to bother the linux driver, but u-boot doesn't like it at all!
Steve

On Thu, 18 Nov 2010 16:13:52 -0800 Steve Sakoman steve@sakoman.com wrote:
The code generating both errors is in the nand_do_read_ops function in nand_base.c:
if (mtd->ecc_stats.failed - stats.failed) return -EBADMSG; return mtd->ecc_stats.corrected - stats.corrected ? -EUCLEAN : 0;
}
I understand that the -EUCLEAN error indicates a correctable ECC error. What does the -EBADMSG error indicate?
An uncorrectable ECC error (or other failure).
This condition doesn't seem to bother the linux driver, but u-boot doesn't like it at all!
Check whether the ECC layout and code is the same for this driver in both U-Boot and Linux.
-Scott

On Thu, 2010-11-18 at 18:20 -0600, Scott Wood wrote:
On Thu, 18 Nov 2010 16:13:52 -0800 Steve Sakoman steve@sakoman.com wrote:
The code generating both errors is in the nand_do_read_ops function in nand_base.c:
if (mtd->ecc_stats.failed - stats.failed) return -EBADMSG; return mtd->ecc_stats.corrected - stats.corrected ? -EUCLEAN : 0;
}
I understand that the -EUCLEAN error indicates a correctable ECC error. What does the -EBADMSG error indicate?
An uncorrectable ECC error (or other failure).
This condition doesn't seem to bother the linux driver, but u-boot doesn't like it at all!
Check whether the ECC layout and code is the same for this driver in both U-Boot and Linux.
Since fw_printenv in Linux always can successfully read an environment written by U-boot (aqs well as those written by fw_setenv), wouldn't this indicate that they are using the same ECC layout? If they were different I would expect that compatibility in both directions would be broken.
This is not my area of expertise, so perhaps I am just ignorant of how things work.
Steve

On Thu, Nov 18, 2010 at 4:20 PM, Scott Wood scottwood@freescale.com wrote:
On Thu, 18 Nov 2010 16:13:52 -0800 Steve Sakoman steve@sakoman.com wrote:
The code generating both errors is in the nand_do_read_ops function in nand_base.c:
if (mtd->ecc_stats.failed - stats.failed) return -EBADMSG;
return mtd->ecc_stats.corrected - stats.corrected ? -EUCLEAN : 0; }
I understand that the -EUCLEAN error indicates a correctable ECC error. What does the -EBADMSG error indicate?
An uncorrectable ECC error (or other failure).
This condition doesn't seem to bother the linux driver, but u-boot doesn't like it at all!
Check whether the ECC layout and code is the same for this driver in both U-Boot and Linux.
Well, the mystery is solved.
The strange behavior was a combination of the -EUCLEAN issue in u-boot and the following bizarre bug that crept into the Linux OMAP NAND driver in 2.6.26:
http://article.gmane.org/gmane.linux.ports.arm.omap/46545
I will submit a patch to deal with the u-boot issue tomorrow, and it seems that a fix is already queued for Linux 2.6.37.
Thanks to Scott Wood for helping with the -EUCLEAN issue and Scott Ellis for noticing that what might be the same issue was being discussed on both the u-boot and linux lists today.
Steve
participants (6)
-
Mike Frysinger
-
Scott Wood
-
Stefano Babic
-
Steve Sakoman
-
Steve Sakoman
-
Wolfgang Denk