[U-Boot-Users] [PATCH] Fixed cfi flash read uchar bug.

From: Zhang Wei wei.zhang@freescale.com
For the flash of port width more than 8bit, a completetly read must be performed. For example, 16bit port width flash must perform a ushort read. Otherwise, some flashes will get error data.
Signed-off-by: Zhang Wei wei.zhang@freescale.com --- drivers/cfi_flash.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/cfi_flash.c b/drivers/cfi_flash.c index 9b10220..f02b047 100644 --- a/drivers/cfi_flash.c +++ b/drivers/cfi_flash.c @@ -250,9 +250,9 @@ #endif */ inline uchar flash_read_uchar (flash_info_t * info, uint offset) { - uchar *cp; + uchar cp[FLASH_CFI_64BIT];
- cp = flash_make_addr (info, 0, offset); + memcpy(cp, flash_make_addr (info, 0, offset), info->portwidth); #if defined(__LITTLE_ENDIAN) return (cp[0]); #else

In message <1166772458178-git-send-email-> you wrote:
For the flash of port width more than 8bit, a completetly read must be performed. For example, 16bit port width flash must perform a ushort read. Otherwise, some flashes will get error data.
Please explain under which circumstances you think this is necessary.
Signed-off-by: Zhang Wei wei.zhang@freescale.com
drivers/cfi_flash.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/cfi_flash.c b/drivers/cfi_flash.c index 9b10220..f02b047 100644 --- a/drivers/cfi_flash.c +++ b/drivers/cfi_flash.c @@ -250,9 +250,9 @@ #endif */ inline uchar flash_read_uchar (flash_info_t * info, uint offset) {
- uchar *cp;
- uchar cp[FLASH_CFI_64BIT];
- cp = flash_make_addr (info, 0, offset);
- memcpy(cp, flash_make_addr (info, 0, offset), info->portwidth);
This most probably does NOT perform any ushort reads at all, but does a simple byte by byte copy, so your patch does not make much sense to me.
Please explain in which sort of problems you see that are supposed to be fixed by this modification.
Best regards,
Wolfgang Denk

Hi, Wolfgang,
This cfi-flash issue was found in u-boot 1.1.6 top git tree. On our board (MPC8641HPCn), sometimes the u-boot will get the wrong 'num_erase_regions' (0xff) when it boots up. It causes the u-boot can not find and access the flash. If I remove the patch -- '[PATCH] CFI driver AMD Command Set Top boot geometry reversal, etc.' from Stefan Roese sr@denx.de in Nov. 13, 2006. All of the flash in u-boot are OK.
That patch adds support for reading JEDEC Manufacturer ID and Device ID causes. -- function flash_read_jedec_ids(). In our board, we have 'Spinsion S29GL064M90TFIR6' flash with 16bit width port connection. After perform flash_read_jedec_ids(), the cfi query read will get an '0xff'. From this flash's specification, the read for flash commands in 16bit port connection should perform 16bit read and ignore the high 8bit data. Although this issue maybe occurs in the special chip, perform the fully 16bit read is a safety operation.
The modification refers to the cfi flash drivers in Linux kernel (The cfi_read_query() function in include/linux/mtd/cfi.h file). It's more easy and clear than making the different type date read for different portwidth (such as ushort_read() for x16, ulong_read() for x32).
Thanks!
Best Regards, Zhang Wei
Wolfgang Denk wrote:
In message <1166772458178-git-send-email-> you wrote:
For the flash of port width more than 8bit, a completetly read must be performed. For example, 16bit port width flash must perform a ushort read. Otherwise, some flashes will get error data.
Please explain under which circumstances you think this is necessary.
Signed-off-by: Zhang Wei wei.zhang@freescale.com
drivers/cfi_flash.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/cfi_flash.c b/drivers/cfi_flash.c index 9b10220..f02b047 100644 --- a/drivers/cfi_flash.c +++ b/drivers/cfi_flash.c @@ -250,9 +250,9 @@ #endif */ inline uchar flash_read_uchar (flash_info_t * info, uint offset) {
- uchar *cp;
- uchar cp[FLASH_CFI_64BIT];
- cp = flash_make_addr (info, 0, offset);
- memcpy(cp, flash_make_addr (info, 0, offset), info->portwidth);
This most probably does NOT perform any ushort reads at all, but does a simple byte by byte copy, so your patch does not make much sense to me.
Please explain in which sort of problems you see that are supposed to be fixed by this modification.
Best regards,
Wolfgang Denk

In message 458BBB5D.1030005@freescale.com you wrote:
After perform flash_read_jedec_ids(), the cfi query read will get an '0xff'. From this flash's specification, the read for flash commands in 16bit port connection should perform 16bit read and ignore the high 8bit data. Although this issue maybe occurs in the special chip, perform the fully 16bit read is a safety operation.
But your patch does NOT perform a 16 bit read. It calls memcpy(); the default implementation of memcpy [see lib_generic/string.c] does this:
char *tmp = (char *) dest, *s = (char *) src;
while (count--) *tmp++ = *s++;
i. e. it performs a character copy, too, so it makes no difference compared to the original code.
The modification refers to the cfi flash drivers in Linux kernel (The cfi_read_query() function in include/linux/mtd/cfi.h file). It's more easy and clear than making the different type date read for different portwidth (such as ushort_read() for x16, ulong_read() for x32).
I understand your intentions, but your patch does not do what you think it does, so if it really fixes the problem on your system there must be another cause or effect.
Best regards,
Wolfgang Denk

Hi, Wolfgang,
Yes, the memcpy() is just a byte copy. But a x16 read can be emulated by two x8 read. And in fact, the flash_read_ushort(), flash_read_long() in cfi_flash.c are using the same implementation.
In addition, the original code only reads 8bit, not the full 16bit. My patch ensures the full 16bit data are read completely.
Thanks!
Best Regards, Zhang Wei
-----Original Message----- From: Wolfgang Denk [mailto:wd@denx.de] Sent: 2006-12-22 (星期五) 22:11 To: Zhang Wei-r63237 Cc: u-boot-users@lists.sourceforge.net Subject: Re: [U-Boot-Users] [PATCH] Fixed cfi flash read uchar bug.
In message 458BBB5D.1030005@freescale.com you wrote:
After perform flash_read_jedec_ids(), the cfi query read will get an '0xff'. From this flash's specification, the read for flash commands in 16bit port connection should perform 16bit read and ignore the high 8bit data. Although this issue maybe occurs in the special chip, perform the fully 16bit read is a safety operation.
But your patch does NOT perform a 16 bit read. It calls memcpy(); the default implementation of memcpy [see lib_generic/string.c] does this:
char *tmp = (char *) dest, *s = (char *) src;
while (count--) *tmp++ = *s++;
i. e. it performs a character copy, too, so it makes no difference compared to the original code.
The modification refers to the cfi flash drivers in Linux kernel (The cfi_read_query() function in include/linux/mtd/cfi.h file). It's more easy and clear than making the different type date read for different portwidth (such as ushort_read() for x16, ulong_read() for x32).
I understand your intentions, but your patch does not do what you think it does, so if it really fixes the problem on your system there must be another cause or effect.
Best regards,
Wolfgang Denk

In message 2176B872C0407E44887F07CCAA869293832458@zch01exm21.fsl.freescale.net you wrote:
Yes, the memcpy() is just a byte copy. But a x16 read can be emulated by two x8 read. And in fact, the flash_read_ushort(), flash_read_long() in cfi_flash.c are using the same implementation.
But these are supposed to read more than one byte.
In addition, the original code only reads 8bit, not the full 16bit. My patch ensures the full 16bit data are read completely.
I still don't understand why flash_read_uchar() should read more than one byte? If you need a 16 bit read operation I would expect you to use flash_read_ushort() instead.
Best regards,
Wolfgang Denk

Hi, Wolfgang,
Merry X'mas!
I want to explain two key points. I also take the x16 connection flash example.
1. Reading 16bit data in flash_read_uchar() will not cause an error. We can get an example from the CFI specification. We run the CFI Query-unique ACSII string "QRY" command on x16 device with x16 mode connection. From the word address 0x10, 0x11, 0x12, we could get "Q", "R", "Y". If using byte address, we should get them like below: byte address 0x20 got "Q", 0x21 got 0x0, 0x22 got "R", 0x23 got 0x0, 0x24 got "Y", 0x25 got 0x0. In the u-boot, The word address offset '0x10' in flash_read_uchar() will be translated to byte address '0x20' by flash_make_addr(). I get the high 8-bit in flash_read_uchar(), which is just a NULL and discarded.
2. Reading 16bit data in flash_read_uchar() is a safety operation, which will avoid the potential problem. In the Linux kernel, the cfi driver using the similar implementation. I think reading the full width bit data from the flash port may clear the flash data buffer. I've tested to add 'volatile' keyword to the flash_get_size() and flash_read_jedec_ids() functions' declaration. The flash can not work also. From Chris' work, the compiler's optimizing causes some flash to get the error.
Thanks!
Best Regards, Zhang Wei
Wolfgang Denk wrote:
In message 2176B872C0407E44887F07CCAA869293832458@zch01exm21.fsl.freescale.net you wrote:
Yes, the memcpy() is just a byte copy. But a x16 read can be emulated by two x8 read. And in fact, the flash_read_ushort(), flash_read_long() in cfi_flash.c are using the same implementation.
But these are supposed to read more than one byte.
In addition, the original code only reads 8bit, not the full 16bit. My patch ensures the full 16bit data are read completely.
I still don't understand why flash_read_uchar() should read more than one byte? If you need a 16 bit read operation I would expect you to use flash_read_ushort() instead.
Best regards,
Wolfgang Denk

Dear Zhang Wei,
in message 458F77D0.10203@freescale.com you wrote:
Merry X'mas!
Thanks, and the same to you.
I want to explain two key points. I also take the x16 connection flash example.
OK.
- Reading 16bit data in flash_read_uchar() will not cause an error.
OK.
- Reading 16bit data in flash_read_uchar() is a safety operation, which
will avoid the potential problem.
Now this is what I want to understand. What exactly is the "potential problem"?
In the Linux kernel, the cfi driver using the similar
Can you please point out which specific part of the Linux MTD code you mean? And which version of the code?
implementation. I think reading the full width bit data from the flash port may clear the flash data buffer. I've tested to add 'volatile'
I have to admit that I don't really understand this. I would not be surprised that some statement like this can be found in some chip errata, but I would like to know this for certain first.
keyword to the flash_get_size() and flash_read_jedec_ids() functions' declaration. The flash can not work also. From Chris' work, the compiler's optimizing causes some flash to get the error.
For me this is an indication that the problem is actually somewhere else, and while your modification might actually fix the symptoms, I doubt that it is the correct fix - or the correct problem. If your explanation was right, this should not depend on compiler versions.
Best regards,
Wolfgang Denk

Hi, Wolfgang,
Wolfgang Denk wrote:
Now this is what I want to understand. What exactly is the "potential problem"?
That's the issue in the flash 'Spinsion S29GL064M90TFIR6' with x16 connection. After running flash_read_jedec_ids(), any follow CFI query command will get the data with high 8bit = 0xff, but the low 8bit is valid. And if we only read low 8bit, we'll get the 0xff too. In addition, the second follow CFI query command has no that issue. So, I read the full 16bit date and only take the valid low 8bit.
Can you please point out which specific part of the Linux MTD code you mean? And which version of the code?
The reference codes is in Linux Kernel 2.6.19.
drivers/mtd/chips/cfi_probe.c: cfi_chip_setup() calls: int num_erase_regions = cfi_read_query(map, base + (0x10 + 28)*ofs_factor); include/linux/mtd/cfi.h: cfi_read_query() calls: map_word val = map_read(map, addr); include/linux/mtd/map.h defines: #define map_read(map, ofs) inline_map_read(map, ofs) include/linux/mtd/map.h: function inline_map_read() body: static inline map_word inline_map_read(struct map_info *map, unsigned long ofs) { map_word r;
if (map_bankwidth_is_1(map)) r.x[0] = __raw_readb(map->virt + ofs); else if (map_bankwidth_is_2(map)) r.x[0] = __raw_readw(map->virt + ofs); else if (map_bankwidth_is_4(map)) r.x[0] = __raw_readl(map->virt + ofs); #if BITS_PER_LONG >= 64 else if (map_bankwidth_is_8(map)) r.x[0] = __raw_readq(map->virt + ofs); #endif else if (map_bankwidth_is_large(map)) memcpy_fromio(r.x, map->virt+ofs, map->bankwidth);
return r; } And the 'map_word' definition in include/linux/mtd/map.h is below: typedef union { unsigned long x[MAX_MAP_LONGS]; } map_word;
I have to admit that I don't really understand this. I would not be surprised that some statement like this can be found in some chip errata, but I would like to know this for certain first.
I only find one clue from 'Spinsion S29GL064M90TFIR6' specification, which is the comment 'Data bits DQ15–DQ8 are don’t care. ' for 'RESET' command. And the comment has not found in some other AMD, Spinsion chips specifications.
For me this is an indication that the problem is actually somewhere else, and while your modification might actually fix the symptoms, I doubt that it is the correct fix - or the correct problem. If your explanation was right, this should not depend on compiler versions.
This is a real weird issue. The compiler is also a factor. Chris's different compilers get the different results. In fact, the same gcc with different size codes will also get different results.
Thanks!
Best Regards, Zhang Wei

Hi, Wolfgang,
Any feedback about this mail?
Thanks!
Zhang Wei
-----Original Message----- From: u-boot-users-bounces@lists.sourceforge.net [mailto:u-boot-users-bounces@lists.sourceforge.net] On Behalf Of Zhang Wei Sent: Tuesday, December 26, 2006 2:05 PM To: Wolfgang Denk Cc: u-boot-users@lists.sourceforge.net Subject: Re: [U-Boot-Users] [PATCH] Fixed cfi flash read uchar bug.
Hi, Wolfgang,
Wolfgang Denk wrote:
Now this is what I want to understand. What exactly is the
"potential
problem"?
That's the issue in the flash 'Spinsion S29GL064M90TFIR6' with x16 connection. After running flash_read_jedec_ids(), any follow CFI query command will get the data with high 8bit = 0xff, but the low 8bit is valid. And if we only read low 8bit, we'll get the 0xff too. In addition, the second follow CFI query command has no that issue. So, I read the full 16bit date and only take the valid low 8bit.
Can you please point out which specific part of the Linux
MTD code you
mean? And which version of the code?
The reference codes is in Linux Kernel 2.6.19.
drivers/mtd/chips/cfi_probe.c: cfi_chip_setup() calls: int num_erase_regions = cfi_read_query(map, base + (0x10 + 28)*ofs_factor); include/linux/mtd/cfi.h: cfi_read_query() calls: map_word val = map_read(map, addr); include/linux/mtd/map.h defines: #define map_read(map, ofs) inline_map_read(map, ofs) include/linux/mtd/map.h: function inline_map_read() body: static inline map_word inline_map_read(struct map_info *map, unsigned long ofs) { map_word r;
if (map_bankwidth_is_1(map)) r.x[0] = __raw_readb(map->virt + ofs); else if (map_bankwidth_is_2(map)) r.x[0] = __raw_readw(map->virt + ofs); else if (map_bankwidth_is_4(map)) r.x[0] = __raw_readl(map->virt + ofs);
#if BITS_PER_LONG >= 64 else if (map_bankwidth_is_8(map)) r.x[0] = __raw_readq(map->virt + ofs); #endif else if (map_bankwidth_is_large(map)) memcpy_fromio(r.x, map->virt+ofs, map->bankwidth);
return r;
} And the 'map_word' definition in include/linux/mtd/map.h is below: typedef union { unsigned long x[MAX_MAP_LONGS]; } map_word;
I have to admit that I don't really understand this. I would not be surprised that some statement like this can be found in some chip errata, but I would like to know this for certain first.
I only find one clue from 'Spinsion S29GL064M90TFIR6' specification, which is the comment 'Data bits DQ15–DQ8 are don’t care. ' for 'RESET' command. And the comment has not found in some other AMD, Spinsion chips specifications.
For me this is an indication that the problem is actually
somewhere
else, and while your modification might actually fix the
symptoms, I
doubt that it is the correct fix - or the correct problem.
If your
explanation was right, this should not depend on compiler versions.
This is a real weird issue. The compiler is also a factor. Chris's different compilers get the different results. In fact, the same gcc with different size codes will also get different results.
Thanks!
Best Regards, Zhang Wei
Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge
&CID=DEVDEV
U-Boot-Users mailing list U-Boot-Users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/u-boot-users

In message 46B96294322F7D458F9648B60E15112C03EE48@zch01exm26.fsl.freescale.net you wrote:
SGksIFdvbGZnYW5nLA0KDQpBbnkgZmVlZGJhY2sgYWJvdXQgdGhpcyBtYWlsPw0KDQpUaGFua3Mh DQoNClpoYW5nIFdlaSANCg0KPiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiB1 LWJvb3QtdXNlcnMtYm91bmNlc0BsaXN0cy5zb3VyY2Vmb3JnZS5uZXQgDQo+IFttYWlsdG86dS1i b290LXVzZXJzLWJvdW5jZXNAbGlzdHMuc291cmNlZm9yZ2UubmV0XSBPbiBCZWhhbGYgDQo+IE9m IFpoYW5nIFdlaQ0KPiBTZW50OiBUdWVzZGF5LCBEZWNlbWJlciAyNiwgMjAwNiAyOjA1IFBNDQo+ IFRvOiBXb2xmZ2FuZyBEZW5rDQo+IENjOiB1LWJvb3QtdXNlcnNAbGlzdHMuc291cmNlZm9yZ2Uu bmV0DQo+IFN1YmplY3Q6IFJlOiBbVS1Cb290LVVzZXJzXSBbUEFUQ0hdIEZpeGVkIGNmaSBmbGFz aCByZWFkIHVjaGFyIGJ1Zy4NCj4gDQo+IEhpLCBXb2xmZ2FuZywNCj4gDQo+IFdvbGZnYW5nIERl bmsgd3JvdGU6DQo+ID4gTm93IHRoaXMgaXMgd2hhdCBJIHdhbnQgdG8gdW5kZXJzdGFuZC4gV2hh dCBleGFjdGx5IGlzIHRoZSANCj4gInBvdGVudGlhbA0KPiA+IHByb2JsZW0iPw0KPiA+ICAgDQo+ IFRoYXQncyB0aGUgaXNzdWUgaW4gdGhlIGZsYXNoICdTcGluc2lvbiBTMjlHTDA2NE05MFRGSVI2 JyB3aXRoIHgxNiANCj4gY29ubmVjdGlvbi4gQWZ0ZXIgcnVubmluZyBmbGFzaF9yZWFkX2plZGVj X2lkcygpLCBhbnkgZm9sbG93IA0KPiBDRkkgcXVlcnkgDQo+IGNvbW1hbmQgd2lsbCBnZXQgdGhl IGRhdGEgd2l0aCBoaWdoIDhiaXQgPSAweGZmLCBidXQgdGhlIGxvdyA4Yml0IGlzIA0KPiB2YWxp ZC4gQW5kIGlmIHdlIG9ubHkgcmVhZCBsb3cgOGJpdCwgd2UnbGwgZ2V0IHRoZSAweGZmIHRvby4g SW4gDQo+IGFkZGl0aW9uLCB0aGUgc2Vjb25kIGZvbGxvdyBDRkkgcXVlcnkgY29tbWFuZCBoYXMg bm8gdGhhdCANCj4gaXNzdWUuIFNvLCBJIA0KPiByZWFkIHRoZSBmdWxsIDE2Yml0IGRhdGUgYW5k IG9ubHkgdGFrZSB0aGUgdmFsaWQgbG93IDhiaXQuDQo+ID4gQ2FuIHlvdSBwbGVhc2UgcG9pbnQg b3V0IHdoaWNoIHNwZWNpZmljIHBhcnQgb2YgdGhlIExpbnV4IA0KPiBNVEQgY29kZSB5b3UNCj4g PiBtZWFuPyBBbmQgd2hpY2ggdmVyc2lvbiBvZiB0aGUgY29kZT8NCj4gPiAgIA0KPiBUaGUgcmVm ZXJlbmNlIGNvZGVzIGlzIGluIExpbnV4IEtlcm5lbCAyLjYuMTkuDQo+IA0KPiBkcml2ZXJzL210 ZC9jaGlwcy9jZmlfcHJvYmUuYzogY2ZpX2NoaXBfc2V0dXAoKSBjYWxsczoNCj4gICAgIGludCBu dW1fZXJhc2VfcmVnaW9ucyA9IGNmaV9yZWFkX3F1ZXJ5KG1hcCwgYmFzZSArICgweDEwICsgDQo+ IDI4KSpvZnNfZmFjdG9yKTsNCj4gaW5jbHVkZS9saW51eC9tdGQvY2ZpLmg6IGNmaV9yZWFkX3F1 ZXJ5KCkgY2FsbHM6DQo+ICAgICBtYXBfd29yZCB2YWwgPSBtYXBfcmVhZChtYXAsIGFkZHIpOw0K PiBpbmNsdWRlL2xpbnV4L210ZC9tYXAuaCBkZWZpbmVzOg0KPiAgICAgI2RlZmluZSBtYXBfcmVh ZChtYXAsIG9mcykgaW5saW5lX21hcF9yZWFkKG1hcCwgb2ZzKQ0KPiBpbmNsdWRlL2xpbnV4L210 ZC9tYXAuaDogZnVuY3Rpb24gaW5saW5lX21hcF9yZWFkKCkgYm9keToNCj4gc3RhdGljIGlubGlu ZSBtYXBfd29yZCBpbmxpbmVfbWFwX3JlYWQoc3RydWN0IG1hcF9pbmZvICptYXAsIHVuc2lnbmVk IA0KPiBsb25nIG9mcykNCj4gew0KPiAgICAgbWFwX3dvcmQgcjsNCj4gDQo+ICAgICBpZiAobWFw X2Jhbmt3aWR0aF9pc18xKG1hcCkpDQo+ICAgICAgICAgci54WzBdID0gX19yYXdfcmVhZGIobWFw LT52aXJ0ICsgb2ZzKTsNCj4gICAgIGVsc2UgaWYgKG1hcF9iYW5rd2lkdGhfaXNfMihtYXApKQ0K PiAgICAgICAgIHIueFswXSA9IF9fcmF3X3JlYWR3KG1hcC0+dmlydCArIG9mcyk7DQo+ICAgICBl bHNlIGlmIChtYXBfYmFua3dpZHRoX2lzXzQobWFwKSkNCj4gICAgICAgICByLnhbMF0gPSBfX3Jh d19yZWFkbChtYXAtPnZpcnQgKyBvZnMpOw0KPiAjaWYgQklUU19QRVJfTE9ORyA+PSA2NA0KPiAg ICAgZWxzZSBpZiAobWFwX2Jhbmt3aWR0aF9pc184KG1hcCkpDQo+ICAgICAgICAgci54WzBdID0g X19yYXdfcmVhZHEobWFwLT52aXJ0ICsgb2ZzKTsNCj4gI2VuZGlmDQo+ICAgICBlbHNlIGlmICht YXBfYmFua3dpZHRoX2lzX2xhcmdlKG1hcCkpDQo+ICAgICAgICAgbWVtY3B5X2Zyb21pbyhyLngs IG1hcC0+dmlydCtvZnMsIG1hcC0+YmFua3dpZHRoKTsNCj4gDQo+ICAgICByZXR1cm4gcjsNCj4g fQ0KPiBBbmQgdGhlICdtYXBfd29yZCcgZGVmaW5pdGlvbiBpbiBpbmNsdWRlL2xpbnV4L210ZC9t YXAuaCBpcyBiZWxvdzoNCj4gdHlwZWRlZiB1bmlvbiB7DQo+ICAgICB1bnNpZ25lZCBsb25nIHhb TUFYX01BUF9MT05HU107DQo+IH0gbWFwX3dvcmQ7DQo+IA0KPiA+IEkgaGF2ZSB0byBhZG1pdCB0 aGF0IEkgZG9uJ3QgcmVhbGx5IHVuZGVyc3RhbmQgdGhpcy4gSSB3b3VsZCBub3QgYmUNCj4gPiBz dXJwcmlzZWQgdGhhdCBzb21lIHN0YXRlbWVudCBsaWtlIHRoaXMgY2FuIGJlIGZvdW5kIGluIHNv bWUgY2hpcA0KPiA+IGVycmF0YSwgYnV0IEkgd291bGQgbGlrZSB0byBrbm93IHRoaXMgZm9yIGNl cnRhaW4gZmlyc3QuDQo+ID4gICANCj4gSSBvbmx5IGZpbmQgb25lIGNsdWUgZnJvbSAnU3BpbnNp b24gUzI5R0wwNjRNOTBURklSNicgc3BlY2lmaWNhdGlvbiwgDQo+IHdoaWNoIGlzIHRoZSBjb21t ZW50ICdEYXRhIGJpdHMgRFExNeKAk0RROCBhcmUgZG9u4oCZdCBjYXJlLiAnIA0KPiBmb3IgJ1JF U0VUJyANCj4gY29tbWFuZC4gQW5kIHRoZSBjb21tZW50IGhhcyBub3QgZm91bmQgaW4gc29tZSBv dGhlciBBTUQsIA0KPiBTcGluc2lvbiBjaGlwcyANCj4gc3BlY2lmaWNhdGlvbnMuDQo+ID4gRm9y IG1lIHRoaXMgaXMgYW4gaW5kaWNhdGlvbiB0aGF0IHRoZSBwcm9ibGVtIGlzICBhY3R1YWxseSAg DQo+IHNvbWV3aGVyZQ0KPiA+IGVsc2UsICBhbmQgd2hpbGUgeW91ciBtb2RpZmljYXRpb24gbWln aHQgYWN0dWFsbHkgZml4IHRoZSANCj4gc3ltcHRvbXMsIEkNCj4gPiBkb3VidCB0aGF0IGl0IGlz IHRoZSBjb3JyZWN0IGZpeCAtIG9yIHRoZSBjb3JyZWN0ICBwcm9ibGVtLiANCj4gIElmICB5b3Vy DQo+ID4gZXhwbGFuYXRpb24gd2FzIHJpZ2h0LCB0aGlzIHNob3VsZCBub3QgZGVwZW5kIG9uIGNv bXBpbGVyIHZlcnNpb25zLg0KPiA+ICAgDQo+IA0KPiBUaGlzIGlzIGEgcmVhbCB3ZWlyZCBpc3N1 ZS4gVGhlIGNvbXBpbGVyIGlzIGFsc28gYSBmYWN0b3IuIENocmlzJ3MgDQo+IGRpZmZlcmVudCBj b21waWxlcnMgZ2V0IHRoZSBkaWZmZXJlbnQgcmVzdWx0cy4gSW4gZmFjdCwgdGhlIHNhbWUgZ2Nj IA0KPiB3aXRoIGRpZmZlcmVudCBzaXplIGNvZGVzIHdpbGwgYWxzbyBnZXQgZGlmZmVyZW50IHJl c3VsdHMuDQo+IA0KPiBUaGFua3MhDQo+IA0KPiBCZXN0IFJlZ2FyZHMsDQo+IFpoYW5nIFdlaQ0K PiANCj4gDQo+IC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tDQo+IC0tLS0tLS0tLS0tDQo+IFRha2UgU3VydmV5cy4gRWFybiBDYXNo LiBJbmZsdWVuY2UgdGhlIEZ1dHVyZSBvZiBJVA0KPiBKb2luIFNvdXJjZUZvcmdlLm5ldCdzIFRl Y2hzYXkgcGFuZWwgYW5kIHlvdSdsbCBnZXQgdGhlIA0KPiBjaGFuY2UgdG8gc2hhcmUgeW91cg0K PiBvcGluaW9ucyBvbiBJVCAmIGJ1c2luZXNzIHRvcGljcyB0aHJvdWdoIGJyaWVmIHN1cnZleXMg LSBhbmQgZWFybiBjYXNoDQo+IGh0dHA6Ly93d3cudGVjaHNheS5jb20vZGVmYXVsdC5waHA/cGFn ZT1qb2luLnBocCZwPXNvdXJjZWZvcmdlDQomQ0lEPURFVkRFVg0KPiBfX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiBVLUJvb3QtVXNlcnMgbWFpbGluZyBs aXN0DQo+IFUtQm9vdC1Vc2Vyc0BsaXN0cy5zb3VyY2Vmb3JnZS5uZXQNCj4gaHR0cHM6Ly9saXN0 cy5zb3VyY2Vmb3JnZS5uZXQvbGlzdHMvbGlzdGluZm8vdS1ib290LXVzZXJzDQo+IA0K
Please do not send base 64 encoded messages.
Please do not send HTML messages.
Please send plain text only.
Message unreadable, ignored. Sorry.

Hi, Wolfgang,
Sorry about the wrong encoding mail.
The below is the previous email:
Wolfgang Denk wrote:
Now this is what I want to understand. What exactly is the "potential problem"?
That's the issue in the flash 'Spinsion S29GL064M90TFIR6' with x16 connection. After running flash_read_jedec_ids(), any follow CFI query command will get the data with high 8bit = 0xff, but the low 8bit is valid. And if we only read low 8bit, we'll get the 0xff too. In addition, the second follow CFI query command has no that issue. So, I read the full 16bit date and only take the valid low 8bit.
Can you please point out which specific part of the Linux MTD code you mean? And which version of the code?
The reference codes is in Linux Kernel 2.6.19.
drivers/mtd/chips/cfi_probe.c: cfi_chip_setup() calls: int num_erase_regions = cfi_read_query(map, base + (0x10 + 28)*ofs_factor); include/linux/mtd/cfi.h: cfi_read_query() calls: map_word val = map_read(map, addr); include/linux/mtd/map.h defines: #define map_read(map, ofs) inline_map_read(map, ofs) include/linux/mtd/map.h: function inline_map_read() body: static inline map_word inline_map_read(struct map_info *map, unsigned long ofs) { map_word r;
if (map_bankwidth_is_1(map)) r.x[0] = __raw_readb(map->virt + ofs); else if (map_bankwidth_is_2(map)) r.x[0] = __raw_readw(map->virt + ofs); else if (map_bankwidth_is_4(map)) r.x[0] = __raw_readl(map->virt + ofs); #if BITS_PER_LONG >= 64 else if (map_bankwidth_is_8(map)) r.x[0] = __raw_readq(map->virt + ofs); #endif else if (map_bankwidth_is_large(map)) memcpy_fromio(r.x, map->virt+ofs, map->bankwidth);
return r; } And the 'map_word' definition in include/linux/mtd/map.h is below: typedef union { unsigned long x[MAX_MAP_LONGS]; } map_word;
I have to admit that I don't really understand this. I would not be surprised that some statement like this can be found in some chip errata, but I would like to know this for certain first.
I only find one clue from 'Spinsion S29GL064M90TFIR6' specification, which is the comment 'Data bits DQ15-DQ8 are don't care. ' for 'RESET' command. And the comment has not found in some other AMD, Spinsion chips
specifications.
For me this is an indication that the problem is actually somewhere else, and while your modification might actually fix the symptoms, I doubt that it is the correct fix - or the correct problem. If your explanation was right, this should not depend on compiler versions.
This is a real weird issue. The compiler is also a factor. Chris's different compilers get the different results. In fact, the same gcc with different size codes will also get different results.
Thanks!
Best Regards, Zhang Wei

In message 46B96294322F7D458F9648B60E15112C03EEE0@zch01exm26.fsl.freescale.net you wrote:
The below is the previous email:
Wolfgang Denk wrote:
Now this is what I want to understand. What exactly is the "potential problem"?
That's the issue in the flash 'Spinsion S29GL064M90TFIR6' with x16 connection. After running flash_read_jedec_ids(), any follow CFI query command will get the data with high 8bit = 0xff, but the low 8bit is valid. And if we only read low 8bit, we'll get the 0xff too. In addition, the second follow CFI query command has no that issue. So, I read the full 16bit date and only take the valid low 8bit.
etc. etc.
I didn't see any new facts in your current posting. My position has not changed either: I don't see how your character-wise copy using memcpy() would be different from accessing the flash through a uchar pointer; also I still think that if the compiler version changes behaviour then we don't really understand what's going on here.
Maybe Tolunay or Stefan can comment now that both are back from their Xmas breaks; they both know the CFI driver much better than me.
I tend to reject your proposed patch until we really understand the problem. To me, the patch seems to be wrong.
Best regards,
Wolfgang Denk

On Thursday 04 January 2007 10:23, Wolfgang Denk wrote:
Wolfgang Denk wrote:
Now this is what I want to understand. What exactly is the "potential problem"?
That's the issue in the flash 'Spinsion S29GL064M90TFIR6' with x16 connection. After running flash_read_jedec_ids(), any follow CFI query command will get the data with high 8bit = 0xff, but the low 8bit is valid. And if we only read low 8bit, we'll get the 0xff too. In addition, the second follow CFI query command has no that issue. So, I read the full 16bit date and only take the valid low 8bit.
etc. etc.
I didn't see any new facts in your current posting. My position has not changed either: I don't see how your character-wise copy using memcpy() would be different from accessing the flash through a uchar pointer; also I still think that if the compiler version changes behaviour then we don't really understand what's going on here.
Maybe Tolunay or Stefan can comment now that both are back from their Xmas breaks; they both know the CFI driver much better than me.
What I noticed after looking at the flash access functions like flash_read_uchar() is that no access macros and even no volatile pointer access is used to read from the flash. This looks like a potential problem, that can show when using different compiler versions.
Please find attached a small patch that adds fixes this potential problem for the 3 functions flash_read_uchar/ushort/long. Please give it a try and let me know if this changed the behavior somehow.
Thanks.
Best regards, Stefan

Stefan Roese wrote:
On Thursday 04 January 2007 10:23, Wolfgang Denk wrote:
Wolfgang Denk wrote:
Now this is what I want to understand. What exactly is the "potential problem"?
That's the issue in the flash 'Spinsion S29GL064M90TFIR6' with x16 connection. After running flash_read_jedec_ids(), any follow CFI query command will get the data with high 8bit = 0xff, but the low 8bit is valid. And if we only read low 8bit, we'll get the 0xff too. In addition, the second follow CFI query command has no that issue. So, I read the full 16bit date and only take the valid low 8bit.
etc. etc.
I didn't see any new facts in your current posting. My position has not changed either: I don't see how your character-wise copy using memcpy() would be different from accessing the flash through a uchar pointer; also I still think that if the compiler version changes behaviour then we don't really understand what's going on here.
Maybe Tolunay or Stefan can comment now that both are back from their Xmas breaks; they both know the CFI driver much better than me.
What I noticed after looking at the flash access functions like flash_read_uchar() is that no access macros and even no volatile pointer access is used to read from the flash. This looks like a potential problem, that can show when using different compiler versions.
Please find attached a small patch that adds fixes this potential problem for the 3 functions flash_read_uchar/ushort/long. Please give it a try and let me know if this changed the behavior somehow.
I think this is a good idea given GCC 4.x is agressive in optimizations. I do not think converting these to use memcpy() is a good idea. I am with Wolfgang on this.
I think we should commit Stefan's patch even if it might or might not solve the problem Zhang is experiencing.
Best regards, Tolunay

Dear Tolunay,
in message 45A0C777.1000508@orkun.us you wrote:
Please find attached a small patch that adds fixes this potential problem for the 3 functions flash_read_uchar/ushort/long. Please give it a try and let me know if this changed the behavior somehow.
I think this is a good idea given GCC 4.x is agressive in optimizations.
I already discussed this internally with Stefan. I *don't* think it's a good idea. I really hate to change bits and pieces of code without really understanding why we are doing this.
In the current situation we are accessing flash memory which is supposed to be in read mode, i. e. it behaves like normal system memory. It should be no problem even if the flash memory content was cached. So why would "volatile" improve anything?
As long as I don't see (for example in the generated assembler code) how a problem might exist, and how the suggested patch fixes exactly this problem, I'd like to continue researching this problem.
I do not think converting these to use memcpy() is a good idea. I am with Wolfgang on this.
Actually I might have been wrong in my assessment here, when I stated that memcpy() performs a character-wise copy, too. The simple code from lib_generic/string.c is used only if __HAVE_ARCH_MEMCPY is undefined, and especially on PPC we define this (in include/asm-ppc/string.h). So we might use an optimized versions where it *does* make a difference.
Checking this idea I see that we are actually using the memcpy() code from lib_ppc/ppcstring.S which *does* implement some optimiation, but I then I don't think this is efffecting us here.
I think we should commit Stefan's patch even if it might or might not solve the problem Zhang is experiencing.
Please explain why you think adding volatile's here would be a good thing, andin which way it improves the code.
Best regards,
Wolfgang Denk

Wolfgang Denk wrote:
Dear Tolunay,
in message 45A0C777.1000508@orkun.us you wrote:
Please find attached a small patch that adds fixes this potential problem for the 3 functions flash_read_uchar/ushort/long. Please give it a try and let me know if this changed the behavior somehow.
I think this is a good idea given GCC 4.x is agressive in optimizations.
I already discussed this internally with Stefan. I *don't* think it's a good idea. I really hate to change bits and pieces of code without really understanding why we are doing this.
In the current situation we are accessing flash memory which is supposed to be in read mode, i. e. it behaves like normal system memory. It should be no problem even if the flash memory content was cached. So why would "volatile" improve anything?
These functions are used to read data from flash tables. I agree most tables do not change but I think (need to verify) we also use these functions to read the status registers to determine if programming is OK etc. We might have been OK since we probably access these registers once in a function context. When the function returns the compiler optimization context is gone so no state data is used. So, it might not be necessary.
Perhaps, more important issue would be if these areas were cached. In that case, every time we change the flash from read mode to special query modes, we should probably invalidate the cache. In PowerPC 405 platform our flash is uncached so this is not necessary. I do not know if any platform caches its flash areas.
As long as I don't see (for example in the generated assembler code) how a problem might exist, and how the suggested patch fixes exactly this problem, I'd like to continue researching this problem.
I do not think converting these to use memcpy() is a good idea. I am with Wolfgang on this.
Actually I might have been wrong in my assessment here, when I stated that memcpy() performs a character-wise copy, too. The simple code from lib_generic/string.c is used only if __HAVE_ARCH_MEMCPY is undefined, and especially on PPC we define this (in include/asm-ppc/string.h). So we might use an optimized versions where it *does* make a difference.
Memcopy might be OK for flash tables but I am not sure if we use the same function to access status registers. I would rather access flash using bus wide accesses instead of byte by byte. Maybe it is safe but I do not know how it behaves in various platforms and bus interface units of various processors. I would take the safe route.
Best regards, Tolunay

On 1/13/07, Tolunay Orkun listmember@orkun.us wrote:
Perhaps, more important issue would be if these areas were cached. In that case, every time we change the flash from read mode to special query modes, we should probably invalidate the cache. In PowerPC 405 platform our flash is uncached so this is not necessary. I do not know if any platform caches its flash areas.
AVR32 does cache its flash area by default, but it can be worked around by specifying an uncached virtual mapping (i.e. the physical address ORed with 0xa0000000, similar to MIPS and SH) as the flash address instead of the physical address.
However, doing cached flash accesses would probably improve the performance of things like fsload, so how about using something like the following interface when uncached access is needed?
void *flash_map(unsigned long paddr, size_t len); void flash_unmap(void *vaddr, size_t len);
On platforms where the flash is always uncached, flash_map would simply return paddr cast as a void *, and flash_unmap would do nothing. Other platforms may invalidate the range and return a virtual address which will bypass the cache (or simply disable the dcache and return the physical address.)
Actually I might have been wrong in my assessment here, when I stated that memcpy() performs a character-wise copy, too. The simple code from lib_generic/string.c is used only if __HAVE_ARCH_MEMCPY is undefined, and especially on PPC we define this (in include/asm-ppc/string.h). So we might use an optimized versions where it *does* make a difference.
Memcopy might be OK for flash tables but I am not sure if we use the same function to access status registers. I would rather access flash using bus wide accesses instead of byte by byte. Maybe it is safe but I do not know how it behaves in various platforms and bus interface units of various processors. I would take the safe route.
I agree. Please don't make any assumptions about how memcpy() accesses memory -- it might do all kinds of crazy stuff in the name of optimization, including prefetching. And it may introduce _really_ subtle bugs where adding a field to a struct may cause things to break because it causes its size to cross some implementation threshold (for example on avr32 we might switch to 32-byte load/store-multiple instructions if the size is >= 32 bytes and the start address is properly aligned. Also, I believe gcc uses some threshold to determine whether to emit __builtin_memcpy in-line or emit a call to the out-of-line memcpy implementation, which may cause entirely different things to happen from the flash chip's point of view.)
Best regards, Haavard

Hi Haavard,
On Saturday 13 January 2007 18:53, Håvard Skinnemoen wrote:
On 1/13/07, Tolunay Orkun listmember@orkun.us wrote:
Perhaps, more important issue would be if these areas were cached. In that case, every time we change the flash from read mode to special query modes, we should probably invalidate the cache. In PowerPC 405 platform our flash is uncached so this is not necessary. I do not know if any platform caches its flash areas.
AVR32 does cache its flash area by default, but it can be worked around by specifying an uncached virtual mapping (i.e. the physical address ORed with 0xa0000000, similar to MIPS and SH) as the flash address instead of the physical address.
However, doing cached flash accesses would probably improve the performance of things like fsload, so how about using something like the following interface when uncached access is needed?
void *flash_map(unsigned long paddr, size_t len); void flash_unmap(void *vaddr, size_t len);
On platforms where the flash is always uncached, flash_map would simply return paddr cast as a void *, and flash_unmap would do nothing. Other platforms may invalidate the range and return a virtual address which will bypass the cache (or simply disable the dcache and return the physical address.)
I tend to like this idea of these mapping functions. This would give us more flexibility. On some 4xx boards for example, we initialize the flash cached after powerup and switch to uncached access after relocation. We wouldn't need this change in the TLB's with these mapping functions anymore.
And you are right of course. If implementing such an extension, we should first make sure that those mapping functions will do nothing on most platform, so we stay compatible and don't break any existing designs.
If thinking longer about this, I think it would be helpful to make these mapping functions even more generic. Meaning not flash focused, but generic mapping functions with a parameter like ioremap in Linux. This way we could use these function to also map a region as cached or uncached memory if needed.
Best regards, Stefan

Hi, Wolfgang,
It's so pity that the flash got wrong 'num_erase_regions' issue is still in u-boot 1.2.0.
The byte by byte accessing is not preferred. But the flash_read_ushort() and flash_read_long() in drivers/cfi_flash.c are both implemented by byte accessing. How about it?
Can the instruction ' retval = (addr[0] << 16) | (addr[(info->portwidth)] << 24) | (addr[(2 * info->portwidth)]) | (addr[(3 * info->portwidth)] << 8);' In flash_read_long() be replaced by memcpy() function?
Thanks! Zhang Wei
Memcopy might be OK for flash tables but I am not sure if we use the same function to access status registers. I would rather access flash using bus wide accesses instead of byte by byte. Maybe it is safe but I do not know how it behaves in various platforms and bus interface units of various processors. I would take the safe route.
Best regards, Tolunay

Zhang Wei-r63237 wrote:
Hi, Wolfgang,
It's so pity that the flash got wrong 'num_erase_regions' issue is still in u-boot 1.2.0.
The byte by byte accessing is not preferred. But the flash_read_ushort() and flash_read_long() in drivers/cfi_flash.c are both implemented by byte accessing. How about it?
Can the instruction ' retval = (addr[0] << 16) | (addr[(info->portwidth)] << 24) | (addr[(2 * info->portwidth)]) | (addr[(3 * info->portwidth)] << 8);' In flash_read_long() be replaced by memcpy() function?
I really don't think we should be using mempcy(). You're not supposed to care about the internal implementation of mempcy(), so if we start using it for specific-sized reads, then we start caring about how it's implemented.
If you want to read 32 bits at one time, then just do a 32-bit read!
retval = * (ulong *) addr;
Looking at the code, I don't understand why it's so complicated. If portwidth is 2, then retval above will be a conglomeration of addr[0], addr[2], addr[4], and addr[6]. Why isn't it just reading from [0][1][2][3]??

In message 45B8D66A.6080209@freescale.com you wrote:
Can the instruction ' retval = (addr[0] << 16) | (addr[(info->portwidth)] << 24) | (addr[(2 * info->portwidth)]) | (addr[(3 * info->portwidth)] << 8);' In flash_read_long() be replaced by memcpy() function?
INHO: no, it cannot.
If you want to read 32 bits at one time, then just do a 32-bit read!
retval = * (ulong *) addr;
You assume that we want to read four contiguous bytes, which may, or may not, be the case.
Looking at the code, I don't understand why it's so complicated. If portwidth is 2, then retval above will be a conglomeration of addr[0], addr[2], addr[4], and addr[6]. Why isn't it just reading from [0][1][2][3]??
Because the data we are interested in is only available in every other byte?
Best regards,
Wolfgang Denk

Zhang Wei-r63237 wrote:
Because the data we are interested in is only available in every other byte?
Does it means if we need other byte data we can access them byte by byte? If we needn't we should not access them?
Thanks! Zhang Wei
The current code works because most CPU Bus Interface Unit will execute buswide cycles and discard the unused byte(s).
I understand that you would like to convert this code to use memcpy() and pick various bytes from local ram instead of directly from flash. Right? What exactly you are experiencing when you use current code. Could you refresh my memory?
I do not like memcpy() as it is a black box. It could be implemented at low level using string instructions, it could do various things for different cpu architectures and various sizes so it could lead to very hard to find issues.
My proposal for you to try: Do not depend on bus interface unit doing buswide reads, explicitly read buswide chunks after doing the bus-wide read into a temporary variable, pick the usable bytes from that temporary variable. This should eliminate the bus interface unit dependency of the CPU and possible board designer interfacing gimmicks.
Let us know how it goes (can you post your patch too so I can verify it).
Tolunay

The current code works because most CPU Bus Interface Unit will execute buswide cycles and discard the unused byte(s).
I understand that you would like to convert this code to use memcpy() and pick various bytes from local ram instead of directly from flash. Right? What exactly you are experiencing when you use current code. Could you refresh my memory?
The problem we experienced here is that the flash(am29lv641d) could not get the correct size during boot up(the error message like "FLASH: ....255 erase regions found, only 4 used...0kB" is printed out), could not be read correctly by flinfo, could not erased or written etc. It is because the num_erase_regions got a wrong value by flash_read_uchar(info, FLASH_OFFSET_NUM_ERASE_REGIONS)in line 1204. It happened only after the latest update of cfi_flash driver after u-boot 1.1.6 released. This update added a flash_read_jedec_ids before reading num_erase_resions. We suspect that reset command in flash_read_jedec_ids()may cause the problem. After the last reset in this function, a flash_write_cmd() follows it, then the flash_read_uchar is executed directly after. This flash_read_uchar returns a wrong number(124, 255 for num_erase_regions etc.). If this flash_read_uchar here reads some other flash info other than FLASH_OFFSET_NUM_ERASE_REGIONS by uchar, it also returns wrong value.
I do not like memcpy() as it is a black box. It could be implemented at low level using string instructions, it could do various things for different cpu architectures and various sizes so it could lead to very hard to find issues.
I agree.
My proposal for you to try: Do not depend on bus interface unit doing buswide reads, explicitly read buswide chunks after doing the bus-wide read into a temporary variable, pick the usable bytes from that temporary variable. This should eliminate the bus interface unit dependency of the CPU and possible board designer interfacing gimmicks.
Thanks a lot. :)
Let us know how it goes (can you post your patch too so I can verify it).
Zhang Wei is on vacation this week, he had generated a new patch to implement it with the similar idea last week. He will post his new patch after he is back.
Thanks.
Haiying

Wang Haiying-r54964 wrote:
The current code works because most CPU Bus Interface Unit will execute buswide cycles and discard the unused byte(s).
I understand that you would like to convert this code to use memcpy() and pick various bytes from local ram instead of directly from flash. Right? What exactly you are experiencing when you use current code. Could you refresh my memory?
The problem we experienced here is that the flash(am29lv641d) could not get the correct size during boot up(the error message like "FLASH: ....255 erase regions found, only 4 used...0kB" is printed out), could not be read correctly by flinfo, could not erased or written etc. It is because the num_erase_regions got a wrong value by flash_read_uchar(info, FLASH_OFFSET_NUM_ERASE_REGIONS)in line 1204. It happened only after the latest update of cfi_flash driver after u-boot 1.1.6 released. This update added a flash_read_jedec_ids before reading num_erase_resions. We suspect that reset command in flash_read_jedec_ids()may cause the problem. After the last reset in
Could you add some printfs and print the following variables right after the flash_detect_cfi() call in flash_get_size() function.
printf("chipwidth = %d\n", info->chipwidth); printf("portwidth = %d\n", info->portwidth); printf("interface = %d\n", info->interface); printf("vendor = %d\n", info->vendor); printf("cfi_offset= %d\n", info->cfi_offset);
And the following printf right after flash_read_jedec_ids() in flash_get_size();
printf("manufacturer_id = %d\n", info->manufacturer_id); printf("device_id = %d\n", info->device_id); printf("device_id2 = %d\n", info->device_id2);
Please collect the output and post to the list and cc to me.
Also try replacing following line in flash_detect_cfi() function:
flash_write_cmd (info, 0, 0, info->cmd_reset);
with the following two lines:
flash_write_cmd(info, 0, 0, FLASH_CMD_RESET); flash_write_cmd(info, 0, 0, AMD_CMD_RESET);
I would like to see if this change makes any difference as well.
From your description, I have the feeling that you did not have debugged the problem fully.
If you are suspecting that the flash is not in array read mode after flash_read_jedec_ids() you can easily dump the first 256 bytes of flash right after [write a simple hexdump routine].
this function, a flash_write_cmd() follows it, then the flash_read_uchar
After this command, the flash should be in cfi_read mode. Hexdump the first 256 bytes again. You should be able to spot easily if you landed in cfi query mode or not. These are crucial before you put the blame on flash_read_uchar().
is executed directly after. This flash_read_uchar returns a wrong number(124, 255 for num_erase_regions etc.). If this flash_read_uchar here reads some other flash info other than FLASH_OFFSET_NUM_ERASE_REGIONS by uchar, it also returns wrong value.
Please do the checks I've provided above and collect the logs and send it to me. If flash does not switch to cfi query mode properly the read will fetch whatever content that was available in read mode. If the flash was completely blank (all 1s) you will get 255 which is matching with the symptoms you are facing.
My proposal for you to try: Do not depend on bus interface unit doing buswide reads, explicitly read buswide chunks after doing the bus-wide read into a temporary variable, pick the usable bytes from that temporary variable. This should eliminate the bus interface unit dependency of the CPU and possible board designer interfacing gimmicks.
Thanks a lot. :)
Please do the checks before you even bother with such a patch. The problem could be elsewhere.
Let us know how it goes (can you post your patch too so I can verify it).
Zhang Wei is on vacation this week, he had generated a new patch to implement it with the similar idea last week. He will post his new patch after he is back.
Thanks.
Haiying

Could you add some printfs and print the following variables right after the flash_detect_cfi() call in flash_get_size() function.
printf("chipwidth = %d\n", info->chipwidth); printf("portwidth = %d\n", info->portwidth); printf("interface = %d\n", info->interface); printf("vendor = %d\n", info->vendor); printf("cfi_offset= %d\n", info->cfi_offset);
And the following printf right after flash_read_jedec_ids() in flash_get_size();
printf("manufacturer_id = %d\n", info->manufacturer_id); printf("device_id = %d\n", info->device_id); printf("device_id2 = %d\n", info->device_id2);
Please collect the output and post to the list and cc to me.
Here comes the first log after adding the printf above: ------------ U-Boot 1.2.0 (Jan 31 2007 - 13:12:04)
Freescale PowerPC CPU: Core: E600, Version: 0.2, (0x80040202) System: 8641D, Version: 2.0, (0x80900120) Clocks: CPU:1000 MHz, MPX: 400 MHz, DDR: 200 MHz, LBC: 50 MHz L2: Enabled Board: MPC8641HPCN PCI-EXPRESS1: Disabled I2C: ready DRAM: DDR: 256 MB FLASH: ## Unknown FLASH on Bank 1 - Size = 0x00000000 = 0 MB chipwidth = 2 portwidth = 2 interface = 1 vendor = 0 cfi_offset= 85 manufacturer_id = 1 device_id = 126 device_id2 = 4865 124 erase regions found, only 4 used 0 kB Using default environment
In: serial Out: serial Err: serial Net: eTSEC1, eTSEC2, eTSEC3, eTSEC4 Hit any key to stop autoboot: 0 --------------
Also try replacing following line in flash_detect_cfi() function:
flash_write_cmd (info, 0, 0, info->cmd_reset);
with the following two lines:
flash_write_cmd(info, 0, 0, FLASH_CMD_RESET); flash_write_cmd(info, 0, 0, AMD_CMD_RESET);
I would like to see if this change makes any difference as well.
Then the second log file after replacing the original flash_write_cmd() in flash_detect_cfi: -------------- U-Boot 1.2.0 (Jan 31 2007 - 13:43:50)
Freescale PowerPC CPU: Core: E600, Version: 0.2, (0x80040202) System: 8641D, Version: 2.0, (0x80900120) Clocks: CPU:1000 MHz, MPX: 400 MHz, DDR: 200 MHz, LBC: 50 MHz L2: Enabled Board: MPC8641HPCN PCI-EXPRESS1: Disabled I2C: ready DRAM: DDR: 256 MB FLASH: ## Unknown FLASH on Bank 1 - Size = 0x00000000 = 0 MB chipwidth = 2 portwidth = 2 interface = 1 vendor = 0 cfi_offset= 85 manufacturer_id = 1 device_id = 126 device_id2 = 4865 124 erase regions found, only 4 used 0 kB Using default environment
In: serial Out: serial Err: serial Net: eTSEC1, eTSEC2, eTSEC3, eTSEC4 Hit any key to stop autoboot: 0 => -------------- Seems no difference than the first log.
If you are suspecting that the flash is not in array read mode after flash_read_jedec_ids() you can easily dump the first 256 bytes of flash right after [write a simple hexdump routine].
this function, a flash_write_cmd() follows it, then the flash_read_uchar
After this command, the flash should be in cfi_read mode. Hexdump the first 256 bytes again. You should be able to spot easily if you landed in cfi query mode or not. These are crucial before you put the blame on flash_read_uchar().
Then I dump the first 256bytes of flash, first after flash_read_jedec_ids(), then after flash_write_cmd(), see log:
------- U-Boot 1.2.0 (Jan 31 2007 - 13:46:59)
Freescale PowerPC CPU: Core: E600, Version: 0.2, (0x80040202) System: 8641D, Version: 2.0, (0x80900120) Clocks: CPU:1000 MHz, MPX: 400 MHz, DDR: 200 MHz, LBC: 50 MHz L2: Enabled Board: MPC8641HPCN PCI-EXPRESS1: Disabled I2C: ready DRAM: DDR: 256 MB FLASH: ## Unknown FLASH on Bank 1 - Size = 0x00000000 = 0 MB chipwidth = 2 portwidth = 2 interface = 1 vendor = 0 cfi_offset= 85
Dump the first 256 byte of flash 27 05 19 56 26 6c 3d b3 44 e4 ae 15 00 2b 05 33 00 00 00 00 00 00 00 00 35 b6 3f 82 05 07 03 01 75 62 6f 6f 74 20 65 78 74 32 20 72 61 6d 64 69 73 6b 20 72 6f 6f 74 66 73 00 00 00 00 00 00 00 1f 8b 08 08 14 ae e4 44 00 03 72 6f 6f 74 66 73 2e 65 78 74 32 00 ec 9d 09 7c 15 d5 bd c7 cf dc dc dc 24 17 ac 97 a5 42 2b 68 dc 65 11 90 35 09 01 c2 22 6a 45 45 dc 00 17 92 90 00 d1 6c 26 81 62 ab 08 56 ad 5a 17 5a ad 5b 7d 16 7d f6 55 5b b5 58 ab e0 52 1b 97 56 6c 51 41 01 51 2c 82 62 b5 28 02 ae d4 2e be ef ff ce 19 e6 04 6e ca e4 35 e3 f4 d5 f3 e7 f3 e5 cc ff ce b9 f3 9b 73 ce ff 2c 73 73 ef 8c 52 d6 ac 59 fb b2 5a 2a 4b a9 1e 87 29 f5 91 a3 d4 7d b9 4a fd 95 d7 1c 63 ff c2 2e 2e 97 9b 2f aa ae ea e4 ad 8e b2 66 cd da ff 6f a3 fb ab 38 64 a7 bd 79 6a df 5d f6 5f 47 manufacturer_id = 1 device_id = 126 device_id2 = 4865
Dump the first 256 byte of flash 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 51 00 52 00 59 00 02 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 27 00 36 00 00 00 00 00 07 00 07 00 0a 00 00 00 01 00 05 00 04 00 00 00 17 00 01 00 00 00 05 00 00 00 01 00 7f 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50 00 52 00 49 00 31 00 33 00 08 00 02 00 04 00 01 00 04 00 00 00 00 00 01 00 b5 00 c5 00 05 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 8 MB Using default environment
In: serial Out: serial Err: serial Net: eTSEC1, eTSEC2, eTSEC3, eTSEC4 Hit any key to stop autoboot: 0 => ------- This time, the flash got the correct size and can be accessed. I think the second dump()function I added made it work, as it worked when I inserted a udelay after flash_write_cmd().
is executed directly after. This flash_read_uchar returns a wrong number(124, 255 for num_erase_regions etc.). If this flash_read_uchar here reads some other flash info other than FLASH_OFFSET_NUM_ERASE_REGIONS by uchar, it also returns wrong value.
Please do the checks I've provided above and collect the logs and send it to me. If flash does not switch to cfi query mode properly the read will fetch whatever content that was available in read mode. If the flash was completely blank (all 1s) you will get 255 which is matching with the symptoms you are facing.
It looks that flash does switch to cfi query mode. But from the first two logs I still got 124(0xfc) for num_erase_regions (sometimes I got 0xff)
Please shed some lights. Thanks a lot!
Haiying

Haiying Wang wrote:
U-Boot 1.2.0 (Jan 31 2007 - 13:12:04)
Freescale PowerPC CPU: Core: E600, Version: 0.2, (0x80040202) System: 8641D, Version: 2.0, (0x80900120) Clocks: CPU:1000 MHz, MPX: 400 MHz, DDR: 200 MHz, LBC: 50 MHz L2: Enabled
Pretty fast CPU you've got here :)
Board: MPC8641HPCN PCI-EXPRESS1: Disabled I2C: ready DRAM: DDR: 256 MB FLASH: ## Unknown FLASH on Bank 1 - Size = 0x00000000 = 0 MB
It looks like you inserted the debug prints in wrong place. The output below should have been after the debug stuff. Anyway...
chipwidth = 2 portwidth = 2 interface = 1 vendor = 0 cfi_offset= 85 manufacturer_id = 1 device_id = 126 device_id2 = 4865
Except for "vendor" everything looks OK here. I should have asked you to print vendor right before flash_read_jedec_ids() call. It was not yet initialized when printed. It must be correctly set as well since flash_read_jedec_ids() uses it would not have returned correct data otherwise.
You have a 16-bit flash accessible on 16-bit bus. The flash manufacturer is "AMD/Spansion" or compatible and device id is 7E1301 which is correct for your part:
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/2553...
Please note that flash_read_jedec_ids() is used for manufacturer id and device_ids and this function in turn is using flash_read_uchar() which worked just fine for you.
124 erase regions found, only 4 used
This is not correct. This is why you are blaming the flash_read_uchar() function. But since it worked correctly before the failure to read the number of erase regions is probably not due to the function.
Also try replacing following line in flash_detect_cfi() function:
flash_write_cmd (info, 0, 0, info->cmd_reset);
with the following two lines:
flash_write_cmd(info, 0, 0, FLASH_CMD_RESET); flash_write_cmd(info, 0, 0, AMD_CMD_RESET);
You demonstrated that this did not affect you because the output was the same. The problem with this line was that info->cmd_reset was not yet initialized when flash_detect_cfi() was called. This is a minor bug since the flash was not in query mode yet. I think we can safely remove this call as well or replace it both amd anf intel reset commands back to back. But anyway, back to your issue.
If you are suspecting that the flash is not in array read mode after flash_read_jedec_ids() you can easily dump the first 256 bytes of flash right after [write a simple hexdump routine].
this function, a flash_write_cmd() follows it, then the flash_read_uchar
After this command, the flash should be in cfi_read mode. Hexdump the first 256 bytes again. You should be able to spot easily if you landed in cfi query mode or not. These are crucial before you put the blame on flash_read_uchar().
Then I dump the first 256bytes of flash, first after flash_read_jedec_ids(), then after flash_write_cmd(), see log:
U-Boot 1.2.0 (Jan 31 2007 - 13:46:59)
Freescale PowerPC CPU: Core: E600, Version: 0.2, (0x80040202) System: 8641D, Version: 2.0, (0x80900120) Clocks: CPU:1000 MHz, MPX: 400 MHz, DDR: 200 MHz, LBC: 50 MHz L2: Enabled Board: MPC8641HPCN PCI-EXPRESS1: Disabled I2C: ready DRAM: DDR: 256 MB FLASH: ## Unknown FLASH on Bank 1 - Size = 0x00000000 = 0 MB chipwidth = 2 portwidth = 2 interface = 1 vendor = 0 cfi_offset= 85
Dump the first 256 byte of flash 27 05 19 56 26 6c 3d b3 44 e4 ae 15 00 2b 05 33 00 00 00 00 00 00 00 00 35 b6 3f 82 05 07 03 01 75 62 6f 6f 74 20 65 78 74 32 20 72 61 6d 64 69 73 6b 20 72 6f 6f 74 66 73 00 00 00 00 00 00 00 1f 8b 08 08 14 ae e4 44 00 03 72 6f 6f 74 66 73 2e 65 78 74 32 00 ec 9d 09 7c 15 d5 bd c7 cf dc dc dc 24 17 ac 97 a5 42 2b 68 dc 65 11 90 35 09 01 c2 22 6a 45 45 dc 00 17 92 90 00 d1 6c 26 81 62 ab 08 56 ad 5a 17 5a ad 5b 7d 16 7d f6 55 5b b5 58 ab e0 52 1b 97 56 6c 51 41 01 51 2c 82 62 b5 28 02 ae d4 2e be ef ff ce 19 e6 04 6e ca e4 35 e3 f4 d5 f3 e7 f3 e5 cc ff ce b9 f3 9b 73 ce ff 2c 73 73 ef 8c 52 d6 ac 59 fb b2 5a 2a 4b a9 1e 87 29 f5 91 a3 d4 7d b9 4a fd 95 d7 1c 63 ff c2 2e 2e 97 9b 2f aa ae ea e4 ad 8e b2 66 cd da ff 6f a3 fb ab 38 64 a7 bd 79 6a df 5d f6 5f 47 manufacturer_id = 1 device_id = 126 device_id2 = 4865
Dump the first 256 byte of flash 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 51 00 52 00 59 00 02 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 27 00 36 00 00 00 00 00 07 00 07 00 0a 00 00 00 01 00 05 00 04 00 00 00 17 00 01 00 00 00 05 00 00 00 01 00 7f 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50 00 52 00 49 00 31 00 33 00 08 00 02 00 04 00 01 00 04 00 00 00 00 00 01 00 b5 00 c5 00 05 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01
Guess what this looks like a valid CFI table now. I can definitely see Q = 0x51, R = 0x52, Y = 0x59 etc. in the correct places.
8 MB
It looks like the rest of the commands worked as well :) All without changing flash_read_uchar() functionality!
Using default environment
In: serial Out: serial Err: serial Net: eTSEC1, eTSEC2, eTSEC3, eTSEC4 Hit any key to stop autoboot: 0 =>
This time, the flash got the correct size and can be accessed. I think the second dump()function I added made it work, as it worked when I inserted a udelay after flash_write_cmd().
At this point, all data indicates that flash_read_uchar() is just fine.
I am suspecting two things:
1) Your CPU might be trying to re-order writes and reads to optimize performance. Since the command write and data read are from different addresses it could do this but the write command should really finish before we access the table data. So, we might need to add powerpc "sync" instructions in flash_write_cmd() function. Apparently CONFIG_BLACKFIN has a similar need as well. I would add a generic "sync" for the whole PowerPC family but there is not a conveninent CONFIG_POWERPC macro as far as I can see. Anyway, try adding:
asm("sync;");
statements in that function. Use Blackfin as an example.
2) Because you have a rather fast CPU it is possible that we are not allowing enough time after reset command is executed before array is in read mode. According to the datasheet of your flash part if external reset signal was asserted you could need up to 20usec before flash returns to read mode. I am not sure if this delay is needed for issued reset commands as well. We might need to add some delay after flash reset commands. Try adding udelay(100); right after flash reset command is sent. You can locate flash reset commands by searching flash_write_cmd() statements that passes FLASH_CMD_RESET, AMD_CMD_RESET or cfi->cmd_reset as the last argument.
Please shed some lights. Thanks a lot!
I hope this helps.
Best regards, Tolunay

Except for "vendor" everything looks OK here. I should have asked you to print vendor right before flash_read_jedec_ids() call. It was not yet initialized when printed. It must be correctly set as well since flash_read_jedec_ids() uses it would not have returned correct data otherwise.
My fault, vendor is 2 when I put the printf to the correct place.:-)
You have a 16-bit flash accessible on 16-bit bus. The flash manufacturer is "AMD/Spansion" or compatible and device id is 7E1301 which is correct for your part:
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/2553...
Please note that flash_read_jedec_ids() is used for manufacturer id and device_ids and this function in turn is using flash_read_uchar() which worked just fine for you.
Thanks.
124 erase regions found, only 4 used
This is not correct. This is why you are blaming the flash_read_uchar() function. But since it worked correctly before the failure to read the number of erase regions is probably not due to the function.
I agree with you that the failure is not due to the function (flash_read_uchar()). We thought before that it was better to modify this function so that flash with different port-width could be accessed safer.
You demonstrated that this did not affect you because the output was the same. The problem with this line was that info->cmd_reset was not yet initialized when flash_detect_cfi() was called. This is a minor bug since the flash was not in query mode yet. I think we can safely remove this call as well or replace it both amd anf intel reset commands back to back. But anyway, back to your issue.
Understood.:)
It looks like the rest of the commands worked as well :) All without changing flash_read_uchar() functionality!
Yes. But in our previous tests, without adding code in between flash_write_cmd() and flash_read_uchar()(for num_erase_region), just adding some code elsewhere, the flash worked sometimes and failed sometimes. Another case is that, one of our customer pointed that on their test, flash could work with ELDK 1.3.0 and failed at ELDK 1.4.0. I just suspect that the different compiler may have difference on optimization thus affected the instruction execution sequence.
At this point, all data indicates that flash_read_uchar() is just fine.
I am suspecting two things:
- Your CPU might be trying to re-order writes and reads to optimize
performance. Since the command write and data read are from different addresses it could do this but the write command should really finish before we access the table data. So, we might need to add powerpc "sync" instructions in flash_write_cmd() function. Apparently CONFIG_BLACKFIN has a similar need as well. I would add a generic "sync" for the whole PowerPC family but there is not a conveninent CONFIG_POWERPC macro as far as I can see. Anyway, try adding:
asm("sync;");
statements in that function. Use Blackfin as an example.
- Because you have a rather fast CPU it is possible that we are not
allowing enough time after reset command is executed before array is in read mode. According to the datasheet of your flash part if external reset signal was asserted you could need up to 20usec before flash returns to read mode. I am not sure if this delay is needed for issued reset commands as well. We might need to add some delay after flash reset commands. Try adding udelay(100); right after flash reset command is sent. You can locate flash reset commands by searching flash_write_cmd() statements that passes FLASH_CMD_RESET, AMD_CMD_RESET or cfi->cmd_reset as the last argument.
Your suggestions are very valuable. I did think about the flash response time for reset command, but did not figure out where and why was reasonable.I am very appreciated for your kind help on this. We will try both suggestions and do enough tests before sending out a new patch for review.:-)
I hope this helps.
Many thanks again.
Haiying

Haiying,
I may be able to test your code on my MPC8641hpcn here. It's sounding like the strategically placed sync's are the way to go, especially given the differences I saw with compilers on this bug. I'm sharing the box with another developer at the moment, but I definitely have an interest in getting this portion of the code rock solid. Please contact me off-list if you want help testing.
Thanks Tolunay, Haiying, and everyone else on this thread for working so dilligently on this! :)
Chris
On Thu, 2007-02-01 at 17:06 -0500, Haiying Wang wrote:
Your suggestions are very valuable. I did think about the flash response time for reset command, but did not figure out where and why was reasonable.I am very appreciated for your kind help on this. We will try both suggestions and do enough tests before sending out a new patch for review.:-)
I hope this helps.
Many thanks again.
Haiying

Because SYNC is already defined in include/ppc_asm.tmpl for some ppc based CPUs to use, I use DO_SYNC instead of SYNC for this patch.

In message 1171043255.3932.20.camel@udp097531uds.am.freescale.net you wrote:
Because SYNC is already defined in include/ppc_asm.tmpl for some ppc based CPUs to use, I use DO_SYNC instead of SYNC for this patch.
AFAICT that's a general definition which is not specific to "some ppc based CPUs" only. Rather than redefining it why not just use this definition? It seems to match our purposes.
Best regards,
Wolfgang Denk

SYNC is defined as " #define SYNC \ sync; \ isync " in include/ppc_asm.tmpl,
and can not be used by .c file, am I right? :-). In fact it is used by the start.S file for 85xx/83xx/8xx/4xx/5xxx/74xx_7xx.
We need to define SYNC as asm("sync;").
Haiying
On Fri, 2007-02-09 at 20:42 +0100, Wolfgang Denk wrote:
In message 1171043255.3932.20.camel@udp097531uds.am.freescale.net you wrote:
Because SYNC is already defined in include/ppc_asm.tmpl for some ppc based CPUs to use, I use DO_SYNC instead of SYNC for this patch.
AFAICT that's a general definition which is not specific to "some ppc based CPUs" only. Rather than redefining it why not just use this definition? It seems to match our purposes.
Best regards,
Wolfgang Denk

In message 1171050483.3932.31.camel@udp097531uds.am.freescale.net you wrote:
SYNC is defined as " #define SYNC \ sync; \ isync " in include/ppc_asm.tmpl,
and can not be used by .c file, am I right? :-). In fact it is used by the start.S file for 85xx/83xx/8xx/4xx/5xxx/74xx_7xx.
We need to define SYNC as asm("sync;").
Or, to be sure, ""sync;isync"
Where is the problem? Which code includes include/ppc_asm.tmpl ? Why cannot we have the same definition once for C and again for assembler?
Best regards,
Wolfgang Denk

Wolfgang Denk wrote:
In message 1171050483.3932.31.camel@udp097531uds.am.freescale.net you wrote:
SYNC is defined as " #define SYNC \ sync; \ isync " in include/ppc_asm.tmpl,
and can not be used by .c file, am I right? :-). In fact it is used by the start.S file for 85xx/83xx/8xx/4xx/5xxx/74xx_7xx.
We need to define SYNC as asm("sync;").
Or, to be sure, ""sync;isync"
OK.
Where is the problem? Which code includes include/ppc_asm.tmpl ? Why cannot we have the same definition once for C and again for assembler?
I agree, I think we can define the equivalent one in a C header file
#define SYNC asm("sync; isync;")
I am not sure if the assembler one is ever included in the C code.
Tolunay

On Saturday 10 February 2007 08:23, Tolunay Orkun wrote:
We need to define SYNC as asm("sync;").
Or, to be sure, ""sync;isync"
OK.
I would not do this. Please let a "sync" instruction _not_ do a "isync" too. There will be times when you explicitly _don't_ what this.
Where is the problem? Which code includes include/ppc_asm.tmpl ? Why cannot we have the same definition once for C and again for assembler?
I agree, I think we can define the equivalent one in a C header file
#define SYNC asm("sync; isync;")
I am not sure if the assembler one is ever included in the C code.
Why not use
#define sync() __asm__ __volatile__ ("sync" : : : "memory");
from include/asm-ppc/io.h? This seems to be exactly what we need.
Best regards, Stefan
===================================================================== DENX Software Engineering GmbH, HRB 165235 Munich, CEO: Wolfgang Denk Office: Kirchenstr. 5, D-82194 Groebenzell, Germany =====================================================================

Stefan Roese wrote:
On Saturday 10 February 2007 08:23, Tolunay Orkun wrote:
We need to define SYNC as asm("sync;").
Or, to be sure, ""sync;isync"
OK.
I would not do this. Please let a "sync" instruction _not_ do a "isync" too. There will be times when you explicitly _don't_ what this.
Could you give a specific example.
There is also "eieio" and "msync" to consider though these usually map to former two (or vice versa).
Where is the problem? Which code includes include/ppc_asm.tmpl ? Why cannot we have the same definition once for C and again for assembler?
I agree, I think we can define the equivalent one in a C header file
#define SYNC asm("sync; isync;")
I am not sure if the assembler one is ever included in the C code.
Why not use
#define sync() __asm__ __volatile__ ("sync" : : : "memory");
from include/asm-ppc/io.h? This seems to be exactly what we need.
I would rather prefer an uppercase SYNC since it is a macro but whatever style you guys choose is OK with me.
Please you and Wolfgang decide on this matter. I do not want to confuse Haiying with conflicting messages.
Tolunay

Hi Tolunay,
On Saturday 10 February 2007 08:57, Tolunay Orkun wrote:
I would not do this. Please let a "sync" instruction _not_ do a "isync" too.
There will be times when you explicitly _don't_ what this.
Could you give a specific example.
The "isync" will reload the TLB's on 440's for example and sometimes you really don't want this on a "normal" sync instruction.
There is also "eieio" and "msync" to consider though these usually map to former two (or vice versa).
Yes. But I find the name "sync" more platform generic.
Why not use
#define sync() __asm__ __volatile__ ("sync" : : : "memory");
from include/asm-ppc/io.h? This seems to be exactly what we need.
I would rather prefer an uppercase SYNC since it is a macro but whatever style you guys choose is OK with me.
Could be that on other platforms this sync is not a define but a function. But I don't have a strong preference here. Wolfgang, you decide. ;-)
Best regards, Stefan
===================================================================== DENX Software Engineering GmbH, HRB 165235 Munich, CEO: Wolfgang Denk Office: Kirchenstr. 5, D-82194 Groebenzell, Germany =====================================================================

In message 200702100907.37451.sr@denx.de you wrote:
Could be that on other platforms this sync is not a define but a function. But I don't have a strong preference here. Wolfgang, you decide. ;-)
I like sync(); better.
Best regards,
Wolfgang Denk

In message 200702100840.46587.sr@denx.de you wrote:
Or, to be sure, ""sync;isync"
I would not do this. Please let a "sync" instruction _not_ do a "isync" too. There will be times when you explicitly _don't_ what this.
Note that the current assember #define SYNC does exactly what I quoted above...
Why not use
#define sync() __asm__ __volatile__ ("sync" : : : "memory");
from include/asm-ppc/io.h? This seems to be exactly what we need.
Indeed. Let's use that!
Best regards,
Wolfgang Denk

Stefan Roese wrote:
Why not use
#define sync() __asm__ __volatile__ ("sync" : : : "memory");
I like this idea, but is that semicolon supposed to be there? Shouldn't we make it an inline function instead of a macro?

In message 45CE80A1.5050206@freescale.com you wrote:
#define sync() __asm__ __volatile__ ("sync" : : : "memory");
I like this idea, but is that semicolon supposed to be there? Shouldn't we make it an inline function instead of a macro?
No, and yes, please.
Best regards,
Wolfgang Denk

On 2/9/07, Wolfgang Denk wd@denx.de wrote:
In message 1171043255.3932.20.camel@udp097531uds.am.freescale.net you wrote:
Because SYNC is already defined in include/ppc_asm.tmpl for some ppc based CPUs to use, I use DO_SYNC instead of SYNC for this patch.
AFAICT that's a general definition which is not specific to "some ppc based CPUs" only. Rather than redefining it why not just use this definition? It seems to match our purposes.
Except that it's not written in C ;-)
Haavard

In message 1defaf580702091159s2e2fc9e7wf962d4fff0ec985c@mail.gmail.com you wrote:
Because SYNC is already defined in include/ppc_asm.tmpl for some ppc based CPUs to use, I use DO_SYNC instead of SYNC for this patch.
AFAICT that's a general definition which is not specific to "some ppc based CPUs" only. Rather than redefining it why not just use this definition? It seems to match our purposes.
Except that it's not written in C ;-)
You cannot write this in C anyway. C has no concept of such things. You will always end up withpulling some "asm" tricks.
Best regards,
Wolfgang Denk

Make sure the flash write command is fully finished and remove the unnecessary ssync operation which should be moved to it CPU header file
Signed-off-by: Haiying Wang haiying.wang@freescale.com --- drivers/cfi_flash.c | 15 +++------------ 1 files changed, 3 insertions(+), 12 deletions(-)
diff --git a/drivers/cfi_flash.c b/drivers/cfi_flash.c index 696f9a4..c686e53 100644 --- a/drivers/cfi_flash.c +++ b/drivers/cfi_flash.c @@ -931,27 +931,18 @@ static void flash_write_cmd (flash_info_t * info, flash_sect_t sect, uint offset debug ("fwc addr %p cmd %x %x 8bit x %d bit\n", addr.cp, cmd, cword.c, info->chipwidth << CFI_FLASH_SHIFT_WIDTH); *addr.cp = cword.c; -#ifdef CONFIG_BLACKFIN - asm("ssync;"); -#endif break; case FLASH_CFI_16BIT: debug ("fwc addr %p cmd %x %4.4x 16bit x %d bit\n", addr.wp, cmd, cword.w, info->chipwidth << CFI_FLASH_SHIFT_WIDTH); *addr.wp = cword.w; -#ifdef CONFIG_BLACKFIN - asm("ssync;"); -#endif break; case FLASH_CFI_32BIT: debug ("fwc addr %p cmd %x %8.8lx 32bit x %d bit\n", addr.lp, cmd, cword.l, info->chipwidth << CFI_FLASH_SHIFT_WIDTH); *addr.lp = cword.l; -#ifdef CONFIG_BLACKFIN - asm("ssync;"); -#endif break; case FLASH_CFI_64BIT: #ifdef DEBUG @@ -966,11 +957,11 @@ static void flash_write_cmd (flash_info_t * info, flash_sect_t sect, uint offset } #endif *addr.llp = cword.ll; -#ifdef CONFIG_BLACKFIN - asm("ssync;"); -#endif break; } + + /* Ensure all the instruction are finished */ + DO_SYNC; }
static void flash_unlock_seq (flash_info_t * info, flash_sect_t sect)

In message 1171043258.3932.21.camel@udp097531uds.am.freescale.net you wrote:
Make sure the flash write command is fully finished and remove the unnecessary ssync operation which should be moved to it CPU header file
...
- DO_SYNC;
Please use SYNC (instead of DO_SYNC) as discussed.
Best regards,
Wolfgang Denk

For the arches which need sync in flash_write_cmd, DO_SYNC will be defined. Otherwise, it is a dummy stub for now.
Signed-off-by: Haiying Wang haiying.wang@freescale.com --- include/asm-arm/processor.h | 2 ++ include/asm-avr32/processor.h | 2 ++ include/asm-blackfin/processor.h | 2 ++ include/asm-i386/processor.h | 3 +++ include/asm-m68k/processor.h | 2 ++ include/asm-mips/processor.h | 2 ++ include/asm-nios2/processor.h | 3 +++ include/asm-ppc/processor.h | 2 ++ 8 files changed, 18 insertions(+), 0 deletions(-)
diff --git a/include/asm-arm/processor.h b/include/asm-arm/processor.h index 445d449..5b61eee 100644 --- a/include/asm-arm/processor.h +++ b/include/asm-arm/processor.h @@ -11,6 +11,8 @@ #ifndef __ASM_ARM_PROCESSOR_H #define __ASM_ARM_PROCESSOR_H
+#define DO_SYNC /* dummy stub */ + /* * Default implementation of macro that returns current * instruction pointer ("program counter"). diff --git a/include/asm-avr32/processor.h b/include/asm-avr32/processor.h index cc59dfa..94f3432 100644 --- a/include/asm-avr32/processor.h +++ b/include/asm-avr32/processor.h @@ -22,6 +22,8 @@ #ifndef __ASM_AVR32_PROCESSOR_H #define __ASM_AVR32_PROCESSOR_H
+#define DO_SYNC /* dummy stub */ + #ifndef __ASSEMBLY__
#define current_text_addr() ({ void *pc; __asm__("mov %0,pc" : "=r"(pc)); pc; }) diff --git a/include/asm-blackfin/processor.h b/include/asm-blackfin/processor.h index 19bd720..4cd0be5 100644 --- a/include/asm-blackfin/processor.h +++ b/include/asm-blackfin/processor.h @@ -30,6 +30,8 @@ #ifndef __ASM_BLACKFIN_PROCESSOR_H #define __ASM_BLACKFIN_PROCESSOR_H
+#define DO_SYNC asm("ssync;") + /* * Default implementation of macro that returns current * instruction pointer ("program counter"). diff --git a/include/asm-i386/processor.h b/include/asm-i386/processor.h index 5dedba8..64633d2 100644 --- a/include/asm-i386/processor.h +++ b/include/asm-i386/processor.h @@ -26,4 +26,7 @@ /* Currently this header is unused in the i386 port * but some generic files #include <asm/processor.h> * so this file is a placeholder. */ + +#define DO_SYNC /* dummy stub */ + #endif diff --git a/include/asm-m68k/processor.h b/include/asm-m68k/processor.h index 3fafa6f..c49dab6 100644 --- a/include/asm-m68k/processor.h +++ b/include/asm-m68k/processor.h @@ -1,6 +1,8 @@ #ifndef __ASM_M68K_PROCESSOR_H #define __ASM_M68K_PROCESSOR_H
+#define DO_SYNC /* dummy stub */ + #include <asm/ptrace.h> #include <asm/types.h>
diff --git a/include/asm-mips/processor.h b/include/asm-mips/processor.h index 6838aee..4ad29e2 100644 --- a/include/asm-mips/processor.h +++ b/include/asm-mips/processor.h @@ -11,6 +11,8 @@ #ifndef _ASM_PROCESSOR_H #define _ASM_PROCESSOR_H
+#define DO_SYNC /* dummy stub */ + #include <linux/config.h>
#include <asm/isadep.h> diff --git a/include/asm-nios2/processor.h b/include/asm-nios2/processor.h index 68502a5..825e7b2 100644 --- a/include/asm-nios2/processor.h +++ b/include/asm-nios2/processor.h @@ -23,4 +23,7 @@
#ifndef __ASM_NIOS2_PROCESSOR_H_ #define __ASM_NIOS2_PROCESSOR_H_ + +#define DO_SYNC /* dummy stub */ + #endif /* __ASM_NIOS2_PROCESSOR_H_ */ diff --git a/include/asm-ppc/processor.h b/include/asm-ppc/processor.h index f102600..7843061 100644 --- a/include/asm-ppc/processor.h +++ b/include/asm-ppc/processor.h @@ -1,6 +1,8 @@ #ifndef __ASM_PPC_PROCESSOR_H #define __ASM_PPC_PROCESSOR_H
+#define DO_SYNC asm("sync;") + /* * Default implementation of macro that returns current * instruction pointer ("program counter").

In message 1171043259.3932.23.camel@udp097531uds.am.freescale.net you wrote:
For the arches which need sync in flash_write_cmd, DO_SYNC will be defined. Otherwise, it is a dummy stub for now.
Please don't break definition and use of this into separate patches. This is ONE change, and should be sumbitted as one commit only.
And use SYNC as discussed, please.
Best regards,
Wolfgang Denk

Wolfgang Denk wrote:
In message 1171043259.3932.23.camel@udp097531uds.am.freescale.net you wrote:
For the arches which need sync in flash_write_cmd, DO_SYNC will be defined. Otherwise, it is a dummy stub for now.
Please don't break definition and use of this into separate patches.
Actually he originally created a single patch. I asked him to come up with two patches particularly because one crosses multiple custodians and another only one.
Combined is OK for me. Actually, combined is probably easier now that we will be maintaining different trees.
This is ONE change, and should be sumbitted as one commit only.
And use SYNC as discussed, please.
Yes.
Best regards,
Wolfgang Denk

Hi, Stefan,
Thanks! But it's a pity they are no useful.
What I noticed after looking at the flash access functions like flash_read_uchar() is that no access macros and even no volatile pointer access is used to read from the flash. This looks like a potential problem, that can show when using different compiler versions.
Please find attached a small patch that adds fixes this potential problem for the 3 functions flash_read_uchar/ushort/long. Please give it a try and let me know if this changed the behavior somehow.
Agree with you! It's a potential problem. Is my patch fit for it? :D
Best Regards, Zhang Wei

I'm interested in how this discussion pans out. I have encountered this problem (with flinfo) with my MPC8641HPCN. The patch presented by Zhang Wei (and other freescale employees) does seem to allieviate the problem when I compile u-boot with ELDK 4.0.
However, some scenarios here worry me: -- with the patch, compiled with ELDK 4.0 is peachy. -- without the patch, compiled with ELDK 4.0 FAILS. -- without the patch, compiled with ELDK 3.1.1 is peachy.
At the time of my testing (early this week) I was using a git tree about 2 weeks old from jdl.com, which is what the freescale folks seem to be using at the moment. It worries me that changing a compiler version could affect this bug, but I guess I've seen stranger things.
Could this be a subtle timing problem with the processor? Could it be a timing problem with the flash? Am I nuts? :P
Merry Christmas! Chris
On Fri, 2006-12-22 at 15:11 +0100, Wolfgang Denk wrote:
In message 458BBB5D.1030005@freescale.com you wrote:
After perform flash_read_jedec_ids(), the cfi query read will get an '0xff'. From this flash's specification, the read for flash commands in 16bit port connection should perform 16bit read and ignore the high 8bit data. Although this issue maybe occurs in the special chip, perform the fully 16bit read is a safety operation.
But your patch does NOT perform a 16 bit read. It calls memcpy(); the default implementation of memcpy [see lib_generic/string.c] does this:
char *tmp = (char *) dest, *s = (char *) src;
while (count--) *tmp++ = *s++;
i. e. it performs a character copy, too, so it makes no difference compared to the original code.
The modification refers to the cfi flash drivers in Linux kernel (The cfi_read_query() function in include/linux/mtd/cfi.h file). It's more easy and clear than making the different type date read for different portwidth (such as ushort_read() for x16, ulong_read() for x32).
I understand your intentions, but your patch does not do what you think it does, so if it really fixes the problem on your system there must be another cause or effect.
Best regards,
Wolfgang Denk

In message 1166807257.26147.25.camel@kaboom.lisle.iphase.com you wrote:
However, some scenarios here worry me: -- with the patch, compiled with ELDK 4.0 is peachy. -- without the patch, compiled with ELDK 4.0 FAILS. -- without the patch, compiled with ELDK 3.1.1 is peachy.
Now *this* is interesting information.
Did you compare the generated code?
On Fri, 2006-12-22 at 15:11 +0100, Wolfgang Denk wrote:
If you want to make me a Xmas present, then please stop top posting / full quoting (the same goes to Zhang Wei); please read http://www.netmeister.org/news/learn2quote.html
Best regards,
Wolfgang Denk

Hello,
in message 1166809358.26147.32.camel@kaboom.lisle.iphase.com you wrote:
Did you compare the generated code?
Nope, but that's a good idea. I'll compile and objdump the various scenarios next year.
I just did this - this is always pretty instructive in such situ- ations. GCC-4.x optimizes much better, but will also unremorsefully punish even small sins like a forgotten "volatile".
text data bss dec hex filename 202340 12104 32056 246500 3c2e4 /tmp/ELDK-3.1.1/u-boot 185751 12016 32048 229815 381b7 /tmp/ELDK-4.0/u-boot ------
That's nearly 10% difference in code size...
But here the difference is not in this function - both compile to exactly the same code. Obviously some other functions in the CFI driver are different, though.
Merry Christmas from a very bad quoting net citizen. :P
Thanks, and Merry Christmas, too.
Best regards,
Wolfgang Denk
participants (12)
-
Chris Fester
-
Haavard Skinnemoen
-
Haiying Wang
-
Håvard Skinnemoen
-
Stefan Roese
-
Timur Tabi
-
Tolunay Orkun
-
Wang Haiying-r54964
-
wei.zhang@freescale.com
-
Wolfgang Denk
-
Zhang Wei
-
Zhang Wei-r63237