Re: [U-Boot] Question about ARM compiler optimization problem

After I looked into disassembled ARM codes, found no reason to cause problem.
When I compiled with -Os option, the C source code like following:
static int flash_toggle (flash_info_t * info, flash_sect_t sect, uint offset, uchar cmd) { void *addr; cfiword_t cword; int retval;
addr = flash_map (info, sect, offset); flash_make_cmd (info, cmd, &cword); switch (info->portwidth) { case FLASH_CFI_8BIT: retval = flash_read8(addr) != flash_read8(addr); break; case FLASH_CFI_16BIT: retval = flash_read16(addr) != flash_read16(addr); break; case FLASH_CFI_32BIT: retval = flash_read32(addr) != flash_read32(addr); break; case FLASH_CFI_64BIT: retval = flash_read64(addr) != flash_read64(addr); break; default: retval = 0; break; } flash_unmap(info, sect, offset, addr);
return retval; }
is compiled as following:
f0497c: e1a0100a mov r1, sl f04980: e3a02000 mov r2, #0 ; 0x0 f04984: e1a00007 mov r0, r7 f04988: ebfffcbc bl f03c80 <flash_map> f0498c: e3a01040 mov r1, #64 ; 0x40 f04990: e1a04000 mov r4, r0 f04994: e28d2004 add r2, sp, #4 ; 0x4 f04998: e1a00007 mov r0, r7 f0499c: ebfffcbd bl f03c98 <flash_make_cmd> f049a0: e5d732af ldrb r3, [r7, #687] f049a4: e2433001 sub r3, r3, #1 ; 0x1 f049a8: e3530007 cmp r3, #7 ; 0x7 f049ac: 979ff103 ldrls pc, [pc, r3, lsl #2] f049b0: ea000022 b f04a40 <.text+0x4a40> f049b4: 00f049d4 ldreqsbt r4, [r0], #148 f049b8: 00f049e0 rsceqs r4, r0, r0, ror #19 f049bc: 00f04a40 rsceqs r4, r0, r0, asr #20 f049c0: 00f049ec rsceqs r4, r0, ip, ror #19 f049c4: 00f04a40 rsceqs r4, r0, r0, asr #20 f049c8: 00f04a40 rsceqs r4, r0, r0, asr #20 f049cc: 00f04a40 rsceqs r4, r0, r0, asr #20 f049d0: 00f04a00 rsceqs r4, r0, r0, lsl #20 f049d4: e5d42000 ldrb r2, [r4] f049d8: e5d43000 ldrb r3, [r4] f049dc: ea000004 b f049f4 <.text+0x49f4> f049e0: e1d420b0 ldrh r2, [r4] f049e4: e1d430b0 ldrh r3, [r4] f049e8: ea000001 b f049f4 <.text+0x49f4> f049ec: e5942000 ldr r2, [r4] f049f0: e5943000 ldr r3, [r4] f049f4: e0520003 subs r0, r2, r3 f049f8: 13a00001 movne r0, #1 ; 0x1 f049fc: ea00000d b f04a38 <.text+0x4a38> f04a00: e59f317c ldr r3, [pc, #380] ; f04b84 <.text+0x4b84> f04a04: e1a00004 mov r0, r4 f04a08: e1a0e00f mov lr, pc f04a0c: e1a0f003 mov pc, r3 f04a10: e59f216c ldr r2, [pc, #364] ; f04b84 <.text+0x4b84> f04a14: e1a05000 mov r5, r0 f04a18: e1a00004 mov r0, r4 f04a1c: e1a06001 mov r6, r1 f04a20: e1a0e00f mov lr, pc f04a24: e1a0f002 mov pc, r2 f04a28: e1550000 cmp r5, r0 f04a2c: 1affff98 bne f04894 <flash_full_status_check+0x28> f04a30: e1560001 cmp r6, r1 f04a34: ea000000 b f04a3c <.text+0x4a3c> f04a38: e3500000 cmp r0, #0 ; 0x0 f04a3c: 1affff94 bne f04894 <flash_full_status_check+0x28>
f049d4 ~ f049f0 is accessing flash_read8/16/32 functions, and this code should be work because two sequential read from CFI flash changes toggle bit while erase/write is in action.
But, two read operation gets the same result.
Any other things that I wasn't understand ?
TIA, Any advice will be appreciated.
Best regards, Choe, Hyun-ho
2008-11-20 (목), 12:34 -0600, Andrew Dyer 쓰시길:
On Thu, Nov 20, 2008 at 11:59 AM, Choe, Hyun-ho firebird@legend.co.kr wrote:
In cpu dir, there is arm920t/ks8695. ks8695p is some modified chip from ks8695.
In board section, OpenGear cm4008 and cm41xx uses ks8695.
I would guess the problem is that the compiler is reading the hardware once and using that value both times, thinking that the location is memory, not a hardware register, where the value may change from time to time. I've always fixed this with a 'volatile' pointer, which is supposed to force the compiler to read the location every time it is accessed. Supposedly the 'linux way' is to use special accessor functions that are written to prevent this optimization from occurring.
The other problem I've seen with toggle registers is reading them on hardware that likes to do multiple reads to fill up a word of data. For example you have a 16 bit flash on a 32-bit bus, and you read the status register as a 32-bit quantity, then the hardware goes and reads the register twice to fill up the 32-bit request, and the next time you read you see the same data again as the first time through because the value with the toggle in it is in the other half of the first 32-bits you fetched.
participants (1)
-
Choe, Hyun-ho