[U-Boot] Hanging in kmalloc of nand_scan_tail() function

Hi, everyone
I'm using U-boot 2009-03. U-boot hangs in nand_init() function. I found the routine kmalloc in nand_scan_tail() cause U-boot reset.
int nand_scan_tail(struct mtd_info *mtd) { int i; struct nand_chip *chip = mtd->priv;
if (!(chip->options & NAND_OWN_BUFFERS)) { chip->buffers = kmalloc(sizeof(*chip->buffers), GFP_KERNEL); }
if (!chip->buffers) return -ENOMEM; ........
U-boot displays message as follows :
NAND: data abort pc : [<31f902b4>]\0x09 lr : [<31fa084c>] sp : 31f5bee0 ip : 00000076\0x09 fp : 00000000 r10: 00001188 r9 : 00020000\0x09 r8 : 31f5bfdc r7 : 00000001 r6 : 00000000\0x09 r5 : 31fa42b8 r4 : 31fa4364 r3 : 31fa052c r2 : 00000064\0x09 r1 : 00000063 r0 : ffffffff Flags: NzCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Where the malloc function is defined? Why the kmalloc() hangs the u-boot? Is there any configuation definition needed?
Thanks in advance.
Regards, J.Hwan Kim

On Fri, Sep 18, 2009 at 01:17:48PM +0900, J.Hwan.Kim wrote:
Hi, everyone
I'm using U-boot 2009-03. U-boot hangs in nand_init() function. I found the routine kmalloc in nand_scan_tail() cause U-boot reset.
int nand_scan_tail(struct mtd_info *mtd) { int i; struct nand_chip *chip = mtd->priv;
if (!(chip->options & NAND_OWN_BUFFERS)) { chip->buffers = kmalloc(sizeof(*chip->buffers), GFP_KERNEL); } if (!chip->buffers) return -ENOMEM; ........
U-boot displays message as follows :
NAND: data abort pc : [<31f902b4>]\0x09 lr : [<31fa084c>] sp : 31f5bee0 ip : 00000076\0x09 fp : 00000000 r10: 00001188 r9 : 00020000\0x09 r8 : 31f5bfdc r7 : 00000001 r6 : 00000000\0x09 r5 : 31fa42b8 r4 : 31fa4364 r3 : 31fa052c r2 : 00000064\0x09 r1 : 00000063 r0 : ffffffff Flags: NzCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
Where the malloc function is defined? Why the kmalloc() hangs the u-boot? Is there any configuation definition needed?
What specific source lines do 0x31f902b4 and 0x31fa0840 correspond to, and can you disassemble the former?
-Scott

Dear everyone, I'm using u-boot-2010.09. After I download u-boot.bin to my boards(cpu is s3c2410),the output from serial shows that cpu has exception, the information list as following: U-Boot 2010.09 (Nov 11 2010 - 21:55:07)
U-Boot code: 33F80000 -> 33FA0BDC BSS: -> 33FA45EC RAM Configuration: Bank #0: 30000000 64 MiB NAND: data abort pc : [<33f8fbb4>] lr : [<33f85f70>] sp : 33f07fac ip : 00000000 fp : 00000000 r10: 00001298 r9 : ffffff7f r8 : 33f4ffe0 r7 : 00000000 r6 : 33fa3b50 r5 : 33fa3c00 r4 : 33fa0274 r3 : 33f9ff54 r2 : 00000064 r1 : 00000001 r0 : cc33cc33 Flags: NzCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
value of pc locate at <malloc> function, value of lr locate at <nand_scan_tail>(/drivers/mtd/nand/nand_base.c).
I have seen that someone had this problem before, have you resolved it? Can you give me some suggestions?

On Thu, 11 Nov 2010 23:06:01 +0800 terry gliumailenator@gmail.com wrote:
Dear everyone, I'm using u-boot-2010.09. After I download u-boot.bin to my boards(cpu is s3c2410),the output from serial shows that cpu has exception, the information list as following: U-Boot 2010.09 (Nov 11 2010 - 21:55:07)
U-Boot code: 33F80000 -> 33FA0BDC BSS: -> 33FA45EC RAM Configuration: Bank #0: 30000000 64 MiB NAND: data abort pc : [<33f8fbb4>] lr : [<33f85f70>] sp : 33f07fac ip : 00000000 fp : 00000000 r10: 00001298 r9 : ffffff7f r8 : 33f4ffe0 r7 : 00000000 r6 : 33fa3b50 r5 : 33fa3c00 r4 : 33fa0274 r3 : 33f9ff54 r2 : 00000064 r1 : 00000001 r0 : cc33cc33 Flags: NzCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
value of pc locate at <malloc> function, value of lr locate at <nand_scan_tail>(/drivers/mtd/nand/nand_base.c).
Could you look up the specific line numbers of 0x33f8fbb4 and 0x33f85f6c, and show a few lines of disassembly around those addresses?
-Scott

Dear Scott, I have disassemble the nand_base.o file,because i know the problem happened here. Do you think it's useful for your analysis?
00001a4c <nand_scan_tail>: 1749 1a4c: e92d4070 push {r4, r5, r6, lr} 1750 1a50: e590509c ldr r5, [r0, #156] 1751 1a54: e595304c ldr r3, [r5, #76] 1752 1a58: e3130701 tst r3, #262144 ; 0x40000 1753 1a5c: e1a06000 mov r6, r0 1754 1a60: 1a000002 bne 1a70 <nand_scan_tail+0x24> 1755 1a64: e59f04ec ldr r0, [pc, #1260] ; 1f58 <nand_scan_tail +0x50c> 1756 1a68: ebfffffe bl 0 <malloc> 1757 1a6c: e58500dc str r0, [r5, #220] 1758 1a70: e59510dc ldr r1, [r5, #220] 1759 1a74: e3510000 cmp r1, #0 ; 0x0 1760 1a78: 03e0000b mvneq r0, #11 ; 0xb 1761 1a7c: 08bd8070 popeq {r4, r5, r6, pc} 1762 1a80: e5963014 ldr r3, [r6, #20] 1763 1a84: e59520b0 ldr r2, [r5, #176] 1764 1a88: e0813003 add r3, r1, r3 by the way,I cann't find the prototype of malloc in the whole project,it seems that it is encapsulated in some libs. -- Best regards, Terry.
在 2010-11-11四的 13:49 -0600,Scott Wood写道:
On Thu, 11 Nov 2010 23:06:01 +0800 terry gliumailenator@gmail.com wrote:
Dear everyone, I'm using u-boot-2010.09. After I download u-boot.bin to my boards(cpu is s3c2410),the output from serial shows that cpu has exception, the information list as following: U-Boot 2010.09 (Nov 11 2010 - 21:55:07)
U-Boot code: 33F80000 -> 33FA0BDC BSS: -> 33FA45EC RAM Configuration: Bank #0: 30000000 64 MiB NAND: data abort pc : [<33f8fbb4>] lr : [<33f85f70>] sp : 33f07fac ip : 00000000 fp : 00000000 r10: 00001298 r9 : ffffff7f r8 : 33f4ffe0 r7 : 00000000 r6 : 33fa3b50 r5 : 33fa3c00 r4 : 33fa0274 r3 : 33f9ff54 r2 : 00000064 r1 : 00000001 r0 : cc33cc33 Flags: NzCv IRQs off FIQs off Mode SVC_32 Resetting CPU ...
value of pc locate at <malloc> function, value of lr locate at <nand_scan_tail>(/drivers/mtd/nand/nand_base.c).
Could you look up the specific line numbers of 0x33f8fbb4 and 0x33f85f6c, and show a few lines of disassembly around those addresses?
-Scott

On Fri, 12 Nov 2010 20:45:18 +0800 terry gliumailenator@gmail.com wrote:
Dear Scott, I have disassemble the nand_base.o file,because i know the problem happened here.
Why not disassemble the whole u-boot?
Then you'll get malloc as well, and the addresses will be closer to what shows up in the dump.
Do you think it's useful for your analysis?
Can you disassemble malloc? That's where it actually crashed.
00001a4c <nand_scan_tail>: 1749 1a4c: e92d4070 push {r4, r5, r6, lr} 1750 1a50: e590509c ldr r5, [r0, #156] 1751 1a54: e595304c ldr r3, [r5, #76] 1752 1a58: e3130701 tst r3, #262144 ; 0x40000 1753 1a5c: e1a06000 mov r6, r0 1754 1a60: 1a000002 bne 1a70 <nand_scan_tail+0x24> 1755 1a64: e59f04ec ldr r0, [pc, #1260] ; 1f58 <nand_scan_tail +0x50c> 1756 1a68: ebfffffe bl 0 <malloc>
What's the value at PC+1260?
by the way,I cann't find the prototype of malloc in the whole project,it seems that it is encapsulated in some libs.
It's in common/malloc.c. There's weird preprocessor renaming going on, so it's called mALLOc in that file, but it shows up as malloc in the binary.
-Scott

在 2010-11-12五的 11:19 -0600,Scott Wood写道:
On Fri, 12 Nov 2010 20:45:18 +0800 terry gliumailenator@gmail.com wrote:
Dear Scott, I have disassemble the nand_base.o file,because i know the problem happened here.
Why not disassemble the whole u-boot?
Then you'll get malloc as well, and the addresses will be closer to what shows up in the dump.
Do you think it's useful for your analysis?
Can you disassemble malloc? That's where it actually crashed.
following is part of result that I disassemble the whole u-boot.
33f85f50 <nand_scan_tail>: 6487 33f85f50: e92d4070 push {r4, r5, r6, lr} 6488 33f85f54: e590509c ldr r5, [r0, #156] 6489 33f85f58: e595304c ldr r3, [r5, #76] 6490 33f85f5c: e3130701 tst r3, #262144 ; 0x40000 6491 33f85f60: e1a06000 mov r6, r0 6492 33f85f64: 1a000002 bne 33f85f74 <nand_scan_tail+0x24> 6493 33f85f68: e59f04ec ldr r0, [pc, #1260] ; 33f8645c <nand_scan_tail+0x 50c>// value seen below. 6494 33f85f6c: eb0026cc bl 33f8faa4 <malloc> 6495 33f85f70: e58500dc str r0, [r5, #220] 6496 33f85f74: e59510dc ldr r1, [r5, #220] 6497 33f85f78: e3510000 cmp r1, #0 ; 0x0 6498 33f85f7c: 03e0000b mvneq r0, #11 ; 0xb 6499 33f85f80: 08bd8070 popeq {r4, r5, r6, pc}
value of pc + 1260: 33f8645c: 0000128e .word 0x0000128e
and the following is part of malloc after disassembled, you can find the detailed content of malloc in the attachment malloc.dis file(I'm not sure which part could be useful,so I attached whole malloc).
61 33f8fb84: 9a000004 bls 33f8fb9c <malloc+0xf8> 62 33f8fb88: e59f352c ldr r3, [pc, #1324] ; 33f900bc <malloc +0x618> 63 33f8fb8c: e1520003 cmp r2, r3 64 33f8fb90: 91a0392a lsrls r3, sl, #18 65 33f8fb94: 83a0207e movhi r2, #126 ; 0x7e 66 33f8fb98: 9283207c addls r2, r3, #124 ; 0x7c 67 33f8fb9c: e59f3514 ldr r3, [pc, #1300] ; 33f900b8 <malloc +0x614> 68 33f8fba0: e0834182 add r4, r3, r2, lsl #3 69 33f8fba4: e594000c ldr r0, [r4, #12] 70 33f8fba8: ea000012 b 33f8fbf8 <malloc+0x154> 71 33f8fbac: e5903004 ldr r3, [r0, #4] 72 33f8fbb0: e3c33003 bic r3, r3, #3 ; 0x3 //it seems that exception occurs here 73 33f8fbb4: e06a1003 rsb r1, sl, r3 74 33f8fbb8: e351000f cmp r1, #15 ; 0xf 75 33f8fbbc: c2422001 subgt r2, r2, #1 ; 0x1 76 33f8fbc0: ca00000e bgt 33f8fc00 <malloc+0x15c> 77 33f8fbc4: e3510000 cmp r1, #0 ; 0x0 78 33f8fbc8: e590c00c ldr ip, [r0, #12] 79 33f8fbcc: ba000008 blt 33f8fbf4 <malloc+0x150> 80 33f8fbd0: e0803003 add r3, r0, r3 81 33f8fbd4: e5932004 ldr r2, [r3, #4] 82 33f8fbd8: e5901008 ldr r1, [r0, #8]
00001a4c <nand_scan_tail>: 1749 1a4c: e92d4070 push {r4, r5, r6, lr} 1750 1a50: e590509c ldr r5, [r0, #156] 1751 1a54: e595304c ldr r3, [r5, #76] 1752 1a58: e3130701 tst r3, #262144 ; 0x40000 1753 1a5c: e1a06000 mov r6, r0 1754 1a60: 1a000002 bne 1a70 <nand_scan_tail+0x24> 1755 1a64: e59f04ec ldr r0, [pc, #1260] ; 1f58 <nand_scan_tail +0x50c> 1756 1a68: ebfffffe bl 0 <malloc>
What's the value at PC+1260?
It's "1f58: 0000128e .word 0x0000128e"
by the way,I cann't find the prototype of malloc in the whole project,it seems that it is encapsulated in some libs.
It's in common/malloc.c. There's weird preprocessor renaming going on, so it's called mALLOc in that file, but it shows up as malloc in the binary.
Thanks very much for your carefully instruction.
-Scott

On Nov 12, 2010, at 9:43 PM, terry wrote:
by the way,I cann't find the prototype of malloc in the whole project,it seems that it is encapsulated in some libs.
It's in common/malloc.c. There's weird preprocessor renaming going on, so it's called mALLOc in that file, but it shows up as malloc in the binary.
Thanks very much for your carefully instruction.
-Scott
I haven't been following this thread, but just debugging a malloc/nand "corruption" issue myself. I'm going to start a new thread on the subject since its more related to malloc. However can you try the following and see what happens:
diff --git a/include/malloc.h b/include/malloc.h index 3e145ad..19f0f0b 100644 --- a/include/malloc.h +++ b/include/malloc.h @@ -850,7 +850,7 @@ extern Void_t* sbrk(); #endif
#ifndef MORECORE_CLEARS -#define MORECORE_CLEARS 1 +#define MORECORE_CLEARS 0 #endif
#endif /* INTERNAL_LINUX_C_LIB */

在 2010-11-13六的 10:24 -0600,Kumar Gala写道:
On Nov 12, 2010, at 9:43 PM, terry wrote:
by the way,I cann't find the prototype of malloc in the whole project,it seems that it is encapsulated in some libs.
It's in common/malloc.c. There's weird preprocessor renaming going on, so it's called mALLOc in that file, but it shows up as malloc in the binary.
Thanks very much for your carefully instruction.
-Scott
I haven't been following this thread, but just debugging a malloc/nand "corruption" issue myself. I'm going to start a new thread on the subject since its more related to malloc. However can you try the following and see what happens:
diff --git a/include/malloc.h b/include/malloc.h
I'm sorry, but I cann't understand you clearly. what do you mean by writting this? compare two different versions? if so, which two version?
index 3e145ad..19f0f0b 100644 --- a/include/malloc.h +++ b/include/malloc.h @@ -850,7 +850,7 @@ extern Void_t* sbrk(); #endif
#ifndef MORECORE_CLEARS -#define MORECORE_CLEARS 1 +#define MORECORE_CLEARS 0 #endif
#endif /* INTERNAL_LINUX_C_LIB */
I read the malloc.h file in u-boot-2010.09. it's content like below: #ifdef INTERNAL_LINUX_C_LIB 819 820 #if __STD_C 821 822 Void_t * __default_morecore_init (ptrdiff_t); 823 Void_t *(*__morecore)(ptrdiff_t) = __default_morecore_init; 824 825 #else 826 827 Void_t * __default_morecore_init (); 828 Void_t *(*__morecore)() = __default_morecore_init; 829 830 #endif /* __STD_C */ 831 832 #define MORECORE (*__morecore) 833 #define MORECORE_FAILURE 0 834 #define MORECORE_CLEARS 1
#else /* INTERNAL_LINUX_C_LIB */ 837 838 #if __STD_C 839 extern Void_t* sbrk(ptrdiff_t); 840 #else 841 extern Void_t* sbrk(); 842 #endif 843 844 #ifndef MORECORE 845 #define MORECORE sbrk 846 #endif 847 848 #ifndef MORECORE_FAILURE 849 #define MORECORE_FAILURE -1 850 #endif 851 852 #ifndef MORECORE_CLEARS 853 #define MORECORE_CLEARS 1 854 #endif 855 856 #endif /* INTERNAL_LINUX_C_LIB */ Do you mean that I should change MORECORE_CLEARS from 1 to 0?

On Nov 14, 2010, at 7:18 AM, terry wrote:
在 2010-11-13六的 10:24 -0600,Kumar Gala写道:
On Nov 12, 2010, at 9:43 PM, terry wrote:
by the way,I cann't find the prototype of malloc in the whole project,it seems that it is encapsulated in some libs.
It's in common/malloc.c. There's weird preprocessor renaming going on, so it's called mALLOc in that file, but it shows up as malloc in the binary.
Thanks very much for your carefully instruction.
-Scott
I haven't been following this thread, but just debugging a malloc/nand "corruption" issue myself. I'm going to start a new thread on the subject since its more related to malloc. However can you try the following and see what happens:
diff --git a/include/malloc.h b/include/malloc.h
I'm sorry, but I cann't understand you clearly. what do you mean by writting this? compare two different versions? if so, which two version?
index 3e145ad..19f0f0b 100644 --- a/include/malloc.h +++ b/include/malloc.h @@ -850,7 +850,7 @@ extern Void_t* sbrk(); #endif
#ifndef MORECORE_CLEARS -#define MORECORE_CLEARS 1 +#define MORECORE_CLEARS 0 #endif
#endif /* INTERNAL_LINUX_C_LIB */
I read the malloc.h file in u-boot-2010.09. it's content like below: #ifdef INTERNAL_LINUX_C_LIB 819 820 #if __STD_C 821 822 Void_t * __default_morecore_init (ptrdiff_t); 823 Void_t *(*__morecore)(ptrdiff_t) = __default_morecore_init; 824 825 #else 826 827 Void_t * __default_morecore_init (); 828 Void_t *(*__morecore)() = __default_morecore_init; 829 830 #endif /* __STD_C */ 831 832 #define MORECORE (*__morecore) 833 #define MORECORE_FAILURE 0 834 #define MORECORE_CLEARS 1
#else /* INTERNAL_LINUX_C_LIB */ 837 838 #if __STD_C 839 extern Void_t* sbrk(ptrdiff_t); 840 #else 841 extern Void_t* sbrk(); 842 #endif 843 844 #ifndef MORECORE 845 #define MORECORE sbrk 846 #endif 847 848 #ifndef MORECORE_FAILURE 849 #define MORECORE_FAILURE -1 850 #endif 851 852 #ifndef MORECORE_CLEARS 853 #define MORECORE_CLEARS 1 854 #endif 855 856 #endif /* INTERNAL_LINUX_C_LIB */ Do you mean that I should change MORECORE_CLEARS from 1 to 0?
Yes, I was asking you to modify include/malloc.h to change MORECORE_CLEARS from 1 to 0 and see if that helps your issue. You need to modify the one on lines 853.
- k

On Sat, 13 Nov 2010 11:43:23 +0800 terry gliumailenator@gmail.com wrote:
and the following is part of malloc after disassembled, you can find the detailed content of malloc in the attachment malloc.dis file(I'm not sure which part could be useful,so I attached whole malloc).
61 33f8fb84: 9a000004 bls 33f8fb9c <malloc+0xf8> 62 33f8fb88: e59f352c ldr r3, [pc, #1324] ; 33f900bc <malloc +0x618> 63 33f8fb8c: e1520003 cmp r2, r3 64 33f8fb90: 91a0392a lsrls r3, sl, #18 65 33f8fb94: 83a0207e movhi r2, #126 ; 0x7e 66 33f8fb98: 9283207c addls r2, r3, #124 ; 0x7c 67 33f8fb9c: e59f3514 ldr r3, [pc, #1300] ; 33f900b8 <malloc +0x614> 68 33f8fba0: e0834182 add r4, r3, r2, lsl #3 69 33f8fba4: e594000c ldr r0, [r4, #12] 70 33f8fba8: ea000012 b 33f8fbf8 <malloc+0x154> 71 33f8fbac: e5903004 ldr r3, [r0, #4] 72 33f8fbb0: e3c33003 bic r3, r3, #3 ; 0x3 //it seems that exception occurs here 73 33f8fbb4: e06a1003 rsb r1, sl, r3
This is the instruction that it faulted on -- but it's not a memory access instruction. Could it be an asynchronous data abort (more like a machine check)? It's been a while since I've done ARM stuff.
/me googles "ARM exceptions"
Hmm, data aborts record PC+8 rather than PC? Who comes up with this stuff? :-P
Could you look up the line number information for 0x33f8fbac?
From the full-function disasm, I'd expect ip to equal r0 at this point -- but they don't in the dump.
-scott
participants (4)
-
J.Hwan.Kim
-
Kumar Gala
-
Scott Wood
-
terry