
On 3/23/22 11:07, qianfan wrote:
在 2022/3/23 17:51, Heinrich Schuchardt 写道:
On 3/23/22 10:13, qianfan wrote:
在 2022/3/23 16:02, qianfan 写道:
在 2022/3/23 15:45, qianfan 写道:
在 2022/3/23 10:28, qianfan 写道:
Hi:
I had a custom AM335X board connected my computer by usbnet. It always report data abort when 'dhcp':
Next it the log:
U-Boot 2022.01-rc1-00183-gfa5b4e2d19-dirty (Feb 25 2022 - 15:45:02 +0800)
CPU : AM335X-GP rev 2.1 Model: WISDOM AM335X CCT DRAM: 512 MiB NAND: 256 MiB MMC: OMAP SD/MMC: 0 Loading Environment from NAND... *** Warning - bad CRC, using default environment
Net: Could not get PHY for ethernet@4a100000: addr 0 eth2: ethernet@4a100000, eth3: usb_ether Hit any key to stop autoboot: 0 => setenv autoload no => dhcp using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:ef:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! BOOTP broadcast 1 BOOTP broadcast 2 BOOTP broadcast 3 DHCP client bound to address 192.168.200.4 (757 ms) data abort pc : [<9fe9b0a2>] lr : [<9febbc3f>] reloc pc : [<808130a2>] lr : [<80833c3f>] sp : 9de53410 ip : 9de53578 fp : 00000001 r10: 9de5345c r9 : 9de67e80 r8 : 9febbae5 r7 : 9de72c30 r6 : 9feec710 r5 : 0000000d r4 : 00000018 r3 : 3fdd8e04 r2 : 00000002 r1 : 9feec728 r0 : 9feec700 Flags: Nzcv IRQs off FIQs on Mode SVC_32 (T) Code: f023 0303 60ca 4403 (6091) 685a Resetting CPU ...
resetting ...
It's there has any doc about how to debug data abort? Or is the bug is already fixed?
Thanks
This bug doesn't fixed on master code. I found v2021.01 is good and v2021.04-rc2 is bad.
Also I had tested this on beaglebone black with am335x_evm_defconfig, has the simliar problem.
find the first bug commit via 'git bisect': it told me that commit e97eb638de0dc8f6e989e20eaeb0342f103cb917 broke it. But it is very strange due to this commit doesn't touch any dhcp or network code.
➜ u-boot-main git:(e97eb638de) ✗ git bisect bug e97eb638de0dc8f6e989e20eaeb0342f103cb917 is the first bug commit commit e97eb638de0dc8f6e989e20eaeb0342f103cb917 Author: Heinrich Schuchardt xypron.glpk@gmx.de Date: Wed Jan 20 22:21:53 2021 +0100
fs: fat: consistent error handling for flush_dir()
Provide function description for flush_dir(). Move all error messages for flush_dir() from the callers to the function. Move mapping of errors to -EIO to the function. Always check return value of flush_dir() (Coverity CID 316362).
In fat_unlink() return -EIO if flush_dirty_fat_buffer() fails.
Signed-off-by: Heinrich Schuchardt xypron.glpk@gmx.de
:040000 040000 2281a449f2d134078d7faa1ee735a367b55aad7e 77d188b1c99181fd71f2167fdeee3434a09db209 M fs
184aa6504143b452132e28cd3ebecc7b941cdfa1 is the first commit before e97eb638de0dc8f6e989e20eaeb0342f103cb917:
- e97eb638de0dc8f6e989e20eaeb0342f103cb917 fs: fat: consistent error
handling for flush_dir() * 184aa6504143b452132e28cd3ebecc7b941cdfa1 Merge tag 'u-boot-rockchip-20210121' of https://gitlab.denx.de/u-boot/custodians/u-boot-rockchip |\ | * 9ddc0787bd660214366e386ce689dd78299ac9d0 pci: Add Rockchip dwc based PCIe controller driver
I checked 184aa6504143b452132e28cd3ebecc7b941cdfa1 can work fine.
U-Boot 2021.01-00688-g184aa65041-dirty (Mar 23 2022 - 15:07:56 +0800)
CPU : AM335X-GP rev 2.1 Model: TI AM335x BeagleBone Black DRAM: 512 MiB WDT: Started with servicing (60s timeout) NAND: 0 MiB MMC: OMAP SD/MMC: 0, OMAP SD/MMC: 1 Loading Environment from FAT... <ethaddr> not set. Validating first E-fuse MAC Net: eth2: ethernet@4a100000, eth3: usb_ether Hit any key to stop autoboot: 0 => dhcp ethernet@4a100000 Waiting for PHY auto negotiation to complete......... TIMEOUT ! using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in MAC de:ad:be:ef:00:01 HOST MAC de:ad:be:ef:00:00 RNDIS ready musb-hdrc: peripheral reset irq lost! high speed config #2: 2 mA, Ethernet Gadget, using RNDIS USB RNDIS network up! BOOTP broadcast 1 BOOTP broadcast 2 BOOTP broadcast 3 DHCP client bound to address 192.168.200.157 (757 ms) Using usb_ether device TFTP from server 192.168.200.1; our IP address is 192.168.200.157 Filename 'u-boot.img'. Load address: 0x82000000 Loading: ################################################################# ################################################################# ################################################################# ######################### 2.5 MiB/s done Bytes transferred = 1123888 (112630 hex) =>
"data abort" messages:
data abort pc : [<9ff8196c>] lr : [<9ffa1cd7>] reloc pc : [<8081496c>] lr : [<80834cd7>] sp : 9df38e60 ip : 9df38fc8 fp : 00000001 r10: 9df38eac r9 : 9df4ceb0 r8 : 9ffa1b7d r7 : 9df52fd0 r6 : 9ffdbba8 r5 : 0000000d r4 : 00000018 r3 : 3ff589e0 r2 : 9ffafa11 r1 : 9ffdbbc0 r0 : 9ffdbb00 Flags: Nzcv IRQs off FIQs on Mode SVC_32 (T) Code: 0303 60ca 4403 6091 (685a) f042 Resetting CPU ...
objdump u-boot:pc is in malloc and lr is in env_attr_walk
unlink(victim, bck, fwd); 80814966: 60ca str r2, [r1, #12] set_inuse_bit_at_offset(victim, victim_size); 80814968: 4403 add r3, r0 unlink(victim, bck, fwd); 8081496a: 6091 str r1, [r2, #8] set_inuse_bit_at_offset(victim, victim_size); 8081496c: 685a ldr r2, [r3, #4] 8081496e: f042 0201 orr.w r2, r2, #1 80814972: 605a str r2, [r3, #4]
r3 is 3ff589e0 and it's not a valid ram address on am335x.
I have seen crashes in common/dlmalloc.c before after double free() or free() with an incorrect pointer.
The assert() statements in do_check_inuse_chunk() are meant to catch this but assert() as defined in include/log.h does not stop the code and even does not print without _DEBUG=1.
You should be able to get the assert output with
#include <common.h> #define _DEBUG 1 #include <log.h>
at the top of common/dlmalloc.c.
You should get full malloc debug output with
Hi: I had try add DEBUG marco before <log.h> and no other malloc message
assert() checks for _DEBUG. Defining DEBUG after common.h will not define _DEBUG.
Best regards
Heinrich
printed.
#define DEBUG 1 #include <common.h> #include <log.h>
Best regards
Heinrich