[PATCH v1] usb: xhci: Check return value of wait for TRB_TRANSFER event

xhci_wait_for_event() waiting TRB_TRANSFER event may return NULL. Checking the return value to avoid crash.
Signed-off-by: Minda Chen minda.chen@starfivetech.com --- drivers/usb/host/xhci-ring.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index c8260cbdf9..5f02ff0769 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -544,6 +544,8 @@ static void abort_td(struct usb_device *udev, int ep_index) xhci_queue_command(ctrl, 0, udev->slot_id, ep_index, TRB_STOP_RING);
event = xhci_wait_for_event(ctrl, TRB_TRANSFER); + if (!event) + return; field = le32_to_cpu(event->trans_event.flags); BUG_ON(TRB_TO_SLOT_ID(field) != udev->slot_id); BUG_ON(TRB_TO_EP_INDEX(field) != ep_index);

On 10/17/23 08:20, Minda Chen wrote:
xhci_wait_for_event() waiting TRB_TRANSFER event may return NULL. Checking the return value to avoid crash.
Signed-off-by: Minda Chen minda.chen@starfivetech.com
How did you trigger this error ? Is there a reproducer ? Details please ...

On 2023/10/17 19:20, Marek Vasut wrote:
On 10/17/23 08:20, Minda Chen wrote:
xhci_wait_for_event() waiting TRB_TRANSFER event may return NULL. Checking the return value to avoid crash.
Signed-off-by: Minda Chen minda.chen@starfivetech.com
How did you trigger this error ? Is there a reproducer ? Details please ...
While Scanning a lenovo usb2.0 udisk, not 100 % reproduce
This is log.
StarFive # usb reset resetting USB... Bus xhci_pci: Register 5000420 NbrPorts 5 Starting the controller USB XHCI 1.00 scanning bus xhci_pci for devices... WARN halted endpoint, queueing URB anyway. Unexpected XHCI event TRB, skipping... (f77141f0 00000000 13000000 02008401) Unhandled exception: Load access fault EPC: 00000000f7f563c6 RA: 00000000f7f563c6 TVAL: 000000000000000c EPC: 000000004024a3c6 RA: 000000004024a3c6 reloc adjusted
SP: 00000000f76f9a60 GP: 00000000f76fbdd0 TP: 0000000000000001 T0: 00000000f76fa168 T1: 00000000000000ff T2: 0000000000000016 S0: 00000000f7712fc0 S1: 00000000f76fb100 A0: 0000000000000000 A1: 0000000000000000 A2: 00000000f77145d0 A3: 00000000f7714590 A4: 0000000000000000 A5: 0000000000000020 A6: 000000000000000f A7: 0000000000000100 S2: 0000000000000000 S3: 0000000000000000 S4: 00000000f7717050 S5: 00000000f7717050 S6: 0000000080000383 S7: 00000000f76f9dc0 S8: 00000000000000ff S9: 0000000000000001 S10: 00000000f76f9ba0 S11: 0000000000010c04 T3: 0000000000000010 T4: 0000000000000006 T5: 0000000000000080 T6: 00000000f76fa231
Code: 842a f0ef d75f 0593 0200 8522 f0ef ebdf (455c)
This is USB info and storage info
StarFive # 1: Hub, USB Revision 3.0 - U-Boot XHCI Host Controller - Class: Hub - PacketSize: 512 Configurations: 1 - Vendor: 0x0000 Product 0x0000 Version 1.0 Configuration: 1 - Interfaces: 1 Self Powered 0mA Interface: 0 - Alternate Setting 0, Endpoints: 1 - Class Hub - Endpoint 1 In Interrupt MaxPacket 8 Interval 255ms
2: Hub, USB Revision 2.10 - USB2.0 Hub - Class: Hub - PacketSize: 64 Configurations: 1 - Vendor: 0x2109 Product 0x3431 Version 4.32 Configuration: 1 - Interfaces: 1 Self Powered Remote Wakeup 100mA Interface: 0 - Alternate Setting 0, Endpoints: 1 - Class Hub - Endpoint 1 In Interrupt MaxPacket 1 Interval 12ms
3: Mass Storage, USB Revision 2.0 - Generic Mass Storage 31097778XB15113405 - Class: (from Interface) Mass Storage - PacketSize: 64 Configurations: 1 - Vendor: 0x17ef Product 0x38ac Version 1.0 Configuration: 1 - Interfaces: 1 Bus Powered 200mA Interface: 0 - Alternate Setting 0, Endpoints: 2 - Class Mass Storage, Transp. SCSI, Bulk only - Endpoint 1 Out Bulk MaxPacket 512 - Endpoint 2 In Bulk MaxPacket 512
StarFive # usb storage Device 0: Vendor: Rev: 8.07 Prod: Lenovo SX1 64G Type: Removable Hard Disk Capacity: 60000.0 MB = 58.5 GB (122880000 x 512) StarFive #

On 10/18/23 03:22, Minda Chen wrote:
On 2023/10/17 19:20, Marek Vasut wrote:
On 10/17/23 08:20, Minda Chen wrote:
xhci_wait_for_event() waiting TRB_TRANSFER event may return NULL. Checking the return value to avoid crash.
Signed-off-by: Minda Chen minda.chen@starfivetech.com
How did you trigger this error ? Is there a reproducer ? Details please ...
While Scanning a lenovo usb2.0 udisk, not 100 % reproduce
Can you include Linux
lsusb -vvv
output for this device and include that information in the commit message ? (or the U-Boot info below, that works too, just please add it into the commit message, it is important for future reference).
This is log.
StarFive # usb reset resetting USB... Bus xhci_pci: Register 5000420 NbrPorts 5 Starting the controller USB XHCI 1.00 scanning bus xhci_pci for devices... WARN halted endpoint, queueing URB anyway. Unexpected XHCI event TRB, skipping... (f77141f0 00000000 13000000 02008401) Unhandled exception: Load access fault EPC: 00000000f7f563c6 RA: 00000000f7f563c6 TVAL: 000000000000000c EPC: 000000004024a3c6 RA: 000000004024a3c6 reloc adjusted
Where does the crash point to in code, can you disassemble the PC pointer ? (or maybe you can use scripts/decodecode I think)
SP: 00000000f76f9a60 GP: 00000000f76fbdd0 TP: 0000000000000001 T0: 00000000f76fa168 T1: 00000000000000ff T2: 0000000000000016 S0: 00000000f7712fc0 S1: 00000000f76fb100 A0: 0000000000000000 A1: 0000000000000000 A2: 00000000f77145d0 A3: 00000000f7714590 A4: 0000000000000000 A5: 0000000000000020 A6: 000000000000000f A7: 0000000000000100 S2: 0000000000000000 S3: 0000000000000000 S4: 00000000f7717050 S5: 00000000f7717050 S6: 0000000080000383 S7: 00000000f76f9dc0 S8: 00000000000000ff S9: 0000000000000001 S10: 00000000f76f9ba0 S11: 0000000000010c04 T3: 0000000000000010 T4: 0000000000000006 T5: 0000000000000080 T6: 00000000f76fa231
[...]
3: Mass Storage, USB Revision 2.0
- Generic Mass Storage 31097778XB15113405
- Class: (from Interface) Mass Storage
- PacketSize: 64 Configurations: 1
- Vendor: 0x17ef Product 0x38ac Version 1.0 Configuration: 1
- Interfaces: 1 Bus Powered 200mA Interface: 0
- Alternate Setting 0, Endpoints: 2
- Class Mass Storage, Transp. SCSI, Bulk only
- Endpoint 1 Out Bulk MaxPacket 512
- Endpoint 2 In Bulk MaxPacket 512
StarFive # usb storage Device 0: Vendor: Rev: 8.07 Prod: Lenovo SX1 64G Type: Removable Hard Disk Capacity: 60000.0 MB = 58.5 GB (122880000 x 512)
[...]

On 2023/10/18 10:35, Marek Vasut wrote:
On 10/18/23 03:22, Minda Chen wrote:
On 2023/10/17 19:20, Marek Vasut wrote:
On 10/17/23 08:20, Minda Chen wrote:
xhci_wait_for_event() waiting TRB_TRANSFER event may return NULL. Checking the return value to avoid crash.
Signed-off-by: Minda Chen minda.chen@starfivetech.com
How did you trigger this error ? Is there a reproducer ? Details please ...
While Scanning a lenovo usb2.0 udisk, not 100 % reproduce
Can you include Linux
lsusb -vvv
output for this device and include that information in the commit message ? (or the U-Boot info below, that works too, just please add it into the commit message, it is important for future reference).
OK, I will add lsusb -vvv Linux udisk message and crash dump info to commit message
This is log.
StarFive # usb reset resetting USB... Bus xhci_pci: Register 5000420 NbrPorts 5 Starting the controller USB XHCI 1.00 scanning bus xhci_pci for devices... WARN halted endpoint, queueing URB anyway. Unexpected XHCI event TRB, skipping... (f77141f0 00000000 13000000 02008401) Unhandled exception: Load access fault EPC: 00000000f7f563c6 RA: 00000000f7f563c6 TVAL: 000000000000000c EPC: 000000004024a3c6 RA: 000000004024a3c6 reloc adjusted
Where does the crash point to in code, can you disassemble the PC pointer ? (or maybe you can use scripts/decodecode I think)
OK, I will add EPC pointer disassemble to commit message
SP: 00000000f76f9a60 GP: 00000000f76fbdd0 TP: 0000000000000001 T0: 00000000f76fa168 T1: 00000000000000ff T2: 0000000000000016 S0: 00000000f7712fc0 S1: 00000000f76fb100 A0: 0000000000000000 A1: 0000000000000000 A2: 00000000f77145d0 A3: 00000000f7714590 A4: 0000000000000000 A5: 0000000000000020 A6: 000000000000000f A7: 0000000000000100 S2: 0000000000000000 S3: 0000000000000000 S4: 00000000f7717050 S5: 00000000f7717050 S6: 0000000080000383 S7: 00000000f76f9dc0 S8: 00000000000000ff S9: 0000000000000001 S10: 00000000f76f9ba0 S11: 0000000000010c04 T3: 0000000000000010 T4: 0000000000000006 T5: 0000000000000080 T6: 00000000f76fa231
[...]
3: Mass Storage, USB Revision 2.0 - Generic Mass Storage 31097778XB15113405 - Class: (from Interface) Mass Storage - PacketSize: 64 Configurations: 1 - Vendor: 0x17ef Product 0x38ac Version 1.0 Configuration: 1 - Interfaces: 1 Bus Powered 200mA Interface: 0 - Alternate Setting 0, Endpoints: 2 - Class Mass Storage, Transp. SCSI, Bulk only - Endpoint 1 Out Bulk MaxPacket 512 - Endpoint 2 In Bulk MaxPacket 512
StarFive # usb storage Device 0: Vendor: Rev: 8.07 Prod: Lenovo SX1 64G Type: Removable Hard Disk Capacity: 60000.0 MB = 58.5 GB (122880000 x 512)
[...]

On 10/18/23 05:46, Minda Chen wrote:
On 2023/10/18 10:35, Marek Vasut wrote:
On 10/18/23 03:22, Minda Chen wrote:
On 2023/10/17 19:20, Marek Vasut wrote:
On 10/17/23 08:20, Minda Chen wrote:
xhci_wait_for_event() waiting TRB_TRANSFER event may return NULL. Checking the return value to avoid crash.
Signed-off-by: Minda Chen minda.chen@starfivetech.com
How did you trigger this error ? Is there a reproducer ? Details please ...
While Scanning a lenovo usb2.0 udisk, not 100 % reproduce
Can you include Linux
lsusb -vvv
output for this device and include that information in the commit message ? (or the U-Boot info below, that works too, just please add it into the commit message, it is important for future reference).
OK, I will add lsusb -vvv Linux udisk message and crash dump info to commit message
Thank you
This is log.
StarFive # usb reset resetting USB... Bus xhci_pci: Register 5000420 NbrPorts 5 Starting the controller USB XHCI 1.00 scanning bus xhci_pci for devices... WARN halted endpoint, queueing URB anyway. Unexpected XHCI event TRB, skipping... (f77141f0 00000000 13000000 02008401) Unhandled exception: Load access fault EPC: 00000000f7f563c6 RA: 00000000f7f563c6 TVAL: 000000000000000c EPC: 000000004024a3c6 RA: 000000004024a3c6 reloc adjusted
Where does the crash point to in code, can you disassemble the PC pointer ? (or maybe you can use scripts/decodecode I think)
OK, I will add EPC pointer disassemble to commit message
This part probably doesn't need to be in the commit message. I'd like to know where the crash occurred in the code.

On 2023/10/18 18:11, Marek Vasut wrote:
On 10/18/23 05:46, Minda Chen wrote:
On 2023/10/18 10:35, Marek Vasut wrote:
On 10/18/23 03:22, Minda Chen wrote:
On 2023/10/17 19:20, Marek Vasut wrote:
On 10/17/23 08:20, Minda Chen wrote:
xhci_wait_for_event() waiting TRB_TRANSFER event may return NULL. Checking the return value to avoid crash.
Signed-off-by: Minda Chen minda.chen@starfivetech.com
How did you trigger this error ? Is there a reproducer ? Details please ...
While Scanning a lenovo usb2.0 udisk, not 100 % reproduce
Can you include Linux
lsusb -vvv
output for this device and include that information in the commit message ? (or the U-Boot info below, that works too, just please add it into the commit message, it is important for future reference).
OK, I will add lsusb -vvv Linux udisk message and crash dump info to commit message
Thank you
This is log.
StarFive # usb reset resetting USB... Bus xhci_pci: Register 5000420 NbrPorts 5 Starting the controller USB XHCI 1.00 scanning bus xhci_pci for devices... WARN halted endpoint, queueing URB anyway. Unexpected XHCI event TRB, skipping... (f77141f0 00000000 13000000 02008401) Unhandled exception: Load access fault EPC: 00000000f7f563c6 RA: 00000000f7f563c6 TVAL: 000000000000000c EPC: 000000004024a3c6 RA: 000000004024a3c6 reloc adjusted
Where does the crash point to in code, can you disassemble the PC pointer ? (or maybe you can use scripts/decodecode I think)
OK, I will add EPC pointer disassemble to commit message
This part probably doesn't need to be in the commit message. I'd like to know where the crash occurred in the code.
000000004024a376 <abort_td>: { 4024a376: 7179 addi sp,sp,-48 4024a378: f406 sd ra,40(sp) 4024a37a: f022 sd s0,32(sp) 4024a37c: ec26 sd s1,24(sp) 4024a37e: e84a sd s2,16(sp) 4024a380: e44e sd s3,8(sp) 4024a382: e052 sd s4,0(sp) 4024a384: 89ae mv s3,a1 4024a386: 84aa mv s1,a0 struct xhci_ctrl *ctrl = xhci_get_ctrl(udev); 4024a388: 8c4fe0ef jal ra,4024844c <xhci_get_ctrl> struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring; 4024a38c: 6785 lui a5,0x1 4024a38e: 94be add s1,s1,a5 4024a390: 9444a603 lw a2,-1724(s1) 4024a394: 00198713 addi a4,s3,1 4024a398: 0712 slli a4,a4,0x4 4024a39a: 02061793 slli a5,a2,0x20 4024a39e: 9381 srli a5,a5,0x20 4024a3a0: 07c9 addi a5,a5,18 4024a3a2: 078e slli a5,a5,0x3 4024a3a4: 97aa add a5,a5,a0 4024a3a6: 679c ld a5,8(a5) xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING); 4024a3a8: 2981 sext.w s3,s3 4024a3aa: 86ce mv a3,s3 struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring; 4024a3ac: 97ba add a5,a5,a4 xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING); 4024a3ae: 4581 li a1,0 4024a3b0: 473d li a4,15 struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring; 4024a3b2: 0087ba03 ld s4,8(a5) # 1008 <_start-0x401feff8> struct xhci_ctrl *ctrl = xhci_get_ctrl(udev); 4024a3b6: 842a mv s0,a0 xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING); 4024a3b8: d75ff0ef jal ra,4024a12c <xhci_queue_command> event = xhci_wait_for_event(ctrl, TRB_TRANSFER); 4024a3bc: 02000593 li a1,32 4024a3c0: 8522 mv a0,s0 4024a3c2: ebdff0ef jal ra,4024a27e <xhci_wait_for_event> field = le32_to_cpu(event->trans_event.flags); epc-> 4024a3c6: 455c lw a5,12(a0) BUG_ON(TRB_TO_SLOT_ID(field) != udev->slot_id); 4024a3c8: 9444a703 lw a4,-1724(s1) field = le32_to_cpu(event->trans_event.flags); 4024a3cc: 0007891b sext.w s2,a5

On 10/18/23 12:16, Minda Chen wrote:
On 2023/10/18 18:11, Marek Vasut wrote:
On 10/18/23 05:46, Minda Chen wrote:
On 2023/10/18 10:35, Marek Vasut wrote:
On 10/18/23 03:22, Minda Chen wrote:
On 2023/10/17 19:20, Marek Vasut wrote:
On 10/17/23 08:20, Minda Chen wrote: > xhci_wait_for_event() waiting TRB_TRANSFER event may return > NULL. Checking the return value to avoid crash. > > Signed-off-by: Minda Chen minda.chen@starfivetech.com
How did you trigger this error ? Is there a reproducer ? Details please ...
While Scanning a lenovo usb2.0 udisk, not 100 % reproduce
Can you include Linux
lsusb -vvv
output for this device and include that information in the commit message ? (or the U-Boot info below, that works too, just please add it into the commit message, it is important for future reference).
OK, I will add lsusb -vvv Linux udisk message and crash dump info to commit message
Thank you
This is log.
StarFive # usb reset resetting USB... Bus xhci_pci: Register 5000420 NbrPorts 5 Starting the controller USB XHCI 1.00 scanning bus xhci_pci for devices... WARN halted endpoint, queueing URB anyway. Unexpected XHCI event TRB, skipping... (f77141f0 00000000 13000000 02008401) Unhandled exception: Load access fault EPC: 00000000f7f563c6 RA: 00000000f7f563c6 TVAL: 000000000000000c EPC: 000000004024a3c6 RA: 000000004024a3c6 reloc adjusted
Where does the crash point to in code, can you disassemble the PC pointer ? (or maybe you can use scripts/decodecode I think)
OK, I will add EPC pointer disassemble to commit message
This part probably doesn't need to be in the commit message. I'd like to know where the crash occurred in the code.
000000004024a376 <abort_td>: { 4024a376: 7179 addi sp,sp,-48 4024a378: f406 sd ra,40(sp) 4024a37a: f022 sd s0,32(sp) 4024a37c: ec26 sd s1,24(sp) 4024a37e: e84a sd s2,16(sp) 4024a380: e44e sd s3,8(sp) 4024a382: e052 sd s4,0(sp) 4024a384: 89ae mv s3,a1 4024a386: 84aa mv s1,a0 struct xhci_ctrl *ctrl = xhci_get_ctrl(udev); 4024a388: 8c4fe0ef jal ra,4024844c <xhci_get_ctrl> struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring; 4024a38c: 6785 lui a5,0x1 4024a38e: 94be add s1,s1,a5 4024a390: 9444a603 lw a2,-1724(s1) 4024a394: 00198713 addi a4,s3,1 4024a398: 0712 slli a4,a4,0x4 4024a39a: 02061793 slli a5,a2,0x20 4024a39e: 9381 srli a5,a5,0x20 4024a3a0: 07c9 addi a5,a5,18 4024a3a2: 078e slli a5,a5,0x3 4024a3a4: 97aa add a5,a5,a0 4024a3a6: 679c ld a5,8(a5) xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING); 4024a3a8: 2981 sext.w s3,s3 4024a3aa: 86ce mv a3,s3 struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring; 4024a3ac: 97ba add a5,a5,a4 xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING); 4024a3ae: 4581 li a1,0 4024a3b0: 473d li a4,15 struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring; 4024a3b2: 0087ba03 ld s4,8(a5) # 1008 <_start-0x401feff8> struct xhci_ctrl *ctrl = xhci_get_ctrl(udev); 4024a3b6: 842a mv s0,a0 xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING); 4024a3b8: d75ff0ef jal ra,4024a12c <xhci_queue_command> event = xhci_wait_for_event(ctrl, TRB_TRANSFER); 4024a3bc: 02000593 li a1,32 4024a3c0: 8522 mv a0,s0 4024a3c2: ebdff0ef jal ra,4024a27e <xhci_wait_for_event> field = le32_to_cpu(event->trans_event.flags); epc-> 4024a3c6: 455c lw a5,12(a0)
So the fault occurs when reading the controller register(s), do I understand it right ?
Could it be the problem is rather some clock, which are turned off after a fault ?

On 2023/10/18 18:55, Marek Vasut wrote:
On 10/18/23 12:16, Minda Chen wrote:
On 2023/10/18 18:11, Marek Vasut wrote:
On 10/18/23 05:46, Minda Chen wrote:
On 2023/10/18 10:35, Marek Vasut wrote:
On 10/18/23 03:22, Minda Chen wrote:
On 2023/10/17 19:20, Marek Vasut wrote: > On 10/17/23 08:20, Minda Chen wrote: >> xhci_wait_for_event() waiting TRB_TRANSFER event may return >> NULL. Checking the return value to avoid crash. >> >> Signed-off-by: Minda Chen minda.chen@starfivetech.com > > How did you trigger this error ? Is there a reproducer ? Details please ...
While Scanning a lenovo usb2.0 udisk, not 100 % reproduce
Can you include Linux
lsusb -vvv
output for this device and include that information in the commit message ? (or the U-Boot info below, that works too, just please add it into the commit message, it is important for future reference).
OK, I will add lsusb -vvv Linux udisk message and crash dump info to commit message
Thank you
This is log.
StarFive # usb reset resetting USB... Bus xhci_pci: Register 5000420 NbrPorts 5 Starting the controller USB XHCI 1.00 scanning bus xhci_pci for devices... WARN halted endpoint, queueing URB anyway. Unexpected XHCI event TRB, skipping... (f77141f0 00000000 13000000 02008401) Unhandled exception: Load access fault EPC: 00000000f7f563c6 RA: 00000000f7f563c6 TVAL: 000000000000000c EPC: 000000004024a3c6 RA: 000000004024a3c6 reloc adjusted
Where does the crash point to in code, can you disassemble the PC pointer ? (or maybe you can use scripts/decodecode I think)
OK, I will add EPC pointer disassemble to commit message
This part probably doesn't need to be in the commit message. I'd like to know where the crash occurred in the code.
000000004024a376 <abort_td>: { 4024a376: 7179 addi sp,sp,-48 4024a378: f406 sd ra,40(sp) 4024a37a: f022 sd s0,32(sp) 4024a37c: ec26 sd s1,24(sp) 4024a37e: e84a sd s2,16(sp) 4024a380: e44e sd s3,8(sp) 4024a382: e052 sd s4,0(sp) 4024a384: 89ae mv s3,a1 4024a386: 84aa mv s1,a0 struct xhci_ctrl *ctrl = xhci_get_ctrl(udev); 4024a388: 8c4fe0ef jal ra,4024844c <xhci_get_ctrl> struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring; 4024a38c: 6785 lui a5,0x1 4024a38e: 94be add s1,s1,a5 4024a390: 9444a603 lw a2,-1724(s1) 4024a394: 00198713 addi a4,s3,1 4024a398: 0712 slli a4,a4,0x4 4024a39a: 02061793 slli a5,a2,0x20 4024a39e: 9381 srli a5,a5,0x20 4024a3a0: 07c9 addi a5,a5,18 4024a3a2: 078e slli a5,a5,0x3 4024a3a4: 97aa add a5,a5,a0 4024a3a6: 679c ld a5,8(a5) xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING); 4024a3a8: 2981 sext.w s3,s3 4024a3aa: 86ce mv a3,s3 struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring; 4024a3ac: 97ba add a5,a5,a4 xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING); 4024a3ae: 4581 li a1,0 4024a3b0: 473d li a4,15 struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring; 4024a3b2: 0087ba03 ld s4,8(a5) # 1008 <_start-0x401feff8> struct xhci_ctrl *ctrl = xhci_get_ctrl(udev); 4024a3b6: 842a mv s0,a0 xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING); 4024a3b8: d75ff0ef jal ra,4024a12c <xhci_queue_command> event = xhci_wait_for_event(ctrl, TRB_TRANSFER); 4024a3bc: 02000593 li a1,32 4024a3c0: 8522 mv a0,s0 4024a3c2: ebdff0ef jal ra,4024a27e <xhci_wait_for_event> field = le32_to_cpu(event->trans_event.flags); epc-> 4024a3c6: 455c lw a5,12(a0)
So the fault occurs when reading the controller register(s), do I understand it right ?
I think it is right. Actually this error occur in error path, control tx transfer TRB_TRANSFER error occur and jump to error path. sending TRB_TRANSFER again.
Could it be the problem is rather some clock, which are turned off after a fault ?
I think not. Just this udisk can reproduce this issue.

On 10/19/23 04:46, Minda Chen wrote:
[...]
000000004024a376 <abort_td>: { 4024a376: 7179 addi sp,sp,-48 4024a378: f406 sd ra,40(sp) 4024a37a: f022 sd s0,32(sp) 4024a37c: ec26 sd s1,24(sp) 4024a37e: e84a sd s2,16(sp) 4024a380: e44e sd s3,8(sp) 4024a382: e052 sd s4,0(sp) 4024a384: 89ae mv s3,a1 4024a386: 84aa mv s1,a0 struct xhci_ctrl *ctrl = xhci_get_ctrl(udev); 4024a388: 8c4fe0ef jal ra,4024844c <xhci_get_ctrl> struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring; 4024a38c: 6785 lui a5,0x1 4024a38e: 94be add s1,s1,a5 4024a390: 9444a603 lw a2,-1724(s1) 4024a394: 00198713 addi a4,s3,1 4024a398: 0712 slli a4,a4,0x4 4024a39a: 02061793 slli a5,a2,0x20 4024a39e: 9381 srli a5,a5,0x20 4024a3a0: 07c9 addi a5,a5,18 4024a3a2: 078e slli a5,a5,0x3 4024a3a4: 97aa add a5,a5,a0 4024a3a6: 679c ld a5,8(a5) xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING); 4024a3a8: 2981 sext.w s3,s3 4024a3aa: 86ce mv a3,s3 struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring; 4024a3ac: 97ba add a5,a5,a4 xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING); 4024a3ae: 4581 li a1,0 4024a3b0: 473d li a4,15 struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring; 4024a3b2: 0087ba03 ld s4,8(a5) # 1008 <_start-0x401feff8> struct xhci_ctrl *ctrl = xhci_get_ctrl(udev); 4024a3b6: 842a mv s0,a0 xhci_queue_command(ctrl, NULL, udev->slot_id, ep_index, TRB_STOP_RING); 4024a3b8: d75ff0ef jal ra,4024a12c <xhci_queue_command> event = xhci_wait_for_event(ctrl, TRB_TRANSFER); 4024a3bc: 02000593 li a1,32 4024a3c0: 8522 mv a0,s0 4024a3c2: ebdff0ef jal ra,4024a27e <xhci_wait_for_event> field = le32_to_cpu(event->trans_event.flags); epc-> 4024a3c6: 455c lw a5,12(a0)
So the fault occurs when reading the controller register(s), do I understand it right ?
I think it is right. Actually this error occur in error path, control tx transfer TRB_TRANSFER error occur and jump to error path. sending TRB_TRANSFER again.
Could it be the problem is rather some clock, which are turned off after a fault ?
I think not. Just this udisk can reproduce this issue.
Can you take a closer look into this ? Is there maybe some hardware debug tool which can clarify what is going on better ?
It seems weird that controller register access would trigger this kind of bus fault (it is a bus fault, right ?)
participants (2)
-
Marek Vasut
-
Minda Chen