[U-Boot] usb_test_unit_ready called every block read - performance

While tuning ext2load, we found that usb_test_unit_ready was being called every block read. We compared the usb block storage to the scsi block storage cmd_scsi.c, and found that the scsi device was only calling its scsi_setup_test_unit_ready() during scsi_can. It appears that usb_test_unit_ready() really only needs to be called once during usb_stor_scan(), via usb_stor_get_info(). Is there a particular reason usb_test_unit_ready is called for every block read, or do you think its ok to only call during usb_stor_scan()? We're finding this speeds up ext2load quite a bit.
Regards, Jim

Dear Jim Shimer,
While tuning ext2load, we found that usb_test_unit_ready was being called every block read. We compared the usb block storage to the scsi block storage cmd_scsi.c, and found that the scsi device was only calling its scsi_setup_test_unit_ready() during scsi_can. It appears that usb_test_unit_ready() really only needs to be called once during usb_stor_scan(), via usb_stor_get_info(). Is there a particular reason usb_test_unit_ready is called for every block read, or do you think its ok to only call during usb_stor_scan()? We're finding this speeds up ext2load quite a bit.
Could it be because the USB is actually quite slower than SCSI and the device might get congested? It seems the code was there ever since (I can't seem to find the origin of it) and we might actually want to try if this doesn't break some usb 1.x things.
Regards, Jim
Best regards, Marek Vasut

Marek,
I don't have a USB 1.x device to do that regression. Taking the call to usb_test_unit_ready out of the block reads/writes and leaving it only in get info really speed up ext2load. With the recent cache changes on top of this and the 5ms delay, we're getting fairly close to the Linux USB transfer times and are happy with the results. Can you look into integrating this?
-Jim
On Mon, Jul 30, 2012 at 6:54 PM, Marek Vasut marex@denx.de wrote:
Dear Jim Shimer,
While tuning ext2load, we found that usb_test_unit_ready was being called every block read. We compared the usb block storage to the scsi block storage cmd_scsi.c, and found that the scsi device was only calling its scsi_setup_test_unit_ready() during scsi_can. It appears that usb_test_unit_ready() really only needs to be called once during usb_stor_scan(), via usb_stor_get_info(). Is there a particular reason usb_test_unit_ready is called for every block read, or do you think its
ok
to only call during usb_stor_scan()? We're finding this speeds up
ext2load
quite a bit.
Could it be because the USB is actually quite slower than SCSI and the device might get congested? It seems the code was there ever since (I can't seem to find the origin of it) and we might actually want to try if this doesn't break some usb 1.x things.
Regards, Jim
Best regards, Marek Vasut

Dear Jim Shimer,
Marek,
I don't have a USB 1.x device to do that regression. Taking the call to usb_test_unit_ready out of the block reads/writes and leaving it only in get info really speed up ext2load. With the recent cache changes on top of this and the 5ms delay, we're getting fairly close to the Linux USB transfer times and are happy with the results. Can you look into integrating this?
Can you send a patch that'll yank it out? I should be able to test it in a bit.
-Jim
On Mon, Jul 30, 2012 at 6:54 PM, Marek Vasut marex@denx.de wrote:
Dear Jim Shimer,
While tuning ext2load, we found that usb_test_unit_ready was being called every block read. We compared the usb block storage to the scsi block storage cmd_scsi.c, and found that the scsi device was only calling its scsi_setup_test_unit_ready() during scsi_can. It appears that usb_test_unit_ready() really only needs to be called once during usb_stor_scan(), via usb_stor_get_info(). Is there a particular reason usb_test_unit_ready is called for every block read, or do you think its
ok
to only call during usb_stor_scan()? We're finding this speeds up
ext2load
quite a bit.
Could it be because the USB is actually quite slower than SCSI and the device might get congested? It seems the code was there ever since (I can't seem to find the origin of it) and we might actually want to try if this doesn't break some usb 1.x things.
Regards, Jim
Best regards, Marek Vasut
Best regards, Marek Vasut

Hi Marek,
Here's a patch.
-Jim
On Mon, Jul 30, 2012 at 7:41 PM, Marek Vasut marek.vasut@gmail.com wrote:
Dear Jim Shimer,
Marek,
I don't have a USB 1.x device to do that regression. Taking the call to usb_test_unit_ready out of the block reads/writes and leaving it only in get info really speed up ext2load. With the recent cache changes on top
of
this and the 5ms delay, we're getting fairly close to the Linux USB transfer times and are happy with the results. Can you look into integrating this?
Can you send a patch that'll yank it out? I should be able to test it in a bit.
-Jim
On Mon, Jul 30, 2012 at 6:54 PM, Marek Vasut marex@denx.de wrote:
Dear Jim Shimer,
While tuning ext2load, we found that usb_test_unit_ready was being called every block read. We compared the usb block storage to the scsi block storage cmd_scsi.c, and found that the scsi device was
only
calling its scsi_setup_test_unit_ready() during scsi_can. It appears that usb_test_unit_ready() really only needs to be called once during usb_stor_scan(), via usb_stor_get_info(). Is there a particular reason usb_test_unit_ready is called for every block read, or do you think its
ok
to only call during usb_stor_scan()? We're finding this speeds up
ext2load
quite a bit.
Could it be because the USB is actually quite slower than SCSI and the device might get congested? It seems the code was there ever since (I can't
seem
to find the origin of it) and we might actually want to try if this
doesn't
break some usb 1.x things.
Regards, Jim
Best regards, Marek Vasut
Best regards, Marek Vasut

Dear Jim Shimer,
In message CACjqNxspYVs8Rsqq8Mgnsop15Js6HTdXQXX=v-cWEVBpmN_PPw@mail.gmail.com you wrote:
Here's a patch.
Please read http://www.denx.de/wiki/U-Boot/Patches
User git-format-patch and git-send-email (or makae sure yourself to stick to the mandatory format, which includes for example the "[PATCH]" marker in the subject, and submit inline, not as MIME atachment.
Finally: it appears you introduce a new config option CONFIG_USB_LEGACY_RW - this MUST be documented.
Best regards,
Wolfgang Denk

Dear Jim Shimer,
Hi Marek,
Here's a patch.
-Jim
[...]
I applied modified patch, please check u-boot-usb.git if you agree.
btw please read http://www.denx.de/wiki/U-Boot/Patches
Best regards, Marek Vasut

Marek,
That looks good to me. Jim?
Thanks, Steve
On Tue, Aug 14, 2012 at 1:50 PM, Marek Vasut marex@denx.de wrote:
Dear Jim Shimer,
Hi Marek,
Here's a patch.
-Jim
[...]
I applied modified patch, please check u-boot-usb.git if you agree.
btw please read http://www.denx.de/wiki/U-Boot/Patches
Best regards, Marek Vasut

Marek/Benoit,
Thanks so much for integrating this. I like the way you reused the flags, and simplified the code.
Regards, Jim
On Tue, Aug 14, 2012 at 1:57 PM, Steve Heckman mgi2481@motorola.com wrote:
Marek,
That looks good to me. Jim?
Thanks, Steve
On Tue, Aug 14, 2012 at 1:50 PM, Marek Vasut marex@denx.de wrote:
Dear Jim Shimer,
Hi Marek,
Here's a patch.
-Jim
[...]
I applied modified patch, please check u-boot-usb.git if you agree.
btw please read http://www.denx.de/wiki/U-Boot/Patches
Best regards, Marek Vasut

Dear Jim Shimer,
While tuning ext2load, we found that usb_test_unit_ready was being called every block read. We compared the usb block storage to the scsi block storage cmd_scsi.c, and found that the scsi device was only calling its scsi_setup_test_unit_ready() during scsi_can. It appears that usb_test_unit_ready() really only needs to be called once during usb_stor_scan(), via usb_stor_get_info(). Is there a particular reason usb_test_unit_ready is called for every block read, or do you think its ok to only call during usb_stor_scan()? We're finding this speeds up ext2load quite a bit.
Jim, did we get anywhere on this one ? Can you try with the new ext4 code in Wolfgangs' u-boot-master/ext4 branch?
Regards, Jim
Best regards, Marek Vasut

Marek,
I've been working with Jim on this.
The latest u-boot code we tried (2012.07) does seem to be much faster WRT reading ext volumes from USB media. We were still able to eke out a bit more speed by removing the Test Unit Ready calls per Read/Write. In my experience, you only call Test Unit Ready during initialization of a device (e.g., SCSI) or after device reset.
We still haven't heard a reason why it is being done for each Read and Write operation. Was it put in to work around bad hardware?
Thanks, Steve Heckman
On Sun, Aug 12, 2012 at 7:54 PM, Marek Vasut marex@denx.de wrote:
Dear Jim Shimer,
While tuning ext2load, we found that usb_test_unit_ready was being called every block read. We compared the usb block storage to the scsi block storage cmd_scsi.c, and found that the scsi device was only calling its scsi_setup_test_unit_ready() during scsi_can. It appears that usb_test_unit_ready() really only needs to be called once during usb_stor_scan(), via usb_stor_get_info(). Is there a particular reason usb_test_unit_ready is called for every block read, or do you think its
ok
to only call during usb_stor_scan()? We're finding this speeds up
ext2load
quite a bit.
Jim, did we get anywhere on this one ? Can you try with the new ext4 code in Wolfgangs' u-boot-master/ext4 branch?
Regards, Jim
Best regards, Marek Vasut

Dear Steve Heckman,
Marek,
I've been working with Jim on this.
The latest u-boot code we tried (2012.07) does seem to be much faster WRT reading ext volumes from USB media. We were still able to eke out a bit more speed by removing the Test Unit Ready calls per Read/Write. In my experience, you only call Test Unit Ready during initialization of a device (e.g., SCSI) or after device reset.
We still haven't heard a reason why it is being done for each Read and Write operation. Was it put in to work around bad hardware?
I have zero idea, it was there ever since ... let's apply similar patch and see how it fares.
Thanks, Steve Heckman
On Sun, Aug 12, 2012 at 7:54 PM, Marek Vasut marex@denx.de wrote:
Dear Jim Shimer,
While tuning ext2load, we found that usb_test_unit_ready was being called every block read. We compared the usb block storage to the scsi block storage cmd_scsi.c, and found that the scsi device was only calling its scsi_setup_test_unit_ready() during scsi_can. It appears that usb_test_unit_ready() really only needs to be called once during usb_stor_scan(), via usb_stor_get_info(). Is there a particular reason usb_test_unit_ready is called for every block read, or do you think its
ok
to only call during usb_stor_scan()? We're finding this speeds up
ext2load
quite a bit.
Jim, did we get anywhere on this one ? Can you try with the new ext4 code in Wolfgangs' u-boot-master/ext4 branch?
Regards, Jim
Best regards, Marek Vasut
Best regards, Marek Vasut

Hi Marek,
I looked at the ext4 branch. It looks like he has the patch to remove the usb_test_unit_ready() calls which were not needed. Actually those calls are commented out on that branch: #if 0 if (usb_test_unit_ready(srb, ss)) { printf("Device NOT ready\n Request Sense returned %02X %02X" " %02X\n", srb->sense_buf[2], srb->sense_buf[12], srb->sense_buf[13]); return 0; } #endif
In the u-boot-usb.git, this code is removed so at some point there will be a merge conflict.
Also the ext4 branch still has the mdelay(5) always being done in usb_stor_BBB_transport() line 696 which we found to be the largest performance killer.
Regards, Jim
On Sun, Aug 12, 2012 at 7:54 PM, Marek Vasut marex@denx.de wrote:
Dear Jim Shimer,
While tuning ext2load, we found that usb_test_unit_ready was being called every block read. We compared the usb block storage to the scsi block storage cmd_scsi.c, and found that the scsi device was only calling its scsi_setup_test_unit_ready() during scsi_can. It appears that usb_test_unit_ready() really only needs to be called once during usb_stor_scan(), via usb_stor_get_info(). Is there a particular reason usb_test_unit_ready is called for every block read, or do you think its
ok
to only call during usb_stor_scan()? We're finding this speeds up
ext2load
quite a bit.
Jim, did we get anywhere on this one ? Can you try with the new ext4 code in Wolfgangs' u-boot-master/ext4 branch?
Regards, Jim
Best regards, Marek Vasut

Oh yeah, forgot about that...;)
I just downloaded the ext4 branch tarball and built it.
The test_unit_ready calls were still in there.
With or without those it took 0m 45s to load a ~150MB image.
In our original branch (2011.12), the test_unit_ready calls had more of an impact. The stock 2011.12 u-boot image took 6m 22s to load the 150MB file. Without test_unit_ready, it took 3m 15s. Without test_unit_ready and wait_ms(5) in usb_stor_BBB_transport() it took 0m 16s.
In the ext4 branch, I removed test_unit_ready and the mdelay(5) call from usb_stor_BBB_transport() function and was able to load the same file in 0m 8s.
So, removing the mdelay(5) call is the biggest improvement. Removing both is the best.
To recap:
with w/o w/o TUR TUR TUR and 5ms wait 2011.12 6:25 3:15 0:16 ext4 0:45 0:45 0:08
Note: all these time include the 3-4 seconds it takes to do the "usb start".
Regards, -Steve
On Wed, Aug 15, 2012 at 10:19 AM, Jim Shimer mgi2475@motorola.com wrote:
Hi Marek,
I looked at the ext4 branch. It looks like he has the patch to remove the usb_test_unit_ready() calls which were not needed. Actually those calls are commented out on that branch: #if 0 if (usb_test_unit_ready(srb, ss)) { printf("Device NOT ready\n Request Sense returned %02X %02X" " %02X\n", srb->sense_buf[2], srb->sense_buf[12], srb->sense_buf[13]); return 0; } #endif
In the u-boot-usb.git, this code is removed so at some point there will be a merge conflict.
Also the ext4 branch still has the mdelay(5) always being done in usb_stor_BBB_transport() line 696 which we found to be the largest performance killer.
Regards, Jim
On Sun, Aug 12, 2012 at 7:54 PM, Marek Vasut marex@denx.de wrote:
Dear Jim Shimer,
While tuning ext2load, we found that usb_test_unit_ready was being
called
every block read. We compared the usb block storage to the scsi block storage cmd_scsi.c, and found that the scsi device was only calling its scsi_setup_test_unit_ready() during scsi_can. It appears that usb_test_unit_ready() really only needs to be called once during usb_stor_scan(), via usb_stor_get_info(). Is there a particular reason usb_test_unit_ready is called for every block read, or do you think its
ok
to only call during usb_stor_scan()? We're finding this speeds up
ext2load
quite a bit.
Jim, did we get anywhere on this one ? Can you try with the new ext4 code in Wolfgangs' u-boot-master/ext4 branch?
Regards, Jim
Best regards, Marek Vasut
-- *James H Shimer* Motorola Mobility T3-12-HH72 900 Chelmsford Street Lowell MA 08151 978-614-3550

Dear Steve Heckman,
Oh yeah, forgot about that...;)
I just downloaded the ext4 branch tarball and built it.
The test_unit_ready calls were still in there.
With or without those it took 0m 45s to load a ~150MB image.
In our original branch (2011.12), the test_unit_ready calls had more of an impact. The stock 2011.12 u-boot image took 6m 22s to load the 150MB file. Without test_unit_ready, it took 3m 15s. Without test_unit_ready and wait_ms(5) in usb_stor_BBB_transport() it took 0m 16s.
In the ext4 branch, I removed test_unit_ready and the mdelay(5) call from usb_stor_BBB_transport() function and was able to load the same file in 0m 8s.
So, removing the mdelay(5) call is the biggest improvement. Removing both is the best.
To recap:
with w/o w/o TUR TUR TUR and 5ms wait
2011.12 6:25 3:15 0:16 ext4 0:45 0:45 0:08
Note: all these time include the 3-4 seconds it takes to do the "usb start".
Coolness ! :-) Please just retest by applying those ext4 patches on top of uboot usb and see how fast it goes :)
Regards, -Steve
On Wed, Aug 15, 2012 at 10:19 AM, Jim Shimer mgi2475@motorola.com wrote:
Hi Marek,
I looked at the ext4 branch. It looks like he has the patch to remove the usb_test_unit_ready() calls which were not needed. Actually those calls are commented out on that branch: #if 0
if (usb_test_unit_ready(srb, ss)) { printf("Device NOT ready\n Request Sense returned %02X
%02X"
" %02X\n", srb->sense_buf[2], srb->sense_buf[12], srb->sense_buf[13]); return 0; }
#endif
In the u-boot-usb.git, this code is removed so at some point there will be a merge conflict.
Also the ext4 branch still has the mdelay(5) always being done in usb_stor_BBB_transport() line 696 which we found to be the largest performance killer.
Regards, Jim
On Sun, Aug 12, 2012 at 7:54 PM, Marek Vasut marex@denx.de wrote:
Dear Jim Shimer,
While tuning ext2load, we found that usb_test_unit_ready was being
called
every block read. We compared the usb block storage to the scsi block storage cmd_scsi.c, and found that the scsi device was only calling its scsi_setup_test_unit_ready() during scsi_can. It appears that usb_test_unit_ready() really only needs to be called once during usb_stor_scan(), via usb_stor_get_info(). Is there a particular reason usb_test_unit_ready is called for every block read, or do you think its
ok
to only call during usb_stor_scan()? We're finding this speeds up
ext2load
quite a bit.
Jim, did we get anywhere on this one ? Can you try with the new ext4 code in Wolfgangs' u-boot-master/ext4 branch?
Regards, Jim
Best regards, Marek Vasut
-- *James H Shimer* Motorola Mobility T3-12-HH72 900 Chelmsford Street Lowell MA 08151 978-614-3550

Dear Jim Shimer,
Hi Marek,
I looked at the ext4 branch. It looks like he has the patch to remove the usb_test_unit_ready() calls which were not needed. Actually those calls are commented out on that branch: #if 0 if (usb_test_unit_ready(srb, ss)) { printf("Device NOT ready\n Request Sense returned %02X %02X" " %02X\n", srb->sense_buf[2], srb->sense_buf[12], srb->sense_buf[13]); return 0; } #endif
In the u-boot-usb.git, this code is removed so at some point there will be a merge conflict.
Also the ext4 branch still has the mdelay(5) always being done in usb_stor_BBB_transport() line 696 which we found to be the largest performance killer.
The ext4 branch doesn't contain a few other things indeed. Can you try applying the ext4 patches on uboot-usb and retest?
Regards, Jim
On Sun, Aug 12, 2012 at 7:54 PM, Marek Vasut marex@denx.de wrote:
Dear Jim Shimer,
While tuning ext2load, we found that usb_test_unit_ready was being called every block read. We compared the usb block storage to the scsi block storage cmd_scsi.c, and found that the scsi device was only calling its scsi_setup_test_unit_ready() during scsi_can. It appears that usb_test_unit_ready() really only needs to be called once during usb_stor_scan(), via usb_stor_get_info(). Is there a particular reason usb_test_unit_ready is called for every block read, or do you think its
ok
to only call during usb_stor_scan()? We're finding this speeds up
ext2load
quite a bit.
Jim, did we get anywhere on this one ? Can you try with the new ext4 code in Wolfgangs' u-boot-master/ext4 branch?
Regards, Jim
Best regards, Marek Vasut
Best regards, Marek Vasut
participants (5)
-
Jim Shimer
-
Marek Vasut
-
Marek Vasut
-
Steve Heckman
-
Wolfgang Denk