
Am 20.07.2012 17:35, schrieb Benoît Thébaudeau:
On Friday 20 July 2012 17:15:13 Stefan Herbrechtsmeier wrote:
Am 20.07.2012 17:03, schrieb Benoît Thébaudeau:
On Friday 20 July 2012 16:51:33 Stefan Herbrechtsmeier wrote:
Am 20.07.2012 15:56, schrieb Benoît Thébaudeau:
Dear Marek Vasut,
On Friday 20 July 2012 15:44:01 Marek Vasut wrote:
> On Friday 20 July 2012 13:37:37 Stefan Herbrechtsmeier wrote: >> Am 20.07.2012 13:26, schrieb Benoît Thébaudeau: >>> + int xfr_bytes = min(left_length, >>> + (QT_BUFFER_CNT * 4096 - >>> + ((uint32_t)buf_ptr & 4095)) & >>> + ~4095); >> Why you align the length to 4096? > It's to guarantee that each transfer length is a multiple of > the > max packet > length. Otherwise, early short packets are issued, which breaks > the > transfer and results in time-out error messages. Early short packets ? What do you mean?
During a USB transfer, all packets must have a length of max packet length for the pipe/endpoint, except the final one that can be a short packet. Without the alignment I make for xfr_bytes, short packets can occur within a transfer, because the hardware starts a new packet for each new queued qTD it handles.
But if I am right, the max packet length is 512 for bulk and 1024 for Interrupt transfer.
There are indeed different max packet lengths for different transfer types, but it does not matter since the chosen alignment guarantees a multiple of all these possible max packet lengths.
But thereby you limit the transfer to 4 qT buffers for unaligned transfers.
Not exactly. The 5 qt_buffers are used for page-unaligned buffers, but that results in only 4 full pages of unaligned data, requiring 5 aligned pages.
Sorry I mean 4 full pages of unaligned data.
For page-aligned buffers, the 5 qt_buffers result in 5 full pages of aligned data.
Sure.
The unaligned case could be a little bit improved to always use as many packets as possible per qTD, but that would over-complicate things for a very negligible speed and memory gain.
In my use case (fragmented file on usb storage) the gain would be nearly 20%. The reason is that the data are block aligned (512) and could be aligned to 4096 with the first transfer (5 qt_buffers).
My suggestion would be to truncate the xfr_bytes with the max wMaxPacketSize (1024) and for the qtd_count use:
if ((uint32_t)buffer & 1023) /* wMaxPacketSize unaligned */ qtd_count += DIV_ROUND_UP(((uint32_t)buffer & 4095) + length, (QT_BUFFER_CNT - 1) * 4096); else /* wMaxPacketSize aligned */ qtd_count += DIV_ROUND_UP(((uint32_t)buffer & 4095) + length, QT_BUFFER_CNT * 4096);
This allows 50% of unaligned block data (512) to be transferred with min qTDs.