
On 9/14/20 5:50 PM, Stephen Warren wrote:
On 9/12/20 9:24 AM, Marek Vasut wrote:
On 9/11/20 9:43 PM, twarren@nvidia.com wrote:
From: Tom Warren twarren@nvidia.com
This fixes the XHCI driver on T210 boards (TX1, Nano). I was seeing that Set_Address wasn't completing, returning with a Context Parameter error. Examining the slot context, etc. showed that the correct info was there in RAM. Once I set 'dcache off' globally, it started working. This patch was created to force the TRB, etc. allocation to be in non-cached memory, which resulted in XHCI working on Nano/TX1 w/o the need for a global dcache disable. Thierry Reding pointed to a similar fix he'd done for the rtl6189 driver.
Sending this to the list for comment, as this should have affected other XHCI implementations on other SoCs. Note that Tegra X1 (T210) has a 64-byte cache line size (64-bit ARMv8), and I do see the flush_cache/inval_cache ARM code being called via xhci_cache_flash/xhci_inval_cache.
Is cache management on tegra210 broken ? I've seen the same non-cached workaround in the DWMAC ethernet driver.
I believe the issue with DWMAC and r8169 is related to the size/layout of the descriptors; the Ethernet adapter descriptor size is smaller than one cache line, and there isn't a way to tell the Ethernet HW to allow gaps between them to align them with cache lines. Consequently, it's impossible to perform cache operations that only apply to a single descriptor, which in turn means that adjacent descriptors are potentially corrupted when performing cache operations. Disabling the cache is required in that case. That is unless the HW supports linked-lists of descriptors so SW can lay them out at will. I don't recall if either HW supports this, and even if one/both do, then the driver doesn't currently do this so disabling cache is still the quickest way of making the HW work.
I think you tested this patch: https://patchwork.ozlabs.org/project/uboot/patch/20200429191403.112487-1-mar...
I'd expect this issue to apply to any ARMv8 system, since IIRC doesn't ARMv8 specify the cache line size? If not, at least the issue will apply to any system that uses a cache line size at least as large as Tegra210.
For this XHCI case, there is some other problem, since the cache line size matches the XHCI descriptor size.
OK