
On 09/06/2020 19:47, Michael Nazzareno Trimarchi wrote:
Hi,
that looks much better now, thanks! Some suggestion below.
On Tue, Jun 9, 2020 at 6:13 AM Bin Meng bmeng.cn@gmail.com wrote:
Hi André,
On Mon, Jun 8, 2020 at 9:52 PM André Przywara andre.przywara@arm.com wrote:
On 07/06/2020 12:22, Jagan Teki wrote:
Hi,
(CC: ing Mark)
Without looking to deep, I think invalidating the cache might be the right thing to do, but the rationale or at least the wording of it seems somehow flawed:
Some architecture like ARM Cortex A53, A72 would need
Please don't mix the term "architecture" with actual implementations. And the problem is more general, I would say. This *might* be a case here where this is due to speculation, and then I would expect it to only show on an A72, for instance. I guess this is about NVMe on RK3399? Does U-Boot run on an A53 or an A72 core?
to invalidate dcache to sync the cache with the memory contents before flushing the cache to memory.
That is somewhat confusing. What does "sync" and "flush" mean here? AFAIK only "clean" and "invalidate" are clearly defined terms. The command buffer is cleaned to DRAM in nvme_submit_cmd(), so this is purely about the buffer for the *return* data, in which case I would expect this being a pure invalidation problem.
This issue looks like the case described on slide 30 and 31 here: https://elinux.org/images/6/6d/Rutland2.pdf
Thanks for the slides. This is really helpful.
The NVME here submitting the admin command using dma_addr to the memory without prior cache invalidation. This causing dma_addr is pointing to an invalid location in the memory
This does not sound right: dma_addr is always pointing to the right location in memory. The actual reason this fixes something might be that (some) cache lines of the DMA buffer were dirty and were evicted *before* the existing invalidation, but *after* the DMA transfer. That sounds unlikely in an U-Boot context, though, but would match the case described in Mark's slides.
So there might be more to this. Are we missing a barrier here? Should we use coherent buffers (memory mapped as un-cached) for the return DMA in the first place (but I don't think U-Boot provides those)?
and found the sane nvme_ctrl result.
Below example shows the nvme disk scan result improper result
=> nvme scan nvme_get_info_from_identify: nn = 544502629, vwc = 100, sn = dev_0T, mn = `�\�, fr = t_part, mdts = 105
So, invalidating the cache before submitting the admin command makes the dma_addr points to a valid location in the memory.
If this shows up already, I think we should address all the comments in the driver where it says we should do cache maintenance. And it makes me wonder how this actually works today ...
I believe the driver was only validated on x86 and NXP's PowerPC series where cache coherence is ensured between the DMA master and the system memory. I suspect it was never validated on ARM hence this patch.
I agree we should address all the comments in the driver about cache maintenance.
What about this description:
nvme: Invalidate dcache before submitting admin cmd
This patch tries to avoid eviction of dirty lines during DMA transfer. The code right now executes the following step:
- allocate the buffer
- start a DMA operation using the non-coherent DMA buffer
- invalidate cache lines associated with the buffer
- read the buffer
This can lead to reading back not valid information. CPU can still read stale data. In order to avoid the eviction of dirty lines, a new invalidation is required before starting the DMA operation. The patch just adds an invalidation before submitting the DMA command.
As I think this case is not obvious, I would make it more clear what actually happens here, for instance (replace your paragraph above):
====================== This can lead to reading back not valid information, because the cache controller could evict dirty cache lines belonging to the buffer *after* the DMA operation has started to fill the DRAM. In order to avoid this, a new invalidation is required *before* starting the DMA operation. The patch just adds an invalidation before submitting the DMA command. ======================
And your point about NXP PCIe and x86 being cache coherent is a good one: Actually the ARM SBSA specification demands the same, so systems complying with this (servers, typically) would probably also work already. But those smartphone-class SoCs are probably not coherent.
It seems like drivers for PCI devices in U-Boot seem to assume some properties based on the host controllers and systems they have been tested on, as I found other issues lately. Some drivers do cache maintenance, others don't at all.
I will try to test NVMe on some other ARM platforms as soon as I can get a hand on one.
Many thanks! Andre
The example below shows the nvme disk scan result without the following patch
=> nvme scan nvme_get_info_from_identify: nn = 544502629, vwc = 100, sn = dev_0T, mn = `�\�, fr = t_part, mdts = 105
So, invalidating the cache before submitting the admin command, fix the cpu read.
Michael
Regards, Bin
-- | Michael Nazzareno Trimarchi Amarula Solutions BV | | COO - Founder Cruquiuskade 47 | | +31(0)851119172 Amsterdam 1018 AM NL | | [`as] http://www.amarulasolutions.com |