
On Wed, Oct 16, 2019 at 06:11:23PM +0800, Bin Meng wrote:
On Mon, Oct 14, 2019 at 7:11 PM Patrick Wildt patrick@blueri.se wrote:
On an i.MX8MQ our nvme driver doesn't completely work since we are missing a few cache flushes. One is the prp list, which is an extra buffer that we need to flush before handing it to the hardware. Also the block read/write operations needs more cache flushes on this SoC.
Signed-off-by: Patrick Wildt patrick@blueri.se
drivers/nvme/nvme.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c index 2444e0270f..69d5e3eedc 100644 --- a/drivers/nvme/nvme.c +++ b/drivers/nvme/nvme.c @@ -123,6 +123,9 @@ static int nvme_setup_prps(struct nvme_dev *dev, u64 *prp2, } *prp2 = (ulong)dev->prp_pool;
flush_dcache_range((ulong)dev->prp_pool, (ulong)dev->prp_pool +
dev->prp_entry_num * sizeof(u64));
return 0;
}
@@ -717,9 +720,10 @@ static ulong nvme_blk_rw(struct udevice *udev, lbaint_t blknr, u16 lbas = 1 << (dev->max_transfer_shift - ns->lba_shift); u64 total_lbas = blkcnt;
if (!read)
flush_dcache_range((unsigned long)buffer,
(unsigned long)buffer + total_len);
flush_dcache_range((unsigned long)buffer,
(unsigned long)buffer + total_len);
Why we need this for read?
I'm no processor engineer, but I have read (and "seen") the following on ARMs. If I'm wrong. please correct me.
Since the buffer is usually allocated cache-aligned on the stack, it is very possible that this buffer was previously still used as it's supposed to be used: as stack. Thus, the caches can still be filled, and are not yet evicted/flushed to memory. Now it is possible that between the DMA access from the device and our cache invalidation, the CPU cache writes back its contents to memory, overwriting whatever the NVMe just gave us.
invalidate_dcache_range((unsigned long)buffer,
(unsigned long)buffer + total_len); c.rw.opcode = read ? nvme_cmd_read : nvme_cmd_write; c.rw.flags = 0;
@@ -755,9 +759,8 @@ static ulong nvme_blk_rw(struct udevice *udev, lbaint_t blknr, buffer += lbas << ns->lba_shift; }
if (read)
invalidate_dcache_range((unsigned long)buffer,
(unsigned long)buffer + total_len);
invalidate_dcache_range((unsigned long)buffer,
(unsigned long)buffer + total_len);
Why we need this for write?
That's a good point. After the transaction, if it was a read then we need to invalidate it, as we might have speculatively read it. On a write, we don't care about its contents. I will test it w/o this chunk and report back.
Best regards, Patrick
return (total_len - temp_len) >> desc->log2blksz;
}
Regards, Bin