[RFC PATCH] nvme: Always invalidate whole cqes[] array

At the moment nvme_read_completion_status() tries to invidate a single member of the cqes[] array, which is shady as just a single entry is not cache line aligned. The structure is dictated by hardware, and with 16 bytes is smaller than any cache line we usually deal with. Also multiple entries need to be consecutive in memory, so we can't pad them to cover a whole cache line.
As a consequence we can only always invalidate all of them - U-Boot just uses two of them anyway. This is fine, as they are only ever read by the CPU (apart from the initial zeroing), so they can't become dirty.
Make this obvious by always invalidating the whole array, regardless of the entry number we are about to read. Also blow up the allocation size to cover whole cache lines, to avoid other heap allocations to sneak in.
Signed-off-by: Andre Przywara andre.przywara@arm.com --- Hi,
this is just compile tested, and should fix the only questionable cache invalidate call in this driver. Please verify if this fixes any issues!
Cheers, Andre
drivers/nvme/nvme.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c index 5d6331ad346..c9efeff4bc9 100644 --- a/drivers/nvme/nvme.c +++ b/drivers/nvme/nvme.c @@ -22,6 +22,8 @@ #define NVME_AQ_DEPTH 2 #define NVME_SQ_SIZE(depth) (depth * sizeof(struct nvme_command)) #define NVME_CQ_SIZE(depth) (depth * sizeof(struct nvme_completion)) +#define NVME_CQ_ALLOCATION ALIGN(NVME_CQ_SIZE(NVME_Q_DEPTH), \ + ARCH_DMA_MINALIGN) #define ADMIN_TIMEOUT 60 #define IO_TIMEOUT 30 #define MAX_PRP_POOL 512 @@ -144,8 +146,14 @@ static __le16 nvme_get_cmd_id(void)
static u16 nvme_read_completion_status(struct nvme_queue *nvmeq, u16 index) { - u64 start = (ulong)&nvmeq->cqes[index]; - u64 stop = start + sizeof(struct nvme_completion); + /* + * Single CQ entries are always smaller than a cache line, so we + * can't invalidate them individually. However CQ entries are + * read only by the CPU, so it's safe to always invalidate all of them, + * as the cache line should never become dirty. + */ + ulong start = (ulong)&nvmeq->cqes[0]; + ulong stop = start + NVME_CQ_ALLOCATION;
invalidate_dcache_range(start, stop);
@@ -241,7 +249,7 @@ static struct nvme_queue *nvme_alloc_queue(struct nvme_dev *dev, return NULL; memset(nvmeq, 0, sizeof(*nvmeq));
- nvmeq->cqes = (void *)memalign(4096, NVME_CQ_SIZE(depth)); + nvmeq->cqes = (void *)memalign(4096, NVME_CQ_ALLOCATION); if (!nvmeq->cqes) goto free_nvmeq; memset((void *)nvmeq->cqes, 0, NVME_CQ_SIZE(depth)); @@ -339,7 +347,7 @@ static void nvme_init_queue(struct nvme_queue *nvmeq, u16 qid) nvmeq->q_db = &dev->dbs[qid * 2 * dev->db_stride]; memset((void *)nvmeq->cqes, 0, NVME_CQ_SIZE(nvmeq->q_depth)); flush_dcache_range((ulong)nvmeq->cqes, - (ulong)nvmeq->cqes + NVME_CQ_SIZE(nvmeq->q_depth)); + (ulong)nvmeq->cqes + NVME_CQ_ALLOCATION); dev->online_queues++; }

On Mon, Feb 8, 2021 at 9:32 PM Andre Przywara andre.przywara@arm.com wrote:
At the moment nvme_read_completion_status() tries to invidate a single
typo: invalidate
member of the cqes[] array, which is shady as just a single entry is not cache line aligned. The structure is dictated by hardware, and with 16 bytes is smaller than any cache line we usually deal with. Also multiple entries need to be consecutive in memory, so we can't pad them to cover a whole cache line.
As a consequence we can only always invalidate all of them - U-Boot just uses two of them anyway. This is fine, as they are only ever read by the CPU (apart from the initial zeroing), so they can't become dirty.
Make this obvious by always invalidating the whole array, regardless of the entry number we are about to read. Also blow up the allocation size to cover whole cache lines, to avoid other heap allocations to sneak in.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Hi,
this is just compile tested, and should fix the only questionable cache invalidate call in this driver. Please verify if this fixes any issues!
Cheers, Andre
drivers/nvme/nvme.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
Reviewed-by: Bin Meng bmeng.cn@gmail.com

Hi Jagan
On Tue, Feb 9, 2021 at 12:28 PM Bin Meng bmeng.cn@gmail.com wrote:
On Mon, Feb 8, 2021 at 9:32 PM Andre Przywara andre.przywara@arm.com wrote:
At the moment nvme_read_completion_status() tries to invidate a single
typo: invalidate
member of the cqes[] array, which is shady as just a single entry is not cache line aligned. The structure is dictated by hardware, and with 16 bytes is smaller than any cache line we usually deal with. Also multiple entries need to be consecutive in memory, so we can't pad them to cover a whole cache line.
As a consequence we can only always invalidate all of them - U-Boot just uses two of them anyway. This is fine, as they are only ever read by the CPU (apart from the initial zeroing), so they can't become dirty.
Make this obvious by always invalidating the whole array, regardless of the entry number we are about to read. Also blow up the allocation size to cover whole cache lines, to avoid other heap allocations to sneak in.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Hi,
this is just compile tested, and should fix the only questionable cache invalidate call in this driver. Please verify if this fixes any issues!
Cheers, Andre
drivers/nvme/nvme.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
Reviewed-by: Bin Meng bmeng.cn@gmail.com
Please test again. I don't have the hardware Reviewed-by: Michael Trimarchi michael@amarulasolutions.com

Hi Suniel,
On Mon, Feb 8, 2021 at 7:02 PM Andre Przywara andre.przywara@arm.com wrote:
At the moment nvme_read_completion_status() tries to invidate a single member of the cqes[] array, which is shady as just a single entry is not cache line aligned. The structure is dictated by hardware, and with 16 bytes is smaller than any cache line we usually deal with. Also multiple entries need to be consecutive in memory, so we can't pad them to cover a whole cache line.
As a consequence we can only always invalidate all of them - U-Boot just uses two of them anyway. This is fine, as they are only ever read by the CPU (apart from the initial zeroing), so they can't become dirty.
Make this obvious by always invalidating the whole array, regardless of the entry number we are about to read. Also blow up the allocation size to cover whole cache lines, to avoid other heap allocations to sneak in.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Hi,
this is just compile tested, and should fix the only questionable cache invalidate call in this driver. Please verify if this fixes any issues!
Can you test this on RK3399?
Jagan.

On Fri, Feb 26, 2021 at 8:12 PM Jagan Teki jagan@amarulasolutions.com wrote:
Hi Suniel,
On Mon, Feb 8, 2021 at 7:02 PM Andre Przywara andre.przywara@arm.com wrote:
At the moment nvme_read_completion_status() tries to invidate a single member of the cqes[] array, which is shady as just a single entry is not cache line aligned. The structure is dictated by hardware, and with 16 bytes is smaller than any cache line we usually deal with. Also multiple entries need to be consecutive in memory, so we can't pad them to cover a whole cache line.
As a consequence we can only always invalidate all of them - U-Boot just uses two of them anyway. This is fine, as they are only ever read by the CPU (apart from the initial zeroing), so they can't become dirty.
Make this obvious by always invalidating the whole array, regardless of the entry number we are about to read. Also blow up the allocation size to cover whole cache lines, to avoid other heap allocations to sneak in.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Hi,
this is just compile tested, and should fix the only questionable cache invalidate call in this driver. Please verify if this fixes any issues!
Can you test this on RK3399?
Hi Jagan/All
Here are the test results after applying this patch on mainline. The patch has been tested on roc-rk3399-pc an RK3399 based SBC. It looks fine, below is the log:
U-Boot TPL 2021.04-rc2-00169-g958c213e3f (Feb 26 2021 - 21:38:32) Trying to boot from BOOTROM Returning to boot ROM...
U-Boot SPL 2021.04-rc2-00169-g958c213e3f (Feb 26 2021 - 21:38:32 +0530) Loading Environment from SPIFlash... *** Warning - spi_flash_probe_bus_cs() failed, using default environment
Trying to boot from MMC1
U-Boot 2021.04-rc2-00169-g958c213e3f (Feb 26 2021 - 21:38:32 +0530)
SoC: Rockchip rk3399 Reset cause: POR Model: Firefly ROC-RK3399-PC Mezzanine Board DRAM: 3.9 GiB PMIC: RK808 MMC: mmc@fe310000: 2, mmc@fe320000: 1, sdhci@fe330000: 0 Loading Environment from SPIFlash... jedec_spi_nor flash@0: unrecognized JEDEC id bytes: ff, ff, ff *** Warning - spi_flash_probe_bus_cs() failed, using default environment
In: serial Out: serial Err: serial Model: Firefly ROC-RK3399-PC Mezzanine Board Net: Error: ethernet@fe300000 address not set. No ethernet found.
Hit any key to stop autoboot: 0 => => => pci Scanning PCI devices on bus 0 BusDevFun VendorId DeviceId Device Class Sub-Class _____________________________________________________________ 00.00.00 0x1d87 0x0100 Bridge device 0x04
=> nvme scan
=> nvme info Device 0: Vendor: 0x15b7 Rev: 21110001 Prod: 2031C2440521 Type: Hard Disk Capacity: 244198.3 MB = 238.4 GB (500118192 x 512) => nvme device
IDE device 0: Vendor: 0x15b7 Rev: 21110001 Prod: 2031C2440521 Type: Hard Disk Capacity: 244198.3 MB = 238.4 GB (500118192 x 512)
Tested-by: Suniel Mahesh sunil@amarulasolutions.com
Jagan.

On 08/02/2021 14:31, Andre Przywara wrote:
At the moment nvme_read_completion_status() tries to invidate a single member of the cqes[] array, which is shady as just a single entry is not cache line aligned. The structure is dictated by hardware, and with 16 bytes is smaller than any cache line we usually deal with. Also multiple entries need to be consecutive in memory, so we can't pad them to cover a whole cache line.
As a consequence we can only always invalidate all of them - U-Boot just uses two of them anyway. This is fine, as they are only ever read by the CPU (apart from the initial zeroing), so they can't become dirty.
Make this obvious by always invalidating the whole array, regardless of the entry number we are about to read. Also blow up the allocation size to cover whole cache lines, to avoid other heap allocations to sneak in.
Signed-off-by: Andre Przywara andre.przywara@arm.com
Hi,
this is just compile tested, and should fix the only questionable cache invalidate call in this driver. Please verify if this fixes any issues!
Cheers, Andre
drivers/nvme/nvme.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c index 5d6331ad346..c9efeff4bc9 100644 --- a/drivers/nvme/nvme.c +++ b/drivers/nvme/nvme.c @@ -22,6 +22,8 @@ #define NVME_AQ_DEPTH 2 #define NVME_SQ_SIZE(depth) (depth * sizeof(struct nvme_command)) #define NVME_CQ_SIZE(depth) (depth * sizeof(struct nvme_completion)) +#define NVME_CQ_ALLOCATION ALIGN(NVME_CQ_SIZE(NVME_Q_DEPTH), \
ARCH_DMA_MINALIGN)
#define ADMIN_TIMEOUT 60 #define IO_TIMEOUT 30 #define MAX_PRP_POOL 512 @@ -144,8 +146,14 @@ static __le16 nvme_get_cmd_id(void)
static u16 nvme_read_completion_status(struct nvme_queue *nvmeq, u16 index) {
- u64 start = (ulong)&nvmeq->cqes[index];
- u64 stop = start + sizeof(struct nvme_completion);
/*
* Single CQ entries are always smaller than a cache line, so we
* can't invalidate them individually. However CQ entries are
* read only by the CPU, so it's safe to always invalidate all of them,
* as the cache line should never become dirty.
*/
ulong start = (ulong)&nvmeq->cqes[0];
ulong stop = start + NVME_CQ_ALLOCATION;
invalidate_dcache_range(start, stop);
@@ -241,7 +249,7 @@ static struct nvme_queue *nvme_alloc_queue(struct nvme_dev *dev, return NULL; memset(nvmeq, 0, sizeof(*nvmeq));
- nvmeq->cqes = (void *)memalign(4096, NVME_CQ_SIZE(depth));
- nvmeq->cqes = (void *)memalign(4096, NVME_CQ_ALLOCATION); if (!nvmeq->cqes) goto free_nvmeq; memset((void *)nvmeq->cqes, 0, NVME_CQ_SIZE(depth));
@@ -339,7 +347,7 @@ static void nvme_init_queue(struct nvme_queue *nvmeq, u16 qid) nvmeq->q_db = &dev->dbs[qid * 2 * dev->db_stride]; memset((void *)nvmeq->cqes, 0, NVME_CQ_SIZE(nvmeq->q_depth)); flush_dcache_range((ulong)nvmeq->cqes,
(ulong)nvmeq->cqes + NVME_CQ_SIZE(nvmeq->q_depth));
dev->online_queues++;(ulong)nvmeq->cqes + NVME_CQ_ALLOCATION);
}
On Amlogic A311D, fixes timeout on nvme scan:
Tested-by: Neil Armstrong narmstrong@baylibre.com

On Mon, Feb 08, 2021 at 01:31:54PM +0000, Andre Przywara wrote:
At the moment nvme_read_completion_status() tries to invidate a single member of the cqes[] array, which is shady as just a single entry is not cache line aligned. The structure is dictated by hardware, and with 16 bytes is smaller than any cache line we usually deal with. Also multiple entries need to be consecutive in memory, so we can't pad them to cover a whole cache line.
As a consequence we can only always invalidate all of them - U-Boot just uses two of them anyway. This is fine, as they are only ever read by the CPU (apart from the initial zeroing), so they can't become dirty.
Make this obvious by always invalidating the whole array, regardless of the entry number we are about to read. Also blow up the allocation size to cover whole cache lines, to avoid other heap allocations to sneak in.
Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Bin Meng bmeng.cn@gmail.com Reviewed-by: Michael Trimarchi michael@amarulasolutions.com Tested-by: Neil Armstrong narmstrong@baylibre.com
Applied to u-boot/master, thanks!
participants (7)
-
Andre Przywara
-
Bin Meng
-
Jagan Teki
-
Michael Nazzareno Trimarchi
-
Neil Armstrong
-
Suniel Mahesh
-
Tom Rini