[U-Boot] [PATCH 0/2] OMAP/GPMC: speed up NAND read access

Hi,
I spent some time looking into boot times of AM335x platforms. One big improvement I made came with adding support for GPMC prefetch mode, which gave a speed-up of NAND reads of roughly factor 2. This is what patch #1 does.
Note that this is currently limited to read operations in 8-bit mode, but I believe it could be easily extended to 16-bit operations if anyone has hardware to test it on. Using the engine for write mode speed improvements should also be doable, but I didn't spend time on it yet. That can be added as a separate patch later.
Patch #2 decreases the GPMC memory map window size from 256MiB to 16MiB. Admittedly, I'm not quite sure about the reason, but a read from offset 0 in this memory area will freeze U-Boot instantly if the size is configured to 256MiB. As contents of the FIFO are accessed in a pseudo-mode from offset 0 anyway, it doesn't really matter.
What I also did to further speed up my boot was to tweak the GPMC parameters for the NAND chip on our boards, but that's not part of this patch set, and probably deserves a little more cleanup.
Test results and feedback very welcome.
Thanks, Daniel
Daniel Mack (2): mtd: OMAP: Enable GPMC prefetch mode ARM: omap-common: gpmp: decrease memory region size to 16MiB
arch/arm/cpu/armv7/omap-common/mem-common.c | 2 +- doc/README.nand | 5 ++ drivers/mtd/nand/omap_gpmc.c | 115 +++++++++++++++++++++++++++- include/linux/mtd/omap_gpmc.h | 6 +- 4 files changed, 124 insertions(+), 4 deletions(-)

Enable GPMC's prefetch feature for NAND access. This speeds up NAND read access a lot by pre-fetching contents in the background and reading them through the FIFO address.
The current implementation has two limitations:
a) it only works for 16bit b) it only supports read access
Both is easily fixable by someone who has hardware to implement it.
Note that U-Boot code uses non word-aligned buffers to read data into, and request read lengths that are not multiples of 4, so both partial buffers (head and tail) have to be addressed.
Tested on AM335x hardware.
Signed-off-by: Daniel Mack zonque@gmail.com --- doc/README.nand | 5 ++ drivers/mtd/nand/omap_gpmc.c | 115 +++++++++++++++++++++++++++++++++++++++++- include/linux/mtd/omap_gpmc.h | 6 ++- 3 files changed, 123 insertions(+), 3 deletions(-)
diff --git a/doc/README.nand b/doc/README.nand index 70cf768..6459f2a 100644 --- a/doc/README.nand +++ b/doc/README.nand @@ -292,6 +292,11 @@ Platform specific options Thus BCH16 can be supported on 4K page NAND.
+ CONFIG_NAND_OMAP_PREFETCH + On OMAP platforms that use the GPMC controller (CONFIG_NAND_OMAP_GPMC), + this options enables the code that uses the prefetch mode to speed up + read operations. + NOTE: =====
diff --git a/drivers/mtd/nand/omap_gpmc.c b/drivers/mtd/nand/omap_gpmc.c index 1acf06b..e2d57bd 100644 --- a/drivers/mtd/nand/omap_gpmc.c +++ b/drivers/mtd/nand/omap_gpmc.c @@ -446,6 +446,113 @@ static int omap_correct_data_bch(struct mtd_info *mtd, uint8_t *dat, return (err) ? err : error_count; }
+#ifdef CONFIG_NAND_OMAP_GPMC_PREFETCH + +#define PREFETCH_CONFIG1_CS_SHIFT 24 +#define PREFETCH_FIFOTHRESHOLD_MAX 0x40 +#define PREFETCH_FIFOTHRESHOLD(val) ((val) << 8) +#define PREFETCH_STATUS_COUNT(val) (val & 0x00003fff) +#define PREFETCH_STATUS_FIFO_CNT(val) ((val >> 24) & 0x7F) +#define ENABLE_PREFETCH (1 << 7) + +/** + * omap_prefetch_enable - configures and starts prefetch transfer + * @fifo_th: fifo threshold to be used for read/ write + * @count: number of bytes to be transferred + * @is_write: prefetch read(0) or write post(1) mode + */ +static int omap_prefetch_enable(int fifo_th, unsigned int count, int is_write) +{ + uint32_t val; + + if (fifo_th > PREFETCH_FIFOTHRESHOLD_MAX) + return -EINVAL; + + if (readl(&gpmc_cfg->prefetch_control)) + return -EBUSY; + + /* Set the amount of bytes to be prefetched */ + writel(count, &gpmc_cfg->prefetch_config2); + + val = (cs << PREFETCH_CONFIG1_CS_SHIFT) | (is_write & 1) | + PREFETCH_FIFOTHRESHOLD(fifo_th) | ENABLE_PREFETCH; + writel(val, &gpmc_cfg->prefetch_config1); + + /* Start the prefetch engine */ + writel(1, &gpmc_cfg->prefetch_control); + + return 0; +} + +/** + * omap_prefetch_reset - disables and stops the prefetch engine + */ +static void omap_prefetch_reset(void) +{ + writel(0, &gpmc_cfg->prefetch_control); + writel(0, &gpmc_cfg->prefetch_config1); +} + +static int __read_prefetch_aligned(struct nand_chip *chip, uint32_t *buf, int len) +{ + int ret; + uint32_t cnt; + + ret = omap_prefetch_enable(PREFETCH_FIFOTHRESHOLD_MAX, len, 0); + if (ret < 0) + return ret; + + do { + int i; + + cnt = readl(&gpmc_cfg->prefetch_status); + cnt = PREFETCH_STATUS_FIFO_CNT(cnt); + + for (i = 0; i < cnt / 4; i++) { + *buf++ = readl(CONFIG_SYS_NAND_BASE); + len -= 4; + } + } while (len); + + omap_prefetch_reset(); + + return 0; +} + +static void omap_nand_read_prefetch8(struct mtd_info *mtd, uint8_t *buf, int len) +{ + int ret; + uint32_t head, tail; + struct nand_chip *chip = mtd->priv; + + /* + * If the destination buffer is unaligned, start with reading + * the overlap byte-wise. + */ + head = ((uint32_t) buf) % 4; + if (head) { + nand_read_buf(mtd, buf, head); + buf += head; + len -= head; + } + + /* + * Only transfer multiples of 4 bytes in a pre-fetched fashion. + * If there's a residue, care for it byte-wise afterwards. + */ + tail = len % 4; + + ret = __read_prefetch_aligned(chip, (uint32_t *) buf, len - tail); + if (ret < 0) { + /* fallback in case the prefetch engine is busy */ + nand_read_buf(mtd, buf, len); + } else if (tail) { + buf += len - tail; + nand_read_buf(mtd, buf, tail); + } +} +#endif /* CONFIG_NAND_OMAP_GPMC_PREFETCH */ + /** * omap_read_page_bch - hardware ecc based page read function * @mtd: mtd info structure @@ -883,11 +990,15 @@ int board_nand_init(struct nand_chip *nand) if (err) return err;
-#ifdef CONFIG_SPL_BUILD +#ifdef CONFIG_NAND_OMAP_GPMC_PREFETCH + /* TODO: Implement for 16-bit bus width */ if (nand->options & NAND_BUSWIDTH_16) nand->read_buf = nand_read_buf16; else - nand->read_buf = nand_read_buf; + nand->read_buf = omap_nand_read_prefetch8; +#endif + +#ifdef CONFIG_SPL_BUILD nand->dev_ready = omap_spl_dev_ready; #endif
diff --git a/include/linux/mtd/omap_gpmc.h b/include/linux/mtd/omap_gpmc.h index 9a86582..6cbae45 100644 --- a/include/linux/mtd/omap_gpmc.h +++ b/include/linux/mtd/omap_gpmc.h @@ -66,7 +66,11 @@ struct gpmc { u32 status; /* 0x54 */ u8 res5[0x8]; /* 0x58 */ struct gpmc_cs cs[8]; /* 0x60, 0x90, .. */ - u8 res6[0x14]; /* 0x1E0 */ + u32 prefetch_config1; /* 0x1E0 */ + u32 prefetch_config2; /* 0x1E4 */ + u32 res6; /* 0x1E8 */ + u32 prefetch_control; /* 0x1EC */ + u32 prefetch_status; /* 0x1F0 */ u32 ecc_config; /* 0x1F4 */ u32 ecc_control; /* 0x1F8 */ u32 ecc_size_config; /* 0x1FC */

That memory area is not used except for the first location, so it doesn't matter. However, with the length configured to 256MiB, U-Boot crased when accessing contents of the map.
Signed-off-by: Daniel Mack zonque@gmail.com --- arch/arm/cpu/armv7/omap-common/mem-common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/cpu/armv7/omap-common/mem-common.c b/arch/arm/cpu/armv7/omap-common/mem-common.c index 944ef84..3595806 100644 --- a/arch/arm/cpu/armv7/omap-common/mem-common.c +++ b/arch/arm/cpu/armv7/omap-common/mem-common.c @@ -99,7 +99,7 @@ void gpmc_init(void) M_NAND_GPMC_CONFIG6, 0 }; - u32 size = GPMC_SIZE_256M; + u32 size = GPMC_SIZE_16M; u32 base = CONFIG_SYS_NAND_BASE; #elif defined(CONFIG_CMD_ONENAND) const u32 gpmc_regs[GPMC_MAX_REG] = { ONENAND_GPMC_CONFIG1,
participants (1)
-
Daniel Mack