
Hi Phil,
On Wed, 26 Jun 2013 20:25:25 +0200, Phil Sutter phil.sutter@viprinet.com wrote:
The basic idea is taken from the linux-kernel, but further optimized.
First align the buffer to 8 bytes, then use ldrd/strd to read and store in 8 byte quantities, then do the final bytes.
Tested using: 'date ; nand read.raw 0xE00000 0x0 0x10000 ; date'. Without this patch, NAND read of 132MB took 49s (~2.69MB/s). With this patch in place, reading the same amount of data was done in 27s (~4.89MB/s). So read performance is increased by ~80%!
Signed-off-by: Nico Erfurth ne@erfurth.eu Tested-by: Phil Sutter phil.sutter@viprinet.com Cc: Prafulla Wadaskar prafulla@marvell.com
Patch history missing.
drivers/mtd/nand/kirkwood_nand.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)
diff --git a/drivers/mtd/nand/kirkwood_nand.c b/drivers/mtd/nand/kirkwood_nand.c index 0a99a10..85ea5d2 100644 --- a/drivers/mtd/nand/kirkwood_nand.c +++ b/drivers/mtd/nand/kirkwood_nand.c @@ -38,6 +38,37 @@ struct kwnandf_registers { static struct kwnandf_registers *nf_reg = (struct kwnandf_registers *)KW_NANDF_BASE;
+/*
- The basic idea is stolen from the linux kernel, but the inner loop is
- optimized a bit more.
- */
+static void kw_nand_read_buf(struct mtd_info *mtd, uint8_t *buf, int len) +{
- struct nand_chip *chip = mtd->priv;
- while (len && (unsigned long)buf & 7) {
*buf++ = readb(chip->IO_ADDR_R);
len--;
- };
- /* This loop reads and writes 64bit per round. */
- asm volatile (
"1:\n"
" subs %0, #8\n"
" ldrpld r2, [%2]\n"
" strpld r2, [%1], #8\n"
" bhi 1b\n"
" addne %0, #8\n"
: "+&r" (len), "+&r" (buf)
: "r" (chip->IO_ADDR_R)
: "r2", "r3", "memory", "cc"
- );
Are assembler instructions *really* required? IOW, can you not get enough performance simply with a cleverly written C loop?
- while (len--)
*buf++ = readb(chip->IO_ADDR_R);
+}
/*
- hardware specific access to control-lines/bits
*/ @@ -80,6 +111,7 @@ int board_nand_init(struct nand_chip *nand) nand->ecc.mode = NAND_ECC_SOFT; #endif nand->cmd_ctrl = kw_nand_hwcontrol;
- nand->read_buf = kw_nand_read_buf; nand->chip_delay = 40; nand->select_chip = kw_nand_select_chip; return 0;
Amicalement,