[U-Boot] [PATCH] EXYNOS: SPI: Minimise access to SPI FIFO level

Accessing SPI registers is slow, but access to the FIFO level register in particular seems to be extraordinarily expensive (I measure up to 600ns). Perhaps it is required to synchronise with the SPI byte output logic which might run at 1/8th of the 40MHz SPI speed (just a guess).
Reduce access to this register by filling up and emptying FIFOs more completely, rather than just one word each time around the inner loop.
Since the rxfifo value will now likely be much greater that what we read before we fill the txfifo, we only fill the txfifo halfway. This is because if the txfifo is empty, but the rxfifo has data in it, then writing too much data to the txfifo may overflow the rxfifo as data arrives.
This speeds up SPI flash reading from about 1MB/s to about 2MB/s on snow.
Signed-off-by: Simon Glass sjg@chromium.org Signed-off-by: Rajeshwari Shinde rajeshwari.s@samsung.com --- drivers/spi/exynos_spi.c | 31 +++++++++++++++++-------------- 1 files changed, 17 insertions(+), 14 deletions(-)
diff --git a/drivers/spi/exynos_spi.c b/drivers/spi/exynos_spi.c index c19e227..7bbf9ce 100644 --- a/drivers/spi/exynos_spi.c +++ b/drivers/spi/exynos_spi.c @@ -260,33 +260,36 @@ static int spi_rx_tx(struct exynos_spi_slave *spi_slave, int todo,
/* Keep the fifos full/empty. */ spi_get_fifo_levels(regs, &rx_lvl, &tx_lvl); - if (tx_lvl < spi_slave->fifo_size && out_bytes) { + while (tx_lvl < spi_slave->fifo_size / 2 && out_bytes) { temp = txp ? *txp++ : 0xff; writel(temp, ®s->tx_data); out_bytes--; + tx_lvl++; } if (rx_lvl > 0 && in_bytes) { - temp = readl(®s->rx_data); - if (!rxp && !stopping) { - in_bytes--; - } else if (spi_slave->skip_preamble) { - if (temp == SPI_PREAMBLE_END_BYTE) { - spi_slave->skip_preamble = 0; - stopping = 0; + while (rx_lvl > 0 && in_bytes) { + temp = readl(®s->rx_data); + if (!rxp && !stopping) { + in_bytes--; + } else if (spi_slave->skip_preamble) { + if (temp == SPI_PREAMBLE_END_BYTE) { + spi_slave->skip_preamble = 0; + stopping = 0; + } + } else { + *rxp++ = temp; + in_bytes--; } - } else { - *rxp++ = temp; - in_bytes--; + toread--; + rx_lvl--; } - toread--; - } /* * We have run out of input data, but haven't read enough * bytes after the preamble yet. Read some more, and make * sure that we transmit dummy bytes too, to keep things * going. */ - else if (in_bytes && !toread) { + } else if (in_bytes && !toread) { assert(!out_bytes); toread = out_bytes = in_bytes; txp = NULL;

On Fri, Mar 22, 2013 at 8:09 AM, Rajeshwari Shinde rajeshwari.s@samsung.com wrote:
Accessing SPI registers is slow, but access to the FIFO level register in particular seems to be extraordinarily expensive (I measure up to 600ns). Perhaps it is required to synchronise with the SPI byte output logic which might run at 1/8th of the 40MHz SPI speed (just a guess).
Reduce access to this register by filling up and emptying FIFOs more completely, rather than just one word each time around the inner loop.
Since the rxfifo value will now likely be much greater that what we read before we fill the txfifo, we only fill the txfifo halfway. This is because if the txfifo is empty, but the rxfifo has data in it, then writing too much data to the txfifo may overflow the rxfifo as data arrives.
This speeds up SPI flash reading from about 1MB/s to about 2MB/s on snow.
Signed-off-by: Simon Glass sjg@chromium.org Signed-off-by: Rajeshwari Shinde rajeshwari.s@samsung.com
Acked-by: Simon Glass sjg@chromium.org
participants (2)
-
Rajeshwari Shinde
-
Simon Glass