
On Sat, Jun 27, 2009 at 04:32:35PM +0900, Kyungmin Park wrote:
+/**
- onenand_read_burst
- 16 Burst read: performance is improved up to 40%.
- */
+static void onenand_read_burst(void *dest, const void *src, size_t len) +{
- int count;
- if (len % 16 != 0)
- return;
- count = len / 16;
- __asm__ __volatile__(
- " stmdb r13!, {r0-r3,r9-r12}\n"
- " mov r2, %0\n"
- "1:\n"
- " ldmia r1, {r9-r12}\n"
- " stmia r0!, {r9-r12}\n"
- " subs r2, r2, #0x1\n"
- " bne 1b\n"
- " ldmia r13!, {r0-r3,r9-r12}\n"::"r" (count));
+}
What is this doing that we couldn't generically make memcpy do?
Even though It looks some strange. it has some performance gain. but not general.
I guess that's because you're reading from the same 16 bytes each loop iteration. Perhaps repeated 16-byte calls to memcpy could be used, combined with a suitably optimized memcpy (possibly with inline asm in the arch headers for certain constant sizes).
Also, relying on r0/r1 to still contain dest/src after the compiler has had a chance to mess with things is dangerous. Better to use the asm constraints properly. I also don't see why you need to save r3.
Is there any chance that this driver could be applicable to something that isn't ARM? Is this programming interface part of a host controller, or is it embedded in the OneNAND chip?
-Scott