
Booting off jffs2 is now a valid configuration. The failure was that for each node, jffs2_1pass.c was doing an insertion sort. That's an O(N^2) operation, and that's not current filecount, that's every filename version that has ever existed on your flash.
I made a pathological test case on my linux desktop (loop, block2mtd, jffs2mount) by running bonnie on it. I rapidly generated 72000 dentries and 61000 inodes to go through. On a 2.4ghz athlon X2, it took over 13 minutes to load the filesystem with stock jffs2_1pass.c
I changed it from a insertion sort to a list-based mergesort (no extra heap requirements) and it read the filesystem in 650ms.
Back on the flash chip, ARM with 32meg jffs2 filesystem takes ~15 seconds JUST to read the flash with
while(offset<max) junk=*offset;
Other hits:
the spinner: all those compares added 15 seconds to the boot time. Removed.
get_node_mem() calls - huge overhead (45 seconds, or roughly 66% of the total time)
jffs2_scan_empty() - any erased sectors were bog slow, to do the EXACT same thing the main loop was doing. Removed.
Added ignore cases for erase and summary blocks (from another patch) that gets rid of those warnings with new kernels. It still does the right thing, by checking magic, CRC and length then skipping it.
The main problem is that there's no dcache on ARM, so we pay a massive penalty for function calls or even memory->register loads. Simple addition became quite expensive, hence changing offset+part->offset into offset.
This does add the assumption that a partition will be contiguous in memory. I'm not sure if that is valid on all platforms, but it is on every one I'm working on.
The mergesort license seems to be GPL compatable, (no restrictions or advertising clause). If it's a problem I can rewrite it myself under GPL.