Re: [U-Boot] [RFC V2 PATCH 0/3] Add cache for block devices

21 Mar 2016


      Hi Marek,
On 03/21/2016 09:49 AM, Marek Vasut wrote:
...
On 03/21/2016 02:48 PM, Eric Nelson wrote:
...
On 03/20/2016 06:59 PM, Marek Vasut wrote:
...
On 03/21/2016 02:45 AM, Eric Nelson wrote:
...
Here's a more full-featured implementation of a cache for block
devices that uses a small linked list of cache blocks.
Why do you use linked list ? You have four entries, you can as well
use fixed array. Maybe you should implement an adaptive cache would
would use the unpopulated malloc area and hash the sector number(s)
into that area ?
I was looking for a simple implementation that would allow tweaking of
the max entries/size per entry.
We could get higher performance through hashing, but with such a
small cache, it's probably not worth extra code.
The hashing function can be a simple modulo on sector number ;-) That'd
be less code than linked lists.
I'm not seeing how.
I'm going look first at a better way to integrate than the approach
taken in patch 3.
...
...
Using an array and re-allocating on changes to the max entries variable
is feasible, but I think it would be slightly more code.
That would indeed be more code.
...
...
...
Experimentation loading a 4.5 MiB kernel from the root directory of
 a FAT filesystem shows that a single cache entry of a single
block is the only
only ... what ? This is where things started to be interesting, but
you leave us hanging :)
Oops.
... I was planning on re-wording that.
My testing showed no gain in performance (additional cache hits) past a
single entry of a single block. This was done on a small (32MiB)
partition with a small number of files (~10) and only a single
read is skipped.
I'd kinda expect that indeed.
Yeah, and the single-digit-ms improvement isn't worth much.
...
...
=> blkc c ; blkc i ; blkc 0 0 ;
changed to max of 0 entries of 0 blocks each
=> load mmc 0 10008000 /zImage
reading /zImage
4955304 bytes read in 247 ms (19.1 MiB/s)
=> blkc
block cache:
0	hits
7	misses
0	entries in cache
trace off
max blocks/entry 0
max entries 0
=> blkc c ; blkc i ; blkc 1 1 ;
changed to max of 1 entries of 1 blocks each
=> load mmc 0 10008000 /zImage
reading /zImage
4955304 bytes read in 243 ms (19.4 MiB/s)
=> blkc
block cache:
1	hits
6	misses
1	entries in cache
trace off
max blocks/entry 1
max entries 1
I don't believe that enabling the cache is worth the extra code
for this use case.
By comparison, a load of 150 MiB compressed disk image from
ext4 showed a 30x speedup with the V1 patch (single block,
single entry) from ~150s to 5s.
Without some form of cache, the 150s was long enough to make
a user (me) think something is broken.
I'm obviously loving this improvement.
Glad to hear it.