[U-Boot] [PATCH] ext4fs: Add ext4 extent tree cache

From: Evan Thompson evan.thompson@flukenetworks.com
In ext4, the file inode can store up to 4 extents. If a file requires more (due to size or fragmentation), an extent index tree is used.
Currently, u-boot reads a node from each level of the extent tree for every block read, which is very inefficient when extent tree depth is > 0.
This patch adds a cache for the extent tree. We cache the 1 most-recently-seen node at each extent tree level. The typical workload is sequential block access, so once we leave a given tree node, it will not be revisited. Therefore, it makes sense to just cache one node per tree level.
Cached blocks are lazily allocated. The typical case is extent tree depth = 0, in which case no caching is needed and no allocations will occur.
For files with extent tree depth = 1, this patch produces a ~10x improvement in read speed. For deeper extent trees, the improvement is larger. On my test device, a 3MB file which previously took 9s to read now takes 150ms.
Cache size is configurable with CONFIG_EXT4_EXTENT_CACHE_SIZE. However the default of 5 (the maximum depth of well-formed extent trees) is recommended.
Signed-off-by: Evan Thompson evan.thompson@flukenetworks.com --- doc/README.ext4 | 7 +++++ fs/ext4/ext4_common.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 76 insertions(+), 5 deletions(-)
diff --git a/doc/README.ext4 b/doc/README.ext4 index 8ecd21eee3..a501b92396 100644 --- a/doc/README.ext4 +++ b/doc/README.ext4 @@ -33,6 +33,13 @@ In addition, to get the write access command "ext4write", enable: which automatically selects CONFIG_EXT4_WRITE if it wasn't defined already.
+For files with extents, an ext4 extent tree cache improves performance: + + CONFIG_EXT4_EXTENT_CACHE_SIZE <#> + +The above cache size defaults to 5 if not defined. This default is +strongly recommended. 0 will turn off extent caching. + Also relevant are the generic filesystem commands, selected by:
CONFIG_CMD_FS_GENERIC diff --git a/fs/ext4/ext4_common.c b/fs/ext4/ext4_common.c index e3cc30a1e0..61aade1ced 100644 --- a/fs/ext4/ext4_common.c +++ b/fs/ext4/ext4_common.c @@ -1523,6 +1523,47 @@ void ext4fs_allocate_blocks(struct ext2_inode *file_inode,
#endif
+/* Extent tree cache caches one entry per tree level + * eg, ext_block->eh_depth is used as the index into the cache + * + * If the tree is deeper than CONFIG_EXT4_EXTENT_CACHE_SIZE (very unlikely), + * file read performance will be impacted by repeated re-reads + * of those index nodes. + */ + +#ifndef CONFIG_EXT4_EXTENT_CACHE_SIZE +#define CONFIG_EXT4_EXTENT_CACHE_SIZE 5 +#endif + +struct extent_cache_entry { + unsigned long long block; + struct ext4_extent_header *ext_block; +}; + +static struct extent_cache_entry + extent_cache[CONFIG_EXT4_EXTENT_CACHE_SIZE]; + +static void ext4fs_init_extent_block_cache(void) +{ + int i; + + for (i = 0; i < CONFIG_EXT4_EXTENT_CACHE_SIZE; i++) { + extent_cache[i].block = 0; + extent_cache[i].ext_block = NULL; + } +} + +static void ext4fs_free_extent_block_cache(void) +{ + int i; + + for (i = 0; i < CONFIG_EXT4_EXTENT_CACHE_SIZE; i++) { + extent_cache[i].block = 0; + free(extent_cache[i].ext_block); + extent_cache[i].ext_block = NULL; + } +} + static struct ext4_extent_header *ext4fs_get_extent_block (struct ext2_data *data, char *buf, struct ext4_extent_header *ext_block, @@ -1532,6 +1573,7 @@ static struct ext4_extent_header *ext4fs_get_extent_block unsigned long long block; int blksz = EXT2_BLOCK_SIZE(data); int i; + unsigned int cache_item;
while (1) { index = (struct ext4_extent_idx *)(ext_block + 1); @@ -1554,11 +1596,31 @@ static struct ext4_extent_header *ext4fs_get_extent_block block = le16_to_cpu(index[i].ei_leaf_hi); block = (block << 32) + le32_to_cpu(index[i].ei_leaf_lo);
- if (ext4fs_devread((lbaint_t)block << log2_blksz, 0, blksz, - buf)) - ext_block = (struct ext4_extent_header *)buf; - else - return NULL; + // check cache, read block from device if not found + cache_item = le16_to_cpu(ext_block->eh_depth) - 1; + if (cache_item < CONFIG_EXT4_EXTENT_CACHE_SIZE && + extent_cache[cache_item].block == block) { + ext_block = extent_cache[cache_item].ext_block; + } else { + if (ext4fs_devread((lbaint_t)block << log2_blksz, 0, + blksz, buf)) + ext_block = (struct ext4_extent_header *)buf; + else + return NULL; + // put in cache + if (cache_item < CONFIG_EXT4_EXTENT_CACHE_SIZE) { + struct extent_cache_entry *cache_entry = + &extent_cache[cache_item]; + if (!cache_entry->ext_block) + cache_entry->ext_block = zalloc(blksz); + if (!cache_entry->ext_block) { + printf("ext4 cache alloc failed\n"); + return NULL; + } + memcpy(cache_entry->ext_block, buf, blksz); + cache_entry->block = block; + } + } } }
@@ -2000,6 +2062,7 @@ void ext4fs_close(void) if (ext4fs_root != NULL) { free(ext4fs_root); ext4fs_root = NULL; + ext4fs_free_extent_block_cache(); }
ext4fs_reinit_global(); @@ -2376,6 +2439,7 @@ int ext4fs_mount(unsigned part_length)
ext4fs_root = data;
+ ext4fs_init_extent_block_cache(); return 1; fail: printf("Failed to mount ext2 filesystem...\n");

On Mon, Mar 26, 2018 at 04:05:24PM -0700, evan.g.thompson@gmail.com wrote:
From: Evan Thompson evan.thompson@flukenetworks.com
In ext4, the file inode can store up to 4 extents. If a file requires more (due to size or fragmentation), an extent index tree is used.
Currently, u-boot reads a node from each level of the extent tree for every block read, which is very inefficient when extent tree depth is > 0.
This patch adds a cache for the extent tree. We cache the 1 most-recently-seen node at each extent tree level. The typical workload is sequential block access, so once we leave a given tree node, it will not be revisited. Therefore, it makes sense to just cache one node per tree level.
Cached blocks are lazily allocated. The typical case is extent tree depth = 0, in which case no caching is needed and no allocations will occur.
For files with extent tree depth = 1, this patch produces a ~10x improvement in read speed. For deeper extent trees, the improvement is larger. On my test device, a 3MB file which previously took 9s to read now takes 150ms.
Cache size is configurable with CONFIG_EXT4_EXTENT_CACHE_SIZE. However the default of 5 (the maximum depth of well-formed extent trees) is recommended.
Signed-off-by: Evan Thompson evan.thompson@flukenetworks.com
doc/README.ext4 | 7 +++++ fs/ext4/ext4_common.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 76 insertions(+), 5 deletions(-)
diff --git a/doc/README.ext4 b/doc/README.ext4 index 8ecd21eee3..a501b92396 100644 --- a/doc/README.ext4 +++ b/doc/README.ext4 @@ -33,6 +33,13 @@ In addition, to get the write access command "ext4write", enable: which automatically selects CONFIG_EXT4_WRITE if it wasn't defined already.
+For files with extents, an ext4 extent tree cache improves performance:
- CONFIG_EXT4_EXTENT_CACHE_SIZE <#>
+The above cache size defaults to 5 if not defined. This default is +strongly recommended. 0 will turn off extent caching.
Also relevant are the generic filesystem commands, selected by:
CONFIG_CMD_FS_GENERIC diff --git a/fs/ext4/ext4_common.c b/fs/ext4/ext4_common.c index e3cc30a1e0..61aade1ced 100644 --- a/fs/ext4/ext4_common.c +++ b/fs/ext4/ext4_common.c @@ -1523,6 +1523,47 @@ void ext4fs_allocate_blocks(struct ext2_inode *file_inode,
#endif
+/* Extent tree cache caches one entry per tree level
- eg, ext_block->eh_depth is used as the index into the cache
- If the tree is deeper than CONFIG_EXT4_EXTENT_CACHE_SIZE (very unlikely),
- file read performance will be impacted by repeated re-reads
- of those index nodes.
- */
+#ifndef CONFIG_EXT4_EXTENT_CACHE_SIZE +#define CONFIG_EXT4_EXTENT_CACHE_SIZE 5 +#endif
This needs to be done in Kconfig. Also, is there any sort of testing we could add to test/fs/fs-test.sh for this code? Thanks!

On Thu, May 10, 2018 at 3:02 PM, Tom Rini trini@konsulko.com wrote:
On Mon, Mar 26, 2018 at 04:05:24PM -0700, evan.g.thompson@gmail.com wrote:
+#ifndef CONFIG_EXT4_EXTENT_CACHE_SIZE +#define CONFIG_EXT4_EXTENT_CACHE_SIZE 5 +#endif
This needs to be done in Kconfig.
Ok, I'll fix that.
Also, is there any sort of testing we could add to test/fs/fs-test.sh for this code? Thanks!
I can't think of a straightforward way, as this patch just provides a speedup for certain cases. The functionality of reading / writing files is unchanged.
- Evan

On Fri, May 11, 2018 at 10:14:38AM -0700, Evan Thompson wrote:
On Thu, May 10, 2018 at 3:02 PM, Tom Rini trini@konsulko.com wrote:
On Mon, Mar 26, 2018 at 04:05:24PM -0700, evan.g.thompson@gmail.com wrote:
+#ifndef CONFIG_EXT4_EXTENT_CACHE_SIZE +#define CONFIG_EXT4_EXTENT_CACHE_SIZE 5 +#endif
This needs to be done in Kconfig.
Ok, I'll fix that.
Also, is there any sort of testing we could add to test/fs/fs-test.sh for this code? Thanks!
I can't think of a straightforward way, as this patch just provides a speedup for certain cases. The functionality of reading / writing files is unchanged.
OK. And the test suite passes as well as before, yes?

On Fri, May 11, 2018 at 10:18 AM, Tom Rini trini@konsulko.com wrote:
On Fri, May 11, 2018 at 10:14:38AM -0700, Evan Thompson wrote:
On Thu, May 10, 2018 at 3:02 PM, Tom Rini trini@konsulko.com wrote:
On Mon, Mar 26, 2018 at 04:05:24PM -0700, evan.g.thompson@gmail.com wrote:
+#ifndef CONFIG_EXT4_EXTENT_CACHE_SIZE +#define CONFIG_EXT4_EXTENT_CACHE_SIZE 5 +#endif
This needs to be done in Kconfig.
Ok, I'll fix that.
Also, is there any sort of testing we could add to test/fs/fs-test.sh for this code? Thanks!
I can't think of a straightforward way, as this patch just provides a speedup for certain cases. The functionality of reading / writing files is unchanged.
OK. And the test suite passes as well as before, yes?
Yes, I'm seeing the same passes and fails as before.
- Evan

On 05/11/2018 07:53 PM, Evan Thompson wrote:
On Fri, May 11, 2018 at 10:18 AM, Tom Rini trini@konsulko.com wrote:
On Fri, May 11, 2018 at 10:14:38AM -0700, Evan Thompson wrote:
On Thu, May 10, 2018 at 3:02 PM, Tom Rini trini@konsulko.com wrote:
On Mon, Mar 26, 2018 at 04:05:24PM -0700, evan.g.thompson@gmail.com wrote:
+#ifndef CONFIG_EXT4_EXTENT_CACHE_SIZE +#define CONFIG_EXT4_EXTENT_CACHE_SIZE 5 +#endif
This needs to be done in Kconfig.
Ok, I'll fix that.
Also, is there any sort of testing we could add to test/fs/fs-test.sh for this code? Thanks!
I can't think of a straightforward way, as this patch just provides a speedup for certain cases. The functionality of reading / writing files is unchanged.
OK. And the test suite passes as well as before, yes?
Yes, I'm seeing the same passes and fails as before.
btw doesn't CONFIG_BLOCK_CACHE offer similar service, but for everyone and on block level ? I recall looking for ext4 fs speed up , but then ultimately used CONFIG_BLOCK_CACHE which did it for me.

On Sun, May 13, 2018 at 12:03 PM, Marek Vasut marek.vasut@gmail.com wrote:
btw doesn't CONFIG_BLOCK_CACHE offer similar service, but for everyone and on block level ? I recall looking for ext4 fs speed up , but then ultimately used CONFIG_BLOCK_CACHE which did it for me.
I was unaware of the existence of CONFIG_BLOCK_CACHE - yes, that will solve the same problem by caching at the block level instead of the filesystem level. So my patch is redundant, and ext4 users should just be advised to turn on CONFIG_BLOCK_CACHE.
Thanks, Evan

On 05/14/2018 05:14 PM, Evan Thompson wrote:
On Sun, May 13, 2018 at 12:03 PM, Marek Vasut marek.vasut@gmail.com wrote:
btw doesn't CONFIG_BLOCK_CACHE offer similar service, but for everyone and on block level ? I recall looking for ext4 fs speed up , but then ultimately used CONFIG_BLOCK_CACHE which did it for me.
I was unaware of the existence of CONFIG_BLOCK_CACHE - yes, that will solve the same problem by caching at the block level instead of the filesystem level. So my patch is redundant, and ext4 users should just be advised to turn on CONFIG_BLOCK_CACHE.
Maybe this block cache should be default=y (if BLK is enabled?) or we should advertise it more somehow. I ran into the exact same issue and started looking into the ext4 patches myself only to find the block cache later on.

On Mon, May 14, 2018 at 05:21:38PM +0200, Marek Vasut wrote:
On 05/14/2018 05:14 PM, Evan Thompson wrote:
On Sun, May 13, 2018 at 12:03 PM, Marek Vasut marek.vasut@gmail.com wrote:
btw doesn't CONFIG_BLOCK_CACHE offer similar service, but for everyone and on block level ? I recall looking for ext4 fs speed up , but then ultimately used CONFIG_BLOCK_CACHE which did it for me.
I was unaware of the existence of CONFIG_BLOCK_CACHE - yes, that will solve the same problem by caching at the block level instead of the filesystem level. So my patch is redundant, and ext4 users should just be advised to turn on CONFIG_BLOCK_CACHE.
Maybe this block cache should be default=y (if BLK is enabled?) or we should advertise it more somehow. I ran into the exact same issue and started looking into the ext4 patches myself only to find the block cache later on.
Agreed. I'm right now testing a patch or two to do that and see what the size cost is. Thanks all!
participants (4)
-
Evan Thompson
-
evan.g.thompson@gmail.com
-
Marek Vasut
-
Tom Rini