[U-Boot] [PATCH] fs: btrfs: fix false negatives in ROOT_ITEM search

ROOT_ITEMs in btrfs are referenced without knowing their actual "offset" value. To perform these searches using only two items from the key, the btrfs driver uses a special "btrfs_search_tree_key_type" function.
The algorithm used by that function to transform a 3-tuple search into a 2-tuple search was subtly broken, leading to items not being found if they were the first in their tree node.
This commit fixes btrfs_search_tree_key_type to properly behave in these situations.
Signed-off-by: Pierre Bourdon delroth@gmail.com Cc: Marek Behun marek.behun@nic.cz --- fs/btrfs/ctree.h | 44 ++++++++++++++++++++++++++++++++++++++------ 1 file changed, 38 insertions(+), 6 deletions(-)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 65c152a52f..ca44a2404d 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -292,20 +292,52 @@ btrfs_search_tree_key_type(const struct btrfs_root *root, u64 objectid, { struct btrfs_key key, *res;
+ /* + * In some cases (e.g. tree roots), we need to look for a given + * objectid and type without knowing the offset value (3rd element of a + * btrfs tree node key). We can rely on the fact that btrfs_search_tree + * returns the first element with key >= search_key, and then perform + * our own comparison between the returned element and the search key. + * + * It is tempting to use a search key with offset 0 to perform this + * "fuzzy search". This would return the first item with the (objectid, + * type) we're looking for. However, using offset 0 has the wrong + * behavior when the wanted item is the first in a leaf: since our + * search key will be lower than the wanted item, the recursive search + * will explore the wrong branch of the tree. + * + * Instead, use the largest possible offset (-1). The result of this + * search will either be: + * 1. An element with the (objectid, type) we're looking for, if it + * has offset -1 or if it is the last element in its leaf. + * 2. The first element *after* an element with the (objectid, type) + */ key.objectid = objectid; key.type = type; - key.offset = 0; + key.offset = -1;
if (btrfs_search_tree(root, &key, path)) return NULL;
- res = btrfs_path_leaf_key(path); - if (btrfs_comp_keys_type(&key, res)) { - btrfs_free_path(path); - return NULL; + /* + * Compare with the previous element first -- this is the likely case + * since the result of the search is only what we want if it had offset + * == -1 or if it was last in its leaf. + */ + if (path->slots[0] > 0) { + path->slots[0]--; + res = btrfs_path_leaf_key(path); + if (!btrfs_comp_keys_type(&key, res)) + return res; + path->slots[0]++; }
- return res; + res = btrfs_path_leaf_key(path); + if (!btrfs_comp_keys_type(&key, res)) + return res; + + btrfs_free_path(path); + return NULL; }
static inline u32 btrfs_path_item_size(struct btrfs_path *p)

On Sat, Apr 13, 2019 at 11:50 PM Pierre Bourdon delroth@gmail.com wrote:
ROOT_ITEMs in btrfs are referenced without knowing their actual "offset" value. To perform these searches using only two items from the key, the btrfs driver uses a special "btrfs_search_tree_key_type" function.
The algorithm used by that function to transform a 3-tuple search into a 2-tuple search was subtly broken, leading to items not being found if they were the first in their tree node.
This commit fixes btrfs_search_tree_key_type to properly behave in these situations.
A more practical example in case the explanation isn't clear:
root tree node 122216448 level 1 items 6 free 115 generation 12246 owner ROOT_TREE fs uuid b516974e-a94e-469c-a9ff-d237ce96b03b chunk uuid ecfa1e4e-8832-49c1-bb0c-2f08b95586a0 key (EXTENT_TREE ROOT_ITEM 0) block 122810368 gen 12246 key (ROOT_TREE_DIR INODE_REF 6) block 122339328 gen 12246 key (344 ROOT_ITEM 9422) block 209317888 gen 12115 key (344 ROOT_BACKREF 5) block 226222080 gen 12126 key (368 ROOT_ITEM 11665) block 114966528 gen 12127 key (375 ROOT_ITEM 12127) block 122949632 gen 12246
If you look for (375 ROOT_ITEM) then using (375 ROOT_ITEM 0) as the search key will end up walking to block 114966528 (because (375 ROOT_ITEM 0) < (375 ROOT_ITEM 12127)). Using instead (375 ROOT_ITEM -1) will go to the right leaf, return the first item after (375 ROOT_ITEM 12127), and we can then walk back one slot to get the item we wanted.

On Sun, Apr 14, 2019 at 12:05 AM Pierre Bourdon delroth@gmail.com wrote:
A more practical example in case the explanation isn't clear:
root tree node 122216448 level 1 items 6 free 115 generation 12246 owner ROOT_TREE fs uuid b516974e-a94e-469c-a9ff-d237ce96b03b chunk uuid ecfa1e4e-8832-49c1-bb0c-2f08b95586a0 key (EXTENT_TREE ROOT_ITEM 0) block 122810368 gen 12246 key (ROOT_TREE_DIR INODE_REF 6) block 122339328 gen 12246 key (344 ROOT_ITEM 9422) block 209317888 gen 12115 key (344 ROOT_BACKREF 5) block 226222080 gen 12126 key (368 ROOT_ITEM 11665) block 114966528 gen 12127 key (375 ROOT_ITEM 12127) block 122949632 gen 12246
If you look for (375 ROOT_ITEM) then using (375 ROOT_ITEM 0) as the search key will end up walking to block 114966528 (because (375 ROOT_ITEM 0) < (375 ROOT_ITEM 12127)). Using instead (375 ROOT_ITEM -1) will go to the right leaf, return the first item after (375 ROOT_ITEM 12127), and we can then walk back one slot to get the item we wanted.
So turns out there's a very similar bug when iterating DIR_INDEX entries -- but that one isn't using btrfs_search_tree_key_type. It's doing its own thing with offset = 0 to enumerate multiple items of the same oid/type.
Looking back at this, I think there's a much better way to fix both issues. Currently, btrfs_search_tree will return an invalid path if it's asked to look for an item between end of leaf N and beginning of leaf N+1. Invalid, as in, accessing the element at this path reads past the end of an array... This seems like a very broken behavior given that callers have no way of knowing the path is invalid without dereferencing it.
Instead, btrfs_search_tree should be fixed so that it always returns either: 1. The first element >= searched element. 2. An error if no such element exists.
This can be done with a fairly trivial change to btrfs_search_tree: if we're about to return something that's past the end of the leaf (slot
= nritems), call btrfs_next_slot on the path. If btrfs_next_slot
works return that, otherwise return an error.
I'll send another patch which implements that instead. I've verified it fixes both the root lookup issue I was originally looking for + the DIR_INDEX issue as well. Please disregard the patch I originally sent.
-- Pierre Bourdon delroth@gmail.com Software Engineer @ Zürich, Switzerland https://delroth.net/

On Sat, Apr 13, 2019 at 11:50:49PM +0200, Pierre Bourdon wrote:
ROOT_ITEMs in btrfs are referenced without knowing their actual "offset" value. To perform these searches using only two items from the key, the btrfs driver uses a special "btrfs_search_tree_key_type" function.
The algorithm used by that function to transform a 3-tuple search into a 2-tuple search was subtly broken, leading to items not being found if they were the first in their tree node.
This commit fixes btrfs_search_tree_key_type to properly behave in these situations.
Signed-off-by: Pierre Bourdon delroth@gmail.com Cc: Marek Behun marek.behun@nic.cz
Applied to u-boot/master, thanks!

Hi Tom,
On Sat, Apr 27, 2019 at 4:46 PM Tom Rini trini@konsulko.com wrote:
On Sat, Apr 13, 2019 at 11:50:49PM +0200, Pierre Bourdon wrote:
ROOT_ITEMs in btrfs are referenced without knowing their actual "offset" value. To perform these searches using only two items from the key, the btrfs driver uses a special "btrfs_search_tree_key_type" function.
The algorithm used by that function to transform a 3-tuple search into a 2-tuple search was subtly broken, leading to items not being found if they were the first in their tree node.
This commit fixes btrfs_search_tree_key_type to properly behave in these situations.
Signed-off-by: Pierre Bourdon delroth@gmail.com Cc: Marek Behun marek.behun@nic.cz
Applied to u-boot/master, thanks!
Please disregard this patch as was mentioned in a followup email on the thread.
https://patchwork.ozlabs.org/patch/1085991/ implements a better way of addressing the problem and fixes a 2nd btrfs bug that was missed in this first revision.
Sorry for the mixup, Best,
-- Pierre Bourdon delroth@gmail.com Software Engineer @ Zürich, Switzerland https://delroth.net/

On Sat, Apr 27, 2019 at 04:50:18PM +0200, Pierre Bourdon wrote:
Hi Tom,
On Sat, Apr 27, 2019 at 4:46 PM Tom Rini trini@konsulko.com wrote:
On Sat, Apr 13, 2019 at 11:50:49PM +0200, Pierre Bourdon wrote:
ROOT_ITEMs in btrfs are referenced without knowing their actual "offset" value. To perform these searches using only two items from the key, the btrfs driver uses a special "btrfs_search_tree_key_type" function.
The algorithm used by that function to transform a 3-tuple search into a 2-tuple search was subtly broken, leading to items not being found if they were the first in their tree node.
This commit fixes btrfs_search_tree_key_type to properly behave in these situations.
Signed-off-by: Pierre Bourdon delroth@gmail.com Cc: Marek Behun marek.behun@nic.cz
Applied to u-boot/master, thanks!
Please disregard this patch as was mentioned in a followup email on the thread.
https://patchwork.ozlabs.org/patch/1085991/ implements a better way of addressing the problem and fixes a 2nd btrfs bug that was missed in this first revision.
Sorry for the mixup,
OK, I'll take care of things, thanks!
participants (2)
-
Pierre Bourdon
-
Tom Rini