To make use of the filtering feature, we need to revert the internal walker to a real DFS.
We will therefore just invert the whole tree by storing all of its
contents in a level-keyed vector.
This is horribly expensive in memory, this is a compromise between CPU
and memory, here is the fundamental reason for why:
When you encounter a directory, it's either a leaf or not, i.e. it
contains subdirectories or not.
To know this fact, you can:
- wait until you notice subdirectories under it, i.e. you need to store
any intermediate nodes you see in the meantime -> memory penalty.
- getdents or readdir on it to determine *NOW* its subdirectories -> CPU
penalty and I/O penalty.
This is an implementation of the first proposal, we pay memory.
In practice, we are paying O(#nb of nodes) in memory.
There's a smarter albeit much more complicated algorithm that pays only
O(\sum_i #siblings(p_i)) nodes where (p_1, ..., p_n) is the path to a leaf.
which means for:
A
/ \
B C
/ / \
D E F
We would never store D, E, F but only E, F at a given time.
But we would still store B, C no matter what.
Change-Id: I456ed1c3f0db493e018ba1182665d84bebe29c11
Reviewed-on: https://cl.tvl.fyi/c/depot/+/10567
Tested-by: BuildkiteCI
Autosubmit: raitobezarius <tvl@lahfa.xyz>
Reviewed-by: flokli <flokli@flokli.de>