trie
trie copied to clipboard
Performance sucks when you end up with tens of thousands of children under one node
Theoretically there can be tens of thousands of files under one path. But then performance of GetChild sucks as it linearly searches through all the nodes.
I had a think about improving that... but I think what's probably a better solution is to make this into a real compressed tree instead of just by the path separator. Many of the file names under a 10,000+ file folder likely have name redundancy in them, so we could perhaps just adjust the trie construction to not break on separators but rather break up existing nodes as divergence is found to eliminate the problem.