buck2 icon indicating copy to clipboard operation
buck2 copied to clipboard

Check performance of tset artifactvalues cache

Open cjhopman opened this issue 1 year ago • 3 comments

There's an interesting issue in bazel here: https://github.com/bazelbuild/bazel/discussions/21378. There's a repro for a bad perf case of their merkle tree cache here: https://github.com/DavidANeil/repro-bazel-nested-set/tree/master. That cache seems like it likely does something very similar to how we cache the ActionSharedDirectory for tset nodes, the repro might be a good example for us to check performance on.

cjhopman avatar Aug 07 '24 05:08 cjhopman

Other relevant bazel discussions in https://github.com/bazelbuild/bazel/issues/20862 and https://github.com/bazelbuild/bazel/issues/18686

cjhopman avatar Aug 07 '24 23:08 cjhopman

Confirmed that buck2 has similar poor performance here: https://github.com/cjhopman/repro-bazel-nested-set

buck1 approach is potentially more performant here. It maintains an map of tsetnode equivalent -> interned MerkleTreeNode. and then a MerkleTreeNode lazily computes and caches its encoded form only when requested. see https://github.com/facebook/buck/blob/9c7c421e49f4d92d67321f18c6d1cd90974c77c4/src/com/facebook/buck/remoteexecution/util/MerkleTreeNodeCache.java and https://github.com/facebook/buck/blob/9c7c421e49f4d92d67321f18c6d1cd90974c77c4/src/com/facebook/buck/rules/modern/builders/ModernBuildRuleRemoteExecutionHelper.java#L531

cjhopman avatar Aug 07 '24 23:08 cjhopman

That model doesn't fit great into buck and there's a question of the cost of interning. But actually I think it could be fine to just depend on sharing based on the dice value cached for the tset already for the interning part of that.

but still, it might just be too costly to do all the tree merges themselves.

if this is something like a flat directory of items, and each item is a value in one tset and then there's a densish graph between them, producing the merged directories is going to be O(n^2). lazily doing the fingerprinting just improves the constant.

cjhopman avatar Aug 07 '24 23:08 cjhopman