We should try to deduplicate dep trees between benchmarches.
A global build cache (to avoid a common workspace) would be great but probably hard.
The next best thing is likely allowing a single workspace to have multiple benchmarks, and using a single target directory for building their dependencies (e.g. the Servo crates might benefit from this).
We can also try to reuse -debug artifacts for -check builds, further reducing times (although that might be harder, since --emit=metadata produces rmetas different from --emit=metadata,link, and cargo check likely still can't use regular artifacts).
A global build cache was attempted in https://github.com/rust-lang/rustc-perf/pull/738, but proved to be a regression -- roughly 1.03x of previous time for a full build.
That didn't try to adjust Cargo.locks in the benchmarks to share dependencies though.
I tested the solution in #738 a bit and confirmed that the regression was due to increased copying of the target directories.
That solution accumulates built dependency data in per-build-kind (aka profile) target directories as we build the dependencies for each benchmark before running it. It copies one of those target directories, pre-populated with the compiled dependencies encountered so far, in order to use it as the target directory for the benchmarking proper.
Those copies include a lot of unnecessary data for dependencies used by previous benchmarks but not by the current benchmark. So it looks like there's ample opportunity to cut down on the time spent copying. We could try to copy only what's necessary for the current benchmark from the target directory, or we could use some kind of overlay / copy-on-write approach.
Based on what I'm seeing locally, between adjusting the Cargo.lock files to increase sharing, and reducing the amount of copying, I think that several minutes could be shaved off of a full perf run.
Is there an easy way to generate a tree of hardlinks, instead of copying the files?
I guess we'd need to also change the owner and permissions to make sure the original files don't get modified (and at most the hardlinks themselves can be replaced with regular files).
I did a quick manual test of using hard links, and it seems like it could work. I didn't change the owner of the files, just the permissions, FWIW.
If we go with that approach, here are some issues that would need consideration:
-
Need to ensure the hard link tree is on the same mount as the linked files (can't hard link across mounts).
-
Need to restore original file permissions of template target tree whenever we build dependencies. Or just make them all writable. I'm not sure there's an advantage to restoring their original permissions exactly.
-
Need to make any files that need to be written by cargo/rustc/etc. independent copies rather than hard links. Obviously this is brittle, as it depends on the cargo, etc. implementation. When I tested, I only needed to account for the .cargo-lock file, which cargo opens read/write. But my test was a really simple case, so I don't know if it was complete.