bench_local IDs must be refs in rust-lang/rustc
To reproduce:
./target/release/collector bench_local $(rustup +nightly which rustc) testrun
Expected: The run ID "testrun" should label the current run in the database
Actual: It's also used as a ref when checking out the rustc repo and the whole collector run stops on this benchmark:
15 benchmarks remaining
Cloning into 'rust'...
remote: Enumerating objects: 1406657, done.
remote: Total 1406657 (delta 0), reused 0 (delta 0), pack-reused 1406657
Receiving objects: 100% (1406657/1406657), 580.28 MiB | 2.47 MiB/s, done.
Resolving deltas: 100% (1133476/1133476), done.
fatal: ambiguous argument 'testrun': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
thread 'main' panicked at 'git reset --hard successful', collector/src/execute/rustc.rs:50:5
Probably caused by these commits: https://github.com/rust-lang/rustc-perf/commit/965ad78b70514a29098ad57fb33601873716ca67 https://github.com/rust-lang/rustc-perf/commit/9eef993b9487784c79792573d33c40bcdf98528b (when the repo already exists locally)
I'd suggest either having bench_local runs default to checking out origin/master, or adding a new flag to collector to represent a ref in the rust-lang repo (bit more involved)
If someone can verify this is an issue then I can submit a fix for it.
Yeah, this should be fixed. Probably the best thing is to attempt to checkout the ref, and then if failing, just skip that benchmark - does that sound reasonable to you?
From the perspective of a local user it seems like it'd be better to attempt to checkout origin/master than skipping, though I can see this might mask unintended behaviour when run for perf.rust-lang.org
I'm looking at the collector/README section on comparing two different versions locally when thinking about this
./target/release/collector bench_local $RUST_ORIGINAL Original
./target/release/collector bench_local $RUST_MODIFIED Modified
as I'd argue this particular benchmark is quite valuable
Thoughts?
So, origin/master may not compile is I guess my main worry. But I'd be ok with trying to check out and falling back to origin/master.
Makes sense - I'll see if I can ensure that the build failing just skips the current benchmark
Just a heads up - I'm waiting for permission from my company to commit code to this repo before I can post a patch unfortunately. I'd expect a couple weeks up to a couple months for this.
I think there are two issues with using the local artifact ID as the rustc reference to checkout to compile.
One is just that I don't expect it. As a user, I think of the ID as a perf run identifier, rather than something identifying which rustc to check out.
The other is that you can't have two separate perf runs compile the rustc identified by the same ref because the perf runs would have to have the same ID.
The only error-prone-free way to handle this that I can think of is to require the user to specify which rustc to check out via commit hash. Using a ref is error-prone because the ref can point to different commits at different times, making comparisons between perf runs checking out the same ref inaccurate.
On the other hand, requiring the user to specify a rustc commit just to do bench_local is a burden. What if we just don't run the rustc benchmark by default for bench_local unless the user asks for it and specifies the commit? Or would that be too irritating?
Also, supposing we keep doing it this way, I think there's an issue with the existing code, which is that it does a git reset --hard <artifact-id>, when it should do a git reset --hard origin/<artifact-id>.
Hm, that seems a little surprising to require the aid to be remote (and in a particular remote too)... can you say why you would expect that?
Well, to be clear, my expectation is really that the aid wouldn't necessarily have a relation to which rustc to checkout for the reasons I mentioned above.
But I think I see where the remote/local confusion is coming from. I now get the sense that you use the rust directory under rustc-perf as a development directory, or at least have it contain local branches you want to benchmark? This may sound dumb, but I hadn't considered that the directory would be used that way, which is why the aid being a local ref hadn't even occurred to me.
I just considered the rust checkout to be under the collector's management, not something I should touch. Adding to the confusion is that the code does a git fetch origin <aid> when you specify the aid, lending more confirmation to the idea that the aid is supposed to be remote and belong to origin.
Anyway, I still think there's probably some tweaking that could be done here to make it more intuitive, but I'm glad to understand the intent now (I hope).
I don't think the git fetch / git reset --hard etc are particularly useful for local development; the code here is simply shared between that of the collector run "automatically" (e.g., on perf.r-l.o) and locally.
Part of my concern is that in practice I wouldn't expect people to run the full suite of benchmarks locally, and the rustc benchmark in perf.rlo is a particularly odd one to run; it's very slow (~20 minutes, easily, for everyone -- it runs with -j1 so your hardware likely doesn't matter much), and not likely to be particularly useful. I know my own improvements to rustc performance have all been measured with x.py and just looking at the seconds taken for a recompile of a particular crate, for example, or measuring that with perf stat -e instructions:u x.py build compiler/rustc.
So I think your intuition is "right", but we're just talking past each other a little -- I was responding to a particular point in your comment, but I think the larger point you're making is correct.
On the other hand, requiring the user to specify a rustc commit just to do bench_local is a burden. What if we just don't run the rustc benchmark by default for bench_local unless the user asks for it and specifies the commit? Or would that be too irritating?
I think this is likley desirable. But, it's a little surprising to me to hear that folks are running bench_local without a filter today -- I've never done this personally. I usually pick a single benchmark to evaluate under cachegrind/perf record or similar, and then based on that work on optimizing it. Once I see an improvement in that single benchmark, I'll add in a few typical other benchmarks (e.g., cargo, serde, syn) and confirm they're not showing serious regressions. Then I'll throw the code up for perf.r-l.o to give me numbers across the board. So I'm not in practice encountering any pain from the benchmarking doing git fetches and resets...
It might be a good idea for us to "strongly suggest" a filter by refusing to run without one, perhaps. Most people likely don't want to wait for a full perf build, and it's pretty unnecessary too.