cargo-mutants icon indicating copy to clipboard operation
cargo-mutants copied to clipboard

Compilation of mutated proc macros can hang: reinstate default build timeout?

Open geeknoid opened this issue 1 year ago • 7 comments

I'm working on a project that uses a lot of procedural macros and I'm trying to run mutation testing against the whole thing. This is unfortunately causing the compiler to enter into an infinite loop when certain mutations are applied.

Would it be possible to apply a timeout value to how long it takes to run the compiler for any given mutation, just like there is a timeout on the test execution?

Here's an example of this happening:

build    core/src/analyzers/hash_code_analyzer.rs:105:32: replace + with * in analyze_hash_codes ... **755.1s**
└             Running `C:\Users\mataille\.rustup\toolchains\stable-x86_64-pc-windows-msvc\bin\rustc.exe --crate-name scalar_keys --edition=2021 codegen\scalar_keys.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-inco`

What's happening is that my test makes use of a procedural macro. The mutation performed has caused the execution of this procedural macro to get into an infinite loop, as sometimes happens during mutation. The problem is that since this logic is in a procedural macros, it gets used by the compiler, and so causes the compilation of the test crate to enter this infinite loop. And then we're stuck.

I think having a timeout on how long the compiler gets to run is the only clean way out of this. For the time being, I will exclude the specific functions that are causing this from mutation testing.

geeknoid avatar Dec 13 '24 17:12 geeknoid

There are already build timeouts, in https://mutants.rs/timeouts.html?highlight=timeout#build-timeouts. Perhaps, or apparently, they're not working properly here?

If you look in mutants.out/debug.log then you should be able to see something about how the timeouts are set and enforced.

Are you using any options that might interfere with this, specifically cap-lints=true or #[allow(long_running_const_eval)]?

It would help if you can share a small tree that reproduces the same problem.

sourcefrog avatar Dec 13 '24 18:12 sourcefrog

Here's more info:

  • This is happening on this repo: https://github.com/geeknoid/frozen-collections. I haven't yet published this stuff on crates.io, but I hope to do so in a few days. Mutation testing is the last thing on my todo list.

  • The problem is triggered by the mutation performed here. The + 1 is being turned into a * 1 which prevents the loop from making any forward progress, and hence the infinite loop. The containing function is called from within the procedural macros used to build my test suite. If you remove the mutation suppression attribute on the function definition, you should see the infinite loop.

  • The options I used are shown here and are pretty straightforward.

  • The debug.log file is attached. Lots of spew about trying to terminate some process, which I guess is the thing that's not working right: debug.log

  • This is happening on a Windows 11 box.

geeknoid avatar Dec 13 '24 20:12 geeknoid

timeouts=Timeouts { build: None, test: Some(36s) }

It seems to be not setting a timeout to kill the process, and relying on rustc's internal detection of long running const eval.

Maybe that doesn't cover proc macros?

If you set --build-timeout 60 does that help?

sourcefrog avatar Dec 13 '24 20:12 sourcefrog

Yep --build-timeout=60 did solve the problem, the specific case fails with a timeout.

So I guess the issue I had is that the default build timeout is surprisingly large and so it appeared there was no timeout.

geeknoid avatar Dec 13 '24 21:12 geeknoid

Thanks for digging into this.

We primarily rely on rustc's built in detection of long running const eval. But maybe this doesn't catch long-running proc macros, so we might need to set a default build timeout too.

Martin

sourcefrog avatar Dec 13 '24 22:12 sourcefrog

I think the way the long_running_const_eval lint works is by counting instructions in a kind of MIR interpreter. Given proc-macros are actually native code, the same technique couldn't be used by rustc.

the default build timeout is surprisingly large

In a recent release it appears that the default for build timeouts was removed.

zaneduffield avatar Dec 22 '24 00:12 zaneduffield

In a recent release it appears that the default for build timeouts was removed.

Yes, I took out the default because it seemed hard to calculate a reasonable default, and I thought the long_running_const_eval would be an adequate substitute. With the new information from this bug I think I'll put the default back.

The difficulty is that the initial clean build can be very long, much longer than we'd expect an incremental build can take. Also, especially with concurrent builds, there can be contention for cargo global caches that can cause particular builds to take longer. However the time for a clean build should give a reasonable upper bound.

sourcefrog avatar Dec 22 '24 17:12 sourcefrog