🐛 Bug

When running baseline tests locally, the loop tests provide a different output than the build servers. This causes there to be a lot of wasted time trying to figure out why these tests fail due to code changes, when in fact it has nothing to do with them.

To reproduce

Code snippet to reproduce

UB=1 cargo test

Stack trace/error message

https://github.com/move-language/move/runs/8222133907?check_suite_focus=true#step:6:3678

test prover unit[default]::functional/loops.move ... FAILED
Error: 
New output differs from baseline!
Call this test with env variable UPBL=1 to regenerate or remove old baseline files.
Then use your favorite changelist diff tool to verify you are good with the changes.

Or check the rudimentary diff below:

= Move prover returns: exiting with verification errors
= ... (101 lines)
=     =     at tests/sources/functional/loops.move:191: loop_with_two_back_edges_incorrect
-     =     at tests/sources/functional/loops.move:192: loop_with_two_back_edges_incorrect
-     =         y = <redacted>
-     =     at tests/sources/functional/loops.move:193: loop_with_two_back_edges_incorrect
+     =     at tests/sources/functional/loops.move:195: loop_with_two_back_edges_incorrect
+     =     at tests/sources/functional/loops.move:196: loop_with_two_back_edges_incorrect
+     =         x = <redacted>
+     =     at tests/sources/functional/loops.move:197: loop_with_two_back_edges_incorrect
=     =     at tests/sources/functional/loops.move:189: loop_with_two_back_edges_incorrect
= ... (64 lines)
=     =     at tests/sources/functional/loops.move:119: nested_loop_outer_invariant_incorrect
test prover unit[default]::functional/invariants_resources.move ... ok

Expected Behavior

The prover needs to be predictable or:

Tests need to not be this specific in baselining
These tests are changed accordingly to provide better signal

System information

Please complete the following information:

41d7e1477aab85b9c98754f010ff047709e3a0bb
1.63.0
M1 Mac Monterey 12.5

Additional context

Add any other context about the problem here.

Sep 07 '22 15:09 gregnazario

Thanks for reporting. There seems to be some new non-determinism upstream causing this, because we have already multiple measures in place to ensure output is deterministic.

You can work around this for now by adding pragma verify = false; to the spec loop_with_two_back_edges_incorrect block. Please also point via a TODO at the pragma to this issue.

Sep 07 '22 16:09 wrwg

Yeah, the test case can have two ways to fail for the inductive case. One is by assigning x, and another is assigning y. Perhaps, we need to update the test case so that its counterexample can be deterministic.

public fun loop_with_two_back_edges_incorrect(x: u64, y: u64) {
        spec {
            assume x != y;
        };
        loop {
            spec {
                invariant x != y;
            };
            if (x > y) {
                y = y + 1;
                continue
            };
            if (y > x) {
                x = x + 1;
                continue
            };
            break
        };
    }

@gregnazario can work around the issue by the way @wrwg describes.

Sep 07 '22 17:09 junkil-park

[Bug][move-prover] Undefined behavior provides issues with baseline tests

🐛 Bug

To reproduce

Expected Behavior

System information

Additional context