`SIGSEGV: invalid memory reference` in tests, but only when running them simultanously

Open HollayHorvath opened this issue 1 year ago • 1 comments

I'm facing an issue similar to #1257 but it only fails when I run multiple tests together, when running them one by one it's all good. I didn't open this PR as a bug report because I don't have the time right now to create a repro, once I'll have it I'll put it in the comments.

To make sure that it's not a happy little accident I ran the tests one by one a thousand times (5x1k runs, 0 failures) and I also ran all tests together (1k runs, 232 failures) so I can confidently say it's the tests running simultaneously that fails.

I also copied together my tests into a single one and run that as well which also produced no failures.

I also did a little experiment by just deleting/duplicating tests. When I only have 3 tests the error ratio is down to 6.2%, but when I have 15 tests there isn't a single one that ran successfully.

Another check I did was to add the serial_test library and annotate all tests with the #[serial] attribute which causes all tests but the first one fail with the following error: called `append_to_inittab` but a Python interpreter is already running which I don't really understand, my assumption would be that one executor stops before the other one starts, but I guess I'm wrong about this. I also tried running the tests with --jobs 1 but that also didn't help.

Basically all tests are the same, the only difference is that I'm loading different Python files:

#[test]
fn some_test() {
    let source = include_str!("tests/some_test.py");

    let data = load_some_data();

    pyo3::append_to_inittab!(my_library);

    Python::with_gil(|py| {
        let globals = vec![("data", Py::new(py, data).unwrap())];

        Python::run_bound(
            py,
            source,
            Some(&globals.into_py_dict_bound(py)),
            None,
        )
    })
    .unwrap();
}

As a final clue, if I run multiple tests but comment out the pyo3::append_to_inittab!(my_library); line from all but a single test and cargo runs that test first then it's always green (I didn't do the 1k tests with it but run it manually ~20 times).

I cannot share the library I'm using as it's not open source, but based on these results I'm quite confident that the issue is not with the library itself. Sadly I don't have time right now to set up a minimal example with some 3rd party library.

Operating system and version: Ubuntu 22.04 (Linux Mint 21.3) Python version: 3.10.12 (OS version) Rust version: 1.78.0 (latest stable) PyO3 version: originally faced the issue with 0.21, I updated to 0.22 but it's still present.

Jun 27 '24 16:06 HollayHorvath

As a final clue, if I run multiple tests but comment out the pyo3::append_to_inittab!(my_library); line from all but a single test and cargo runs that test first then it's always green (I didn't do the 1k tests with it but run it manually ~20 times).

I think this is the key insight indeed. I think it's quite likely there's a race when calling append_to_inittab! multiple times in parallel. Further investigation would be needed to confirm whether it's racing with itself (and maybe PyO3 can try to mitigate it) or it's a time-of-check vs time-of-use issue with the check we already have against the interpreter already being up.

Jul 02 '24 09:07 davidhewitt