Running tests fails with "node down: Not properly terminated", maybe execnet related?
We recently moved to Python 3.4 and when running our tests with pytest-xdist installed and in parallel with (-n) sometimes the tests fail with the below error. The traceback is from using faulthandler. The tests consistently fail during tests that use celery tasks and that have the format:
@override_settings(task_always_eager=True) def test_example(self):
Packages used: execnet-1.5.0 pytest-3.0.6 xdist-1.15.0
Here is the error log:
platform linux -- Python 3.4.5, pytest-3.0.6, py-1.4.32, pluggy-0.4.0 Django settings: gameserver.settings.test (from environment variable) rootdir: /home/jcazacu/Repos/main_repo/ares-game-server, inifile: tox.ini plugins: xdist-1.15.0, faulthandler-1.4.1, django-3.1.2, cov-2.5.1, celery-4.0.2 gw0 [535] / gw1 [535] / gw2 [535] / gw3 [535] scheduling tests via LoadScheduling .....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................Fatal Python error: Segmentation fault
Thread 0x00007fbd16d3d700 (most recent call first): File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/execnet/gateway_base.py", line 386 in read File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/execnet/gateway_base.py", line 418 in from_io File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/execnet/gateway_base.py", line 954 in _thread_receiver File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/execnet/gateway_base.py", line 213 in run File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/execnet/gateway_base.py", line 277 in _perform_spawn
Current thread 0x00007fbd1eed2740 (most recent call first):
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/django/db/backends/sqlite3/base.py", line 335 in execute
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/django/db/backends/utils.py", line 62 in execute
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/django/db/backends/base/base.py", line 288 in _savepoint_rollback
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/django/db/backends/base/base.py", line 328 in savepoint_rollback
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/django/db/transaction.py", line 243 in exit
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/django/test/testcases.py", line 1004 in _rollback_atomics
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/django/test/testcases.py", line 1066 in _fixture_teardown
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/django/test/testcases.py", line 908 in _post_teardown
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/channels/tests/base.py", line 57 in _post_teardown
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/django/test/testcases.py", line 216 in call
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/_pytest/unittest.py", line 157 in runtest
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/_pytest/runner.py", line 104 in pytest_runtest_call
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/_pytest/vendored_packages/pluggy.py", line 614 in execute
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/_pytest/vendored_packages/pluggy.py", line 265 in init
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/_pytest/vendored_packages/pluggy.py", line 248 in _wrapped_call
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/_pytest/vendored_packages/pluggy.py", line 613 in execute
File "/home/jcazacu/Repos/main_repo/ares-game-server/.tox/test34/lib/python3.4/site-packages/_pytest/vendored_packages/pluggy.py", line 334 in
Any ideas?
Currently we just moved the task tests to run serially since the failures are so unpredictable.
its really not clear whats happening there, its just clear that python itself gets seg-faulted, which is quite a feat and a real pain to debug
ok, thx, do you have any suggestions as to where I can start with debugging?
unfortunately not, im not familiar with the surrounding libs in use
[gw4] [ 96%] PASSED tests/warehouse/test_warehouse_supplier_refunds.py::TestDeleteSupplierRefund::test_delete_refund_after_delete_cell 10:32:56 [gw5] [ 96%] PASSED tests/warehouse/test_warehouse_supplier_refunds.py::TestDeleteSupplierRefund::test_delete_refund_after_good_has_become_serial 10:32:56 [gw3] node down: Not properly terminated 10:32:56 [gw3] [ 96%] FAILED tests/warehouse/test_warehouse_residue.py::TestResidueControlValidation::test_close_residue_rule_via_mask 10:32:56
10:32:56 replacing crashed worker gw3 10:32:56
10:32:56 [gw8] linux Python 3.7.6 cwd: /opt/buildagent/work/61408fc2f9cf70c9 10:32:57
10:32:57 [gw8] Python 3.7.6 (default, Mar 17 2020, 13:08:12) -- [GCC 7.5.0] 10:33:00
10:33:10 tests/warehouse/test_warehouse_residue.py::TestResidueControlValidation::test_field_validation_positive 10:33:10 [gw2] node down: Not properly terminated 10:33:10 [gw2] [ 97%] FAILED tests/warehouse/test_warehouse_residue.py::TestResidueControlValidation::test_field_validation_negative[Abc] 10:33:10
10:33:10 replacing crashed worker gw2 10:33:10
10:33:10 [gw9] linux Python 3.7.6 cwd: /opt/buildagent/work/61408fc2f9cf70c9 10:33:10
10:33:10 [gw9] Python 3.7.6 (default, Mar 17 2020, 13:08:12) -- [GCC 7.5.0] 10:33:13
10:33:16 tests/warehouse/test_warehouse_residue.py::TestResidueControlValidation::test_field_validation_negative[ ] 10:33:16 [gw7] node down: Not properly terminated 10:33:16 [gw7] [ 97%] FAILED tests/warehouse/test_warehouse_residue.py::TestCreateAndDeleteResidueControl::test_check_table_icon 10:33:16
10:33:16 replacing crashed worker gw7 10:33:16
10:33:16 [gw10] linux Python 3.7.6 cwd: /opt/buildagent/work/61408fc2f9cf70c9 10:33:17
10:33:17 [gw10] Python 3.7.6 (default, Mar 17 2020, 13:08:12) -- [GCC 7.5.0] 10:33:20
10:34:01 tests/warehouse/test_warehouse_residue.py::TestResidueControlValidation::test_field_validation_negative[0] 10:34:01 [gw0] node down: Not properly terminated 10:34:01 [gw0] [ 97%] FAILED tests/warehouse/test_warehouse_residue.py::TestResidueControlValidation::test_close_residue_rule_modal 10:34:01
10:34:01 replacing crashed worker gw0 10:34:01
10:34:01 [gw11] linux Python 3.7.6 cwd: /opt/buildagent/work/61408fc2f9cf70c9 10:34:02
10:34:02 [gw11] Python 3.7.6 (default, Mar 17 2020, 13:08:12) -- [GCC 7.5.0]
I have seen this type of error occur when combining pytest-xdist with other libraries that deal with multithreaded or multi-processing code. One common pain point is the Annoy library from Spotify, especially older versions. It does many multi-threaded actions without end user control of the threading, and there are just extremely mystrerious and un-debuggable ways that pytest-xdist causes some bad interaction with threading to lead to a segfault and a crashed worker. Worse this can be different for different OSes or Python versions.
I am debugging an example right now where on Mac OS with Python 3.6.9, everything is fine and pytest-xdist runs my test suite (with parallel workers) without issue. If I take the same code and put it in an equivalent Ubuntu Docker image and run the tests, I get a segfault from execnet and crashed workers. But if I run the tests serially in the Docker image, all tests pass.
These issues are virtually impossible to boil down to simplified reproducible examples as well, since if I knew what all the factors were that are required to minimally reproduce it, that would probably be the solution to debugging it. It's very, very hard to use pytest-xdist in these situations.
I'm seeing a similar error (Python segfault + "node down: Not properly terminated"). It seems to fail consistently in my Ubuntu CI runs (these logs will be unavailable soon) and on my MacBook, but only on CPython 3.11 (3.11.0-beta.3 on Ubuntu and 3.11.0b3+ [f9d0240] on macOS).
nox > python run_tests.py -m 'not longrunning' --script-launch-mode=subprocess -s tests_regression
============================= test session starts ==============================
platform linux -- Python 3.11.0b3, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/runner/work/rinohtype/rinohtype, configfile: setup.cfg
plugins: assume-2.4.3, github-actions-annotate-failures-0.1.6, console-scripts-1.3.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
gw0 I / gw1 I
gw0 [88] / gw1 [88]
.................../home/runner/work/rinohtype/rinohtype/tests_regression/rst/columns_two.rst:1: (ERROR/3) Undefined substitution referenced: "problematic".
....../home/runner/work/rinohtype/rinohtype/tests_regression/rst/exceptional_style.rst:11: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
/home/runner/work/rinohtype/rinohtype/tests_regression/rst/exceptional_style.rst:18: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
/home/runner/work/rinohtype/rinohtype/tests_regression/rst/exceptional_style.rst:21: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
............../home/runner/work/rinohtype/rinohtype/tests_regression/rst/inline_markup.rst:1: (ERROR/3) Undefined substitution referenced: "problematic".
.................Fatal Python error: Segmentation fault
Thread 0x00007f92f6d6f700 (most recent call first):
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/execnet/gateway_base.py", line 400 in read
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/execnet/gateway_base.py", line 432 in from_io
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/execnet/gateway_base.py", line 967 in _thread_receiver
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/execnet/gateway_base.py", line 220 in run
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/execnet/gateway_base.py", line 285 in _perform_spawn
Current thread 0x00007f92f7854740 (most recent call first):
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/table.py", line 466 in <genexpr>
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/table.py", line 466 in _index
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/table.py", line 473 in get_rowspanned_columns
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/table.py", line 562 in __int__
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/table.py", line 575 in __iter__
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/table.py", line 537 in __eq__
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/style.py", line 303 in match
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/style.py", line 361 in match
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/style.py", line 622 in match
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/style.py", line 739 in find_matches
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/style.py", line 742 in find_matches
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/style.py", line 742 in find_matches
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/document.py", line 327 in get_matches
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/style.py", line 697 in _get_value_lookup
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/attribute.py", line 469 in get_value_for
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/attribute.py", line 333 in get_config_value
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/style.py", line 512 in get_style
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/util.py", line 160 in function_wrapper
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/draw.py", line 86 in render
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/table.py", line 386 in draw_cell_border
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/table.py", line 401 in _place_rows_and_render_borders
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/table.py", line 166 in render_rows
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/table.py", line 176 in render
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 295 in flow_inner
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 240 in flow
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 572 in _flow_with_next
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 581 in _flow_with_next
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 535 in render
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 295 in flow_inner
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 240 in flow
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 572 in _flow_with_next
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 535 in render
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 295 in flow_inner
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 240 in flow
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 572 in _flow_with_next
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 535 in render
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 295 in flow_inner
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/flowable.py", line 240 in flow
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/layout.py", line 637 in render
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/layout.py", line 366 in _render
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/layout.py", line 306 in render
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/layout.py", line 197 in render
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/layout.py", line 197 in render
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/document.py", line 178 in render
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/template.py", line 461 in render
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/document.py", line 478 in _render_pages
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/rinoh/document.py", line 436 in render
File "/home/runner/work/rinohtype/rinohtype/tests_regression/helpers/regression.py", line 105 in render_doctree
File "/home/runner/work/rinohtype/rinohtype/tests_regression/helpers/regression.py", line 92 in _render_file
File "/home/runner/work/rinohtype/rinohtype/tests_regression/helpers/regression.py", line 59 in render_rst_file
File "/home/runner/work/rinohtype/rinohtype/tests_regression/test_rst.py", line 45 in test_rst
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/_pytest/python.py", line 192 in pytest_pyfunc_call
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_hooks.py", line 265 in __call__
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/_pytest/python.py", line 1761 in runtest
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/_pytest/runner.py", line 166 in pytest_runtest_call
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_hooks.py", line 265 in __call__
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/_pytest/runner.py", line 259 in <lambda>
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/_pytest/runner.py", line 338 in from_call
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/_pytest/runner.py", line 258 in call_runtest_hook
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/_pytest/runner.py", line 219 in call_and_report
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/_pytest/runner.py", line 130 in runtestprotocol
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/_pytest/runner.py", line 111 in pytest_runtest_protocol
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_hooks.py", line 265 in __call__
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/xdist/remote.py", line 110 in run_one_test
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/xdist/remote.py", line 91 in pytest_runtestloop
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_hooks.py", line 265 in __call__
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/_pytest/main.py", line 322 in _main
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/_pytest/main.py", line 268 in wrap_session
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/_pytest/main.py", line 315 in pytest_cmdline_main
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_callers.py", line 39 in _multicall
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_manager.py", line 80 in _hookexec
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/pluggy/_hooks.py", line 265 in __call__
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/xdist/remote.py", line 291 in <module>
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/execnet/gateway_base.py", line 1084 in executetask
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/execnet/gateway_base.py", line 220 in run
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/execnet/gateway_base.py", line 285 in _perform_spawn
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/execnet/gateway_base.py", line 267 in integrate_as_primary_thread
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/execnet/gateway_base.py", line 1060 in serve
File "/home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/lib/python3.11/site-packages/execnet/gateway_base.py", line 1554 in serve
File "<string>", line 8 in <module>
File "<string>", line 1 in <module>
Extension modules: markupsafe._speedups (total: 1)
...........[gw1] node down: Not properly terminated
F
replacing crashed worker gw1
s...................
=================================== FAILURES ===================================
_________________________ tests_regression/test_rst.py _________________________
[gw1] linux -- Python 3.11.0 /home/runner/work/rinohtype/rinohtype/.nox/regression-3-11-sdist/bin/python
worker 'gw1' crashed while running 'tests_regression/test_rst.py::test_rst[png]'
=============================== warnings summary ===============================
.nox/regression-3-11-sdist/lib/python3.11/site-packages/babel/messages/catalog.py:13
.nox/regression-3-11-sdist/lib/python3.11/site-packages/babel/messages/catalog.py:13
...
I have seen this type of error occur when combining pytest-xdist with other libraries that deal with multithreaded or multi-processing code.
All tests perform image comparison in parallel (diffpdf.py), but the crash does not occur in this stage of the test but rather during rinohtype's rendering which is single-threaded.
It's always test_rst[png] that segfaults. When running without pytest-xdist (-n 0), all is well. It is possible to reproduce the issue running only the test_rst[png] test case:
git clone https://github.com/brechtm/rinohtype.git
cd rinohtype
git checkout bd4b4157
poetry install
poetry run nox -r -s "regression-3.11(wheel)" -- -k png
- pytest (7.1.2)
- pytest-xdist (2.5.0)
UPDATE: Running the tests not through nox, the issue doesn't occur!
.nox/regression-3-11-wheel/bin/python run_tests.py -m 'not longrunning' --script-launch-mode=subprocess -k png tests_regression
~Similar problem, also with xdist: https://github.com/deltachat/deltachat-core-rust/actions/runs/4194093763/jobs/7271810435~
EDIT: this turned out to be a bug in our code, not pytest: https://github.com/deltachat/deltachat-core-rust/pull/4153
Confronted with the same problem in PyPOTS testing here https://github.com/WenjieDu/PyPOTS/actions/runs/4577128474/jobs/8082161160
No solution, but for others that every come across this issue, in my case the following happened.
When running pytest tests in parallel with xdist, my tests crash with:
[gw5] node down: Not properly terminated
[gw5] FAILED tests/test.py::test_with_botorch
replacing crashed worker gw5
This crash happens only when I have:
class A:
def x(self):
import botorch # only used in this function
but not when
import botorch
class A:
def x(self):
Also cannot reproduce it on MacOS but only in the CI on DevOps with ubuntu-latest.
@basnijholt
Just curious, do you still detecting the same construction?
did you played with number of threads etc?