Tests fail often for different reasons
Describe the bug The tests occasionally fail when running with default settings. I have also observed the tests to hang indefinitely.
ERROR traitlets:client.py:568 Error occurred while starting new kernel client for kernel 95bf912a-0d9e-419a-8ae5-3b4fd9c31f6e: Kernel died before replying to kernel_info
ERROR traitlets:client.py:568 Error occurred while starting new kernel client for kernel fc8ac83e-1017-412a-95de-475b0d83c8ba: Kernel didn't respond in 60 seconds
None = NotebookResult(nb={'cells': [{'cell_type': 'code', 'execution_count': 1, 'metadata': {'execution': {'iopub.status.busy...'949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1'}}}, 'nbformat': 4, 'nbformat_minor': 2}, error=None).error[0m
zmq.error.ZMQError: Address already in use (addr='tcp://127.0.0.1:39979')
^ it is a very bad idea to use ports >32k, since those are also used for outgoing TCP connections.
To Reproduce Steps to reproduce the behavior:
- Build and test nbmake using Nix (https://github.com/NixOS/nixpkgs)
Expected behavior No test errors.
Logs https://gist.github.com/FliegendeWurst/807356cbe8f273045a167198350e3d9c
This looks like a race condition due to using xdist with high parallelism. A workaround is to set --maxproccesses to something lower.
this def feels like something best worked around - for reference, we depend on nbclient and they depend on a stack of jupyter libraries that are not the most amenable to parallelism.
https://github.com/jupyter/nbclient
What degree of parallelism are you running with?
What degree of parallelism are you running with?
I was running the tests with 32 jobs, but I sometimes encountered the issue with 12 jobs too.
Not sure if there's been any update on this since @FliegendeWurst raised the original issue, but I can also confirm I'm seeing this regularly even with only 4 workers.