nbmake icon indicating copy to clipboard operation
nbmake copied to clipboard

Tests fail often for different reasons

Open FliegendeWurst opened this issue 1 year ago • 4 comments

Describe the bug The tests occasionally fail when running with default settings. I have also observed the tests to hang indefinitely.

ERROR traitlets:client.py:568 Error occurred while starting new kernel client for kernel 95bf912a-0d9e-419a-8ae5-3b4fd9c31f6e: Kernel died before replying to kernel_info ERROR traitlets:client.py:568 Error occurred while starting new kernel client for kernel fc8ac83e-1017-412a-95de-475b0d83c8ba: Kernel didn't respond in 60 seconds None = NotebookResult(nb={'cells': [{'cell_type': 'code', 'execution_count': 1, 'metadata': {'execution': {'iopub.status.busy...'949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1'}}}, 'nbformat': 4, 'nbformat_minor': 2}, error=None).error[0m zmq.error.ZMQError: Address already in use (addr='tcp://127.0.0.1:39979')

^ it is a very bad idea to use ports >32k, since those are also used for outgoing TCP connections.

To Reproduce Steps to reproduce the behavior:

  1. Build and test nbmake using Nix (https://github.com/NixOS/nixpkgs)

Expected behavior No test errors.

Logs https://gist.github.com/FliegendeWurst/807356cbe8f273045a167198350e3d9c

FliegendeWurst avatar Nov 27 '24 10:11 FliegendeWurst

This looks like a race condition due to using xdist with high parallelism. A workaround is to set --maxproccesses to something lower.

mweinelt avatar Dec 08 '24 15:12 mweinelt

this def feels like something best worked around - for reference, we depend on nbclient and they depend on a stack of jupyter libraries that are not the most amenable to parallelism.

https://github.com/jupyter/nbclient

What degree of parallelism are you running with?

alex-treebeard avatar Dec 09 '24 16:12 alex-treebeard

What degree of parallelism are you running with?

I was running the tests with 32 jobs, but I sometimes encountered the issue with 12 jobs too.

FliegendeWurst avatar Dec 09 '24 16:12 FliegendeWurst

Not sure if there's been any update on this since @FliegendeWurst raised the original issue, but I can also confirm I'm seeing this regularly even with only 4 workers.

sammorley-short avatar Nov 04 '25 14:11 sammorley-short