distributed icon indicating copy to clipboard operation
distributed copied to clipboard

Better documentation of required open ports

Open selshowk opened this issue 4 years ago • 2 comments

When trying to deploy dask in a firewalled environment it is important to know the full set of ports that need to be accessible and from where they will be accessed (i.e. source/target/port for all connections between client, scheduler and workers).

Right now I don't think it's very visible in the docs that workers also listen on (random) ports and what the tooling is to fix the ports (and also who connects to these ports -- the scheduler, other workers, the client). So I think it would be useful to have some docs around this specifically geared towards IT admins setting up FW rules. An example of docs like this is:

https://kubernetes.io/docs/reference/ports-and-protocols/

  • Enumerate which ports are used in Client <-> Scheduler <-> Worker communications
  • Some information around best practices around worker port system (currently only documented in docstrings)
    • Currently need to set a flag manually to limit the ports used

selshowk avatar Oct 29 '21 13:10 selshowk

I'm also interested in this information. Is there any update?

donatogr avatar Feb 01 '24 17:02 donatogr

Typically the scheduler listens on 8786 and 8787. Each worker will listen on an ephemeral port assigned by the operating system unless explicitly specified. If the scheduler ports are already token they will also fall back to ephemeral ports.

I agree it might be nice to add a documentation section that clearly outlines this.

jacobtomlinson avatar Feb 06 '24 16:02 jacobtomlinson