[Feature Request] Add capability for utilizing NVLink across docker containers.
See discussions in #1253 and NVIDIA/nccl#324, CUDA IPCs cannot be established between docker containers with different GPU mount, which hurts NCCL performance in GPU cluster.
Can nvidia-docker provides a capability that allow GPUs in docker containers can see each other while applications can be isolated in some way?
cc @cheyang @WencongXiao
Hi @sjeaugey , would you mind offering a checklist to clear the path to use NVLink for 2 containers within the same node? This is kind of critical for us to improve the overall GPU utilization on public cloud. Maybe we can also help?
I'm assuming we're talking about containers where only a single GPU is visible -- please correct me if I'm wrong. I see two rather large steps.
-
CUDA. Last time I checked, CUDA did not allow you to use P2P through NVLink between two containers which didn't see each other's GPUs. Maybe there is a way to make it work, maybe I'm just plain wrong, but the first step is certainly to ensure we can use CUDA P2P between two containers through NVLink, outside of NCCL, using simple CUDA P2P semantics.
-
NCCL. Currently NCCL relies on each GPU detecting what other GPUs there are on the node and checking whether there are NVLinks between them and whether they can be used or not. That topology detection would need to change to have each rank on the node detect only its local topology information, then share it with others on the node through out-of-band communication, then build the intra-node topology based on that. That's a pretty heavy rework of how the topology detection works, and adds an extra communication step in the middle. It's something I think we'll do at some point, because it should also improve performance, but we don't have a clear plan for that to happen yet.