Yunmeng
Results
2
comments of
Yunmeng
@mrwyattii Any update for #321 ? I have a similar scenario: two nodes with single A10 gpu on each node and we want to serve llama-13B with parallelism. Is this...
I don't think simply reinstalling `nvidia-nccl-cu12` will solve the issue. Based on the code in the vLLM repository https://github.com/vllm-project/vllm/blob/main/vllm/utils.py#L988-L1010 and https://github.com/vllm-project/vllm/blob/main/vllm/distributed/device_communicators/pynccl_wrapper.py#L186-L205 , it means vLLM prioritizes `torch`'s built-in `nccl`. It...