ScaleLLM
ScaleLLM copied to clipboard
cuda graph capture may occasionally become stuck with multiple gpus.
It is a known issue that CUDA graph capture may occasionally become stuck when multiple workers are in use. further investigation is needed.
Disabled cuda graph for multiple gpus by default for now.