ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[FEATURE]: Allow cuda build on host without a device when TORCH_CUDA_ARCH_LIST is set

Open ccoulombe opened this issue 2 years ago • 0 comments

Describe the feature

When TORCH_CUDA_ARCH_LIST is set, allow gpu build to succeed by not searching for a device and resulting in

RuntimeError: No CUDA GPUs are available

when the host does not have a device.

This is especially important for HPC centre where build nodes may not have a GPU device, and where we build for multiple architectures.

Thanks!!

ccoulombe avatar Jan 22 '24 16:01 ccoulombe