DeepSpeed-MII
DeepSpeed-MII copied to clipboard
Add support for HuggingFace GPT-NeoX implementation
I'm running into a CUDA OOM error when loading this model due to the large size and lack of support for multi-GPU in HF pipeline.