DeepSpeed
DeepSpeed copied to clipboard
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
I am testing NVMe offloading when training a model. When I try to save a checkpoint, I am getting (full stack trace below): ``` NotImplementedError: ZeRO-3 does not yet support...
Hello I'm running Windows 10 and I would like to install DeepSpeed to speed up inference of GPT-J. My system is the following: ``` Windows 10 cuda 11.6 torch 1.13.0...
Windows 10 Cuda Toolkit = 10.2 PyTorch = 8.1 VisualStudio = 2019 I'm trying to use this with the transformers library and carefully followed the instructions laid out by the...
## Problem Description requirements-sd.txt required triton==2.0.0.dev20221005 but it doesn't exist in the PyPI triton release histories, the earlist version for triton-v2 that available is triton==2.0.0.dev20221030, the most similar version name...
**Describe the bug** I was trying to install Deepspeed on WIndows inside a python virtual environment. I have been told that Deepspeed has not been tested on windows so far...
Hello DeepSpeed :) I am trying to use [Pipeline module](https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/runtime/pipe/engine.py) to train a pipeline parallel model on multiple nodes. I am using Slurm as the cluster scheduler, so I initialized...
I am trying 4x sharing for "Salesforce/codegen-16B-mono". 4 A10 chips (24 GiB each). torch type is torch.half. My math told me (please double check): If the sharding is correct, then...
Creates blog for automatric tensor parallelism feature.
Expands unsupported model list and adds more checks for clean error exit.