Jeff Rasley
Jeff Rasley
DeepSpeed has support for several dtypes now (i.e., fp32, fp16, bf16). However, it's becoming less clear what parts of training are using what dtypes and what time. For example, in...
We noticed our DeepSpeed + Accelerate unit tests are failing on torch 1.8. `torch.distributed.run` requires torch 1.9+ so bumping your min torch version to 1.9. If you'd rather guard the...
AML deployments the model dir is not writeable, download config/tokenizer to a writeable cache path.
Provide local AML deployment option, this will use the [AML inference server](https://pypi.org/project/azureml-inference-server-http/) for the front end. We can then easily deploy an MII generated score file via: `azmlinfsrv --model_dir --entry_script...
After #25 is complete we want to expose all DS-inference configs (https://deepspeed.readthedocs.io/en/latest/inference-init.html#deepspeed.init_inference) and ZeRO inference configs in the MII config dictionary.
https://github.com/huggingface/transformers/pull/18261 introduces model arg validation, which is not compatible with how ds-inference was originally setup. We no longer need to do all of the things we previously did in an...
Checking to #2310 allows us to run our mp>1 neox tests.
- [x] add pre/post forward methods - [x] add generate method if the wrapped module has this attribute - [ ] add documentation for new pre/post forward calls to RTD...