Quentin Anthony
Quentin Anthony
Following the discussions in #995 and #862, This PR adds a small usage example into the ImageNet README. One note, however, is that specifying ```--gpu``` will trigger the print in...
@ShadenSmith Enables: - Training/inference for most workloads on CPU systems - DeepSpeed development on systems without GPUs (including personal machines) to save compute resources Most code changes boil down to...
@StellaAthena @ShivanshuPurohit **Note: we will not merge this unless we decide to get rid of DeeperSpeed** This branch completely does away with DeeperSpeed, and instead is based on upstream DeepSpeed....
We should add support for mutransfer: https://github.com/microsoft/mup Appears non-trivial, but not as difficult as MoE. We'd have to modify the model itself. https://github.com/microsoft/mup/blob/main/examples/Transformer/model.py appears especially relevant. A good workflow would...
This PR introduces MoE support modeled after the support in Megatron-DeepSpeed (https://github.com/microsoft/Megatron-DeepSpeed) This is part of the effort to add/test upstream DeepSpeed features: https://github.com/EleutherAI/gpt-neox/pull/663
Currently users need do either launch with `ds_bench`, `deepspeed` or append the DeepSpeed repo to their `PYTHONPATH` to avoid an import error when launching communication benchmarks. These changes both fix...
After the patch in https://github.com/microsoft/DeepSpeed/pull/1400 for BigScience, the final element of the `inputs` tuple is conditional on whether its grad is null (https://github.com/microsoft/DeepSpeed/blob/v0.7.5/deepspeed/runtime/pipe/engine.py#L995). This will always fail if `elt.grad is...
We want to add Mamba to gpt-neox: - [ ] Add basic mamba block, without kernels, from https://github.com/state-spaces/mamba/tree/main/mamba_ssm/modules to https://github.com/EleutherAI/gpt-neox/tree/main/megatron/model - [ ] Add mamba kernels from https://github.com/state-spaces/mamba/tree/main/mamba_ssm/ops - [...
DeepSpeed wins most inference benchmarks I see. We should test their claims on neox models. EleutherAI spends a significant amount of compute running inference, so any improvement in inference performance...
Need to convert the existing docker container into a singularity container and provide it to users.