Molly Smith issues

Results 8 issues of


                                            Molly Smith

Fix Bloom logits mismatch

Bloom with kernel injection was showing significant logits mismatch compared to Transformer's baseline as reported by issue https://github.com/microsoft/DeepSpeed/issues/2730. Softmax input_mask is float32, not int64, and needs to be converted to...

Create tensor parallelism blog/tutorial

Creates blog for automatric tensor parallelism feature.

TP unsupported models and assertions

Expands unsupported model list and adds more checks for clean error exit.

Remove bf16 from inference config dtye enum

Remove bf16 from inference config dtye enum because not it is not supported. Users should now see pydantic error with supported types vs. vague CUDA error. ``` pydantic.error_wrappers.ValidationError: 1 validation...

Auto TP Tutorial with T5 Example

Updates Auto Tensor Parallelism tutorial with T5 example instead of OPT, since OPT is supported with kernel injection and we would like to showcase a model that does not have...

Assert mp_size is factor of model dimensions

The number of GPUs or mp_size needs to be a factor of a model's hidden dimension, embedded dimension, number of attention heads, etc. Otherwise we encounter various tensor size errors...

RuntimeError test-wav2vec2.py

I get errors when trying to run huggingface example test-wav2vec2.py. First I get missing python package errors (datasets, jiwer). After installing packages I see: RuntimeError: Error opening '/home/mosm/.cache/huggingface/datasets/downloads/extracted/e4488bdcc5e36bb8e49ff9b437db0cde3f99b8f604fabd9bc27b267ced1c7967/6930-75918-0000.flac': System error....

Skip autoTP if tp_size is 1

Skip auto TP if no tensor parallelism is needed / using only 1 GPU. https://github.com/microsoft/DeepSpeed/issues/3285

merge-queue