Mohamad Zamini

Results 4 comments of Mohamad Zamini

> ``` > @microsoft-github-policy-service agree > ``` @microsoft-github-policy-service agree

> @mzamini92 could you rebase so we can double check no tests are breaking with this and we can merge? Thanks! @muellerzr Thanks for reaching me. I did it based...

If I use 2 H100 I can run the code but I get OOM. When I increase it to +2 GPUs the model duplicates on GPUs instead of sharding and...

same issue +1. OS: Linux GPU count and types: 8 x H100 80GB (if applicable) Hugging Face Transformers/Accelerate/etc. versions: [email protected] & 4.42 Python version: 3.10 deepspeed: 0.15.2 & 0.14.2 &...