Molly Smith
Molly Smith
@yingapple I'm trying to reproduce your issue. The script you provided does not run as it has syntax and other errors. Can you please verify the script you are running...
Closing since https://github.com/microsoft/DeepSpeed/pull/2507 should address this.
Hi @janglichao, can you please provide more information about your setup? ds_report output Please run ds_report to give us details about your setup. System info (please complete the following information):...
Hi @katitizhou, GPT2 is not supported for tensor parallelism without kernel injection. You can split gpt2 across multiple GPUs by setting kernel injection to True and removing injection policy.
https://github.com/microsoft/DeepSpeed/pull/2851 should fix this issue
Hi @akamaster, I was able to recreate your issue. There was an issue with OPT injection that has been resolved in the latest Deepspeed v0.8.0. If you use that version,...
Hi @xiejw, codegen is not supported currently because it has a fused qkv and you're right that we need a special case for it.
Hi @stevensu1977 , can you provide the script you used?
Hi @stevensu1977 and @gaziqbal, can you try setting kernel injection to True?
disregard PR https://github.com/microsoft/DeepSpeed/pull/3083