Molly Smith comments

Results 45 comments of


                                            Molly Smith

[BUG] Wrong output for batch input for opt model inference.

@yingapple I'm trying to reproduce your issue. The script you provided does not run as it has syntax and other errors. Can you please verify the script you are running...

CPU-Adam: add compile-flag to enable param-copy from CPU to GPU

Closing since https://github.com/microsoft/DeepSpeed/pull/2507 should address this.

[BUG]KeyError: 'attention_mask'

Hi @janglichao, can you please provide more information about your setup? ds_report output Please run ds_report to give us details about your setup. System info (please complete the following information):...

[BUG]embedding is not splited while inference using gpt2

Hi @katitizhou, GPT2 is not supported for tensor parallelism without kernel injection. You can split gpt2 across multiple GPUs by setting kernel injection to True and removing injection policy.

[BUG] Incorrect logits on Bloom models

https://github.com/microsoft/DeepSpeed/pull/2851 should fix this issue

[BUG] Wrong logits/outputs when using HFOPTLayerPolicy on OPT model

Hi @akamaster, I was able to recreate your issue. There was an issue with OPT injection that has been resolved in the latest Deepspeed v0.8.0. If you use that version,...

[BUG] DeepSpeed loads the whole codegen model into GPU

Hi @xiejw, codegen is not supported currently because it has a fused qkv and you're right that we need a special case for it.

[BUG] 'StableDiffusionPipeline' object has no attribute 'children'

Hi @stevensu1977 , can you provide the script you used?

[BUG] 'StableDiffusionPipeline' object has no attribute 'children'

Hi @stevensu1977 and @gaziqbal, can you try setting kernel injection to True?

[BUG] 'StableDiffusionPipeline' object has no attribute 'children'

disregard PR https://github.com/microsoft/DeepSpeed/pull/3083