DeepSpeed
DeepSpeed copied to clipboard
Has the default partition algorithm of pipline parallelism been changed? I found training speed becomed a little slower than previous versions
deespeed==0.12.6 I do believe that the distribution of parameters is more reasonable, but why slower the training speed becomed?