Results 1 issues of ziliwang

**Describe the bug** When comparing zero-1 and zero-2, I noticed discrepancies between the results in the DeepSpeed Flops Profiler and the training speed metrics in transformers, and the conclusions drawn...

bug
training