igor0

Results 1 issues of igor0

**TL,DR:** The non-ZeRO ("stage 0") optimizer in DeepSpeed makes fragile assumptions about the optimizer state in the checkpoint, even when ``finetune: true`` configuration parameter is set. A mitigating factor is...

bug