igor0
Results
1
issues of
igor0
**TL,DR:** The non-ZeRO ("stage 0") optimizer in DeepSpeed makes fragile assumptions about the optimizer state in the checkpoint, even when ``finetune: true`` configuration parameter is set. A mitigating factor is...
bug