tarrett

Results 2 issues of tarrett

When using the DeepSeek-V2-Lite_Chat model to generate text, sometimes this error occurs。Can anybody help?

**Describe the bug** Traing the llama-7b model with zero stage3 and set stage3_gather_16bit_weights_on_model_save to true in ds_config.json, but the size of saved pytorch-model.bin is only 610K. It is strange that...

bug
training