Wangxc2019
Results
2
comments of
Wangxc2019
收到,正在修改并寻找其他需要提示faq的地方
> 可以试试把语言模型设置成bf16,混合精度数据类型也改成bf16,fp16在某些情况容易nan According to the author's message, after changing to bf16 the loss will no longer be nan. `torch_dtype=torch.bfloat16`