Peng
Peng
fixed in https://github.com/InternLM/InternLM/pull/73
We are doing this one in https://github.com/InternLM/InternLM/pull/245
@exceedzhang .to(torch.bfloat16)后是否还会OOM?
@JiaoPL
or at least they should not impact the value of global norm
![Uploading 企业微信截图_f6b71604-f3a9-4917-b0f4-e36960e097c3.png…]()
@zigzagcai 帮忙看一下
@blankde @huangting4201