gongjingcs
gongjingcs
the same
  I define a tensor with size [6, 12,2048,2048], the fp32 memory consumes 1207.9 M, howerver line 13 shows Total Used Memory:2511.9 Mb
> > @szhengac You are correct, LAMB and LARS implementations that are not aware of ZeRO will not work correctly with ZeRO. This is not a fundamental limitation of optimizer...
same warning looking forward your reply