CH3OH
Results
1
issues of
CH3OH
@awaelchli I found that in the `pretrain.py`, the accumulation steps are calculated based on global batch size, device number and micro batch size. This works fine under single-node setting, e.g....
enhancement