CH3OH

Results 1 issues of CH3OH

@awaelchli I found that in the `pretrain.py`, the accumulation steps are calculated based on global batch size, device number and micro batch size. This works fine under single-node setting, e.g....

enhancement