albert icon indicating copy to clipboard operation
albert copied to clipboard

[ALBERT] Pre-training on TPU Pod

Open ngoanpv opened this issue 6 years ago • 1 comments

Hi all, Could I do pre-training on TPU Pod v2-256 on large/xlarge V2 config (batch 4096, 3M steps,...)? Any config to working on it?

ngoanpv avatar Nov 19 '19 03:11 ngoanpv

I also wonder that. According to this https://cloud.google.com/tpu/docs/training-on-tpu-pods?hl=ko, keep the per core batch size the same. (batch size * TPU count), (steps / TPU count). Should I set config like this (batch size: 4096 * 32, steps: 3M / 32)?

jwkim912 avatar Nov 22 '19 04:11 jwkim912