bert
bert copied to clipboard
How many epoch do we need when pretraining bert?
I want to know how many epoch we need wen pretraining bert,but most of articals about bert just say how many step we need when pretraining? In the artical --'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding' gives approximate number--40 epoch.But this is about words,not sentence, but when we pretraing each sample is about sentence. Is epoch not important in Nlp ? In cv ,epoch is important.
Did you find the answer?