bert
bert copied to clipboard
Dupe factor in create_pretraining_data.py
Explanation of dupe_factor: Number of times to duplicate the input data (with different masks). Did the original BERT use duplicated data? I think the most obvious value should be 1 i.e. not duplicate at all. Let me know if setting the dupe_factor to something like {5,10} is beneficial and does not make the model overfit.