BERT4doc-Classification icon indicating copy to clipboard operation
BERT4doc-Classification copied to clipboard

Questions about discriminative_fine_tuning

Open wlhgtc opened this issue 5 years ago • 2 comments

In Section 5.4.3 " We find that assign a lower learn- ing rate to the lower layer is effective to fine-tuning BERT, and an appropriate setting is ξ=0.95 and lr=2.0e-5." Compared to the code in https://github.com/xuyige/BERT4doc-Classification/blob/master/codes/fine-tuning/run_classifier.py#L812 Seem that you divide the bert layer into 3 part (4 layers for one part) and set different learning rate for each part. Some questions about it:

  1. How could the decay factor 0.95 match the number 2.6 in code ?
  2. And the last classify layer seem not be contained , no need to set lr for it ?

wlhgtc avatar Jun 12 '20 07:06 wlhgtc

Thank you for your issue!

  1. The number 2.6 was set for the beginning experiments, after that, we use run_classifier_discriminative.py for discriminative fine-tuning.
  2. The link to run_classifier_discriminative.py is https://github.com/xuyige/BERT4doc-Classification/blob/master/codes/fine-tuning/run_classifier_discriminative.py
  3. The classifier layer is contained in run_classifier_discriminative.py.

xuyige avatar Jun 25 '20 14:06 xuyige

Thanks for your reply, I will try it!

wlhgtc avatar Jun 28 '20 04:06 wlhgtc