coreference-resolution
coreference-resolution copied to clipboard
Gamma in the LR scheduler is too small
Gamma should be 0.999 and step_size=1, so that the learning rate is decayed by 0.1% as recommended in the paper. Otherwise the learning rate is just cut abruptly after 10k steps. https://github.com/shayneobrien/coreference-resolution/blob/f368f5a06e1d646d60c50d824235576bb8fe4198/src/coref.py#L453