Junseong Kim
Junseong Kim
@wangwei7175878 Can you share your code for crawling and preprocessing on above issue? Or if it possible can you share the full corpus with shared drive(dropbox, google drive etc). This...
@wangwei7175878 very interesting, authors said 0.01 weight decay is default parameter that they used. What's your parameter setting? it is same with default setting with our code except weigth_decay?
@wangwei7175878 Sounds Great! Can you make a pull request with your adamW implementation? I'll test it on my corpus too 👍
This issue is stated from #3
@eveliao We didn't implemented the transfer learning process yet. Cause by issue #32. But you can just load the BERT using torch.load(bert_path) and the model output is similar with lstm...
Sorry for my late update, and I think your point is right too. I'll fix it up ASAP
Sorry for the late response, I think you are right. I'll fix it ASAP
@briandw Well I sent the email to author, and they noticed me the same thing. Well I agree that we can generate the pytorch module using ONNX, but it might...
@briandw Thank you for your advice. Currently my goal is training from the scratch with smaller model which can available to train on our GPU environment. Cause I wanna keep...