examples
examples copied to clipboard
the learning rate of word_language_model
Hi, I have a question about the learning rate in the example "word_language_model", the init lr = 20, which seems very large, can you tell me why lr is set to equal 20? Thanks a lot! If you have some advices about improving the performance, please let me know and thanks
i also have the same question. Look like nobody answers that
me too, when i used lr=20, then the loss was very strange and crazy......
I'm happy to accept contributions for a more reasonable learning rate