Enming Yuan
Enming Yuan
https://github.com/graykode/xlnet-Pytorch/blob/cb793a1c75bdc59e3360f04ec641af726719811f/xlnet.py#L163 In your implementation, the FFN module only has one linear layer. is it a bug?
Thanks for your wonderful work. Well-organized project! You have mentioned that you did a grid search on hyperparameters. Could you please provide the best hyperparameters for different datasets? or at...
Thanks for your educational blog post and this repo. Could you please provide your scripts to finetune the 70B model in this repo? BTW, when I run your 7B finetune...
Hi, thanks for sharing your implementations, it is really helpful and very clean to follow. I have one question about the embedding used in your model. There are 3 options...