BitNet-Transformers
BitNet-Transformers copied to clipboard
Low accuracy issue
Hello. First of all, thank you for sharing the code. I have one question about your work. I am wondering if you checked the accuracy after training was completed. When I do 1 epoch training with the train_wikitext.sh script, the loss is about 7 and the perplexity is about 951. This value seems to be very different from the value in the paper (17.07 in terms of perplexity), so I wonder if I missed something. Thank you in advance for your reply.