BitNet
BitNet copied to clipboard
Repeated tokens generated from 'generate.py' running on GPU
Dear Authors,
Thanks for introducing the amazing project. When I tested the BitNet Inference Kernel on RTX 3090 with Ubuntu system, I followed the commands in README.md, but I got repeated tokens as the output. For example:
Could you help me explain Python?
OfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOfOf
Could you help me check if anything could be wrong here? Thanks.
I am facing the same problem.
Hello! GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
Similar issue, the model just outputs repeated nonsensical letters:
python ./run_inference.py -m ./ggml-model-i2_s.gguf -p "When people say \"I'm terrified of the future\", the primary emotion expressed is: " -n 10
When people say "I'm terrified of the future", the primary emotion expressed is: �zuônimersimerszuimersuggyimershands