Use special tokens specific to the fine-tuned adapter during decoding

Open tgaddair opened this issue 2 years ago • 1 comments

During fine-tuning, it's possible that special tokens are added that are specific to the adapter. During decoding, we should be using the special tokens, and ensure the correct stop tokens, padding, etc. are properly honored.

Repro from @runvnc, related: #68

Model ID: https://huggingface.co/qblocks/mistral_7b_norobots/tree/main

QLoRA repo example uses this AutoTokenizer with special tokens:

https://github.com/artidoro/qlora/blob/7f4e95a68dc076bea9b3a413d2b512eca6d004e5/qlora.py#L347

Nov 27 '23 22:11 tgaddair

Will this be completed? Planning to use adapters with special tokens like the below ones: https://huggingface.co/Dogge/llama-3-8B-instruct-Bluemoon-Freedom-lora/ https://huggingface.co/Dogge/llama-3-70B-instruct-uncensored-lora

May 16 '24 11:05 llama-shepard