cog-llama-template icon indicating copy to clipboard operation
cog-llama-template copied to clipboard

Spaces before dots and commas

Open python273 opened this issue 2 years ago • 1 comments

For this prompt:

User: [INST]Hello. Repeat this text exactly "find . -name hello"[/INST]
Assistant:

it outputs:

Sure! Here's the text you requested:

find. -name hello

for some reason spaces before dots and commas are removed on 7B and 13B. Maybe some problem with tokenizer?

https://replicate.com/replicate/llama70b-v2-chat seem to work correctly

python273 avatar Jul 20 '23 14:07 python273

Thanks @python273! This may be caused by an interaction between Llama's sentencepiece tokenizer and the way we process token streams (e.g. see here).

joehoover avatar Jul 21 '23 14:07 joehoover