BOS/EOS tokens
Hi,
First of all, thanks for this great package! I am running inference against a triton server serving transformer models from go, and this library is a tremendous help.
One issue I couldn't figure out from the examples or the code is how to make the BPE tokenizer output encoded BOS and EOS tokens (i.e. < s > and < / s >). I checked that those tokens are part of my vocab.json but it seems they get ignored. I tried manually adding them to the tokenizer as special tokens, tried wrapping my input sentence in "< s > ... < / s >" manually, but I can't seem to get it to work. What am I missing?
Cheers!
edit: changed formatting for < s > so markdown doesn't eat them.
Bump
Close for now as too old. If anyone still has this issue, please provide with a simple example. Thanks.