machinelearning
machinelearning copied to clipboard
xlm-Roberta tokenizer
hey, can you add support to xlm-Roberta tokenizer? it's a very useful tokenizer that could be very helpful. thank you!
Would like to second this - this would be a useful tokenizer to have as it is used by Donut (another nice-to-have) in the huggingface transformers library.
Adding this tokenizer should be easier now that SentencePiece is implemented in the Microsoft.ML.Tokenizers library.