OpusTools
OpusTools copied to clipboard
What is the tokenizer for all languages?
The provided tmx file contain the tokenized text, and I wonder what tokenizer is used for the language like Thai, Chinese etc. Is there any docs to find this? Thx!