easytokenizer icon indicating copy to clipboard operation
easytokenizer copied to clipboard

高性能文本 Tokenizer 库

Results 5 easytokenizer issues
Sort by recently updated
recently updated
newest added

* C code format by Clang-Tidy * Python code format by black

* I publish a repo for Golang binding. * https://github.com/sunhailin-Leo/easytokenizer-to-go

Excellent works! I wonder whether this package provide the api to train a tokenizer (i.e. get the vocab) from huge corpus? Thanks!

你好,请问是否可以不编译,直接用头文件的方式呢?编译总是很麻烦