easytokenizer
easytokenizer copied to clipboard
高性能文本 Tokenizer 库
Results
5
easytokenizer issues
Sort by
recently updated
recently updated
newest added
* C code format by Clang-Tidy * Python code format by black
* I publish a repo for Golang binding. * https://github.com/sunhailin-Leo/easytokenizer-to-go
Excellent works! I wonder whether this package provide the api to train a tokenizer (i.e. get the vocab) from huge corpus? Thanks!
你好,请问是否可以不编译,直接用头文件的方式呢?编译总是很麻烦