stopes icon indicating copy to clipboard operation
stopes copied to clipboard

Bug in tokenizer for Tibetian Language

Open asusdisciple opened this issue 2 years ago • 1 comments

At the moment you cant use tibetian language tokenizer. It gives the error message:

TypeError: "module" object is not callable

The error is thrown here in sentence_split.py:

    elif split_algo == "bodnlp":
        logger.info(f" - Tibetan NLTK sentence splitter applied to '{lang}'")
        from botok.tokenizers import sentencetokenizer as bod_sent_tok

asusdisciple avatar Jul 28 '23 11:07 asusdisciple

We have released the new version 2.1.0 last week, could you check again ?

antoine-tran avatar Dec 05 '23 09:12 antoine-tran