charNgram2vec icon indicating copy to clipboard operation
charNgram2vec copied to clipboard

How can I use this to train char-embeddings on a Hindi-Language Corpus?

Open skmalviya opened this issue 5 years ago • 1 comments

For example a Hindi corpus like -- indicnlp_corpus.

skmalviya avatar Jul 27 '20 07:07 skmalviya

Hi @skmalviya ,

Thank you so much for your interests and I am sorry for my missing your question here. Indeed, this is a crucial question. At that time, I was using C++ in a naive way, and it is not trivial how to handle multi-byte characters. I think now it is better to use Python to implement this for the language coverage.

hassyGo avatar Oct 27 '23 06:10 hassyGo