Yoshinari Fujinuma

Results 12 comments of Yoshinari Fujinuma

@cmoen Might be of your interest. I do not know the current Lucene's implementation on it, but it is doable by e.g., having a separate function in [FST.java](https://github.com/atilika/kuromoji/blob/0b01987c6977701f01db901d738869b0275212d5/kuromoji-core/src/main/java/com/atilika/kuromoji/fst/FST.java) that returns...

I was looking into ["Tuning Spark" document on Spark 1.2.0](https://spark.apache.org/docs/1.2.0/tuning.html) and there is a section mentioning that using serialization will help reduce the memory usage on Spark. Perhaps Fujikawa-san is...

Hi, I recommend looking at the output of the "Viterbi" option available at https://www.atilika.com/en/kuromoji/ to see what's going on. It seems that for IPADic (default dictionary) there is a connection...

This is a common issue because when the tokenization model is trained, it looks at the surface feature (and POS, base form, conjugation form, etc.) rather than its reading. The...

Let's try not to be confused about the "feature" used for the machine learning models and the "feature" for Kuromoji. The word "feature" has a special meaning in the context...

@amyeroberts Shall I open a pull request? Have one handy.

Sounds good! +1 to follow the "keep it simple (and stupid)" principle.

Thank you so much @younesbelkada! Yes, (at least currently) NOT looking for distributed training (e.g., distributed data parallel through `trochrun`) when `load_in_8bit` (or 4bit) is turned on. Only NPP. Looking...

@chenwuchen Bumped into the same error. Solution: Use DDP or `torchrun`.

CC: @RogerYu123 @xiteng01