Guoao Wei comments

Results 30 comments of


                                            Guoao Wei

Infer test cases from input HeroSeries in test_indexes

> The only part that might be tricky is that for some functions we need to pass more than one parameter (see for instance pca in test_indexes). That can be...

Infer test cases from input HeroSeries in test_indexes

See #130 for test handling `nan`, and #157 for new HeroTypes.

Chinese language support

I would prefer to start with adding Chinese support for the preprocessing module. The most common Chinese NLP tools right now should be [jieba](https://github.com/fxsjy/jieba), [HanLP](https://github.com/hankcs/HanLP), and [pkuseg](https://github.com/lancopku/PKUSeg-python). Also, spaCy has...

Chinese language support

@jbesomi I'm also confused about the difference. [Here](https://nlp.stanford.edu/IR-book/html/htmledition/tokenization-1.html) says that word segmentation is a prior process of tokenization, but in practice we think they're almost equivalent. `preprocessing_zh.py` sounds good, would...

Chinese language support

Hi @jbesomi. You are correct. I read about [this issue](https://github.com/explosion/spaCy/issues/4695), seems `zh_core_web_sm` do includes a word segmenter, which is trained by OntoNotes dataset with gold segmentation. Also, as I mentioned...

Chinese language support

Hi, @jbesomi. You've made a good point. The Chinese model of spaCy was originally released from [howl-anderson/Chinese_models_for_SpaCy](https://github.com/howl-anderson/Chinese_models_for_SpaCy), however there hasn't been any info about its performance compared to other tools....

Guoao Wei

Infer test cases from input HeroSeries in test_indexes

Infer test cases from input HeroSeries in test_indexes

Chinese language support

Chinese language support

Chinese language support

Chinese language support

Chinese language support

How to provide multilingual support

How to provide multilingual support

How to provide multilingual support