CarolLi
CarolLi
请问gb2312编码csv读取乱码的问题解决了吗?
I have a large dataset and want to parse it using UCCA. But, there is a kind of punctuation commonly used in this dataset which is recognized as a word...
> For plain text, TUPA uses spaCy for tokenization and punctuation identification. This is the relevant line of code: https://github.com/danielhers/ucca/blob/master/ucca/convert.py#L769 > Now, spaCy (at least with the `en_core_web_md` model) seems...
> I'm not passing it the model correctly. It expects exact path to it, not just the folder that holds it. > Sometimes maybe miss some files or type a...
> ULTIMATE 可以了,感谢!