Wang Zirui comments

Results 14 comments of


                                            Wang Zirui

rolling stylometry

Quanteda is the best text analysis tool in the world. I really hope that Quanteda's workflow can implement rolling stylometry analysis one day

Incompatibility Issue with docnames Function in corpus and tokens dfm Objects

The tutorial is not clear：https://quanteda.io/reference/docnames.html

Incompatibility Issue with docnames Function in corpus and tokens dfm Objects

> This should still work fine. If you can supply a reproducible example using `reprex::reprex()` I can examine this further. I found some guy meet the same issue like me...

新样式：红楼梦学刊

https://mp.weixin.qq.com/s/wvzky3Gj5wC2ymQxyvHSqA?scene=0&subscene=90 公众号似乎就是官网

1.0.0-beta.1-Internal ServerError

> We will release 1.0.0 later, please check whether this persists in the latest version.我们将稍后发布 1.0.0 版本，请检查此问题是否在最新版本中仍然存在。 persist in the latest version

trilingual alignment

I think this is a good topic I recently tried to modify your repository with windsurf, but the effect was not good. Because there are currently many one-to-two bilingual corpora...

非常感谢作者提供的模板，非常好用，不知道为什么我的公式无法转换呢

enhancing export 导出为 OpenOffice 似乎是正常的，但是 word 不行……

pdf解析能力有点差，希望借鉴各种Computer-aided translation (CAT)工具，类似trados，可以完美翻译pdf文件

我们做笔译经常用的是 trados、还有云译客，不过都是闭源的此链接下的云译客亦是闭源https://transpace.iol8.com/home。

pdf解析能力有点差，希望借鉴各种Computer-aided translation (CAT)工具，类似trados，可以完美翻译pdf文件

另外，现在正如上所述，许多笔译翻译项目都要用cat工具，但这些工具大多闭源、老旧，但是cat工具也有其一些优点，术语统一、翻译记忆库，可以保持pdf样式的优点。PDFMathTranslate翻译的产物面向的是简单阅读那么可以不必参考，但是如果能多迈前一步，能加入cat工具的各种功能（用户可以自主编辑翻译文本的错误），那么将会从“简单阅读”到“出版物级阅读”升级。不过似乎样式错乱导致的原因似乎是pdf的文本层与 PDFMathTranslate识别的文本层不统一有关。但是不知道现在中文ocr是如何的，是否可以精准为pdf的文本画框？（不太懂如何描述），如果在翻译这些文本层前，用户有机会查看 PDFMathTranslate识别的文本层是否是正常的，那就避免了许多token浪费。