EasyOCR icon indicating copy to clipboard operation
EasyOCR copied to clipboard

How to add support CJK Symbols and Punctuation

Open leotu opened this issue 2 years ago • 1 comments

These characters are often used "「」,。、"

  • Refer to the link below: https://en.wikipedia.org/wiki/CJK_Symbols_and_Punctuation

Example: 2023-12-04 10 39 31

2023-12-04_09 25 15
  • Here are Chinese wiki version

https://zh.wikipedia.org/wiki/Unicode#%E4%B8%AD%E6%96%87%E8%BC%B8%E5%85%A5%E6%B3%95

  • Custom recognition models

But I had no idea how to retraining my model easily after reading below link:

https://github.com/JaidedAI/EasyOCR/blob/master/custom_model.md https://github.com/JaidedAI/EasyOCR/tree/master/trainer

Is it possible to download the source code and append some characters to these files and run some scripts to enhance the new model?

https://github.com/JaidedAI/EasyOCR/blob/master/easyocr/character/ch_sim_char.txt https://github.com/JaidedAI/EasyOCR/blob/master/easyocr/character/ch_tra_char.txt

leotu avatar Dec 04 '23 02:12 leotu

OCR sample result ("「」,。、" all disappear)

2023-12-04 10 50 54

leotu avatar Dec 04 '23 02:12 leotu