Not working
how to improve accuracy of the OCR? There is no a word captured correctly from camera.
There is a helpful section in the wiki of the tesserect repo for improving the quality of a captured image before passing it to tesseract.
You may also want to consider using a white or blacklist to improve results. Example. You can also play around with the Page Segmentation Mode, I haven't really figured out what this is, but apparently you can set it to things like "single line" or "single character" modes, so that tesserect knows what it is looking for.
You can simply change Page Segmentation Mode on an instance of the TessBaseAPI class:
TessBaseAPI tessBaseApi = new TessBaseAPI();
tessBaseApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_AUTO);
The PageSegMode inner class has the various constants of the modes that are supported.
As a last note: make sure the image you have captured has good quality, try not to shake the camera and so on... I hope I could help, I have just started getting into OCR on android. I am also currently trying to get better results.
EDIT: I forgot to mention that there are also different types of trained data:
- https://github.com/tesseract-ocr/tessdata_best
- https://github.com/tesseract-ocr/tessdata_fast
- https://github.com/tesseract-ocr/tessdata
tessdata_best leads to the most accurate results. tessdata_fast has the better runtime but trades accuracy for it. tessdata has backwards compatibility so you can work with older versions of tesseract.
Hi @Dennis1995 I wanted to ask , this project use what version of tesseract. if yes then it's great if not please help me where to find if you have any idea
This Project uses the tess-two library. The tess-two library uses Tesseract 3.05. It doesn't use the newest version, which is Tesseract 4. I couldn't find an android implementation that uses Tesseract 4. You could try to create it yourself by using JNI and the andorid NDK. Does this answer your question?