Disordering of Extracted Lines from ML kit Text Extraction API(OnDeviceTextRecogniser)
Step 1: Describe your environment
- Android device: Nokia 2.2___
- Android OS version: Android 9.0
- Google Play Services version: _____
- Firebase/Play Services SDK version: _____
Step 2: Describe the problem:
On providing images with texts in same line having large spaces in between them are separated into different blocks. This inreturn messes up order of lines in text extraction. But order of lines is very important in our case.
Steps to reproduce:
1.Provide an image with words in same line with large spaces 2.And check output Text , you can see order of lines disordered in large degree
Observed Results: This is Raw text from TextExtraction API
Sample Input

Output
ORIGINAL L For Recioient BILL OF SUPPLY ABD CO 20/0/208 BOS0057 M.G.RAOD, Delhi, Delhi 110099 GSTIN 0TA State PAN 434 B Date 07-Delhi AAECC8220 No Reference Nio PO.78708 Customer Name ACC &CO Customar GSTIN 27A Place of Supsly Billing Addres ACC &CO Maharashtr Shipping Addres ACC&CO Maharashtra 120 27-Maharashtra Due Date 24/01/2038 Discount ( Rate / tem ( Total () tem HSN/SAC Quantity 1,809.00 KGS 1.Slag for manufacturing iron 8,678.67 171,30,30,762.64 Total () 1,30,30,762.64 1,30,30,763.00 0.36 One Crore Thirty Lakh Thirty Thousand Seven Hundred Sity Three Rupees Only Total Value Rounding off Total amount in words) For ABD Cco Authorised Signatory
Expected Results:
Expected result is line by line extraction.Currently its block by block extraction or is there way to acheive it?
Relevant Code:
depnedecny used com.google.firebase:firebase-ml-vision:20.0.0'(OnDeviceTextRecogniser)
This issue does not seem to follow the issue template. Make sure you provide all the required information.
Hai, Please provide resources for resolving, this is high priority.
@phani-artiovatic you will need to use the information returned in the TextBlock object to align these yourself:
https://firebase.google.com/docs/reference/android/com/google/firebase/ml/vision/text/FirebaseVisionText.TextBlock
I will consider this a feature request for this sample repo to show how to do this.
@gkaldev do you think you could change the sample code to show how to extract text in a sensible order?
@samtstern @gkaldev thanks for the response. I have a dependency on this issue/feature so it would help in lot of ways, so requesting to help close this asap.
Thanks found a way to order lines through their bounding box info
@phani-artiovatic glad you found a solution! Re-opening this as I think it's a good feature request for the sample.
Hi @samtstern any updates here? solution i found was temporary and works on small size texts.
any progress here @samtstern
@phani-artiovatic can you clarify why the solution you implemented is dependent on text size?
I have a couple of use cases and i have used line and word bounding box coordinates to arrange them accordingly, its specific to use cases, haven't generalised ordering logic. Just wanted to check if there's better logic to generalise ordering.
On Mon, Dec 23, 2019 at 9:55 PM Kalyan Reddy [email protected] wrote:
@phani-artiovatic https://github.com/phani-artiovatic can you clarify why the solution you implemented is dependent on text size?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/firebase/quickstart-android/issues/960?email_source=notifications&email_token=ANDC3KL2OMNHALDQ7DZTTGDQ2DRATA5CNFSM4I5NOS6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHRONRI#issuecomment-568518341, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDC3KLAMMDOD7MUTWBIOGLQ2DRATANCNFSM4I5NOS6A .
Please update this issue, I also have some use cases which support this issue and my use cases.
any update on it for iOS ???
I'm afraid the iOS version won't adopt such change in the future. Closing it with obsolete as it has been over 2 years.
did anyone find solution to this problem?
This should be re-opened. The tool is useless for adjacent data extraction from static fields when its return text is totally out of order. There are actually those of us doing professional work not playing with toys in their parents basement in California.