react-native-ml-kit icon indicating copy to clipboard operation
react-native-ml-kit copied to clipboard

Frame positioning for IOS is messed up

Open hurnell opened this issue 1 year ago • 7 comments

What happened?

Turns out frame calculation of OCR for photos on IOS is messed up in @react-native-ml-kit/text-recognition.

Have found that the frame is misinterpreted and should be:

top: is misinterpreted as left height and width are switched. and left: would be screen width - their top - their height.

Version

@react-native-ml-kit/barcode-scanning: version@react-native-ml-kit/face-detection: version@react-native-ml-kit/identify-languages: version@react-native-ml-kit/image-labeling: version@react-native-ml-kit/text-recognition: version@react-native-ml-kit/translate-text: version

Which ML Kit packages do you use?

  • [ ] @react-native-ml-kit/barcode-scanning
  • [ ] @react-native-ml-kit/face-detection
  • [ ] @react-native-ml-kit/identify-languages
  • [ ] @react-native-ml-kit/image-labeling
  • [X] @react-native-ml-kit/text-recognition
  • [ ] @react-native-ml-kit/translate-text

What platforms are you seeing this issue on?

  • [ ] Android
  • [X] iOS

System Information

RN 0.73.6 System: OS: macOS 14.6.1 CPU: (8) arm64 Apple M3 Memory: 84.88 MB / 16.00 GB Shell: version: "5.9" path: /bin/zsh Binaries: Node: version: 18.20.0 path: ~/.nvm/versions/node/v18.20.0/bin/node Yarn: version: 1.22.22 path: /opt/homebrew/bin/yarn npm: version: 10.5.0 path: ~/.nvm/versions/node/v18.20.0/bin/npm Watchman: version: 2024.09.02.00 path: /opt/homebrew/bin/watchman Managers: CocoaPods: version: 1.15.2 path: /opt/homebrew/bin/pod SDKs: iOS SDK: Platforms: - DriverKit 24.0 - iOS 18.0 - macOS 15.0 - tvOS 18.0 - visionOS 2.0 - watchOS 11.0 Android SDK: API Levels: - "28" - "31" - "33" - "34" Build Tools: - 28.0.3 - 30.0.3 - 31.0.0 - 33.0.1 - 34.0.0 System Images: - android-34 | Google APIs ARM 64 v8a - android-34 | Google Play ARM 64 v8a Android NDK: 25.1.8937393 IDEs: Android Studio: 2024.1 AI-241.15989.150.2411.11948838 Xcode: version: 16.0/16A242d path: /usr/bin/xcodebuild Languages: Java: version: 17.0.12 path: /opt/homebrew/bin/javac Ruby: version: 3.3.5 path: /opt/homebrew/opt/ruby/bin/ruby npmPackages: "@react-native-community/cli": Not Found react: installed: 18.2.0 wanted: 18.2.0 react-native: installed: 0.73.6 wanted: 0.73.6 react-native-macos: Not Found npmGlobalPackages: "react-native": Not Found Android: hermesEnabled: true newArchEnabled: false iOS: hermesEnabled: true newArchEnabled: false

Steps to Reproduce

const {blocks} = await TextRecognition.recognize(`file://${photo.path}`);

expand lines then filter blocks somehow. (or don't)

use "react-native-fs"

to copy original image to somewhere on phone and save json encoded blocks to json file on phone.

Transfer files to PC.

Use small html/js/css file to position frames on top of original image.

badDiv.style.cssText = `top: ${value.frame.top}px; left: ${value.frame.left}px; height: ${value.frame.height}px; width: ${value.frame.width}px`;
                    
goodDiv.style.cssText = `top: ${value.frame.left}px; right: ${value.frame.top}px; height: ${value.frame.width}px; width: ${value.frame.height}px`;

Screenshot 2024-11-05 at 08 43 01

The red bordered blocks (badDiv) are messed up.

The green bordered blocks (goodDiv) with the properties switched around are in the correct positions.

hurnell avatar Nov 05 '24 07:11 hurnell

@hurnell The values returned are the same as the ones provided by Google's ML Kit Text Recognition iOS framework so there's not much to change from my side. But my assumption is that it's because of the image's orientation (the image is in landscape and looks like ML Kit analyzed the image in portrait mode). The orientation is detected automatically using UIImage.imageOrientation.

a7medev avatar Nov 28 '24 14:11 a7medev

hi @hurnell @a7medev have you guys fix this issue?

abdymm avatar May 15 '25 11:05 abdymm

there is a problem with HEIC photos taken on iPhone in portrait orientation. they contain exif info about display orientation but the module treats them as landscape photos and applies text recognition as if the photo was rotated 90° CCW.

still happening on the latest version unfortunately, but i just fixed it with a patch:

patches/@react-native-ml-kit+text-recognition+1.5.2.patch

diff --git a/node_modules/@react-native-ml-kit/text-recognition/ios/TextRecognition.m b/node_modules/@react-native-ml-kit/text-recognition/ios/TextRecognition.m
index 87fd4ff..f237b71 100644
--- a/node_modules/@react-native-ml-kit/text-recognition/ios/TextRecognition.m
+++ b/node_modules/@react-native-ml-kit/text-recognition/ios/TextRecognition.m
@@ -79,6 +79,15 @@ - (NSDictionary*)blockToDict: (MLKTextBlock*)block {
     return dict;
 }
 
+- (UIImage*)fixImageOrientation: (UIImage*)image
+  {
+    UIGraphicsBeginImageContext(image.size);
+    [image drawAtPoint:CGPointZero];
+    UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext();
+    UIGraphicsEndImageContext();
+    return newImage ?: image;
+}
+
 RCT_EXPORT_METHOD(recognize: (nonnull NSString*)url
                   script:(NSString*)script
                   resolver:(RCTPromiseResolveBlock)resolve
@@ -86,7 +95,9 @@ - (NSDictionary*)blockToDict: (MLKTextBlock*)block {
 {
     NSURL *_url = [NSURL URLWithString:url];
     NSData *imageData = [NSData dataWithContentsOfURL:_url];
-    UIImage *image = [UIImage imageWithData:imageData];
+    UIImage *imageraw = [UIImage imageWithData:imageData];
+    UIImage *image = [self fixImageOrientation:imageraw];
+
     MLKVisionImage *visionImage = [[MLKVisionImage alloc] initWithImage:image];
     visionImage.orientation = image.imageOrientation;
  

use with patch-package and create a new build

pugson avatar Jun 02 '25 03:06 pugson

hi @pugson thanks for responding this, may i see how the output looks like from your end?

I also fix this by recalculating the frame, and separate between android and iOS, here how it look like

Image

its not really that precise like the android one, but its not that bad, may i see how it look like in your end? just want to see if its worth it to patch or not , thankyouu

abdymm avatar Jun 02 '25 04:06 abdymm

@abdymm the patch is as precise as on any other photo.

yellow areas are detected.
here's an example:

image

pugson avatar Jun 02 '25 04:06 pugson

woah thats actually a lot more better! thanks @pugson 🫶🏻

abdymm avatar Jun 02 '25 04:06 abdymm

Hello @hurnell and @pugson. Sorry for taking so long to respond here.

I see that the fix basically creates a new in-memory image from the existing UIImage to overcome the issue with orientation. We probably need a more efficient solution that just finds the correct orientation (or even just provide the option to set the orientation in the API). I will look into it more when I have some time.

a7medev avatar Sep 01 '25 22:09 a7medev