Frame positioning for IOS is messed up
What happened?
Turns out frame calculation of OCR for photos on IOS is messed up in @react-native-ml-kit/text-recognition.
Have found that the frame is misinterpreted and should be:
top: is misinterpreted as left height and width are switched. and left: would be screen width - their top - their height.
Version
@react-native-ml-kit/barcode-scanning: version@react-native-ml-kit/face-detection: version@react-native-ml-kit/identify-languages: version@react-native-ml-kit/image-labeling: version@react-native-ml-kit/text-recognition: version@react-native-ml-kit/translate-text: version
Which ML Kit packages do you use?
- [ ] @react-native-ml-kit/barcode-scanning
- [ ] @react-native-ml-kit/face-detection
- [ ] @react-native-ml-kit/identify-languages
- [ ] @react-native-ml-kit/image-labeling
- [X] @react-native-ml-kit/text-recognition
- [ ] @react-native-ml-kit/translate-text
What platforms are you seeing this issue on?
- [ ] Android
- [X] iOS
System Information
RN 0.73.6 System: OS: macOS 14.6.1 CPU: (8) arm64 Apple M3 Memory: 84.88 MB / 16.00 GB Shell: version: "5.9" path: /bin/zsh Binaries: Node: version: 18.20.0 path: ~/.nvm/versions/node/v18.20.0/bin/node Yarn: version: 1.22.22 path: /opt/homebrew/bin/yarn npm: version: 10.5.0 path: ~/.nvm/versions/node/v18.20.0/bin/npm Watchman: version: 2024.09.02.00 path: /opt/homebrew/bin/watchman Managers: CocoaPods: version: 1.15.2 path: /opt/homebrew/bin/pod SDKs: iOS SDK: Platforms: - DriverKit 24.0 - iOS 18.0 - macOS 15.0 - tvOS 18.0 - visionOS 2.0 - watchOS 11.0 Android SDK: API Levels: - "28" - "31" - "33" - "34" Build Tools: - 28.0.3 - 30.0.3 - 31.0.0 - 33.0.1 - 34.0.0 System Images: - android-34 | Google APIs ARM 64 v8a - android-34 | Google Play ARM 64 v8a Android NDK: 25.1.8937393 IDEs: Android Studio: 2024.1 AI-241.15989.150.2411.11948838 Xcode: version: 16.0/16A242d path: /usr/bin/xcodebuild Languages: Java: version: 17.0.12 path: /opt/homebrew/bin/javac Ruby: version: 3.3.5 path: /opt/homebrew/opt/ruby/bin/ruby npmPackages: "@react-native-community/cli": Not Found react: installed: 18.2.0 wanted: 18.2.0 react-native: installed: 0.73.6 wanted: 0.73.6 react-native-macos: Not Found npmGlobalPackages: "react-native": Not Found Android: hermesEnabled: true newArchEnabled: false iOS: hermesEnabled: true newArchEnabled: false
Steps to Reproduce
const {blocks} = await TextRecognition.recognize(`file://${photo.path}`);
expand lines then filter blocks somehow. (or don't)
use "react-native-fs"
to copy original image to somewhere on phone and save json encoded blocks to json file on phone.
Transfer files to PC.
Use small html/js/css file to position frames on top of original image.
badDiv.style.cssText = `top: ${value.frame.top}px; left: ${value.frame.left}px; height: ${value.frame.height}px; width: ${value.frame.width}px`;
goodDiv.style.cssText = `top: ${value.frame.left}px; right: ${value.frame.top}px; height: ${value.frame.width}px; width: ${value.frame.height}px`;
The red bordered blocks (badDiv) are messed up.
The green bordered blocks (goodDiv) with the properties switched around are in the correct positions.
@hurnell The values returned are the same as the ones provided by Google's ML Kit Text Recognition iOS framework so there's not much to change from my side.
But my assumption is that it's because of the image's orientation (the image is in landscape and looks like ML Kit analyzed the image in portrait mode).
The orientation is detected automatically using UIImage.imageOrientation.
hi @hurnell @a7medev have you guys fix this issue?
there is a problem with HEIC photos taken on iPhone in portrait orientation. they contain exif info about display orientation but the module treats them as landscape photos and applies text recognition as if the photo was rotated 90° CCW.
still happening on the latest version unfortunately, but i just fixed it with a patch:
patches/@react-native-ml-kit+text-recognition+1.5.2.patch
diff --git a/node_modules/@react-native-ml-kit/text-recognition/ios/TextRecognition.m b/node_modules/@react-native-ml-kit/text-recognition/ios/TextRecognition.m
index 87fd4ff..f237b71 100644
--- a/node_modules/@react-native-ml-kit/text-recognition/ios/TextRecognition.m
+++ b/node_modules/@react-native-ml-kit/text-recognition/ios/TextRecognition.m
@@ -79,6 +79,15 @@ - (NSDictionary*)blockToDict: (MLKTextBlock*)block {
return dict;
}
+- (UIImage*)fixImageOrientation: (UIImage*)image
+ {
+ UIGraphicsBeginImageContext(image.size);
+ [image drawAtPoint:CGPointZero];
+ UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext();
+ UIGraphicsEndImageContext();
+ return newImage ?: image;
+}
+
RCT_EXPORT_METHOD(recognize: (nonnull NSString*)url
script:(NSString*)script
resolver:(RCTPromiseResolveBlock)resolve
@@ -86,7 +95,9 @@ - (NSDictionary*)blockToDict: (MLKTextBlock*)block {
{
NSURL *_url = [NSURL URLWithString:url];
NSData *imageData = [NSData dataWithContentsOfURL:_url];
- UIImage *image = [UIImage imageWithData:imageData];
+ UIImage *imageraw = [UIImage imageWithData:imageData];
+ UIImage *image = [self fixImageOrientation:imageraw];
+
MLKVisionImage *visionImage = [[MLKVisionImage alloc] initWithImage:image];
visionImage.orientation = image.imageOrientation;
use with patch-package and create a new build
hi @pugson thanks for responding this, may i see how the output looks like from your end?
I also fix this by recalculating the frame, and separate between android and iOS, here how it look like
its not really that precise like the android one, but its not that bad, may i see how it look like in your end? just want to see if its worth it to patch or not , thankyouu
@abdymm the patch is as precise as on any other photo.
yellow areas are detected.
here's an example:
woah thats actually a lot more better! thanks @pugson 🫶🏻
Hello @hurnell and @pugson. Sorry for taking so long to respond here.
I see that the fix basically creates a new in-memory image from the existing UIImage to overcome the issue with orientation. We probably need a more efficient solution that just finds the correct orientation (or even just provide the option to set the orientation in the API). I will look into it more when I have some time.