google-cloud-python document_text_detection API simply returns "Bad image data" for local image file

Environment details

OS type and version: mac os
Python version: 3.10.6
pip version: 23.0.1
google-cloud-vision version: 3.4.0

Steps to reproduce

Execute the below script

Code example

import cv2 as cv
from google.cloud import vision


# define path="/path/to/attached/file.jpg"
client = vision.ImageAnnotatorClient()
_, buffer = cv.imencode('.jpg', cv.imread(str(path)))
image = vision.Image(content=buffer.tobytes())
response = client.document_text_detection(image=image)

# response will be code 3: Bad image data

When trying to OCR the attached image, I'm getting "Bad image data" as a result. I thought it was due to the file size, but I've already successfully received responses for bigger files. I've also researched the issue quite a bit but didn't find anything useful across other issues / forums. Since there is no stacktrace or anything else, I'm really quite stumped and don't know how best to proceed. Any ideas? TRSTimes_Volume_5_No _1_1992-01_TRSTimes_Publications_US_0001

Mar 13 '23 17:03 christian-steinmeyer

I now found out, that the image was too high in resolution. By reducing the file size to below 4MB, I could upload it to the vision "try it now in your browser" website, Where I then got a more useful error message saying the max image resolution is 75 megapixels. In the documentation it actually says:

Image size should not exceed 75M pixels (length x width) for OCR analysis. If the image size exceeds 75M pixels (length x width) , the Vision API resizes the image; otherwise, the Vision API uses the original image.

I totally misunderstood this sentence. I guess what it means is that OCR will not work with images > 75M pixels, but other features (like object detection) might work on a resized image. Perhaps you could rephrase that documentation?

Mar 14 '23 08:03 christian-steinmeyer

I also found it confusing to only receive the feedback of "bad image data" via the python client response, but to get more details via the website itself.

Additionally, here's an example formulation for the documentation bit:

"Image size must not exceed 75M pixels (length x width) for OCR analysis. Larger images will result in an error. For other features of the Vision API, images that exceed this limit will be resized internally first."

Mar 15 '23 11:03 christian-steinmeyer

I'm going to transfer this issue to google-cloud-python as we're planning to move the code for google-cloud-vision there in the next 1-2 weeks

Oct 21 '23 20:10 parthea

This seems like fundamentally a Vision API/doc issue, not a client library-specific issue, though on the client library side, we should ensure that we are showing the user all the error details provided by the service. We'll look into this and keep you updated.

Dec 08 '23 21:12 vchudnov-g