`add_page_orientation` does not handle pages without words (e.g. blank pages) (Python)

Open MattExact opened this issue 2 years ago • 0 comments

add_page_orientation raises an error on documents with blank pages.

If the input data for statistics.mode is empty, StatisticsError is raised (see Python docs)

This could be fixed with something along the lines of:

word_orientiations = [
    round(__get_degree_from_polygon(w.geometry.polygon))
    for w in words
    if w.geometry and w.geometry.polygon
]
orientation = statistics.mode(word_orientiations) if word_orientiations else 0

Or some other alternative 🤷‍♂️

https://github.com/aws-samples/amazon-textract-response-parser/blob/541c07a12d603deed70699357f865d6974369c7b/src-python/trp/t_pipeline.py#L136-L150

Jul 25 '23 15:07 MattExact