open-parse icon indicating copy to clipboard operation
open-parse copied to clipboard

ValueError: Coordinate 'right' is less than 'left'

Open atgreen opened this issue 1 year ago • 5 comments

Given this code:

import openparse

basic_doc_path = "mydoc.pdf"
parser = openparse.DocumentParser(
    table_args={
        "parsing_algorithm": "unitable",
        "min_table_confidence": 0.8,
    }
)

parsed_basic_doc = parser.parse(basic_doc_path)

for node in parsed_basic_doc.nodes:
    print(node.json())

I'm getting the following error:

  File "/home/green/git/cl-langtools/test.py", line 11, in <module>
    parsed_basic_doc = parser.parse(basic_doc_path)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/green/git/cl-langtools/tools/open-parse/lib64/python3.12/site-packages/openparse/doc_parser.py", line 106, in parse
    table_elems = tables.ingest(doc, table_args_obj, verbose=self._verbose)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/green/git/cl-langtools/tools/open-parse/lib64/python3.12/site-packages/openparse/tables/parse.py", line 223, in ingest
    return _ingest_with_unitable(doc, parsing_args, verbose)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/green/git/cl-langtools/tools/open-parse/lib64/python3.12/site-packages/openparse/tables/parse.py", line 189, in _ingest_with_unitable
    table_str = table_img_to_html(table_img)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/green/git/cl-langtools/tools/open-parse/lib64/python3.12/site-packages/openparse/tables/unitable/core.py", line 192, in table_img_to_html
    pred_cell_lst = predict_cells(image_tensor, pred_bbox, table_image)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/green/git/cl-langtools/tools/open-parse/lib64/python3.12/site-packages/openparse/tables/unitable/core.py", line 160, in predict_cells
    _image_to_tensor(image.crop(bbox), size=(112, 448)) for bbox in pred_bboxes
                     ^^^^^^^^^^^^^^^^
  File "/home/green/git/cl-langtools/tools/open-parse/lib64/python3.12/site-packages/PIL/Image.py", line 1237, in crop
    raise ValueError(msg)
ValueError: Coordinate 'right' is less than 'left'

If it helps, my input document is this one: https://www.rbc.com/investor-relations/_assets-custom/pdf/ar_2023_e.pdf

atgreen avatar Apr 08 '24 18:04 atgreen

Thanks for this great library.

I'm also getting ValueError: Coordinate 'right' is less than 'left' with this PDF and almost the same code:

import openparse

basic_doc_path = "sample.pdf"
parser = openparse.DocumentParser(
    table_args={
        "parsing_algorithm": "unitable",
        "min_table_confidence": 0.8
    },
)

parsed_doc = parser.parse(basic_doc_path)

giovannibonetti avatar Apr 08 '24 18:04 giovannibonetti

Any updates on this one? Getting the same error. I'm assuming it means a table is in an unexpected position

KuriaMaingi avatar Sep 20 '24 08:09 KuriaMaingi