unstructured icon indicating copy to clipboard operation
unstructured copied to clipboard

`infer_table_structure` lead `Failed to initialize the model`

Open spongxin opened this issue 1 year ago • 0 comments

Describe the bug I use partition_pdf to parse PDF file, when I set infer_table_structure=True, it happened :

This function will be deprecated in a future release and unstructured will simply use the DEFAULT_MODEL from unstructured_inference.model.base to set default model name Failed to initialize the model. Ensure that the model is correct Review the parameters to initialize a UnstructuredTableTransformerModel obj

To Reproduce

docker run -dt --name unstructured downloads.unstructured.io/unstructured-io/unstructured:latest
docker exec -it unstructured bash

my code is as below:

from unstructured.partition.pdf import partition_pdf
from collections import Counter

try:
    elements = partition_pdf(
        filename=filename,
        strategy='hi_res',
        infer_table_structure=True
    )
   print(Counter(type(element) for element in elements))
except Exception as e:
    print(e)

Expected behavior

I want to obtain text_as_html data for Table by setting infer_table_structure=True. When I set infer_table_structure=False the program runs normally.

Environment Info

[notebook-user@57ba27f71222 ~]$ python3 /data/unstructured-main/scripts/collect_env.py /data/unstructured-main/scripts/collect_env.py:5: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html import pkg_resources OS version: Linux-5.4.0-42-generic-x86_64-with-glibc2.34 Python version: 3.10.13 unstructured version: None unstructured-inference version: 0.7.27 pytesseract version: 0.3.10 Torch version: 2.2.2 Detectron2 is not installed

[notice] A new release of pip is available: 23.2.1 -> 24.0 [notice] To update, run: pip install --upgrade pip

[notice] A new release of pip is available: 23.2.1 -> 24.0 [notice] To update, run: pip install --upgrade pip PaddleOCR is not installed Libmagic version: file-5.39 magic file from /etc/magic:/usr/share/misc/magic LibreOffice version: LibreOffice 7.1.8.1 10(Build:1)

Thank you if u can help me about this issue !

spongxin avatar Apr 23 '24 07:04 spongxin