`infer_table_structure` lead `Failed to initialize the model`
Describe the bug
I use partition_pdf to parse PDF file, when I set infer_table_structure=True, it happened :
This function will be deprecated in a future release and unstructured will simply use the DEFAULT_MODEL from unstructured_inference.model.base to set default model name
Failed to initialize the model.
Ensure that the model is correct
Review the parameters to initialize a UnstructuredTableTransformerModel obj
To Reproduce
docker run -dt --name unstructured downloads.unstructured.io/unstructured-io/unstructured:latest
docker exec -it unstructured bash
my code is as below:
from unstructured.partition.pdf import partition_pdf
from collections import Counter
try:
elements = partition_pdf(
filename=filename,
strategy='hi_res',
infer_table_structure=True
)
print(Counter(type(element) for element in elements))
except Exception as e:
print(e)
Expected behavior
I want to obtain text_as_html data for Table by setting infer_table_structure=True. When I set infer_table_structure=False the program runs normally.
Environment Info
[notebook-user@57ba27f71222 ~]$ python3 /data/unstructured-main/scripts/collect_env.py /data/unstructured-main/scripts/collect_env.py:5: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html import pkg_resources OS version: Linux-5.4.0-42-generic-x86_64-with-glibc2.34 Python version: 3.10.13 unstructured version: None unstructured-inference version: 0.7.27 pytesseract version: 0.3.10 Torch version: 2.2.2 Detectron2 is not installed
[notice] A new release of pip is available: 23.2.1 -> 24.0 [notice] To update, run: pip install --upgrade pip
[notice] A new release of pip is available: 23.2.1 -> 24.0 [notice] To update, run: pip install --upgrade pip PaddleOCR is not installed Libmagic version: file-5.39 magic file from /etc/magic:/usr/share/misc/magic LibreOffice version: LibreOffice 7.1.8.1 10(Build:1)
Thank you if u can help me about this issue !