Seems like EasyOCR is not using GPU
Question
I am running docling on Ubuntu with Nvidia GPUs. However, its still taking a very long time (dosnt finish actually) parsing a 300 page pdf with images. Is there anything specific to check / debug this?
I have enabled gpu explicitly in the pipeline options for easyOcr.
Would be great if someone can help here. Thank you!
@nikhildigde What is you configuration? How long is taking? Even with GPUs some documents of 300 pages can take up to 5 mins to be parsed, but that is still 8-10 times faster than using CPU.
Running on A100 (6 GPU)
Docling version docling 2.15.1 docling-core 2.15.1 docling-ibm-models 3.2.1 docling-parse 3.1.1 ...
Python version Python 3.11.11
How long it takes to parse a 300 pages pdf?
kinda forever ... waited 30 mins almost. with ocr off - 130 sec. with tessaract - 230 sec (CPU)
observed similar time on CPU, but on GPU parsing with OCR and TableFormerMode.ACCURATE takes less than 5 mins. You should verify that docking is using GPUs: Accelerator device: 'cuda:0'
Should I set it explicitly? I thought it does it automatically
@nikhildigde You are running a very old version of docling (2.15.1). I would recommend to upgrade to a newer version and try again. I am relatively sure we have solved this issue in the newer versions.
@nikhildigde Please let us know is the latest version of docling works with GPU.
@nikhildigde I will close this for now, I am fairly sure that the latest versions support GPU now.
Hey @PeterStaar-IBM , apologies , I dint test it yet. I will do that in the next couple of days and update here. Thank you.