Investigate how to remove dependency of OpenCV
The OpenCV is a dependency for docling-ibm-models(https://github.com/DS4SD/docling-ibm-models), where it is used to load and resize images for the TableFormer model.
Here are the places where opencv is called:
- tableformer/data_management/functional.py:resize()
- tableformer/data_management/tf_predictor.py:resize_img()
- tests/test_tf_predictor.py:test_tf_predictor()
Additionally in order to correctly evaluate the effect of replacing opencv with another image library, we should measure the impact at:
- TableFormer input tensors (image loading, normalization, resizing).
- TableFormer output predictions.
@nikos-livathinos I ran a quick test with Pillow.
I created two functions to resize the image, one using Pillow and other using openCV. Both were given the same input image with the input type numpy.ndarray
Used python's timeit module for 100 runs Below are some performance numbers:
Avg time for OpenCV: 0.008613 seconds
Avg time for Pillow: 0.072365 seconds
Now, instead of numpy.ndarray if we provide the image file path as the input and open the image with Pillow directly, performance gets slightly better:
Avg time for Pillow: 0.067890 seconds
@sgonsal could you also try the same with torchvision.transforms.functional.resize?
With a torch.Tensor input, it relies on torch.nn.functional.interpolate: https://github.com/pytorch/vision/blob/cb9fdbf11f884b0501d1c23a48af258ab4acb57f/torchvision/transforms/_functional_tensor.py#L467
@pavel-denisov-fraunhofer Done.
Ran a similar test as my comment above. This time with torchvision.transforms.functional.resize
Avg time for torchvision: 0.071132 seconds