[question] Text detection evaluation is so long with the rotation flag.
Bug description
I am testing the db_resnet50 on the given doctr/references/detection/evaluate_pytorch.py script. When I don't use "--rotation", the eval script is relatively fast (40s for FUNSD), but when I use that flag, it is so long (more than 30m, I am still waiting). I understand that it should be slower when we use '--rotation' because the script need to handle with polygons but the actual time is longer than I can expect. Can someone double-check it?
Code snippet to reproduce the bug
python evaluate_pytorch.py db_resnet50 --rotation --amp -b 32
Error traceback
Input without --rotation:
Namespace(arch='db_resnet50', dataset='FUNSD', batch_size=32, device=None, size=None, workers=None, rotation=False, resume=None, amp=True)
Unpacking FUNSD: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 149/149 [00:00<00:00, 2850.94it/s]
Unpacking FUNSD: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 2669.90it/s]
Test set loaded in 0.7717s (199 samples in 7 batches)
Running evaluation
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:40<00:00, 5.82s/it]
Validation loss: 0.732066 (Recall: 83.55% | Precision: 86.67% | Mean IoU: 67.83%)
Input without --rotation:
Still running xD
Environment
DocTR version: 0.8.0a0 TensorFlow version: N/A PyTorch version: 2.1.0a0+4136153 (torchvision 0.16.0a0) OpenCV version: 4.9.0 OS: Ubuntu 22.04.2 LTS Python version: 3.10.6 Is CUDA available (TensorFlow): N/A Is CUDA available (PyTorch): Yes CUDA runtime version: 12.1.105 GPU models and configuration: GPU 0: NVIDIA A30 Nvidia driver version: 525.147.05 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.2 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.2 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.2 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.2 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.2 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.2 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.2
Deep Learning backend
is_tf_available: False is_torch_available: True
Probably it is a duplicate but I am wondering if there is a solution?
I am using broadcasting but it is still slow.
Hi @decadance-dance 👋,
You are right we need to find a better/faster solution for the calculation, that's a known issue. :) (Maybe a refactoring to shapely could help)
Unfortunately, we are currently only 2 people working on docTR in our spare time, so it's difficult to cover everything at once 😅
@felixdittrich92 got it. Maybe I'll take a look at this issue later to optimize it.
@felixdittrich92 got it. Maybe I'll take a look at this issue later to optimize it.
Sounds good we are happy about every contribution 👍🏼