kenho211

Results 9 comments of kenho211

Have the same error, but the cache dir is iopath_cache instead of fvcore_cache ![Capture](https://user-images.githubusercontent.com/32618031/114656350-7c02ba80-9d20-11eb-8ce2-0ea0ad1d7539.PNG)

When trying to avoid duplicating text by discarding overlapping bounding boxes, a threshold of 80% is used (intersection of bounding boxes over the first bounding box in the pair). In...

Yes, I am using a unified annotaion format (labelling all text in the same text line as one bbox). Thank you for the suggestion on modifying post-processing params.

I am reading through the tips in documentation (https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/finetune.md) 1. PP-OCR提供的预训练模型有较好的泛化能力 2. 加入少量真实数据(检测任务>=500张, 识别任务>=5000张),会大幅提升垂类场景的检测与识别效果 3. 在模型微调时,加入真实通用场景数据,可以进一步提升模型精度与泛化性能 4. 在图像检测任务中,增大图像的预测尺度,能够进一步提升较小文字区域的检测效果 5. 在模型微调时,需要适当调整超参数(学习率,batch size最为重要),以获得更优的微调效果。 For point 2, is 真实数据 referring to scene text...

Same issue on ubuntu 18.04 using docker image python:3.8.12

https://github.com/m-bain/whisperX/pull/795 seems to have fixed this

from_pretrained allow huggingface model name OR model file path.

Encounter another error for audio without speech. Not the same one as in https://github.com/SYSTRAN/faster-whisper/pull/973 ` File "/home/ubuntu/.local/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 362, in transcribe clip_timestamps = merge_segments(active_segments, vad_parameters) File "/home/ubuntu/.local/lib/python3.10/site-packages/faster_whisper/vad.py", line 315, in...

> Hello, what is the use case where it needs to be supplied as `None`? From https://github.com/openai/whisper/pull/676, the use case of `None` should be to conduct a majority vote for...