kenho211 comments

Results 9 comments of


                                            kenho211

Cannot run the model on Windows.

Have the same error, but the cache dir is iopath_cache instead of fvcore_cache ![Capture](https://user-images.githubusercontent.com/32618031/114656350-7c02ba80-9d20-11eb-8ce2-0ea0ad1d7539.PNG)

ZeroDivisionError: float division by zero

When trying to avoid duplicating text by discarding overlapping bounding boxes, a threshold of 80% is used (intersection of bounding boxes over the first bounding box in the pair). In...

Bounding box in the same line break down into smaller ones after finetuning

Yes, I am using a unified annotaion format (labelling all text in the same text line as one bbox). Thank you for the suggestion on modifying post-processing params.

Bounding box in the same line break down into smaller ones after finetuning

I am reading through the tips in documentation (https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/finetune.md) 1. PP-OCR提供的预训练模型有较好的泛化能力 2. 加入少量真实数据（检测任务>=500张, 识别任务>=5000张），会大幅提升垂类场景的检测与识别效果 3. 在模型微调时，加入真实通用场景数据，可以进一步提升模型精度与泛化性能 4. 在图像检测任务中，增大图像的预测尺度，能够进一步提升较小文字区域的检测效果 5. 在模型微调时，需要适当调整超参数（学习率，batch size最为重要），以获得更优的微调效果。 For point 2, is 真实数据 referring to scene text...

Unable to install hdbscan on colab.

Same issue on ubuntu 18.04 using docker image python:3.8.12

TypeError: TranscriptionOptions.new() got an unexpected keyword argument 'hotwords'

https://github.com/m-bain/whisperX/pull/795 seems to have fixed this

how can I deploy with my local model (e.g. faster_whisper_large_v3)?

from_pretrained allow huggingface model name OR model file path.

Use Silero VAD in Batched Mode

Encounter another error for audio without speech. Not the same one as in https://github.com/SYSTRAN/faster-whisper/pull/973 ` File "/home/ubuntu/.local/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 362, in transcribe clip_timestamps = merge_segments(active_segments, vad_parameters) File "/home/ubuntu/.local/lib/python3.10/site-packages/faster_whisper/vad.py", line 315, in...

change language_detection_threshold type to float

> Hello, what is the use case where it needs to be supplied as `None`? From https://github.com/openai/whisper/pull/676, the use case of `None` should be to conduct a majority vote for...