实时语音识别和VAD效果不好

Open liurongjie174 opened this issue 1 year ago • 3 comments

Notice: In order to resolve issues more efficiently, please raise issue following the template. （注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节）

❓ Questions and Help

我用FunASR识别实时语音，由于那边推过来的流是通过WS推送PCM，每个包大小是234，然后用示例的funasr_wss_server.py去识别，vad和online效果不好。首先vad经常识别到的内容为[],导致fun_asr_online慢，然后fun_asr也执行的很慢，所以实时识别的数据推送出来的特别慢。

Before asking:

search the issues.
search the docs.

What is your question?

Code

What have you tried?

What's your environment?

OS (e.g., Linux):
FunASR Version (e.g., 1.0.0):
ModelScope Version (e.g., 1.11.0):
PyTorch Version (e.g., 2.0.0):
How you installed funasr (pip, source):
Python version:
GPU (e.g., V100M32)
CUDA/cuDNN version (e.g., cuda11.7):
Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
Any other relevant information:

Jun 17 '24 01:06 liurongjie174