Nan
Nan
Thanks for your great work! I kewn model code based on wenet, could you tell what are used pipline ?
When I recognized one minute of Chinese audio, I found that there was no punctuation
warning: The current model is English-only but the language parameter is set to 'zh'; using 'en' instead.
when I fintune fellow https://github.com/huggingface/distil-whisper/tree/main/training. NotImplementedError: The model type whisper is not yet supported to be used with BetterTransformer.
Accelerate is a tool for multi-machine,but why you use it in single gpu?
I think this is a great work ! But there is exits limit as introduction. I want to know is there any better work based on SpeechGPT recently?
Thanks for your excellent work! I want to ask how the Discrete tokenizer's perform on the ASR?Can you tell me your understand? Thanks!
感谢LLama-Omni的工作!本着学习的目的,我们尝试复现了这篇论文的训练代码,希望能对做端到端语音对话的朋友有一点点帮助,也希望大家能提意见,一起交流,共同进步。https://github.com/wntg/LLaMA-Omni
I want to use wavtokenizer to speech AI. Is avtokenizer apply streaming infer?