Luke comments

Results 7 comments of


                                            Luke

Problem when using

3_enhancement.wav is 16384 samples, but 3_clean.wav is 16000 samples

Echo or noise in recorded audio

Bother to mention this issue again. Seems that it works fine in the beginning a few minutes, and it becomes worse with echo/noise after 10 to 20 minutes. Does anyone...

Echo or noise in recorded audio

To initialize the sampleRate at 16000 to bypass resampling can mitigate the echo/noise around 20 minutes. However, there is longer delay of transcripts after 20 minutes.

Does it matter what the license is for the original architecture?

Thank you for your excellent work. Besides the license for the source, how about the pre-trained model? like "https://paddlegan.bj.bcebos.com/applications/first_order_model/vox-cpk.pdparams" thanks, Luke

two different audio files get cosine similarity over 0.9

When I fix a test set, I get the threshold. But how can I apply the threshold to an unseen dataset?

Sometimes blocking about test_microphone.py

![image](https://github.com/user-attachments/assets/f1133166-6e88-4063-92d4-7c8508101515) It should not relate to memory. Please see this flow. Server drop the silent chunks, and client waits for response, and never send the next chunk. Thank you.

开源的TTS模型支持中英文吗？还是只支持中文，或者只支持英文

粤语TTS支持中英文混合吗？有没有说明连接？谢谢！