Luke
Luke
3_enhancement.wav is 16384 samples, but 3_clean.wav is 16000 samples
Bother to mention this issue again. Seems that it works fine in the beginning a few minutes, and it becomes worse with echo/noise after 10 to 20 minutes. Does anyone...
To initialize the sampleRate at 16000 to bypass resampling can mitigate the echo/noise around 20 minutes. However, there is longer delay of transcripts after 20 minutes.
Thank you for your excellent work. Besides the license for the source, how about the pre-trained model? like "https://paddlegan.bj.bcebos.com/applications/first_order_model/vox-cpk.pdparams" thanks, Luke
When I fix a test set, I get the threshold. But how can I apply the threshold to an unseen dataset?
 It should not relate to memory. Please see this flow. Server drop the silent chunks, and client waits for response, and never send the next chunk. Thank you.
粤语TTS支持中英文混合吗?有没有说明连接?谢谢!