huazhi1024
huazhi1024
thanks very much
I used the clean VCTK dataset and trained the model from scratch using the code you provided. However, I noticed that the synthesized speech sounds poorer than the one obtained...
Thank you very much for your response. In the table below, the left side shows the results obtained by encoder and decoder models using the pre-trained models you provided, while...
> @zhanghuiyu123 Hi! May I ask how long it took you to train 700k rounds for one system (e.g. AudioDec_v1) roughly? This is the time I spent training different models...
> Hi @zhanghuiyu123, I would recommend you mention that you "reimplement AudioDec based on the open-source repo" in your paper to avoid any concerns from the reviewers although I think...
Yes, I will finish it by Monday. However, I am currently encountering some issues with uploading the model to GitHub. ---Original--- From: ***@***.***> Date: Sat, Jun 15, 2024 13:18 PM...
Hello,I have completed the model release. The download link and usage instructions have been sent to you via email. If you have any questions, please feel free to contact me....
# updata: # 16khz,2kbps codec model ## (1) Downstream results: Codec SUPERB application evaluation Stage 1: Run speech emotion recognition. Acc: 75.97% Stage 2: Run speaker related evaluation. Parsing the...
# 44.1khz,7kbps codec model ## (1) Downstream results Codec SUPERB application evaluation Stage 1: Run speech emotion recognition. Acc: 75.49% Stage 2: Run speaker related evaluation. Parsing the resyn_trial.txt for...
# 48kHz,7.5kbps codec model ## (1) Downstream results Codec SUPERB application evaluation Stage 1: Run speech emotion recognition. Acc: 75.28% Stage 2: Run speaker related evaluation. Parsing the resyn_trial.txt for...