huazhi1024 comments

Results 10 comments of


                                            huazhi1024

VCTK dataset

I used the clean VCTK dataset and trained the model from scratch using the code you provided. However, I noticed that the synthesized speech sounds poorer than the one obtained...

VCTK dataset

Thank you very much for your response. In the table below, the left side shows the results obtained by encoder and decoder models using the pre-trained models you provided, while...

VCTK dataset

> @zhanghuiyu123 Hi! May I ask how long it took you to train 700k rounds for one system (e.g. AudioDec_v1) roughly? This is the time I spent training different models...

VCTK dataset

> Hi @zhanghuiyu123, I would recommend you mention that you "reimplement AudioDec based on the open-source repo" in your paper to avoid any concerns from the reviewers although I think...

results

Yes, I will finish it by Monday. However, I am currently encountering some issues with uploading the model to GitHub. ---Original--- From: ***@***.***> Date: Sat, Jun 15, 2024 13:18 PM...

results

Hello,I have completed the model release. The download link and usage instructions have been sent to you via email. If you have any questions, please feel free to contact me....

# updata: # 16khz,2kbps codec model ## (1) Downstream results: Codec SUPERB application evaluation Stage 1: Run speech emotion recognition. Acc: 75.97% Stage 2: Run speaker related evaluation. Parsing the...

results

# 44.1khz,7kbps codec model ## (1) Downstream results Codec SUPERB application evaluation Stage 1: Run speech emotion recognition. Acc: 75.49% Stage 2: Run speaker related evaluation. Parsing the resyn_trial.txt for...

results

# 48kHz,7.5kbps codec model ## (1) Downstream results Codec SUPERB application evaluation Stage 1: Run speech emotion recognition. Acc: 75.28% Stage 2: Run speaker related evaluation. Parsing the resyn_trial.txt for...