Codec-SUPERB Results of LaDiffCodec (1.5kbps)

Scores updated:

Acc_ground_truth: 93.85% Acc_resync_audio: 16.10% Cos_similarity: 36.48% ACC: 16.10%

Log results

File Name: crema_d.log Codec SUPERB objective metric evaluation on crema_d

Stage 1: Run SDR evaluation. SDR: mean score is: -0.6618466287421877

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 4.7431045

Stage 3: Run STOI. stoi:

Stage 4: Run PESQ. pesq: mean score is: 1.1791039681434632

File Name: esc50.log Codec SUPERB objective metric evaluation on esc50

Stage 1: Run SDR evaluation. SDR: mean score is: -7.735703443297681

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 3.6174948

File Name: fluent_speech_commands.log Codec SUPERB objective metric evaluation on fluent_speech_commands

Stage 1: Run SDR evaluation. SDR: mean score is: 4.330545305329152

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 2.7490408

Stage 3: Run STOI. stoi: mean score is: 0.7800622448245815

Stage 4: Run PESQ. pesq: mean score is: 1.6228661406040192

File Name: fsd50k.log Codec SUPERB objective metric evaluation on fsd50k

Stage 1: Run SDR evaluation. SDR: mean score is: -5.688258628657724

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 4.0113335

File Name: gunshot_triangulation.log Codec SUPERB objective metric evaluation on gunshot_triangulation

Stage 1: Run SDR evaluation. SDR: mean score is: -2.769766115983086

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 2.239529

File Name: libri2Mix_test.log Codec SUPERB objective metric evaluation on libri2Mix_test

Stage 1: Run SDR evaluation. SDR: mean score is: 1.2123890992883006

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 1.7746849

Stage 3: Run STOI. stoi: mean score is: 0.7529617185269315

Stage 4: Run PESQ. pesq: mean score is: 1.3319110035896302

File Name: librispeech.log Codec SUPERB objective metric evaluation on librispeech

Stage 1: Run SDR evaluation. SDR: mean score is: 4.48363052891714

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 2.1447082

Stage 3: Run STOI. stoi: mean score is: 0.8117344206829971

Stage 4: Run PESQ. pesq: mean score is: 1.7257570731639862

File Name: quesst.log Codec SUPERB objective metric evaluation on quesst

Stage 1: Run SDR evaluation. SDR: mean score is: 3.0613881509402994

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 3.3179162

Stage 3: Run STOI. stoi: mean score is: 0.7105730301462775

Stage 4: Run PESQ. pesq: mean score is: 1.4366185867786407

File Name: snips_test_valid_subset.log Codec SUPERB objective metric evaluation on snips_test_valid_subset

Stage 1: Run SDR evaluation. SDR: mean score is: 6.483090668408405

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 1.9094324

Stage 3: Run STOI. stoi: mean score is: 0.8549385395393462

Stage 4: Run PESQ. pesq: mean score is: 1.8450518810749055

File Name: vox_lingua_top10.log Codec SUPERB objective metric evaluation on vox_lingua_top10

Stage 1: Run SDR evaluation. SDR: mean score is: 2.299034565789743

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 3.4319177

Stage 3: Run STOI. stoi:

Stage 4: Run PESQ. pesq: mean score is: 1.3151621878147126

File Name: voxceleb1.log Codec SUPERB objective metric evaluation on voxceleb1

Stage 1: Run SDR evaluation. SDR: mean score is: 2.138264888912873

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 1.984266

Stage 3: Run STOI. stoi: mean score is: 0.7347235382930105

Stage 4: Run PESQ. pesq: mean score is: 1.54294668674469

Average SDR for speech datasets: 3.618218106966028 Average Mel_Loss for speech datasets: 2.313341416666667 Average STOI for speech datasets: 0.7741655820021908 Average PESQ for speech datasets: 1.5841918953259786 Average SDR for audio datasets: -5.397909395979497 Average Mel_Loss for audio datasets: 3.289452433333333

Jun 20 '24 14:06 haiciyang

Thank you very much for submitting the results. Here are two reminders:

There are some missing numbers in your submitted GitHub issue. Could you please git pull firstly and rerun the evaluation? By the way, please follow the https://github.com/voidful/Codec-SUPERB/tree/SLT_Challenge, and you can use two cmd to get all results after preparing the synthesised audio.
Could you also refer to section 4.2 of the rule (https://codecsuperb.github.io/Codec-SUPERB-rule.pdf) to let us know how to do inference using your model (we will leverage your model to test on the hidden set)?

Jun 20 '24 15:06 hbwu-ntu

Sorry for missing out on this information - To run inference of the model, please refer to https://github.com/haiciyang/LaDiffCodec (Master branch).

Jun 21 '24 09:06 haiciyang