Inter-SubNet training dataset

I'm trying to retrain your InterSubnet. I have some questions about training datasets.

I saw you used interspeech 2020 datasets in your code but the subset of the interspeech 2021 datasets in your paper. Which is correct ?
If you use the subset of interspeech 2021 datasets, what kind of dataset did you use? Fullband or Wideband? Only use clean read_speech or do you use emotional speech and non-English speech? Thanks

Aug 22 '23 09:08 yenchoupan

Our training dataset mainly consists of the wideband (16kHz) data from the DNS Challenge at InterSpeech 2021. The clean dataset used is "readbook" and the noise dataset incorporates the complete set of noises. Our test dataset utilizes the referenced test set released in the DNS Challenge at InterSpeech 2020.

Aug 23 '23 06:08 RookieJunChen

Did you use drop band that reaches the performance in your paper?

Sep 12 '23 10:09 yenchoupan

The configs of my final reported results are listed in config. If there is a slight difference in your results, I think it's likely to be a randomness difference due to the dynamic mixing strategy. Besides, the type of GPU also has some effect on the results.

Sep 12 '23 10:09 RookieJunChen