DeepSpeech Create cocktail party data set

Create cocktail party data set where the background noise for training comes only from the training data sets, background noise for validation comes only from the validation data sets, and , background noise for test comes only from the test data sets.

Sep 12 '18 12:09 kdavis-mozilla

Howdy, I was just linked to this: https://voices18.github.io/ Could it be used to help in this regard?

Sep 19 '18 22:09 SalvorinFex

@SalvorinFex Thanks!

Currently we're planning on using the CC0 audio from freesound along with using many random samples of the training set, at a lower in volume, to create a cocktail party data set. But, the VOiCES Corpus also might be a nice addition.

Sep 20 '18 07:09 kdavis-mozilla

Another Mozilla associated noise reduction project created a large noise dataset that might be useful to you. It’s downloadable at the bottom of this page: https://people.xiph.org/~jm/demo/rnnoise/

May 02 '19 07:05 zaptrem

Are there plans to run reverb filters on the datasets as well? STT might struggle in this area.

Jul 20 '20 20:07 zaptrem

https://deepspeech.readthedocs.io/en/v0.7.4/TRAINING.html#augmentation

Jul 21 '20 11:07 tilmankamp

@tilmankamp Thanks! Are the pretrained models trained with reverb and these other augmentations enabled? Also, is the reverb added before or after the audio is mixed with the noise samples?

Jul 25 '20 06:07 zaptrem

Models trained on data with such augmentations are in the process of being trained.

Jul 27 '20 08:07 kdavis-mozilla

@kdavis-mozilla Are the just-released 0.8 models trained with these augmentations? Or are those coming in 0.9/1.0?

Jul 30 '20 19:07 zaptrem

@zaptrem They are coming later.

Jul 31 '20 08:07 kdavis-mozilla

@kdavis-mozilla Would this noise+dataset be useful to this? https://iqtlabs.github.io/voices/

Aug 26 '20 03:08 zaptrem