Titanet-L Augmentation
Is your feature request related to a problem? Please describe.
I want to train TitaNet with augmentation. However, I do not know how can I prepare the rir_noise_manifest.json, file for training Titanet-L with rir noise augmentation.
Describe the solution you'd like
Provide a code snippet on how to add augmentation for TitaNet-L training.
I added the following to augmentor, and used the following snippet from online augmentation tutorial
rir_data_path = f'{data_dir}/dataset'
!python {NEMO_ROOT}/scripts/dataset_processing/get_openslr_rir_data.py --data_root {rir_data_path}
rir_manifest_path = os.path.join(rir_data_path, 'processed', 'rir.json')
!head -n 3 {rir_manifest_path}
Then to use the augmentation I applied the following
audio_augmentations = dict(
speed = dict(
sr=16000,
prob=0.3,
resample_type='kaiser_fast',
min_speed_rate=0.95,
max_speed_rate=1.05,
),
noise = dict(
manifest_path=rir_manifest_path,
prob=0.5,
min_snr_db=0,
max_snr_db=15,
),
)
finetune_config.model.train_ds.augmentor = audio_augmentations
Am I correct and thanks @okuchaiev
Yes, code looks fine to me. But for impulse you should use impulse pertubation not noise pertubation. Sample can be found here: https://github.com/NVIDIA/NeMo/blob/6442bb67275759f5ece6bd5e366966216e050cfe/examples/speaker_tasks/recognition/conf/titanet-small.yaml#L14
@nithinraok that's what I thought, However in Titanet-Large they use noise instead of impulse, and it says we are using impulse perturbation. So, does that mean in their training they made an error using RIR corpora for noise instead of pulse perturbation. https://github.com/NVIDIA/NeMo/blob/6442bb67275759f5ece6bd5e366966216e050cfe/examples/speaker_tasks/recognition/conf/titanet-large.yaml#L14-L26
The paper statement:
(just realized you are the first author x.x) Thank you @nithinraok
I don;t remember details exactly but as far I remember RIR corpora also has noise samples as well along with impulse responses, and I have not added impulse section to this config file but was added to titanet-small config.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been inactive for 7 days since being marked as stale.