NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

Contrastive Loss: Runtime error for certain audio files when reshaping out_masked_only

Open piraka9011 opened this issue 3 years ago • 1 comments

Describe the bug

After a few steps when pretraining a SpeechEncDecSelfSupervisedModel, training fails with the following error

File "/opt/conda/lib/python3.8/site-packages/nemo/collections/asr/models/ssl_models.py", line 468, in training_step
  loss_value, loss_val_dict = self.decoder_loss_step(
File "/opt/conda/lib/python3.8/site-packages/nemo/collections/asr/models/ssl_models.py", line 450, in decoder_loss_step
  current_loss_value = current_loss(
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1129, in _call_impl
  return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/nemo/core/classes/common.py", line 963, in __call__
  outputs = wrapped(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/nemo/collections/asr/losses/ssl_losses/contrastive.py", line 187, in forward
  out_masked_only = out_masked_only.reshape(bs, -1, out_masked_only.shape[-1])
RuntimeError: shape '[16, -1, 128]' is invalid for input of size 547712

Steps/Code to reproduce bug

I used the speech_pre_training.py script with the default configuration for the conformer here.

I only modified train_ds.max_duration: 25.0. I also tried the config in the Self_Supervised_Pre_Training.ipynb notebook and I got the same error.

Expected behavior

Training should process normally and/or a more detailed explanation why the reshape failed and what parameters to change.

Environment overview (please complete the following information)

  • Environment location: Docker (nvcr.io/nvidia/pytorch:22.05-py3)
  • Method of NeMo install: pip install 'nemo_toolkit[all]==1.10.0'
  • If method of install is [Docker], provide docker pull & docker run commands used
docker run --rm -it --gpus all --ipc=host --env-file .env train

piraka9011 avatar Jul 23 '22 01:07 piraka9011

Btw, current workaround is to set loss_list.contrastive.loss.sample_from_same_utterance_only=False of course, but ideally this works with it set to true.

piraka9011 avatar Jul 23 '22 02:07 piraka9011

This issue is stale because it has been open for 60 days with no activity.

github-actions[bot] avatar Sep 28 '22 02:09 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar Oct 06 '22 02:10 github-actions[bot]