DeepFilterNet icon indicating copy to clipboard operation
DeepFilterNet copied to clipboard

RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.

Open youssefabdelm opened this issue 3 years ago • 2 comments

This error I'm very confused about. For certain files, DeepFilterNet2 runs totally fine on certain WAV files. For others (their format being 32 bit, wav, 2 channels, 44.1kHz), I get this error:

2022-08-05 17:00:42 | INFO     | DF | Running on torch 1.12.1+cu116
2022-08-05 17:00:42 | INFO     | DF | Running on host 3d0695211f12
fatal: not a git repository (or any of the parent directories): .git
2022-08-05 17:00:42 | INFO     | DF | Loading model settings of DeepFilterNet2
2022-08-05 17:00:42 | INFO     | DF | Using DeepFilterNet2 model at ../opt/conda/lib/python3.7/site-packages/pretrained_models/DeepFilterNet2
2022-08-05 17:00:42 | INFO     | DF | Initializing model `deepfilternet2`
2022-08-05 17:00:46 | INFO     | DF | Found checkpoint ../opt/conda/lib/python3.7/site-packages/pretrained_models/DeepFilterNet2/checkpoints/model_96.ckpt.best with epoch 96
2022-08-05 17:00:46 | WARNING  | DF | Unexpected key: erb_comp.c
2022-08-05 17:00:46 | WARNING  | DF | Unexpected key: erb_comp.mn
2022-08-05 17:00:46 | INFO     | DF | Model loaded
2022-08-05 17:00:50 | WARNING  | DF | Audio sampling rate does not match model sampling rate (44100, 48000). Resampling...
Traceback (most recent call last):
  File "/opt/conda/bin/deepFilter", line 8, in <module>
    sys.exit(run())
  File "/opt/conda/lib/python3.7/site-packages/df/enhance.py", line 329, in run
    main(parser.parse_args())
  File "/opt/conda/lib/python3.7/site-packages/df/enhance.py", line 44, in main
    model, df_state, audio, pad=args.compensate_delay, atten_lim_db=args.atten_lim
  File "/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/df/enhance.py", line 243, in enhance
    enhanced = model(spec, erb_feat, spec_feat)[0].cpu()
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/df/deepfilternet2.py", line 432, in forward
    e0, e1, e2, e3, emb, c0, lsnr = self.enc(feat_erb, feat_spec)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/df/deepfilternet2.py", line 173, in forward
    c0 = self.df_conv0(feat_spec)  # [B, C, T, Fc]
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 179, in forward
    self.eps,
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py", line 2439, in batch_norm
    input, weight, bias, running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.

I've tried different versions of Pytorch, torch 1.12.1+cu116, and an newer one, and cu113.

Any hints as to what this might even be related to would be super helpful! It's confusing that files with seemingly the same exact format, (32 bit, wav, 44.1kHz) are giving these different results, one is working, the other this error.

I'm not even sure what to test / change. Do I convert all files beforehand to some different format? Do I modify the data in some way?

I don't know what it's referring to with 'contiguous input'

youssefabdelm avatar Aug 05 '22 17:08 youssefabdelm

Not sure why the input is not contiguous. It would help me to debug if you could send me a sample to reproduce the error.

A possible workaround could be explicitly forcing contiguous input:

diff --git a/DeepFilterNet/df/deepfilternet2.py b/DeepFilterNet/df/deepfilternet2.py
index 0e62492..c9ade40 100644
--- a/DeepFilterNet/df/deepfilternet2.py
+++ b/DeepFilterNet/df/deepfilternet2.py
@@ -428,7 +428,7 @@ class DfNet(nn.Module):
         feat_spec = feat_spec.squeeze(1).permute(0, 3, 1, 2)
 
         feat_erb = self.pad_feat(feat_erb)
-        feat_spec = self.pad_feat(feat_spec)
+        feat_spec = self.pad_feat(feat_spec).contiguous()
         e0, e1, e2, e3, emb, c0, lsnr = self.enc(feat_erb, feat_spec)
         m = self.erb_dec(emb, e3, e2, e1, e0)

Rikorose avatar Aug 08 '22 07:08 Rikorose

Thank you so much! I tried this but an error persisted (I believe it may have been the same but I can't recall, it may have given me a different error after that change). In the end, I found that what essentially was likely going on was that I was feeding the model a longer audio file than it could take (as discussed in the previous issue here: https://github.com/Rikorose/DeepFilterNet/issues/121), but in stereo. So if it's in stereo and greater than approx 30 mins, it would give this error. I found that files slightly smaller (at 25 mins say) worked. I ended up splitting stereo audio files by this length.

The thing I was a bit confused by was why it was giving me that error and not the other 32bit error or "CUDA out of memory".

youssefabdelm avatar Aug 16 '22 19:08 youssefabdelm

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Nov 15 '22 03:11 github-actions[bot]