OpenVoice icon indicating copy to clipboard operation
OpenVoice copied to clipboard

[ERROR] Get target tone color error cuFFT error: CUFFT_INTERNAL_ERROR

Open Zovya opened this issue 2 years ago • 9 comments

installed locally on Manjaro linux with nvidia drivers no problems during installation

I get this error in the info box every time [ERROR] Get target tone color error cuFFT error: CUFFT_INTERNAL_ERROR

Zovya avatar Jan 03 '24 22:01 Zovya

This repo was developed under ubuntu 20.04

Zengyi-Qin avatar Jan 04 '24 02:01 Zengyi-Qin

I'm encountering the same issue on Ubuntu 20.04 running on WSL.

Ubuntu 20.04.6 LTS (GNU/Linux 5.15.133.1-microsoft-standard-WSL2 x86_64)

ahandleman avatar Jan 04 '24 06:01 ahandleman

It appears to be an issue with Torch 1.13.1 due to its dependency on CUDA 11.7. According to the PyTorch bug thread, this error does not occur when running later versions of Torch.

I have my doubts as to whether it'll be as simple as updating Torch to latest, but I'll try it out and report back.

xaroth8088 avatar Jan 04 '24 07:01 xaroth8088

I'm getting this too - I tracked it down to the following in mel_processing.py

spec = torch.stft(
        y,
        n_fft,
        hop_length=hop_size,
        win_length=win_size,
        window=hann_window[wnsize_dtype_device],
        center=center,
        pad_mode="reflect",
        normalized=False,
        onesided=True,
        return_complex=False,
    )

I added a bunch of extra logging to track it down - here is that - tired now, going to bed. Best of skill taking this to next step.

(openvoice) jas@Hope:/mnt/d/repo/AI/audio/OpenVoice$ python openvoice_app.py
Loaded checkpoint 'checkpoints/base_speakers/EN/checkpoint.pth'
missing/unexpected keys: [] []
Loaded checkpoint 'checkpoints/base_speakers/ZH/checkpoint.pth'
missing/unexpected keys: [] []
Loaded checkpoint 'checkpoints/converter/checkpoint.pth'
missing/unexpected keys: [] []
/home/jas/anaconda3/envs/openvoice/lib/python3.9/site-packages/gradio/components/dropdown.py:103: UserWarning: The `max_choices` parameter is ignored when `multiselect` is False.
 warnings.warn(
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Detected language:en
[(0.0, 19.278375)]
after vad: dur = 19.27798185941043
Audio path: processed/demo_speaker0-0-100/wavs
Audio name: demo_speaker0-0-100
Device: cuda
VC model: <api.ToneColorConverter object at 0x7f0ed7dcddf0>
SE path: processed/demo_speaker0-0-100/se.pth
Audio Segments: ['processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg0.wav', 'processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg1.wav']
ref_wav_list: ['processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg0.wav', 'processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg1.wav']
se_save_path: processed/demo_speaker0-0-100/se.pth
device: cuda
hps: {'data': {'sampling_rate': 22050, 'filter_length': 1024, 'hop_length': 256, 'win_length': 1024, 'n_speakers': 0}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [8, 8, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 4, 4], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 256}}
gs: []
proc fname: processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg0.wav
loaded audio: [ 0.          0.          0.         ... -0.00099443  0.01052232
 0.        ] 22050
torch.FloatTensor: tensor([ 0.0000,  0.0000,  0.0000,  ..., -0.0010,  0.0105,  0.0000])
y.to(device): tensor([ 0.0000,  0.0000,  0.0000,  ..., -0.0010,  0.0105,  0.0000],
      device='cuda:0')
y.unsqueeze(0): tensor([[ 0.0000,  0.0000,  0.0000,  ..., -0.0010,  0.0105,  0.0000]],
      device='cuda:0')
wnsize_dtype_device: 1024_torch.float32_cuda:0
wnsize_dtype_device adding
torch.nn.functional.pad
y.squeeze(1)
torch.stft(...)
Exception: cuFFT error: CUFFT_INTERNAL_ERROR

WolfieXIII avatar Jan 04 '24 07:01 WolfieXIII

@WolfieXIII : That mirrors what I found, too. 😞

Re: trying to just upgrade Torch - alas, it appears OpenVoice has a dependency on wavmark, which doesn't seem to have a version compatible with torch>2.0. So, trying to get this to work on newer cards will likely require one of the following:

  1. Wait for wavmark to create a Torch 2.x-compatible version
  2. Replace wavmark with an alternative library (and then upgrade Torch)
  3. Create a custom build of Torch 1.13.1 that depends on CUDA 11.8 or later.
  4. Your ideas here...

xaroth8088 avatar Jan 04 '24 08:01 xaroth8088

Someone already made a PR on wavmark to support 2.1 <3

https://github.com/wavmark/wavmark/pull/6

JacopoMangiavacchi avatar Jan 04 '24 21:01 JacopoMangiavacchi

I found that it works after doing the following two:

  1. Upgrade all torch, torchaudio, torchvision to the latest version
  2. Uninstall the default wavmark and reinstall this version: https://github.com/violetdenim/wavmark using "pip install -e ." after git clone this project and cd into this directory.

yctam avatar Jan 05 '24 04:01 yctam

Confirmed!

Here's some simpler instructions to tide everyone over until wavmark officially updates their package:

  1. Install OpenVoice, as per the README.md instructions
  2. pip install -U torch torchvision torchaudio git+https://github.com/violetdenim/wavmark.git
  3. Run OpenVoice, as per the README.md instructions
  4. Enjoy!

xaroth8088 avatar Jan 05 '24 05:01 xaroth8088

Confirmed!

Here's some simpler instructions to tide everyone over until wavmark officially updates their package:

  1. Install OpenVoice, as per the README.md instructions
  2. pip install -U torch torchvision torchaudio git+https://github.com/violetdenim/wavmark.git
  3. Run OpenVoice, as per the README.md instructions
  4. Enjoy!

Thanks! This should be included in README.md

xiangdev avatar Jan 19 '24 06:01 xiangdev