OpenVoice [ERROR] Get target tone color error cuFFT error: CUFFT_INTERNAL

installed locally on Manjaro linux with nvidia drivers no problems during installation

I get this error in the info box every time [ERROR] Get target tone color error cuFFT error: CUFFT_INTERNAL_ERROR

Jan 03 '24 22:01 Zovya

This repo was developed under ubuntu 20.04

Jan 04 '24 02:01 Zengyi-Qin

I'm encountering the same issue on Ubuntu 20.04 running on WSL.

Ubuntu 20.04.6 LTS (GNU/Linux 5.15.133.1-microsoft-standard-WSL2 x86_64)

Jan 04 '24 06:01 ahandleman

It appears to be an issue with Torch 1.13.1 due to its dependency on CUDA 11.7. According to the PyTorch bug thread, this error does not occur when running later versions of Torch.

I have my doubts as to whether it'll be as simple as updating Torch to latest, but I'll try it out and report back.

Jan 04 '24 07:01 xaroth8088

I'm getting this too - I tracked it down to the following in mel_processing.py

spec = torch.stft(
        y,
        n_fft,
        hop_length=hop_size,
        win_length=win_size,
        window=hann_window[wnsize_dtype_device],
        center=center,
        pad_mode="reflect",
        normalized=False,
        onesided=True,
        return_complex=False,
    )

I added a bunch of extra logging to track it down - here is that - tired now, going to bed. Best of skill taking this to next step.

(openvoice) jas@Hope:/mnt/d/repo/AI/audio/OpenVoice$ python openvoice_app.py
Loaded checkpoint 'checkpoints/base_speakers/EN/checkpoint.pth'
missing/unexpected keys: [] []
Loaded checkpoint 'checkpoints/base_speakers/ZH/checkpoint.pth'
missing/unexpected keys: [] []
Loaded checkpoint 'checkpoints/converter/checkpoint.pth'
missing/unexpected keys: [] []
/home/jas/anaconda3/envs/openvoice/lib/python3.9/site-packages/gradio/components/dropdown.py:103: UserWarning: The `max_choices` parameter is ignored when `multiselect` is False.
 warnings.warn(
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Detected language:en
[(0.0, 19.278375)]
after vad: dur = 19.27798185941043
Audio path: processed/demo_speaker0-0-100/wavs
Audio name: demo_speaker0-0-100
Device: cuda
VC model: <api.ToneColorConverter object at 0x7f0ed7dcddf0>
SE path: processed/demo_speaker0-0-100/se.pth
Audio Segments: ['processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg0.wav', 'processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg1.wav']
ref_wav_list: ['processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg0.wav', 'processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg1.wav']
se_save_path: processed/demo_speaker0-0-100/se.pth
device: cuda
hps: {'data': {'sampling_rate': 22050, 'filter_length': 1024, 'hop_length': 256, 'win_length': 1024, 'n_speakers': 0}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [8, 8, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 4, 4], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 256}}
gs: []
proc fname: processed/demo_speaker0-0-100/wavs/demo_speaker0-0-100_seg0.wav
loaded audio: [ 0.          0.          0.         ... -0.00099443  0.01052232
 0.        ] 22050
torch.FloatTensor: tensor([ 0.0000,  0.0000,  0.0000,  ..., -0.0010,  0.0105,  0.0000])
y.to(device): tensor([ 0.0000,  0.0000,  0.0000,  ..., -0.0010,  0.0105,  0.0000],
      device='cuda:0')
y.unsqueeze(0): tensor([[ 0.0000,  0.0000,  0.0000,  ..., -0.0010,  0.0105,  0.0000]],
      device='cuda:0')
wnsize_dtype_device: 1024_torch.float32_cuda:0
wnsize_dtype_device adding
torch.nn.functional.pad
y.squeeze(1)
torch.stft(...)
Exception: cuFFT error: CUFFT_INTERNAL_ERROR

Jan 04 '24 07:01 WolfieXIII

@WolfieXIII : That mirrors what I found, too. 😞

Re: trying to just upgrade Torch - alas, it appears OpenVoice has a dependency on wavmark, which doesn't seem to have a version compatible with torch>2.0. So, trying to get this to work on newer cards will likely require one of the following:

Wait for wavmark to create a Torch 2.x-compatible version
Replace wavmark with an alternative library (and then upgrade Torch)
Create a custom build of Torch 1.13.1 that depends on CUDA 11.8 or later.
Your ideas here...

Jan 04 '24 08:01 xaroth8088

Someone already made a PR on wavmark to support 2.1 <3

https://github.com/wavmark/wavmark/pull/6

Jan 04 '24 21:01 JacopoMangiavacchi

I found that it works after doing the following two:

Upgrade all torch, torchaudio, torchvision to the latest version
Uninstall the default wavmark and reinstall this version: https://github.com/violetdenim/wavmark using "pip install -e ." after git clone this project and cd into this directory.

Jan 05 '24 04:01 yctam

Confirmed!

Here's some simpler instructions to tide everyone over until wavmark officially updates their package:

Install OpenVoice, as per the README.md instructions
pip install -U torch torchvision torchaudio git+https://github.com/violetdenim/wavmark.git
Run OpenVoice, as per the README.md instructions
Enjoy!

Jan 05 '24 05:01 xaroth8088

Confirmed!

Here's some simpler instructions to tide everyone over until wavmark officially updates their package:

Install OpenVoice, as per the README.md instructions

pip install -U torch torchvision torchaudio git+https://github.com/violetdenim/wavmark.git

Run OpenVoice, as per the README.md instructions

Enjoy!

Thanks! This should be included in README.md

Jan 19 '24 06:01 xiangdev