OpenVoice icon indicating copy to clipboard operation
OpenVoice copied to clipboard

Run this project on Ubuntu 24.04 with cuda 12.

Open CyberT33N opened this issue 1 year ago • 4 comments

I can not run this project the this because I get:

  • RuntimeError: Library libcublas.so.11 is not found or cannot be loaded

I use the code of demo_part3.ipynb and installed everything as explained in USAGE.md

Related to:

  • https://github.com/myshell-ai/OpenVoice/issues/225

Any ideas?

However, I manged to get get it working with CPU as workaround if somebody has the same problems

Install:

cd ~/Projects/ai
git clone https://github.com/myshell-ai/OpenVoice

pyenv local 3.9

python3 -m venv venv
source venv/bin/activate

pip install -e .

wget https://myshell-public-repo-host.s3.amazonaws.com/openvoice/checkpoints_v2_0417.zip
unzip checkpoints_v2_0417.zip

pip install git+https://github.com/myshell-ai/MeloTTS.git
python -m unidic download

python
# then enter in python shell
import nltk
nltk.download('averaged_perceptron_tagger')
exit()

Edit /home/userName/Projects/ai/OpenVoice/openvoice/se_extractor.py and change line to:

model = WhisperModel(model_size, device="cpu", compute_type="float32")

Start script:

python main.py

main.py:

import os
import torch
print(torch.__version__)            # Sollte 2.1.0 oder höher sein
print(torch.cuda.is_available())    # Sollte True ausgeben
print(torch.cuda.get_device_name(0))  # Zeigt den Namen deiner GPU an

from openvoice import se_extractor
from openvoice.api import ToneColorConverter

ckpt_converter = 'checkpoints_v2/converter'
device = "cuda:0" if torch.cuda.is_available() else "cpu"
output_dir = 'outputs_v2'

tone_color_converter = ToneColorConverter(f'{ckpt_converter}/config.json', device=device)
tone_color_converter.load_ckpt(f'{ckpt_converter}/checkpoint.pth')

os.makedirs(output_dir, exist_ok=True)

reference_speaker = 'resources/example_reference.mp3' # This is the voice you want to clone
target_se, audio_name = se_extractor.get_se(reference_speaker, tone_color_converter, vad=False)

from melo.api import TTS

texts = {
    'EN_NEWEST': "Did you ever hear a folk tale about a giant turtle?",  # The newest English base speaker model
    'EN': "Did you ever hear a folk tale about a giant turtle?",
    'ES': "El resplandor del sol acaricia las olas, pintando el cielo con una paleta deslumbrante.",
    'FR': "La lueur dorée du soleil caresse les vagues, peignant le ciel d'une palette éblouissante.",
    'ZH': "在这次vacation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。",
    'JP': "彼は毎朝ジョギングをして体を健康に保っています。",
    'KR': "안녕하세요! 오늘은 날씨가 정말 좋네요.",
}


src_path = f'{output_dir}/tmp.wav'

# Speed is adjustable
speed = 1.0

for language, text in texts.items():
    model = TTS(language=language, device=device)
    speaker_ids = model.hps.data.spk2id
    
    for speaker_key in speaker_ids.keys():
        speaker_id = speaker_ids[speaker_key]
        speaker_key = speaker_key.lower().replace('_', '-')
        
        source_se = torch.load(f'checkpoints_v2/base_speakers/ses/{speaker_key}.pth', map_location=device)
        model.tts_to_file(text, speaker_id, src_path, speed=speed)
        save_path = f'{output_dir}/output_v2_{speaker_key}.wav'

        # Run the tone color converter
        encode_message = "@MyShell"
        tone_color_converter.convert(
            audio_src_path=src_path, 
            src_se=source_se, 
            tgt_se=target_se, 
            output_path=save_path,
            message=encode_message)

CyberT33N avatar Nov 26 '24 23:11 CyberT33N

I have same issue

lukaLLM avatar Dec 11 '24 09:12 lukaLLM

Same issue, too. v1 works fine.

hongleng avatar Dec 23 '24 11:12 hongleng

Try with a docker image maybe.

https://github.com/ground-creative/openvoice-docker

ground-creative avatar Jan 02 '25 06:01 ground-creative

Try with a docker image maybe.

https://github.com/ground-creative/openvoice-docker

Did this work for anyone yet ?

ER404R avatar Jan 07 '25 08:01 ER404R