Windows + python3.9 + OpenVoice v2 = not possible without CUDA?
Hi,
I followed the windows installation guide, and tried both on latest python 3.12 and 3.9.12 (per recommendation from the guide for python to be 3.9).
When I attempt to run the v2 example from demo_part3.ipynb, I am getting error originating from the following line:
target_se, audio_name = se_extractor.get_se(reference_speaker, tone_color_converter, vad=False)
This is the error:
Traceback (most recent call last):
File "C:\Users\user\Source\VoiceCloningTests\OpenVoice\demov2_.py", line 23, in <module>
target_se, audio_name = se_extractor.get_se(reference_speaker, tone_color_converter, vad=False)
File "C:\Users\user\Source\VoiceCloningTests\OpenVoice\openvoice\se_extractor.py", line 146, in get_se
wavs_folder = split_audio_whisper(audio_path, target_dir=target_dir, audio_name=audio_name)
File "C:\Users\user\Source\VoiceCloningTests\OpenVoice\openvoice\se_extractor.py", line 22, in split_audio_whisper
model = WhisperModel(model_size, device="cuda", compute_type="float16")
File "C:\Users\user\Source\VoiceCloningTests\OpenVoice\env39\lib\site-packages\faster_whisper\transcribe.py", line 128, in __init__
self.model = ctranslate2.models.Whisper(
RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version
In the beginning of my script - I follow the demo and setup my device variable in the same way to fallback to cpu. When I open the se_extractor.py file from the error above, I see that a device is hardcoded to be CUDA. And I end up with the error above.
My device is using integrated graphics from Intel, so afaik it is not even CUDA-enabled. Does this mean I cannot run OpenVoice v2, without NVidia graphics?
This is the code from the library - se_extractor.py - with hardcoded cuda string, that raises the issue:
def split_audio_whisper(audio_path, audio_name, target_dir='processed'):
global model
if model is None:
model = WhisperModel(model_size, device="cuda", compute_type="float16")
# ...
Hello,
I just ran into the same issue. I replaced the line in question with:
model = WhisperModel(model_size, device="cpu", compute_type="float32")
And it works. Hope it helps you too !
Lol, just came back to say I iterated towards the same solution and got it working. Thanks for the tip.
For all those who will come to this issue looking for the same, here's how I iterated towards the solution:
- I looked through the files raising the error. In code above, you can see it comes from
WhisperModel. - I located the
WhisperModelin my environment (installed by pip into virtualenv) invenv/Lib/site-packages/faster_whisper- I know it's in the faster_whisper module, because theWhisperModelis imported in the very beginning of the example from this module. - Error came from
transcribe.py, so that's the file I open and look for class definition
Now there are two parameters of interest there - device and compute_type. When I previously just tried hardcoding cpu for device, I would end up being told float16 is unsupported. So, my line of thinking was that I will look for what other types are supported and try their combinations.
Line 91 in transcribe contains a really long comment, out of which I will take out the most important parts:
"""
Initializes the Whisper model.
Args:
[...]
device: Device to use for computation ("cpu", "cuda", "auto").
compute_type: Type to use for computation.
See https://opennmt.net/CTranslate2/quantization.html.
[...]
"""
Quantization link provides a reference table of implicit type conversions on load, in which I was able to look up what is the default value for CPU for float16, in my architecture (Intel, x64, it is float32).
I changed the corresponding line to hardcode cpu for device variable and float32 for compute_type and got result on the output.
Reference table for your convenience:
haha you went about it much more professionally than I did :) Happy we both found a solution.
Hello, I have the same error even with the fix :