Retrieval-based-Voice-Conversion-WebUI Error in using training data to generate file (ffmpeg based)

After training I can use my model just fine and convert many voice files, but sometimes it gives me an error when doing so:

Traceback (most recent call last):
  File "L:\RCV_voiceclone\my_utils.py", line 14, in load_audio
    ffmpeg.input(file, threads=0)
  File "L:\RCV_voiceclone\runtime\lib\site-packages\ffmpeg\_run.py", line 325, in run
    raise Error('ffmpeg', out, err)
ffmpeg._run.Error: ffmpeg error (see stderr output for detail)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "L:\RCV_voiceclone\infer-web.py", line 161, in vc_single
    audio = load_audio(input_audio_path, 16000)
  File "L:\RCV_voiceclone\my_utils.py", line 19, in load_audio
    raise RuntimeError(f"Failed to load audio: {e}")
RuntimeError: Failed to load audio: ffmpeg error (see stderr output for detail)

Traceback (most recent call last):
  File "L:\RCV_voiceclone\runtime\lib\site-packages\gradio\routes.py", line 321, in run_predict
    output = await app.blocks.process_api(
  File "L:\RCV_voiceclone\runtime\lib\site-packages\gradio\blocks.py", line 1007, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "L:\RCV_voiceclone\runtime\lib\site-packages\gradio\blocks.py", line 953, in postprocess_data
    prediction_value = block.postprocess(prediction_value)
  File "L:\RCV_voiceclone\runtime\lib\site-packages\gradio\components.py", line 2076, in postprocess
    processing_utils.audio_to_file(sample_rate, data, file.name)
  File "L:\RCV_voiceclone\runtime\lib\site-packages\gradio\processing_utils.py", line 206, in audio_to_file
    data = convert_to_16_bit_wav(data)
  File "L:\RCV_voiceclone\runtime\lib\site-packages\gradio\processing_utils.py", line 219, in convert_to_16_bit_wav
    if data.dtype in [np.float64, np.float32, np.float16]:
AttributeError: 'NoneType' object has no attribute 'dtype'
Traceback (most recent call last):
  File "L:\RCV_voiceclone\my_utils.py", line 14, in load_audio
    ffmpeg.input(file, threads=0)
  File "L:\RCV_voiceclone\runtime\lib\site-packages\ffmpeg\_run.py", line 325, in run
    raise Error('ffmpeg', out, err)
ffmpeg._run.Error: ffmpeg error (see stderr output for detail)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "L:\RCV_voiceclone\infer-web.py", line 161, in vc_single
    audio = load_audio(input_audio_path, 16000)
  File "L:\RCV_voiceclone\my_utils.py", line 19, in load_audio
    raise RuntimeError(f"Failed to load audio: {e}")
RuntimeError: Failed to load audio: ffmpeg error (see stderr output for detail)

Traceback (most recent call last):
  File "L:\RCV_voiceclone\runtime\lib\site-packages\gradio\routes.py", line 321, in run_predict
    output = await app.blocks.process_api(
  File "L:\RCV_voiceclone\runtime\lib\site-packages\gradio\blocks.py", line 1007, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "L:\RCV_voiceclone\runtime\lib\site-packages\gradio\blocks.py", line 953, in postprocess_data
    prediction_value = block.postprocess(prediction_value)
  File "L:\RCV_voiceclone\runtime\lib\site-packages\gradio\components.py", line 2076, in postprocess
    processing_utils.audio_to_file(sample_rate, data, file.name)
  File "L:\RCV_voiceclone\runtime\lib\site-packages\gradio\processing_utils.py", line 206, in audio_to_file
    data = convert_to_16_bit_wav(data)
  File "L:\RCV_voiceclone\runtime\lib\site-packages\gradio\processing_utils.py", line 219, in convert_to_16_bit_wav
    if data.dtype in [np.float64, np.float32, np.float16]:
AttributeError: 'NoneType' object has no attribute 'dtype'

It is not just one file, and those that tend to give it I have opened, edited, resaved in multiple formats and matching of those that work, etc. It feels random what files will output this error and what ones will convert properly.

Jun 04 '23 19:06 hobolyra

I get the exact same error when i tried using pitch on a model that had pitch unselected. Set the "Protect the artifact of voiceless consonant and breath" to 0.5 and see if the error goes away. Oh and this error can happen when you have a space in the audio files name.

Jun 05 '23 08:06 kazuviking2

This issue was closed because it has been inactive for 15 days since being marked as stale.

Apr 28 '24 04:04 github-actions[bot]