vits VITS codes failed to run for Python 3.10.12

Hi, I managed to run VITS on Python 3.8.12 but when I upgraded to Python 3.10.12, I've got the following error:

Traceback (most recent call last): File "/content/drive/MyDrive/vits/train.py", line 290, in main() File "/content/drive/MyDrive/vits/train.py", line 50, in main mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,)) File "/usr/local/lib/python3.10/dist-packages/torch/multiprocessing/spawn.py", line 239, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/usr/local/lib/python3.10/dist-packages/torch/multiprocessing/spawn.py", line 197, in start_processes while not context.join(): File "/usr/local/lib/python3.10/dist-packages/torch/multiprocessing/spawn.py", line 160, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, *args) File "/content/drive/MyDrive/vits/train.py", line 117, in run train_and_evaluate(rank, epoch, hps, [net_g, net_d], [optim_g, optim_d], [scheduler_g, scheduler_d], scaler, [train_loader, eval_loader], logger, [writer, writer_eval]) File "/content/drive/MyDrive/vits/train.py", line 137, in train_and_evaluate for batch_idx, (x, x_lengths, spec, spec_lengths, y, y_lengths) in enumerate(train_loader): File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 633, in next data = self._next_data() File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1345, in _next_data return self._process_data(data) File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1371, in _process_data data.reraise() File "/usr/local/lib/python3.10/dist-packages/torch/_utils.py", line 644, in reraise raise exception IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop data = fetcher.fetch(index) File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch return self.collate_fn(data) File "/content/drive/MyDrive/vits/data_utils.py", line 114, in call torch.LongTensor([x[1].size(1) for x in batch]), File "/content/drive/MyDrive/vits/data_utils.py", line 114, in torch.LongTensor([x[1].size(1) for x in batch]), IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

It appears Python 3.10.12 is complaining the input tensor is of wrong shape but that doesn't happen for Python 3.8

May I ask why is this happening? Thanks.

Sep 24 '23 23:09 CKAbundant

You can try running the code using Python3.7. Python3.7 works for me.

Oct 10 '23 03:10 ghost

Hi there

I am facing the same issue on Google Colab. I've switched the version of python to 3.7, but I get the same error.

Traceback (most recent call last):
  File "train.py", line 290, in <module>
    main()
  File "train.py", line 50, in main
    mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
    while not context.join():
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 160, in join
    **raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:** 

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/content/vits/train.py", line 62, in run
    dist.init_process_group(backend='nccl', init_method='env://', world_size=n_gpus, rank=rank)
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 754, in init_process_group
    store, rank, world_size = next(rendezvous_iterator)
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/distributed/rendezvous.py", line 246, in _env_rendezvous_handler
    store = _create_c10d_store(master_addr, master_port, rank, world_size, timeout)
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/distributed/rendezvous.py", line 169, in _create_c10d_store
    raise ValueError(f"port must have value from 0 to 65535 but was {port}.")
**ValueError: port must have value from 0 to 65535 but was 80000.'**`

Sep 17 '24 13:09 Ayushi113

Hi there 你好

I am facing the same issue on Google Colab. I've switched the version of python to 3.7, but I get the same error.我在 Google Colab 上遇到了同样的问题。我将 Python 版本切换到了 3.7，但仍然出现相同的错误。

Traceback (most recent call last):
  File "train.py", line 290, in <module>
    main()
  File "train.py", line 50, in main
    mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
    while not context.join():
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 160, in join
    **raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:** 

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/content/vits/train.py", line 62, in run
    dist.init_process_group(backend='nccl', init_method='env://', world_size=n_gpus, rank=rank)
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 754, in init_process_group
    store, rank, world_size = next(rendezvous_iterator)
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/distributed/rendezvous.py", line 246, in _env_rendezvous_handler
    store = _create_c10d_store(master_addr, master_port, rank, world_size, timeout)
  File "/usr/local/envs/vits_tts/lib/python3.7/site-packages/torch/distributed/rendezvous.py", line 169, in _create_c10d_store
    raise ValueError(f"port must have value from 0 to 65535 but was {port}.")
**ValueError: port must have value from 0 to 65535 but was 80000.'**`

ok，you can fix it by change 8000 to other number, like this:

Nov 18 '25 06:11 zzhdbw