FreeVC issues

Update requirements.txt

1

The dependencies for torch and torchvision were incompatible. Torch 1.10.0 utilizes torchvision 0.11.1 instead of 0.9.0.

siddy819

Unseen Male to Male results in Female output

1

Hi, I really liked this great project and while testing it with some English male as source and Malayalam male as target sample the output sounds like a female voice,...

bharaniyv

Poor results with: voice_conversion_models--multilingual--vctk--freevc24.zip CoquiTTS

6

At first I was somewhat impressed, using a male voice as source and a female voice as target that was 30 seconds long and noise-cleaned by AI. It pretty much...

ballerburg9005

Some observations after 690k steps

33

For unseen F to seen M conversion, the resulting pitch is very close to the source speaker , especially if the source pitch is much higher than seen M pitch....

skol101

Why is the speaker embedding g used to condition the Posterior Encoder and the Decoder?

I am confused why the speaker embedding `g` is used to condition multiple model components (_Posterior Encoder, Decoder, Flow_) as opposed to just _Flow_. From the model diagram in **Fig....

st-vincent1

training a model with 44.1k data

Are there any tips to consider when training a model with 44.1k data? Additionally, does increasing the sampling rate of training data contribute to improved model performance? Now my training...

clumsyroot

target pitch issue after training (not appearing if using the pretrained checkpoint)

1

Hi, I'm running trainings with and w/o using the pretained checkpoint (VCTK) as initial state. However, in both cases the target pitch is affected by the input pitch (e.g. from...

fervillamar

How to start inference example?

1

When I do: # inference with FreeVC `CUDA_VISIBLE_DEVICES=0 python convert.py --hpfile logs/freevc.json --ptfile checkpoints/freevc.pth --txtpath convert.txt --outdir outputs/freevc` How do I get the freevc.json and freevc.pth checkpoint if I did...

asusdisciple

poor performance on seen-to-unseen task while finetuning on Hindi language

2

Hello! I'm delighted to come across this remarkable project, and thanks for sharing it as an open-source project. Currently, my focus lies on fine-tuning the freevc-s model using pretrained checkpoints...

rgenai

关于训练问题

我在测试说话人相似度的时候发现训练集和在LibriTTS的train-clean-100上测得的平均相似度很接近，是因为提供的pt文件是已经在LibriTTS上已经fine-tune好的吗？还是我测试说话人相似度的方法不太合适？我用的是该项目自带的pretrained speaker encoder提的emb vector计算转换后的语音和参考音频之间的余弦相似度。

Aydous

FreeVC
FreeVC copied to clipboard

Metadata

Update requirements.txt

Unseen Male to Male results in Female output

Poor results with: voice_conversion_models--multilingual--vctk--freevc24.zip CoquiTTS

Some observations after 690k steps

Why is the speaker embedding g used to condition the Posterior Encoder and the Decoder?

training a model with 44.1k data

target pitch issue after training (not appearing if using the pretrained checkpoint)

How to start inference example?

poor performance on seen-to-unseen task while finetuning on Hindi language

关于训练问题

← Metadata

Owner

Metadata

FreeVC FreeVC copied to clipboard

Metadata

← Metadata

Owner

Metadata

FreeVC
FreeVC copied to clipboard