ppg-vc
ppg-vc copied to clipboard
PPG-Based Voice Conversion
I've noticed that the quality of VCTK dataset is extremely high, will the difference of my custom dataset influence the result? Thanks.
Bumps [numpy](https://github.com/numpy/numpy) from 1.19.2 to 1.22.0. Release notes Sourced from numpy's releases. v1.22.0 NumPy 1.22.0 Release Notes NumPy 1.22.0 is a big release featuring the work of 153 contributors spread...
what is /home/shaunxliu/data/vctk/fidlists/train_fidlist.new, /home/shaunxliu/data/vctk/fidlists/dev_fidlist.new, /home/shaunxliu/data/vctk/fidlists/eval_fidlist.txt how to get such kinds of files for my own dataset
作者您好,感谢您的工作 我使用您的预训练模型获取了某段声音的梅尔频谱,但我注意到他似乎有小于零的部分,似乎与我之前所见到的(使用librosa库提取的)梅尔频谱不太一样。 如果我想用它来继续提取MFCC特征,请问可行吗?如果可以,您能否提供一些简要指导? 我注意到您代码中可以指定min_mel和max_mel调整数值区间,但是我将min_mel设置大于零后,不知如何指定max_mel的数值,也不能确定求出的是否为标准的MFCC特征。 非常感谢您的阅读
Thank you for sharing,There's one thing I don't understand。 Why do have to change the one-quarter sampling rate of subsampling after removing it? What is the harm to VC task...
Thank you for the questions. For Q1: I adapted espnet a lot; it seems that espnet asr models always downsample the encoder input along the temporal axis more than 4x...
I notice in utterance_mvn(), if norm_means is True, then std.sqrt() is performed twice? is there any explanation? ``` if norm_means: x -= mean if norm_vars: var = x.pow(2).sum(dim=1, keepdim=True) /...
Does this project support cross lingual voice conversion? If yes, what changes need to be made? Thanks.
It's a great job and it shown extraordinary results for zero-shot condition. Have you test your model on mandarin datasets?If I want to try on mandarin datasets, which module i...