FAcodec issues

请问'uv'指的是 'unvoiced' 吗?

1

即某一帧是否有声音,计算方式为f0是否大于某一阈值?

一、请问我的流程是否正确： 1、修改meldataset.py，改为自己的dataloader，使用VCTK数据集以及wav2vec生成伪标签，在train.py上训练出几个ckpt文件 2、使用训练出的最后一个ckpt作为预训练模型，训练train_redecoder.py（有一个疑问是此处用于训练train_redecoder.py的和train所用的数据集一样即可吗？） 3、使用train训练出的ckpt以及train_redecoder.py训练出的ckpt，作用于reconstruct_redecoder.py上进行音色转换二、请问通过train和train_redecoder.py训练出的ckpt文件是否和您所提供的bin预训练模型有着相同的结构和参数？感谢解答！

rainbowjack

Audio format in dataset files

2

Thanks for you great work on implementing FACodec! I found the data file in https://github.com/Plachtaa/FAcodec/blob/master/data/val.txt has some labels, like speaker id, phonemes. How can I get these labels? Will these...

r666ay

How many steps would be enough if i train this model from start?

8

Hi! Nice work! Could you share how many steps would be sufficient to train a new model? I'm trying to train a 16k FAcodec. The results reconstructed by ckpt 130,000...

lixuyuan102

What do the loss curves look like during your successful training?

5

Hello, I've attempted to train FAcodec using my own dataset. However, whether I start from scratch or fine-tune your provided checkpoint, the reconstructed audio clips are just noise. I fine-tuned...

YuXiangLin1234

恢复后的音频高频部分都没了

6

qgzang

你好，我想请问下如何用train中训练出的PTH文件进行推理以及想请教下不用任何标签也可以解耦的思想

2

项目中的reconstruct和redecoder reconstruct似乎只能针对预训练文件，也就是bin，我想请教下train训练的pth文件能否用于推理还有就是想请问不用任何标签也可以训练出解耦音频要素的方法是在哪个文件中体现的感谢解答

rainbowjack

你好，我想问下关于检查点的问题

2

我发现您们所提供的预训练检查点似乎都是只有权重的bin格式，而使用仓库中train训练出来的检查点都是pth格式，先是大小就差了2.5个G 由于我既无法连上HF也无法连上HFmirror，于是我就想着先用自己训练出来的检查点试试，就把检查点的名字改成了pytorch_model.bin，连着config一起放到了checkpoints里然后我发现训练出来的模型并不能够用于声音重构，因为在reconstruct的时候，模型的键是： dict_keys(['encoder', 'quantizer', 'decoder', 'discriminator', 'fa_predictors']) 而检查点的键是： Keys in ckpt_params: dict_keys(['net', 'optimizer', 'scheduler', 'iters', 'epoch']) 请问是就是这样设计的呢，还是我的使用方法是错误的呢？最后我想问一下，请问您们是如何不加上任何标签和注释就将一个音频的音色内容音高给解耦开的呢？是用的哪个文件中的哪一段函数呢？多谢解答

rainbowjack

Does the prosody codes[0] work?

3

I tried to test the code some specifically for prosody but it seemed like the prosody was tied to codes[1] with the content?

dalazymodder

FAcodec
FAcodec copied to clipboard

Metadata

请问'uv'指的是 'unvoiced' 吗?

请问解码器是否可以支持流式输出？

关于训练以及推理流程有一些疑问

Audio format in dataset files

How many steps would be enough if i train this model from start?

What do the loss curves look like during your successful training?

恢复后的音频高频部分都没了

你好，我想请问下如何用train中训练出的PTH文件进行推理以及想请教下不用任何标签也可以解耦的思想

你好，我想问下关于检查点的问题

Does the prosody codes[0] work?

← Metadata

Owner

Metadata

FAcodec FAcodec copied to clipboard

Metadata

← Metadata

Owner

Metadata

FAcodec
FAcodec copied to clipboard