AcademiCodec
AcademiCodec copied to clipboard
AcademiCodec: An Open Source Audio Codec Model for Academic Research
I want to reproduce the results in this paper "HIFI-CODEC: GROUP-RESIDUAL VECTOR QUANTIZATION FOR HIGH FIDELITY AUDIO CODEC". However, the description is so confused. The paper just said that the...
why do norm in inference: wav = normalize(wav) * 0.95 i didn't see same operation in train dataset. Is this step necessary?
Hi, thanks for the job. I read the paper but I still didn't find information about the decoder speed of HiFiCodec. I mean how long would it takes on CPU...
The model does not converge when I use hifi-codec to train NAR of valle. The data i used is a chinese dataset while its duration is 5000 hours. How can...
There is no LICENSE file. What is the license for this project and the pretrained models?
我目前正在学习这个模型的训练流程,但是我看论文说训练到收敛需要8张卡训练一个多月,所以说对我而言短时间内肯定是训练不好的,我想知道有没有训练好的**discriminator**模型,谢谢。
运行egs/SoundStream_24k_240d/main3_ddp.py时,当运行到第9行,也就是导入自定义库academicodec/models/encodec/distributed/launch.py时,launch.py会在第5行报错,说找不到库。 这里只需要把launch.py第5行改写成`from . import distributed as dist_fn`就可以了。
因为argparse的输入的input和output都是pathlib.path这个类,可以不需要引入os操作,结合官方的encodec的代码做出了以下修改
可以例如下面这种格式保存,要不然单机保存的模型根据索引会出现问题,我会在后面提交修复的版本 ```python if epoch % config.common.save_interval == 0: model_to_save = model.module if config.distributed.data_parallel else model disc_model_to_save = disc_model.module if config.distributed.data_parallel else disc_model if not config.distributed.data_parallel or dist.get_rank() == 0: save_master_checkpoint(epoch,...
解开 launch.py 里面关于 OMP_NUM_THREADS 的注释可以加速训练,也能提高 GPU 利用率,因为默认会使用所有核心(对于核心数很多的机器如 A100),多核心之间的交互可能有耗时,如果觉得 1 太小,可以额外在 train.sh 前面控制(如使用 8),LibriTTS 的训练尚未测试 https://github.com/yangdongchao/AcademiCodec/blob/a496082fc2f7a324abb37fc3355487798dad2084/academicodec/models/encodec/distributed/launch.py#L34 also see https://github.com/yangdongchao/SoundStorm/pull/34 在该仓库中暂未验证