ParallelWaveGAN icon indicating copy to clipboard operation
ParallelWaveGAN copied to clipboard

discrete token for audio resynthesis

Open South-Twilight opened this issue 2 years ago • 0 comments

Here is the PR for audio resynthesis in discrete token:

  1. We extend hubert_voc1 to token_voc1 and it can handle more models token;
  2. We add f0 for training and inference when finding poor prounciation in singing;
  3. We add multi-stream method including residual cluster and weight sum;
  4. Using embedding feature of models is also allowed.

The following models have been validated in opencpop recipe: HuBERT, XLS-R, WavLM, MERT, Encodec.

South-Twilight avatar Feb 02 '24 13:02 South-Twilight