FastSpeech2
FastSpeech2 copied to clipboard
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Hey, I'm getting the following error when I try to run preprocess.py. I know this issue has been addressed already, however I've tried the suggested solution to check my path...
Multispeaker
First I wanna say that I love this repo, it performs excellently with many of my small (down to 80 seconds) and noisy datasets, having good transfer learning. Multispeaker support...
If the max_wav_value is 32868, when you run `wav = wav / max(abs(wav)) * max_wav_value` in preprocess_align.py, it may cause data overflow and bring instant noise in the training wav,...
How to create dataset?
Hi All, I trained the model using custom data set for Sinhala Language using around 14 hour data for 100000 steps. After synthesizing the output wav file gives a wired...
I can't find it from [official website](http://challenge.ai.iqiyi.com/detail?raceId=5fb2688224954e0b48431fe0) Could you please share the source of M2VOC dataset? This is my [email](mailto:[email protected]) Thanks a lot!
Hi Thanks for this great implementation. The **sampling rate** in my data is **44.1 kHz** so what changes are required in the parameters of config files to train the synthesizer?...
I run it on pc (cpu only) and google colab(Tesla 1000), and got the same wrong message: Traceback (most recent call last): File "synthesize.py", line 188, in model = get_model(args,...
How Can I get dynamic input in torch model to onnx model ? I give input with dynamic_axes, but the output in inference is not dynamic. ### My code ```...
Thanks for the good job. When I read the code , A question disturb me from understand it wholy: Why the pitch_predictor can predict pitch under different resolutions? I see...