FastSpeech2 issues

StandardScaler Error when running preprocess.py

Hey, I'm getting the following error when I try to run preprocess.py. I know this issue has been addressed already, however I've tried the suggested solution to check my path...

jay99de

Multispeaker

12

First I wanna say that I love this repo, it performs excellently with many of my small (down to 80 seconds) and noisy datasets, having good transfer learning. Multispeaker support...

ZDisket

The max_wav_value in preprocess.yaml should be smaller than 32768.0

4

If the max_wav_value is 32868, when you run `wav = wav / max(abs(wav)) * max_wav_value` in preprocess_align.py, it may cause data overflow and bring instant noise in the training wav,...

Georgehappy1

Synthesizing outputs a wired sound

Hi All, I trained the model using custom data set for Sinhala Language using around 14 hour data for 100000 steps. After synthesizing the output wav file gives a wired...

DanojaDias

I can't find it from [official website](http://challenge.ai.iqiyi.com/detail?raceId=5fb2688224954e0b48431fe0) Could you please share the source of M2VOC dataset? This is my [email](mailto:[email protected]) Thanks a lot!

AmorFati-coder

New parameters for sampling_rate=44.1 kHz

1

Hi Thanks for this great implementation. The **sampling rate** in my data is **44.1 kHz** so what changes are required in the parameters of config files to train the synthesizer?...

Adibian

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False.

7

I run it on pc (cpu only) and google colab(Tesla 1000), and got the same wrong message: Traceback (most recent call last): File "synthesize.py", line 188, in model = get_model(args,...

dsyrock

How to covert Fastspeech2 to Onnx with dynamic input and output ?

4

How Can I get dynamic input in torch model to onnx model ? I give input with dynamic_axes, but the output in inference is not dynamic. ### My code ```...

Tian14267

About pitch_predictor of different resolutions

1

Thanks for the good job. When I read the code , A question disturb me from understand it wholy: Why the pitch_predictor can predict pitch under different resolutions? I see...

JohnHerry

FastSpeech2
FastSpeech2 copied to clipboard

Metadata

StandardScaler Error when running preprocess.py

Multispeaker

The max_wav_value in preprocess.yaml should be smaller than 32768.0

Making dataset

Synthesizing outputs a wired sound

Where is M2VOC dataset?

New parameters for sampling_rate=44.1 kHz

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False.

How to covert Fastspeech2 to Onnx with dynamic input and output ?

About pitch_predictor of different resolutions

← Metadata

Owner

Metadata

FastSpeech2 FastSpeech2 copied to clipboard

Metadata

← Metadata

Owner

Metadata

FastSpeech2
FastSpeech2 copied to clipboard