FastSpeech2 icon indicating copy to clipboard operation
FastSpeech2 copied to clipboard

Weird noise introduced by `prepare_align.py`

Open hellolzc opened this issue 4 years ago • 1 comments

After data preprocessing using prepare_align.py, strange noise appears in the audio file. The following figure is the spectrum of the noise in the audio file in the demo directory.

image

hellolzc avatar Jun 09 '21 12:06 hellolzc

The problem was solved by using soundfile to save audio

Modify the file preprocessor/ljspeech.py:

# from scipy.io import wavfile
import soundfile as sf

# ...

                wav = wav / max(abs(wav))
                sf.write(
                    os.path.join(out_dir, speaker, "{}.wav".format(base_name)),
                    wav,
                    sampling_rate,
                    subtype='PCM_16'
                )
                # wav = wav / max(abs(wav)) * max_wav_value
                # wavfile.write(
                #     os.path.join(out_dir, speaker, "{}.wav".format(base_name)),
                #     sampling_rate,
                #     wav.astype(np.int16),
                # )

hellolzc avatar Jun 09 '21 12:06 hellolzc