Something about extract_mel_spectrogram_for_tts in datapipeline.py

Open RobinWitch opened this issue 2 years ago • 0 comments

If I use librosa.feature.melspectrogram to extract melspectrogram from the raw wav , the result is quite different from use extract_mel_spectrogram_for_tts in datapipeline.py .

such as using same setting and wav data:

mel_spec1 = extract_mel_spectrogram_for_tts( ... ) mel_spec2 = librosa.feature.melspectrogram( ... )

while the result in a random example: mel_spec1.max=0.08053161389832361 mel_spec1.min=0.0 mel_spec1.mean=0.02656768943313912 mel_spec1.std=0.021109905037463596

mel_spec2.max=209.5188483174194 mel_spec2.min=0.0 mel_spec2.mean=0.25262605107753267 mel_spec2.std=2.8514387417932685

Could someone help me to explain the reason of this difference ?

Aug 21 '23 15:08 RobinWitch