AttributeError: 'StandardScaler' object has no attribute 'mean_'
Hi, after running python3 preprocess.py config/LibriTTS/preprocess.yaml, i get the error
line 96, in build_from_path pitch_mean = pitch_scaler.mean_[0] AttributeError: 'StandardScaler' object has no attribute 'mean_'
i have went to stackoverflow and tried to debug but it is a futile attempt.
Can anyone please help?
@ming024 when you train the LibriTTS dataset, did you set the sampling rate to 22050hz or kept it as 24000hz?
You should make sure that this self.process_utterance(speaker, basename) worked. If not, no '_scaler' did partial_fit()
@phamlehuy53 hi, thank you for the response. How can I ensure that self.process_utterance(speaker, basename) works?
@phamlehuy53 hi, thank you for the response. How can I ensure that self.process_utterance(speaker, basename) works?
Take a look at this:
if os.path.exists(tg_path):
ret = self.process_utterance(speaker, basename)
if ret is None:
continue
else:
info, pitch, energy, n = ret
out.append(info)
if len(pitch) > 0:
pitch_scaler.partial_fit(pitch.reshape((-1, 1)))
if len(energy) > 0:
energy_scaler.partial_fit(energy.reshape((-1, 1)))
If process_utterance return None with all your data(I guess that your put the wrong paths), statement pitch_scaler.partial_fit(pitch.reshape((-1, 1))) never reach. So *_scaler fits nothing -> has no mean_ attribute.
You can check it here
@phamlehuy53 Hey! process_utterance returns me a result ( the files are processed) but the no mean_ attribute error still persists ;/
the sampling rate is 24000hz for this dataset. Would you reckon that this rate causes this error?
got same issue when I try "python3 preprocess.py config/AISHELL3/preprocess.yaml" on my updated AISHELL3 dataset (one more speaker was added)
Hi, after running python3 preprocess.py config/LibriTTS/preprocess.yaml, i get the error
line 96, in build_from_path pitch_mean = pitch_scaler.mean_[0] AttributeError: 'StandardScaler' object has no attribute 'mean_'
i have went to stackoverflow and tried to debug but it is a futile attempt.
Can anyone please help?
My guess is that maybe you should execute the prepare_align.py first and generate the relevant *.wav and *.lab files in the raw_data/LibriTTS directory.
In my case dealing with AISHELL3 datasets, I encountered with the same problem, and I found out that the MFA output textgrid has 2 sections, one is "phones" which will run into "start >= end" condition, so I change line 165 in preprocessor/preprocessor.py to textgrid.get_tier_by_name("words") and it works. Hope it could help you guys.
In my case dealing with AISHELL3 datasets, I encountered with the same problem, and I found out that the MFA output textgrid has 2 sections, one is "phones" which will run into "start >= end" condition, so I change line 165 in
preprocessor/preprocessor.pytotextgrid.get_tier_by_name("words")and it works. Hope it could help you guys.
If you can run successfully after changing it to this way, you can check whether the phone part in your textgrid file is all spn, which may be the problem with the official phonetic dictionary of mfa . Because after I changed to Word, this step can be passed, but there will still be problems later. I checked and found that there was a problem with my textgrid generation
For reference only:https://github.com/ming024/FastSpeech2/issues/188