FastSpeech2 icon indicating copy to clipboard operation
FastSpeech2 copied to clipboard

AttributeError: 'StandardScaler' object has no attribute 'mean_'

Open wanwen1405 opened this issue 4 years ago • 10 comments

Hi, after running python3 preprocess.py config/LibriTTS/preprocess.yaml, i get the error

line 96, in build_from_path pitch_mean = pitch_scaler.mean_[0] AttributeError: 'StandardScaler' object has no attribute 'mean_'

i have went to stackoverflow and tried to debug but it is a futile attempt.

Can anyone please help?

wanwen1405 avatar Aug 11 '21 07:08 wanwen1405

@ming024 when you train the LibriTTS dataset, did you set the sampling rate to 22050hz or kept it as 24000hz?

wanwen1405 avatar Aug 11 '21 09:08 wanwen1405

You should make sure that this self.process_utterance(speaker, basename) worked. If not, no '_scaler' did partial_fit()

huypl53 avatar Aug 13 '21 07:08 huypl53

@phamlehuy53 hi, thank you for the response. How can I ensure that self.process_utterance(speaker, basename) works?

wanwen1405 avatar Aug 13 '21 10:08 wanwen1405

@phamlehuy53 hi, thank you for the response. How can I ensure that self.process_utterance(speaker, basename) works?

Take a look at this:

if os.path.exists(tg_path):
    ret = self.process_utterance(speaker, basename)
    if ret is None:
        continue
    else:
        info, pitch, energy, n = ret
    out.append(info)

if len(pitch) > 0:
    pitch_scaler.partial_fit(pitch.reshape((-1, 1)))
if len(energy) > 0:
    energy_scaler.partial_fit(energy.reshape((-1, 1)))

If process_utterance return None with all your data(I guess that your put the wrong paths), statement pitch_scaler.partial_fit(pitch.reshape((-1, 1))) never reach. So *_scaler fits nothing -> has no mean_ attribute.

You can check it here

huypl53 avatar Aug 14 '21 01:08 huypl53

@phamlehuy53 Hey! process_utterance returns me a result ( the files are processed) but the no mean_ attribute error still persists ;/

the sampling rate is 24000hz for this dataset. Would you reckon that this rate causes this error?

wanwen1405 avatar Aug 16 '21 03:08 wanwen1405

got same issue when I try "python3 preprocess.py config/AISHELL3/preprocess.yaml" on my updated AISHELL3 dataset (one more speaker was added)

everschen avatar May 17 '22 03:05 everschen

Hi, after running python3 preprocess.py config/LibriTTS/preprocess.yaml, i get the error

line 96, in build_from_path pitch_mean = pitch_scaler.mean_[0] AttributeError: 'StandardScaler' object has no attribute 'mean_'

i have went to stackoverflow and tried to debug but it is a futile attempt.

Can anyone please help?

My guess is that maybe you should execute the prepare_align.py first and generate the relevant *.wav and *.lab files in the raw_data/LibriTTS directory.

Aiden-song avatar May 20 '22 07:05 Aiden-song

In my case dealing with AISHELL3 datasets, I encountered with the same problem, and I found out that the MFA output textgrid has 2 sections, one is "phones" which will run into "start >= end" condition, so I change line 165 in preprocessor/preprocessor.py to textgrid.get_tier_by_name("words") and it works. Hope it could help you guys.

huiofficial avatar Nov 04 '22 08:11 huiofficial

In my case dealing with AISHELL3 datasets, I encountered with the same problem, and I found out that the MFA output textgrid has 2 sections, one is "phones" which will run into "start >= end" condition, so I change line 165 in preprocessor/preprocessor.py to textgrid.get_tier_by_name("words") and it works. Hope it could help you guys.

If you can run successfully after changing it to this way, you can check whether the phone part in your textgrid file is all spn, which may be the problem with the official phonetic dictionary of mfa . Because after I changed to Word, this step can be passed, but there will still be problems later. I checked and found that there was a problem with my textgrid generation

For reference only:https://github.com/ming024/FastSpeech2/issues/188

tuntun990606 avatar Feb 11 '23 08:02 tuntun990606