kaggle_speech_recognition icon indicating copy to clipboard operation
kaggle_speech_recognition copied to clipboard

AttributeError: 'NoneType' object has no attribute 'groups'

Open 20206666 opened this issue 5 years ago • 8 comments

After I installed the kit, I encountered an unsolvable problem. The content is as follows:

(test) C:\Users\USER\Desktop\kaggle_speech_recognition-master>python train.py C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\framework\dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\framework\dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\framework\dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\framework\dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\framework\dtypes.py:521: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\framework\dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) WARNING:tensorflow:From C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version. Instructions for updating: Use the retry module or similar alternatives. 2021-01-19 21:29:54.173337: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 DEBUG:tensorflow:fold_idx=1, MODEL_FLAGS=Namespace(dropout_keep_prob=0.5, file_chars='map_chars.txt', file_words='map_words.txt', frame_size_ms=30.0, frame_stride_ms=10.0, l2_scale=0, lr_drate=0.3, lr_init=0.0002, num_key_words=10, num_mel_bins=46, pad_ms=140, sample_rate=16000, target_duration_ms=1140) DEBUG:tensorflow:num_words=30, num_chars=28, max_label_length=4 C:\Users\USER\Desktop\kaggle_speech_recognition-master\util_data.py:326: WavFileWarning: Chunk (non-data) not understood, skipping it. _, audio = scipy.io.wavfile.read(wav_file) Traceback (most recent call last): File "train.py", line 191, in tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) File "C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run _sys.exit(main(argv)) File "train.py", line 43, in main FLAGS.bg_noise_prob, FLAGS.bg_nsr) File "C:\Users\USER\Desktop\kaggle_speech_recognition-master\audio_dataset.py", line 66, in load_datasets num_folds, fold_idx) File "C:\Users\USER\Desktop\kaggle_speech_recognition-master\util_data.py", line 123, in split_datasets divider = set_divider(data_dir, key_words, num_folds) File "C:\Users\USER\Desktop\kaggle_speech_recognition-master\util_data.py", line 59, in set_divider speaker = reg.search(wav).groups()[0].lower() AttributeError: 'NoneType' object has no attribute 'groups'

20206666 avatar Jan 19 '21 13:01 20206666

hi, speaker = reg.search(wav).groups()[0].lower() AttributeError: 'NoneType' object has no attribute 'groups', means the regex pattern you're looking for is not found in your wave file name

kezakool avatar Jan 20 '21 15:01 kezakool

@kezakool
How should I adjust wave file

20206666 avatar Jan 21 '21 10:01 20206666

if you don't want to modify code, check if your config dataset folder is well configured and if all your dataset files are of the same structure as one of google command dataset, hope it helps

kezakool avatar Jan 21 '21 10:01 kezakool

@kezakool I just run download_dataset.sh and python train.py How can i adjust thank you

20206666 avatar Jan 21 '21 11:01 20206666

@20206666 what is the output like when you run download_dataset.sh?

huschen avatar Jan 24 '21 13:01 huschen

@huschen When Irun download_dataset.sh there are two more folders dataset and run

20206666 avatar Jan 25 '21 09:01 20206666

Are there any wav files downloaded in the dataset folder?

huschen avatar Jan 25 '21 20:01 huschen

@huschen yes , downloaded a lot wav files

20206666 avatar Jan 26 '21 11:01 20206666