kaggle_speech_recognition AttributeError: 'NoneType' object has no attribute 'groups'

After I installed the kit, I encountered an unsolvable problem. The content is as follows:

(test) C:\Users\USER\Desktop\kaggle_speech_recognition-master>python train.py C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\framework\dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\framework\dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\framework\dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\framework\dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\framework\dtypes.py:521: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\framework\dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) WARNING:tensorflow:From C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version. Instructions for updating: Use the retry module or similar alternatives. 2021-01-19 21:29:54.173337: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 DEBUG:tensorflow:fold_idx=1, MODEL_FLAGS=Namespace(dropout_keep_prob=0.5, file_chars='map_chars.txt', file_words='map_words.txt', frame_size_ms=30.0, frame_stride_ms=10.0, l2_scale=0, lr_drate=0.3, lr_init=0.0002, num_key_words=10, num_mel_bins=46, pad_ms=140, sample_rate=16000, target_duration_ms=1140) DEBUG:tensorflow:num_words=30, num_chars=28, max_label_length=4 C:\Users\USER\Desktop\kaggle_speech_recognition-master\util_data.py:326: WavFileWarning: Chunk (non-data) not understood, skipping it. _, audio = scipy.io.wavfile.read(wav_file) Traceback (most recent call last): File "train.py", line 191, in tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) File "C:\Users\USER.conda\envs\test\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run _sys.exit(main(argv)) File "train.py", line 43, in main FLAGS.bg_noise_prob, FLAGS.bg_nsr) File "C:\Users\USER\Desktop\kaggle_speech_recognition-master\audio_dataset.py", line 66, in load_datasets num_folds, fold_idx) File "C:\Users\USER\Desktop\kaggle_speech_recognition-master\util_data.py", line 123, in split_datasets divider = set_divider(data_dir, key_words, num_folds) File "C:\Users\USER\Desktop\kaggle_speech_recognition-master\util_data.py", line 59, in set_divider speaker = reg.search(wav).groups()[0].lower() AttributeError: 'NoneType' object has no attribute 'groups'

Jan 19 '21 13:01 20206666

hi, speaker = reg.search(wav).groups()[0].lower() AttributeError: 'NoneType' object has no attribute 'groups', means the regex pattern you're looking for is not found in your wave file name

Jan 20 '21 15:01 kezakool

@kezakool
How should I adjust wave file

Jan 21 '21 10:01 20206666

if you don't want to modify code, check if your config dataset folder is well configured and if all your dataset files are of the same structure as one of google command dataset, hope it helps

Jan 21 '21 10:01 kezakool

@kezakool I just run download_dataset.sh and python train.py How can i adjust thank you

Jan 21 '21 11:01 20206666

@20206666 what is the output like when you run download_dataset.sh?

Jan 24 '21 13:01 huschen

@huschen When Irun download_dataset.sh there are two more folders dataset and run

Jan 25 '21 09:01 20206666

Are there any wav files downloaded in the dataset folder?

Jan 25 '21 20:01 huschen

@huschen yes , downloaded a lot wav files

Jan 26 '21 11:01 20206666