Waveformer icon indicating copy to clipboard operation
Waveformer copied to clipboard

KeyError: 'scale_invariant_signal_noise_ratio' in test_epoch function

Open YuanxinGuo opened this issue 1 year ago • 4 comments

I'm encountering an issue while training the Waveformer model. When I run the following command:

python -W ignore -m src.training.train /home/swufe1/project/Waveformer/experiments/dcc_tf_ckpt_E256_10_D256_1 --use_cuda

I receive the following error message:

2024-10-31 10:27:40.500632: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1 Imported the model from 'src.training.dcc_tf'. Loading train dataset: fg_dir=data/FSDSoundScapes/FSDKaggle2018/train bg_dir=data/FSDSoundScapes/TAU-acoustic-sounds/TAU-urban-acoustic-scenes-2019-development Loaded train dataset at data/FSDSoundScapes containing 50000 elements Loading val dataset: fg_dir=data/FSDSoundScapes/FSDKaggle2018/val bg_dir=data/FSDSoundScapes/TAU-acoustic-sounds/TAU-urban-acoustic-scenes-2019-development Loaded test dataset at data/FSDSoundScapes containing 5000 elements Using CUDA devices: [0, 1, 2, 3] Using data parallel model Initializing optimizer with {'lr': 0.0005, 'weight_decay': 0.0} Learning rates initialized to: {group 0: params=1.61230M lr=5.0E-04} Initialized LR scheduler with params: fix_lr_epochs=50 {'mode': 'max', 'factor': 0.1, 'patience': 5, 'min_lr': 5e-06, 'threshold': 0.1, 'threshold_mode': 'abs'} Epoch 0: Train: 100%|███████████████████████████████████| 3125/3125 [1:19:01<00:00, 1.52s/it, loss=-0.93717] Train: _signal_noise_ratio=7.7241 _scale_invariant_signal_noise_ratio=2.0061 loss=-0.7130 Test: 0%| | 0/79 [00:49<?, ?it/s] Traceback (most recent call last): File "/home/swufe1/project/Waveformer/src/training/train.py", line 200, in train curr_test_metrics = test_epoch(model, device, val_loader, File "/home/swufe1/project/Waveformer/src/training/eval.py", line 75, in test_epoch tensorboard_add_metrics( File "/home/swufe1/project/Waveformer/src/training/synthetic_dataset.py", line 162, in tensorboard_add_metrics vals = np.asarray(metrics['scale_invariant_signal_noise_ratio']) KeyError: 'scale_invariant_signal_noise_ratio'

Could you please provide any suggestions on how to resolve this issue? Thank you very much!

YuanxinGuo avatar Oct 31 '24 06:10 YuanxinGuo

Hello, I encountered the same issue while running the code. May I ask if you have solved this problem? Thank you very much.

000WSW avatar Oct 31 '24 09:10 000WSW

Sorry, I haven't resolved this issue yet. I modified line 162 of synthetic_dataset.py to vals = np.asarray(metrics['_scale_invariant_signal_noise_ratio']). The program successfully trains for one epoch, but during the second epoch, it still reports the error: Test: 100%|█████████████████████████████████████████████████████████| 79/79 [08:12<00:00, 6.23s/it] Test: _signal_noise_ratio=8.0362 _scale_invariant_signal_noise_ratio=3.4833 loss=-1.2454 runtime=82.2936 Traceback (most recent call last): File "/home/swufe1/project/Waveformer/src/training/train.py", line 229, in train if max(val_metrics[base_metric]) == val_metrics[base_metric][-1]: KeyError: 'scale_invariant_signal_noise_ratio'

YuanxinGuo avatar Oct 31 '24 11:10 YuanxinGuo

Sorry, I haven't resolved this issue yet. I modified line 162 of synthetic_dataset.py to vals = np.asarray(metrics['_scale_invariant_signal_noise_ratio']). The program successfully trains for one epoch, but during the second epoch, it still reports the error: Test: 100%|█████████████████████████████████████████████████████████| 79/79 [08:12<00:00, 6.23s/it] Test: _signal_noise_ratio=8.0362 _scale_invariant_signal_noise_ratio=3.4833 loss=-1.2454 runtime=82.2936 Traceback (most recent call last): File "/home/swufe1/project/Waveformer/src/training/train.py", line 229, in train if max(val_metrics[base_metric]) == val_metrics[base_metric][-1]: KeyError: 'scale_invariant_signal_noise_ratio'

截图 2024-11-07 15-53-41 try set "base_metric": "_scale_invariant_signal_noise_ratio",in the config.json in the file of experiments

000WSW avatar Nov 07 '24 07:11 000WSW

I am sorry that I still encounter the same error after modifying the code according to your answer. Thanks again for your answer!

YuanxinGuo avatar Nov 07 '24 10:11 YuanxinGuo