wespeaker Is the evaluation metric for training not loss or accuracy, but rather EER?

Hi, wespeaker team. I just followed your configure and got loss and acc. But i think it is too low. So i searched this on issues. I found similar thing. https://github.com/wenet-e2e/wespeaker/issues/165

According to above, most important thing is checking EER, loss and acc are less important during training? I am wondering loss and acc when it is 150 epochs for resnet34 In your training case. And please check my training log that it is trained usual or not. Thank you!!

In my training case as below. Dataset : Voxcelebs model : Resnet34

[ INFO : 2024-12-24 21:51:06,518 ] - | [ INFO : 2024-12-24 21:51:06,528 ] - | [ INFO : 2024-12-24 21:51:06,536 ] - | [ INFO : 2024-12-24 21:51:06,541 ] - | [ INFO : 2024-12-24 21:51:06,549 ] - | [ INFO : 2024-12-24 21:51:47,574 ] - | [ INFO : 2024-12-24 21:51:47,630 ] - | [ INFO : 2024-12-24 21:51:47,815 ] - | [ INFO : 2024-12-24 21:51:47,855 ] - | [ INFO : 2024-12-24 21:51:47,857 ] - | [ INFO : 2024-12-24 21:51:47,860 ] - | [ INFO : 2024-12-24 21:51:47,862 ] - | [ INFO : 2024-12-24 21:51:47,884 ] - | [ INFO : 2024-12-24 21:53:27,802 ] - | [ INFO : 2024-12-24 21:53:27,803 ] - | [ INFO : 2024-12-24 21:53:27,813 ] - | [ INFO : 2024-12-24 21:53:27,826 ] - | [ INFO : 2024-12-24 21:53:27,834 ] - | [ INFO : 2024-12-24 21:53:27,846 ] - | [ INFO : 2024-12-24 21:53:27,851 ] - | 149| 1000| 0.01728| 0.2| 1.4961| 71.647| 149| 1000| 0.01728| 0.2| 1.4996| 71.495| 149| 1000| 0.01728| 0.2| 1.4984| 71.522| 149| 1000| 0.01728| 0.2| 1.5064| 71.708| 149| 1000| 0.01728| 0.2| 1.4848| 71.882| 149| 1066| 0.017247| 0.2| 1.5084| 71.395| 149| 1066| 0.017247| 0.2| 1.4973| 71.537| 149| 1066| 0.017247| 0.2| 1.4936| 71.602| 149| 1066| 0.017247| 0.2| 1.4988| 71.583| 149| 1066| 0.017247| 0.2| 1.4966| 71.579| 149| 1066| 0.017247| 0.2| 1.4868| 71.852| 149| 1066| 0.017247| 0.2| 1.5065| 71.697| 149| 1066| 0.017247| 0.2| 1.4921| 71.735| 150| 100| 0.017198| 0.2| 1.5211| 71.57| 150| 100| 0.017198| 0.2| 1.4884| 71.703| 150| 100| 0.017198| 0.2| 1.4869| 71.93| 150| 100| 0.017198| 0.2| 1.4878| 71.688| 150| 100| 0.017198| 0.2| 1.4782| 72.234| 150| 100| 0.017198| 0.2| 1.4912| 71.219| 150| 100| 0.017198| 0.2| 1.4765| 71.727|

And it is my 'config.yaml' data_type: raw dataloader_args: batch_size: 128 drop_last: true num_workers: 16 pin_memory: false prefetch_factor: 8 dataset_args: aug_prob: 0.6 fbank_args: dither: 1.0 frame_length: 25 frame_shift: 10 num_mel_bins: 80 filter: true filter_args: max_num_frames: 800 min_num_frames: 100 num_frms: 200 resample_rate: 16000 sample_num_per_epoch: 0 shuffle: true shuffle_args: shuffle_size: 2500 spec_aug: false spec_aug_args: max_f: 8 max_t: 10 num_f_mask: 1 num_t_mask: 1 prob: 0.6 speed_perturb: true enable_amp: false exp_dir: RESNET-TSTP-emb256-fbank80-num_frms200-aug0.6-spTrue-saFalse-ArcMargin-SGD-epoch150_20241223 gpus:

0
1
2
3
4
5
6
7 log_batch_interval: 100 loss: CrossEntropyLoss loss_args: {} margin_scheduler: MarginScheduler margin_update: epoch_iter: 1066 final_margin: 0.2 fix_start_epoch: 40 increase_start_epoch: 20 increase_type: exp initial_margin: 0.0 update_margin: true model: ResNet34 model_args: embed_dim: 256 feat_dim: 80 pooling_func: TSTP two_emb_layer: false model_init: null noise_data: data/musan/lmdb num_avg: 10 num_epochs: 250 optimizer: SGD optimizer_args: lr: 0.1 momentum: 0.9 nesterov: true weight_decay: 0.0001 projection_args: do_lm: false easy_margin: false embed_dim: 256 num_class: 17982 project_type: arc_margin scale: 32.0 reverb_data: data/rirs/lmdb save_epoch_interval: 5 scheduler: ExponentialDecrease scheduler_args: epoch_iter: 1066 final_lr: 5.0e-05 initial_lr: 0.1 num_epochs: 250 scale_ratio: 16.0 warm_from_zero: true warm_up_epoch: 6 seed: 42 train_data: data/vox2_dev/raw.list train_label: data/vox2_dev/utt2spk

Jan 03 '25 05:01 NathanJHLee

Since this is ArcMargin loss with margins, the current loss behavior is expected. If you switch to the standard softmax criterion, you can achieve significantly higher accuracy more easily.

Jan 04 '25 09:01 wsstriving

Thank you for your explanation. I have one more question. If I use ArcMargin for projection during training, is there a way to determine the optimal number of epochs during Stage 3? Or is the only option to check the EER by evaluating the saved epoch models?

Jan 05 '25 23:01 NathanJHLee