ml-cvnets The accuracy of the two figures does not match

Hi，thanks for such a great job，but I have a question, the accuracy of the standard training method of MobileViT-S in (b) and (c) in Figure 9 in the paper seems to be different, and the top-1 accuracy in (b) should be about 77% , the top-1 accuracy in (c) should be around 78%.

Can anyone answer my doubts? thanks！

Feb 15 '22 08:02 cuicheng01

Thanks for your question. The validation error plot in (b) is without EMA while the validation accuracy in (c) is with EMA.

Feb 15 '22 14:02 sacmehta

Thanks a lot for the reply. It is surprising that the EMA strategy can rise so much. My previous experimental conclusions are basically between 0-0.2%. By the way, the top-1 accuracy of the standard training method is only 77% without using the EMA strategy?

Feb 16 '22 02:02 cuicheng01

Our experiments also suggests that EMA improves performance by 0.2-0.3%.

The top-1 accuracy without EMA is about 78%. The difference we observe in graph is because part (c) performance is measured after the training on a single GPU ensuring batches are neither truncated nor padded. While the data for plots in (b) are measured during training. For (b), depending on the batch size, there could be cases when dataset size is not multiple of batch size. For example, for ImageNet, we have 50k validation images. If we use a batch size of 1024 for computing validation statistics during training, then few batches will be either truncated or padded. So, training/validation curves could be bit noisy.

Hope this helps.

Feb 16 '22 03:02 sacmehta

Ok! Thanks for your reply. However, I still have some questions. According to what you said, I reproduced the process. Although there will be some differences in accuracy, it will not be as big as 1%. I would like to take the liberty to ask, can you provide the training log here?

I look forward to your reply.

Feb 21 '22 08:02 cuicheng01

Hi @cuicheng01 ,

Extremely sorry for the late response. I did not realize that I did not respond to your comment.

Here are the training/validation/validation + EMA loss curves. Hope this helps.

Train loss vs. Epoch: Train_Loss

Val loss vs. Epoch: Val_Loss

Val loss w/ EMA vs. Epoch: Val_EMA_Loss

Jun 28 '22 04:06 sacmehta

Closing issue because of no activity.

Oct 30 '22 02:10 sacmehta