Results 2 comments of Myungseo Song

The dashed lines indicate the accuracies of the top-5 prediction results from the classifier, i.e. whether the ground truth is one of the 5 classes with the largest scores. For...

In my case, when training the model from scratch with learning rate = 1e-4, the nan outputs appeared occasionally. I tried to solve this instability of training, e.g., by improving...