achen46
achen46
In this setup, we are using the exact same seeds as set by the authors [here](https://github.com/microsoft/Swin-Transformer/blob/main/main.py#L323). So why can't we achieve anything close to 83.20 as claimed in the paper...
Adding others for visibility @ancientmooner @caoyue10
Hi @keyu-tian Thank you so much for providing the pointer ! As mentioned, I am interested in a (shell) script that I can run to start training the model and...
@yukang2017 this is great and a much needed feature to be added. I tried your modifications to resume from a checkpoint. The loss in the beginning was around 0.21 and...
@yukang2017 I observe the same exact issue as you mentioned. Loss goes back up. I wonder if this may be due to the EMA weights ?
@NathanYanJing I believe so as the saved model is too large ~ 9G. But it could also be the fact that we overwrite them again and hence loss going back...
> @achen46 I’ve encountered the exact same issue you described earlier, and as a result, I’ve created a new pull request #36 . I’ve tested it on my end, and...