depthwise

Results 2 issues of depthwise

I'm observing sensitivity wrt LR restarts in a typical SGDR schedule with cosine annealing as in Loschilov & Hutter. RAdam still seems to be doing better than AdamW so far,...

question

Thank you for developing such a useful service. As a practitioner I care disproportionately about the _peak_ metrics in any given run. I.e. max mAP50 for object detection (and min...

feature_request