Dora

Results 1 comments of Dora

@senthurRam33 do you mind sharing which hyperparams you used? Specifically the batch size and LR? I trained for 80K steps and the performance dropped from 56% to around 51% after...