Bryan Wong
Bryan Wong
Hi @ramprs21 @Richarizardd , For the hierarchical pretraining (2nd stage), will the training time be much faster than the 1st stage since one region can now be converted into [256,384]...
May I know how to set coefficient of unlabeled batch size (mu) & eval_steps properly?
Hi @binli123, could you please let me know which things I should initialize using various initialization methods in PyTorch? Additionally, do you have an example? I'm currently working on implementing...
Hi @yanjk3, Thank you very much for the answers, I really appreciate it. It makes more sense now that I know the authors made a slight modification to the original...
Hi @yanjk3, When I use eval_knn.py from original dino to evaluate selfpatch, it says: size mismatch for pos_embed: copying a param with shape torch.Size([1, 196, 384]) from checkpoint, the shape...
Hi @yanjk3, thank you for your answers. Could you demonstrate how I can use a global avg pooling on the last transformer blocks?
Hi @yanjk3, I already took your advice, but it appears that the accuracy is 3% less than it was for the original DINO under the same settings for eval knn.py....
Hi @yanjk3, sorry I don't really get it. What do you mean by copying SelfPatch VIT to DINO VIT?
Hi @yanjk3, do you mean adding everything you previously suggested to the code for the Dino Vit Model (vision transformer.py)?
Hi, could you kindly provided full code on how to do batch inference, given the batch of images and the question. Thank you!