Youwei Liang

Results 26 comments of Youwei Liang

@yaox12 Hi, I notice you actually did not use mixed precision in the BYOL training since `opt_level` is `"O0"` in your [train_config.yaml](https://github.com/yaox12/BYOL-PyTorch/blob/master/config/train_config.yaml). Have you tried using `opt_level: "O1"`? Would `opt_level:...

Thx. I have just tried using `torch.cuda.amp` (PyTorch 1.6) for the mixed prescision training and also observed similar computation time with/without `amp`. I tend to agree with you on the...

Hi, thanks for your interest in our work! Yes, compared to vanilla ViTs, EViT has the `gather` and `topk` operations that require additional GPU kernel launch, whose computational overhead would...

The overhead may be caused by the `complement_idx` in `helpers.py`. I will check it soon. For your use case in video streams, can't the video be viewed as a series...

Hi, you can add `print(e)` to see the error message.

Hi, thanks for the questions. In Table 7, the presence of DeiT is not for comparison with EViT to show which is better, but for easy reference. We would update...

你说的tokens参数是用于外面的caller控制剩余token数目的。一般情况下,我们用的是keep_rate这个参数来控制token数目,是在下面这几行计算出来的。 https://github.com/youweiliang/evit/blob/cc1993ddbd49bf3bf84aa39a7488dfdad95ad50a/evit.py#L209-L211

你好,对于EViT-LVViT-S 1. 只需蒸馏最后留下的token 2. fuse token不需要蒸馏 我这周会上传EViT-LVViT的代码。谢谢!

No, the resulting difference would be negligible.

Thanks for your question. No, it is for only three layers. The `keep_rate` you mentioned is for controlling the keep rate during training and is different from the keep rate...