Youwei Liang comments

Results 26 comments of


                                            Youwei Liang

Training time

@yaox12 Hi, I notice you actually did not use mixed precision in the BYOL training since `opt_level` is `"O0"` in your [train_config.yaml](https://github.com/yaox12/BYOL-PyTorch/blob/master/config/train_config.yaml). Have you tried using `opt_level: "O1"`? Would `opt_level:...

Training time

Thx. I have just tried using `torch.cuda.amp` (PyTorch 1.6) for the mixed prescision training and also observed similar computation time with/without `amp`. I tend to agree with you on the...

Why is normal VIT faster than EVIT when batchsize is 1?

Hi, thanks for your interest in our work! Yes, compared to vanilla ViTs, EViT has the `gather` and `topk` operations that require additional GPU kernel launch, whose computational overhead would...

Why is normal VIT faster than EVIT when batchsize is 1?

The overhead may be caused by the `complement_idx` in `helpers.py`. I will check it soon. For your use case in video streams, can't the video be viewed as a series...

How to handle "compute MAC error"?

Hi, you can add `print(e)` to see the error message.

Some question

Hi, thanks for the questions. In Table 7, the presence of DeiT is not for comparison with EViT to show which is better, but for easy reference. We would update...

forward_features(self, x, keep_rate=None, tokens=None, get_idx=False)

你说的tokens参数是用于外面的caller控制剩余token数目的。一般情况下，我们用的是keep_rate这个参数来控制token数目，是在下面这几行计算出来的。 https://github.com/youweiliang/evit/blob/cc1993ddbd49bf3bf84aa39a7488dfdad95ad50a/evit.py#L209-L211

How to train a EViT-LVViT-S

你好，对于EViT-LVViT-S 1. 只需蒸馏最后留下的token 2. fuse token不需要蒸馏我这周会上传EViT-LVViT的代码。谢谢！

Warmup strategy

No, the resulting difference would be negligible.

Warmup strategy

Thanks for your question. No, it is for only three layers. The `keep_rate` you mentioned is for controlling the keep rate during training and is different from the keep rate...