CvT icon indicating copy to clipboard operation
CvT copied to clipboard

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Results 22 CvT issues
Sort by recently updated
recently updated
newest added

Hello, thanks for the great work, how to calculate the flops of the model. I have noticed that you report the flops of transformer based model, but I only found...

'rearrage' -> 'rearrange' in name of sequential projection layer

请问有适合关于Cifar10 或者Cifar100的模型 SPEC吗? Is there a model SPEC for Cifar10 or Cifar100? Thank you very much!

Bumps [numpy](https://github.com/numpy/numpy) from 1.19.3 to 1.22.0. Release notes Sourced from numpy's releases. v1.22.0 NumPy 1.22.0 Release Notes NumPy 1.22.0 is a big release featuring the work of 153 contributors spread...

dependencies

Hi All, Thanks for providing the code. I come across the mismatch between the code and the theory you proposed for the transformer block. The paper says "Instead, we propose...

你好,请问我用imagenet1k进行训练时候,test精度好像完全没有变化,一直保持0.1左右,这是什么情况呢?

Good job! My question is that why to use different class tokens for each stage but **only the final class token is used for classification**? https://github.com/microsoft/CvT/blob/34d1af94c95442b19fb9470e0c9dd5ee11be2024/lib/models/cls_cvt.py#L607

I want to konw why there isn't any to the function get_cls_model and compute_macs

Hi, thanks for sharing the code, I am using a dataset that can be converted to images of size 750* 184, I was wondering what should I change in this...

Hi,thanks for your work, could you release the 22k model?