CLIPSelf
CLIPSelf copied to clipboard
[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
can you provide a log and i am curious about the wall-clock time
您好,我在运行的时候出现错误:FViT: EvaCLIPViT: Model config for EVA02-CLIP-B-16 not found; 还有一个问题就是安装xformers的时候显示需要torch2.2.0,但是跟安装的mmcv和torch好像冲突; 请问该怎么解决呢,谢谢~
Hi, thanks for your great work. When I try to reproduce the results using command below ``` bash scripts/train_clipself_coco_image_patches_eva_vitl14.sh ``` I ran into the error `ValueError: assignment destination is read-only`,...
in this [Drive](https://drive.google.com/drive/folders/11zG4nJffm0MbvA0Ph19p6jvJFj6VwRAH),Is it intentional or a mistake that coco_proposals.json and coco_pseudo_4764.json are completely identical.
RuntimeError: Pretrained weights (checkpoints) not found for model EVA02-CLIP-B-16.Available pretrained tags (['eva', 'eva02', 'eva_clip', 'eva02_clip'].
Hi, thank you for your great works. I have two questions. Firstly, is it possible to share the following config files to train F-ViT models from **OpenAI-CLIP** rather than EVA-CLIP?...
How does the performance of this fine-tuned model on Zero-shot Classification and Zero-shot Cross-Modal Retrieval?
First of all, thank you very much for publishing code that accompanies your paper. I was wondering if you would be willing to share the training script as well as...
Can this be trained on V100 32gb or A100 40gb? The paper mentioned A100 but doesn't mentioned whether it is 40gb or 80gb.
Hi, thanks you support the codes of CLIPSelf, a very nice work! Here is an error about: main.py: error: argument --dataset-type: invalid choice: 'grid_distill' (choose from 'webdataset', 'csv', 'synthetic', 'auto')...