CLIP
CLIP copied to clipboard
Reproduce zeroshot results on EuroSAT dataset.
Thank you for your work on CLIP!
I was trying to reproduce the zeroshot prediction results listed in Table 11 in the paper. Though I can reproduce most of the results in the Table 11, I found there are huge gaps on EuroSAT dataset.
We have tried:
- Use JIT when loading CLIP model or not
- Using different image preprocessing, i.e. center-crop or not center-crop
- We have confirmed the order of categories in the promts.py is consistent with the dataset
But we can still can not reproduce the reported numbers in table 11. Any hints will be greatly appreciated, thank you!
| Model name | ResNet50 | ResNet101 | RN50x4 | RN50x16 | ViT-B/16 | ViT-B/32 | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset | CLIP | Ours | Delta | CLIP | Ours | Delta | CLIP | Ours | Delta | CLIP | Ours | Delta | CLIP | Ours | Delta | CLIP | Ours | Delta |
| EuroSAT | 41.1 | 41.3 | 0.2 | 33.1 | 31.0 | -2.1 | 35.0 | 32.7 | -2.3 | 40.3 | 42.0 | 1.7 | 54.1 | 54.6 | 0.5 | 49.4 | 44.8 | -4.6 |
JIT applied or Not when loading CLIP model
| Model name | ResNet50 | ResNet101 | RN50x4 | RN50x16 | ViT-B/16 | ViT-B/32 | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset | w/ JIT | w/o JIT | Delta | w/ JIT | w/o JIT | Delta | w/ JIT | w/o JIT | Delta | w/ JIT | w/o JIT | Delta | w/ JIT | w/o JIT | Delta | w/ JIT | w/o JIT | Delta |
| EuroSAT | 41.3 | 41.3 | 0 | 31.0 | 31.0 | 0 | 32.6 | 32.7 | -0.1 | 42.2 | 42.0 | 0.2 | 54.6 | 54.6 | 0 | 44.8 | 44.8 | 0 |
Image Preprocessing: Center Crop v.s. No Center Crop
| Model name | ResNet50 | ResNet101 | RN50x4 | RN50x16 | ViT-B/16 | ViT-B/32 | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset | w/ Crop | w/o Crop | Delta | w/ Crop | w/o Crop | Delta | w/ Crop | w/o Crop | Delta | w/ Crop | w/o Crop | Delta | w/ Crop | w/o Crop | Delta | w/ Crop | w/o Crop | Delta |
| EuroSAT | 41.3 | 41.3 | 0 | 31.0 | 31.0 | 0 | 32.7 | 32.7 | 0 | 42.0 | 42.0 | 0 | 54.6 | 54.6 | 0 | 44.8 | 44.8 | 0 |
Order of categories
| Dataset | Order Inconsistent | Order Fixed |
|---|---|---|
| EuroSAT | Y | Y |
hi @xcpeng, can you share the code that you use to get this performance. Because it's only 4.42 with B/32 with my code. while others dataset gave the same performance as the table 11