img_norm_cfg and img_scale
In the configs/soft_teacher.py/base.py. Following statistcs have been used for img_norm
img_norm_cfg = dict(mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
However, the coco_instance.py in the mmdet has following img_norm_cfg
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375].
Can you please clarify the discrepancy.
Also, I had doubt as to why multiple img_scale are used img_scale=[(1333, 400), (1333, 1200)], and also how does this img_scale relate to the pretraining img_scale of the backbone. Like, I've trained a swin_transformer for 224x224 image size. I would assume that the backbone gets the image of the 224x224, then what exactly does img_scale does.
-
img_norm_cfgis used to normalize the image. And in our config, we usemean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=Falsebecause we use the pretrained weights from caffe. Inmmdetection, however, they use the pretrained weights from torchvision by default. These two pretrained models are trained with different statistcs. -
img_scaledoesn't have to relate to the pretraning img_scale. Even in your case, the pretrained model are trained with 224x224, it is still better to train it with a large scale because it is hard to locate the object in a small image.