When I use my own model the points are error.
I trained the vitpose on dataset halpe, then I used this model to visualise a video, but the points are error, any points even beyond the range of the yolo detection box.
Hello! Which code did you use for training? Are you able to run inference successfully with such code? Could you share the inference code and skeleton you are using?
Thanks for your reply! I used the offical code from https://github.com/ViTAE-Transformer/ViTPose for training. I can successfully run your inference code with other offical model.
For training, my model settings is model = dict( type='TopDown', pretrained=None, backbone=dict( type='ViT', img_size=(256, 192), patch_size=16, embed_dim=768, depth=12, num_heads=12, ratio=1, use_checkpoint=False, mlp_ratio=4, qkv_bias=True, drop_path_rate=0.3, ), keypoint_head=dict( type='TopdownHeatmapSimpleHead', in_channels=768, num_deconv_layers=0, num_deconv_filters=[], num_deconv_kernels=[], upsample=4, extra=dict(final_conv_kernel=3, ), out_channels=channel_cfg['num_output_channels'], loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)), train_cfg=dict(), test_cfg=dict( flip_test=True, post_process='default', shift_heatmap=False, target_type=target_type, modulate_kernel=11, use_udp=True))
So I change the model_base in ViTPose_common.py like:
model_base = dict( type='TopDown', pretrained=None, backbone=dict( type='ViT', img_size=(256, 192), patch_size=16, embed_dim=768, depth=12, num_heads=12, ratio=1, use_checkpoint=False, mlp_ratio=4, qkv_bias=True, drop_path_rate=0.3, ), keypoint_head=dict( type='TopdownHeatmapSimpleHead', in_channels=768, num_deconv_layers=0, num_deconv_filters=[], num_deconv_kernels=[], extra=dict(final_conv_kernel=3, ), loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)), train_cfg=dict(), test_cfg=dict( flip_test=True, post_process='default', shift_heatmap=False, target_type=target_type, modulate_kernel=11, use_udp=True))
But it's not work
I solved this question:
I add the upsample=4 in the after model settings and in the topdown_heatmap_simple_head.py change the if not isinstance(inputs, list): if self.upsample > 0: raise NotImplementedError return inputs to if not isinstance(inputs, list): if self.upsample > 0: inputs = F.interpolate(inputs, scale_factor=self.upsample, mode='bilinear', align_corners=False) return inputs
Cool! I'm currently in holiday but will check the code change when I come back.
Have a nice day!