Taylor-X76
Taylor-X76
Here is the specific correction:https://github.com/HRNet/HRNet-Image-Classification/issues/28
https://github.com/HRNet/HRNet-Image-Classification/blob/8f158719e821836e21e6cba99a3241a12a13bc41/lib/models/cls_hrnet.py#L459~L473 If different block types are used in different stages, instead of the default bottleneck-basic-basic-basic in the original yaml file, the channel mismatch error as shown in the figure below...
In R2Plus1D_model.py, line 200: https://github.com/jfzhang95/pytorch-video-recognition/blob/ca37de9f69a961f22a821c157e9ccf47a601904d/network/R2Plus1D_model.py#L200 It's actually a convolution of 3 * 7 * 7 with padding=(1, 3, 3), not 1 * 7 * 7!