R2Plus1D-PyTorch What is performance in comparison with original implementation?

Great implementation. Could you provide the reproduce result that can use to compare with original implementation in CAFFE2? Thanks

Jun 19 '18 17:06 John1231983

I think the first conv should be conv2d. Am I right? The correct version likes

       self.spatial_conv = nn.Conv2d(in_channels, intermed_channels, kernel=3,
                                    stride=1, padding=1, bias=bias)
        self.bn = nn.BatchNorm2d(intermed_channels)
        self.relu = nn.ReLU()
        self.temporal_conv = nn.Conv3d(intermed_channels, out_channels, temporal_kernel_size, 
                                    stride=temporal_stride, padding=temporal_padding, bias=bias)

Jun 19 '18 21:06 John1231983

I think it is okay. It should be kept as conv3d. but it actually performs like conv2d because one of kernel size is 1.

Oct 13 '19 08:10 yechanp

self.conv3 = SpatioTemporalResLayer(64, 128, 3, layer_sizes[1], block_type=block_type, downsample=True) why downsample=True?input size = 64 output size =128,I can't understand.can you help me ? Thanks! @irhum

Dec 03 '19 15:12 JinXiaozhao

My finding is that it's actually slower than C3D with fp16. With fp32, R2+1D is faster.

pytorch 1.3 cuda 10.2 cudnn 7.6.5

I think the newer cudnn is quite efficient in performing 3D convolution for fp16 inputs.

Dec 06 '19 20:12 Litou1