Siyi Chen comments

Repositories
Issues
Comments

Results 4 comments of


                                            Siyi Chen

RGBDiff

Thank you! Downsampling by n means selecting 1 frame per n frames, do you first select those frames and then calculate RGBDiff?

RGBDiff

Thank you! How about the result obtained in Table 12? Do you also do self-supervised training on two encoders for RGB and RGBDiff, and average the the similarity of two...

Table 6 Question

Thank you! Also I am curious, have tried to train on the full K400 dataset, would that help or harm the model? > On Sep 13, 2023, at 4:35 AM,...

The code implementation does not match the description in the paper.

What's more, I think the author's implementation of `encode_image_mini` may always fuse all text tokens with image tokens - this is a problem during training, since the "answer" text part...