PyVideoResearch Questions about charades_video

Thank you for sharing the code. It help a lot. But there is some confusion about the data loading procedure in charades_video_tsn.py. Here is the codes： ``

    n = self.data['datas'][index]['n']
    if shift is None:
        shift = np.random.randint(n-self.train_gap-2)
    else:
        shift = int(shift * (n-self.train_gap-2))
    ss = [shift] + [np.random.randint(n-self.train_gap-2)
                    for _ in range(self.segments-1)]

`` In this way, ss would be a list including indexes of images to be loaded. But I find the values in ss not regular, which seems not consistent with TSN.
Did I miss something? By the way, what's the difference among charades_tsn.py, charades_video.py and charades_video_xx.py? Looking forward to your reply.

Dec 24 '20 07:12 RongchangLi

The TSN part of the codebase was very experimental, so feel free to submit a pull request if you get good results.

I believe this is a version of TSN that picks a "center segment" and then randomly samples points before and after this segment (since it's sorted here https://github.com/gsig/PyVideoResearch/blob/46307b1a03ce670696297e2154ddee6f4e6b0b8a/datasets/charades_video_tsn.py#L25)

The original TSN code chooses something like 3 equally spaced segments, but we were extending it to larger videos, so we introduce some random sampling.

charades_tsn.py just borrows code from the video version, but returns individual frames instead of video clips. Video clip is the name I use for a "stack of video frames".

charades_video returns a video clip instead of a single frame.

charades_video_xx.py are different versions of the charades_video.py dataloader but with different data augmentations, and number of clips sampled a test time etc.

Hope that helps, Gunnar

Dec 24 '20 17:12 gsig

Thanks for your patient explanation. I will try some experiments and I will leave a comment on this issue if get any new findings.

Dec 25 '20 03:12 RongchangLi

Questions about charades_video_tsn.py