Sk_Kim
Sk_Kim
could you upload json param file?
Hi, author. To visualize your results attention map, how can you visualize this? 1) Use Encoder (ViT)? 2) Use Decoder (VIT)? given input x -> y = encoder(x) -> decoder(y)....
what is coda version for MSDA
Thank you for nice work. In training ViCLIP, I would like to clarify my understanding of this paper. If vision transforms is not pre-trained such as MAE method, then, it...
1. for pre-training FM, how many frames are used ? (150fr? for each batch)? 2. for downstream task, I see that you used 224 224 with 1 frame for segmentation....
Hi, how can I create real time us probe navigator as navig.gif? Thanks
I followed code and it looks like data processing is very time consuming. is there a better approach for that?
Hi, I wonder if during the training, it should be the same length of the video. In inference steps, how can you infer if the video is longer or shorter...
Hi, I wonder if during the training, it should be the same length of the video. In inference steps, how can you infer if the video is longer or shorter...