Sam Motamed

Results 9 comments of Sam Motamed

"s" in requirements is missing in your script

same Q here! confused about paper / code discrepancy

Thanks @ZhangYuanhan-AI for the answer on this. So for the 7B model, only 10 frames are given to the model with full tokens (no pooling) per frame?

One last question @ZhangYuanhan-AI; 7B-Qwen model uses 10 frames during training correct? or does it sample varying frames based on fps?

They replied on another thread. They don't use the slow fast pooling on the 7B model, only on the 72B.

It looks like they have commented this out in the code, so I am also wondering whether the released models have actually used the slow-fast features as described in the...