Akanksha Bindal
Results
2
comments of
Akanksha Bindal
Thanks for that explanation. Is the context window for video and audio frames decided by the kernel size of the first audio and video conv layer? For ex: If we...
Changing the audio kernel size here messes up the dimensions of the model. How did you account for context window size and M inside the model?