dual_encoding
dual_encoding copied to clipboard
Could this work do video to text task?
It's a great work. Given a sentence to find the most matched video from several videos. But could this work do video-to-text task, I mean given a video to find the most matched text from several texts.
Looking forward to your reply.