MetaTransformer icon indicating copy to clipboard operation
MetaTransformer copied to clipboard

Meta-Transformer for Unified Multimodal Learning

Results 5 MetaTransformer issues
Sort by recently updated
recently updated
newest added

In the sample code provided, features are concated before processed in the encoder. features = torch.concat([video_tokenizer(video), audio_tokenizer(audio), time_series_tokenizer(time_data)],dim=1) However, as I ran some tokenizers of different modaility, the tokenized shape...

First of all, congratulations for your work! I opened this issue to ask if you can upload the Data2Seq pre-trained weights, it could be very useful for many researchers. Thanks...

Hi, thanks for your outstanding work! I am trying to use meta-transformer to conduct image classification. I noticed that in the paper, you wrote "On image classification, with the help...

非常感谢您的杰出工作,我刚刚接触这方面的研究,读了您的论文后,收到很大的启发,但在使用X-ray代码时遇到了一些问题,我装好了一些库之后,却显示找不到models,这是什么原因呢?

https://github.com/invictus717/MetaTransformer/blob/b08a2bee6dae578bbbedd124859bfe4201181681/Data2Seq/Data2Seq.py#L52 in the code,embeddings is a list including input_ids and attention_mask,and cause error in function zero_padding