Dragon2938734

Results 3 issues of Dragon2938734

excuse me, i woder whether the image_embedding in the code "self.image_embedding = nn.Embedding(1, cfg.DECODER.d_model * 2)" means embed different views? Thank you

How do you get affine_trans in the code? for exameple: affine_trans = torch.stack([m['affine_trans'] for m in meta], dim=1)

请问大佬们有推荐的多模态mllm类似的总结资源仓库么