Dragon2938734
Results
3
issues of
Dragon2938734
excuse me, i woder whether the image_embedding in the code "self.image_embedding = nn.Embedding(1, cfg.DECODER.d_model * 2)" means embed different views? Thank you
How do you get affine_trans in the code? for exameple: affine_trans = torch.stack([m['affine_trans'] for m in meta], dim=1)
请问大佬们有推荐的多模态mllm类似的总结资源仓库么