Text4Vis issues

model without text

Thank you for your impressive work, Could you provide your pretrained model without text on HMDB as shown in Table 6? Thank you very much. Kind Regards,

workerbcd

回归任务

你好！您的项目十分有趣，我想将其修改为回归任务应用在我的数据集上，请问是否可以修改呢？如果可以，具体该如何修改哪些部分呢？

qq1332427275

您好，我对您的工作十分感兴趣，并且有两个问题想询问您。 1.您如何获取到Classifier（训练过程）即：如何通过Transferring visual statistic knowledge(LDA)得到lda_0.1.pt文件，以及如何通过Transferring textual semantic knowledge得到classes_features的训练过程 2.相关pt文件distilbert-base-k400.pt和lda_0.1.pt没有给出。十分期待您的回信

Liu-arch

CoOP

2

May I ask how the CoOP in the paper is implemented? Is there a tutorial available？

JiangjiangLan

关于多模态融合以及结果复现问题

2

作者您好，看了您的论文深受启发，觉得您写的很好，有两个问题想咨询您。 1、我已经成功复现了代码，预训练模型使用的vit-l-14，两张4090显卡跑的结果是：top1: 95.3%\top5: 99.2%，跟您的结果可能还有差距。 2、关于视觉特征和文本特征融合时，您采用了CLIP模型默认的余弦相似度计算，但我不太理解这个代码思路，看CLIP原论文伪代码好像不是这样，恳请您解答一下这个logit_scale 是干啥的，有什么用，为什么要这样初始化logit_scale 。 self.logit_scale = nn.Parameter(torch.ones([]) * np.log(1 / 0.07)) logit_scale = self.logit_scale.exp() logits = logit_scale * image_emb @ text_emb.t()

xiezexun

Text4Vis
Text4Vis copied to clipboard

Metadata

model without text

回归任务

lda_0.1.pt等文件

CoOP

关于多模态融合以及结果复现问题

could you provide a demo code for test the video in the wild?

What's different between "train.py" and "train_nce.py"？

all links to OneDrive are expired

← Metadata

Owner

Metadata

Text4Vis Text4Vis copied to clipboard

Metadata

← Metadata

Owner

Metadata

Text4Vis
Text4Vis copied to clipboard