LoveCV
Results
2
issues of
LoveCV
Hi! I am confused about the training of multi-size images. Based on the paper, > In other words, during training we implement the varying-input-size SPP-net by two fixed-size networks that...
你好!请教几个问题: 1)论文中4.3部分中,首先使用cross-attention (CA)来进行模态信息交互,之后使用公式7进行codebook更新的。但看了下代码,好像是模态特征没有进行CA?见如下代码: `# video self.ema_count = self.decay * self.ema_count + (1 - self.decay) * torch.sum(v_encodings, dim=0) n = torch.sum(self.ema_count) self.ema_count = (self.ema_count + self.epsilon) / (n + M *...