Hongwei Niu issues

Results 18 issues of


                                            Hongwei Niu

[Question] Recognition Error

### Question Hello, I found that it, like CLIP, focuses on a very coarse granularity. For example, an image cropped from the GT box of the COCO dataset is recognized...

[F-VLM] K-Means Clustering of Frozen Features

Can you share some implementation details about the result about 'K-Means Clustering of Frozen Features'?

[F-VLM] CLIP Recognition Bias

Hello, I've been working on a project involving object detection, and I encountered a specific issue I'd like to discuss. My approach involves using RoIalign to extract regional features from...

你好，我想知道torch_vertex.py 146行，y代表什么，有什么作用呢？ ``` if self.r > 1: y = F.avg_pool2d(x, kernel_size=self.r, stride=self.r) # [B, out_dim, H/r, W/r] y = y.reshape(B, C, -1, 1).contiguous() # [B, out_dim, H/r*W/r, 1] ```

What do Pair, L, T stand for in the code?

Hi, I'm a beginner and would like to ask a question. What do Pair, L, T stand for in the code? What do they mean? ``` # Pair x L...

关于MultiModalDataset

作者，您好。请问对于同义词你是怎么处理的呢？像ODISE、FC-CLIP的做法是在预测的时候进行max ensemble，但是这样的操作特别耗时，会影响FPS。例如，num_templates存储的是每个类别的同义词个数，它会将每个类别的所有同义词模板进行max操作，这样得到的final_pred_logits维度为[B, N, num_classes] ``` cur_idx = 0 for num_t in num_templates: final_pred_logits.append(pred_logits[:, :, cur_idx: cur_idx + num_t].max(-1).values) cur_idx += num_t ```

Hongwei Niu

[Question] Recognition Error

[F-VLM] K-Means Clustering of Frozen Features

[F-VLM] CLIP Recognition Bias

关于代码中y的疑问

What do Pair, L, T stand for in the code?

关于MultiModalDataset

CLIP Recognition Error

AttributeError: module 'signal' has no attribute 'SIGUSR1'

Questions about input pred label file

Some questions about QSAttn.