Yoav issues

Results 5 issues of


                                            Yoav

very low mAP on coco val2014 when training from scratch

The code is elegant and concise, but the training performance on coco val2014 is poor. The mAP is only 0.00912 after 24 epochs when I train the model from scratch.

the meaning of the mask in "attn.data.masked_fill_(mask.data, -float('inf'))" in the forward function of the class GlobalAttentionGeneral

In the forward function of the class GlobalAttentionGeneral, there are some codes i think maybe wrong. We assume that batch_size is 20, words_num is 18 and embedding_dim is 256, so...

so many bugs in the code!!!

I run this code with torch 1.6.0 and cuda 10.2. I have successfully trained the Headhunter, and now I try to get the track results. After I fixed a lot...

执行readme里面生成图像那个样例遇到报错 Error code: 400. Error message: The input messages exceed the maximum context length (58000 tokens) after keeping only the system message (if exists) and the latest one user message...

ZoomInSubfigure 工具的使用可能存在bug

作者你好，感谢你的开源工作！我发现一个问题，在使用ZoomInSubfigure 工具时，从.parquet文件加载出来的item有两张图像，但是两张图像是一模一样的，第二张图像并不是第一张图像的局部子图，然后在collate_fn函数中处理数据的时候，在convert_example中 ``` if "You are a visual assistant capable of generating and solving steps" in item['value']: content.append({'type':'text', 'text':item['value'].split("\n\nQuestion: ")[-1]}) ``` 去掉了问题前面的``标记，而在后面的`OBSERVATION:\nZoomInSubfigure model outputs:`中却又保留了``标记，导致最后`processor.apply_chat_template`拼接出来的样本是下面这样子的： `....user\nWhat is the label of the...

Yoav

very low mAP on coco val2014 when training from scratch

the meaning of the mask in "attn.data.masked_fill_(mask.data, -float('inf'))" in the forward function of the class GlobalAttentionGeneral

so many bugs in the code!!!

执行第一个提供的图像生成的样例遇到bug

ZoomInSubfigure 工具的使用可能存在bug