Yoav
Yoav
The code is elegant and concise, but the training performance on coco val2014 is poor. The mAP is only 0.00912 after 24 epochs when I train the model from scratch.
In the forward function of the class GlobalAttentionGeneral, there are some codes i think maybe wrong. We assume that batch_size is 20, words_num is 18 and embedding_dim is 256, so...
I run this code with torch 1.6.0 and cuda 10.2. I have successfully trained the Headhunter, and now I try to get the track results. After I fixed a lot...
执行readme里面生成图像那个样例遇到报错 Error code: 400. Error message: The input messages exceed the maximum context length (58000 tokens) after keeping only the system message (if exists) and the latest one user message...
作者你好,感谢你的开源工作!我发现一个问题,在使用ZoomInSubfigure 工具时,从.parquet文件加载出来的item有两张图像,但是两张图像是一模一样的,第二张图像并不是第一张图像的局部子图,然后在collate_fn函数中处理数据的时候,在convert_example中 ``` if "You are a visual assistant capable of generating and solving steps" in item['value']: content.append({'type':'text', 'text':item['value'].split("\n\nQuestion: ")[-1]}) ``` 去掉了问题前面的``标记,而在后面的`OBSERVATION:\nZoomInSubfigure model outputs:`中却又保留了``标记,导致最后`processor.apply_chat_template`拼接出来的样本是下面这样子的: `....user\nWhat is the label of the...