VLMEvalKit icon indicating copy to clipboard operation
VLMEvalKit copied to clipboard

LLaVA v1.5 7b very low performance on TextVQA

Open wufeim opened this issue 1 year ago • 3 comments

Dear authors,

Thanks for sharing this great codebase.

I tried to test LLaVA-v1.5-7b (llava_v1.5_7b) model on TextVQA_VAL and only get about 21.88 accuracy, which is much lower than the 58.2 performance reported in the paper. I understand that the codebase is not aimed to reproduce the exact results but it seems that the gap is too big.

Is there any reason why this happens and any quick fix?

Thanks very much!

wufeim avatar Oct 28 '24 07:10 wufeim

Same to you, did you solve it?

ZI-MA avatar Feb 28 '25 02:02 ZI-MA

Same to you, did you solve it?

No I haven't figured it out. 🥲

wufeim avatar Feb 28 '25 05:02 wufeim

I've also got the same problem.

kydxh avatar Mar 16 '25 12:03 kydxh