VLMEvalKit icon indicating copy to clipboard operation
VLMEvalKit copied to clipboard

ChartQA evaluation

Open mactavish91 opened this issue 1 year ago • 1 comments

Hi,

I have a question regarding the test results for ChartQA. In the paper, there's a table showing the data split:

ChartQA-H:

  • Training: 3,699 charts, 7,398 questions
  • Validation: 480 charts, 960 questions
  • Test: 625 charts, 1,250 questions

ChartQA-M:

  • Training: 15,474 charts, 20,901 questions
  • Validation: 680 charts, 680 questions
  • Test: 987 charts, 1,250 questions

I'm wondering if the reported test results in the paper are based on just the ChartQA-H (human-authored) test set, or if they include both the ChartQA-H and ChartQA-M (machine-generated) test sets combined?

Thank you for clarifying!

mactavish91 avatar Aug 01 '24 09:08 mactavish91

The ChartQA_TEST benchmark includes both the ChartQA-H and ChartQA-M with 2,500 questions.

junming-yang avatar Aug 02 '24 07:08 junming-yang

@mactavish91 check out this inference example it could help you too : https://github.com/moured/qwen-vl2.5-chartqa

moured avatar May 05 '25 07:05 moured