ChartQA evaluation

Open mactavish91 opened this issue 1 year ago • 1 comments

Hi,

I have a question regarding the test results for ChartQA. In the paper, there's a table showing the data split:

ChartQA-H:

Training: 3,699 charts, 7,398 questions
Validation: 480 charts, 960 questions
Test: 625 charts, 1,250 questions

ChartQA-M:

Training: 15,474 charts, 20,901 questions
Validation: 680 charts, 680 questions
Test: 987 charts, 1,250 questions

I'm wondering if the reported test results in the paper are based on just the ChartQA-H (human-authored) test set, or if they include both the ChartQA-H and ChartQA-M (machine-generated) test sets combined?

Thank you for clarifying!

Aug 01 '24 09:08 mactavish91

The ChartQA_TEST benchmark includes both the ChartQA-H and ChartQA-M with 2,500 questions.

Aug 02 '24 07:08 junming-yang

@mactavish91 check out this inference example it could help you too : https://github.com/moured/qwen-vl2.5-chartqa

May 05 '25 07:05 moured