VLMEvalKit
VLMEvalKit copied to clipboard
ChartQA evaluation
Hi,
I have a question regarding the test results for ChartQA. In the paper, there's a table showing the data split:
ChartQA-H:
- Training: 3,699 charts, 7,398 questions
- Validation: 480 charts, 960 questions
- Test: 625 charts, 1,250 questions
ChartQA-M:
- Training: 15,474 charts, 20,901 questions
- Validation: 680 charts, 680 questions
- Test: 987 charts, 1,250 questions
I'm wondering if the reported test results in the paper are based on just the ChartQA-H (human-authored) test set, or if they include both the ChartQA-H and ChartQA-M (machine-generated) test sets combined?
Thank you for clarifying!
The ChartQA_TEST benchmark includes both the ChartQA-H and ChartQA-M with 2,500 questions.
@mactavish91 check out this inference example it could help you too : https://github.com/moured/qwen-vl2.5-chartqa