mllm-evaluation topic

List mllm-evaluation repositories

EASI

59
Stars
4
Forks
59
Watchers

Holistic Evaluation of Multimodal LLMs on Spatial Intelligence

EgoThink

63
Stars
4
Forks
63
Watchers

[CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models"

FinMME

61
Stars
2
Forks
61
Watchers

[ACL 2025] FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation

EgoTextVQA

44
Stars
1
Forks
44
Watchers

[CVPR'25] 🌟🌟 EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering

GAGE

27
Stars
4
Forks
27
Watchers

General AI evaluation and Gauge Engine. A unified evaluation engine for LLMs, MLLMs, audio, and diffusion models.

EIBench

25
Stars
0
Forks
25
Watchers

Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models

core-knowledge

20
Stars
1
Forks
20
Watchers

Office codebase for ICML 2025 paper "Core Knowledge Deficits in Multi-Modal Language Models"

General-Level

19
Stars
3
Forks
19
Watchers

On Path to Multimodal Generalist: General-Level and General-Bench

VidEgoThink

15
Stars
0
Forks
15
Watchers

The official code and data for paper "VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI"