mllm-evaluation topics

[CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models"

AdaCheng

egocentric-vision

mllm-evaluation

FinMME

61

Stars

2

Forks

61

Watchers

[ACL 2025] FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation

luo-junyu

mllm-evaluation

mllm-reasoning

EgoTextVQA

44

Stars

1

Forks

44

Watchers

[CVPR'25] 🌟🌟 EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering

zhousheng97

egocentric-qa-assistance

mllm-evaluation

scene-text-videoqa

scene-text-vqa

GAGE

27

Stars

4

Forks

27

Watchers

General AI evaluation and Gauge Engine. A unified evaluation engine for LLMs, MLLMs, audio, and diffusion models.

HiThink-Research

agent

game-arena

llm

llm-evaluation

EIBench

25

Stars

0

Forks

25

Watchers

Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models

Lum1104

chain-of-thought-reasoning

emotion-analysis

emotion-reasoning

mllm-evaluation

core-knowledge

20

Stars

1

Forks

20

Watchers

Office codebase for ICML 2025 paper "Core Knowledge Deficits in Multi-Modal Language Models"

grow-ai-like-a-child

core-knowledge

large-language-model

mllm-evaluation

multi-modal-large-language-model

General-Level

19

Stars

3

Forks

19

Watchers

On Path to Multimodal Generalist: General-Level and General-Bench

path2generalist

benchmark

llm

llm-evaluation

mllm

VidEgoThink

15

Stars

0

Forks

15

Watchers

The official code and data for paper "VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI"

AdaCheng

egocentric-videos

mllm-evaluation