MLVU issues

summary and sub scene task

1

The data/8_sub_scene.json and data/9_summary.json in this repo are dev or test set in the leaderboard?

What is the difference between sy1998/MLVU_dev and MLVU/MVLU?

8

Hello, thank you for sharing. I have a question. Why is the dataset given in this repository [MLVU-Dev](https://huggingface.co/datasets/MLVU/MVLU) different from the dataset used by lmms-eval ([sy1998/MLVU_dev](https://huggingface.co/datasets/sy1998/MLVU_dev/tree/main))? Is there any difference...

Cola-any

Welcome to add our VideoChat-Flash to leadboard

4

Thank you for your outstanding work. We noticed that you haven't added our model [VideoChat-Flash](https://github.com/OpenGVLab/VideoChat-Flash) to the leaderboard, which achieved a performance of 74.7 with a 7B scale. We sincerely...

leexinhao

raw evaluations of proprietary models

Hi, Will you make open-source/can you share the raw evaluations for proprietary models? Best, Orr

orrzohar

How to understand the 'Input' in Leaderboard ?

1

Hello，In the evaluation Leaderboard, the value of 'Input' column is usually 'n frm', '16 frm' for example. How do I understand this value? Is 16 frames sampled from the entire...

jchsun1

How to submit results file to your MLVU online evaluation system?

2

When I open http://analysis.a1.luyouxia.net:23226/, it tells: 使用 TCP 映射用于 HTTP 协议访问时，请使用分配的域名加端口进行访问，不支持使用其它域名访问。（无效主机头: analysis.a1.luyouxia.net）

Leon1207

GPT-4o API 帧上传数量限制问题

3

Hello, in the experiment section, can GPT-4o handle uploading 120 frames at once? Why can I only upload up to 50 frames when I call the API?

wang2q

Welcome to add our LVAgent to leadboard

3

Thank you for your outstanding work. We noticed that you haven't added our agent-based method [LVAgent](https://github.com/64327069/LVAgent) to the leaderboard, which achieved a performance of 83.9 with two 72B models and...

64327069

How can I evaluate MLVU test G-AVG

5

I want to test MLVU test G-AVG. How can I evaluate MLVU test G-AVG?

yunzhuzhang0918

question about the test_res.json

Hi author thank you for sharing such a great work! I have just simple one question. **Can you tell me which model used for generating the test_res.json in the official...

LimGeunTaekk

MLVU
MLVU copied to clipboard

Metadata

summary and sub scene task

What is the difference between sy1998/MLVU_dev and MLVU/MVLU?

Welcome to add our VideoChat-Flash to leadboard

raw evaluations of proprietary models

How to understand the 'Input' in Leaderboard ?

How to submit results file to your MLVU online evaluation system?

GPT-4o API 帧上传数量限制问题

Welcome to add our LVAgent to leadboard

How can I evaluate MLVU test G-AVG

question about the test_res.json

← Metadata

Owner

Metadata

MLVU MLVU copied to clipboard

Metadata

← Metadata

Owner

Metadata

MLVU
MLVU copied to clipboard