mllms topic

List mllms repositories

GlobalCom2

37

Stars

1

Forks

37

Watchers

[AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models

large-language-models

model-compression

TVC

144

Stars

0

Forks

144

Watchers

[ACL 2025] The code repository for "Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning" in PyTorch.

multimodel-large-language-model

Awesome-Reasoning-MLLM

61

Stars

4

Forks

61

Watchers

Awesome Reasoning in MLLMs: Papers and Projects about learning to reason with MLLMs, including Chain-of-Thought (CoT), OpenAl o1, and DeepSeek-R1

OS-Agent-Survey

376

Stars

19

Forks

376

Watchers

This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).

OS-Agent-Survey

Active-o3

76

Stars

1

Forks

76

Watchers

ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO

active-perception

G2VLM

242

Stars

4

Forks

242

Watchers

G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

3d-reconstruction

spatial-intelligence

General-Level

19

Stars

3

Forks

19

Watchers

On Path to Multimodal Generalist: General-Level and General-Bench

path2generalist

InternNav

595

Stars

67

Forks

595

Watchers

InternRobotics' open platform for building generalized navigation foundation models.

Olympus

428

Stars

72

Forks

428

Watchers

[CVPR 2025 Highlight] Official code for "Olympus: A Universal Task Router for Computer Vision Tasks"

foundation-models

MedTrinity-25M

394

Stars

28

Forks

394

Watchers

[ICLR 2025] This is the official repository of our paper "MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine“