multimodel-large-language-model topic

List multimodel-large-language-model repositories

echoOLlama

124
Stars
4
Forks
124
Watchers

🦙 echoOLlama: A real-time voice AI platform powered by local LLMs. Features WebSocket streaming, voice interactions, and OpenAI API compatibility. Built with FastAPI, Redis, and PostgreSQL. Perfect f...

TVC

144
Stars
0
Forks
144
Watchers

[ACL 2025] The code repository for "Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning" in PyTorch.

RoboBrain2.0

742
Stars
63
Forks
742
Watchers

RoboBrain 2.0: Advanced version of RoboBrain. See Better. Think Harder. Do Smarter. 🎉🎉🎉

UI-Venus

606
Stars
35
Forks
606
Watchers

UI-Venus is a native UI agent designed to perform precise GUI element grounding and effective navigation using only screenshots as input.

Seg-Zero

582
Stars
28
Forks
582
Watchers

Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"

Robust-R1

383
Stars
6
Forks
383
Watchers

🔥🔥🔥[AAAI 2026 Oral] Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

Basic-Visual-Language-Model

47
Stars
8
Forks
47
Watchers

Build a simple basic multimodal large model from scratch. 从零搭建一个简单的基础多模态大模型🤖