qwen-vl topic

List qwen-vl repositories

PaddleMIX

710
Stars
222
Forks
710
Watchers

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high per...

awesome-vlm-architectures

393
Stars
22
Forks
Watchers

Famous Vision Language Models and Their Architectures

webmarker

50
Stars
3
Forks
50
Watchers

Mark web pages for use with vision-language models

lmms-finetune

357
Stars
41
Forks
357
Watchers

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.

Vision-Language-Models-Overview

444
Stars
22
Forks
444
Watchers

A most Frontend Collection and survey of vision-language model papers, and models GitHub repository. Continuous updates.

Vision-SR1

142
Stars
18
Forks
142
Watchers

Reinforcement Learning of Vision Language Models with Self Visual Perception Reward