llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Feature Request: Qwen3-Omni-30B-A3B support

Open teee9 opened this issue 4 months ago • 12 comments

Prerequisites

  • [x] I am running the latest code. Mention the version if possible as well.
  • [x] I carefully followed the README.md.
  • [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [x] I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Qwen has released three 30b a3b omni models: https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instruct https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Thinking https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Captioner

Motivation

new SOTA omni models

Possible Implementation

https://github.com/huggingface/transformers/pull/41025 https://github.com/huggingface/transformers/pull/41045

teee9 avatar Sep 22 '25 23:09 teee9

It seems to have been forgotten by the developers.

eli0wang6 avatar Sep 26 '25 02:09 eli0wang6

It seems to have been forgotten by the developers.

Not forgotten 🙂 audio is just more complex VL is prioritized first, so audio support will likely follow later.

ServeurpersoCom avatar Sep 26 '25 04:09 ServeurpersoCom

facing this issue ValueError: Unrecognized configuration class <class 'transformers.models.qwen3_omni_moe.configuration_qwen3_omni_moe.Qwen3OmniMoeConfig'> for this kind of AutoModel: AutoModelForCausalLM. Any help please

HaithemH avatar Sep 29 '25 14:09 HaithemH

@HaithemH Qwen3-vl and Qwen3-Omni hasn't been supported by llama.cpp yet

CarlGao4 avatar Oct 04 '25 02:10 CarlGao4

It seems to have been forgotten by the developers.

Not forgotten 🙂 audio is just more complex VL is prioritized first, so audio support will likely follow later.

Could you tell us the plan of supporting qwen3 omni?

CarlHuangNuc avatar Oct 14 '25 00:10 CarlHuangNuc

same request

sxch775-work avatar Oct 22 '25 01:10 sxch775-work

hope it

CarlHuangNuc avatar Oct 22 '25 01:10 CarlHuangNuc

Yeah it seems like more and more omni modal models are getting released, it would be amazing if we had support for those in llama.cpp, though I know thats very complicated /:

wsbagnsv1 avatar Oct 27 '25 23:10 wsbagnsv1

Qwen3-VL has been supported now!

CarlGao4 avatar Nov 01 '25 04:11 CarlGao4

How about the plan of support Omni?

CarlHuangNuc avatar Nov 05 '25 00:11 CarlHuangNuc

I'm curious about which one is more difficult to implement: Qwen3-Omni or Qwen3-Next?

calvin2021y avatar Nov 14 '25 15:11 calvin2021y

hope this hasn’t been forgotten

riunxaio avatar Nov 15 '25 15:11 riunxaio

We have mlx implementation now. https://github.com/Blaizzy/mlx-vlm/pull/598 Audio generation speed is acceptable as long as the input does not include images.

demo repo: https://github.com/hellopahe/joi

hellopahe avatar Nov 26 '25 16:11 hellopahe

looking forward to audio generation

Busboy3129 avatar Nov 30 '25 17:11 Busboy3129