llama.cpp Feature Request: Qwen3-Omni-30B-A3B support

Prerequisites

[x] I am running the latest code. Mention the version if possible as well.
[x] I carefully followed the README.md.
[x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[x] I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Qwen has released three 30b a3b omni models: https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instruct https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Thinking https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Captioner

Motivation

new SOTA omni models

Possible Implementation

https://github.com/huggingface/transformers/pull/41025 https://github.com/huggingface/transformers/pull/41045

Sep 22 '25 23:09 teee9

It seems to have been forgotten by the developers.

Sep 26 '25 02:09 eli0wang6

It seems to have been forgotten by the developers.

Not forgotten 🙂 audio is just more complex VL is prioritized first, so audio support will likely follow later.

Sep 26 '25 04:09 ServeurpersoCom

facing this issue ValueError: Unrecognized configuration class <class 'transformers.models.qwen3_omni_moe.configuration_qwen3_omni_moe.Qwen3OmniMoeConfig'> for this kind of AutoModel: AutoModelForCausalLM. Any help please

Sep 29 '25 14:09 HaithemH

@HaithemH Qwen3-vl and Qwen3-Omni hasn't been supported by llama.cpp yet

Oct 04 '25 02:10 CarlGao4

It seems to have been forgotten by the developers.

Not forgotten 🙂 audio is just more complex VL is prioritized first, so audio support will likely follow later.

Could you tell us the plan of supporting qwen3 omni?

Oct 14 '25 00:10 CarlHuangNuc

same request

Oct 22 '25 01:10 sxch775-work

hope it

Oct 22 '25 01:10 CarlHuangNuc

Yeah it seems like more and more omni modal models are getting released, it would be amazing if we had support for those in llama.cpp, though I know thats very complicated /:

Oct 27 '25 23:10 wsbagnsv1

Qwen3-VL has been supported now!

Nov 01 '25 04:11 CarlGao4

How about the plan of support Omni?

Nov 05 '25 00:11 CarlHuangNuc

I'm curious about which one is more difficult to implement: Qwen3-Omni or Qwen3-Next?

Nov 14 '25 15:11 calvin2021y

hope this hasn’t been forgotten

Nov 15 '25 15:11 riunxaio

We have mlx implementation now. https://github.com/Blaizzy/mlx-vlm/pull/598 Audio generation speed is acceptable as long as the input does not include images.

demo repo: https://github.com/hellopahe/joi

Nov 26 '25 16:11 hellopahe

looking forward to audio generation

Nov 30 '25 17:11 Busboy3129