llama.cpp Feature Request: Support GLM-4.1V-9B-Thinking

Prerequisites

[x] I am running the latest code. Mention the version if possible as well.
[x] I carefully followed the README.md.
[x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[x] I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Support GLM-4.1V-9B-Thinking

Motivation

It's a SOTA open sourse thinking VLM.

Possible Implementation

No response

Jul 02 '25 05:07 Willian7004

Any updates?

Jul 17 '25 07:07 wogam

Yeah any updates?

Jul 20 '25 21:07 IDKKKKK

You can already test my solution, no need to recompile anything (it works like the old GLM-4). You just need to use the converter to create a GGUF file for GLM-4.1V-9B-Thinking.

Jul 23 '25 21:07 jacekpoplawski

It looks like https://github.com/ggml-org/llama.cpp/pull/14823 only support text, so we should leave this issue open until vision is supported?

Jul 24 '25 02:07 rujialiu

I can see text works okay but still waiting for vision support. Please could this be implemented, this model is really good for vision.

Aug 02 '25 09:08 wogam

same, it would be nice to have it, i tried it on a demo and it can scan a whole page and give you an over all really good result. support for it would be really nice.

Aug 02 '25 09:08 IDKKKKK

yes its very good model for work , i try it on demo site very accurate response for vision task , please if you can support it as vision model on llama-server .

i try many steps but fail , its need --mmproj projector file to run on llama-server and provided gguf suport text only no vision no audio

https://www.modelscope.cn/models/unsloth/GLM-4.1V-9B-Thinking-GGUF/summary