Chris Scott comments

Results 9 comments of


                                            Chris Scott

[FIX] Add hidden_size attribute to FusedMoE to bypass the deployment error of Qwen3-30B-A3B-AWQ

I am also interested in this pr. I wanted to deploy the 235B in AWQ.

[Bug]: `size_k must divisible by BLOCK_SIZE_K` error when using tensor parallelism with AWQ-quantized MoE models

I have the same issue with [https://huggingface.co/Qwen/Qwen3-235B-A22B-GPTQ-Int4](url) using TP=4

Add second openai compatible provider

I am also interested in this because I do have different local machines or maybe even the same machine that's running multiple VLLM endpoints. With different models

Fix: IOS Input Zoom / Support PWA Home Screen App

Anything you want to change to merge this?

Fix: IOS Input Zoom / Support PWA Home Screen App

> Thank you very much! Thank you for the awesome project.

[New Model]: support Qwen3-235B-A22B-GPTQ-Int4

Same Problem on 4 Ada a6000s.

[New Model]: support Qwen3-235B-A22B-GPTQ-Int4

I have this working with the latest 0.9. The GPTQ

[New Model]: support Qwen3-235B-A22B-GPTQ-Int4

> [@getfit-us](https://github.com/getfit-us) does it work with 0.8.5.post1? I am not sure, I do know if you install the latest version, it seems to work. Although, I believe the toolcalling is...

Autocomplete not working

#800 . I know it works for me in the latest version.