ZX-ModelCloud issues

Results 5 issues of


                                            ZX-ModelCloud

failed to dispatch head_dim 96

``` env CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=10 python -m sglang.launch_server --model-path vonjack/Phi-3-mini-4k-instruct-LLaMAfied --port 30000 ``` When loading [vonjack/Phi-3-mini-4k-instruct-LLaMAfied](https://huggingface.co/vonjack/Phi-3-mini-4k-instruct-LLaMAfied) using **sglang**, the following error occurs. ``` server_args=ServerArgs(model_path='vonjack/Phi-3-mini-4k-instruct-LLaMAfied', tokenizer_path='vonjack/Phi-3-mini-4k-instruct-LLaMAfied', tokenizer_mode='auto', skip_tokenizer_init=False, load_format='auto', dtype='auto', trust_remote_code=False, context_length=None,...

add a100_qlinear.py

[MODEL] Intern vl2 support

[MODEL] support minicpm-o 2.6

Fully deprecate AutoGPTQ for GPT-QModel

# What does this PR do? Remove autogptq clutter and autogptq related configs that are not worth adding backward compat. See [transformers PR#41567](https://github.com/huggingface/transformers/pull/41567) [peft PR#2932](https://github.com/huggingface/peft/pull/2932) ## Before submitting - [...