ZX-ModelCloud

Results 5 issues of ZX-ModelCloud

``` env CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=10 python -m sglang.launch_server --model-path vonjack/Phi-3-mini-4k-instruct-LLaMAfied --port 30000 ``` When loading [vonjack/Phi-3-mini-4k-instruct-LLaMAfied](https://huggingface.co/vonjack/Phi-3-mini-4k-instruct-LLaMAfied) using **sglang**, the following error occurs. ``` server_args=ServerArgs(model_path='vonjack/Phi-3-mini-4k-instruct-LLaMAfied', tokenizer_path='vonjack/Phi-3-mini-4k-instruct-LLaMAfied', tokenizer_mode='auto', skip_tokenizer_init=False, load_format='auto', dtype='auto', trust_remote_code=False, context_length=None,...

# What does this PR do? Remove autogptq clutter and autogptq related configs that are not worth adding backward compat. See [transformers PR#41567](https://github.com/huggingface/transformers/pull/41567) [peft PR#2932](https://github.com/huggingface/peft/pull/2932) ## Before submitting - [...