风云再起 issues

Results 2 issues of


                                            风云再起

[Bug] internlm2-chat-20b在回复时会有时会出现显式输出<|im_end|>，且不能正常停止的情况

### 描述该错误在https://studio.intern-ai.org.cn/ 的 A100 (1/4) * 2 配置的服务器上运行 [web_demo.py](https://github.com/InternLM/InternLM/blob/main/chat/web_demo.py) 显式输出 ![image](https://github.com/InternLM/InternLM/assets/3361335/41deaf1c-d55f-4b2e-b4c8-34b79ffbac75) 显式输出且没有停止输出 ![image](https://github.com/InternLM/InternLM/assets/3361335/4703cab4-4dba-4cb5-8d1a-2ec3e163d1fa) ### 环境信息 | NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2 | Name: torch Version:...

[Feature] 建议训练internlm2-chat-7b 的 GPTQ-4bit 量化模型并支持llmdeploy部署

### Motivation llmdeploy 支持在V100 显卡上部署 GPTQ量化模型 ### Related resources Qwen/Qwen1.5-72B-Chat-GPTQ-Int4 TheBloke/Llama-2-7B-Chat-GPTQ 因为llama qwen都提供了GPTQ模型 ### Additional context _No response_