dwq370 comments

Results 19 comments of


                                            dwq370

采用自己的数据微调INT4模型，使用web_demo部署后推理，问一个问题在queue等待了190s后没有返回任何结果，请教大佬们这是什么原因导致的

使用原模型，进程会加载checkpoint ![image](https://user-images.githubusercontent.com/131581396/233882283-025035e5-e4bb-42b9-99cb-d25628f9fd20.png) 使用微调模型，会提示有些参数无法从checkpoint中加载。 ![image](https://user-images.githubusercontent.com/131581396/233882699-01613f95-948f-4fee-a939-05c339d1a44f.png) 应该问题出在这里，但不知道该如何解决

采用自己的数据微调INT4模型，使用web_demo部署后推理，问一个问题在queue等待了190s后没有返回任何结果，请教大佬们这是什么原因导致的

参考https://www.heywhale.com/mw/project/6436d82948f7da1fee2be59e中的做法 ![image](https://user-images.githubusercontent.com/131581396/234478081-9c0c704c-5f55-4740-a01d-b9f4002b14c3.png) 执行出现新的错误 ![image](https://user-images.githubusercontent.com/131581396/234478170-7f338588-8b1d-4acb-86bb-dae96b4921dd.png) AttributeError: 'ChatGLMModel' object has no attribute 'prefix_encoder'

采用自己的数据微调INT4模型，使用web_demo部署后推理，问一个问题在queue等待了190s后没有返回任何结果，请教大佬们这是什么原因导致的

问题已解决 ![image](https://user-images.githubusercontent.com/131581396/234500002-1b81afed-47aa-4791-a662-98acea304722.png) 添加 config = AutoConfig.from_pretrained("./ptuning/THUDM/chatglm-6b", trust_remote_code=True) config.pre_seq_len = 64 model = AutoModel.from_pretrained("./ptuning/THUDM/chatglm-6b", config=config, trust_remote_code=True) pre_seq_len和训练设置的一致，就可以运行

采用自己的数据微调INT4模型，使用web_demo部署后推理，问一个问题在queue等待了190s后没有返回任何结果，请教大佬们这是什么原因导致的

第3行 model = AutoModel.from_pretrained(CheckPoint_Path, config=config, trust_remote_code=True) 改成 model = AutoModel.from_pretrained(“/root/autodl-tmp/modal/chatglm-6b”, config=config, trust_remote_code=True) 就可以了

采用自己的数据微调INT4模型，使用web_demo部署后推理，问一个问题在queue等待了190s后没有返回任何结果，请教大佬们这是什么原因导致的

估计用prefixEncoder的训练方式，需要先加载源模型再加载prefixEncoder。如果直接加载微调的checkpoint，训练时将参数pre_seq_len去掉试试，如果计算资源不足，可能需要使用deepspeed来微调。 ![image](https://user-images.githubusercontent.com/131581396/236716200-2b911b08-d22f-442e-870c-5f64a8a42066.png) 我的机器上跑deepspeed总是失败，所以只能用prefixEncoder的方式。

dwq370

采用自己的数据微调INT4模型，使用web_demo部署后推理，问一个问题在queue等待了190s后没有返回任何结果，请教大佬们这是什么原因导致的

采用自己的数据微调INT4模型，使用web_demo部署后推理，问一个问题在queue等待了190s后没有返回任何结果，请教大佬们这是什么原因导致的

采用自己的数据微调INT4模型，使用web_demo部署后推理，问一个问题在queue等待了190s后没有返回任何结果，请教大佬们这是什么原因导致的

采用自己的数据微调INT4模型，使用web_demo部署后推理，问一个问题在queue等待了190s后没有返回任何结果，请教大佬们这是什么原因导致的

采用自己的数据微调INT4模型，使用web_demo部署后推理，问一个问题在queue等待了190s后没有返回任何结果，请教大佬们这是什么原因导致的

3090显卡，CUDA11.1版本，单卡运行INT4推理报错

[tensorrt-llm backend] A question about launch_triton_server.py

[tensorrt-llm backend] A question about launch_triton_server.py

[Usage]: tensor-parallel-size=2，The program just kept hanging

[Usage]: tensor-parallel-size=2，The program just kept hanging