chatglm3-6b\modeling_chatglm.py", line 413, in forward ,cache_k, cache_v = kv_cache , ValueError: too many values to unpack (expected 2)

Open armstrong1972 opened this issue 1 year ago • 3 comments

System Info / 系統信息

Download the "ZhipuAI/chatglm3-6b" models from https://modelscope.cn/models/ZhipuAI/ChatGLM-6B .

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda() model = model.eval()

response, history = model.chat(tokenizer, "你好", history=[]) print(response)

Popup Error : File "D:\AI\Txt2Dialog\Codes\d.py", line 16, in response, history = model.chat(tokenizer, "你好", history=[]) ... ... File "C:\Users\XXXX.cache\huggingface\modules\transformers_modules\chatglm3-6b\modeling_chatglm.py", line 413, in forward cache_k, cache_v = kv_cache ValueError: too many values to unpack (expected 2)

Who can help? / 谁可以帮助到您？

@Btlmd

Information / 问题信息

[ ] The official example scripts / 官方的示例脚本
[X] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

from transformers import AutoTokenizer, AutoModel

model_id = "ZhipuAI/chatglm3-6b" models_dir = 'D:/AI/_PyTorch/models/modelscope' model_path = models_dir + "/" + model_id

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda() model = model.eval()

Demo

response, history = model.chat(tokenizer, "你好", history=[]) print(response) #response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history) #print(response)

Expected behavior / 期待表现

Pls provide the solution to fix the bug

Jul 26 '24 12:07 armstrong1972

It is the issue of new ver of tranformers , you can fix it by :

pip install transformers==4.41.2

Jul 26 '24 12:07 armstrong1972

这个问题解决了吗

Aug 01 '24 02:08 394988736

请问什么时候能兼容

Aug 01 '24 06:08 T-Atlas

直接用GLM-4吧，这个代码应该是没有维护了

Sep 04 '24 15:09 zRzRzRzRzRzRzR