[BUG] 使用MiniCPM-o 2.6 int4模型时,chat.py报错
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
- [X] 我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
我下载了MiniCPM-o 2.6 int4模型,并把路径写入到chat.py入口函数的model_path变量,运行代码得到了如下报错:
(MiniCPMo) hygx@hygx:~/code/MiniCPM-o$ cd /home/hygx/code/MiniCPM-o ; /usr/bin/env /home/hygx/anaconda3/envs/MiniCPMo/bin/python /home/hygx/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher 54997 -- /home/hygx/code/MiniCPM-o/chat.py
`low_cpu_mem_usage` was None, now set to True since model is quantized.
Traceback (most recent call last):
File "/home/hygx/anaconda3/envs/MiniCPMo/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/hygx/anaconda3/envs/MiniCPMo/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/hygx/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 71, in <module>
cli.main()
File "/home/hygx/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 501, in main
run()
File "/home/hygx/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 351, in run_file
runpy.run_path(target, run_name="__main__")
File "/home/hygx/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 310, in run_path
return _run_module_code(code, init_globals, run_name, pkg_name=pkg_name, script_name=fname)
File "/home/hygx/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 127, in _run_module_code
_run_code(code, mod_globals, init_globals, mod_name, mod_spec, pkg_name, script_name)
File "/home/hygx/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 118, in _run_code
exec(code, run_globals)
File "/home/hygx/code/MiniCPM-o/chat.py", line 282, in <module>
chat_model = MiniCPMVChat(model_path)
File "/home/hygx/code/MiniCPM-o/chat.py", line 269, in __init__
self.model = MiniCPMV(model_path)
File "/home/hygx/code/MiniCPM-o/chat.py", line 142, in __init__
self.model = AutoModel.from_pretrained(model_path, trust_remote_code=True).to(dtype=torch.bfloat16)
File "/home/hygx/anaconda3/envs/MiniCPMo/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 559, in from_pretrained
return model_class.from_pretrained(
File "/home/hygx/anaconda3/envs/MiniCPMo/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3738, in from_pretrained
if metadata.get("format") == "pt":
AttributeError: 'NoneType' object has no attribute 'get'
是我调用方式有问题吗?还是别的什么原因?如何解决这个问题?
期望行为 | Expected Behavior
能和模型正常对话
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
- OS: Windows 11 with WSL2
- Python:3.10
- Transformers:4.44.2
- PyTorch:2.2.0
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):12.1
备注 | Anything else?
No response
self.model = AutoModel.from_pretrained(model_path, trust_remote_code=True).to(dtype=torch.bfloat16) int4 model should not be bf16 type?
同样的问题
同样的问题
应该是int4版本的model.safetensors文件中缺少metadata引起的
你好,可以根据这个 README 步骤安装 AutoGPTQ 使用 int4 量化推理
You can follow the steps in this README to install AutoGPTQ and perform int4 quantized inference.
int4 model doesn't contain metadata
你好,可以根据这个 README 步骤安装
AutoGPTQ使用 int4 量化推理 You can follow the steps in this README to installAutoGPTQand perform int4 quantized inference.
It can NOT work, after installation of AutoGPTQ. The issue duplicated, as below.
Installed /media/harr/King/MiniCPM-o/AutoGPTQ
Successfully installed auto_gptq
Remote version of pip: 24.3.1
Local version of pip: 24.3.1
Was pip installed by pip? True
Removed build tracker: '/tmp/pip-build-tracker-3225ifjw'
(minicpmo) harr@harr-Kuangshi16-Super-Series-GM6IX8X:/media/harr/King/MiniCPM-o$ python web_demos/minicpm-o_2.6/model_server.py --model openbmb/MiniCPM-o-2_6-int4
WARNING - AutoGPTQ has stopped development. Please transition to GPTQModel: https://github.com/ModelCoud/GPTQModel
GPTQModel has been merged into Transformers/Optimum and full deprecation of AutoGPTQ within HF frameworks is planned in the near-future.
low_cpu_mem_usage was None, now set to True since model is quantized.
Traceback (most recent call last):
File "/media/harr/King/MiniCPM-o/web_demos/minicpm-o_2.6/model_server.py", line 601, in
If you want to using int4 version for webdemo, you should change the model initialization by AutoGPTQForCausalLM.from_quantized
model = AutoGPTQForCausalLM.from_quantized(
'openbmb/MiniCPM-o-2_6-int4',
torch_dtype=torch.bfloat16,
device="cuda:0",
trust_remote_code=True,
disable_exllama=True,
disable_exllamav2=True
)