llama.cpp Qwen1.5-1.8B-Chat conversion to gguf causes an error

The requirement is to perform inference with Qwen/Qwen1.5-1.8B-Chat on a CPU, and the plan is to use llama.cpp. First, it needs to be converted to gguf.

model link: https://huggingface.co/Qwen/Qwen1.5-1.8B-Chat

error: C:\Users\THINK\Downloads\llama.cpp-master>python convert.py C:\Users\THINK\Downloads\Qwen1.5-1.8B-Chat

Loading model file C:\Users\THINK\Downloads\Qwen1.5-1.8B-Chat\model.safetensors params = Params(n_vocab=151936, n_embd=2048, n_layer=24, n_ctx=32768, n_ff=5504, n_head=16, n_head_kv=16, n_experts=None, n_experts_used=None, f_norm_eps=1e-06, rope_scaling_type=None, f_rope_freq_base=1000000.0, f_rope_scale=None, n_orig_ctx=None, rope_finetuned=None, ftype=None, path_model=WindowsPath('C:/Users/THINK/Downloads/Qwen1.5-1.8B-Chat')) Found vocab files: {'tokenizer.model': None, 'vocab.json': WindowsPath('C:/Users/THINK/Downloads/Qwen1.5-1.8B-Chat/vocab.json'), 'tokenizer.json': WindowsPath('C:/Users/THINK/Downloads/Qwen1.5-1.8B-Chat/tokenizer.json')} Loading vocab file 'C:\Users\THINK\Downloads\Qwen1.5-1.8B-Chat\vocab.json', type 'spm' Traceback (most recent call last): File "C:\Users\THINK\Downloads\llama.cpp-master\convert.py", line 1483, in main() File "C:\Users\THINK\Downloads\llama.cpp-master\convert.py", line 1451, in main vocab, special_vocab = vocab_factory.load_vocab(args.vocab_type, model_parent_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\THINK\Downloads\llama.cpp-master\convert.py", line 1336, in load_vocab vocab = SentencePieceVocab( ^^^^^^^^^^^^^^^^^^^ File "C:\Users\THINK\Downloads\llama.cpp-master\convert.py", line 394, in init self.sentencepiece_tokenizer = SentencePieceProcessor(str(fname_tokenizer)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\THINK\AppData\Local\Programs\Python\Python311\Lib\site-packages\sentencepiece_init_.py", line 447, in Init self.Load(model_file=model_file, model_proto=model_proto) File "C:\Users\THINK\AppData\Local\Programs\Python\Python311\Lib\site-packages\sentencepiece_init_.py", line 905, in Load return self.LoadFromFile(model_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\THINK\AppData\Local\Programs\Python\Python311\Lib\site-packages\sentencepiece_init_.py", line 310, in LoadFromFile return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Internal: D:\a\sentencepiece\sentencepiece\src\sentencepiece_processor.cc(1102) [model_proto->ParseFromArray(serialized.data(), serialized.size())]

Feb 20 '24 00:02 anaivebird

You should download "qwen1_5-1_8b-chat-q8_0.gguf" model from https://huggingface.co/Qwen

Feb 20 '24 02:02 codetown

You should download "qwen1_5-1_8b-chat-q8_0.gguf" model from https://huggingface.co/Qwen

Thanks, but I need to use my own data to finetune the model, and how to convert my finetuned model to gguf?

Feb 20 '24 02:02 anaivebird

same error

Feb 20 '24 09:02 wzg-zhuo

same

Feb 20 '24 09:02 sorasoras

Same issue here, I am trying to fine tune

Feb 21 '24 00:02 JesseGuerrero

try this python convert-hf-to-gguf.py C:\Users\THINK\Downloads\Qwen1.5-1.8B-Chat

Feb 21 '24 04:02 ylsdamxssjxxdd

try this python convert-hf-to-gguf.py C:\Users\THINK\Downloads\Qwen1.5-1.8B-Chat

It's no error of running this, but inference result is wrong

Feb 21 '24 04:02 anaivebird

try this python convert-hf-to-gguf.py C:\Users\THINK\Downloads\Qwen1.5-1.8B-Chat

convert work but there are some serious degradation compare to the transformer gguf transformers

Feb 21 '24 05:02 sorasoras

@anaivebird You should try to convert it on wsl2. I got it working but still suffer from degradation.

Feb 22 '24 18:02 sorasoras

I can't even get convert.py or convert-hf to work...

Feb 26 '24 23:02 RonanKMcGovern

Do two things: pip install sentencepiece -U and use convert-hf-to-gguf.py

This worked for me, and the GGUF output was able to perform inference.

Mar 05 '24 18:03 christopherthompson81

This issue was closed because it has been inactive for 14 days since being marked as stale.

Apr 19 '24 01:04 github-actions[bot]