Qwen1.5-1.8B-Chat conversion to gguf causes an error
The requirement is to perform inference with Qwen/Qwen1.5-1.8B-Chat on a CPU, and the plan is to use llama.cpp. First, it needs to be converted to gguf.
model link: https://huggingface.co/Qwen/Qwen1.5-1.8B-Chat
error: C:\Users\THINK\Downloads\llama.cpp-master>python convert.py C:\Users\THINK\Downloads\Qwen1.5-1.8B-Chat
Loading model file C:\Users\THINK\Downloads\Qwen1.5-1.8B-Chat\model.safetensors params = Params(n_vocab=151936, n_embd=2048, n_layer=24, n_ctx=32768, n_ff=5504, n_head=16, n_head_kv=16, n_experts=None, n_experts_used=None, f_norm_eps=1e-06, rope_scaling_type=None, f_rope_freq_base=1000000.0, f_rope_scale=None, n_orig_ctx=None, rope_finetuned=None, ftype=None, path_model=WindowsPath('C:/Users/THINK/Downloads/Qwen1.5-1.8B-Chat')) Found vocab files: {'tokenizer.model': None, 'vocab.json': WindowsPath('C:/Users/THINK/Downloads/Qwen1.5-1.8B-Chat/vocab.json'), 'tokenizer.json': WindowsPath('C:/Users/THINK/Downloads/Qwen1.5-1.8B-Chat/tokenizer.json')} Loading vocab file 'C:\Users\THINK\Downloads\Qwen1.5-1.8B-Chat\vocab.json', type 'spm' Traceback (most recent call last): File "C:\Users\THINK\Downloads\llama.cpp-master\convert.py", line 1483, in main() File "C:\Users\THINK\Downloads\llama.cpp-master\convert.py", line 1451, in main vocab, special_vocab = vocab_factory.load_vocab(args.vocab_type, model_parent_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\THINK\Downloads\llama.cpp-master\convert.py", line 1336, in load_vocab vocab = SentencePieceVocab( ^^^^^^^^^^^^^^^^^^^ File "C:\Users\THINK\Downloads\llama.cpp-master\convert.py", line 394, in init self.sentencepiece_tokenizer = SentencePieceProcessor(str(fname_tokenizer)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\THINK\AppData\Local\Programs\Python\Python311\Lib\site-packages\sentencepiece_init_.py", line 447, in Init self.Load(model_file=model_file, model_proto=model_proto) File "C:\Users\THINK\AppData\Local\Programs\Python\Python311\Lib\site-packages\sentencepiece_init_.py", line 905, in Load return self.LoadFromFile(model_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\THINK\AppData\Local\Programs\Python\Python311\Lib\site-packages\sentencepiece_init_.py", line 310, in LoadFromFile return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Internal: D:\a\sentencepiece\sentencepiece\src\sentencepiece_processor.cc(1102) [model_proto->ParseFromArray(serialized.data(), serialized.size())]
You should download "qwen1_5-1_8b-chat-q8_0.gguf" model from https://huggingface.co/Qwen
You should download "qwen1_5-1_8b-chat-q8_0.gguf" model from https://huggingface.co/Qwen
Thanks, but I need to use my own data to finetune the model, and how to convert my finetuned model to gguf?
same error
same
Same issue here, I am trying to fine tune
try this python convert-hf-to-gguf.py C:\Users\THINK\Downloads\Qwen1.5-1.8B-Chat
try this python convert-hf-to-gguf.py C:\Users\THINK\Downloads\Qwen1.5-1.8B-Chat
It's no error of running this, but inference result is wrong
try this python convert-hf-to-gguf.py C:\Users\THINK\Downloads\Qwen1.5-1.8B-Chat
convert work but there are some serious degradation compare to the transformer
gguf
transformers
@anaivebird You should try to convert it on wsl2. I got it working but still suffer from degradation.
I can't even get convert.py or convert-hf to work...
Do two things:
pip install sentencepiece -U
and use convert-hf-to-gguf.py
This worked for me, and the GGUF output was able to perform inference.
This issue was closed because it has been inactive for 14 days since being marked as stale.