llama.cpp Converting GGML Q4_0 back to Torch checkpoint for HuggingFace/Pytorch consumption/training/finetuning

Hi everyone, I hacked together a python script to convert a model saved as GGML Q4_0 files back to Pytorch checkpoint for further consumption/training/finetuning using HuggingFace's Transformer package and/or Pytorch/Pytorch Lightning. If there are interests to do this, please comment of drop a like. I will post the code or create a pull request if people need this.

Mar 21 '23 15:03 ductai199x

can you put in on git hub?

Mar 21 '23 16:03 karcheng

Hi everyone, I hacked together a python script to convert a model saved as GGML Q4_0 files back to Pytorch checkpoint for further consumption/training/finetuning using HuggingFace's Transformer package and/or Pytorch/Pytorch Lightning. If there are interests to do this, please comment of drop a like. I will post the code or create a pull request if people need this.

The same I started to programming today. I want to use Q4 models with pytorch for inference tests on my GPU. It will be amazing to have this script.

Mar 21 '23 16:03 PriNova

Hey guys, here's a PR I made to do this: https://github.com/ggerganov/llama.cpp/pull/403. Please check it out. If you have any questions, don't hesitate to ask here.

Mar 22 '23 17:03 ductai199x

This issue was closed because it has been inactive for 14 days since being marked as stale.

Apr 10 '24 01:04 github-actions[bot]