Add support for Gemma models
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [ yes] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- [ yes] I carefully followed the README.md.
- [ yes] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [ yes] I reviewed the Discussions, and have a new bug or useful enhancement to share.
Feature Description
I wanted to convert the all-new Gemma model from HuggingFace to the GGUF format using the "convert-hf-to-gguf.py" script. I immediately ran into the error of NotImplementedError: Architecture "GemmaForCausalLM" not supported!
I'm not sure what the best workaround for this is, I just want to be able to use the Gemma models with llama.cpp.
Motivation
Gemma models are the latest open-source models from Google, and being able to create applications and benchmark these models using llama.cpp will be extremely informative to debug and develop apps.
Possible Implementation
I think implementing the GemmaForCausalLM architecture, as the error suggests will do it. I am also actively looking for resources that can help me with a workaround for now. I will update this as soon as I find something useful and relevant.