Add support for Gemma models

Open AIWithShrey opened this issue 1 year ago • 0 comments

Prerequisites

Please answer the following questions for yourself before submitting an issue.

[ yes] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
[ yes] I carefully followed the README.md.
[ yes] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[ yes] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Feature Description

I wanted to convert the all-new Gemma model from HuggingFace to the GGUF format using the "convert-hf-to-gguf.py" script. I immediately ran into the error of NotImplementedError: Architecture "GemmaForCausalLM" not supported!

I'm not sure what the best workaround for this is, I just want to be able to use the Gemma models with llama.cpp.

Motivation

Gemma models are the latest open-source models from Google, and being able to create applications and benchmark these models using llama.cpp will be extremely informative to debug and develop apps.

Possible Implementation

I think implementing the GemmaForCausalLM architecture, as the error suggests will do it. I am also actively looking for resources that can help me with a workaround for now. I will update this as soon as I find something useful and relevant.

Feb 22 '24 22:02 AIWithShrey