galai
galai copied to clipboard
What if I would run the "base" or "mini" versions on CPU?
Hi,
First of all thanks for the amazing work! I was wondering if it would be possible to run the smaller versions of the model on CPU, at the moment it seems not possible.
Looking at the size of mini and base (comparing them with T5) I would say it should be possible to have them on CPUs, but looking at the code it seems it assumes you should/must have GPUs.
I think the best way to run the model on CPU is to use the HF APIs:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("facebook/galactica-1.3b")
model = AutoModelForCausalLM.from_pretrained("facebook/galactica-1.3b")
You can find an example of the code for a pipeline here: https://huggingface.co/spaces/morenolq/galactica-base/blob/main/app.py