DewEfresh comments

Repositories
Issues
Comments

Results 4 comments of


                                            DewEfresh

TypeError: GemmaModel.forward() got an unexpected keyword argument 'cache_position'

> Could you share changes to main.py, please? model_path = "./models/models--mustafaaljadery--gemma-2B-10M" #tokenizer = AutoTokenizer.from_pretrained(model_path) tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir="./models") model = GemmaForCausalLM.from_pretrained( #model_path, model_name, cache_dir="./models", torch_dtype=torch.bfloat16 )

TypeError: GemmaModel.forward() got an unexpected keyword argument 'cache_position'

i can't get past GemmaModel.forward() got an unexpected keyword argument 'cache_position'

Great work. quick question on roadmap ETA?

you may want to check out https://github.com/IST-DASLab/qmoe. They created some custom cuda functions for sub 1-bit weights.

BitNet_Llama_model_test_huggingface_GPU.ipynb

https://colab.research.google.com/drive/1nvzhy_PCBZ_r6dlvQv3GfweJsGlZHrNJ?usp=sharing