[BUG] Load StarCoder2 AWQ using Transformers
When loading starcoder2-AWQ using transformers, I received a confusing error:
model = AutoModelForCausalLM.from_pretrained(
"TechxGenus/starcoder2-3b-AWQ",
torch_dtype=torch.float16,
device_map="auto",
)
get:
Some weights of the model checkpoint at TechxGenus/starcoder2-3b-AWQ were not used when initializing Starcoder2ForCausalLM: ['model.layers.0.mlp.act.scales', 'model.layers.1.mlp.act.scales', 'model.layers.10.mlp.act.scales', 'model.layers.11.mlp.act.scales', 'model.layers.12.mlp.act.scales', 'model.layers.13.mlp.act.scales', 'model.layers.14.mlp.act.scales', 'model.layers.15.mlp.act.scales', 'model.layers.16.mlp.act.scales', 'model.layers.17.mlp.act.scales', 'model.layers.18.mlp.act.scales', 'model.layers.19.mlp.act.scales', 'model.layers.2.mlp.act.scales', 'model.layers.20.mlp.act.scales', 'model.layers.21.mlp.act.scales', 'model.layers.22.mlp.act.scales', 'model.layers.23.mlp.act.scales', 'model.layers.24.mlp.act.scales', 'model.layers.25.mlp.act.scales', 'model.layers.26.mlp.act.scales', 'model.layers.27.mlp.act.scales', 'model.layers.28.mlp.act.scales', 'model.layers.29.mlp.act.scales', 'model.layers.3.mlp.act.scales', 'model.layers.4.mlp.act.scales', 'model.layers.5.mlp.act.scales', 'model.layers.6.mlp.act.scales', 'model.layers.7.mlp.act.scales', 'model.layers.8.mlp.act.scales', 'model.layers.9.mlp.act.scales']
And it doesn't output properly. There is no error when using AutoAWQForCausalLM or vllm.LLM to load it, and they work well.
Not all AWQ models have implemented loading in transformers. I would urge you to raise the issue in transformers