AutoAWQ
AutoAWQ copied to clipboard
prepare_inputs_for_generation and position_embeddings
Not really an issue with AutoAWQ as it is with transformers but the prepare_inputs_for_generation functions are not being updated to include position_embeddings which models are needing for decoder layer inference (i.e. to be put in module_kwargs). The main reason this hasn't been flagged yet is that transformers keeps pushing it being mandatory back (modeling_llama.py in tranformers version v4.47.1 -> position_embeddings: Optional[Tuple[torch.Tensor, torch.Tensor]] = None, # will become mandatory in v4.45) but eventually it'll become an issue and it does break things when using some specific models that were brought in more recently.