prepare_inputs_for_generation and position_embeddings

Open ambroser53 opened this issue 1 year ago • 0 comments

Not really an issue with AutoAWQ as it is with transformers but the prepare_inputs_for_generation functions are not being updated to include position_embeddings which models are needing for decoder layer inference (i.e. to be put in module_kwargs). The main reason this hasn't been flagged yet is that transformers keeps pushing it being mandatory back (modeling_llama.py in tranformers version v4.47.1 -> position_embeddings: Optional[Tuple[torch.Tensor, torch.Tensor]] = None, # will become mandatory in v4.45) but eventually it'll become an issue and it does break things when using some specific models that were brought in more recently.

Dec 17 '24 17:12 ambroser53