Make qwen2 `attention_qkv_bias` optional
What does this PR do?
Qwen2 and Qwen2-MoE model is forced to add bias to the query, key and value linear projections.
However, following the trend with other recent models (e.g. llama), I refactored these attention_qkv_bias to be optional so that we can configure it in config file.
Fixes #32892
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you read the contributor guideline, Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
- [x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [ ] Did you write any new necessary tests?
Who can review?
@ArthurZucker @stevhliu
RuntimeError: Failed to import transformers.models.audio_spectrogram_transformer.feature_extraction_audio_spectrogram_transformer because of the following error (look up to see its traceback):
I have no idea why the check_repo test failed with the error messages above. Can you please give me some guides? to: @ArthurZucker @stevhliu
cc @ArthurZucker
@ArthurZucker Thank you for your response!
What I intended is to allow the option to be set/unset in qwen2 as well. In fact, the LLaMA model has never used a bias in linear projection across all its series, and in the official repository, they are all set to bias=False. However, in the Hugging Face implementation, this can be set/unset through the config. Therefore, I want to provide the same functionality here as well.
@ArthurZucker Hey, do you have any further comments for this?
Hey! The reason we added that param for llama is for the release of #26302: InternLM was officially using this flag, meaning there is a model related to this change! Unless a model is publish we usually avoid just enabling something that does not have a model associated! 🤗