Fix: Add missing attributes to hugging face conversion functions
Changes
This PR adds missing attributes to the lists of handled/ignored configuration attributes in the model conversion functions for:
- Llama models (
llama_from_huggingface_model) - Mistral models (
mistral_from_huggingface_model) - GPT-NeoX models (
gpt_neox_from_huggingface_model)
Problem
When converting HuggingFace models to Penzai models, the conversion would fail if the model configuration contained certain non-critical attributes like pad_token_id and _name_or_path. These attributes are present in many HuggingFace models (including test models) but don't affect the model's functionality.
Solution
Added attributes to the handled_or_ignored_attributes set in llama/mistral/gpt_neox model conversion functions, allowing the conversion to proceed while ignoring these non-critical configuration values.
Testing
The fix has been tested with:
- Existing test cases in
transformer_consistency_test.py - Newly implemented test cases that load tiny models from huggingface (
hf-internal-testing/tiny-random-[Llama/Mistral/GPTNeoX]ForCausalLM). These new test fail without the updates in this PR.
Related Issue
Fixes #115
Hi @danieldjohnson, I submitted this PR from a fork and it looks like the CI checks haven’t been triggered yet.
Could you please approve or manually run the unittest-job workflow so the checks can proceed?
Thanks!
Hey @ealt, just wanted to ask whether you are still planning to work on this PR? Fixing the HuggingFace conversion seems pretty important. If you don't have the bandwidth to address these comments, I'd be happy to take it over and make the necessary changes myself (and credit you as a co-author in the commit).
(No rush, I just wanted to make sure this didn't get dropped!)
Closing this PR since I implemented the changes myself in #125. Hope you don't mind, and thanks for getting this started!