penzai icon indicating copy to clipboard operation
penzai copied to clipboard

Fix: Add missing attributes to hugging face conversion functions

Open ealt opened this issue 9 months ago • 1 comments

Changes

This PR adds missing attributes to the lists of handled/ignored configuration attributes in the model conversion functions for:

  • Llama models (llama_from_huggingface_model)
  • Mistral models (mistral_from_huggingface_model)
  • GPT-NeoX models (gpt_neox_from_huggingface_model)

Problem

When converting HuggingFace models to Penzai models, the conversion would fail if the model configuration contained certain non-critical attributes like pad_token_id and _name_or_path. These attributes are present in many HuggingFace models (including test models) but don't affect the model's functionality.

Solution

Added attributes to the handled_or_ignored_attributes set in llama/mistral/gpt_neox model conversion functions, allowing the conversion to proceed while ignoring these non-critical configuration values.

Testing

The fix has been tested with:

  • Existing test cases in transformer_consistency_test.py
  • Newly implemented test cases that load tiny models from huggingface (hf-internal-testing/tiny-random-[Llama/Mistral/GPTNeoX]ForCausalLM). These new test fail without the updates in this PR.

Related Issue

Fixes #115

ealt avatar Apr 22 '25 23:04 ealt

Hi @danieldjohnson, I submitted this PR from a fork and it looks like the CI checks haven’t been triggered yet. Could you please approve or manually run the unittest-job workflow so the checks can proceed? Thanks!

ealt avatar Apr 23 '25 20:04 ealt

Hey @ealt, just wanted to ask whether you are still planning to work on this PR? Fixing the HuggingFace conversion seems pretty important. If you don't have the bandwidth to address these comments, I'd be happy to take it over and make the necessary changes myself (and credit you as a co-author in the commit).

(No rush, I just wanted to make sure this didn't get dropped!)

danieldjohnson avatar Jun 08 '25 19:06 danieldjohnson

Closing this PR since I implemented the changes myself in #125. Hope you don't mind, and thanks for getting this started!

danieldjohnson avatar Jun 22 '25 05:06 danieldjohnson