Conversion from pretrained HuggingFace models

Open ealt opened this issue 1 year ago • 0 comments

Bug Description

When attempting to convert a HuggingFace model to a Penzai model using [llama/mistral/gpt_neox]_from_huggingface_model, the conversion fails with a ValueError when the model configuration contains certain attributes that are not explicitly handled.

Steps to Reproduce

from penzai.models.transformer.variants import llama
import transformers

model_name = "hf-internal-testing/tiny-random-LlamaForCausalLM"
hf_model = transformers.LlamaForCausalLM.from_pretrained(model_name)
pz_model = llama.llama_from_huggingface_model(hf_model)

(similar for mistral and gpt_neox)

Expected Behavior

The conversion should complete successfully, as missing attributes (e.x. _name_or_path) that are not critical for constructing the penzai model and can be ignored.

Actual Behavior

The conversion fails, raising a ValueError for unexpected missing attributes. For the llama example above:

 ValueError: Conversion of a LlamaForCausalLM does not support these configuration attributes: {'pad_token_id': -1, '_name_or_path': 'hf-internal-testing/tiny-random-LlamaForCausalLM'}

Root Cause

In penzai/models/transformer/variants/[llama/mistral/gpt_neox].py, the [llama/mistral/gpt_neox]_from_huggingface_model functions check for unsupported configuration attributes but are missing values like _name_or_path in their lists of handled_or_ignored_attributes.

Suggested Fix

Add missing attributes to the handled_or_ignored_attributes sets in the [llama/mistral/gpt_neox]_from_huggingface_model functions.

Apr 22 '25 23:04 ealt