Add `BioGPTForSequenceClassification`
What does this PR do?
Add Sequence Classification support for BioGPT.
Fixes #21530 Fixes #21535 This PR completes the stalled PR #21535.
Who can review?
@ArthurZucker @younesbelkada @NielsRogge @sgugger
The documentation is not available anymore as the PR was closed or merged.
@NielsRogge @sgugger Is there a way to skip the check for specific lines when I run make repo-consistency.
It gives an error when I add this:
# Copied from transformers.models.opt.modeling_opt.OPTForSequenceClassification with OPT->BioGpt.
There are some attributes like word_embed_proj_dim which do not exist for the BioGpt model.
Also it changes the case of the docstring variable, which leads to a variable not found error.
Should I drop the copy attribution comment?
If some attributes do not exist, let's just add the # Adapted from mention, and put the # Copied from only where it properly fits!
@younesbelkada You're right, I haven't figured out how to solve this failing test.
@ArthurZucker Any suggestions as to how to fix this failing test?
I went through #18123. The code is extremely similar, but I still don't get why the test is failing. Maybe I am missing something. I need help to fix it.
_____________________________ BioGptModelTest.test_load_with_mismatched_shapes _____________________________
self = <tests.models.biogpt.test_modeling_biogpt.BioGptModelTest testMethod=test_load_with_mismatched_shapes>
def test_load_with_mismatched_shapes(self):
if not self.test_mismatched_shapes:
return
config, inputs_dict = self.model_tester.prepare_config_and_inputs_for_common()
for model_class in self.all_model_classes:
if model_class.__name__ not in get_values(MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING_NAMES):
continue
with self.subTest(msg=f"Testing {model_class}"):
with tempfile.TemporaryDirectory() as tmp_dir:
model = model_class(config)
model.save_pretrained(tmp_dir)
# Fails when we don't set ignore_mismatched_sizes=True
with self.assertRaises(RuntimeError):
new_model = AutoModelForSequenceClassification.from_pretrained(tmp_dir, num_labels=42)
with self.assertRaises(RuntimeError):
> new_model_without_prefix = AutoModel.from_pretrained(tmp_dir, vocab_size=10)
E AssertionError: RuntimeError not raised
tests/test_modeling_common.py:2640: AssertionError
Hey! I'll try to have a look, it looks like setting the vocab_size does not change the shape of the model which means that it does not raise an error when it should!
@ArthurZucker Thanks! The vocab_size argument had my suspicion as well. Since we inherit from BioGptModel, I thought that already does the needful. I could not figure out what I was missing. Looking forward to your suggestions.
@ArthurZucker Those changes seemed to do the trick, all the CI tests pass now. Thanks for your help!