GLiNER Getting bad results when fine-tuning GLiNER

Hi everyone, I am trying to fine-tune GLiNER but the results are consistently poor.

At first, I used an artificially generated dataset, but after fine-tuning, the model completely lost its ability to recognize and correctly label entities.

To verify whether the problem was with my data, I ran the official examples/fine-tuning.ipynb notebook with the sample dataset provided by the authors. However, even with that dataset, the model produces very bad results. (It is unclear to me how it previously gave good results, since now it consistently fails, even though I am using the same code and the same dataset.)

I also read in an another issue that the model may suffer from catastrophic forgetting, and that merging the Pile-NER dataset with the target dataset during fine-tuning could help. I tried that approach with various dataset sizes (small, medium, large), various ratios between my dataset and the Pile-NER one (1:2, 1:5) and various hyperparameters, but the results did not improve.

After fine-tuning, the model tends to over-predict by labeling every token, or sometimes even entire phrases, as entities. In many cases, it assigns the same label to all of them, and even when the labels vary, they are consistently incorrect. Here is a sample output:

Whiplash => treatment 
, => treatment 
a => treatment 
soft => treatment 
tissue injury to the neck, is also called neck sprain => treatment 
or => treatment 
strain => treatment 
. => treatment 
Treatment depends on the cause => treatment 
, => treatment 
but => treatment 
may include => treatment 
applying => treatment 
ice => treatment 
, => treatment 
taking => treatment 
pain relievers, getting physical therapy or wearing => treatment 
a => treatment 
cervical collar => treatment 
. => treatment 
You => treatment 
rarely => treatment 
need => treatment 
surgery => treatment 
. => treatment

Has anyone else experienced this issue? Is there a recommended approach to improve this?

Sep 20 '25 19:09 tamarastojanova

@urchade @Ingvarstep Sorry for bothering but I'm doing a bachelor's degree project on this :)

Sep 22 '25 18:09 tamarastojanova

@tamarastojanova , hi, thanks for informing me about this issue. It was related to the incorrect interface to the loss function, and masking of negative labels was done incorrectly. Here is my fix, it's still under review, feel free to give your feedback: https://github.com/urchade/GLiNER/pull/296

Sep 23 '25 11:09 Ingvarstep

Hi @Ingvarstep I have uninstalled and redownloaded the library and the model and I'm still getting the same behaviour. Here is my data (completly synthetic):

with open("processed_data.json", "r", encoding='utf-8') as f:
    data = json.load(f)

print('Dataset size:', len(data))

random.shuffle(data)
print('Dataset is shuffled...')

train_dataset = data[:int(len(data)*0.9)]
test_dataset = data[int(len(data)*0.9):]

print('Dataset is splitted...')
print(train_dataset[0])

{'tokenized_text': ['', '', 'Reporte', 'Clínico', '', '', '\n\n', '', '', 'Paciente', ':', '', 'María', 'del', 'Carmen', 'Ramos', 'Martínez', '\n', '', '', 'DNI', ':', '', '12345678', '\n', '', '', 'Fecha', 'de', 'Nacimiento', ':', '', '12', 'de', 'mayo', 'de', '1985', '\n', '', '', 'Teléfono', ':', '', '915654321', '\n', '', '', 'Dirección', ':', '', 'Cl', '.', 'de', 'Alcalá', ',', '123', ',', 'Madrid', '\n\n', '', '', 'Fecha', 'del', 'Examen', ':', '', '22', 'de', 'febrero', 'de', '2023', '\n', '', '', 'Médico', ':', '', 'Dr.', 'Juan', 'Luis', 'Ramos', '\n\n', '', '', 'Examen', 'Físico', ':', '', 'El', 'paciente', 'se', 'presentó', 'en', 'la', 'consulta', 'con', 'un', 'estado', 'general', 'de', 'salud', 'regular', '.', 'El', 'apoyo', 'corporal', 'es', 'de', '165', 'cm', 'y', 'un', 'peso', 'de', '55', 'kg', ',', 'lo', 'que', 'corresponde', 'a', 'una', 'IMC', 'de', '20,5', 'kg', '/', 'm2', '.', 'La', 'piel', 'y', 'los', 'tejidos', 'subcutáneos', 'presentan', 'una', 'normoflora', 'y', 'no', 'hay', 'alteraciones', 'observables', 'en', 'las', 'mucosas', '.', '\n\n', 'El', 'paciente', 'tiene', 'una', 'apnea', 'normal', 'y', 'la', 'periferia', 'pulpar', 'es', 'normal', '.', 'El', 'examen', 'abdominal', 'reveló', 'un', 'abdomen', 'laxo', 'y', 'asimétrico', ',', 'con', 'hipoacusia', 'en', 'el', 'cuadrante', 'superior', 'izquierdo', '.', 'No', 'se', 'observaron', 'lesiones', 'cutáneas', 'o', 'alteraciones', 'en', 'la', 'piel', '.', 'La', 'circulación', 'linfática', 'es', 'normal', '.', '\n\n', 'La', 'inspectión', 'del', 'pecho', 'reveló', 'una', 'silueta', 'corporal', 'regular', ',', 'con', 'respiración', 'ruidosa', 'periférica', 'en', 'ambos', 'lados', '.', 'Se', 'escuchó', 'un', 'sonido', 'cardíaco', 'normal', 'y', 'no', 'se', 'detectaron', 'murmuros', 'o', 'ruidos', 'anómalos', '.', 'La', 'palpación', 'reveló', 'un', 'pulso', 'regular', 'y', 'no', 'se', 'encontraron', 'áreas', 'de', 'tensión', 'muscular', 'anómalas', '.', '\n\n', '', '', 'Hallazgos', ':', '', 'Se', 'observaron', 'signos', 'de', 'estrés', 'y', 'ansiedad', ',', 'manifestando', 'un', 'estado', 'de', 'alerta', 'y', 'vigilancia', '.', 'La', 'evaluación', 'neurológica', 'reveló', 'reflexos', 'normales', 'y', 'no', 'se', 'detectaron', 'alteraciones', 'en', 'la', 'marcha', 'o', 'coordinación', '.', '\n\n', '', '', 'Recomendaciones', ':', '', 'Como', 'resultado', 'del', 'examen', 'físico', ',', 'es', 'recomendable', 'que', 'el', 'paciente', 'realice', 'un', 'seguimiento', 'médico', 'cada', '6', 'meses', 'para', 'evaluar', 'su', 'estado', 'general', 'de', 'salud', 'y', 'controlar', 'las', 'posibles', 'complicaciones', '.', 'También', 'se', 'recomienda', 'que', 'el', 'paciente', 'realice', 'un', 'ejercicio', 'moderado', 'regularmente', 'para', 'mejorar', 'su', 'condición', 'física', 'y', 'reducir', 'el', 'estrés', '.', '\n\n', '', '', 'Next', 'Steps', ':', '', 'Se', 'programó', 'una', 'cita', 'para', 'revisión', 'en', '6', 'meses', 'y', 'se', 'le', 'recomienda', 'realizar', 'una', 'consulta', 'con', 'un', 'especialista', 'en', 'medicina', 'general', 'si', 'presenta', 'alguna', 'dolencia', 'o', 'síntoma', 'desagradable', 'en', 'el', 'futuro', '.', '\n\n', '', '', 'Firma', 'del', 'Médico', ':', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\n', '', '', 'Firma', 'del', 'Paciente', ':', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''], 'ner': [[12, 17, 'PERSON'], [23, 24, 'DNI'], [32, 37, 'DATE'], [43, 44, 'TELEPHONE'], [50, 58, 'ADRESS'], [66, 71, 'DATE'], [78, 81, 'PERSON']]}

import os
os.environ["TOKENIZERS_PARALLELISM"] = "true"

import torch
from gliner import GLiNER
from gliner.training import Trainer, TrainingArguments
from gliner.data_processing.collator import DataCollator

device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
entity_types = ["PERSON", "DNI", "DATE", "TELEPHONE", "ADRESS"]
model = GLiNER.from_pretrained("urchade/gliner_small-v2.1")

# use it for better performance, it mimics original implementation but it's less memory efficient
data_collator = DataCollator(
    model.config,
    data_processor=model.data_processor,
    prepare_labels=True,
    #entity_types=entity_types # <-- Add this line
)

# Optional: compile model for faster training
model.to(device)
print("done")

# Define key hyperparameters first for clarity
output_path = "spanish-medical-anonymization-gliner"
num_epochs = 15  # Increased epochs for the small dataset, monitor for overfitting
batch_size = 5  # Smaller batch size can improve performance on small datasets

training_args = TrainingArguments(
    output_dir=output_path,
    num_train_epochs=num_epochs,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,

    # --- Learning Rate & Scheduler ---
    learning_rate=1e-5,                # A slightly higher LR is often better for fine-tuning
    others_lr=1e-4,                    # Faster learning for the new layers (span repr)
    weight_decay=0.01,
    lr_scheduler_type="cosine",        # Cosine scheduler can lead to better convergence
    warmup_ratio=0.1,

    # --- Evaluation and Saving ---
    eval_strategy="epoch",       # Evaluate at the end of each epoch
    save_strategy="epoch",             # Save a checkpoint at the end of each epoch
    load_best_model_at_end=True,       # Crucial: loads the best model after training
    save_total_limit=2,                # Saves the best and the most recent checkpoint

    # --- Other Settings ---
    dataloader_num_workers=2,          # Can speed up data loading if CPU has multiple cores
    report_to="none",
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    #tokenizer=model.data_processor.transformer_tokenizer,
    processing_class=model.data_processor.transformer_tokenizer,
    data_collator=data_collator,
)

trainer.train()

# --- Save the best model's configuration files ---
# This saves the fine-tuned GLiNER-specific config files like `gliner_config.json`
# The trainer already saved the full model weights in the best checkpoint directory.
model.save_pretrained(f'{output_path}/best_model_files', from_pt=True)

print(f"✅ Best model saved to checkpoint directory in '{output_path}'")
print(f"✅ GLiNER config files saved to '{output_path}/best_model_files'")

With this code I'm getting this behaviour on training:

I don't see what am I doing wrong. Can you please help me?

thanks.

Sep 23 '25 13:09 jzamora-281992

Hi @Ingvarstep I have uninstalled and redownloaded the library and the model and I'm still getting the same behaviour. Here is my data (completly synthetic):
with open("processed_data.json", "r", encoding='utf-8') as f:
    data = json.load(f)

print('Dataset size:', len(data))

random.shuffle(data)
print('Dataset is shuffled...')

train_dataset = data[:int(len(data)*0.9)]
test_dataset = data[int(len(data)*0.9):]

print('Dataset is splitted...')
print(train_dataset[0])
{'tokenized_text': ['', '', 'Reporte', 'Clínico', '', '', '\n\n', '', '', 'Paciente', ':', '', 'María', 'del', 'Carmen', 'Ramos', 'Martínez', '\n', '', '', 'DNI', ':', '', '12345678', '\n', '', '', 'Fecha', 'de', 'Nacimiento', ':', '', '12', 'de', 'mayo', 'de', '1985', '\n', '', '', 'Teléfono', ':', '', '915654321', '\n', '', '', 'Dirección', ':', '', 'Cl', '.', 'de', 'Alcalá', ',', '123', ',', 'Madrid', '\n\n', '', '', 'Fecha', 'del', 'Examen', ':', '', '22', 'de', 'febrero', 'de', '2023', '\n', '', '', 'Médico', ':', '', 'Dr.', 'Juan', 'Luis', 'Ramos', '\n\n', '', '', 'Examen', 'Físico', ':', '', 'El', 'paciente', 'se', 'presentó', 'en', 'la', 'consulta', 'con', 'un', 'estado', 'general', 'de', 'salud', 'regular', '.', 'El', 'apoyo', 'corporal', 'es', 'de', '165', 'cm', 'y', 'un', 'peso', 'de', '55', 'kg', ',', 'lo', 'que', 'corresponde', 'a', 'una', 'IMC', 'de', '20,5', 'kg', '/', 'm2', '.', 'La', 'piel', 'y', 'los', 'tejidos', 'subcutáneos', 'presentan', 'una', 'normoflora', 'y', 'no', 'hay', 'alteraciones', 'observables', 'en', 'las', 'mucosas', '.', '\n\n', 'El', 'paciente', 'tiene', 'una', 'apnea', 'normal', 'y', 'la', 'periferia', 'pulpar', 'es', 'normal', '.', 'El', 'examen', 'abdominal', 'reveló', 'un', 'abdomen', 'laxo', 'y', 'asimétrico', ',', 'con', 'hipoacusia', 'en', 'el', 'cuadrante', 'superior', 'izquierdo', '.', 'No', 'se', 'observaron', 'lesiones', 'cutáneas', 'o', 'alteraciones', 'en', 'la', 'piel', '.', 'La', 'circulación', 'linfática', 'es', 'normal', '.', '\n\n', 'La', 'inspectión', 'del', 'pecho', 'reveló', 'una', 'silueta', 'corporal', 'regular', ',', 'con', 'respiración', 'ruidosa', 'periférica', 'en', 'ambos', 'lados', '.', 'Se', 'escuchó', 'un', 'sonido', 'cardíaco', 'normal', 'y', 'no', 'se', 'detectaron', 'murmuros', 'o', 'ruidos', 'anómalos', '.', 'La', 'palpación', 'reveló', 'un', 'pulso', 'regular', 'y', 'no', 'se', 'encontraron', 'áreas', 'de', 'tensión', 'muscular', 'anómalas', '.', '\n\n', '', '', 'Hallazgos', ':', '', 'Se', 'observaron', 'signos', 'de', 'estrés', 'y', 'ansiedad', ',', 'manifestando', 'un', 'estado', 'de', 'alerta', 'y', 'vigilancia', '.', 'La', 'evaluación', 'neurológica', 'reveló', 'reflexos', 'normales', 'y', 'no', 'se', 'detectaron', 'alteraciones', 'en', 'la', 'marcha', 'o', 'coordinación', '.', '\n\n', '', '', 'Recomendaciones', ':', '', 'Como', 'resultado', 'del', 'examen', 'físico', ',', 'es', 'recomendable', 'que', 'el', 'paciente', 'realice', 'un', 'seguimiento', 'médico', 'cada', '6', 'meses', 'para', 'evaluar', 'su', 'estado', 'general', 'de', 'salud', 'y', 'controlar', 'las', 'posibles', 'complicaciones', '.', 'También', 'se', 'recomienda', 'que', 'el', 'paciente', 'realice', 'un', 'ejercicio', 'moderado', 'regularmente', 'para', 'mejorar', 'su', 'condición', 'física', 'y', 'reducir', 'el', 'estrés', '.', '\n\n', '', '', 'Next', 'Steps', ':', '', 'Se', 'programó', 'una', 'cita', 'para', 'revisión', 'en', '6', 'meses', 'y', 'se', 'le', 'recomienda', 'realizar', 'una', 'consulta', 'con', 'un', 'especialista', 'en', 'medicina', 'general', 'si', 'presenta', 'alguna', 'dolencia', 'o', 'síntoma', 'desagradable', 'en', 'el', 'futuro', '.', '\n\n', '', '', 'Firma', 'del', 'Médico', ':', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\n', '', '', 'Firma', 'del', 'Paciente', ':', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''], 'ner': [[12, 17, 'PERSON'], [23, 24, 'DNI'], [32, 37, 'DATE'], [43, 44, 'TELEPHONE'], [50, 58, 'ADRESS'], [66, 71, 'DATE'], [78, 81, 'PERSON']]}
import os
os.environ["TOKENIZERS_PARALLELISM"] = "true"

import torch
from gliner import GLiNER
from gliner.training import Trainer, TrainingArguments
from gliner.data_processing.collator import DataCollator

device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
entity_types = ["PERSON", "DNI", "DATE", "TELEPHONE", "ADRESS"]
model = GLiNER.from_pretrained("urchade/gliner_small-v2.1")

# use it for better performance, it mimics original implementation but it's less memory efficient
data_collator = DataCollator(
    model.config,
    data_processor=model.data_processor,
    prepare_labels=True,
    #entity_types=entity_types # <-- Add this line
)

# Optional: compile model for faster training
model.to(device)
print("done")

# Define key hyperparameters first for clarity
output_path = "spanish-medical-anonymization-gliner"
num_epochs = 15  # Increased epochs for the small dataset, monitor for overfitting
batch_size = 5  # Smaller batch size can improve performance on small datasets

training_args = TrainingArguments(
    output_dir=output_path,
    num_train_epochs=num_epochs,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,

    # --- Learning Rate & Scheduler ---
    learning_rate=1e-5,                # A slightly higher LR is often better for fine-tuning
    others_lr=1e-4,                    # Faster learning for the new layers (span repr)
    weight_decay=0.01,
    lr_scheduler_type="cosine",        # Cosine scheduler can lead to better convergence
    warmup_ratio=0.1,

    # --- Evaluation and Saving ---
    eval_strategy="epoch",       # Evaluate at the end of each epoch
    save_strategy="epoch",             # Save a checkpoint at the end of each epoch
    load_best_model_at_end=True,       # Crucial: loads the best model after training
    save_total_limit=2,                # Saves the best and the most recent checkpoint

    # --- Other Settings ---
    dataloader_num_workers=2,          # Can speed up data loading if CPU has multiple cores
    report_to="none",
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    #tokenizer=model.data_processor.transformer_tokenizer,
    processing_class=model.data_processor.transformer_tokenizer,
    data_collator=data_collator,
)

trainer.train()

# --- Save the best model's configuration files ---
# This saves the fine-tuned GLiNER-specific config files like `gliner_config.json`
# The trainer already saved the full model weights in the best checkpoint directory.
model.save_pretrained(f'{output_path}/best_model_files', from_pt=True)

print(f"✅ Best model saved to checkpoint directory in '{output_path}'")
print(f"✅ GLiNER config files saved to '{output_path}/best_model_files'")
With this code I'm getting this behaviour on training:

I don't see what am I doing wrong. Can you please help me?

thanks.

Hi, I'm sorry for the bug you experienced. To apply the latest changes, you need to install GLiNER from the source, please, check the instructions here: https://github.com/urchade/GLiNER/blob/main/README_Extended.md#install-from-source

Sep 23 '25 13:09 Ingvarstep

Thanks! this solved the issue. Very grateful.

Sep 23 '25 14:09 jzamora-281992

Thanks! this solved the issue. Very grateful.

You are welcome

Sep 23 '25 15:09 Ingvarstep

Hi again. I managed to train the model by doing the installation from the source but now I'm running into a different error when trying to infer with the model I just fine-tuned. Here is the code:

# --- Step 1: Load the fine-tuned model from the Hub ---
# This uses the same repository ID from the previous step.
from gliner import GLiNER
#repo_id = "Juan281992/spanish-medical-anonymization-gliner_small-v2.5"
repo_id = "Juan281992/spanish-medical-anonymization-gliner_small-v2.5"
model = GLiNER.from_pretrained(repo_id)
# --- Step 2: Prepare your text and entity labels ---
# Replace this text with any other Spanish text you want to analyze.
text_to_test = """
**Informe de Consulta Ambulatoria**\n\n**Paciente:** Mar\u00eda del Carmen G\u00f3mez Iglesias\n**Fecha de nacimiento:** 15 de mayo de 1985\n**DNI:** 74732156-L\n**Tel\u00e9fono:** 696812345\n**Direcci\u00f3n:** Alameda de la flor, N\u00ba 12, 3er piso, Puerta B, 28021 Madrid\n\n**Fecha y hora de la consulta:** 10 de marzo de 2023, 10:30 horas\n\nMar\u00eda del Carmen G\u00f3mez Iglesias, nacida en 1985, acudi\u00f3 a la consulta con queja de dolor abdominal cr\u00f3nico y flatulencia recurrente de varios d\u00edas de evoluci\u00f3n. Present\u00f3 una historia previa de diabetes tipo 2, hipertensi\u00f3n y obesidad m\u00f3rbida.\n\nAl realizar la exploraci\u00f3n f\u00edsica, se observ\u00f3 que la paciente presentaba dolor abdominal difuso, m\u00e1s intenso en la regi\u00f3n epig\u00e1strica y con un patr\u00f3n de doloramiento irregular. La peritonealidad fue anest\u00e9sica y no se detectaron signos de inflamaci\u00f3n abdominal evidente.\n\nSe solicit\u00f3 un estudio de laboratorio, que incluy\u00f3 an\u00e1lisis de sangre completo, urine y examen de hemograma, el cual revel\u00f3 un aumento leve en la glucemia basal y presencia de prote\u00ednas en orina.\n\nConsiderando los estudios y la exploraci\u00f3n f\u00edsica, se estableci\u00f3 el diagn\u00f3stico de enfermedad de Crohn, una enfermedad inflamatoria cr\u00f3nica del tracto gastrointestinal, caracterizada por la inflamaci\u00f3n cr\u00f3nica y lesiones en la mucosa intestinal.\n\nSe estableci\u00f3 un plan de tratamiento para la paciente, que incluye el seguimiento con medicaci\u00f3n antiinflamatoria, dieta restrictiva en grasas y fibras, y posibles intervenciones terap\u00e9ticas l\u00e1ser en caso de no respuesta al tratamiento.\n\n**Firma y nombre del m\u00e9dico responsable:**\n\nDr. Juan Carlos Mart\u00edn Gonz\u00e1lez, M\u00e9dico de Familia\n\n**Fe de erratas:** Ninguna.
"""

# These are the PII (Personally Identifiable Information) entity types my model was trained to recognize.
entity_labels = [
    "PERSON",
    "ADDRESS",
    "TELEPHONE",
    "EMAIL",
    "DNI",
    "SEX"
]

# --- Step 3: Run prediction ---
print("Running inference on the test text...")
entities = model.predict(text_to_test,entity_labels)

# --- Step 4: Display the results ---
print("\n--- Entidades de Identificación Personal Encontradas ---")
if not entities:
    print("No se encontraron entidades.")
else:
    for entity in entities:
        print(f"  - Texto: '{entity['text']}'")
        print(f"    Label: {entity['label']}")
        print(f"    Posición: [{entity['start']}, {entity['end']}]\n")

TypeError                                 Traceback (most recent call last)
Cell In[13], line 25
     23 # --- Step 3: Run prediction ---
     24 print("Running inference on the test text...")
---> 25 entities = model.predict(text_to_test, entity_labels)
     27 # --- Step 4: Display the results ---
     28 print("\n--- Entidades de Identificación Personal Encontradas ---")

File [/opt/conda/lib/python3.11/site-packages/gliner/model.py:645](https://oxt9kwyeq2twj0g.studio.eu-south-2.sagemaker.aws/opt/conda/lib/python3.11/site-packages/gliner/model.py#line=644), in GLiNER.predict(self, batch, flat_ner, threshold, multi_label)
    632 def predict(self, batch, flat_ner=False, threshold=0.5, multi_label=False):
    633     """
    634     Predict the entities for a given batch of data.
    635 
   (...)
    643         List: Predicted entities for each example in the batch.
    644     """
--> 645     model_output = self.model(**batch)[0]
    647     if not isinstance(model_output, torch.Tensor):
    648         model_output = torch.from_numpy(model_output)

TypeError: SpanModel(
  (token_rep_layer): Encoder(
    (bert_layer): Transformer(
      (model): DebertaV2Model(
        (embeddings): DebertaV2Embeddings(
          (word_embeddings): Embedding(128003, 768, padding_idx=0)
          (LayerNorm): LayerNorm((768,), eps=1e-07, elementwise_affine=True)
          (dropout): StableDropout()
        )
        (encoder): DebertaV2Encoder(
          (layer): ModuleList(
            (0-5): 6 x DebertaV2Layer(
              (attention): DebertaV2Attention(
                (self): DisentangledSelfAttention(
                  (query_proj): Linear(in_features=768, out_features=768, bias=True)
                  (key_proj): Linear(in_features=768, out_features=768, bias=True)
                  (value_proj): Linear(in_features=768, out_features=768, bias=True)
                  (pos_dropout): StableDropout()
                  (dropout): StableDropout()
                )
                (output): DebertaV2SelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-07, elementwise_affine=True)
                  (dropout): StableDropout()
                )
              )
              (intermediate): DebertaV2Intermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
                (intermediate_act_fn): GELUActivation()
              )
              (output): DebertaV2Output(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-07, elementwise_affine=True)
                (dropout): StableDropout()
              )
            )
          )
          (rel_embeddings): Embedding(512, 768)
          (LayerNorm): LayerNorm((768,), eps=1e-07, elementwise_affine=True)
        )
      )
    )
  )
  (rnn): LstmSeq2SeqEncoder(
    (lstm): LSTM(768, 384, batch_first=True, bidirectional=True)
  )
  (span_rep_layer): SpanRepLayer(
    (span_rep_layer): SpanMarkerV0(
      (project_start): Sequential(
        (0): Linear(in_features=768, out_features=3072, bias=True)
        (1): ReLU()
        (2): Dropout(p=0.4, inplace=False)
        (3): Linear(in_features=3072, out_features=768, bias=True)
      )
      (project_end): Sequential(
        (0): Linear(in_features=768, out_features=3072, bias=True)
        (1): ReLU()
        (2): Dropout(p=0.4, inplace=False)
        (3): Linear(in_features=3072, out_features=768, bias=True)
      )
      (out_project): Sequential(
        (0): Linear(in_features=1536, out_features=3072, bias=True)
        (1): ReLU()
        (2): Dropout(p=0.4, inplace=False)
        (3): Linear(in_features=3072, out_features=768, bias=True)
      )
    )
  )
  (prompt_rep_layer): Sequential(
    (0): Linear(in_features=768, out_features=3072, bias=True)
    (1): ReLU()
    (2): Dropout(p=0.4, inplace=False)
    (3): Linear(in_features=3072, out_features=768, bias=True)
  )
) argument after ** must be a mapping, not str

From what I've seen in the examples the way is correct so it make sme doubt about if there is another issue with the library or is the way my model has been trained.

Thanks in advance for any help!

Sep 24 '25 08:09 jzamora-281992

Hi, I just saw my error. I was using the incorrect method since the one I was using was for batches of entities. THe correct method for a string would be "predict_entitites". I'll leave it here just in case it happens to any other person XD. Thanks you!!!!

Sep 24 '25 08:09 jzamora-281992

Thank you for this, I had the same issue, it is now resolved.

I would recommend to change the finetune google colab example with the following two changes:

env setup to install gliner from the source:

%%bash pip uninstall gliner -y rm -rf GLiNER git clone https://github.com/urchade/GLiNER.git cd GLiNER pip install -r requirements.txt pip install . python -c "import gliner; print(f'✓ GLiNER version: {gliner.version}')"

update the eval_strategy in the TrainingArguments class.

Oct 17 '25 16:10 jpmallette