Getting bad results when fine-tuning GLiNER
Hi everyone, I am trying to fine-tune GLiNER but the results are consistently poor.
At first, I used an artificially generated dataset, but after fine-tuning, the model completely lost its ability to recognize and correctly label entities.
To verify whether the problem was with my data, I ran the official examples/fine-tuning.ipynb notebook with the sample dataset provided by the authors. However, even with that dataset, the model produces very bad results. (It is unclear to me how it previously gave good results, since now it consistently fails, even though I am using the same code and the same dataset.)
I also read in an another issue that the model may suffer from catastrophic forgetting, and that merging the Pile-NER dataset with the target dataset during fine-tuning could help. I tried that approach with various dataset sizes (small, medium, large), various ratios between my dataset and the Pile-NER one (1:2, 1:5) and various hyperparameters, but the results did not improve.
After fine-tuning, the model tends to over-predict by labeling every token, or sometimes even entire phrases, as entities. In many cases, it assigns the same label to all of them, and even when the labels vary, they are consistently incorrect. Here is a sample output:
Whiplash => treatment
, => treatment
a => treatment
soft => treatment
tissue injury to the neck, is also called neck sprain => treatment
or => treatment
strain => treatment
. => treatment
Treatment depends on the cause => treatment
, => treatment
but => treatment
may include => treatment
applying => treatment
ice => treatment
, => treatment
taking => treatment
pain relievers, getting physical therapy or wearing => treatment
a => treatment
cervical collar => treatment
. => treatment
You => treatment
rarely => treatment
need => treatment
surgery => treatment
. => treatment
Has anyone else experienced this issue? Is there a recommended approach to improve this?
@urchade @Ingvarstep Sorry for bothering but I'm doing a bachelor's degree project on this :)
@tamarastojanova , hi, thanks for informing me about this issue. It was related to the incorrect interface to the loss function, and masking of negative labels was done incorrectly. Here is my fix, it's still under review, feel free to give your feedback: https://github.com/urchade/GLiNER/pull/296
Hi @Ingvarstep I have uninstalled and redownloaded the library and the model and I'm still getting the same behaviour. Here is my data (completly synthetic):
with open("processed_data.json", "r", encoding='utf-8') as f:
data = json.load(f)
print('Dataset size:', len(data))
random.shuffle(data)
print('Dataset is shuffled...')
train_dataset = data[:int(len(data)*0.9)]
test_dataset = data[int(len(data)*0.9):]
print('Dataset is splitted...')
print(train_dataset[0])
{'tokenized_text': ['', '', 'Reporte', 'Clínico', '', '', '\n\n', '', '', 'Paciente', ':', '', 'María', 'del', 'Carmen', 'Ramos', 'Martínez', '\n', '', '', 'DNI', ':', '', '12345678', '\n', '', '', 'Fecha', 'de', 'Nacimiento', ':', '', '12', 'de', 'mayo', 'de', '1985', '\n', '', '', 'Teléfono', ':', '', '915654321', '\n', '', '', 'Dirección', ':', '', 'Cl', '.', 'de', 'Alcalá', ',', '123', ',', 'Madrid', '\n\n', '', '', 'Fecha', 'del', 'Examen', ':', '', '22', 'de', 'febrero', 'de', '2023', '\n', '', '', 'Médico', ':', '', 'Dr.', 'Juan', 'Luis', 'Ramos', '\n\n', '', '', 'Examen', 'Físico', ':', '', 'El', 'paciente', 'se', 'presentó', 'en', 'la', 'consulta', 'con', 'un', 'estado', 'general', 'de', 'salud', 'regular', '.', 'El', 'apoyo', 'corporal', 'es', 'de', '165', 'cm', 'y', 'un', 'peso', 'de', '55', 'kg', ',', 'lo', 'que', 'corresponde', 'a', 'una', 'IMC', 'de', '20,5', 'kg', '/', 'm2', '.', 'La', 'piel', 'y', 'los', 'tejidos', 'subcutáneos', 'presentan', 'una', 'normoflora', 'y', 'no', 'hay', 'alteraciones', 'observables', 'en', 'las', 'mucosas', '.', '\n\n', 'El', 'paciente', 'tiene', 'una', 'apnea', 'normal', 'y', 'la', 'periferia', 'pulpar', 'es', 'normal', '.', 'El', 'examen', 'abdominal', 'reveló', 'un', 'abdomen', 'laxo', 'y', 'asimétrico', ',', 'con', 'hipoacusia', 'en', 'el', 'cuadrante', 'superior', 'izquierdo', '.', 'No', 'se', 'observaron', 'lesiones', 'cutáneas', 'o', 'alteraciones', 'en', 'la', 'piel', '.', 'La', 'circulación', 'linfática', 'es', 'normal', '.', '\n\n', 'La', 'inspectión', 'del', 'pecho', 'reveló', 'una', 'silueta', 'corporal', 'regular', ',', 'con', 'respiración', 'ruidosa', 'periférica', 'en', 'ambos', 'lados', '.', 'Se', 'escuchó', 'un', 'sonido', 'cardíaco', 'normal', 'y', 'no', 'se', 'detectaron', 'murmuros', 'o', 'ruidos', 'anómalos', '.', 'La', 'palpación', 'reveló', 'un', 'pulso', 'regular', 'y', 'no', 'se', 'encontraron', 'áreas', 'de', 'tensión', 'muscular', 'anómalas', '.', '\n\n', '', '', 'Hallazgos', ':', '', 'Se', 'observaron', 'signos', 'de', 'estrés', 'y', 'ansiedad', ',', 'manifestando', 'un', 'estado', 'de', 'alerta', 'y', 'vigilancia', '.', 'La', 'evaluación', 'neurológica', 'reveló', 'reflexos', 'normales', 'y', 'no', 'se', 'detectaron', 'alteraciones', 'en', 'la', 'marcha', 'o', 'coordinación', '.', '\n\n', '', '', 'Recomendaciones', ':', '', 'Como', 'resultado', 'del', 'examen', 'físico', ',', 'es', 'recomendable', 'que', 'el', 'paciente', 'realice', 'un', 'seguimiento', 'médico', 'cada', '6', 'meses', 'para', 'evaluar', 'su', 'estado', 'general', 'de', 'salud', 'y', 'controlar', 'las', 'posibles', 'complicaciones', '.', 'También', 'se', 'recomienda', 'que', 'el', 'paciente', 'realice', 'un', 'ejercicio', 'moderado', 'regularmente', 'para', 'mejorar', 'su', 'condición', 'física', 'y', 'reducir', 'el', 'estrés', '.', '\n\n', '', '', 'Next', 'Steps', ':', '', 'Se', 'programó', 'una', 'cita', 'para', 'revisión', 'en', '6', 'meses', 'y', 'se', 'le', 'recomienda', 'realizar', 'una', 'consulta', 'con', 'un', 'especialista', 'en', 'medicina', 'general', 'si', 'presenta', 'alguna', 'dolencia', 'o', 'síntoma', 'desagradable', 'en', 'el', 'futuro', '.', '\n\n', '', '', 'Firma', 'del', 'Médico', ':', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\n', '', '', 'Firma', 'del', 'Paciente', ':', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''], 'ner': [[12, 17, 'PERSON'], [23, 24, 'DNI'], [32, 37, 'DATE'], [43, 44, 'TELEPHONE'], [50, 58, 'ADRESS'], [66, 71, 'DATE'], [78, 81, 'PERSON']]}
import os
os.environ["TOKENIZERS_PARALLELISM"] = "true"
import torch
from gliner import GLiNER
from gliner.training import Trainer, TrainingArguments
from gliner.data_processing.collator import DataCollator
device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
entity_types = ["PERSON", "DNI", "DATE", "TELEPHONE", "ADRESS"]
model = GLiNER.from_pretrained("urchade/gliner_small-v2.1")
# use it for better performance, it mimics original implementation but it's less memory efficient
data_collator = DataCollator(
model.config,
data_processor=model.data_processor,
prepare_labels=True,
#entity_types=entity_types # <-- Add this line
)
# Optional: compile model for faster training
model.to(device)
print("done")
# Define key hyperparameters first for clarity
output_path = "spanish-medical-anonymization-gliner"
num_epochs = 15 # Increased epochs for the small dataset, monitor for overfitting
batch_size = 5 # Smaller batch size can improve performance on small datasets
training_args = TrainingArguments(
output_dir=output_path,
num_train_epochs=num_epochs,
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
# --- Learning Rate & Scheduler ---
learning_rate=1e-5, # A slightly higher LR is often better for fine-tuning
others_lr=1e-4, # Faster learning for the new layers (span repr)
weight_decay=0.01,
lr_scheduler_type="cosine", # Cosine scheduler can lead to better convergence
warmup_ratio=0.1,
# --- Evaluation and Saving ---
eval_strategy="epoch", # Evaluate at the end of each epoch
save_strategy="epoch", # Save a checkpoint at the end of each epoch
load_best_model_at_end=True, # Crucial: loads the best model after training
save_total_limit=2, # Saves the best and the most recent checkpoint
# --- Other Settings ---
dataloader_num_workers=2, # Can speed up data loading if CPU has multiple cores
report_to="none",
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=test_dataset,
#tokenizer=model.data_processor.transformer_tokenizer,
processing_class=model.data_processor.transformer_tokenizer,
data_collator=data_collator,
)
trainer.train()
# --- Save the best model's configuration files ---
# This saves the fine-tuned GLiNER-specific config files like `gliner_config.json`
# The trainer already saved the full model weights in the best checkpoint directory.
model.save_pretrained(f'{output_path}/best_model_files', from_pt=True)
print(f"✅ Best model saved to checkpoint directory in '{output_path}'")
print(f"✅ GLiNER config files saved to '{output_path}/best_model_files'")
With this code I'm getting this behaviour on training:
I don't see what am I doing wrong. Can you please help me?
thanks.
Hi @Ingvarstep I have uninstalled and redownloaded the library and the model and I'm still getting the same behaviour. Here is my data (completly synthetic):
with open("processed_data.json", "r", encoding='utf-8') as f: data = json.load(f) print('Dataset size:', len(data)) random.shuffle(data) print('Dataset is shuffled...') train_dataset = data[:int(len(data)*0.9)] test_dataset = data[int(len(data)*0.9):] print('Dataset is splitted...') print(train_dataset[0]){'tokenized_text': ['', '', 'Reporte', 'Clínico', '', '', '\n\n', '', '', 'Paciente', ':', '', 'María', 'del', 'Carmen', 'Ramos', 'Martínez', '\n', '', '', 'DNI', ':', '', '12345678', '\n', '', '', 'Fecha', 'de', 'Nacimiento', ':', '', '12', 'de', 'mayo', 'de', '1985', '\n', '', '', 'Teléfono', ':', '', '915654321', '\n', '', '', 'Dirección', ':', '', 'Cl', '.', 'de', 'Alcalá', ',', '123', ',', 'Madrid', '\n\n', '', '', 'Fecha', 'del', 'Examen', ':', '', '22', 'de', 'febrero', 'de', '2023', '\n', '', '', 'Médico', ':', '', 'Dr.', 'Juan', 'Luis', 'Ramos', '\n\n', '', '', 'Examen', 'Físico', ':', '', 'El', 'paciente', 'se', 'presentó', 'en', 'la', 'consulta', 'con', 'un', 'estado', 'general', 'de', 'salud', 'regular', '.', 'El', 'apoyo', 'corporal', 'es', 'de', '165', 'cm', 'y', 'un', 'peso', 'de', '55', 'kg', ',', 'lo', 'que', 'corresponde', 'a', 'una', 'IMC', 'de', '20,5', 'kg', '/', 'm2', '.', 'La', 'piel', 'y', 'los', 'tejidos', 'subcutáneos', 'presentan', 'una', 'normoflora', 'y', 'no', 'hay', 'alteraciones', 'observables', 'en', 'las', 'mucosas', '.', '\n\n', 'El', 'paciente', 'tiene', 'una', 'apnea', 'normal', 'y', 'la', 'periferia', 'pulpar', 'es', 'normal', '.', 'El', 'examen', 'abdominal', 'reveló', 'un', 'abdomen', 'laxo', 'y', 'asimétrico', ',', 'con', 'hipoacusia', 'en', 'el', 'cuadrante', 'superior', 'izquierdo', '.', 'No', 'se', 'observaron', 'lesiones', 'cutáneas', 'o', 'alteraciones', 'en', 'la', 'piel', '.', 'La', 'circulación', 'linfática', 'es', 'normal', '.', '\n\n', 'La', 'inspectión', 'del', 'pecho', 'reveló', 'una', 'silueta', 'corporal', 'regular', ',', 'con', 'respiración', 'ruidosa', 'periférica', 'en', 'ambos', 'lados', '.', 'Se', 'escuchó', 'un', 'sonido', 'cardíaco', 'normal', 'y', 'no', 'se', 'detectaron', 'murmuros', 'o', 'ruidos', 'anómalos', '.', 'La', 'palpación', 'reveló', 'un', 'pulso', 'regular', 'y', 'no', 'se', 'encontraron', 'áreas', 'de', 'tensión', 'muscular', 'anómalas', '.', '\n\n', '', '', 'Hallazgos', ':', '', 'Se', 'observaron', 'signos', 'de', 'estrés', 'y', 'ansiedad', ',', 'manifestando', 'un', 'estado', 'de', 'alerta', 'y', 'vigilancia', '.', 'La', 'evaluación', 'neurológica', 'reveló', 'reflexos', 'normales', 'y', 'no', 'se', 'detectaron', 'alteraciones', 'en', 'la', 'marcha', 'o', 'coordinación', '.', '\n\n', '', '', 'Recomendaciones', ':', '', 'Como', 'resultado', 'del', 'examen', 'físico', ',', 'es', 'recomendable', 'que', 'el', 'paciente', 'realice', 'un', 'seguimiento', 'médico', 'cada', '6', 'meses', 'para', 'evaluar', 'su', 'estado', 'general', 'de', 'salud', 'y', 'controlar', 'las', 'posibles', 'complicaciones', '.', 'También', 'se', 'recomienda', 'que', 'el', 'paciente', 'realice', 'un', 'ejercicio', 'moderado', 'regularmente', 'para', 'mejorar', 'su', 'condición', 'física', 'y', 'reducir', 'el', 'estrés', '.', '\n\n', '', '', 'Next', 'Steps', ':', '', 'Se', 'programó', 'una', 'cita', 'para', 'revisión', 'en', '6', 'meses', 'y', 'se', 'le', 'recomienda', 'realizar', 'una', 'consulta', 'con', 'un', 'especialista', 'en', 'medicina', 'general', 'si', 'presenta', 'alguna', 'dolencia', 'o', 'síntoma', 'desagradable', 'en', 'el', 'futuro', '.', '\n\n', '', '', 'Firma', 'del', 'Médico', ':', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '\n', '', '', 'Firma', 'del', 'Paciente', ':', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''], 'ner': [[12, 17, 'PERSON'], [23, 24, 'DNI'], [32, 37, 'DATE'], [43, 44, 'TELEPHONE'], [50, 58, 'ADRESS'], [66, 71, 'DATE'], [78, 81, 'PERSON']]}
import os os.environ["TOKENIZERS_PARALLELISM"] = "true" import torch from gliner import GLiNER from gliner.training import Trainer, TrainingArguments from gliner.data_processing.collator import DataCollator device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu') entity_types = ["PERSON", "DNI", "DATE", "TELEPHONE", "ADRESS"] model = GLiNER.from_pretrained("urchade/gliner_small-v2.1") # use it for better performance, it mimics original implementation but it's less memory efficient data_collator = DataCollator( model.config, data_processor=model.data_processor, prepare_labels=True, #entity_types=entity_types # <-- Add this line ) # Optional: compile model for faster training model.to(device) print("done") # Define key hyperparameters first for clarity output_path = "spanish-medical-anonymization-gliner" num_epochs = 15 # Increased epochs for the small dataset, monitor for overfitting batch_size = 5 # Smaller batch size can improve performance on small datasets training_args = TrainingArguments( output_dir=output_path, num_train_epochs=num_epochs, per_device_train_batch_size=batch_size, per_device_eval_batch_size=batch_size, # --- Learning Rate & Scheduler --- learning_rate=1e-5, # A slightly higher LR is often better for fine-tuning others_lr=1e-4, # Faster learning for the new layers (span repr) weight_decay=0.01, lr_scheduler_type="cosine", # Cosine scheduler can lead to better convergence warmup_ratio=0.1, # --- Evaluation and Saving --- eval_strategy="epoch", # Evaluate at the end of each epoch save_strategy="epoch", # Save a checkpoint at the end of each epoch load_best_model_at_end=True, # Crucial: loads the best model after training save_total_limit=2, # Saves the best and the most recent checkpoint # --- Other Settings --- dataloader_num_workers=2, # Can speed up data loading if CPU has multiple cores report_to="none", ) trainer = Trainer( model=model, args=training_args, train_dataset=train_dataset, eval_dataset=test_dataset, #tokenizer=model.data_processor.transformer_tokenizer, processing_class=model.data_processor.transformer_tokenizer, data_collator=data_collator, ) trainer.train() # --- Save the best model's configuration files --- # This saves the fine-tuned GLiNER-specific config files like `gliner_config.json` # The trainer already saved the full model weights in the best checkpoint directory. model.save_pretrained(f'{output_path}/best_model_files', from_pt=True) print(f"✅ Best model saved to checkpoint directory in '{output_path}'") print(f"✅ GLiNER config files saved to '{output_path}/best_model_files'")With this code I'm getting this behaviour on training:
I don't see what am I doing wrong. Can you please help me?
thanks.
Hi, I'm sorry for the bug you experienced. To apply the latest changes, you need to install GLiNER from the source, please, check the instructions here: https://github.com/urchade/GLiNER/blob/main/README_Extended.md#install-from-source
Thanks! this solved the issue. Very grateful.
Thanks! this solved the issue. Very grateful.
You are welcome
Hi again. I managed to train the model by doing the installation from the source but now I'm running into a different error when trying to infer with the model I just fine-tuned. Here is the code:
# --- Step 1: Load the fine-tuned model from the Hub ---
# This uses the same repository ID from the previous step.
from gliner import GLiNER
#repo_id = "Juan281992/spanish-medical-anonymization-gliner_small-v2.5"
repo_id = "Juan281992/spanish-medical-anonymization-gliner_small-v2.5"
model = GLiNER.from_pretrained(repo_id)
# --- Step 2: Prepare your text and entity labels ---
# Replace this text with any other Spanish text you want to analyze.
text_to_test = """
**Informe de Consulta Ambulatoria**\n\n**Paciente:** Mar\u00eda del Carmen G\u00f3mez Iglesias\n**Fecha de nacimiento:** 15 de mayo de 1985\n**DNI:** 74732156-L\n**Tel\u00e9fono:** 696812345\n**Direcci\u00f3n:** Alameda de la flor, N\u00ba 12, 3er piso, Puerta B, 28021 Madrid\n\n**Fecha y hora de la consulta:** 10 de marzo de 2023, 10:30 horas\n\nMar\u00eda del Carmen G\u00f3mez Iglesias, nacida en 1985, acudi\u00f3 a la consulta con queja de dolor abdominal cr\u00f3nico y flatulencia recurrente de varios d\u00edas de evoluci\u00f3n. Present\u00f3 una historia previa de diabetes tipo 2, hipertensi\u00f3n y obesidad m\u00f3rbida.\n\nAl realizar la exploraci\u00f3n f\u00edsica, se observ\u00f3 que la paciente presentaba dolor abdominal difuso, m\u00e1s intenso en la regi\u00f3n epig\u00e1strica y con un patr\u00f3n de doloramiento irregular. La peritonealidad fue anest\u00e9sica y no se detectaron signos de inflamaci\u00f3n abdominal evidente.\n\nSe solicit\u00f3 un estudio de laboratorio, que incluy\u00f3 an\u00e1lisis de sangre completo, urine y examen de hemograma, el cual revel\u00f3 un aumento leve en la glucemia basal y presencia de prote\u00ednas en orina.\n\nConsiderando los estudios y la exploraci\u00f3n f\u00edsica, se estableci\u00f3 el diagn\u00f3stico de enfermedad de Crohn, una enfermedad inflamatoria cr\u00f3nica del tracto gastrointestinal, caracterizada por la inflamaci\u00f3n cr\u00f3nica y lesiones en la mucosa intestinal.\n\nSe estableci\u00f3 un plan de tratamiento para la paciente, que incluye el seguimiento con medicaci\u00f3n antiinflamatoria, dieta restrictiva en grasas y fibras, y posibles intervenciones terap\u00e9ticas l\u00e1ser en caso de no respuesta al tratamiento.\n\n**Firma y nombre del m\u00e9dico responsable:**\n\nDr. Juan Carlos Mart\u00edn Gonz\u00e1lez, M\u00e9dico de Familia\n\n**Fe de erratas:** Ninguna.
"""
# These are the PII (Personally Identifiable Information) entity types my model was trained to recognize.
entity_labels = [
"PERSON",
"ADDRESS",
"TELEPHONE",
"EMAIL",
"DNI",
"SEX"
]
# --- Step 3: Run prediction ---
print("Running inference on the test text...")
entities = model.predict(text_to_test,entity_labels)
# --- Step 4: Display the results ---
print("\n--- Entidades de Identificación Personal Encontradas ---")
if not entities:
print("No se encontraron entidades.")
else:
for entity in entities:
print(f" - Texto: '{entity['text']}'")
print(f" Label: {entity['label']}")
print(f" Posición: [{entity['start']}, {entity['end']}]\n")
TypeError Traceback (most recent call last)
Cell In[13], line 25
23 # --- Step 3: Run prediction ---
24 print("Running inference on the test text...")
---> 25 entities = model.predict(text_to_test, entity_labels)
27 # --- Step 4: Display the results ---
28 print("\n--- Entidades de Identificación Personal Encontradas ---")
File [/opt/conda/lib/python3.11/site-packages/gliner/model.py:645](https://oxt9kwyeq2twj0g.studio.eu-south-2.sagemaker.aws/opt/conda/lib/python3.11/site-packages/gliner/model.py#line=644), in GLiNER.predict(self, batch, flat_ner, threshold, multi_label)
632 def predict(self, batch, flat_ner=False, threshold=0.5, multi_label=False):
633 """
634 Predict the entities for a given batch of data.
635
(...)
643 List: Predicted entities for each example in the batch.
644 """
--> 645 model_output = self.model(**batch)[0]
647 if not isinstance(model_output, torch.Tensor):
648 model_output = torch.from_numpy(model_output)
TypeError: SpanModel(
(token_rep_layer): Encoder(
(bert_layer): Transformer(
(model): DebertaV2Model(
(embeddings): DebertaV2Embeddings(
(word_embeddings): Embedding(128003, 768, padding_idx=0)
(LayerNorm): LayerNorm((768,), eps=1e-07, elementwise_affine=True)
(dropout): StableDropout()
)
(encoder): DebertaV2Encoder(
(layer): ModuleList(
(0-5): 6 x DebertaV2Layer(
(attention): DebertaV2Attention(
(self): DisentangledSelfAttention(
(query_proj): Linear(in_features=768, out_features=768, bias=True)
(key_proj): Linear(in_features=768, out_features=768, bias=True)
(value_proj): Linear(in_features=768, out_features=768, bias=True)
(pos_dropout): StableDropout()
(dropout): StableDropout()
)
(output): DebertaV2SelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-07, elementwise_affine=True)
(dropout): StableDropout()
)
)
(intermediate): DebertaV2Intermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): DebertaV2Output(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-07, elementwise_affine=True)
(dropout): StableDropout()
)
)
)
(rel_embeddings): Embedding(512, 768)
(LayerNorm): LayerNorm((768,), eps=1e-07, elementwise_affine=True)
)
)
)
)
(rnn): LstmSeq2SeqEncoder(
(lstm): LSTM(768, 384, batch_first=True, bidirectional=True)
)
(span_rep_layer): SpanRepLayer(
(span_rep_layer): SpanMarkerV0(
(project_start): Sequential(
(0): Linear(in_features=768, out_features=3072, bias=True)
(1): ReLU()
(2): Dropout(p=0.4, inplace=False)
(3): Linear(in_features=3072, out_features=768, bias=True)
)
(project_end): Sequential(
(0): Linear(in_features=768, out_features=3072, bias=True)
(1): ReLU()
(2): Dropout(p=0.4, inplace=False)
(3): Linear(in_features=3072, out_features=768, bias=True)
)
(out_project): Sequential(
(0): Linear(in_features=1536, out_features=3072, bias=True)
(1): ReLU()
(2): Dropout(p=0.4, inplace=False)
(3): Linear(in_features=3072, out_features=768, bias=True)
)
)
)
(prompt_rep_layer): Sequential(
(0): Linear(in_features=768, out_features=3072, bias=True)
(1): ReLU()
(2): Dropout(p=0.4, inplace=False)
(3): Linear(in_features=3072, out_features=768, bias=True)
)
) argument after ** must be a mapping, not str
From what I've seen in the examples the way is correct so it make sme doubt about if there is another issue with the library or is the way my model has been trained.
Thanks in advance for any help!
Hi, I just saw my error. I was using the incorrect method since the one I was using was for batches of entities. THe correct method for a string would be "predict_entitites". I'll leave it here just in case it happens to any other person XD. Thanks you!!!!
Thank you for this, I had the same issue, it is now resolved.
I would recommend to change the finetune google colab example with the following two changes:
- env setup to install gliner from the source:
%%bash pip uninstall gliner -y rm -rf GLiNER git clone https://github.com/urchade/GLiNER.git cd GLiNER pip install -r requirements.txt pip install . python -c "import gliner; print(f'✓ GLiNER version: {gliner.version}')"
- update the eval_strategy in the TrainingArguments class.
