GLiNER Regarding the issue of fine-tuning on a specific domain

Dear author, has the file examples/finetune.ipynb included negative entity sampling yet? If not, how can we adjust it to incorporate negative entity sampling?

Apr 13 '24 12:04 QuangTQV

It already include in batch negative sampling

Apr 13 '24 12:04 urchade

It already include in batch negative sampling

thanks ^^

Apr 13 '24 16:04 QuangTQV

It already include in batch negative sampling

How can I make the GliNER model biased towards my specific domain data? Because my data domain is prone to confusion with other domains. For example, "harryporter price" is a question about cryptocurrency price, but the model could mistakenly interpret it as a book or something else

Apr 13 '24 17:04 QuangTQV

The solution is fine-tuning the model on your specialized domain. You can for instance generate synthetic data for that

Apr 13 '24 18:04 urchade

The solution is fine-tuning the model on your specialized domain. You can for instance generate synthetic data for that

I know I should fine-tune on my specific domain data, but my dataset compared to the pre-trained model's data is too small. I'm afraid it won't bias towards my data. Do you have any suggestions for a good fine-tuning solution? My data consists of entities within the blockchain domain.

Apr 13 '24 18:04 QuangTQV

Even with small data it should work. How many is it exactly ? I have read someone finetuning with 20-30 samples getting strong performance in his domain

Apr 13 '24 18:04 urchade

Even with small data it should work. How many is it exactly ? I have head someone finetuning with 20-30 samples getting strong performance in his domain

I have 500 samples for each entity, and I need to extract about 8 entities.

Apr 13 '24 18:04 QuangTQV

I did some testing last week and genrated 70 synthetic examples to bias the model to clasifying different kind of labels associated with bird nesting and dietary habbits. It works quite well. If your real world data is fairly consistent, this helps too. You will want to adjust the number of steps in the fine-tune notebook accordingly.

Apr 14 '24 00:04 wjbmattingly