GLiNER icon indicating copy to clipboard operation
GLiNER copied to clipboard

Default max_len value

Open yishusong opened this issue 1 year ago • 4 comments

I'm trying to increase the size of the input texts, however, it seems like all large versions of Gliner (2, 2.1) have default max_len = 384. So I'm wondering what's the reasoning behind the value 384 and whether I can modify this value during inference.

Much appreciated!

yishusong avatar Sep 03 '24 20:09 yishusong

In GLiNER max_len value refers not to tokens count, but to words count, 384 words are on average approximately equal to 512 tokens. Up to this range DeBERTA model - a backbone transformer of GLiNER works the best. You can increase the maximum length, but performance can start to degrade.

from gliner import GLiNER
import torch

model = GLiNER.from_pretrained("urchade/gliner_large-v2.1", max_length = 768).to('cuda:0', dtype=torch.float16)

Ingvarstep avatar Sep 05 '24 08:09 Ingvarstep

Where can I have like a full documentation for the arguments for GLiNER ? max_length for example isn't in any arguments doc that I found, yet it works. So, my conclusion is that I neeed that docs to search on them for modifications. Thanks

GioPetro avatar Sep 06 '24 07:09 GioPetro

I got this warning too "UserWarning: Sentence of length 20415 has been truncated to 384 warnings.warn(f"Sentence of length {len(tokens)} has been truncated to {max_len}") Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation."

Does this mean when I want to find entities in a large document (here 20K words), I just need to break it into each sentence (and then merge the results). I see there is a version of the predict method that takes a list f texts. That is fine, but would be best if said clearly.

geraldthewes avatar Mar 04 '25 13:03 geraldthewes

Hi @Ingvarstep

model = GLiNER.from_pretrained("knowledator/gliner-pii-large-v1.0", max_length =384).to(device) But I am still getting the following warning Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation

Ibrokhimsadikov avatar Nov 03 '25 20:11 Ibrokhimsadikov