Ihor Stepanov

Results 25 comments of Ihor Stepanov

Thank you for pointing it out. You need to change a processor to rely on NumPy, plus rewrite a bit of conversion script to use ONNX instead of PyTorch. We...

In GLiNER max_len value refers not to tokens count, but to words count, 384 words are on average approximately equal to 512 tokens. Up to this range DeBERTA model -...

Interesting, thanks for contributing. I recently did something similar for GLiClass models. Let me experiment with your implementation first, and then we will merge it based onthe results.

The issue was related to how ONNX matches positional arguments. In the latest commit, I included fixes for these issues.

Sorry for the delay in reviewing your PR. This is a good job and important fixes. Thank you!

From my experiments, ONNX models work faster for sequences smaller than 124 words. With a longer input sequence, attention becomes the limiting factor and ONNX is not necessarily more efficient...

> Thanks @Ingvarstep. I was going through [GLiNER.cpp](https://github.com/Knowledgator/GLiNER.cpp) and could not find license details. Is it Apache 2.0 or MIT licensed? It's Apache 2.0

Hi @mikeg27, post-quantization performance is largely determined by the model’s training regime. Models not trained with quantization-aware methods often suffer from activation/weight distribution shifts once precision is reduced. Since none...

Hello @Ani2019 , We improved ONNX conversion experience and fixed some bugs, feel free to follow this tutorial to convert this model and then use it. https://github.com/urchade/GLiNER/blob/main/examples/convert_to_onnx.ipynb