yishusong
yishusong
Thank you very much for the replies! I'll try it out shortly. Re: @Marwen-Bhj 's comment about GPU... I haven't looked into the source code yet but is it possible...
Thanks! With CPU there is joblib so there will be more speedup.
Thanks a lot! This indeed speed up inference a lot. However, `model.to('cuda')` seems to only utilize 1 GPU. I looked up online, the `nn.DataParallel(model)` won't extend to GLiNER batch_predict...
I don't think inferentia works because it only supports a very limited list of HF models. Also it might not be compatible with CUDA so there might be other dependency...
I tried CPU, which turned out to be too slow.