InsightFace-REST icon indicating copy to clipboard operation
InsightFace-REST copied to clipboard

arcface onnx input and output size have fixed batch size of 1

Open MiqdaadIndori opened this issue 1 year ago • 4 comments

hello, first of all thankyou for your valuable contribution. I have an issue, I am trying to find an arcface model that supports dynamic batching. I was excited when I saw your onnx file which was described as having batch inference.

https://github.com/SthPhoenix/InsightFace-REST?tab=readme-ov-file#recognition

you can see here.

however when I download and open it on netron, I see that the batch size is 1. could you help me with this, I have been trying for some time to find a arcface onnx with supports dynamic batching.

thankyou.

MiqdaadIndori avatar Feb 15 '25 14:02 MiqdaadIndori

@SthPhoenix , hello! Same question here

MVoloshin avatar May 09 '25 23:05 MVoloshin

Hi! I can't recall exact reason, something with backward compatibility, I guess, but models have fixed batch size. Though, during building TRT engine models are reshaped to desired dimensions, including batch size. You can check reshape_onnx.py for details.

SthPhoenix avatar May 09 '25 23:05 SthPhoenix

Hi! I can't recall exact reason, something with backward compatibility, I guess, but models have fixed batch size. Though, during building TRT engine models are reshaped to desired dimensions, including batch size. You can check reshape_onnx.py for details.

Thanks for a link. P.S. Do you know if it's even possible to make ArcFace produce N 512-element output vectors in ONNX? Currently I have 15 FPS for single face and 7 FPS for 6 faces on my GeForce 1660 (as I'm forced to execute ArcFace model sequentially multiple times).

MVoloshin avatar May 09 '25 23:05 MVoloshin

Thanks for a link. P.S. Do you know if it's even possible to make ArcFace produce N 512-element output vectors in ONNX? Currently I have 15 FPS for single face and 7 FPS for 6 faces on my GeForce 1660 (as I'm forced to execute ArcFace model sequentially multiple times).

I don't fully understand your question, but it looks like you are speaking about batch inference, which is totally possible.

SthPhoenix avatar May 09 '25 23:05 SthPhoenix