FlagEmbedding
FlagEmbedding copied to clipboard
Deploying reranker model on triton inference server
Hey, does it make sense to deploy the reranking model in triton inference server for efficiency? Or maybe there are other recommendations concerning reranking inference optimization?
Did anybody elaborate on that?