llm-engine
llm-engine copied to clipboard
Comparison benchmarks?
Hi, Thanks for open-sourcing the code.
I was wondering how it compares in terms of throughput with existing inference frameworks like https://github.com/huggingface/text-generation-inference and https://github.com/vllm-project/vllm , do we have any benchmarks?
Thanks for the request — we will be sure to add some benchmarks. cc @yixu34
Under the hood, the inference serving component is handled by HF Text Generation Inference, so the inference throughput should be similar or equivalent to that library.