language-model-arithmetic
language-model-arithmetic copied to clipboard
Inference acceleration
Excellent work. I would like to know if it is possible to use some commonly used inference acceleration frameworks such as VLLM and LMDEPLOY in the model loading section.