lmdeploy [Feature] Support Llama 3.2 family of models

Motivation for Implementing Llama 3.2 lmdeploy Inference Engine Support

Llama 3.2's release presents a strong case for expanding lmdeploy with dedicated support:

Multi-modal Capabilities: Llama 3.2 introduces vision LLMs (11B & 90B) enabling image understanding and reasoning alongside text. Lmdeploy integration unlocks applications like document analysis, image captioning, and visual grounding.
Edge & Mobile Deployment: Lightweight 1B & 3B models are tailored for on-device use cases (summarization, instructions, rewriting). Lmdeploy can facilitate faster, private, and offline AI on resource-constrained hardware.
Enhanced Performance: Pruning and distillation techniques in Llama 3.2 result in improved efficiency. Lmdeploy users would benefit from faster inference and reduced resource consumption with these models.
Open Innovation: Supporting Llama 3.2 aligns with lmdeploy's open-source ethos, fostering a wider ecosystem and accelerating AI application development.

No response

Sep 26 '24 02:09 vikrantrathore

Sure. @AllentDan is working on it.

Sep 26 '24 06:09 lvhan028

@AllentDan Is there any progress?Thanks♪(･ω･)ﾉ

Oct 09 '24 07:10 lijiawei320

Please upgrade to v0.6.2

Nov 04 '24 12:11 lvhan028