worker-vllm icon indicating copy to clipboard operation
worker-vllm copied to clipboard

Issue: Update VLLM to Version .5.0++, and a few suggestions

Open nerdylive123 opened this issue 1 year ago • 13 comments

Description

  1. 🌟 Upgrade VLLM: We need to rocket VLLM to version 0.5.0++ or beyond! 🚀
  2. 🤖 Tensorize Awesomeness: The tensorize feature is like giving VLLM a turbo boost. 🏎️ Check out the Tensorize VLLM example for a sneak peek.
    • 🚀 It lets us load the model during download (but remember, the model needs a little conversion magic).
  3. 📦 Pip It Up: Why build VLLM from scratch when we can summon it with a pip package? Efficiency, my friend! 🧙‍♂️

Kudos to the stellar maintainer! 🌟🙌

nerdylive123 avatar Jul 03 '24 04:07 nerdylive123

+1! I really would like to run Phi3VForCausalLM


FrederikHandberg avatar Jul 04 '24 12:07 FrederikHandberg

+1!

Sapessii avatar Jul 05 '24 13:07 Sapessii

+1, Gemma 2 support has been recently rolled out in vLLM!

shivanker avatar Jul 08 '24 23:07 shivanker

+1, it would make much more sense to pip install vllm so that when a new model is released and implemented in vLLM it is automatically integrated in this worker @alpayariyak

avacaondata avatar Jul 09 '24 08:07 avacaondata

Are there any plans to upgrade the VLLM version and if so, can you provide a date?

d4rk6un avatar Jul 22 '24 00:07 d4rk6un

+1, then we could finally run DeepSeek-Coder v2

PhoenixSmaug avatar Jul 22 '24 12:07 PhoenixSmaug

+1

Llama 3.1 needs 0.5.3 https://github.com/vllm-project/vllm/releases/tag/v0.5.3

Can we upgrade this worker to support this out of box in runpod serverless vllm ?

harshal-pr avatar Jul 25 '24 18:07 harshal-pr

waiting also for the update :) let me know if i can help !

Lhemamou avatar Jul 26 '24 21:07 Lhemamou

Hi all, thank you so much for the suggestions! I've joined a different company, so @pandyamarut will be taking over. It's been a great pleasure serving you all!

alpayariyak avatar Jul 26 '24 22:07 alpayariyak

I wish you an amazing next work experience ;) welcome aboard @pandyamarut !

Lhemamou avatar Jul 26 '24 22:07 Lhemamou

Working on it ,Sorry for the delay. Thanks for maintaining the repo @alpayariyak

pandyamarut avatar Jul 26 '24 22:07 pandyamarut

Guys, do we know anything about the approximate time frame for the update? So that you can somehow plan the update of the models in the roadmap. Thanks

TheAlexPG avatar Aug 05 '24 07:08 TheAlexPG

image Pls support new quantization fp8, refer to this docs: vllm docs

I've got a whole new menu with a bunch of new options i guess its all of the arguments thats very great thank you for the update staffs and maintainers! just the options value needs to be updated :)

nerdylive123 avatar Aug 08 '24 06:08 nerdylive123