feat: Avoid downloading the same model twice when only backend is different

Open matankley opened this issue 2 years ago • 0 comments

Feature request

Curerntly, when we run "openllm build --backend pt" the building process downloads the model and builds a bento. However, once we afterwards run "openllm build --backend vllm" the building process downloads the model again, even though its exactly the same model.

I suggest identifying these cases and avoid double downloading the model.

Motivation

Avoid redundant model downloads which can be heavy and network consuming

Other

No response

Dec 09 '23 16:12 matankley