Model aware scheduling
What would you like to be added:
Right now, model management is a tricky problem in the cluster, it's big, so we need to cache them in the node just like images, however, kubelet will take over the image lifecycle management but files, so that's a problem, and will not be tacked in the near future, so maybe we need to manage the models manually and make it aware by the scheduler to make pod placement decisions.
Why is this needed:
Efficient pod scheduling with models
Completion requirements:
This enhancement requires the following artifacts:
- [x] Design doc
- [ ] API change
- [x] Docs update
The artifacts should be linked in subsequent comments.
/kind feature
I see that the built-in plugins of scheduler currently include: volumebinding volumerestrictions volumeszone plugins. Do we have scoring plugins similar to imagelocality plugins (e.g. volumelocality) to cover this scenario?
Yes, basically the idea is we have models located at different nodes, and we should be aware of which node is the best candidate.
However, right now, I'm developing a P2P based model distribution project, you can take it as another lightweight dragonfly but most works for model weights. See https://github.com/InftyAI/Manta, once this finished, the model aware is not that urgent, because models will be transmitted across nodes, but still available.