Deprecate `.inference.preset.accessMode` in Workspace CRD
Describe the bug
With the introduction of model weight download at runtime (#982), accessMode has become more confusing. There could be scenarios where a preset wants to download a "gated" HuggingFace model but uses a public kaito-base image as the base model. Classifying it as public or private both have valid points. Additionally, there is redundant information between .inference.preset.accessMode and the access mode baked into each model.go.
Let’s deprecate the CRD field (without removing it from the API) and rely on the access mode in model.go as the source of truth for the validating webhook.
Steps To Reproduce
Expected behavior
Logs
Environment
- Kubernetes version (use
kubectl version): - OS (e.g:
cat /etc/os-release): - Install tools:
- Others:
Additional context
:bulb: Auto-generated documentation-based answer:
Kaito is an operator that automates the AI/ML model inference or tuning workload in a Kubernetes cluster. It supports popular open-sourced large models such as Falcon and Phi-3. Key features include managing large model files using container images, providing preset configurations, supporting inference runtimes like vLLM and transformers, auto-provisioning GPU nodes, and hosting large model images in the public Microsoft Container Registry (MCR) if the license allows.
Relevant Sources:
- https://github.com/kaito-project/kaito/blob/main/README.md
The field is deprecated by https://github.com/kaito-project/kaito/pull/1091, but we still need to remove its behavior from the codebase.
link #1100