kaito icon indicating copy to clipboard operation
kaito copied to clipboard

Deprecate `.inference.preset.accessMode` in Workspace CRD

Open chewong opened this issue 1 year ago • 1 comments

Describe the bug

With the introduction of model weight download at runtime (#982), accessMode has become more confusing. There could be scenarios where a preset wants to download a "gated" HuggingFace model but uses a public kaito-base image as the base model. Classifying it as public or private both have valid points. Additionally, there is redundant information between .inference.preset.accessMode and the access mode baked into each model.go.

Let’s deprecate the CRD field (without removing it from the API) and rely on the access mode in model.go as the source of truth for the validating webhook.

Steps To Reproduce

Expected behavior

Logs

Environment

  • Kubernetes version (use kubectl version):
  • OS (e.g: cat /etc/os-release):
  • Install tools:
  • Others:

Additional context

chewong avatar Apr 24 '25 17:04 chewong

:bulb: Auto-generated documentation-based answer:

Kaito is an operator that automates the AI/ML model inference or tuning workload in a Kubernetes cluster. It supports popular open-sourced large models such as Falcon and Phi-3. Key features include managing large model files using container images, providing preset configurations, supporting inference runtimes like vLLM and transformers, auto-provisioning GPU nodes, and hosting large model images in the public Microsoft Container Registry (MCR) if the license allows.

Relevant Sources:

  • https://github.com/kaito-project/kaito/blob/main/README.md

kaito-pr-agent[bot] avatar Apr 24 '25 17:04 kaito-pr-agent[bot]

The field is deprecated by https://github.com/kaito-project/kaito/pull/1091, but we still need to remove its behavior from the codebase.

chewong avatar May 05 '25 16:05 chewong

link #1100

zhuangqh avatar May 07 '25 04:05 zhuangqh