TimWang

Results 13 issues of TimWang

### 1. Quick Debug Information * OS/Version(Garden Linux 934.11): * Kernel Version: 5.15.135-gardenlinux-amd64 * Container Runtime Type/Version(e.g. Containerd, CRI-O, Docker): containerd:/1.6.20 * K8s Flavor/Version(e.g. K8s, OCP, Rancher, GKE, EKS): K8s/v1.26.11...

This PR refactors the `UnmarshalJSON` method for `ReplicatedDevices` to improve code clarity and structure. Key changes include: - **Modularization**: Created separate handlers (`handleStringInput`, `handleNumericInput`, `handleListInput`) for different input types, enhancing...

The code changes in `deployment.yaml` add the GPU node selector to the scheduler deployment. This change allows the scheduler to select nodes with the `gpu` label set to `"on"`. The...

### 1. Issue or feature description #### Description In our cluster, we have both VM nodes and BM (Bare Metal) nodes. However, only the BM nodes have GPUs. Therefore, I...

Tags in Git serve as a means to designate significant moments in your repository's history. They are commonly employed to indicate release versions, such as v1.0, v2.0, and so on....

### 📚 The doc issue ### Context: I am currently managing H100 GPUs using Kubernetes (K8s). However, I’ve noticed that the vLLM documentation only provides deployment instructions for Docker, which...

documentation

### Description This Pull Request introduces detailed documentation on how to deploy vLLM with Kubernetes. The new documentation is designed to help users efficiently manage and scale their machine learning...

ready

**What type of PR is this?** During an offline debugging session with @archlitchi , we identified that the current NVIDIA device plugin (v1.4.0) is causing compatibility issues with nanoGPT, preventing...

kind/bug
dco-signoff: yes
do-not-merge/work-in-progress
size/XL

--- ### 1. Issue or feature description An issue has been identified when trying to run https://github.com/karpathy/nanoGPT with the HAMi framework; it's currently unsuccessful. However, when the same code is...

## Pull Request Description This pull request updates the `samples/quickstart/model.yaml` file to enhance resource allocation, namespace organization, and deployment configuration for the `deepseek-r1-distill-llama-8b` model. Key changes include migrating to a...