k8s-device-plugin icon indicating copy to clipboard operation
k8s-device-plugin copied to clipboard

NVIDIA device plugin for Kubernetes

Results 354 k8s-device-plugin issues
Sort by recently updated
recently updated
newest added

_The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense._...

lifecycle/stale

### 1. Issue or feature description After change the k8s container runtime from docker to containerd, we execute `nvidia-smi` in a k8s GPU POD, it returns error with `Failed to...

lifecycle/stale

### 1. Issue or feature description Hi, I work at Microsoft and we are getting ready to go live with the A10 VMs (https://docs.microsoft.com/en-us/azure/virtual-machines/nva10v5-series). Ahead of this go-live, I am...

I am using Jetson Xavier NX. I rewrote config.toml as follows and restarted containerd. However, when I execute describe to gpu pod, it seems to be warnning. The pod is...

lifecycle/stale

It looks like this repo isn't too `go install` friendly -- anyway we can resolve that? I was able to remove workarounds and easily go-install nvidia-container-toolkit (and drop nvidia-container-runtime! 😌)...

It would be useful to allow containers/pods to share GPUs (similar to a shared workstation) when desired. --- I have a fork of this device plugin that implements the above...

### 1. Issue or feature description nvdp deployed via helm chart v0.12.2 with gfd enabled, no other changes to values.yaml. Running on RHEL 8 with selinux enabled. No nvidia.com/xxx labels...

### 1. Issue or feature description * I am trying to install JupyterHub on a bare metal machine using microk8s with GPU support on Ubuntu 22.04 LTS * I can...

### 1. Issue or feature description I use GPU pod to run pytorch processes with the device plugin, and met the problem occasionally which shows "CUDA unknown error". But after...

lifecycle/stale

### 1. Issue or feature description Is it possible that one gpu card is assigned to multiple pod ? As I know, GPU sharing among multiple pod is not easy....