Kevin Klues

Results 365 comments of Kevin Klues

Just to be clear, `vGPU` is a very different technology than `MIG`. It seems you are trying to configure `MIG` here and not `vGPUs`. In order to create GIs and...

I'm sorry, I don't quite understand your question. Can you rephrase it please?

> should i kill the process of nvidia-fabricmanager which use gpu (lsof /dev/nvidia* can find) when i set the mig of A100 GPU (Almost) all clients of the GPU driver...

> I have annother question. I didn't see an explicit question, but I'm guessing you are confused why tensorflow isn't using both MIG devices you passed it? At present, all...

The recommended method is to use a toleration or a node selector to *only* deploy the plugin on nodes that actually have GPU s on them. If you really want...

@dpressel how were you requesting the GPUs before, if not via: ``` resources: limits: nvidia.com/gpu: 1 ``` If you don't set this at all, and your job lands on a...

This is very strange behaviour. The underlying code to detect and enumerate the various MIG devices is shared between gpu-feature-discovery and the k8s-device-plugin. We’ve also not had similar reports to...

If libnvidia-container doesn’t think you want access to MIG devices, it won’t inject these proc files (or the dev nodes they point to). When it comes to the plugin, this...

Sorry. I just meant exec into the container and show the value of NVIDIA_VISIBLE_DEVICES environment variable.