TimWang
TimWang
Today, I had an offline debug session with @archlitchi . Despite setting `CUDA_DISABLE_CONTROL` to true and removing `ld.so.preload` from the GPU node, the issue persisted. We suspect that this is...
Confirmed that the issue mentioned also occurs in version **0.14.0** of [k8s-device-plugin](https://github.com/NVIDIA/k8s-device-plugin/releases/tag/v0.14.0). Thus, we should update k8s-device-plugin to at least version **0.14.5**.
@wawa0210 @archlitchi
There will be an extra env set in the `vgpu-scheduler-extender` is we enable the `vmBmMixMode` ``` - name: vgpu-scheduler-extender image: projecthami/hami:v2.3.12 imagePullPolicy: "IfNotPresent" env: - name: NODE_SELECTOR_GPU value: "on" ```
Fix the bug in this pull request: https://github.com/Project-HAMi/HAMi/pull/354, as it is causing the scheduler to restart.
Replace it with new PR https://github.com/Project-HAMi/HAMi/pull/855
Upon reviewing, I identified a bug in the previous submission that resulted in the failure of two existing unit test functions. To address this, I've removed the switch statement, which...