程序正常跑起来了, 但是跑测试容器的时候, 在容器里面执行 nvidia-smi 有有以下报错
/tmp/cuda-control/src/loader.c:1139 can't get config file path


I have the same problem as you. Have you solved it?
My problem has been solved. This open source has many bugs, and I have fixed a lot. I can provide you with the repaired image, but after you test it, I need your help to issue a test report.
khw @.***
------------------ 原始邮件 ------------------ 发件人: "tkestack/gpu-manager" @.>; 发送时间: 2022年11月23日(星期三) 晚上8:47 @.>; @.@.>; 主题: Re: [tkestack/gpu-manager] 程序正常跑起来了, 但是跑测试容器的时候, 在容器里面执行 nvidia-smi 有有以下报错 (Issue #147)
I have the same problem as you. Have you solved it?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
My problem has been solved. This open source has many bugs, and I have fixed a lot. I can provide you with the repaired image, but after you test it, I need your help to issue a test report. khw @.*** … ------------------ 原始邮件 ------------------ 发件人: "tkestack/gpu-manager" @.>; 发送时间: 2022年11月23日(星期三) 晚上8:47 @.>; @.@.>; 主题: Re: [tkestack/gpu-manager] 程序正常跑起来了, 但是跑测试容器的时候, 在容器里面执行 nvidia-smi 有有以下报错 (Issue #147) I have the same problem as you. Have you solved it? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
好啊,请问你是fork了一个版本吗,还是直接给我镜像啊我来用呢,最好是前者。我最近也在调研用户态(k8s)的gpu虚拟化方案~老兄有时间可以请教下么
@khw934 同样我这个问题,也解决了。之前报错我是Ubuntu22.04版本,之后使用centos7.9。初步感觉是cuda版本不适配的问题,但是目前我还没能力去改这部分插件的能力,不太懂~ 先就更换操作系统版本来解决了
方便加个联系方式,我们互相学习下??
khw @.***
------------------ 原始邮件 ------------------ 发件人: "tkestack/gpu-manager" @.>; 发送时间: 2022年11月28日(星期一) 下午5:30 @.>; @.@.>; 主题: Re: [tkestack/gpu-manager] 程序正常跑起来了, 但是跑测试容器的时候, 在容器里面执行 nvidia-smi 有有以下报错 (Issue #147)
@khw934 同样我这个问题,也解决了。之前报错我是Ubuntu22.04版本,之后使用centos7.9。初步感觉是cuda版本不适配的问题,但是目前我还没能力去改这部分插件的能力,不太懂~ 先就更换操作系统版本来解决了
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
@khw934 请问最终是怎么解决的呢?
/tmp/cuda-control/src/loader.c:1139 can't get config file path
@khw934 请问一下这个问题是解决了吗?能帮我解答一下这是什么原因导致的?怎么解决的吗?感谢。
解决了
khw @.***
------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2023年7月31日(星期一) 中午11:42 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [tkestack/gpu-manager] 程序正常跑起来了, 但是跑测试容器的时候, 在容器里面执行 nvidia-smi 有有以下报错 (Issue #147)
/tmp/cuda-control/src/loader.c:1139 can't get config file path
@khw934 请问一下这个问题是解决了吗?能帮我解答一下这是什么原因导致的?怎么解决的吗?感谢。
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Had the same problem after migrating to Ubuntu 22.04. It uses by default cgroup V2. The workaround to solve the issue was changing /etc/default/grub, including GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=0", upgrade-grub and reboot as explained on https://docs.docker.com/config/containers/runmetrics/