GPU Memory 'max_gpu_wrk_mem' seems to be more than the actual GPU type in GPU'20 trace ?

Open matthewygf opened this issue 2 years ago • 0 comments

For example in GPU 2020 trace

the job 'e5d6d5b546bff61f93b47ebf' has max_gpu_wrk_mem '44.289062' but the gpu type is V100 where the memory capacity should be 16GB or at max 32GB !?

Should I assume that once the max_gpu_wrk_mem > GPU_type_capcity, the worker encounters OOM ?

Feb 09 '24 16:02 matthewygf