Felix Kuehling
Felix Kuehling
The ROCm driver supports multiple concurrent processes on the GPU natively. We don't need MPS for that. This should also work for multiple containers. The processes end up sharing GPU...
I don't think you need to do anything special. Just start multiple containers. They will share the GPU in the same way as multiple processes in the same container.
> > I don't think you need to do anything special. Just start multiple containers. They will share the GPU in the same way as multiple processes in the same...
Are /dev/kfd and /dev/dri/renderD* visible inside docker with the right permissions?
@dannysemi On bare metal, udevd may have a rule to add the local console user to the access control lists of /dev/kfd and /dev/dri/renderD*, so you don't need to mess...
I don't know of any precedent of using OpenCL in kernel mode. It requires significant user mode SW that would not be practical to run in kernel mode. I won't...
@Lucretia, the kernel oops is happening on a code path specific to Hawaii GPUs. It doesn't get any regular testing here. I can try to reproduce it locally. This problem...
@Lucretia, this quick patch should fix your kernel oops: https://lists.freedesktop.org/archives/amd-gfx/2019-September/039702.html
> @fxamd is that on top of the [other two](https://github.com/justxi/rocm/issues/72#issuecomment-526848585) I applied to stop the original crash, or in place of? In place of. It looks like one of the...
You need the patch I pointed to yesterday. It's not submitted to any branch yet. It only exists as an email code review. I'm attaching it for your convenience. [0001-drm-amdgpu-Fix-KFD-related-kernel-oops-on-Hawaii.patch.txt](https://github.com/RadeonOpenCompute/ROCR-Runtime/files/3584551/0001-drm-amdgpu-Fix-KFD-related-kernel-oops-on-Hawaii.patch.txt)