Add sample RQD Dockerfile with CUDA base image
Link the Issue(s) this Pull Request is related to. Related to PR https://github.com/AcademySoftwareFoundation/OpenCue/pull/1309. As discussed in the mail thread Blender plugin development.
Summarize your change. Adds a sample RQD Dockerfile with CUDA supported base image for GPU rendering. Tested with new derived RQD Blender image.
Works in combination with Nvidia Container Toolkit.
General thought here -- the Dockerfile here seems to duplicate much of the standard RQD image, though I understand it's using a different base image.
It's possible to have a Dockerfile which selectively copies files from another image using COPY --from. I wonder if this CUDA Dockerfile could do that -- start with the CUDA base image, copy the RQD tarball from the base RQD image, then do anything else that's CUDA specific.
This would help keep this image in sync with the base RQD image. Otherwise we will make changes to the base RQD image and forget to update this one.
Could you give that a try?
I'm with Brian on the idea that it would be better if we could copy the rqd tarball from the base image
Noted @bcipriano @DiegoTavares, that's a lot more efficient.
Just to clarify, would this be the tarball generated in the /opt/opencue directory of the image?
...then do anything else that's CUDA specific.
Actually, there are no other changes done on the Dockerfile other than the inclusion of the base CUDA image.
Everything else is dependent on the Nvidia driver and Container Toolkit installation on the host machine beforehand.
But now that you mentioned it, I'll include the nvidia-smi command at end of this Dockerfile to verify correct installation of all CUDA components.
Additionally, should add some documentation about this but not quite sure where. Perhaps as an amendment to the Deploying RQD page?
Just to clarify, would this be the tarball generated in the /opt/opencue directory of the image?
Yeah, that's probably the best way as you'll just need to copy that single file, everything is self-contained. Will be stored as /opt/opencue/rqd-{version}-all.tar.gz I believe.
Once the file is copied you'll still need to do any steps needed to install and run RQD.
Additionally, should add some documentation about this but not quite sure where. Perhaps as an amendment to the Deploying RQD page?
Hmm, how about this -- we have Customizing RQD, you could add a section there called like "Sample Dockerfiles" which links to the samples/rqd/ directory in the repo.
You could also add the Customizing RQD page to the Deploying RQD "What's Next?" section. That seems like it would flow nicely.
@bcipriano, as suggested I implemented the COPY --from command for the tarball as well as RQD config file and the proto directory which were not included with the tarball extraction. Tested with CueBot and seems to be connecting and working as expected.
I'll include the nvidia-smi command at end of this Dockerfile
Turns out that the command only works when mounted to a GPU as in the docker run command, which seemed unnecessary just for this Dockerfile, and also doesn't seem to be possible.
we have Customizing RQD, you could add a section there called like "Sample Dockerfiles" which links to the samples/rqd/ directory in the repo. You could also add the Customizing RQD page to the Deploying RQD "What's Next?" section. That seems like it would flow nicely.
Noted. That sounds good, will get on it.
Also, a couple of things to clarify:
-
Would there be a graceful way to resolve the version number used for the tarball name via a variable? Would help dynamically get it for use in the tarball filename. https://github.com/AcademySoftwareFoundation/OpenCue/blob/e05eb27059ef36406442d32d790fc7b3cd6db135/samples/rqd/cuda/Dockerfile#L19
I tried this out but seem to be having some trouble with my current test implementation extracting the value from the
VERSIONfile and assigning it to an ARG or ENV variable.COPY --from=opencue/rqd /opt/opencue/VERSION /opt/opencue/VERSION ENV VERSION="" RUN cat /opt/opencue/VERSION > $VERSION COPY --from=opencue/rqd /opt/opencue/rqd-${VERSION}-custom-all.tar.gz /opt/opencue/rqd-${VERSION}-custom-all.tar.gz -
Are the gRPC related instructions like the one below required for the installation since I'm importing the proto directory also from the
opencue/rqdimage? If its redundant, will remove. https://github.com/AcademySoftwareFoundation/OpenCue/blob/e05eb27059ef36406442d32d790fc7b3cd6db135/samples/rqd/cuda/Dockerfile#L29-L33
@bcipriano, as suggested I implemented the
COPY --fromcommand for the tarball as well as RQD config file and the proto directory which were not included with the tarball extraction. Tested with CueBot and seems to be connecting and working as expected.I'll include the nvidia-smi command at end of this Dockerfile
Turns out that the command only works when mounted to a GPU as in the
docker runcommand, which seemed unnecessary just for this Dockerfile, and also doesn't seem to be possible.we have Customizing RQD, you could add a section there called like "Sample Dockerfiles" which links to the samples/rqd/ directory in the repo. You could also add the Customizing RQD page to the Deploying RQD "What's Next?" section. That seems like it would flow nicely.
Noted. That sounds good, will get on it.
Also, a couple of things to clarify:
Would there be a graceful way to resolve the version number used for the tarball name via a variable? Would help dynamically get it for use in the tarball filename. https://github.com/AcademySoftwareFoundation/OpenCue/blob/e05eb27059ef36406442d32d790fc7b3cd6db135/samples/rqd/cuda/Dockerfile#L19
I tried this out but seem to be having some trouble with my current test implementation extracting the value from the
VERSIONfile and assigning it to an ARG or ENV variable.COPY --from=opencue/rqd /opt/opencue/VERSION /opt/opencue/VERSION ENV VERSION="" RUN cat /opt/opencue/VERSION > $VERSION COPY --from=opencue/rqd /opt/opencue/rqd-${VERSION}-custom-all.tar.gz /opt/opencue/rqd-${VERSION}-custom-all.tar.gzAre the gRPC related instructions like the one below required for the installation since I'm importing the proto directory also from the
opencue/rqdimage? If its redundant, will remove. https://github.com/AcademySoftwareFoundation/OpenCue/blob/e05eb27059ef36406442d32d790fc7b3cd6db135/samples/rqd/cuda/Dockerfile#L29-L33
Sorry for taking ages to reply to this:
- Unfortunately COPY runs on the build environment, so it doesn't have access to variables set by RUN. I don't see a simple solution here, maybe try copying with a regex:
COPY --from=opencue/rqd /opt/opencue/rqd-*-all.tar.gz /opt/opencue/
- If you're copying the tarball from the rqd image, you don't need to run the build steps.
No worries @DiegoTavares 😄. Took me a while to get back to this also.
Unfortunately COPY runs on the build environment, so it doesn't have access to variables set by RUN. I don't see a simple solution here, maybe try copying with a regex
Noted. This worked, thanks!
Resolved in https://github.com/AcademySoftwareFoundation/OpenCue/pull/1327/commits/85dae74749da0bde9ecf135b860a0974b8ec4b64
However there seems to be a linting issue in an unrelated service.py wrapper.
Ping @DiegoTavares