please help me how to slim gpu docker
like nvcr.io/nvidia/pytorch:21.10-py3
do you have examples for pytorch and how you use it?
I have been trying this too. @kcq
This is my Dockerfile
FROM nvcr.io/nvidia/pytorch:22.08-py3
WORKDIR /
ADD run.py /
CMD [ "python", "run.py" ]
and this is run.py
import torch
from torchvision.models.resnet import resnet18
model = resnet18()
model.eval().to("cuda:0").half()
x = torch.rand(1, 3, 224, 224).to("cuda:0").half()
_ = model(x)
I ran these commands to make the slim image
docker build -t pytorch_fat:1.0 .
docker-slim build --http-probe=false --cro-runtime=nvidia pytorch_fat:1.0
I can see the two images
pytorch_fat.slim latest e29f67494610 2 minutes ago 2.9GB
pytorch_fat 1.0 8f7954c308c3 10 minutes ago 14.6GB
but if I run the slim image I get this error due to .so files being removed
nvidia-docker run --rm -it pytorch_fat.slim:latest
Error processing line 1 of /opt/conda/lib/python3.8/site-packages/google_auth-2.9.1-py3.10-nspkg.pth:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage
exec(line)
File "<string>", line 1, in <module>
File "<frozen importlib._bootstrap>", line 553, in module_from_spec
AttributeError: 'NoneType' object has no attribute 'loader'
Remainder of file ignored
Error processing line 1 of /opt/conda/lib/python3.8/site-packages/matplotlib-3.5.2-py3.8-nspkg.pth:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage
exec(line)
File "<string>", line 1, in <module>
File "<frozen importlib._bootstrap>", line 553, in module_from_spec
AttributeError: 'NoneType' object has no attribute 'loader'
Remainder of file ignored
Error processing line 1 of /opt/conda/lib/python3.8/site-packages/protobuf-3.20.1-py3.8-nspkg.pth:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage
exec(line)
File "<string>", line 1, in <module>
File "<frozen importlib._bootstrap>", line 553, in module_from_spec
AttributeError: 'NoneType' object has no attribute 'loader'
Remainder of file ignored
Error processing line 1 of /opt/conda/lib/python3.8/site-packages/sphinxcontrib_applehelp-1.0.2-py3.8-nspkg.pth:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage
exec(line)
File "<string>", line 1, in <module>
File "<frozen importlib._bootstrap>", line 553, in module_from_spec
AttributeError: 'NoneType' object has no attribute 'loader'
Remainder of file ignored
Error processing line 1 of /opt/conda/lib/python3.8/site-packages/sphinxcontrib_devhelp-1.0.2-py3.8-nspkg.pth:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage
exec(line)
File "<string>", line 1, in <module>
File "<frozen importlib._bootstrap>", line 553, in module_from_spec
AttributeError: 'NoneType' object has no attribute 'loader'
Remainder of file ignored
Error processing line 1 of /opt/conda/lib/python3.8/site-packages/sphinxcontrib_htmlhelp-2.0.0-py3.9-nspkg.pth:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage
exec(line)
File "<string>", line 1, in <module>
File "<frozen importlib._bootstrap>", line 553, in module_from_spec
AttributeError: 'NoneType' object has no attribute 'loader'
Remainder of file ignored
Error processing line 1 of /opt/conda/lib/python3.8/site-packages/sphinxcontrib_jsmath-1.0.1-py3.7-nspkg.pth:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage
exec(line)
File "<string>", line 1, in <module>
File "<frozen importlib._bootstrap>", line 553, in module_from_spec
AttributeError: 'NoneType' object has no attribute 'loader'
Remainder of file ignored
Error processing line 1 of /opt/conda/lib/python3.8/site-packages/sphinxcontrib_qthelp-1.0.3-py3.8-nspkg.pth:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage
exec(line)
File "<string>", line 1, in <module>
File "<frozen importlib._bootstrap>", line 553, in module_from_spec
AttributeError: 'NoneType' object has no attribute 'loader'
Remainder of file ignored
Error processing line 1 of /opt/conda/lib/python3.8/site-packages/sphinxcontrib_serializinghtml-1.1.5-py3.9-nspkg.pth:
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage
exec(line)
File "<string>", line 1, in <module>
File "<frozen importlib._bootstrap>", line 553, in module_from_spec
AttributeError: 'NoneType' object has no attribute 'loader'
Remainder of file ignored
Traceback (most recent call last):
File "run.py", line 1, in <module>
import torch
File "/opt/conda/lib/python3.8/site-packages/torch/__init__.py", line 201, in <module>
_load_global_deps()
File "/opt/conda/lib/python3.8/site-packages/torch/__init__.py", line 154, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/opt/conda/lib/python3.8/ctypes/__init__.py", line 373, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libopen-rte.so.40: cannot open shared object file: No such file or directory
Can slim be used for applications which use Pytorch?
Thank you for sharing your Dockerfile and app info @ganessh22 It's super helpful for the repro!
I have been trying this too. @kcq
This is my Dockerfile
FROM nvcr.io/nvidia/pytorch:22.08-py3 WORKDIR / ADD run.py / CMD [ "python", "run.py" ]and this is run.py
import torch from torchvision.models.resnet import resnet18 model = resnet18() model.eval().to("cuda:0").half() x = torch.rand(1, 3, 224, 224).to("cuda:0").half() _ = model(x)I ran these commands to make the slim image
docker build -t pytorch_fat:1.0 . docker-slim build --http-probe=false --cro-runtime=nvidia pytorch_fat:1.0I can see the two images
pytorch_fat.slim latest e29f67494610 2 minutes ago 2.9GB pytorch_fat 1.0 8f7954c308c3 10 minutes ago 14.6GBbut if I run the slim image I get this error due to .so files being removed
nvidia-docker run --rm -it pytorch_fat.slim:latestError processing line 1 of /opt/conda/lib/python3.8/site-packages/google_auth-2.9.1-py3.10-nspkg.pth: Traceback (most recent call last): File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage exec(line) File "<string>", line 1, in <module> File "<frozen importlib._bootstrap>", line 553, in module_from_spec AttributeError: 'NoneType' object has no attribute 'loader' Remainder of file ignored Error processing line 1 of /opt/conda/lib/python3.8/site-packages/matplotlib-3.5.2-py3.8-nspkg.pth: Traceback (most recent call last): File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage exec(line) File "<string>", line 1, in <module> File "<frozen importlib._bootstrap>", line 553, in module_from_spec AttributeError: 'NoneType' object has no attribute 'loader' Remainder of file ignored Error processing line 1 of /opt/conda/lib/python3.8/site-packages/protobuf-3.20.1-py3.8-nspkg.pth: Traceback (most recent call last): File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage exec(line) File "<string>", line 1, in <module> File "<frozen importlib._bootstrap>", line 553, in module_from_spec AttributeError: 'NoneType' object has no attribute 'loader' Remainder of file ignored Error processing line 1 of /opt/conda/lib/python3.8/site-packages/sphinxcontrib_applehelp-1.0.2-py3.8-nspkg.pth: Traceback (most recent call last): File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage exec(line) File "<string>", line 1, in <module> File "<frozen importlib._bootstrap>", line 553, in module_from_spec AttributeError: 'NoneType' object has no attribute 'loader' Remainder of file ignored Error processing line 1 of /opt/conda/lib/python3.8/site-packages/sphinxcontrib_devhelp-1.0.2-py3.8-nspkg.pth: Traceback (most recent call last): File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage exec(line) File "<string>", line 1, in <module> File "<frozen importlib._bootstrap>", line 553, in module_from_spec AttributeError: 'NoneType' object has no attribute 'loader' Remainder of file ignored Error processing line 1 of /opt/conda/lib/python3.8/site-packages/sphinxcontrib_htmlhelp-2.0.0-py3.9-nspkg.pth: Traceback (most recent call last): File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage exec(line) File "<string>", line 1, in <module> File "<frozen importlib._bootstrap>", line 553, in module_from_spec AttributeError: 'NoneType' object has no attribute 'loader' Remainder of file ignored Error processing line 1 of /opt/conda/lib/python3.8/site-packages/sphinxcontrib_jsmath-1.0.1-py3.7-nspkg.pth: Traceback (most recent call last): File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage exec(line) File "<string>", line 1, in <module> File "<frozen importlib._bootstrap>", line 553, in module_from_spec AttributeError: 'NoneType' object has no attribute 'loader' Remainder of file ignored Error processing line 1 of /opt/conda/lib/python3.8/site-packages/sphinxcontrib_qthelp-1.0.3-py3.8-nspkg.pth: Traceback (most recent call last): File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage exec(line) File "<string>", line 1, in <module> File "<frozen importlib._bootstrap>", line 553, in module_from_spec AttributeError: 'NoneType' object has no attribute 'loader' Remainder of file ignored Error processing line 1 of /opt/conda/lib/python3.8/site-packages/sphinxcontrib_serializinghtml-1.1.5-py3.9-nspkg.pth: Traceback (most recent call last): File "/opt/conda/lib/python3.8/site.py", line 169, in addpackage exec(line) File "<string>", line 1, in <module> File "<frozen importlib._bootstrap>", line 553, in module_from_spec AttributeError: 'NoneType' object has no attribute 'loader' Remainder of file ignored Traceback (most recent call last): File "run.py", line 1, in <module> import torch File "/opt/conda/lib/python3.8/site-packages/torch/__init__.py", line 201, in <module> _load_global_deps() File "/opt/conda/lib/python3.8/site-packages/torch/__init__.py", line 154, in _load_global_deps ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL) File "/opt/conda/lib/python3.8/ctypes/__init__.py", line 373, in __init__ self._handle = _dlopen(self._name, mode) OSError: libopen-rte.so.40: cannot open shared object file: No such file or directoryCan slim be used for applications which use Pytorch?
So, do we need GPU in CI runner servers? haha
Haven't had enough cycles to investigate. Don't have a local machine with nvidia. Will try to repro it with AWS.