gen-quickstart icon indicating copy to clipboard operation
gen-quickstart copied to clipboard

The Docker image is 5+ GB

Open bzinberg opened this issue 5 years ago • 2 comments

Partly due to having both TensorFlow and (once #45 is merged) PyTorch installs. Makes it pretty cumbersome to install without a strong internet connection. Not sure what we can do about this.

bzinberg avatar Mar 24 '20 14:03 bzinberg

Afaict, not much you can do about it. The current image does not delete its sources after package installation - you can add rm -rf /var/lib/apt/lists/* to do so. You can also delete the Julia archive after you have downloaded it, i.e. rm julia-1.3.1-linux-x86_64.tar.gz. Finally, each time you add another RUN line you create another layer in the layered file system, so you might want to summarize a few lines, e.g. for the Julia installation in https://github.com/probcomp/gen-quickstart/blob/master/Dockerfile#L20 Finally, it seems unnecessary to use virtualenv inside a container which is already well compartmentalized.

But the short answer is that none of these methods will have tremendous impact. It's not super optimized, but I pull a few tricks like this in https://github.com/probcomp/gen-quickstart/blob/master/Dockerfile.ubuntu-2004 and it does not make the image much smaller. And it would get even worse if you added GPU acceleration with nvidia/cuda-based images (expect 6-10GB). The methods to really shrink this down further like building in one image and then only pushing binaries into the production image as well as using a smaller base image like Alpine are not ideal for a developer image. So I can push some optimizations if you want, but if you wanted to shrink this to 1GB or so I'm not too optimistic. We can try a few things, but the low hanging fruit probably won't suffice to significantly reduce its size.

PS:

  • You might not need the git dependency, since you copy the sources in
  • Not sure what python-tk is needed for. Isn't Tcl/Tk just relevant for GUIs? => Might be able to reduce dependencies a bit.

fplk avatar Mar 24 '20 15:03 fplk

Yeah, I figured as much -- thanks for shedding some light on this @fplk. Given that there are no known low-hanging fruit, and the current situation is tolerable, I think we should leave it as-is and keep the issue open.

bzinberg avatar Mar 24 '20 17:03 bzinberg