builder icon indicating copy to clipboard operation
builder copied to clipboard

Target manylinux_2_24 for wheels

Open mattip opened this issue 4 years ago • 11 comments

The new manylinux_2_24 standard, based on debian9 and glibc2.24, could enable #520 and solve pytorch/pytorch#51039. It would require a new Dockerfile, changing all the yum installation calls to apt, and changing the PATCHELF parts of the build.

mattip avatar Mar 01 '21 07:03 mattip

Copying selected comments from pypa/manylinux#1012, which discusses the fact that manylinux_2_24 is based on Ubuntu, which does not recompile newer gcc versions for the libstdc++ needed for manylinux_2_24 compliance:


The GCC version in manylinux_2_24 is a dismal 6.3, much worse than both the year based manylinuxes and not even new enough for C++17 language support (mostly added in GCC 7)! This is a huge step backward for an image with a higher GLIBC version.

Once you install gcc-9, it will update at least libstdc++ which is a no go. RHEL dev toolset took care of that and also other libraries like libgcc_s so that binaries produced with devtoolset would still be compatible with the base image with no action.


Bottom line: in order to use the manylinux_2_24 standard for wheels (which would enable using CXX11_ABI), someone needs to put in the effort to build a gcc-9 or gcc-10 for manylinux_2_24.

mattip avatar May 31 '21 13:05 mattip

To add to this:

  • currently we still have manylinux1 wheels - that means glibc 2.5 (2006)
  • upgrading to manylinux2014 is much easier - that would be glibc 2.17 (2012)
  • manylinux_2_24 gives glibc 2.24 (2016)

@mattip you said 2.5 -> 2.17 was much less interesting than 2.17 -> 2.24. Can you explain why that is? Newer is better, but getting rid of manylinux1 would also be interesting I'd think?

rgommers avatar Jun 01 '21 14:06 rgommers

CentOS7, which is the base distro for manylinux2014, is the reason pytorch cannot use CXX11_ABI. Moving to manylinux2014 is easy and makes sense since the older distros are EOL, but I am not sure what the tangible benefits are for pytorch.

For a nice summary of glibc versions vs. distros, along with EOL dates, there is https://github.com/mayeut/pep600_compliance

mattip avatar Jun 01 '21 14:06 mattip

CentOS 7 is old but is not EOL. It will be supported till June 30th, 2024. It's just it will not have new features and will not add support for new hardware. But it is not a problem for build pipelines. Usually we run it in docker so it doesn't matter if its kernel can support the latest CPU. And for release managers, we should always target the oldest distro as possible as we can.

But if PyTorch team is willing to add a new set of packages for manylinux_2_24, it is absolutely great.

snnn avatar Jun 02 '21 04:06 snnn

@malfet's feedback was: let's upgrade to manylinux2014 for now - CentOS 7 is still around, so let's do that first. manylinux1 to manylinux2014 upgrade is valuable.

rgommers avatar Jun 04 '21 20:06 rgommers

I have one additional comment: if pytorch decided to build manylinux2014, then you need to replace the GCC from 9.x to 8.x , as long as you still need to support CUDA 10.2.

snnn avatar Jun 04 '21 20:06 snnn

let's upgrade to manylinux2014 for now

@malfet: is that "stop creating linux86_64 and create only manylinux2014 instead" or "add manylinux2014 wheels in addition to the linux86_64 wheels"

mattip avatar Jun 08 '21 15:06 mattip

let's upgrade to manylinux2014 for now

@malfet: is that "stop creating linux86_64 and create only manylinux2014 instead" or "add manylinux2014 wheels in addition to the linux86_64 wheels"

From looking at build_all_docker.sh, one would think that manylinux2014 is now the default, but the other one appears to be being built due to a possible MANYLINUX_VERSION vs MANY_LINUX_VERSION variable name variation, per update 2 in this other issue

qhaas avatar Mar 26 '22 17:03 qhaas

Yes, that does look like a typo. Are the wheels being built? xref #979

mattip avatar Mar 26 '22 19:03 mattip

Yes, that does look like a typo. Are the wheels being built? xref #979

Honestly, I'm having trouble building the wheels inside either container 'the right way' because I'm not sure which scripts meta uses to build them inside the container. The documentation only points to how to build the docker container image.

Running manywheel/build.sh results in an error complaining the gcc version in the original Docker image is too old.

Running said script with the 2014 docker image results in an openssl so file not being found.

Inside the container, used the following command in the folder where pytorch builder is checked out where /mnt/pytorch-1.11.0-cuda-11.3-src is the path to the pytorch source checkout: GPU_ARCH_TYPE=cuda OVERRIDE_TORCH_CUDA_ARCH_LIST=3.5 DESIRED_CUDA=11.3 DESIRED_PYTHON=3.9 PYTORCH_ROOT=/mnt/pytorch-1.11.0-cuda-11.3-src manywheel/build.sh

qhaas avatar Mar 27 '22 14:03 qhaas

@malfet: is that "stop creating linux86_64 and create only manylinux2014 instead" or "add manylinux2014 wheels in addition to the linux86_64 wheels"

This was definitely "drop manylinux1, upgrade existing jobs to manylinux2014".

rgommers avatar Mar 28 '22 07:03 rgommers