isce2 icon indicating copy to clipboard operation
isce2 copied to clipboard

Build ISCE2 with CUDA version Error

Open gjustin40 opened this issue 2 years ago • 1 comments

I am currently working on creating stack data for Sentinel-1 data using stackSentinel.py. It was taking too long with just the CPU, so I'm trying to install the CUDA version.

Here's what I've done so far:

  • Successfully installed ISCE2 using a CPU version of Docker and have been using it without any issues.
  • When I tried to install the CUDA version using a Docker container, I encountered the following error:
~/github/isce2$ docker build --rm --force-rm -t hysds/isce2:latest-cuda -f docker/Dockerfile.cuda .
[+] Building 29.7s (9/15)
 => [internal] load .dockerignore                                                                                                                                                                           0.0s
 => => transferring context: 2B                                                                                                                                                                             0.0s
 => [internal] load build definition from Dockerfile.cuda                                                                                                                                                   0.0s
 => => transferring dockerfile: 3.26kB                                                                                                                                                                      0.0s
 => [internal] load metadata for docker.io/hysds/cuda-pge-base:latest                                                                                                                                       1.6s
 => [internal] load metadata for docker.io/hysds/cuda-dev:latest                                                                                                                                            1.6s
 => CACHED [stage-1 1/3] FROM docker.io/hysds/cuda-pge-base:latest@sha256:03177fd8b55e184f9b43a1aec5091ebab326c86cff4ebbcec3e06035ade0bc86                                                                  0.0s
 => [internal] load build context                                                                                                                                                                           0.2s
 => => transferring context: 1.84MB                                                                                                                                                                         0.2s
 => [stage-0 1/7] FROM docker.io/hysds/cuda-dev:latest@sha256:b3594240801f3f125ef2b17591a1431af36fdaa68a1afae96577c74c0b7858b9                                                                              0.0s
 => CACHED [stage-0 2/7] RUN set -ex  && yum update -y  && yum groupinstall -y "development tools"  && yum install -y       make ruby-devel rpm-build rubygems  && gem install ffi -v 1.12.2  && gem insta  0.0s
 => ERROR [stage-0 3/7] RUN set -ex  && . /opt/conda/bin/activate root  && conda install --yes       cython       gdal       git       h5py       libgdal       pytest       numpy       fftw       scipy  28.1s
------
 > [stage-0 3/7] RUN set -ex  && . /opt/conda/bin/activate root  && conda install --yes       cython       gdal       git       h5py       libgdal       pytest       numpy       fftw       scipy       scons       hdf4       hdf5       libgcc       libstdcxx-ng       cmake  && yum install -y uuid-devel x11-devel motif-devel jq     opencv opencv-devel opencv-python  && ln -sf /opt/conda/bin/cython /opt/conda/bin/cython3  && mkdir -p /opt/isce2/src:
#0 0.530 + . /opt/conda/bin/activate root
#0 0.530 ++ _CONDA_ROOT=/opt/conda
#0 0.530 ++ . /opt/conda/etc/profile.d/conda.sh
#0 0.530 +++ export CONDA_EXE=/opt/conda/bin/conda
#0 0.530 +++ CONDA_EXE=/opt/conda/bin/conda
#0 0.530 +++ export _CE_M=
#0 0.530 +++ _CE_M=
#0 0.530 +++ export _CE_CONDA=
#0 0.530 +++ _CE_CONDA=
#0 0.530 +++ export CONDA_PYTHON_EXE=/opt/conda/bin/python
#0 0.530 +++ CONDA_PYTHON_EXE=/opt/conda/bin/python
#0 0.530 +++ '[' -z '' ']'
#0 0.530 +++ export CONDA_SHLVL=0
#0 0.530 +++ CONDA_SHLVL=0
#0 0.530 +++ '[' -n '' ']'
#0 0.531 +++++ dirname /opt/conda/bin/conda
#0 0.536 ++++ dirname /opt/conda/bin
#0 0.540 +++ PATH=/opt/conda/condabin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
#0 0.540 +++ export PATH
#0 0.540 +++ '[' -z '' ']'
#0 0.540 +++ PS1=
#0 0.540 ++ conda activate root
#0 0.540 ++ local cmd=activate
#0 0.540 ++ case "$cmd" in
#0 0.540 ++ __conda_activate activate root
#0 0.540 ++ '[' -n '' ']'
#0 0.540 ++ local ask_conda
#0 0.541 +++ PS1=
#0 0.541 +++ __conda_exe shell.posix activate root
#0 0.541 +++ /opt/conda/bin/conda shell.posix activate root
#0 0.675 ++ ask_conda='PS1='\''(base) '\''
#0 0.675 export PATH='\''/opt/conda/bin:/opt/conda/condabin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'\''
#0 0.675 export CONDA_PREFIX='\''/opt/conda'\''
#0 0.675 export CONDA_SHLVL='\''1'\''
#0 0.675 export CONDA_DEFAULT_ENV='\''base'\''
#0 0.675 export CONDA_PROMPT_MODIFIER='\''(base) '\''
#0 0.675 export CONDA_EXE='\''/opt/conda/bin/conda'\''
#0 0.675 export _CE_M='\'''\''
#0 0.675 export _CE_CONDA='\'''\''
#0 0.675 export CONDA_PYTHON_EXE='\''/opt/conda/bin/python'\''
#0 0.675 . "/opt/conda/etc/conda/activate.d/proj4-activate.sh"'
#0 0.675 ++ eval 'PS1='\''(base) '\''
#0 0.675 export PATH='\''/opt/conda/bin:/opt/conda/condabin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'\''
#0 0.675 export CONDA_PREFIX='\''/opt/conda'\''
#0 0.675 export CONDA_SHLVL='\''1'\''
#0 0.675 export CONDA_DEFAULT_ENV='\''base'\''
#0 0.675 export CONDA_PROMPT_MODIFIER='\''(base) '\''
#0 0.675 export CONDA_EXE='\''/opt/conda/bin/conda'\''
#0 0.675 export _CE_M='\'''\''
#0 0.675 export _CE_CONDA='\'''\''
#0 0.675 export CONDA_PYTHON_EXE='\''/opt/conda/bin/python'\''
#0 0.675 . "/opt/conda/etc/conda/activate.d/proj4-activate.sh"'
#0 0.675 +++ PS1='(base) '
#0 0.675 +++ export PATH=/opt/conda/bin:/opt/conda/condabin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
#0 0.675 +++ PATH=/opt/conda/bin:/opt/conda/condabin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
#0 0.675 +++ export CONDA_PREFIX=/opt/conda
#0 0.675 +++ CONDA_PREFIX=/opt/conda
#0 0.675 +++ export CONDA_SHLVL=1
#0 0.675 +++ CONDA_SHLVL=1
#0 0.675 +++ export CONDA_DEFAULT_ENV=base
#0 0.675 +++ CONDA_DEFAULT_ENV=base
#0 0.675 +++ export 'CONDA_PROMPT_MODIFIER=(base) '
#0 0.675 +++ CONDA_PROMPT_MODIFIER='(base) '
#0 0.675 +++ export CONDA_EXE=/opt/conda/bin/conda
#0 0.675 +++ CONDA_EXE=/opt/conda/bin/conda
#0 0.675 +++ export _CE_M=
#0 0.675 +++ _CE_M=
#0 0.675 +++ export _CE_CONDA=
#0 0.675 +++ _CE_CONDA=
#0 0.675 +++ export CONDA_PYTHON_EXE=/opt/conda/bin/python
#0 0.675 +++ CONDA_PYTHON_EXE=/opt/conda/bin/python
#0 0.675 +++ . /opt/conda/etc/conda/activate.d/proj4-activate.sh
#0 0.676 ++++ '[' -n '' ']'
#0 0.676 ++++ '[' -d /opt/conda/share/proj ']'
#0 0.676 ++++ export PROJ_LIB=/opt/conda/share/proj
#0 0.676 ++++ PROJ_LIB=/opt/conda/share/proj
#0 0.676 ++++ '[' -f /opt/conda/share/proj/copyright_and_licenses.csv ']'
#0 0.676 ++++ export PROJ_NETWORK=ON
#0 0.676 ++++ PROJ_NETWORK=ON
#0 0.676 ++ __conda_hashr
#0 0.676 ++ '[' -n '' ']'
#0 0.676 ++ '[' -n '' ']'
#0 0.676 ++ hash -r
#0 0.676 + conda install --yes cython gdal git h5py libgdal pytest numpy fftw scipy scons hdf4 hdf5 libgcc libstdcxx-ng cmake
#0 0.676 + local cmd=install
#0 0.676 + case "$cmd" in
#0 0.676 + __conda_exe install --yes cython gdal git h5py libgdal pytest numpy fftw scipy scons hdf4 hdf5 libgcc libstdcxx-ng cmake
#0 0.677 + /opt/conda/bin/conda install --yes cython gdal git h5py libgdal pytest numpy fftw scipy scons hdf4 hdf5 libgcc libstdcxx-ng cmake
#0 1.439 Collecting package metadata (current_repodata.json): ...working... done
#0 11.90 Solving environment: ...working... unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
#0 24.84 Solving environment: ...working... unsuccessful attempt using repodata from current_repodata.json, retrying with next repodata source.
#0 27.72
#0 27.72 ResolvePackageNotFound:
#0 27.72   - conda==22.11.1
#0 27.72
#0 28.00 + return
------
Dockerfile.cuda:22
--------------------
  21 |     # install isce requirements
  22 | >>> RUN set -ex \
  23 | >>>  && . /opt/conda/bin/activate root \
  24 | >>>  && conda install --yes \
  25 | >>>       cython \
  26 | >>>       gdal \
  27 | >>>       git \
  28 | >>>       h5py \
  29 | >>>       libgdal \
  30 | >>>       pytest \
  31 | >>>       numpy \
  32 | >>>       fftw \
  33 | >>>       scipy \
  34 | >>>       scons \
  35 | >>>       hdf4 \
  36 | >>>       hdf5 \
  37 | >>>       libgcc \
  38 | >>>       libstdcxx-ng \
  39 | >>>       cmake \
  40 | >>>  && yum install -y uuid-devel x11-devel motif-devel jq \
  41 | >>>     opencv opencv-devel opencv-python \
  42 | >>>  && ln -sf /opt/conda/bin/cython /opt/conda/bin/cython3 \
  43 | >>>  && mkdir -p /opt/isce2/src
  44 |
--------------------
ERROR: failed to solve: process "/bin/sh -c set -ex  && . /opt/conda/bin/activate root  && conda install --yes       cython       gdal       git       h5py       libgdal       pytest       numpy       fftw       scipy       scons       hdf4       hdf5       libgcc       libstdcxx-ng       cmake  && yum install -y uuid-devel x11-devel motif-devel jq     opencv opencv-devel opencv-python  && ln -sf /opt/conda/bin/cython /opt/conda/bin/cython3  && mkdir -p /opt/isce2/src" did not complete successfully: exit code: 1

My server's development environment is as follows:

  • Distributor ID: Ubuntu
  • Description: Ubuntu 20.04.1 LTS
  • Release: 20.04
  • Codename: focal
  • Architecture: x86_64
  • CPU op-mode(s): 32-bit, 64-bit
  • Address sizes: 46 bits physical, 48 bits virtual
  • Byte Order: Little Endian
  • CPU(s): 40
  • GPUI: NVIDIA RTX A4000
  • NVIDIA-SMI 510.54
  • Driver Version: 510.54
  • CUDA Version: 11.6
  • Docker version: 24.0.2

How can i solve it? Thanks

gjustin40 avatar Nov 01 '23 06:11 gjustin40

I'm afraid the docker images have been quite out of date for a while and aren't tested often, especially so for the CUDA version. Have you been able to build isce2 with CUDA support using cmake, outside of docker? You might be able to get farther that way.

It looks like the error here was simply a failure to resolve all the dependencies needed. I find that opencv in particular is problematic for large environments - if you don't need any components that use opencv, you might be able to just remove that from the requirements and proceed.

rtburns-jpl avatar Dec 29 '23 02:12 rtburns-jpl