Problems building mxnet
For bugs or installation issues, please provide the following information. The more information you provide, the more likely people will be able to help you.
Environment info
Operating System: Ubuntu 16.04.3 LTS Compiler: hipcc / hcc (clang 6, see version output below)
hipcc --version HIP version: 1.4.17494 HCC clang version 6.0.0 (ssh://gerritgit/compute/ec/hcc-tot/clang 42ceed861a212d9bd0aef883ee7981144f3ecc02) (ssh://gerritgit/compute/ec/hcc-tot/llvm 23e086be6f627e6e983c6789d2e77da6bf85ebb6) (based on HCC 1.1.17493-2f85d8a-42ceed8-23e086b ) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/rocm/hcc/bin
Package used (Python/R/Scala/Julia):
MXNet version:
Or if installed from source:
MXNet commit hash (git rev-parse HEAD):
d053ae86d5327ca36315b9a0646989678fff335d
If you are using python package, please provide
Python version and distribution:
If you are using R package, please provide
R sessionInfo():
Error Message:
Please paste the full error message, including stack trace. The initial issue was that with the latest rocm (1.7.60) install from the repositories there was problem with rocBLAS and hcRNG was missing so I built them from git. hcFFT was available as expected. At this point mxnet appear to compile but multiple errors reported I'm attached a build log from the second build attempt so it is less noisy.
I am also using cuda 9.1 but I did try cuda 8 which also failed. The environment vars in both cases were: LD_LIBRARY_PATH=/usr/local/cuda/lib64 (this symlinked to 8 or 9.1 depending on what is installed) HIP_PLATFORM=hcc
The current git version of mxnet also do not need the Makefile modification presented since it is always there.
Minimum reproducible example
if you are using your own code, please provide a short script that reproduces the error.
Steps to reproduce
or if you are running standard examples, please provide the commands you have run that lead to the error.
1.make -j $(nproc) 2. 3.
What have you tried to solve it?
The first stoppage in the log...
41 warnings and 2 errors generated. Died at /opt/rocm/bin/hipcc line 500
...refers to a line in the hipcc script...
495 if ($runCmd) { 496 if ($HIP_PLATFORM eq "hcc" and exists($hipConfig{'HCC_VERSION'}) and $HCC_VERSION ne $hipConfig{'HCC_VERSION'}) { 497 print ("HIP ($HIP_PATH) was built using hcc $hipConfig{'HCC_VERSION'}, but you are using $HCC_HOME/hcc with version $HCC_VERSION from hipcc. Please rebuild HIP including cmake or update HCC_HOME variable.\n") ; 498 die unless $ENV{'HIP_IGNORE_HCC_VERSION'}; 499 } 500 system ("$CMD") and die (); 501 }
However, my HIP configuration appears to be good... hipconfig HIP version : 1.4.17494
== hipconfig HIP_PATH : /opt/rocm HIP_PLATFORM : hcc CPP_CONFIG : -D__HIP_PLATFORM_HCC__= -I/opt/rocm/include -I/opt/rocm/hcc/include
== hcc HSA_PATH : /opt/rocm/hsa HCC_HOME : /opt/rocm/hcc HCC clang version 6.0.0 (ssh://gerritgit/compute/ec/hcc-tot/clang 42ceed861a212d9bd0aef883ee7981144f3ecc02) (ssh://gerritgit/compute/ec/hcc-tot/llvm 23e086be6f627e6e983c6789d2e77da6bf85ebb6) (based on HCC 1.1.17493-2f85d8a-42ceed8-23e086b ) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/rocm/hcc/bin LLVM (http://llvm.org/): LLVM version 6.0.0svn Optimized build. Default target: x86_64-unknown-linux-gnu Host CPU: znver1
Registered Targets: amdgcn - AMD GCN GPUs r600 - AMD GPUs HD2XXX-HD6XXX x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 HCC-cxxflags : -hc -std=c++amp -I/opt/rocm/hcc-1.0/include -I/opt/rocm/includeHCC-ldflags : -hc -std=c++amp -L/opt/rocm/hcc-1.0/lib -Wl,--rpath=/opt/rocm/hcc-1.0/lib -ldl -lm -lpthread -lunwind -lhc_am -Wl,--whole-archive -lmcwamp -Wl,--no-whole-archive
=== Environment Variables PATH=/opt/rocm/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin LD_LIBRARY_PATH=/usr/local/cuda/lib64 HIP_PLATFORM=hcc
== Linux Kernel
Hostname :
~ ~ ~
I'm not sure what to try next. My guess is that there are some function differences between mxnet code and the larger requirements but I don't know how to resolve that.