runtime icon indicating copy to clipboard operation
runtime copied to clipboard

Infinite recursion error when building cuda targets

Open kgreenek opened this issue 4 years ago • 2 comments

My machine is running Ubuntu 18.04 with an nvidia 2080 gpu.

I run the following command after following the setup instructions and verifying cuda / clang work:

bazel build //tools:bef_executor

I get the following error:

ERROR: infinite symlink expansion detected
[start of symlink chain]
/usr/bin/X11
/usr/bin
[end of symlink chain]
INFO: Repository llvm-project instantiated at:
  /home/kevin/src/tensorflow/runtime/WORKSPACE:22:18: in <toplevel>
  /home/kevin/src/tensorflow/runtime/dependencies.bzl:38:9: in tfrt_dependencies
  /home/kevin/src/tensorflow/runtime/third_party/llvm/workspace.bzl:10:22: in repo
  /home/kevin/src/tensorflow/runtime/third_party/repo.bzl:114:23: in tfrt_http_archive
Repository rule _tfrt_http_archive defined at:
  /home/kevin/src/tensorflow/runtime/third_party/repo.bzl:65:37: in <toplevel>
ERROR: /home/kevin/.cache/bazel/_bazel_kevin/706c0853b511db4c8e87dca8def87940/external/rules_cuda/cuda/BUILD:128:20: every rule of type cuda_toolchain_info implicitly depends upon the target '@local_cuda//:cuda/bin/nvcc', but this target could not be found because of: no such package '@local_cuda//': Symlink issue while evaluating globs: Infinite symlink expansion: /usr/bin/X11- > /usr/bin

Bazel version: 4.0.0

I ran ls -l /usr/bin/X11, and I see that it points to /usr/bin. Hence the recursive symlink.

I also ran dpkg -S /usr/bin/X11 and I see that it is install by the apt package x11-common.

I'm afraid to just remove that symlink on my system because I assume it is there for a reason, and I'm afraid things will break.

It appears that something somewhere is globbing all the files under /usr/bin, but that causes infinite symlink recursion if x11-common is installed.

Output from clang -v:

Ubuntu clang version 11.1.0-++20210428103915+1fdec59bffc1-1~exp1~20210428204556.164
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/i686-linux-gnu/8
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/6.5.0
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/7.5.0
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/8
Candidate multilib: .;@m64
Selected multilib: .;@m64
Found CUDA installation: /usr/local/cuda, version 10.2

I installed clang-11 from the llvm PPA.

kgreenek avatar May 12 '21 19:05 kgreenek

I suspect your CUDA toolkit at /usr/local/cuda might have been installed with a package manager. Can you try to use a toolkit extracted to a single directory, as explained here? https://llvm.org/docs/CompileCudaWithLLVM.html#prerequisites

chsigg avatar Jun 22 '21 19:06 chsigg

@chsigg I also encounter this problem. My CUDA toolkit install at /usr/local/cuda was done with the runfile, not a package manager.

$ dpkg -S /usr/local/cuda
dpkg-query: no path found matching pattern /usr/local/cuda
$ dpkg -S /usr/local/cuda-11.0/
dpkg-query: no path found matching pattern /usr/local/cuda-11.0/

sclarkson avatar Aug 17 '21 09:08 sclarkson