tutel icon indicating copy to clipboard operation
tutel copied to clipboard

ImportError: cannot import name 'tutel_custom_kernel' from 'tutel.impls.jit_compiler'

Open zhaojiancheng007 opened this issue 3 years ago • 16 comments

zhaojiancheng007 avatar Mar 30 '23 08:03 zhaojiancheng007

It is usually due to environmental issue that Pytorch fails to find CUDA SDK. Can you print the log of installation command below:

python3 -m pip install --verbose --user --upgrade git+https://github.com/microsoft/tutel@main

ghostplant avatar Mar 30 '23 08:03 ghostplant

Using pip 23.0.1 from /home/ubuntu/anaconda3/envs/snerf/lib/python3.8/site-packages/pip (python 3.8) Looking in indexes: https://mirrors.bfsu.edu.cn/pypi/web/simple/ Collecting git+https://github.com/microsoft/tutel@main Cloning https://github.com/microsoft/tutel (to revision main) to /tmp/pip-req-build-f3vo8y7s Running command git version git version 2.25.1 Running command git clone --filter=blob:none https://github.com/microsoft/tutel /tmp/pip-req-build-f3vo8y7s Cloning into '/tmp/pip-req-build-f3vo8y7s'... Updating files: 3% (2/61) Updating files: 4% (3/61) Updating files: 6% (4/61) Updating files: 8% (5/61) Updating files: 9% (6/61) Updating files: 11% (7/61) Updating files: 13% (8/61) Updating files: 14% (9/61) Updating files: 16% (10/61) Updating files: 18% (11/61) Updating files: 19% (12/61) Updating files: 21% (13/61) Updating files: 22% (14/61) Updating files: 24% (15/61) Updating files: 26% (16/61) Updating files: 27% (17/61) Updating files: 29% (18/61) Updating files: 31% (19/61) Updating files: 32% (20/61) Updating files: 34% (21/61) Updating files: 36% (22/61) Updating files: 37% (23/61) Updating files: 39% (24/61) Updating files: 40% (25/61) Updating files: 42% (26/61) Updating files: 44% (27/61) Updating files: 45% (28/61) Updating files: 47% (29/61) Updating files: 49% (30/61) Updating files: 50% (31/61) Updating files: 52% (32/61) Updating files: 54% (33/61) Updating files: 55% (34/61) Updating files: 57% (35/61) Updating files: 59% (36/61) Updating files: 60% (37/61) Updating files: 62% (38/61) Updating files: 63% (39/61) Updating files: 65% (40/61) Updating files: 67% (41/61) Updating files: 68% (42/61) Updating files: 70% (43/61) Updating files: 72% (44/61) Updating files: 73% (45/61) Updating files: 75% (46/61) Updating files: 77% (47/61) Updating files: 78% (48/61) Updating files: 80% (49/61) Updating files: 81% (50/61) Updating files: 83% (51/61) Updating files: 85% (52/61) Updating files: 86% (53/61) Updating files: 88% (54/61) Updating files: 90% (55/61) Updating files: 91% (56/61) Updating files: 93% (57/61) Updating files: 95% (58/61) Updating files: 96% (59/61) Updating files: 98% (60/61) Updating files: 100% (61/61) Updating files: 100% (61/61), done. Running command git show-ref main 1456b49e27d3aaef09be65da5b74a7be0239bdb4 refs/heads/main 1456b49e27d3aaef09be65da5b74a7be0239bdb4 refs/remotes/origin/main Running command git symbolic-ref -q HEAD refs/heads/main Resolved https://github.com/microsoft/tutel to commit 1456b49e27d3aaef09be65da5b74a7be0239bdb4 Running command git rev-parse HEAD 1456b49e27d3aaef09be65da5b74a7be0239bdb4 Running command python setup.py egg_info running egg_info creating /tmp/pip-pip-egg-info-aiosqnkd/tutel.egg-info writing manifest file '/tmp/pip-pip-egg-info-aiosqnkd/tutel.egg-info/SOURCES.txt' writing manifest file '/tmp/pip-pip-egg-info-aiosqnkd/tutel.egg-info/SOURCES.txt' Preparing metadata (setup.py) ... done Building wheels for collected packages: tutel Running command git rev-parse HEAD 1456b49e27d3aaef09be65da5b74a7be0239bdb4 Running command python setup.py bdist_wheel running bdist_wheel running build running build_py creating build creating build/lib.linux-x86_64-3.8 creating build/lib.linux-x86_64-3.8/tutel copying tutel/system.py -> build/lib.linux-x86_64-3.8/tutel copying tutel/net.py -> build/lib.linux-x86_64-3.8/tutel copying tutel/jit.py -> build/lib.linux-x86_64-3.8/tutel copying tutel/moe.py -> build/lib.linux-x86_64-3.8/tutel copying tutel/init.py -> build/lib.linux-x86_64-3.8/tutel creating build/lib.linux-x86_64-3.8/tutel/jit_kernels copying tutel/jit_kernels/gating.py -> build/lib.linux-x86_64-3.8/tutel/jit_kernels copying tutel/jit_kernels/sparse.py -> build/lib.linux-x86_64-3.8/tutel/jit_kernels copying tutel/jit_kernels/init.py -> build/lib.linux-x86_64-3.8/tutel/jit_kernels creating build/lib.linux-x86_64-3.8/tutel/parted copying tutel/parted/patterns.py -> build/lib.linux-x86_64-3.8/tutel/parted copying tutel/parted/spmdx.py -> build/lib.linux-x86_64-3.8/tutel/parted copying tutel/parted/init.py -> build/lib.linux-x86_64-3.8/tutel/parted copying tutel/parted/solver.py -> build/lib.linux-x86_64-3.8/tutel/parted creating build/lib.linux-x86_64-3.8/tutel/examples copying tutel/examples/moe_mnist.py -> build/lib.linux-x86_64-3.8/tutel/examples copying tutel/examples/helloworld_from_scratch.py -> build/lib.linux-x86_64-3.8/tutel/examples copying tutel/examples/helloworld.py -> build/lib.linux-x86_64-3.8/tutel/examples copying tutel/examples/helloworld_amp.py -> build/lib.linux-x86_64-3.8/tutel/examples copying tutel/examples/moe_cifar10.py -> build/lib.linux-x86_64-3.8/tutel/examples copying tutel/examples/helloworld_ddp_tutel.py -> build/lib.linux-x86_64-3.8/tutel/examples copying tutel/examples/helloworld_deepspeed.py -> build/lib.linux-x86_64-3.8/tutel/examples copying tutel/examples/init.py -> build/lib.linux-x86_64-3.8/tutel/examples copying tutel/examples/helloworld_ddp.py -> build/lib.linux-x86_64-3.8/tutel/examples creating build/lib.linux-x86_64-3.8/tutel/experts copying tutel/experts/ffn.py -> build/lib.linux-x86_64-3.8/tutel/experts copying tutel/experts/init.py -> build/lib.linux-x86_64-3.8/tutel/experts creating build/lib.linux-x86_64-3.8/tutel/checkpoint copying tutel/checkpoint/scatter.py -> build/lib.linux-x86_64-3.8/tutel/checkpoint copying tutel/checkpoint/init.py -> build/lib.linux-x86_64-3.8/tutel/checkpoint copying tutel/checkpoint/gather.py -> build/lib.linux-x86_64-3.8/tutel/checkpoint creating build/lib.linux-x86_64-3.8/tutel/custom copying tutel/custom/init.py -> build/lib.linux-x86_64-3.8/tutel/custom creating build/lib.linux-x86_64-3.8/tutel/launcher copying tutel/launcher/run.py -> build/lib.linux-x86_64-3.8/tutel/launcher copying tutel/launcher/execl.py -> build/lib.linux-x86_64-3.8/tutel/launcher copying tutel/launcher/init.py -> build/lib.linux-x86_64-3.8/tutel/launcher creating build/lib.linux-x86_64-3.8/tutel/gates copying tutel/gates/cosine_top.py -> build/lib.linux-x86_64-3.8/tutel/gates copying tutel/gates/top.py -> build/lib.linux-x86_64-3.8/tutel/gates copying tutel/gates/init.py -> build/lib.linux-x86_64-3.8/tutel/gates creating build/lib.linux-x86_64-3.8/tutel/impls copying tutel/impls/fast_dispatch.py -> build/lib.linux-x86_64-3.8/tutel/impls copying tutel/impls/jit_compiler.py -> build/lib.linux-x86_64-3.8/tutel/impls copying tutel/impls/moe_layer.py -> build/lib.linux-x86_64-3.8/tutel/impls copying tutel/impls/overlap.py -> build/lib.linux-x86_64-3.8/tutel/impls copying tutel/impls/communicate.py -> build/lib.linux-x86_64-3.8/tutel/impls copying tutel/impls/init.py -> build/lib.linux-x86_64-3.8/tutel/impls copying tutel/impls/losses.py -> build/lib.linux-x86_64-3.8/tutel/impls creating build/lib.linux-x86_64-3.8/tutel/parted/backend copying tutel/parted/backend/init.py -> build/lib.linux-x86_64-3.8/tutel/parted/backend creating build/lib.linux-x86_64-3.8/tutel/parted/backend/torch copying tutel/parted/backend/torch/config.py -> build/lib.linux-x86_64-3.8/tutel/parted/backend/torch copying tutel/parted/backend/torch/executor.py -> build/lib.linux-x86_64-3.8/tutel/parted/backend/torch copying tutel/parted/backend/torch/init.py -> build/lib.linux-x86_64-3.8/tutel/parted/backend/torch running build_ext creating /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8 creating /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8/tutel creating /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8/tutel/custom Emitting ninja build file /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/1] c++ -MMD -MF /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8/tutel/custom/custom_kernel.o.d -pthread -B /home/ubuntu/anaconda3/envs/snerf/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ubuntu/.local/lib/python3.8/site-packages/torch/include -I/home/ubuntu/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/ubuntu/.local/lib/python3.8/site-packages/torch/include/TH -I/home/ubuntu/.local/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-11.6/include -I/home/ubuntu/anaconda3/envs/snerf/include/python3.8 -c -c /tmp/pip-req-build-f3vo8y7s/tutel/custom/custom_kernel.cpp -o /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8/tutel/custom/custom_kernel.o -Wno-sign-compare -Wno-unused-but-set-variable -Wno-terminate -Wno-unused-function -Wno-strict-aliasing -DUSE_GPU -DUSE_NCCL -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=tutel_custom_kernel -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ g++ -pthread -shared -B /home/ubuntu/anaconda3/envs/snerf/compiler_compat -L/home/ubuntu/anaconda3/envs/snerf/lib -Wl,-rpath=/home/ubuntu/anaconda3/envs/snerf/lib -Wl,--no-as-needed -Wl,--sysroot=/ /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8/./tutel/custom/custom_kernel.o -L/usr/local/cuda/lib64/stubs -L/home/ubuntu/.local/lib/python3.8/site-packages/torch/lib -L/usr/local/cuda-11.6/lib64 -lcuda -lnvrtc -lnccl -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda_cu -ltorch_cuda_cpp -o build/lib.linux-x86_64-3.8/tutel_custom_kernel.cpython-38-x86_64-linux-gnu.so installing to build/bdist.linux-x86_64/wheel running install running install_lib creating build/bdist.linux-x86_64 creating build/bdist.linux-x86_64/wheel creating build/bdist.linux-x86_64/wheel/tutel creating build/bdist.linux-x86_64/wheel/tutel/jit_kernels creating build/bdist.linux-x86_64/wheel/tutel/parted creating build/bdist.linux-x86_64/wheel/tutel/parted/backend creating build/bdist.linux-x86_64/wheel/tutel/parted/backend/torch creating build/bdist.linux-x86_64/wheel/tutel/examples creating build/bdist.linux-x86_64/wheel/tutel/experts creating build/bdist.linux-x86_64/wheel/tutel/checkpoint creating build/bdist.linux-x86_64/wheel/tutel/custom creating build/bdist.linux-x86_64/wheel/tutel/launcher creating build/bdist.linux-x86_64/wheel/tutel/gates creating build/bdist.linux-x86_64/wheel/tutel/impls running install_egg_info running egg_info creating tutel.egg-info writing manifest file 'tutel.egg-info/SOURCES.txt' writing manifest file 'tutel.egg-info/SOURCES.txt' Copying tutel.egg-info to build/bdist.linux-x86_64/wheel/tutel-0.1-py3.8.egg-info running install_scripts creating build/bdist.linux-x86_64/wheel/tutel-0.1.dist-info/WHEEL creating '/tmp/pip-wheel-fsgwko7i/tutel-0.1-cp38-cp38-linux_x86_64.whl' and adding 'build/bdist.linux-x86_64/wheel' to it adding 'tutel_custom_kernel.cpython-38-x86_64-linux-gnu.so' adding 'tutel/init.py' adding 'tutel/jit.py' adding 'tutel/moe.py' adding 'tutel/net.py' adding 'tutel/system.py' adding 'tutel/checkpoint/init.py' adding 'tutel/checkpoint/gather.py' adding 'tutel/checkpoint/scatter.py' adding 'tutel/custom/init.py' adding 'tutel/examples/init.py' adding 'tutel/examples/helloworld.py' adding 'tutel/examples/helloworld_amp.py' adding 'tutel/examples/helloworld_ddp.py' adding 'tutel/examples/helloworld_ddp_tutel.py' adding 'tutel/examples/helloworld_deepspeed.py' adding 'tutel/examples/helloworld_from_scratch.py' adding 'tutel/examples/moe_cifar10.py' adding 'tutel/examples/moe_mnist.py' adding 'tutel/experts/init.py' adding 'tutel/experts/ffn.py' adding 'tutel/gates/init.py' adding 'tutel/gates/cosine_top.py' adding 'tutel/gates/top.py' adding 'tutel/impls/init.py' adding 'tutel/impls/communicate.py' adding 'tutel/impls/fast_dispatch.py' adding 'tutel/impls/jit_compiler.py' adding 'tutel/impls/losses.py' adding 'tutel/impls/moe_layer.py' adding 'tutel/impls/overlap.py' adding 'tutel/jit_kernels/init.py' adding 'tutel/jit_kernels/gating.py' adding 'tutel/jit_kernels/sparse.py' adding 'tutel/launcher/init.py' adding 'tutel/launcher/execl.py' adding 'tutel/launcher/run.py' adding 'tutel/parted/init.py' adding 'tutel/parted/patterns.py' adding 'tutel/parted/solver.py' adding 'tutel/parted/spmdx.py' adding 'tutel/parted/backend/init.py' adding 'tutel/parted/backend/torch/init.py' adding 'tutel/parted/backend/torch/config.py' adding 'tutel/parted/backend/torch/executor.py' adding 'tutel-0.1.dist-info/LICENSE' adding 'tutel-0.1.dist-info/METADATA' adding 'tutel-0.1.dist-info/WHEEL' adding 'tutel-0.1.dist-info/top_level.txt' adding 'tutel-0.1.dist-info/RECORD' removing build/bdist.linux-x86_64/wheel Building wheel for tutel (setup.py) ... done Created wheel for tutel: filename=tutel-0.1-cp38-cp38-linux_x86_64.whl size=3818720 sha256=c9229a1d4450e51722ce8c3ee1ac1f168c52eb336f50cb8b74541f46db9908d6 Stored in directory: /tmp/pip-ephem-wheel-cache-8p0gi8d8/wheels/fd/b8/fb/efc186bf3c0931e42fd89af67fe0cfcdece6fb5b055e69ec0a Successfully built tutel Installing collected packages: tutel Running command git rev-parse HEAD 1456b49e27d3aaef09be65da5b74a7be0239bdb4 Successfully installed tutel-0.1

zhaojiancheng007 avatar Mar 30 '23 09:03 zhaojiancheng007

Thanks. What about the standard output of this:

python3 -c 'import torch; import tutel_custom_kernel'

ghostplant avatar Mar 30 '23 14:03 ghostplant

Thanks! seems like it doesn't have the module 'torch_custom_tutel'

zhaojiancheng007 avatar Mar 30 '23 14:03 zhaojiancheng007

Can you search where is the OS path of this file in your anaconda3 environment:

find /home/ubuntu/anaconda3 | grep tutel_custom_kernel

Your anaconda3 doesn't automatically add it to the PYTHON_PATH.

For PYPI installation instead of anaconda, I don't think there would be such problem, and the file is usually installed at some path like:

/usr/local/lib/python3.8/dist-packages/tutel_custom_kernel.cpython-38m-x86_64-linux-gnu.so

ghostplant avatar Mar 30 '23 15:03 ghostplant

I sorry that I did follow the installation procedures, I still couldn't find the file 'tutel_custom_kernel', in the dist-packages. I'm not sure which part went wrong. I use CUDA11.6 and torch==1.10.0+cu113. Another error that always shows up 'ImportError: libnvrtc.so.11.0: cannot open shared object file: No such file or directory'

zhaojiancheng007 avatar Mar 31 '23 06:03 zhaojiancheng007

I sorry that I did follow the installation procedures, I still couldn't find the file 'tutel_custom_kernel', in the dist-packages. I'm not sure which part went wrong. I use CUDA11.6 and torch==1.10.0+cu113. Another error that always shows up 'ImportError: libnvrtc.so.11.0: cannot open shared object file: No such file or directory'

OK, so the problem is not from anaconda's site location, but your Pytorch fails to detach CUDA library environment and related versioning.

You have several options:

  1. find the location of libnvrtc.so.11.0 and put it to LD_LIBRARY_PATH.
  2. find the location of libnvrtc.so.11.6 and create a symbolic link for it and name it as libnvrtc.so.11.0

ghostplant avatar Mar 31 '23 07:03 ghostplant

Because those shared libraries fails to locate on the disk, so Pytorch C++ modules can't load at initialization.

ghostplant avatar Mar 31 '23 07:03 ghostplant

Thanks for your patience, I did what you told me, the problem is still unsolved,. I think maybe something wrong with the ninja compiler while installing? I paste the installation log here. And I use CUDA10.2, with torch version torch1.10.0+cu102 Thanks a lot!

running install running bdist_egg running egg_info writing manifest file 'tutel.egg-info/SOURCES.txt' running install_lib running build_py running build_ext Emitting ninja build file /home/ubuntu/zcq/tutel/build/temp.linux-x86_64-3.8/build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) ninja: no work to do. g++ -pthread -shared -B /home/ubuntu/anaconda3/envs/snerf/compiler_compat -L/home/ubuntu/anaconda3/envs/snerf/lib -Wl,-rpath=/home/ubuntu/anaconda3/envs/snerf/lib -Wl,--no-as-needed -Wl,--sysroot=/ /home/ubuntu/zcq/tutel/build/temp.linux-x86_64-3.8/./tutel/custom/custom_kernel.o -L/usr/local/cuda/lib64/stubs -L/home/ubuntu/.local/lib/python3.8/site-packages/torch/lib -L/usr/local/cuda-11.6/lib64 -ldl -lcuda -lnvrtc -lnccl -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-3.8/tutel_custom_kernel.cpython-38-x86_64-linux-gnu.so creating build/bdist.linux-x86_64/egg creating build/bdist.linux-x86_64/egg/tutel creating build/bdist.linux-x86_64/egg/tutel/jit_kernels creating build/bdist.linux-x86_64/egg/tutel/parted creating build/bdist.linux-x86_64/egg/tutel/parted/backend creating build/bdist.linux-x86_64/egg/tutel/parted/backend/torch creating build/bdist.linux-x86_64/egg/tutel/examples creating build/bdist.linux-x86_64/egg/tutel/custom creating build/bdist.linux-x86_64/egg/tutel/launcher creating build/bdist.linux-x86_64/egg/tutel/impls byte-compiling build/bdist.linux-x86_64/egg/tutel/system.py to system.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/jit_kernels/gating.py to gating.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/jit_kernels/sparse.py to sparse.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/jit_kernels/init.py to init.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/net.py to net.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/jit.py to jit.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/backend/init.py to init.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/backend/torch/config.py to config.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/backend/torch/executor.py to executor.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/backend/torch/init.py to init.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/patterns.py to patterns.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/spmdx.py to spmdx.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/init.py to init.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/solver.py to solver.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/examples/helloworld_from_scratch.py to helloworld_from_scratch.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/examples/helloworld.py to helloworld.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/examples/helloworld_amp.py to helloworld_amp.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/examples/helloworld_deepspeed.py to helloworld_deepspeed.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/examples/init.py to init.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/examples/helloworld_ddp.py to helloworld_ddp.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/custom/init.py to init.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/moe.py to moe.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/launcher/run.py to run.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/launcher/execl.py to execl.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/launcher/init.py to init.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/init.py to init.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/impls/fast_dispatch.py to fast_dispatch.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/impls/jit_compiler.py to jit_compiler.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/impls/moe_layer.py to moe_layer.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/impls/communicate.py to communicate.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel/impls/init.py to init.cpython-38.pyc byte-compiling build/bdist.linux-x86_64/egg/tutel_custom_kernel.py to tutel_custom_kernel.cpython-38.pyc creating build/bdist.linux-x86_64/egg/EGG-INFO copying tutel.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO copying tutel.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying tutel.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying tutel.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying tutel.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO zip_safe flag not set; analyzing archive contents... pycache.tutel_custom_kernel.cpython-38: module references file removing 'build/bdist.linux-x86_64/egg' (and everything under it) creating /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg Extracting tutel-0.1-py3.8-linux-x86_64.egg to /home/ubuntu/.local/lib/python3.8/site-packages byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel_custom_kernel.py to tutel_custom_kernel.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/init.py to init.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/jit.py to jit.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/moe.py to moe.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/net.py to net.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/system.py to system.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/custom/init.py to init.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/examples/init.py to init.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/examples/helloworld.py to helloworld.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/examples/helloworld_amp.py to helloworld_amp.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/examples/helloworld_ddp.py to helloworld_ddp.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/examples/helloworld_deepspeed.py to helloworld_deepspeed.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/examples/helloworld_from_scratch.py to helloworld_from_scratch.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/impls/init.py to init.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/impls/communicate.py to communicate.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/impls/fast_dispatch.py to fast_dispatch.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/impls/jit_compiler.py to jit_compiler.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/impls/moe_layer.py to moe_layer.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/jit_kernels/init.py to init.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/jit_kernels/gating.py to gating.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/jit_kernels/sparse.py to sparse.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/launcher/init.py to init.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/launcher/execl.py to execl.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/launcher/run.py to run.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/init.py to init.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/patterns.py to patterns.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/solver.py to solver.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/spmdx.py to spmdx.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/backend/init.py to init.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/backend/torch/init.py to init.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/backend/torch/config.py to config.cpython-38.pyc byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/backend/torch/executor.py to executor.cpython-38.pyc Adding tutel 0.1 to easy-install.pth file

Installed /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg Processing dependencies for tutel==0.1 Finished processing dependencies for tutel==0.1

zhaojiancheng007 avatar Mar 31 '23 15:03 zhaojiancheng007

Thanks! I reinstall CUDA and torch, update tutel to the latest version, and it works! Thanks for your patience, that really helps me a lot.

zhaojiancheng007 avatar Apr 01 '23 04:04 zhaojiancheng007

Thanks! I reinstall CUDA and torch, update tutel to the latest version, and it works! Thanks for your patience, that really helps me a lot.

Can you share your CUDA and Pytorch version? I have the same issue, and reinstall doesn't work

zachary62 avatar May 25 '23 05:05 zachary62

Thanks! I reinstall CUDA and torch, update tutel to the latest version, and it works! Thanks for your patience, that really helps me a lot.

Can you share your CUDA and Pytorch version? I have the same issue, and reinstall doesn't work.

Mostly it is related to Pytorch fails to import standard C++ extension due to improper/messed-up extension location.

Here are several possibilities.

  1. Pytorch user is the root-cause (e.g. root or non-root) because Pytorch is installed by an unknown else users.
  2. Multiple C++ extension is found at different site locations (e.g. a version exists in root sites, and another version exists in user sites), making Pytorch imports a improper one.
  3. CUDA environment is not configured correctly, making C++ extension failed in setup procedure or library loading procedure. However for this case, you can usually see those related error logs during installation, e.g. nvcc or libcuda.so is not found.

ghostplant avatar May 25 '23 06:05 ghostplant