Cutlass missing from 3rdparty in new 5.2 release
Branch/Tag/Commit
v5.2
Docker Image Version
nvcr.io/nvidia/pytorch:22.09-py3
GPU name
A100
CUDA Driver
510.47.03
Reproduced Steps
Trying to build v5.2 using the T5 guide.
Seems that there is a new dependence on cutlass in 3rdparty that is missing.
When running cmake -DSM=80 -DCMAKE_BUILD_TYPE=Release -DBUILD_PYT=ON -DBUILD_MULTI_GPU=ON ..
the following error occurs:
-- The CXX compiler identification is GNU 9.4.0
-- The CUDA compiler identification is NVIDIA 11.8.89
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found suitable version "11.8", minimum required is "10.2")
CUDA_VERSION 11 is greater or equal than 11, enable -DENABLE_BF16 flag
-- Add DBUILD_CUTLASS_MOE, requires CUTLASS. Increases compilation time
-- Add DBUILD_CUTLASS_MIXED_GEMM, requires CUTLASS. Increases compilation time
-- Running submodule update to fetch cutlass
error: pathspec '3rdparty/cutlass' did not match any file(s) known to git
CMake Error at CMakeLists.txt:56 (message):
git submodule update --init 3rdparty/cutlass failed with 1, please checkout
cutlass submodule
-- Configuring incomplete, errors occurred!
There is a submodule linking issue we're fixing, due to out-of-sync between CUTLASS and FasterTransformer. In the meantime, please try the following workaround to get v5.2 compiled:
git clone https://github.com/NVIDIA/FasterTransformer.git && cd FasterTransformer
git submodule add -f https://github.com/NVIDIA/cutlass.git 3rdparty/cutlass
mkdir build && cd build
cmake -DSM=<XX> -DCMAKE_BUILD_TYPE=Release -DBUILD_PYT=ON -DBUILD_MULTI_GPU=ON ..
cd 3rdparty/cutlass
git checkout cc85b6
cd ../..
make -j${nproc}
This should get passed the CMake phase.
If you're hitting another error later during make phase like 3rdparty/cutlass/include/cutlass/epilogue/threadblock/epilogue.h:65:10: fatal error: cutlass/util/index_sequence.h: No such file or directory, please cd to the 3rdparty/cutlass and do git checkout cc85b6 again. Then run the make command in the build folder. This should get v5.2 compiled
Thank you for the explanation, symphonylyh. @michaelroyzen, I have fixed it. You can try again now.